Rutgers Receives $4 Million Grant from NSF to Establish Regional Data-Sharing Network
The National Science Foundation (NSF) has awarded a $4 million grant to Rutgers for the creation and assessment of a distributed regional computing infrastructure that will support collaborative data-intensive science.
Manish Parashar, founding director of the Rutgers Discovery Informatics Institute (RDI2) and distinguished professor of computer science at Rutgers–New Brunswick, is the project’s principal investigator. He is working with teams at Penn State University and the Keystone Initiative for Network Based Education and Research (KINBER), the grant’s sub-awardees, on the development and deployment of the state-of-the-art computing infrastructure that will benefit academia and industry in New Jersey, Pennsylvania and New York. The project team also includes researchers from City University of New York, Drexel University and Temple University.
The research team will design a Virtual Data Collaboratory (VDC), a regional infrastructure that integrates state-of-the-art, data-intensive computing platforms, storage and networking with an innovative data services layer across Rutgers, Penn State and the other institutions. A high-speed network will connect the services with potential to incorporate academic and research institutions nationwide. The VDC will leverage existing regional, national and international data repositories, such as the NSF-funded Ocean Observatories Initiative, which is operated by RDI² and Rutgers’ Center for Ocean Observing Leadership. It will link to existing advanced cyberinfrastructure such as the NSF-funded Big Data Regional Hubs, XSEDE and OSG, among others.
“Science, and society in general, are being increasingly transformed by data, and it is critical that we develop the necessary data ecosystem that can enable researchers to acquire, share, integrate, steward and analyze disparate types of data,” Parashar said. VDC has the potential, he added, to transform shared data into a core modality for research, education and innovation, with direct impacts on the quality and reproducibility of data-driven science and researchers’ productivity.
Helen Berman, a Rutgers structural biologist, founder of the Nucleic Acid Database and former director of the Protein Data Bank, will be collaborating with Vasant Honavar, the principal investigator of the project at Penn State. They will use the VDC to assemble curated data sets protein-DNA and RNA complexes and interfaces, and then use that information to develop machine learning and other computational methods to reliably predict protein-DNA and RNA interfaces. This will not only help develop and evaluate the VDC infrastructure, but the results will advance the understanding of molecular mechanisms in protein bindings. This is one of many examples of the partnership between the universities.
“Scientific progress in many disciplines is increasingly enabled by our ability to examine natural phenomena through the computational lens, such as using algorithmic abstractions of the underlying processes, and our ability to acquire, share, integrate and analyze disparate types of data,” Honavar said. “Realizing the full potential of data to accelerate science calls for significant advances in data and computational infrastructure to support collaborative data-intensive science by teams of researchers that transcend institutional and disciplinary boundaries.”
Other Rutgers faculty working on the VDC include Grace Agnew, associate university librarian for digital library systems and technical and automated services; Thu Nguyen, professor of computer science and associate director for research cyberinfrastructure at RDI²; Ivan Rodero, associate director for technical operations and associate research professor at RDI²; Jie Gong, assistant professor of civil and environmental engineering; and James Barr von Oehsen, a computational scientist who was recruited this year from Clemson University to lead Rutgers’ new Office of Advanced Research Computing.