Research

Data Commons connects researchers through data sharing

Credit: Penn StateCreative Commons

UNIVERSITY PARK, Pa. -- In order to promote open access to research data, many funding agencies such as the National Science Foundation (NSF) and the National Institutes of Health (NIH), require that research data generated by publicly-funded projects be made publicly available. In addition, some journals require authors to make materials, data and associated protocols promptly available to readers as a condition of publication. Researchers can now more easily comply with these policies by utilizing the services of Penn State’s Data Commons.

Data Commons was developed to provide a resource for data sharing, discovery, and archiving for the Penn State research and teaching community. “A commons is a place you go to share something,” explained Maurie Kelly, director of the Pennsylvania Spatial Data Access (PASDA) and one of the Data Commons initiators. “What we’re sharing here is data, a service that is now expected of many researchers who publish work funded by these agencies, so it’s easy to see how the Data Commons helps to make that connection.”

Data Commons came from small beginnings. Kelly presented the concept for an Environmental Data Library in 2005 and she said, “It was one of the first projects we conducted where we wanted to do something that was different and that hadn’t been done before, but something that was equally important and vital to the Penn State community.”

At that time, large-scale data management projects were still relatively new to researchers. However, as some of the major federal agencies began to change their grant requirements, expecting that data and research findings become public, the idea of a data commons became increasingly valuable. 

Kelly recalled numerous meetings, both with individual faculty and with groups around campus, about the importance of the Data Commons, and watched as it gained traction throughout the University. “Long-term data storage is something everybody wants,” she said. “I think people get very excited when they learn that you are able and willing to provide that service for them.”

After the pilot site launch in April 2011, the Data Commons continues to grow. With ongoing support from the Penn State Institutes of Energy and the Environment (PSIEE), the Institute for CyberScience (ICS), and the Research Computing and Cyberinfrastructure Group, the Data Commons has expanded its services as a portal to data, applications and resources by, for, or about Penn State research.

“The Internet has vastly increased the options for sharing our scholarship” said Tom Richard, director of the PSIEE. “There are national repositories for many data types, and libraries are opening their electronic doors to new types of archival material (ScholarSphere is the Penn State library’s digital data archiving service). The Data Commons archives data but also provides a suite of wrap-around tools to support data - user interactions, including customized interfaces, models and apps. The goal is to make the data not just available, but useful.” 

The Data Commons currently hosts data from researchers across the University including the College of Earth and Mineral Sciences, College of Agricultural Sciences and the College of Education; from departments such as rural sociology and ecosystem science and management; as well as from institutes and centers including the Earth and Environmental Systems Institute, and the Metabolomics Core Facility. 

Assistant professor of molecular toxicology and director of the Metabolomics Core Facility Andrew Patterson partnered with the Data Commons for his research. “When you’re producing terabytes of data on a monthly basis, data storage becomes a very apparent issue.  Without the Data Commons, we simply could not store all the data we produce at the Metabolomics lab, so this service has been absolutely essential.”

Another useful feature of the Data Commons is the Apps and Tools section — something Kelly thought was missing from earlier efforts to provide public access to research data. These applications, tools and models are free and available to the public, and cover a wide range of research interests and have the potential to enhance teaching and outreach both within and outside of the University. 

Other services provided by the Data Commons include creation of Digital Object Identifiers (DOIs) and metadata development. Metadata is the key to understanding and accessing data so the Data Commons staff will be providing metadata training for researchers at Penn State in the coming months.

The Data Commons also is working on special projects that feature Penn State data. For example, the Data Commons is working with The Arboretum at Penn State to make its plant records available to visitors through a mobile application.

Kelly and her staff members who include Ryan Baxter, information technology coordinator and doctoral candidate in geography, and James Spayd, data systems coordinator, are working on an app that would allow an Arboretum visitor to search, based on location, the names of nearby plants or find the location of a species of interest.

“Our accession records for each plant in the Arboretum include, of course, the name and location,” said Kim Steiner, director of the Arboretum and professor of forest biology. “However, because many smart phones have GPS capability, it is now possible to link a person’s location with recorded information about his environment.”

The app and accompanying Web mapping application utilize GIS data provided by the Penn State Office of Physical Plant and is a successful example of cross department collaboration and cooperation.  

The goal of the Data Commons, Kelly reiterated, is to make research findings and data publicly accessible and to promote research, teaching and outreach. “One of the driving forces behind the creation of the Data Commons,” she said, “is looking at the research output of Penn State. We can look at dollars, we can look at publications, but ultimately, we are providing this data to the public, and that in and of itself shows the incredible productivity of this University.”

Researchers interested in sharing data, can email the Data Commons or call 814- 863-0104. For more information, visit the Data Commons online.

Last Updated April 21, 2017

Contact