Gathering big data to accelerate the COVID-19 fight

Matthew G. Solovey
May 27, 2020

HERSHEY, Pa. — Penn State is one of 15 institutions participating in a national collaboration to turn data from hundreds of thousands of medical records from coronavirus patients into effective treatments and predictive tools.

The National COVID Cohort Collaborative partners National Institutes of Health-supported Clinical and Translational Science Award programs with U.S. Department of Health and Human Services agencies and clinical organizations to support the analysis of electronic health records on a new, secure database. Penn State Clinical and Translational Science Institute led the effort for Penn State to join the collaborative. 

Federal Risk and Authorization Management Program certified the secure, cloud-based database. The management program provides standardized assessment, authorization and continuous monitoring for cloud products and services. National Center for Advancing Translational Sciences is providing the database, which contains records from patients who have undergone coronavirus testing or are suspected to be infected.

Individuals granted access to the database will be able to run algorithms on this first-of-its-kind patient data set without seeing actual patient records. Synthetic data will also be an option. Synthetic data artificially replicates statistical components of the data without providing identifiable patient information. 

A $25 million NIH award to the National Center for Data to Health supports the National COVID Cohort Collaborative. NIH's National Center for Advancing Translational Sciences provides the overall stewardship of the collaborative. 

"This collaborative shows the strength of the NIH's Clinical and Translational Science Award program," Dr. Larry Sinoway, director of Penn State Clinical and Translational Science Institute, said. "The quick creation of the collaborative during the COVID-19 pandemic illustrates the importance of the more than 60 award program institutions across the nation working together to advance research and improve health. Without this award program, significant roadblocks may have existed."

The database will enable machine learning and modern statistical analyses. Researchers can potentially predict patient responses to antiviral or anti-inflammatory therapies, identify new drugs and treatments, and discover body changes that can help in diagnosis and treatment.

"We thank our colleagues in our information technology, electronic health record, contracts and research departments for helping Penn State join this collaborative," Sinoway said. "Without the collaboration of the University, Penn State Health and Penn State Health Milton S. Hershey Medical Center, we would not be able to join this critical effort to address the COVID-19 pandemic. Data collected in central Pennsylvania can now benefit researchers nationally."

Electronic health records began to be transferred to the database on May 12. More will be uploaded as additional partners join the effort. The 15 institutions that are currently contributing are Penn State, Oregon Health and Science University, John Hopkins University, University of North Carolina, Rockefeller, Washington University, University of Kentucky, Medical University of South Carolina, Stony Brook University, University of Alabama, Tufts University, University of Wisconsin-Madison, University of Massachusetts, Wake Forest University Health Sciences and Maine Medical Center Research Institute.

Penn State Clinical and Translational Science Institute is one of nearly 60 Clinical and Translational Science Award programs nationally. The institute's Informatics Core led this effort to join the National COVID Cohort Collaborative. Vasant HonavarWenke Hwang, and Jason Hughes lead the core. Avnish Katoch, institute research informatics project manager, coordinated logistics for this project.

Last Updated June 16, 2020