Research network helps Penn State researchers study gene regulation

Katie Bohn
October 25, 2016

UNIVERSITY PARK, Pa. -- Human DNA stores reams of data.

The tiny molecules hold information about a person’s eye color, their height and even which diseases they might be susceptible to.

Tapping into all this data is important to the work of scientists like B. Franklin Pugh, Evan Pugh University Professor and Willaman Chair in molecular biology, and Shaun Mahony, assistant professor of biochemistry and molecular biology, who are both biochemistry researchers in Penn State’s Eberly College of Science.

Pugh, Mahony and their respective labs are researching different cell genomes and how they interact with proteins to regulate genes. Genes tell us who we are and how we behave, but misregulated genes can cause such diseases as cancer.

“We’re in the era of personalized genomic medicine,” said Pugh. “Cancer is really thousands of different diseases, because there are many different kinds of organs or cell types that start to proliferate uncontrollably. We’re trying to learn more about the defining components of these diseases to hopefully improve treatment.”

Pugh and his team of researchers are using a blend of technologies to help them along the way, including a sophisticated gene sequencer and the Penn State Research Network — a high-speed, secure network funded in 2012 by the National Science Foundation that allows for better data transfers here at the University.

To study a genome, Pugh says they begin with a process called “mapping.” Strands of DNA are made up of four types of building blocks, or “bases,” that are represented by the letters A, T, C and G. The team uses a machine called a gene sequencer, which analyzes the order of the bases on a DNA sample and reports it as a very long string of text.

The process begins with loading a DNA sample onto a plastic cartridge that’s then placed in the sequencer. A button on the machine’s touch screen is tapped and the sequencing begins.

The lab’s sequencer, an Illumina NextSeq 500, is quick.

Instead of taking a week to sequence a DNA sample, like with older models, it now completes the task overnight. In the morning, the team returns to a massive amount of data — up to 500 million data points per run — that must be moved from the sequencer onto the Penn State Institute for CyberScience’s computer cluster, which is across campus, before the sequencer can be run again.

While some computer-based labs could move their systems directly into a data center, eliminating the need to transfer the data, this wasn’t feasible for Pugh and Mahony’s labs.

“Our work involves growing human cells and conducting experiments here on our lab bench,” said Pugh. “So it would be all but impossible to move these experiments into a data center.”

That’s where the Penn State Research Network came in. In September, the lab connected to the network with the help of the Office of the Vice Provost for Information Technology and IT professionals in the lab and the Eberly College of Science’s IT group.

The network enables quicker file transfers and better security to protect sensitive research. Additionally, the research team can access large data files on their lab workstations instead of needing to log in remotely to a data center computer system.

Prior to joining the network, the team says it took 36 hours or longer before the sequencing data could be transferred and viewed by researchers. Now, more complete and comprehensive results are available within 18 to 24 hours.

“I don't know how feasible it would have been to produce this scale of data before we were on the network,” Mahony said. “When we started talking about joining the network two years ago, transferring data was already becoming an issue, and we weren't producing as much data as we are now. Now, we’re producing more data, and it's not an issue.”

Once the data is transferred to the Computer Building, the research team can begin their analysis and start to make sense of the information they’ve gathered.

Pugh hopes that by learning more about the genomes of different kinds of cells, scientists and researchers will be closer to being able to personalize and fine-tune cancer treatments depending on which type of the disease a person has.

“We’re hoping to be able to characterize each cell type at a molecular level so we can better match treatments to the type of cancer a person has,” Pugh said. “While we’re not able to do that just yet, we are building the foundation for making that happen over the next several years.”

If you have questions or are interested in getting connected to the Penn State Research Network, contact or visit the Penn State Research Network website.

  • The lit home screen of a gene sequencer

    Gene sequencing can begin after the cartridge is loaded and the button on the touch screen is pressed.

    IMAGE: Katie Bohn
  • The lit screen of a gene sequencer displaying data

    Data can also be view on the sequencer before its moved off the machine.

    IMAGE: Katie Bohn
(1 of 2)

(Media Contacts)

Last Updated April 21, 2017