Research

Whose genes are they, anyway?

The promise and peril of personal genomics

Our DNA can reveal a lot more about us than our ethnic ancestry, raising concerns about who has access to our genetic code and who may profit from it. Credit: GettyImages / metamorworksAll Rights Reserved.

You’ve seen the ads from companies that promise to tell you, based on your DNA, where your ancestors came from. You’re eager to trace your family’s roots, so you order a test kit, send in your sample, and await the results.

Your involvement with the company may end there, but two Penn State researchers say that for your DNA sequence — your genome — the journey has just begun.

What you may not realize is that when you get your DNA sequenced, in most cases you don’t own the sequence in a legal sense. The company that sequenced it does, or at least, in our current legal framework, it can act as if it does: It can sell or give your data to other organizations, which often are not bound by the agreement you signed with the sequencing company. Even if you pay for just the basic service that will allow you to sketch your ethnic background, the company may sequence your entire genome — and then pass that information along to others.

An open-ended journey

Your genome’s first stop will probably be at a research institution, where it joins a database of thousands or millions of other genomes that researchers use to pinpoint genes that correlate with specific diseases or health risks. Those institutions, in turn, may partner with businesses that use the data to develop commercial products and services they can sell, such as pharmacogenomics drugs.

“Many, many people do not understand what the potential uses of their DNA might be,” says Barbara Gray, emerita professor of business in the Smeal College of Business.

She and Forrest Briscoe, professor of management and organization at Smeal, have built their careers studying organizational actions and decision-making, especially in situations that involve controversy and ethical choices. Genomics presents them with a whole new kind of challenge and opportunity.

“It’s not too often that you get to study a new, emerging field of organizations, and on top of that, one that has a lot of interesting heterogeneity, including private sector firms from different kinds of industries, nonprofit organizations, universities, hospitals and health-care systems, and government agencies,” says Briscoe.

Like ethnographers documenting an unfamiliar culture, he and Gray attend gatherings of people in the field, ask questions, and try to discern the relationships and value systems that underlie the conduct they observe. They’re intrigued by the ethical dilemmas the field presents: How can the need for privacy co-exist with the need for researcher access to the data? “We’re interested to see how these problems materialize, how people perceive them, and what can and what will be done to address the potential downsides of genomics,” says Gray.

They hope to map the entire arena: who is creating genomic databases, where the data is housed, how it’s secured, how it’s being used and by whom, how all that is regulated, and who speaks for the individuals whose genomes are the raw material for the whole endeavor — all while the field continues to develop at a vertiginous pace.

Promise and peril

The first complete human genome sequence was published in 2003, after a 13-year international effort that involved hundreds of researchers and cost $2.7 billion. Since then, sequencing technology has gotten faster and much less costly. At the same time, the advent of supercomputing centers that can analyze and compare millions of genomes has turned the mountain of raw genomic data into a motherlode of invaluable information. National agencies, huge corporations, and tiny startups all are vying to amass the biggest and best collections of genomes and discover their marketable secrets. In 2017, investment in genomics businesses topped $3 billion.

Companies dealing with genomic information have attracted billions of dollars from investors. Credit: GettyImages / KEXINOAll Rights Reserved.

The gleaming promise that justifies this level of investment and excitement is health: the potential to create medical care tailor-made for each individual. That raises a privacy issue, though, because scientists looking for variants related to health and disease need access to entire genomes, not just the short segments used for general ancestry work.

Other people examining our genes would seem intrusive to many of us, but for others, like the parents of children with rare and incurable diseases, privacy is a lesser concern. Many such families freely share their genetic data and medical records in hopes that researchers will be able to identify gene variants responsible for the disease, and perhaps develop better therapies or even a cure.

The problem is that when dealing with your genetics, it’s never just about you, says Gray. “The DNA for your blood relatives is very similar to yours, so when you put your data in the system, you’re not only exposing yourself, you’re also exposing your progeny, your parents, uncles and aunts, and other people in your family, who did not sign a waiver. Your child may have a rare disease, but your brother’s family may not want the data to be available. How does the family make that decision?”

Privacy and profit

To allay fears, some companies separate genome data from the name, age, gender, and other personal details about the person who provided the genome. The idea is that if we just send in a saliva sample, and our name and other identifying information are kept separate from the genome data, we’ll be anonymous to the system and its users.

Unfortunately, says Briscoe, as we learn more about the genetics of personal traits, it becomes more difficult to keep our genomes anonymous. Scientists recently announced that they can predict what you look like, with fair accuracy, just from your DNA sequence, and Briscoe says there’s now a thriving cottage industry in creating algorithms that can identify a specific person within a supposedly anonymized collection of genomes.

Then there’s the possibility of being identified in ancestry databases even if you’ve never had your own DNA sequenced. In early 2018, police identified the serial rapist-murderer known as the Golden State Killer by comparing DNA from crime scenes with genome information and family trees in a publicly-available ancestry website. The genome data came from relatives who had had their DNA sequenced for genealogy purposes. Several other cases have been solved in a similar way but to less fanfare.

“It’s the same kind of data that you can use for biomedical research, here used for tracking someone down,” says Briscoe. “That’s kind of violating their privacy. It’s good when we’re finding killers, but it might not be good for other reasons.”

As for those “other reasons,” say Briscoe and Gray, it’s important to know who will provide the genomic information — and who will be using and benefiting from it. For instance, some of the most valuable genomes will belong to members of indigenous groups. Because their community has evolved to adapt to local conditions and may have had little outside contact, they may carry gene variants that are rare or nonexistent in the wider population.

The Bajau people of Indonesia, like this child, live on and in the water. Their DNA holds clues to their exceptional ability to hunt underwater for many minutes, at depths of up to 200 feet.  Credit: Logan Lambert on UnsplashAll Rights Reserved.

Briscoe points to the Bajau people in Indonesia, who have evolved larger spleens and other adaptations that allow them to make long dives in the sea to gather food. Their DNA has already been sequenced, with their permission, says Briscoe, and at some point someone’s going to make money from it. “But who is going to make it is the question,” he says. “Who’s going to be able to capitalize on these new developments? Something tells me the community is not going to be compensated.”

Who’s minding the store?

The physical location of DNA databases is also a concern. A single human genome contains about 7 GB of data, which means that a collection of thousands or millions of genomes runs to…almost unimaginable numbers. To store and analyze that much data requires heavy-duty computer firepower, usually referred to as “super-computing,” and massive storage space — which is almost always in “the cloud.” Such a reassuring word, cloud; it sounds airy and diffuse — and safe, because who can grab and run off with a cloud?

But the computing cloud is not floating in space. It’s housed in huge arrays of digital equipment, in specific buildings in specific countries, managed by employees of specific organizations. Data can be stored in any geographic location, but it’s better to keep close to the corporate home, or at least in the same country, where the rules are understood and the legal recourse if they’re not obeyed is clear.

“Right now the fastest-growing big databases are not in the U.S.,” says Briscoe. “We’re an innovation culture, but in this case countries that have nationalized health systems, like the U.K. and Australia, and China, which have a different system but can collect tons of data quickly, may move faster and discover more things more quickly.”

Forrest Briscoe, professor of management and organization in Penn State's Smeal College of Business, says the points where DNA data passes from one organization to another are especially vulnerable to lapses in security. Credit: Patrick Mansell / Penn StateCreative Commons

The complexity of genomics — the sheer number of organizations and individuals who have access to the sequences, and the number of different jurisdictions they operate in — is a major issue. The hand-offs where data and security pass from one organization to another are especially vulnerable, and where responsibilities need to be clarified, says Briscoe. “Any time you’ve got so many different actors handling data from so many different organizations, that’s a challenge.”

The newness of the field is also a concern. There are a lot of start-up businesses dealing with DNA; when some of those inevitably fail, what happens to the genomic data they hold?

It’s easy to say that every person and every organization with access to the data is responsible for keeping it safe — but when a breach comes, who bears legal responsibility? Gray likens it to trying to figure out who contributed what at industrial waste dumps. “When Superfund sites were being excavated, it was a morass trying to know whose barrel of gunk belonged to which company,” she says. “That’s going to be an analogy for who bears responsibility if there’s a breach of data when there are multiple partnerships, when DNA data has traveled from one organization to another to another and may have gone to a database in another country.”

Businesses, research institutions, and other organizations involved in genomics are rightly concerned about this: The potential costs of a breach are enormous.

The consequences now of a breach of medical information are profound, says Gray. “Imagine what the consequences will be for a health-care provider if 450,000 of their patients’ genomic data is made available to the general public.”

Even people who work for companies or institutions that deal with genomics are concerned. “A chief information security officer for one of the organizations we spoke with said he’d given a bunch of these ‘spit kits’ as Christmas presents,” recalls Briscoe. “And subsequently, as he learned more about the field, was regretting that decision.”

The health-care model

Because researchers have a better chance of finding meaningful links if they have more genomes at their disposal, organizations from small startup companies to the National Institutes of Health are scrambling to develop bigger databases with more genomic information from more people. Their legal agreements with donors, and the security measures they employ to protect the data, are all over the map. A consortium of organizations is working to design protocols for storage and sharing of data, but to date, there is no industry-wide standard.

In medical settings, genomic data can get out the same ways that private health information systems can be breached now — sloppy office procedures, doctors sharing patient info, outright theft by someone with access. Oddly, genetic data, even that collected in a medical or scientific context, is not covered under HIPAA, the federal regulation that says health-care providers can’t reveal your medical information to others without your consent. Your doctor needs your permission to tell someone else your blood pressure, but medical researchers can send your entire genome to others without telling you about it.

There’s definitely an upside to making your genome available to researchers, say Briscoe and Gray: If your DNA isn’t included, it can’t be part of what the researchers discover. But there’s a dark side, as well. What happens if insurance companies or employers gain access to your DNA data? The federal Genetic Information Non-discrimination Act, passed in 2008, bars employers from considering the genetic information of employees, job applicants, or members of their families, but a bill now before Congress, H.R. 1313 (the “Preserving Employee Wellness Programs Act”) would get around that by allowing employers to “invite” employees to provide a DNA sample, and to charge those who say no up to twice as much for health insurance. In other words, you don’t have to provide a DNA sample, but you may pay a lot more for insurance if you don’t.

Searching for norms

Gray and Briscoe think the various stakeholders in genomics come from such different backgrounds that it could be difficult for them to agree on norms for the field. Governance that starts within the health-care system tends to reflect the patient-oriented values of that system, but the regulatory and values systems in the business world may lead to different guidelines.

Barbara Gray, emerita professor in Penn State's Smeal College of Business, explores ethical aspects of corporate decision-making. She and Forrest Briscoe are concerned that many people do not understand the potential uses of their genetic information, or who might benefit from it. Credit: Patrick Mansell / Penn StateCreative Commons

Two general frameworks stake out opposing positions. On one end of the spectrum are those who advocate for the rights of individuals to control who sees their genetic information and under what conditions. That might not be workable, says Gray. “You could say, ‘I don’t want any commercial companies to have access to my data.’ The problem is that they are partnering all over the place. So maybe Children’s Hospital from Town Y accesses your data, and that’s OK with you, but their partner is Pfizer.”

Others favor an open-source process where everyone’s genetic data is available for anyone to see, like a genomic worldwide web. This could lead to faster advances in the field, says Briscoe, but in addition to conflicting with the notion of genomic privacy, this option isn’t likely to be popular with companies who have invested a lot in assembling their own databases. The solution, if the industry settles on just one framework, is likely to lie between these extremes.

Mostly absent from the discussion so far is the general public, the millions who have sent in their samples or are thinking of doing it due to health concerns, an interest in their ancestry, or simple curiosity. Gray and Briscoe would like to see the public become more aware of the issues around genomic privacy and take an active role in developing the norms that will govern how DNA information will be handled in the future.

Their study is still young, but Gray and Briscoe have already learned enough to reach one conclusion: Barring a medical emergency, they don’t plan to send in their own cheek swabs for sequencing.

“There is an infinite number of ways that people can capitalize on this information,” says Gray. “For good or for ill.”

An abridged version of this story appeared in the Fall 2018 issue of Research/Penn State magazine.

Last Updated March 1, 2019