Sort It Out

Sequencing DNA is not as overwhelming as it once was.

"The technology is really amazing," said Deb Grove of Penn State's Nucleic Acid Facility. "When I started here a few years ago the 'read length' for a DNA template was 300 to 400 nucleotides long. Today we can obtain read lengths of over 800 nucleotides, and can sequence templates up to 200k."

The first step in sequencing is to cut the DNA into short lengths and clone them using the PCR technique (see "An Amazing Reaction"). Said Nina Fedoroff, director of Penn State's Life Sciences Consortium,"Cloning is the heart of genetic engineering. You can't sequence a single molecule—it's like figuring out how a molecule of sugar tastes. You need to get at least a quarter teaspoonful in your mouth to taste anything, and a quarter teaspoonful has millions and millions of molecules in it. The same problem confronts the scientist trying to find out the sequence of DNA. You have to have enough of it to actually carry out your analytical tools."

Once you have enough copies, you heat them to break the double helix into two strands: If DNA is like a spiral staircase, the nucleotides pair up to make the steps. The four nucleotides, or bases—adenine (A), thymine (T), guanine (G), and cytosine (C)—are complementary (C always bonds with G, and A always bonds with T) because C and G need three hydrogen bonds to hold them together, while A and T need two. These hydrogen bonds, Fedoroff said, are "sort of like Scotch tape. They hold things together but it's easy to pull them apart." (By comparison, the backbone of a DNA strand, made up of alternating groups of sugars and phosphates, "hangs together at high temperatures—it's more like crazy glue.")

Next you place the single strands into a solution along with four other ingredients: a primer, DNA polymerase, and two forms of free nucleotides, normal ones and so-called "stop" nucleotides (dideoxynucleotides). Each of the four stop nucleotides is labeled with a different fluorescent color.

The primer 3/4 a strand of DNA about 20 nucleotides long 3/4 gets things started. "DNA polymerase doesn't like to put the first two nucleotides together," Fedoroff explained, "but if you give it a place to hang onto, it'll go right to that place 3/4 which is extremely convenient for the molecular biologist." The primer, synthesized to match the end of the DNA strand you are sequencing, attaches, leaving an open hydroxl group for the next nucleotide to grab onto. Then the polymerase takes over. It reads along the original DNA strand, picking up a free nucleotide from the solution to pair with each one it reads. It continues until it randomly picks up a "stop" nucleotide. These lack the hydroxyl group needed for the next to grab on.

Because choosing a stop nucleotide is random, the process stops at a different location on each copy. You separate the strands again and put the new complementary strands into a sequencing gel. When you run electricity through the gel, the long pieces of DNA, which move slowly, remain near the top, while the short pieces run to the bottom. The difference in the length of each fragment from the bottom of the gel to the top is one nucleotide. Each fragment ends with a stop nucleotide tagged with one of four fluorescent colors; under a laser, the color molecule is excited and emits light. A sequencing machine, which can scan 96 gel lanes 12,000 times in 12 hours, detects the color of the light and translates it into a letter—A,T, G, or C.

Deborah Grove, Ph.D., directs the Nucleic Acid Facility in the Life Sciences Consortium, 210 Wartik Lab, University Park, PA 16802; 814-865-3332; dsg4@psu.edu. Nina Fedoroff, Ph.D., directs Penn State's Life Sciences Consortium; nvf1@psu.edu.

Last Updated September 01, 2001