How does DNA sequencing work?
Regardless of the approach to the genome as a whole, the actual process
of DNA sequencing is the same. Sequencing employs a technique known
as electrophoresis to separate pieces of DNA that differ in length by
only one base.

Lab with sequencing machines
Courtesy of Celera Genomics |
In electrophoresis, DNA to be sequenced is placed at one end of a gela
slab of a gelatin-like substance. (A major part of DNA sequencing simply
comes down to making a bunch of Jell-O.) Electrodes are placed at either
end of the gel and an electrical current is applied, causing the DNA molecules
to move through the gel. Smaller molecules move through the gel more rapidly,
so the DNA molecules become separated into different bands according to
their size. The catch is that electrophoresis can only separate about
500 bases into clear bandshence the need for chopping DNA up into
small pieces in order to sequence it.
Until the late 1980s, electrophoresis gels were always read by a person.
Each piece of DNA was attached to a radioactive label, and an X-ray picture
was made of the gel to make the positions of the DNA bands visible. Painstakingly
analyzing the rows and columns of bands on the gel, a person could determine
the sequence of the DNA.
But this process was slow, tedious, and fraught with error. Today's
large-scale sequencing projects would be impossible without automatic
sequencing machines, which became commercially available in the late
1980s and have made DNA sequencing much quicker and more reliable. In
one year, a person can produce a finished sequence of 20,000 to 50,000
bases; a machine can produce a rough draft of a sequence that long in
just a few hours.
Most automatic sequencing machines have a design based closely on the
original, manual sequencing process. To run the machine, a technician
pours gel into the space between two glass plates set less than half
a millimeter (two-hundredths of an inch) apart. After the gel sets,
DNA is loaded into each of the 96 lanesjust like the lanes on
a highway or in a poolthat run the length of the 30-cm (about
1 foot) gel. As the DNA pieces move through the gel, the sequencing
machine reads the order of DNA bases and stores this information in
its computer memory.
In some newer machines, known as capillary sequencers, DNA is run through
an array of 96 gel-filled capillariesglass tubes about the width
of a human hairrather than through a slab of gel. But just like
the slab-gel machines, capillary machines read the base sequence as
DNA moves through the gel.

Close up of capillaries from a capillary sequencing
machine Courtesy of Celera Genomics |
Capillary sequencers can sequence each piece of DNA about twice as
fast as slab-gel machines. Moreover, they are fully automateda
robotic arm places the DNA into the top of the capillaries. The machine
automatically fills the capillaries with gel and cleans them between
runs, so only a minimum of human attentionabout 15 minutes a dayis
necessary to refill the containers of gel, water, and other solutions
located in the machine's "guts." On the other hand, sequencing machines
are expensive and capillary sequencers are so new that some labs have
had trouble getting them to work at top efficiency. Most large-scale
sequencing projects use a combination of slab-gel and capillary machines.
How does the sequencing machine know whether a base is an A, C, G, or T?
Sequencing machines can't "see" DNA directly, so scientists must use
a complex set of procedures to prepare DNA for sequencing. When DNA
is finally in a form that the machines can read, it has been chopped
up, copied, chemically modified, and tagged with fluorescent dyes corresponding
to the four different DNA bases, or genetic letters.
Before it is sequenced, a piece of DNA is copied many times, then divided
into four batches in preparation for another round of copying. In this
second round, a small amount of chemically modified base is added to each
batchthat is, modified T to one batch, A to another, and so on.
When one of these modified bases is incorporated into a DNA molecule,
the chain of bases stops growing. The result of all this is that one batch
of DNA will contain only pieces that end in T, another only pieces that
end in A, a third only pieces that end in G, and the fourth batch only
pieces that end in C.
In the second round of copying, a different fluorescent dye is also added
to each batch of DNA. Thus, every piece of DNA that ends with T has a
blue dye tag, for example; those that end in A have a red dye tag; those
that end in G have a yellow dye tag; and those that end in C have a green
dye tag.
Suppose you apply that procedure to this sequence of DNA:
TAGACT
At the end of the second round of copying, each batch will contain
the following pieces of DNA:
1: blue-T, blue-TAGACT
2: red-TA, red-TAGA
3: yellow-TAG
4: green-TAGAC
Into one lane or capillary of a sequencing machine goes a mixture of
DNA from all four batches. Because smaller molecules move through the
gel faster, the DNA pieces come through the gel in increasing order
of sizeeach piece one base longer than the last.
Thus, in this example, the first piece to make it all the way through
the gel is a T attached to a blue dye tag; the next piece is TA with a
red dye tag; next is TAG attached to a yellow dye tag; and so on.
As the pieces emerge from the gel, they move past a laser that causes
the dye molecules to fluoresce. A detector reads the color of the fluorescenceblue,
red, yellow…and a software program matches the color to the corresponding
baseT, A, G…. In this way, the sequence grows base by base. Each
sequence of 500 bases or so that a sequencing machine generates is known
as a "read."
What happens after DNA sequences come out of the sequencing machines?
An automatic sequencing machine spits out what genome scientists call
"raw" sequence. In raw sequence, the reads or short DNA sequences are
all jumbled together, like the pieces of a jigsaw puzzle in a just-opened
box. Inevitably, raw sequence also contains a few gaps, mistakes, and
ambiguities.
The process of polishing that raw sequencetransforming the fragmented
rough draft into a long, continuous final product without breaks or errorsis
called finishing. Finishing involves both assembly, in which individual
reads are hooked together in the proper order, and a laborious process
of double-checking and refining the sequence to eliminate mistakes and
close gaps. Finishing often takes longer than the sequencing itself.
. . . .
. . . . . . . . .
. . . . . . . .
|