GNN - Genome News Network  
  Home | About | Topics
Introduction | Overview
2004 Rat
2002 Mouse
2001 30,000 Genes
2000 The Human Genome
1999 Fruit Fly
1998 Worm
1996 An Extremophile
1996 Yeast
1995 Haemophilus
1991 Venter
1986 Human Genome
1986 Hood
1983 Mullis
1978 Botstein
1977 Gilbert & Sanger
1973 Boyer & Cohen
1972 Berg
1970 Smith
1970 Temin & Baltimore
1969 Beckwith
1967 Weiss & Green
1961 Jacob & Monod
1961 Nirenberg
1960 mRNA
1957 Crick
1956 Kornberg
1953 Crick & Watson
1950 Chargaff
1944 Avery
1943 Delbruck & Luria
1941 Beadle & Tatum
1934 Bernal
1927 Muller
1913 Sturtevant
1910 Morgan
1909 Johannsen
1908 Garrod
1904 Bateson
1902 Boveri & Sutton
1900 Rediscover Mendel
1888 Boveri
1882 Flemming
1876 Galton
1869 Miescher
1866 Mendel
1859 Darwin

 Printer Friendly
Genetics and Genomics Timeline
Sequencing the genome of Haemophilus influenzae Rd

Early proponents of the Human Genome Project recognized both the importance of innovation and the promise of sequencing the DNA of various model organisms besides human beings. By the mid-1990s, however, the principal strategies had produced complete genomes of only a few viruses. Demonstrating the value of a new strategy of "shotgun" sequencing, J. Craig Venter and colleagues published, in May 1995, the first completely sequenced genome of a self-replicating, free-living organism—the bacteria Haemophilus influenzae Rd.

A circular representation of the H. influenzae Rd genome.

Venter, after leaving the National Institutes of Health in 1992, founded The Institute for Genome Research (TIGR). In a collaboration with Hamilton Smith, of Johns Hopkins University Medical School, who, in 1970, discovered site-specific restriction enzymes, he decided to sequence an organism much more complex than any yet attempted, using what they called "whole-genome random sequencing."

Haemophilus influenzae—known as H. flu, for short—is a bacteria that can cause ear and respiratory infections, as well as meningitis in children. With 1.8 million base pairs, the size of its genome was fairly typical for a bacterium—but about ten times longer than any virus that had been sequenced.

"Whole-genome random sequencing," as used with H. flu, was a stepwise process that, in simplest terms, aimed to assemble a wholly sequenced genome from partly sequenced DNA fragments with the help of a computational model. This approach dispensed with the need for a preliminary physical map of the genome.

Copies of DNA from H. flu were cut into pieces of random lengths of between 1,600 to 2,000 base pairs to create a library of plasmid clones. The clones were then partly sequenced at both ends, using automated sequencing machines, revealing "read lengths" each several hundred base pairs long. These base-pair sequences—with their many overlaps—became the raw data that was entered into the computer. Smaller libraries of longer fragments—15,000 to 20,000 base pairs—were also developed.

Using a software tool, the TIGR assembler, the many thousands of fragments were compared, clustered, and matched for assembling the genome. The most informative and nonrepeating sequences were identified first, and repeating fragments were compared next. The longer fragments helped to order some of the very repetitive and almost identical sequences. Small physical gaps that remained after the TIGR assembler performed its work were rectified with several auxiliary strategies.

Assembling the H. flu genome from 24,304 DNA fragments was a considerable achievement—and to some observers a surprise. The genome contains 1,830,137 base pairs, in which 1,749 genes are embedded. Once assembled, the genetic coding regions were located, compared to known genes, and a detailed map developed.

Success in sequencing Haemophilus influenzae Rd—the project took about a year—demonstrated that random shotgun sequencing could be applied to whole genomes with speed and accuracy. Within months the same method was applied to another bacterium, the Mycoplasma genitalium, and genomes of other organisms soon followed.

Fleishmann, R.D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496-512 (July 28, 1995).

Back to GNN Home Page