|Clues to the human genome puzzle|
|By Haleh V. Samiei
June 23, 2000
Now the sequence of all three billion DNA basesthe four-letter alphabetof the human genome is in hand. It will be difficult, however, to make sense of the volumes of information generated without the genomes of other organisms for comparison. To complement research on the human genome, scientists have been sequencing genomes of a number of model organisms from yeast to roundworms to mice. Greg Elgar, of the United Kingdom Human Genome Mapping Project (UK HGMP) Resource Centre in Hinxton, England, has been working with Sydney Brenner, now at The Molecular Sciences Institute in Berkeley, California, to study the genome of the pufferfish.
Elgar and his colleagues have targeted their sequencing efforts more to certain complex human disease genes than to determining the entire sequence of the pufferfish genome. They believe that they can find information about these genes faster and more efficiently by looking at the pufferfish genome rather than the human genome. So far, they have found the genes they have been looking for, including the genes for Huntington's and Alzheimer's.
Several advantages in studying the genome of the pufferfish contribute to the success of Brenner and his colleagues. One advantage is that it is much faster to get from one end of a pufferfish gene to the other end and from one gene to the next when determining DNA sequence on continuous stretches of chromosomes. This is because the pufferfish genome is only about an eighth of the size of the human genome400 million DNA bases. But the pufferfish is not deficient in its total number of genes. Rather, the pufferfish genome contains less of what seems to be irrelevant DNA, sometimes called "junk." This junk DNA separates genes from one another like the space that separates words in a sentence. It also breaks genes into sections like syllables. The human genome is diluted with so much junk DNA that genes are contained in only three percent of itcompared to fifteen percent in the pufferfish.
Another advantage to studying the pufferfish is that, compared to other important model organisms, including fruitflies, the pufferfish is closer to humans on the evolutionary scale, and will have more of the same genes. According to Elgar and Brenner, the sequence of the pufferfish genome will be more useful in filling any gaps that may remain in the human genome. Elgar believes that "one way to bridge those gaps, if you have conserved regions, is simply to look at them in other organisms"like the pufferfish. Conserved regions are parts of the genome that are similar between organisms separated by millions of years of evolution, because they are important to the survival of the organisms. Sequences that are not important will slowly change in a random way, accumulating mutations. This is what happens to junk sequences.
Compared to organisms that are closer to humans on the evolutionary scale, such as laboratory mice and primates, the pufferfish genome has the advantage of a compact genome with less room for junk sequences. In addition, the pufferfish is distant enough from humans, compared to mammals, for the junk sequences to become random. When comparing human and pufferfish genomes, "if you find a conserved homology blockeven quite small conserved blocksthey seem to have some significance," says Elgar. It is easier to recognize an important sequence among oceans of irrelevant DNA.
Given these advantages, Brenner and collaborators use the pufferfish in several types of experiments. In one, they use sensitive computer software to compare sequences, separating and interrupting genes that are shared by the pufferfish and other organisms. With this approach, they have been able to recognize short stretches of conserved sequences among longer stretches of random sequences. They tested the significance of each conserved sequence by chemically altering the sequence and studying how the change affects a laboratory mouse. The mouse is very useful in these studies because it is possible to inject genes from other organisms into the developing embryo. If the mouse has a gene similar to the one being tested, it will treat the injected gene as one of its own. It is then possible to see if alteration of conserved sequences separating and interrupting genes will affect when, where, and in what amounts the protein product of the gene is made. Using this approach, Brenner's team has been able to show the significance of several conserved sequences in the DNA separating and interrupting pufferfish genes.
In other experiments, Brenner's team studied certain large complex human disease genes. For example, in 1995 Elgar and colleagues identified and sequenced the pufferfish counterpart of the human Huntington's disease gene, which had already been sequenced. The pufferfish gene turned out to be only 23,000 DNA bases longseven and a half times shorter than the human gene. Although the pufferfish gene has the same sixty-seven interruptions, they are rarely over 1,000 DNA basescompared to interruptions as long as 12,500 DNA bases in the human gene for Huntington's. The actual gene, however, is very similar to the human gene and provides no further information about the protein.
Another example is when Mike Trower left Brenner's lab to join the Glaxo Wellcome Medicines Research Centre in England and, in collaboration with Brenner, identified and sequenced one of the Alzheimer's disease genes in both human and pufferfish genomes. They knew which human chromosome the gene was on, and what other sequenced genes were in the vicinity. The usual way to get to a gene of interest, when the only available information is about the neighboring genes, is to start at the DNA location of one of the neighbors and "walk" away until the telltale signs of the next gene appear. This may be a long hike with the human genome because of all the junk DNA.
Trower devised a more efficient approach. He assumed that genes are in the same order on the human and pufferfish chromosomes. This is a phenomenon known as conserved synteny and is seen between the genomes of more closely related organisms, such as human and mouse. If this assumption was true, Trower would reach the Alzheimer's disease gene much faster by following the lead to the human gene but sequencing the more compact pufferfish genome.
Although other teams with a headstart reached the human Alzheimer's disease gene first, Trower's team did show a powerful new way of "fishing out" human genes. They found that the order of genes is indeed conserved in the area of the Alzheimer's disease gene. They also found that the genes in that area are concentrated in less DNAthree of them occupying only 12,400 DNA bases compared to over 600,000 DNA bases in the human genome.
Elgar and colleagues have also found other areas of conserved synteny between pufferfish and human genomes. One such area corresponds to a site on the human chromosome 11. This site, known as the WAGR region, has three unrelated genes spanning one and a half million DNA bases in humans and less than 100,000 bases in pufferfish. Both the order and the direction of the unrelated genes in this region are conserved between the two organisms. In humans, deletions in the WAGR region lead to Wilm's tumor, aniridia, genitourinary abnormalities, and mental retardation.
This conserved synteny, however, does not seem to apply to all shared genes between human and pufferfish genomes. Elgar believes that "the evidence at the moment suggests that there are probably quite large areas of conserved synteny...but there will be also some areas that will be very disrupted."
Elgar and colleagues have now obtained over 50,000 random samples of pufferfish genome sequences, using a new, fast and relatively inexpensive "sequence scanning" technique. The sequence information from pufferfish will continue to help decipher the meaning of the billions of bases of human DNA.
. . .