A nationwide team of researchers, funded in part by the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), has produced the first sequence-based map of large-scale structural variation across the human genome. The work, published in the journal "Nature", provides a starting point to examine how such DNA variation contributes to human health and disease.
Other recently created maps, such as the HapMap, have catalogued the patterns of small-scale variations in the genome that involve single DNA letters, or bases.
However, the scientific community has been eagerly awaiting the creation of additional types of maps in light of findings that larger scale differences account for a great deal of the common genetic variation among individuals and between populations, and may account for a significant fraction of disease. While previous work has identified structural variation in the human genome, a sequence-based map provides much finer resolution and location information.
Large-scale structural variations are differences in the genome among people that range from a few thousand to a few million DNA bases. Some are gains or losses of stretches of genome sequence.
Others appear as re-arrangements of stretches of sequence. Already, some structural variations have been linked to individual differences in susceptibility to the human immunodeficiency virus (HIV), risk of coronary heart disease, as well as to schizophrenia and autism. Researchers hope the new map will open the door to uncovering the functions of structural variants in even more conditions.
"It is important that we understand how changes in the human genome, both small and large, contribute to individual differences in susceptibility to diseases," said Francis Collins, M.D., Ph.D. "This map is a valuable starting point for researchers studying the normal patterns of structural variation and how differences in those patterns affect human health."
Researchers constructed the structural variation map by partially sequencing the genomes of eight people: four people of African descent, two of Asian descent and two of European descent. The samples were collected as part of the International HapMap Project. No medical or personal identifying information was obtained from the donors, but the samples were labeled by population group.
Sequence data were collected from each end of roughly 1 million random small pieces of DNA from each individual's genome. These end sequences were compared to the reference sequence of the human genome completed in 2003. Where precise matches did not occur, the scientists inferred that there was a structural difference between the volunteer's sample and the reference sequence of the human genome.
In addition to revealing new variations, the map also provides a more detailed look at the locations of nearly 1,700 structural variations -- half of which had not been previously described. About half of the structural variations were found in at least two of the eight genomes analyzed. The work also uncovered 525 new regions of large-scale structural variation in the human genome.
The large-scale differences came in many forms, including deletions and out-of-place insertions of long stretches of DNA. Almost half of the new variations consist of differences in how many copies individuals have of a certain gene, which researchers refer to as a copy number variant.
"The structural variation map will give us a much better picture of genetic variation between each individual, and help us better understand these areas of the genome that are prone to large-scale changes over time," said Evan Eichler, Ph.D., of the University of Washington, who led the research.