These findings may help to explain why some environmental exposures in utero or during early childhood are known to increase risk of diseases that produce symptoms years or even decades later. In addition, researchers were able to pinpoint which cell types are affected by different diseases. These results provide new insight into disease mechanisms, and suggest novel targets for therapeutics development and disease prevention strategies.
Senior author John Stamatoyannopoulos, M.D., and colleagues report their findings in the Sept. 5 online issue of the journal Science.
Scientists have previously used genome-wide association studies (GWAS) to look across the DNA of many people and identify genetic differences, or variants, associated with different diseases. However, the overwhelming majority of GWAS variants are not located within genes themselves, but are in stretches of DNA between genes, called non-coding DNA. These non-coding regions were once considered to be mostly non-essential, but it is now known that these regions have important roles in regulating how, where, and when genes are expressed.
Regulation of gene expression can be controlled by proteins that bind to DNA and by epigenetic factors, chemical modifications that occur on DNA or DNA-associated proteins and do not alter the underlying DNA sequence. Research is pointing to an important role for epigenetic modifications in many diseases, but discovering how genetic variants and epigenetic modifications interact to influence disease risk has been difficult.
"While previous studies have examined the connection between epigenetic modifications and genetic variants associated with individual diseases, this study provides a much more comprehensive look across the genome at a wide variety of diseases and a wide variety of human tissues," said NIH Director Francis S. Collins, M.D., Ph.D. "These findings can potentially lead to a new understanding about the mechanisms of many common diseases, such as cancer, cardiovascular disease, diabetes, and neurological disease, as well as normal variation in physical traits."
In this study, researchers examined thousands of GWAS variants in non-coding regions of DNA to see whether and how these variants may regulate gene expression. Regions of non-coding DNA that actively regulate gene expression are known to be sensitive to DNaseI, a protein that acts like molecular scissors to chop up bits of DNA. The researchers used over 400 different cell and tissue samples, including samples from the NIH Common Fund’s Epigenomics program and National Human Genome Research Institutes (NHGRI)'s ENCODE: ENCyclopedia Of DNA Elements project, to create a map of DNA regions that are sensitive to DNaseI, called DNaseI hypersensitivity sites (DHSs).
Over 76 percent of non-coding GWAS variants were found to be in or very near DHSs, indicating that the vast majority of non-coding GWAS variants in these samples are actively involved in regulating genes. In addition, 88 percent of GWAS variants in regulatory DNA regions are active in fetal development, including variants associated with adult-onset disease.
The finding that a large number of disease-associated GWAS variants are located in regulatory DNA regions that are active during fetal development suggests that environmental exposures during this period could influence risk for a large number of diseases, the researchers say. Understanding how and when environmental exposures affect gene regulation may provide insights into ways to prevent disease by reducing exposures early in life, instead of treating the disease when symptoms occur in adulthood.
A major challenge in GWAS studies has been difficulty in linking variations in non-coding regions with the genes they regulate, because the target genes can be located a great distance away. In the current study, researchers were able to look across the genome and identify which genes were regulated by hundreds of GWAS variants, including variants associated with blood platelet counts, amyotrophic lateral sclerosis (ALS), Crohn’s disease, breast and ovarian cancer, and schizophrenia.
In the current study, 79 percent of GWAS variants in regulatory DNA were connected to genes that were not the closest ones to the variant, underscoring why previous efforts to link GWAS variants with target genes have been so difficult.
"These exciting results show how a broad, systematic approach to deciphering regulatory DNA — essentially the genome's operating system — can have major implications for our understanding of the genetic basis of many common diseases and traits," said Dr. Stamatoyannopoulos, associate professor of genome sciences at the University of Washington.
GWAS variants are the same in every cell; however, many diseases only affect a limited number of cell and tissue types. In contrast, the activity of regulatory DNA can vary between cell types, suggesting that GWAS variants only have an effect in those cells where the variant is in a DHS and is therefore actively influencing gene expression. Researchers used the cell- and tissue-specific pattern of DHSs to identify which cell types may be playing a role in various diseases.
The scientists identified cell types associated with Crohn's disease and multiple sclerosis (MS) that were only recently discovered to play a role in these diseases. These results suggest that looking for the presence of disease-associated GWAS variants located within DHSs in specific cell types can help identify previously unknown cell types that contribute to disease pathology.
Researchers also determined that GWAS variants from similar diseases often disrupted interconnected networks of proteins. GWAS variants associated with a group of autoimmune diseases were found to disrupt a distinct set of proteins, and this set of proteins was different from the sets of proteins affected by GWAS variants associated with neurological diseases and various cancers.
Disruption of these common regulatory protein networks may explain why seemingly unconnected GWAS variants can be associated with closely related diseases that share overlapping symptoms.
"The fact that susceptibility to many diseases can be traced to variants that affect common regulatory networks opens the door to a greater understanding of the roles that different genes and regulatory elements play in health and disease, and should enable new approaches to disease diagnosis, treatment, and prevention," said Dr. Stamatoyannopoulos.
"The study of epigenomics is transforming how we think about how genes are regulated, and how human disease can arise when that regulation goes awry," said James M. Anderson, M.D., Ph.D., director of the Division of Program Coordination, Planning, and Strategic Initiatives that guides the NIH Common Fund's programs. "The integrated analysis of genetic and epigenetic factors presented in this research will allow researchers to identify cell types and gene networks that play a role in disease, increasing our understanding of disease mechanisms as well as our knowledge of potential therapeutic targets for drug development."
This study was supported in part by the NIH Common Fund’s Epigenomics program, which aims to understand how chemical modifications to DNA regulate gene activity without altering the DNA sequence itself, and how these modifications can affect human health and disease. The Epigenomics program is managed by the National Institute of Environmental Health Science (NIEHS), the National Institute for Drug Abuse (NIDA), and the National Institute on Deafness and Other Communication Disorders (NIDCD), in partnership with the Office of the Director. The National Human Genome Research Institute (NHGRI), National Heart, Lung, and Blood Institute (NHLBI), Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), and National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) also supported this research.
The NIH Common Fund supports a series of exceptionally high impact research programs that are broadly relevant to health and disease. Common Fund programs are designed to overcome major research barriers and pursue emerging opportunities for the benefit of the biomedical research community at large. The research products of Common Fund programs are expected to catalyze disease-specific research supported by the NIH Institutes and Centers.