Punctuating Messages Encoded in Human Genome with Transposable Elements
News Aug 06, 2015
Since the classical studies of Jacob and Monod in the early 1960s, it has been evident that genome sequences contain not only blueprints for genes and the proteins that they encode, but also the instructions for a coordinated regulatory program that governs when, where and to what extent these genes and proteins are expressed. The execution of this regulatory code is what allows for the creation of very different cell- and tissue-types from the same set of genetic instructions found in the nucleus of every cell.
The vast majority of the human genome (~98% of the total genetic information) is not dedicated to encoding proteins, and this non-coding sequence was initially designated as "junk DNA" to underscore its lack of apparent function. Much of the so-called junk DNA in our genomes has accumulated over evolutionary time due to the activity of retrotransposable elements (RTEs), which are capable of moving (transposing) from one location to another in the genome and make copies of themselves when they do so. These elements have been considered as genomic parasites that exist by virtue of their ability to replicate themselves to high numbers within genomes without providing any beneficial function for the hosts in which they reside. However, recent studies on RTEs have shown that they can in fact encode important functions, and much of their functional activity turns out to be related to how genomes are regulated. RTEs have been linked to stem cell function, tissue differentiation, cancer progression and ultimately to aging and age-related pathologies.
The study by Wang et al. provides a new perspective on the role that RTE-derived sequences play in the precise execution of the human genome's regulatory program. This study found that one particular class of RTEs - Mammalian-wide Interspersed Repeats (MIRs) - can serve as genetic landmarks that help to target specific regulatory mechanisms to a large number of genomic sites and thereby lead to the coordinated regulation of the genes located nearby these sites.
This discovery was spearheaded by a team of computational biologists, led by Dr. King Jordan, Associate Professor and Director of the Bioinformatics Graduate Program at the Georgia Institute of Technology, who performed a "big data" analysis of massive datasets generated by hundreds of scientists from dozens of laboratories around the world working as part of the "Encyclopedia of DNA Elements" or ENCODE project. Their comprehensive and integrated data analysis, conducted by primarily by Dr. Jianrong Wang from Dr. Jordan's team, allowed them to pinpoint the location of thousands of individual MIR elements in the human genome that appear to function as so-called "boundary elements" in T lymphocyte cells of the immune system.
Boundary elements are epigenetic regulatory sequences that separate transcriptionally active regions of the human genome from transcriptionally silent regions in a cell-type specific manner. In so doing, these critical regulatory elements help to provide distinct identities to different cell types, although they all contain identical sets of information. The regulatory programs that underlie these cell- and tissue-specific functions and identities are based largely on genome packaging. Genes that should not be expressed in a given cell or tissue are located in tightly packaged regions of the genome and inaccessible to the transcription factors that would otherwise turn them on. These boundary elements help to establish the geography of genome packaging by delineating the margins between silent regions in which genes are not expressed and active regions in which they are. In this critical role, boundary elements help to control the timing and extent of gene expression across the entire genome. As a result, defects in the organization of the genome by boundary elements are highly relevant for physiological and pathological processes.
"Our colleagues at the Georgia Institute of Technology were able to build upon our early discovery that another class of retrotransposon, the SINEB2 element, can provide boundary function at the mouse growth hormone locus," said Dr. Victoria Lunyak, CEO of Aelan Cell Technologies whose research team collaborated with Jordan lab on this project.
"We randomly picked a hand full of the MIR sequences predicted to serve as boundary elements by the Jordan lab and experimentally validated their activity in mouse cell lines and, with help of our Spanish collaborators, in Zebra fish upon embryonic development," Dr. Lunyak said. "This testing revealed that MIR sequences can serve as punctuation marks within our genome that enable cells to correctly read and comprehend the message transmitted by the genomic sequences."
"One thing that is particularly striking is the fact that these punctuation marks, as Victoria calls them, play a role that is deeply evolutionary conserved," said Dr. Jordan. "The same exact MIR sequences were able to function as boundaries in human CD4+ lymphocytes, in mouse cell models and in Zebrafish."
"This is an important discovery because the understanding of how RTEs punctuate messages encoded in the human genome can help researchers to develop treatments for a wide variety of human diseases, including aging," added Dr. Lunyak
Aging is characterized by a number of global changes in genome organization and function, and aging-associated defects in how our genome is packaged can have severe pathological consequences. In particular, age-related defects in genomic packaging can greatly increase the susceptibility of the genome to damage. Based on the discoveries published in their PNASpaper, the Jordan lab at Georgia Tech and the Lunyak team at Aelan Cell Technologies and their partner Nuclea Biotechnologies are now working towards the development of novel diagnostic and therapeutic strategies that target the critical roles of epigenetic regulators, such as human retrotransposons, in coordinating cell-type specific regulatory programs.