Molecular biologists, developmental biologists and computer scientists at the Universtity of Helsinki, Finland, came together to advance towards cracking the code for how gene expression is controlled. The results of this work are published in Cell, in January 2006.
A genome milestone was reached in 2001 when sequencing of the human genome was completed.
This has been followed by complete chemical read-outs of DNA sequence for several species, for example mouse, dog, cow and chicken, in the recent years.
But without a code or 'grammar' to reveal the message behind the sequence, the genomic DNA is merely a list of millions and millions of base pairs, A's, C's, G's and T's one after the other.
Based on the universal code by which DNA encodes amino acids, one can make sense of the constantly increasing amout of DNA sequence data as far as it encodes proteins.
This code was solved in 1966 and it has allowed researchers to find genes and estimate the total number of genes in the human genome. However, coding sequence covers only about 1.2% of the human genome.
At the University of Helsinki the researchers have been interested in sequences which regulate gene expression. The research group, led by professor Jussi Taipale, Ph.D, has defined the binding specificities of several transcription factors.
Transcription factors are DNA-binding proteins which are required to activate gene expression.
In collaboration biologists and computer scientists designed software called EEL (enhancer element locator) which searches genomic sequence for regions where many transcription factors bind DNA side by side.
Finding the same region with high frequency of transcription factors in several species indicates that the DNA element regulates gene expression.
The researchers showed that the predicted regulatory elements direct organ-specific expression of a marker gene in transgenic mice. Novel experimental and computational methods enabled genome-wide analysis of regulatory elements in several species.
The findings of the Finnish scientists have implications to the study of cancer, evolution, development biology and many other areas of biology. The work revealed a potential mechanism explaining why many different genes are linked to cancer.