Aequatus - a new bioinformatics tool developed at Earlham Institute (EI) - is helping to give an in-depth view of syntenic information between different species, providing a system to better identify important, positively-selected, and evolutionarily-conserved regions of DNA.
Generally, organisms that are closely related show a high degree of synteny i.e. they possess similar sequences along their chromosomes, where closely related genes that are presumed to have the same function are clustered in a similar organisation between species. Thus, many human genes have high synteny with mammals, from chimpanzees to mice.
Studying the synteny between organisms can help us to identify how genetic regions change through evolution, and has far-reaching applications - including better understanding evolution and how we came to be, aiding studies into human health, as well as in breeding better crops.
Anil Thanki of the Data Infrastructure group, said: "We are very excited about Aequatus because it provides a really intuitive way to visualise homologous genes among species. Aequatus provides a seamless user experience using the latest web technologies available to represent genomics data. It helps biologists delve into the details of homologous genes by comparing them at the genomic feature level. We have also connected this resource with the SMART protein domain information server to let researchers get to relevant data without having to switch services."
Award winning, real-world applications
Alongside the publication of the Aequatus tool in GigaScience, the main developer Anil Thanki of the Davey Group at EI was nominated for an award in the prize track for ICG-13, the 13th International Conference on Genomics in Shenzhen, China.
Built using open-source technologies, Aequatus provides a fast and intuitive web-based browsing experience to bridge the gap between phylogenetic changes and gene feature information.
One such application is the recently published GeneSeqToFamily tool, a Galaxy workflow based on the Ensembl Compara GeneTrees pipeline to find gene families. The Aequatus plugin has been made available within Galaxy (currently on usegalaxy.eu) in order to visualise resulting gene families garnered from GeneSeqToFamily.
A novel, more complete visualisation tool
Whereas traditional phylogenetic trees (a visualisation of the shared ancestry in a "family tree") present an overview of synteny, Aequatus also provides information regarding structural changes in genes, including variation within them that corresponds to changes in phenotype (appearance).
Using a "guide" gene as a reference, other genes are mapped based on alignment (an analysis of sequence similarity, or how closely two genes are related to each other based on their DNA or protein sequence). Alignments are retrieved from open-source databases,Ensembl Compara and the Ensembl Core, then Aequatus processes both comparative and feature data to provide a visual representation of phylogenetic and structural changes between species based on a shared colour scheme.
This helps to visualise regions of homology, while also allowing the identification of changes to genes, such as insertions or deletions, with black bars representing insertions specific to a given gene compared to the "guide".
Overall, Aequatus provides a unique way to explore complex relationships between genes from various species at a level that has so far been unrealised. Applicable not only to high-quality reference genomes including mouse and human, Aequatus has been designed for use with hard-to-assemble or non-model organisms.
The latest version of Aequatus also supports the Ensembl REST API, which can retrieve data directly from Ensembl server and doesn't necessitate the use of local data improving the portability of Aequatus.
Rob Davey, Data Infrastructure group leader, added, "It's great to see this work published and indeed selected for an award at an international conference. This shows that visualisation of genomic data is still an active and valuable area of research, and Aequatus can really help researchers gain access to even more fine-grained information about their genes and organisms of interest".
This article has been republished from materials provided by the Earlham Institute. Note: material may have been edited for length and content. For further information, please contact the cited source.