Family Gathering: Galaxy Program Makes Grouping Genes Easier
Researchers at Earlham Institute (EI) have released ‘GeneSeqToFamily’, an open-source Galaxy workflow that helps scientists to find gene families based on the powerful ‘EnsemblCompara GeneTrees’ pipeline.
Published in Gigascience, the open source Galaxy workflow allows researchers to make easier work of finding gene families; an important tool when it comes to analysing the evolution, structure and function of genes across species.
Co-author Wilfried Haerty, Group Leader of Evolutionary Genomics at EI, explained why this tool is so useful to biologists: “The software developed at the Earlham Institute enables scientists to investigate species of interest using a flexible and reproducible pipeline. The performance of our workflow was assessed on vertebrate genome assemblies of various qualities (platypus, pig, horse, dog, mouse and human). The species were selected to assess the impact of genome quality on gene families identification. The mouse, dog and human genomes are of high quality whereas the three others are at different stages of analysis completion.”
Based on and expanding Ensembl’s existing EnsemblCompara Gene Trees pipeline, the GeneSeqToFamily workflow removes many complex prerequisites of the process, such as having to use the command line to install a large number of separate tools, by converting the whole process into Galaxy; a much simpler platform to use.
Importantly, the workflow is highly customisable, allowing users to choose parameters, change tools and run the software on their own genes, without having to use the Ensembl database.
Not just a workflow, GeneSeqToFamily contains a number of new, standalone Galaxy tools, including TreeBeST, hcluster_sg, T-Coffee and ETE. Developed at EI by Anil Thanki and Nicola Soranzo of the Data Infrastructure Group, the software makes the process of finding and generating phylogenetic trees easier, using a range of open platforms and databases. Anil Thanki, Scientific Programmer at EI, said: “We are excited to put our work in the open domain, where it allows biologists and bioinformaticians to use the Ensembl Compara GeneTrees Pipeline in a simple, graphical user interface and modify it if needed.”
The team hopes that the new workflow will help users unfamiliar with the complexities associated with using Compara to be able to more easily analyse phylogenetic datasets, while collating a number of useful gene family tools in one Galaxy workflow. Users can either select existing Ensembl databases to use as the reference sets for their analysis, or provide their own data in the same format, and tools are provided that can help.
Earlham Institute is committed to providing tools and algorithms to support, enable and develop computational biology and life sciences research, with projects such as Galaxy helping to open access to a range of scientific tools and databases.
The Data Infrastructure Group, led by Dr Rob Davey, also supports resources such as CyVerse UK and COPO which, alongside Galaxy, expand the availability and usability of computational resources to the wider scientific community in the UK and internationally through EI’s National Capability in e-Infrastructure.
This article has been republished from materials provided by the Earlham Institute. Note: material may have been edited for length and content. For further information, please contact the cited source.
Reference: Thanki, A. S., Soranzo, N., Haerty, W., & Davey, R. P. (2018). GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees. GigaScience. https://doi.org/10.1093/gigascience/giy005
Changing Lanes: Algorithm Helps AI Drive More Like HumansNews
For self-driving cars, algorithms for changing lanes are beset by one of two problems. Either, they rely on detailed statistical models of the driving environment, which are too complex to analyze on the fly; or they’re so simple that they can lead to impractically conservative decisions, such as never changing lanes at all. Now a new algorithm hopes to split the difference, allowing aggressive lane changes than the simple models do but relies only on immediate information about other vehicles’ directions and velocities to make decisions.
Schizophrenics' Blood Contains RNA From More MicrobesNews
The blood of schizophrenia patients features genetic material from more types of microorganisms than that of people without the debilitating mental illness, research at Oregon State University has found. What’s not known is whether that’s a cause or effect of the severe, chronic condition that strikes about one person in 100.READ MORE
Faulty Gene Leads to Alcohol-Induced Heart FailureNews
A faulty gene interacts with alcohol to accelerate heart failure in susceptible patients, a study suggests. This dangerous interaction can occur even when only moderate amounts of alcohol have been consumed.READ MORE
Comments | 0 ADD COMMENT
2nd Annual Artificial Intelligence in Drug Development Congress
Sep 20 - Sep 21, 2018