Venter Institute Scientists Sequence 178 Microbial Reference Genomes Associated with the Human Body
News May 28, 2010
Researchers from the J. Craig Venter Institute, a not-for-profit genomic research organization, have published (along with other members of the National Institutes of Health (NIH) Human Microbiome Jumpstart Reference Strains Consortium), a catalog of 178 microbial reference genomes isolated from the human body.
Other members of the Consortium are: Baylor College of Medicine Human Genome Sequencing Center, the Broad Institute, and the Genome Center at Washington University. The paper is being published in the May 21 issue of the journal Science.
The human body is teeming with a variety of microbial species. This collective community is called the human microbiome. The role these microbes play in human health and disease is still relatively unknown but likely very important.
The NIH Human Microbiome Project was launched in 2007, as part of the National Institutes of Health’s (NIH) Common Fund’s Roadmap for Medical Research. It is a $157 million, five-year effort that will implement a series of increasingly complicated studies that reveal the interactive role of the microbiome in human health.
The HMP Jumpstart Consortium has been charged with selecting the microbial strains to sequence from the following body sites: gastrointestinal tract, oral cavity, urogenital/vaginal tract, skin and respiratory tract; with creating standards for sequencing and annotation; and ensuring the rapid release of information to the scientific community.
The Consortium’s goal is to ultimately produce 900 reference genomes. So far the group has produced and released into the public domain 239 genomes. The 178 genomes represented in this publication are completely annotated and analyzed. The sequencing was conducted using mostly the Roche-454 sequencing platform along with some traditional Sanger sequencing. The team compared the sequenced reference genomes to human metagenomic data in the public domain to find new genes and proteins, to ascertain some function for these genes and to assign metagenomic data to species.
From the analysis of 547,968 predicted proteins, the team found 29,987 unique proteins. This data set was compared to a randomly selected data set of 178 previously sequenced prokaryotic genomes found in the public database GenBank and there were fewer unique genes in this data set than in the human microbiome. This is said to be a unique finding and suggests that there is greater microbial diversity in the human microbiome than was previously known.
The group found some novel gene functions unique to particular microbial strains in the reference genome set. While the group cautioned that this was preliminary data and more work was needed on gene functions, some initial insights into important functions were gleaned.
One of the main goals of having HMP reference genomes is to help interpret and understand metagenomic data. Since the HMP reference genomes were isolated from humans and were not in metagenomic data sets, the group was uncertain if these reference genomes would aid in identifying metagenomic sequences. However, in comparing 16.8 million sequences, the team found that 62 of the HMP reference genomes recruited 11.3 million sequences, and of these, 6.9 million sequences recruited most closely to the HMP reference genomes. Thus, having the HMP reference genomes allowed for 20 to 40% of metagenomic sequences to be identified.
This analysis shows that the HMP reference genomes are aiding substantially in the understanding of the human microbiome. However, the group added that there is still much work to be done in fully understanding the microbiome, and achieving the goal of 900 sequenced reference genomes is necessary for a more complete understanding.
The group concluded that while this initial catalog focused on bacteria, future efforts will concentrate on adding eukaryotic microbes and viruses since these are both found in the human microbiome. As well, the group will continue their work in developing standards for sequencing unculturable strains, strain selection criteria, and providing online access to these large datasets, among many issues.
The corresponding author, JCVI’s Karen E. Nelson, Ph.D., said, “This is a major study that moves us in the right direction to understanding the complex microbiota associated with the human body, and outlines how we benefit from this relationship. We will continue to learn more about the impact of these species in health and disease conditions.” Nelson also added that the consortium anticipates several additional significant publications on the human microbiome in the near future.
MIT researchers have developed a cryptographic system that could help neural networks identify promising drug candidates in massive pharmacological datasets, while keeping the data private. Secure computation done at such a massive scale could enable broad pooling of sensitive pharmacological data for predictive drug discovery.
Previous work by the International Multiple Sclerosis Genetics Consortium (IMSGC) has identified 233 genetic risk variants. However, these only account for about 20% of overall disease risk, with the remaining genetic culprits proving elusive. A new study has tracked down four of these hard-to-find genes.READ MORE