Computer Model Is ‘Crystal Ball’ for E. Coli Bacteria
News Oct 31, 2016
It’s difficult to make predictions, especially about the future, and even more so when they involve the reactions of living cells — huge numbers of genes, proteins and enzymes, embedded in complex pathways and feedback loops. Yet researchers at the University of California, Davis, Genome Center and Department of Computer Science are attempting just that, building a computer model that predicts the behavior of a single cell of the bacterium Escherichia coli.
The results of their work were published Oct. 7 in the journal Nature Communications.
The new simulation is the largest of its kind yet, said Ilias Tagkopoulos, professor of computer science, who led the team.
“The number of layers, and the amount of data involved are unprecedented,” he said. The dataset on which the model is based includes, for example, over 4,389 profiles of the expression of different genes and proteins across 649 different conditions. Both the dataset, named “Ecomics” and the integrated model, MOMA (Multi-Omics Model and Analytics) are available to other researchers to use and test.
The model could be useful to researchers as a fast and inexpensive way to predict how an organism might behave in a specific experiment, Tagkopoulos said. Although no prediction can be as accurate as actually performing the experiment, this would help scientists design their hypotheses and experiments. Applications range from finding the best growth conditions in biotechnology to identifying key pathways for antibiotic and stress resistance.
A week to download, 2 years to build
Collecting and downloading the data took a week, but processing the data into a single dataset took two years of the three-year project, Tagkopoulos said. The team built models for four layers, starting with gene expression and working up to the activity at the whole-cell level. Then they integrated the layers together. They used techniques in machine learning to train the models to predict the behavior of each layer, and ultimately of the cell itself, under different conditions.
The model was built on computer clusters at UC Davis, and on supercomputers available through a national network. The researchers received a National Science Foundation grant of computing time on “Blue Waters,” one of the world’s most powerful supercomputers, at the National Center for Supercomputer Applications.
Although E. coli is a well-known organism, we are far from knowing everything about its biochemistry and metabolism, Tagkopoulos said.
“We are exploring a vast space here,” he said. “Our aim is to create a crystal ball for the bacteria, which can help us decide what is the next experiment we should do to explore this space better.”
With collaborators at Mars Inc., Tagkopoulos hopes to begin building similar databases and models for bacteria involved in foodborne illness, such as Salmonella enterica and Bacillus subtilis. He expects other researchers to draw on the Ecomics database, and hopes to make the MOMA model interface more accessible for biologists to use.
“We’re living in an amazing era at the intersection of computer science, engineering and biology,” he said. “It’s a very interesting time.”
About 422 million people around the world, including more than 30 million Americans, have diabetes. Obesity is the most significant risk factor for type 2 diabetes. yet about 30 percent of obese people do not develop type 2 diabetes or other metabolic conditions. New research aims to understand on a cellular level, how this separation occurs.READ MORE
Habitat loss, habitat fragmentation and the loss of genetic diversity are the main factors driving the extinction of many wild species, and the few eastern massasauga rattlesnakes remaining in Illinois have certainly suffered two of the three. A long-term study of these snakes reveals, however, that – despite their alarming decline in numbers – they have retained a surprising amount of genetic diversity.READ MORE