A multi-institutional team of more than 30 scientists has found that statistical analysis of DNA from natural microbial communities can be used to accurately identify environmental contaminants and serve as quantitative geochemical biosensors.
“Changes induced in the natural microbial community structure by contaminants lasts long after the contaminants themselves have become undetectable,” says Terry Hazen, an internationally recognized authority on microbial ecology who led the research. “This means the DNA of these microbial communities can be used as a forensic tool for measuring anthropogenic effects on the environment.”
For this study, the ENIGMA collaborators identified the most independent and interesting groundwater well clusters from 25 years of monitoring data collected at the Bear Creek watershed in Oak Ridge. This watershed was a crucial site for the early development of nuclear weapons under the Manhattan Project and harbors spectacular geochemical gradients. The collaborators then collected a large number of microbial DNA samples from the identified wells in combination with 28 other physical/chemical characteristics.
“The wells we sampled typically contain a high number of particulates in the groundwater, thus causing the filters we collected DNA on to clog easily,” says Andrea Rocha, a post-doctoral associate in Hazen’s research group, who spent three months collecting samples from the watershed. “We had to change our filters each time they clogged until we obtained our required four liters of groundwater. Sometimes this meant changing filters five to six times for one well.”
Analysis of the DNA data from the collected groundwater samples was carried out via a technique called “supervised machine-learning,” which the ENIGMA team applied to high-throughput DNA sequencing data.
“Because microbial communities continuously sense and respond to their environments, they form a ubiquitous environmental surveillance network that can be inexpensively digitized through DNA sequencing,” Hazen says. “Our idea was to determine whether and how information encoded in bacterial communities might be tapped to quantitatively characterize the environment.”
While previous research demonstrated that specific proteins or whole bacterial cells could be used as biosensors to translate environmental signals into machine-readable data, the focus of the ENIGMA study was on the integration of information gathered from native bacterial communities containing billions of cells and encompassing thousands of taxonomic groups.
With just the sequencing data from the 16S rRNA gene alone, the ENIGMA team was able to quantitatively produce “a rich catalogue of 26 geochemical features” from 93 groundwater wells with highly differing geochemistry characteristics. These features were then used to predict contamination. The accuracy for predicting uranium contamination of the groundwater was about 88%, and the accuracy for predicting nitrate contamination was about 73%.
“Our work shows that knowing what bacteria are present allows us to infer something about the current or past chemistry of a site,” says Eric Alm, an MIT microbiologist and one of the principal investigators on this project. “The next big challenge will be to understand why different bacteria are associated with different environmental conditions.”
As human populations continue to grow and the industrialization of developing nations continues to expand, the impact of human activity on the environment is only going to intensify. Measuring the causes and consequences of these impacts is a challenge that science must meet. The ENIGMA project demonstrates one path towards meeting this objective.
“It takes an integrated team to tackle a large problem like this,” says Paul Adams, the SFA Laboratory Research Manager for ENIGMA. “The work with these natural microbial communities highlights what can be achieved through interdisciplinary research that harnesses ENIGMA’s scientific expertise in field sampling, high throughput sequencing, and computational analysis.”