We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.

Advertisement

Analyzing Large Data Sets in Metabolomics Just Got Easier

Three researchers in a lab.
Credit: iStock.
Listen with
Speechify
0:00
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 2 minutes

A research team led by scientists at the University of California, Riverside, has developed a computational workflow for analyzing large data sets in the field of metabolomics, the study of small molecules found within cells, biofluids, tissues, and entire ecosystems.


Most recently, the team applied this new computational tool to analyze pollutants in seawater in Southern California. The team swiftly captured the chemical profiles of coastal environments and highlighted potential sources of pollution.


“We are interested in understanding how such pollutants get introduced in the ecosystem,” said Daniel Petras, an assistant professor of biochemistry at UC Riverside, who led the research team. “Figuring out which molecules in the ocean are important for environmental health is not straightforward because of the ocean’s sheer chemical diversity. The protocol we developed greatly speeds up this process. More efficient sorting of the data means we can understand problems related to ocean pollution faster.”

Want more breaking news?

Subscribe to Technology Networks’ daily newsletter, delivering breaking science news straight to your inbox every day.

Subscribe for FREE

Petras and his colleagues report in the journal Nature Protocols that their protocol is designed not only for experienced researchers but also for educational purposes, making it an ideal resource for students and early-career scientists. This computational workflow is accompanied by an accessible web application with a graphical user interface that makes metabolomics data analysis accessible for non-experts and enables them to gain statistical insights into their data within minutes.


“This tool is accessible to a broad range of researchers, from absolute beginners to experts, and is tailored for use in conjunction with the molecular networking software my group is developing,” said coauthor Mingxun Wang, an assistant professor of computer science and engineering at UCR. “For beginners, the guidelines and code we provide make it easier to understand common data processing and analysis steps. For experts, it accelerates reproducible data analysis, enabling them to share their statistical data analysis workflows and results.”


Petras explained the research paper is unique, serving as a large educational resource organized through a virtual research group called Virtual Multiomics Lab, or VMOL. With more than 50 scientists participating from around the world, VMOL is a community-driven, open-access community. It aims to simplify and democratize the chemical analysis process, making it accessible to researchers worldwide, regardless of their background or resources.


“I’m incredibly proud to see how this project evolved into something impactful, involving experts and students from across the globe,” said Abzer Pakkir Shah, a doctoral student in Petras’ group and the first author of the paper. “By removing physical and economic barriers, VMOL provides training in computational mass spectrometry and data science and aims to launch virtual research projects as a new form of collaborative science.”


All software the team developed is free and publicly available. The software development was initiated during a summer school for non-targeted metabolomics in 2022 at the University of Tübingen, where the team also launched VMOL.


Petras expects the protocol will be especially useful to environmental researchers as well as scientists working in the biomedical field and researchers doing clinical studies in microbiome science.


“The versatility of our protocol extends to a wide range of fields and sample types, including combinatorial chemistry, doping analysis, and trace contamination of food, pharmaceuticals, and other industrial products,” he said.


Reference: Pakkir Shah AK, Walter A, Ottosson F, et al. Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nat Protoc. 2024. doi: 10.1038/s41596-024-01046-3


This article has been republished from the following materials. Note: material may have been edited for length and content. For further information, please contact the cited source. Our press release publishing policy can be accessed here.