RAMclust/RAMsearch: efficient post-XCMS feature clustering and annotation of MS-based metabolomics datasets
Poster Dec 22, 2016
Corey D. Broeckling and Jessica E. Prenni
Introduction: Chromatographically coupled mass spectrometry is a powerful tool for profiling, semi-quantitatively or quantitatively, a breadth of small molecules with sensitivity and selectivity. The complexity of these datasets has driven the development of informatics approaches for feature finding, retention time alignment, feature grouping, and annotation. However, the complexity of signals derived from a single compound is generally underestimated, resulting in poor spectral reproducibility, misannotation, and misinterpretation of individual mass signals. This limitation has driven us to develop informatics tools to improve the quality of post-XCMS data processing.
Methods: RAMclustR is developed in R and is freely available. It is designed with memory constraints in mind, and operates on the scale of minutes, but can take an hour when peak shape similarity scoring is also used. The output is initially an R object containing a dataset of reduced dimensionality as compared to the input XCMS set, as well as spectra which are written to .msp format. These spectra can include MSE (indiscriminant MS/MS) spectra when available. This msp format is taken as input for RAMseach, a .NET-based GUI for performing batch spectral searching against NIST formatted spectral libraries. The results can be output in a format which can be reimported back into the ramclustR.
Preliminary Results: RAMclustR feature similarity scores are calculated for all feature pairs in the input XCMS R object, where feature similarity is the product of individual similarities in correlation in intensity across the dataset, feature retention time, and peak shape. The contribution of each score is tunable using sigmoid functions, enabling the evaluation of results and adjustment, when necessary. The output datasets demonstrate improved injection reproducibility as compared to individual features, reduce false discovery error rate burden, and improve annotation quality. Annotation efficiency is dramatically improved by utilizing the output spectra from RAMclustR as input for spectral searching using RAMsearch, a novel GUI for batch searching and manual validation of search results. The output from RAMsearch is imported into RAMclustR, enabling the storing, visualization, and sharing of the evidence for a given annotation. These output are suitable as supplementary material upon publication of the dataset, to ensure transparency in the annotation process. This workflow reduces annotation time several fold by automating routine manual tasks. Further, it is designed to streamline the efforts that go into reporting annotation confidence, which will enable more robust, transparent, and accessible reporting of metabolomics data.
Multi-Residue Pesticide Analysis in Food Matrices Using Ultivo Triple Quad LC/MSPoster
Regulatory agencies have set maximum residue levels for hundreds of pesticides and their metabolites in foods.READ MORE
Analysis of Mycotoxins in Food Matrices using the Innovative Ultivo Triple Quad LC/MSPoster
A study demonstrating the accuracy and sensitive quantification of up to 12 regulated mycotoxin compounds in three commonly regulated foods.READ MORE
Analysis of Urine SRMs Using Solid-Phase Micro Extraction, Dynamic Headspace, and Liquid Injection with Comprehensive GCxGC High Res TOFMSPoster
We used modern-day, high-resolution time-of-flight mass spectrometry (HRT) and powerful processing software to quickly and confidently identify compounds in urine.READ MORE
Comments | 0 ADD COMMENT
EMBO Workshop: Integrating Systems Biology: From Networks to Mechanisms to Models
Apr 15 - Apr 17, 2018
EMBL Course: Introduction to Next Generation Sequencing
Apr 09 - Apr 12, 2018
EMBL Course: Introduction to Metabolomics Analysis
Mar 20 - Mar 23, 2018