Recent advances in liquid chromatography-mass spectrometry (LC-MS) technology have led to more effective approaches for measuring changes in peptide/protein abundances in biological samples. Label-free LC-MS methods have been used for extraction of quantitative information and for detection of differentially abundant peptides/proteins. However, difference detection by analysis of data derived from label-free LC-MS methods requires various preprocessing steps including filtering, baseline correction, peak detection, alignment, and normalization. Although several specialized tools have been developed to analyze LC-MS data, determining the most appropriate computational pipeline remains challenging partly due to lack of established gold standards.
The work in this paper is an initial study to develop a simple model with "presence" or "absence" condition using spike-in experiments and to be able to identify these "true differences" using available software tools. In addition to the preprocessing pipelines, choosing appropriate statistical tests and determining critical values are important. We observe that individual statistical tests could lead to different results due to different assumptions and employed metrics. It is therefore preferable to incorporate several statistical tests for either exploration or confirmation purpose.
The LC-MS data from our spike-in experiment can be used for developing and optimizing LC-MS data preprocessing algorithms and to evaluate workflows implemented in existing software tools. Our current work is a stepping stone towards optimizing LC-MS data acquisition and testing the accuracy and validity of computational tools for difference detection in future studies that will be focused on spiking peptides of diverse physicochemical properties in different concentrations to better represent biomarker discovery of differentially abundant peptides/proteins.
The article is published online in the journal Proteome Science and is free to access.