Professor Alain van Gool, Professor of Personalized Healthcare at Radboud University Medical Center (Radboudumc), in Nijmegen, The Netherlands, helps oversee Radboudumc’s 19 Technology Centers, which span multiple aspects of biomedical science. He also has a rather more specialized position as head of the Translational Metabolic Laboratory (TML). The TML employs about 120 researchers studying the metabolic basic of disease, turning disease-related molecules and metabolites into biomarkers for diagnostics. Mass spectrometry (MS) is an integral part of that endeavor.
Van Gool explains how his lab uses MS data: “Our laboratory is focused on genetic metabolic diagnosis, working frequently with genetic variants where it is not always possible to identify the causal variant. In some cases, patients have a certain phenotype and we know the unique variant causing this phenotype. However, in a large number of cases, there are multiple variants of unknown significance, proving difficult to find the cause. In our metabolomics approach, we use MS analysis of preselected metabolites to elucidate mechanisms of inborn errors of metabolism. By knowing which metabolites are changed in a particular patient, we can contextualize the genomic information.”
Another key area where Van Gool and his team incorporate MS information is in glycoproteomics research; investigating proteins with post-translational modifications that add on a carbohydrate molecule. These glycoproteins include those with important roles in human health and disease. In this research, Van Gool works closely with Radboudumc’s Department of Genetics. “We map a large variety of sugar chains which are circulating in the blood, providing a different functional read-out of genetic variants, linked to sugar metabolism and protein glycosylation. This approach nicely complements genomic screening, whereby the genomic analysis directs glycome analysis or vice versa. We expect a lot from our innovative approach towards glycopeptide profiling that we are developing for diagnostic applications.”
Keeping Things Intact
Key to the study of glycoproteins is their variable state in disease – these proteins aren’t just modified as a result of genetic mutation, but often also by certain infections and diseases. To study these modifications, Van Gool and colleagues try and optimize their assays to enable intact protein viewing, which Van Gool explains is superior to classical proteomic analysis of smaller chunks of protein: “By measuring an epitope, or a piece of a protein, using immunoassays or peptide-based MS, this [glycoprotein] complexity is not covered,” says Van Gool. “We utilize this extra layer of analysis to better understand biology and develop improved and more representative protein diagnostics.”
He gives an example of a 2014 study in which his team developed a diagnostic assay for congenital disorders affecting glycosylation, diseases in which the glycosylation process that attaches carbohydrate molecules to glycoproteins is defective. To do this, they studied glycosylation at two separate sites on a protein called transferrin, which when analyzed together acted as a disease biomarker. Without intact analysis, says Van Gool, the two sites couldn’t be measured simultaneously, and the diagnostic assay wouldn’t be possible.
Data Challenges to Overcome
Van Gool’s research shows the power of modern-day proteomics, but challenges remain. He highlights the process of analyzing MS data as a key obstacle his team are looking to overcome. “Data analysis becomes an increasing issue now MS instrumentation generates data with higher resolution and coverage, revealing more information about the complexity of biological samples. Identifications of peptides, proteins and their modifications and interactions are key challenges that require advanced machine and deep learning methods to obtain insights and to identify the patterns that lie beneath. The sheer size of proteomics data, where we can easily generate millions of data points on a single day, requires advanced approaches to process study outcomes.”
Dealing With Data Hairballs
The diversity of data types is increasing alongside the amount of data available. Combining data from multiple omics’ is a common goal in biomedical research but isn’t always easy with LC-MS data. Hans Wessels, Proteomics Scientist at the TML, describes the confusion that may arise between two researchers trying to combine their data: “Typically, one researcher focuses from an applications and biological point of view and another more from the bioinformatics point of view. In both cases they produce insights in biology but these interpretations may not always overlap.” He recalls a suggestion from a bioinformatician he met at a recent conference that researchers should try focusing on the main core of information produced by different omics, rather than attempting to produce complex networks, which often form into “hairballs” when the network’s intricacy reaches a certain point.
Another sticking point in diagnostics research is the apparent disconnect between biomarker research and valuable clinical outcomes for disease. Prostate cancer research, for example, has identified biomarkers at a rate of about 2000 a year in peer-reviewed journals. That relatively rapid identification process is then followed by an arduous validation process, which Van Gool says takes 1-3 years, and an even longer diagnostic development process, which takes up to 10 years. Why is there this imbalance in the diagnostics pipeline? Van Gool identifies two key aspects: “Firstly, our research output is not standardized to allow one to reproduce 100% of all studies. This could be for a number of reasons, such as the study materials may be treated slightly different, the protocol might be adapted, the tuning of the mass spec may be different, and the data may be organized differently.” Van Gool continues, “Secondly, the biomarker research and development pipeline is rarely connected. The identification of a biomarker requires different skills than the validation of a biomarker, which is again different than running and developing a clinical test to evaluate the biomarker in real life. These steps are often performed by different parties.”
Van Gool thinks that to solve this, harmonization and standardization needs to happen across the field, particularly with regards to how we handle our MS data: “Proper data stewardship is key to help improve the reproducibility of (MS) data and major steps have been made recently to define standards to adopt the FAIR data principles.” FAIR is a set of standards that aims to make improve research data under four goals of “Findability, Accessibility, Interoperability and Reusability”, first suggested in 2016 after much collaborative work by researchers, industry leaders, funding agencies and publishers. Van Gool says he has been heartened by FAIR’s adoption across the industry.
Summing up, Van Gool is optimistic about the future of MS in diagnostics research but feels that the focus of the research needs to shift in order to turn the potential of biomarkers discovery into significant diagnostic outcomes: “I believe that we, as a biomedical research field, could really improve the quality of what we do. It’s time for quality and not quantity in terms of biomarker discovery.”