Clinical Proteomics and Computational Biochemistry With Professor Jürgen Cox
The Computational Systems Biochemistry group at the Max Planck Institute of Biochemistry develop computational methods for the identification and quantification of the molecular components of cells, tissues and body fluids.
Using inputs from large data sets generated by mass spectrometry (MS)-based proteomic studies and next generation sequencing, the research group endeavour to identify the interactions of biomolecules and decipher their physiological functions. This is achieved through computational statistical analysis and computer modeling.
The group is led by Professor Jürgen Cox. Working collaboratively with the research department ran by Professor Matthias Mann, Cox developed MaxQuant – a quantitative proteomics software tool that analyses large MS data sets. Cox's research is also focused in medicine, where his methods could be utilized for a wide range of applications.
Technology Networks recently spoke with Professor Cox to learn more about the software tools he is developing, and how 4D proteomics workflows enhance research in clinical proteomics.
Molly Campbell (MC): Please can you explain what 4D proteomics is, and how it contributes to our understanding of the role of proteins in clinical conditions?
Jürgen Cox (JC): 4D proteomics is a novel way of performing proteomics measurements enabled by recent developments in MS instrumentation as, for instance carried out by Bruker with the timsTOF Pro, and improvements in proteomics software as in MaxQuant which we develop in my lab. The essential ingredient is ion mobility spectrometry, which is coupled to the standard LC-MS/MS proteomics workflow. It increases the number of dimensions of the raw data from three (m/z, retention time and signal intensity) to four.
MC: Why is it advantageous to adopt a 4D proteomics approach, based on TIMS and PASEF, to study proteins rather than the standard 3D approach?
JC: The 4D proteomics approach offers several advantages. The underlying TOF technology offers a high speed of generating peptide fragmentation spectra which are then used to identify peptides. This allows for scanning the extra dimension in depth. The extra dimension achieves a better separation of peptide features in the raw data. Also, with the collision cross section, it offers an additional molecular feature that can be used for improving the identification of peptides.
MC: Why have you chosen to adopt a multi-disciplinary research approach in your lab when researching proteomics? Are there any challenges associated with this?
JC: We need expertise from a large portfolio of different backgrounds. Being cutting edge in software and algorithm development alone would not be sufficient for our team. In addition, we need knowledge in statistics and machine learning, including deep learning, and understanding of the inner workings of MS instrumentation. Finally, in order to be able to translate the data into biomedical findings, we have to integrate knowledge about biological processes.
MC: Which technologies do you most commonly adopt in your lab and why?
JC: We are a computational lab without a wet lab and MS facility run by ourselves. Instead, we collaborate with university research labs, MS vendors and pharmaceutical companies to incorporate the non-software aspects of our work. This constellation allows us to work on a broad range of technologies.
MC: You developed MaxQuant, which you say is a "work in progress" that is "continually expanding and improving to meet the complexity of biological processes". What have some of the most recent improvements been and what are your next steps here?
JC: One of the most recent improvements is a breakthrough in predicting MS/MS spectra with the help of deep learning algorithms. The precision with which we can predict fragmentation spectra from sequence will open up new directions in analysing data-independent acquisition proteomics data and improves coverage in shotgun proteomics workflows.
MC: When developing software for proteomics and computational methods suitable for use in a clinical environment, what key features are required? What challenges exist in developing such software?
JC: One of the main challenges in analysing clinical samples, in particular the plasma proteome is their high dynamic range in terms of the concentrations of the proteins contained in a sample. Here, advances in instrumentation and software are continuously improving sensitivity. Particular challenges for the software development are to drive quantification to its limits in terms of accuracy and sensitivity for both DDA and DIA data. Also, since clinical data analysis has direct implications on the health of human individuals, special care has to be devoted to the reliability of answers obtained in data analysis.
MC: How can 4D proteomics advance the research field going forward? Are there any barriers?
JC: 4D proteomics offers a good depth in terms of number of proteins that can be measured and a respectable dynamic range, both of which are beneficial for many applications.
Professor Jürgen Cox was speaking with Molly Campbell, Science Writer, Technology Networks.