The Evolution of Proteomics - Dr Gary Kruppa
The Evolution of Proteomics - Dr Gary Kruppa
Gary Kruppa, PhD, Vice President of Proteomics, Bruker Daltonics Inc., has over 30 years of experience in the field of mass spectrometry (MS), having served as a Vice President at Bruker Daltonics for over 20 years.
Kruppa received his PhD in chemical physics from the California Institute of Technology, and his BS from the University of Delaware. Kruppa oversees market and applications development management for Bruker's innovative solutions for research in proteomics.
In this instalment of The Evolution of Proteomics, Kruppa discusses the recent technological advances that are driving MS-based plasma proteomics for biomarker discovery and beyond.
Molly Campbell (MC): Can you provide some background into the need for new plasma biomarkers? How can new methods improve biomarker discovery?
Gary Kruppa (GK): Many existing diagnostic methods that are currently based on proteins measured by antibody binding could be translated to MS-based assays, improving specificity. Leigh Anderson’s review paper on The Human Plasma Proteome as early as 2002 provides a good reference to both the advantages of plasma proteomics, as wells as its inherent challenges.
There is an unmet need for new biomarkers for many diseases, but the protein content of plasma is very complex, making discovery and validation a challenge. A number of cancers currently have no known plasma biomarkers, the early detection of which would revolutionize patient care. New techniques for biomarker discovery, as well as instruments with high sensitivity and robustness, are required to meet this need. The abundance of biomarkers for early stage cancer in plasma is likely to be quite low, so very good sensitivity and very high coverage of proteins in plasma is required to detect such biomarkers.
MC: In addition to medicine, what other fields of science can benefit from proteomics insights?
GK: The development of new proteomics technologies is key for pharmaceutical drug development. Many drug targets are proteins, so gaining a deeper understanding of the cell mechanisms that influence normal concentration range, turnover rates, accessibility to protein pockets for suitable targets, protein-lipid complexes on the cell surface, and multiple target or off-target hits are important factors that MS-based proteomics helps unravel. Once researchers have a suitable target and a drug with which to target it, they want to assess the proteomic profile to ensure they are knocking down the expression of the target protein and affecting only that pathway. The more sensitive the methods, the more you can study small on and off-target effects on the proteome.
Much early stage drug development is done in cell culture, so sensitivity and dynamic range are not as much of a problem as they are in plasma, but the whole range of pharmaceutical sciences is very interested in proteomics for numerous applications. In the field of biology, metaproteomics is a major area for study, which includes the study of how organisms interact, which organisms are present, and how their proteomes are affected by these interactions.
A key area of metaproteomics includes the study of gut microbes in humans, from both a fundamental science standpoint and studying the effect of the human microbiome on health. Host-pathogen protein-protein interactive networks (PPI) is another key area of focus, where drug development could be tailored to target virus borne diseases, such as Zika, Dengue or Ebola, which hijack the cellular protein machinery of the host and often lead to human wide illness and even death.
MC: Can you explain the use of 4D matching, and what this means for biomarker discovery?
GK: 4D matching has a huge impact on biomarker discovery, pharmaceutical drug development, and fields like metaproteomics and PPI. In standard bottom up proteomic studies, the proteome is digested and peptides are detected as they elute from a liquid chromatography (LC) system. The retention time on the LC column is one dimension of analysis, the mass measured is the second dimension, and the intensity of the peaks is the third dimension. The resultant 3D peaks are integrated, which reflects the intensity of the peptide, which is identified by MS/MS and tells you what protein it came from, and by inference how much of that protein was in the sample.
The fourth dimension (4D) refers to the addition of ion mobility, which has been around for a number of years but has not been used routinely in proteomics. The invention of trapped ion mobility spectrometry (TIMS) by Bruker has made the routine use of ion mobility in proteomics possible. Additionally, the parallel accumulation-serial fragmentation (PASEF) scan technique in the TIMS cell, increases sensitivity and speed. As the ions are trapped and then elute as a function of their mobility, this additional dimension of information can be used to improve identification. As multiple peptides co-elute off the nano-LC column, their unique collision cross sections (CCS) allows for further gas phase separation in the TIMS cell, allowing for more peptides to be identified. This gas phase separation as a result of TIMS is the fourth dimension in addition to retention time, mass-to-charge, and intensity, and termed as 4D matching.
The benefits of 4D matching due to the PASEF scan allows researchers to identify lower abundance proteins, such as tissue leakage proteins or signal proteins, with higher confidence and with the required high sensitivity. For example, for data dependent analysis (DDA/PASEF), the CCS values will be used as an additional identification criteria in the search engine to provide confidence in the peptide identification. In the case of data-independent analysis (DIA/PASEF), or an intermediate method called “match between runs”, you can use the CCS value as a unique peptide signature to help align features and increase confidence in assignments.
MC: What stages are involved in the development of a novel technology for use in proteomics research?
GK: Firstly, you have to identify the unmet need of a particular application, which in proteomics is usually the depth of coverage of the proteome, speed, and sensitivity. There are an enormous number of proteins in plasma, over an incredible dynamic range, so this is the major unmet need that has been recognized for years.
While you want to drive the higher sensitivity and dynamic range by improving the specificity, sensitivity and speed of the mass spectrometers, you have to keep in mind that in the clinic, samples must be analyzed quickly. This is another unmet need – to be able to generate a proteome from a person in an hour at a reasonable cost, to ensure applicability in the clinic. Researchers may be willing to spend hours to days on a single sample to find a biomarker, but validation cannot take this long because you need a minimum of a thousand samples, and for that to work you need a method that can be done in an hour or less. For routine application in the clinic with thousands of patients per day, it has to be even shorter.
Robustness is another need to be met. To routinely measure a patient’s proteome and compare them at regular intervals, results must be reproducible in order to observe changes in the patient, rather than in instrument performance. Thus, you need a mass spectrometer that is very robust. In addition to its 4D matching capability which adds specificity, and the speed which enables you to hit a lot more targets, the robustness of Bruker’s timsTOF Pro is a big advantage.
Once the unmet needs are identified we must then develop solutions to solve them. In many cases, partnerships and collaborations are also crucial. Bruker works closely with both commercial and academic software partners, e.g. Bioinformatics Solutions Inc (the producer of PEAKS software for proteomics), MaxQuant, Skyline, and Protein Metrics Inc. (the producer of Byonic™)to help dig deeper into results.
We also partner with chromatography experts to maximize robustness and speed. Samples are injected into an LC system to separate the peptides, and generally for sensitivity purposes this is done with nanoflow chromatography. However, alternative methods may be more suitable for clinical applications because nanoflow can pose some practical challenges.
With the timsTOF Pro, the nanoflow chromatography is one of the bottlenecks, so we have partnered with Evosep which has developed a very robust, high-throughput LC system, the Evosep One,with moderately low flow rates that are ideal for clinical research applications. By partnering with these different companies and academic institutions, we can help speed up the development and bring best practices from different sources to meet these needs.
MC: In your opinion, what have been some of the most exciting technological developments in the proteomics research field thus far? What major advances do you see in the future of the proteomics research field?
GK: Even after nearly 25 years of proteomics, the field remains largely fragmented, especially in contrast with genomics and the landmark human genome sequencing work done by Venter and Lander in the mid-2000’s.
In proteomics, the challenge remains in achieving relevant proteomics depth in the shortest possible time, which is no easy task given that the human genome contains about 20,000 protein coding genes. To achieve the required proteomics depth, given the duty cycle of modern mass spectrometers, two mutually exclusive approaches are employed, each with its pros and cons; DIA and DDA with "match between runs" philosophy.
Furthermore, LC-MS is a hyphenated technique, and the LC methods segment into nanoflow and microflow, each with its strength and weakness. So unlike a DNA sequencer that benefited from PCR amplification and removed the focus from the analytical technique itself, proteomics very much remains dependent on the advances made in both LC-MS and in bioinformatics, trying to decipher the complex information acquired per second.
The Orbitrap mass spectrometer made high resolution, accurate mass data, routine which in turn triggered bioinformatics tools that used the accurate mass information to dramatically improve the confidence in analysis. This key development resulted in the wide-spread use of MS-based proteomics, leading to a rapid advance in the field. The timsTOF Pro with PASEF technology represents a new step-change for MS-based proteomics, as is adds an additional key qualifier – the peptide CCS value. Even today, chromatographic retention time plays a key role in bioinformatics, so adding peptide CCS values – a critical gas phase separation signature of the peptide – could exponentially improve the confidence of analysis by reducing false discovery rates, discover multiple site PTMs for the same peptide sequence, and go deeper into the proteome by triggering MS/MS in windows around certain ion mobilities. This unique CCS signature could be advantageously used by adding a level of intelligence, for example in the immunopeptidomics (and other non-tryptic peptides, targeted proteomics) using PASEF triggered parallel reaction monitoring (PRM), or for connecting the various PPI pathways to study host-pathogen infections to discover new pharmaceutical drugs.
The field has been developing rapidly for the past 25 years, thanks to the continuous evolution of MS-based proteomics. The dramatic improvement in the robustness and speed performance the timsTOF Pro makes large cohort studies possible, analogous to the impact made by the genome-wide association study (GWAS). We believe the TIMS/PASEF approach with the critical CCS-peptide signature information, together with advances made in machine learning capable bioinformatics tools, make it an important consideration in enabling proteomics to become more clinically relevant.
Gary Kruppa was speaking with Molly Campbell, Science Writer, Technology Networks.