Deciphering the Cancer Proteome
Deciphering the Cancer Proteome
Researchers have been attempting to decode protein alterations in cancer for many years, but the dynamic nature of proteins coupled with the fact that cancer is an extremely diverse and heterogeneous disease hasn’t made deciphering the human cancer proteome easy.1
Nevertheless, efforts to advance tools and develop novel strategies to ‘de-complex’ the proteome – allowing researchers to analyze it in a more accessible format – has fueled progress in cancer proteomics. Here we explore the current state of play within the area of cancer proteomics, highlight some of the techniques used to investigate the proteome, and look at the impact of combining proteomics with other ‘omics’ in relation to cancer research.
What is a proteome?
A proteome is the entire set of proteins that can be expressed by an organism. It can also be used to describe the proteins produced in a specific cell or tissue type at a given time. Proteomics is the term used to define the large-scale study of proteomes.2,3
Oncoproteomics: Tools for investigating the cancer proteome
What is oncoproteomics?
Oncoproteomics, also known as ‘cancer proteomics’, is the term used to describe the use of proteomics technologies to study proteins and their interactions within a cancer cell.4,5
Oncoproteomics holds great promise as a means for unveiling the molecular events responsible for disrupting normal cell function and triggering their transformation into cancer cells.
There are several ongoing proteome projects that intend to help revolutionize our understanding of the human proteome, including the Human Proteome Project (HPP), organized by the Human Proteome Organization (HUPO), and the Human Protein Atlas, a project that is primarily based in Sweden. The Human Protein Atlas aims to “map all of the human proteins in cells, tissues and organs” using numerous ‘omics’ technologies such as mass spectrometry-based proteomics, antibody-based approaches and knowledge-based proteomics.
In addition to the above-mentioned proteome projects, specialized projects with a specific focus on cancer also exist.
“Through recent consortium projects such as The Cancer Genome Atlas (TCGA) or Clinical Proteome Tumor Analysis Consortium (CPTAC), proteomics data has been systematically generated over large cohorts of cancer patients. These studies provide the comprehensive catalogs of oncogenic events at the protein level and reveal tumor subtypes by proteomics data,” explains Han Liang, Ph.D. Department of Bioinformatics and Computational Biology, Division of Science, MD Anderson Cancer Center.
There are ongoing efforts to generate antibodies that are capable of recognizing specific components of the human proteome. There are several antibody-based approaches that can be used in conjunction with high-throughput assays, including tissue and protein microarrays, to examine the cancer proteome. These technologies are capable of targeting and profiling several proteins. The resulting “proteomic snapshots” can then be converted to proteomic maps exposing the proteome composition.6,7
Forward-phase protein array
In a forward-phase protein array (FPPA), analytes are captured by a capture molecule (typically an antibody) that is immobilized on the array surface. Each microarray ‘spot’ consists of one specific type of immobilized antibody. Numerous parameters can be analyzed from a single sample. The sample analytes can be directly or indirectly (sandwich assay) labeled.8
Reverse-phase protein array
A major focus of Liang’s group is to develop bioinformatics tools and analyses for cancer functional proteomics data generated from a protein microarray. “We focus on functional proteomic data generated by reverse-phase protein arrays. This rapidly maturing quantitative antibody-based assay can assess many protein markers in many samples in a cost-effective, sensitive and high-throughput manner,” says Liang.
A reverse-phase protein array (RPPA) platform immobilizes minute amounts of protein lysates on individual spots on a microarray. Each array is incubated with a specific detection antibody, which enables you to detect the relative expression of the corresponding protein across multiple samples simultaneously.6
The MD Anderson Cancer Center has been a leader in the implementation of this technology. The RPPA platform currently contains ~300 protein markers, covering all major signaling pathways. Liang’s group has developed an integrated bioinformatics resource, The Cancer Proteome Atlas (TCPA) which contains the RPPA data of ~10,000 patient samples, and ~1000 cell lines.
“TCPA provides user-friendly modules for fluent exploration of RPPA data visualization and analysis in a rich context. It is a major data portal for TCGA and will support other important NCI initiatives,” says Liang.
Mass spectrometry-based proteomics
Mass spectrometry (MS) is an indispensable tool for cancer proteomics studies. It can detect very small changes within the proteome and has a multitude of uses within the field – disease diagnosis, biomarker discovery, drug toxicity monitoring… the list goes on.9
“Some of the applications include the identification of proteins and their post-translational modifications, the elucidation of protein complexes, their subunits and functional interactions, as well as the global measurement of proteins in proteomics.”
“MS can also be used to localize proteins to the various organelles and determine the interactions between different proteins as well as with membrane lipids,” explains Dr. William Cho, Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong.
Cho explains that ionization of protein in mass spectrometry is the fundamental step, with electrospray ionization and matrix-assisted laser desorption/ionization being the most commonly-used methods.
“These ionization techniques are used in conjunction with mass analyzers such as tandem mass spectrometry. Usually the proteins are analyzed either in a "top-down" approach in which proteins are analyzed intact, or a "bottom-up" approach in which protein are first digested into fragments,” says Cho.
Top-down vs bottom-up
Bottom-up MS-based proteomics or “shotgun proteomics” is a well-established approach that involves digesting the protein sample with a protease (e.g. trypsin). The resulting peptides are then separated using liquid chromatography and the individual peptides are then mass analyzed using MS. The spectra are subsequently compared to specialized proteome reference databases to help identify peptide-spectrum matches.
However, ‘bottom-up’ methods are not best suited to analyzing a protein’s post translational modifications (PTMs) due to the requirement for the proteomic mixture to be digested before it can be mass analyzed. Top-down MS-based proteomics has therefore gained momentum, as the approach involves introducing the intact protein into the mass spectrometer – meaning it is possible to obtain more detailed information about PTMs.10
PTMs have a central role in cell signaling and aberrant signaling has been identified as a key mechanism in cancer biology, emphasizing the importance of PTM analysis. Due to their involvement in neoplastic transformation, PTMs present themselves as potential cancer biomarkers and/or attractive therapeutic targets.11
Challenges associated with deciphering the cancer proteome
Although proteome studies are gaining momentum, challenges remain, factors such as cost, protein stability and limited large-scale proteomic data all impact the progress within the field.
Liang touches on some of these challenges: “First, in general, proteome studies are more expensive and require larger amounts of tumor cells, and these factors limit the possibility that proteomics can be routinely applied to clinical samples (as DNA sequencing) in a cost-effective manner. So far, large-scale proteomic data of clinical patients’ samples are still limited.”
“Second, since proteins are less stable, the quality control step is critical, and there is a major need to develop systematic approaches to credential proteomic data of clinical samples.”
“Third, there has been limited progress to elucidate intratumor heterogeneity using proteomic approaches, but in contract, tremendous progress has been made by single-cell DNA/RNA sequencing approaches.”
Cancer proteomics joins forces with other ‘omics’
Emerging ‘omics’ technologies are being increasingly used for cancer research and biomarker discovery. “Peptides identified with mass spectrometry are used for improving gene annotations and protein annotations. Parallel analysis of the genome and the proteome facilitates discovery of post-translational modifications and proteolytic events,” explains Cho.
Proteogenomics is the term used to describe the integration of proteomics, transcriptomics and genomics techniques. Whilst proteogenomics is not restricted to the cancer field, it is a particularly useful technique for studying molecular signatures of cancers and for identifying tumor-specific peptides.12,13 This unified approach has enabled far more detailed and accurate insight into cancer biology and has proven far superior compared to the insight gathered from data derived from a single ‘omics’ technology in isolation.
Liang concludes: “Through integrative analysis with other “omic” data such as whole-exome and transcriptome data, we have a better understanding about how key oncogenic events are achieved by dysregulation.”