We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Whole-Genome Analysis of 12,000 Patients Reveals “Treasure Trove” of Cancer Insights

Whole-Genome Analysis of 12,000 Patients Reveals “Treasure Trove” of Cancer Insights content piece image
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 6 minutes

Scientists and clinicians from the Cambridge University Hospitals NHS Trust and the University of Cambridge – supported by Cancer Research UK (CRUK) – have conducted the largest whole-genome sequencing analysis of cancer tumors to date. The researchers say they have uncovered a “treasure trove” of insights into the underlying molecular mechanisms behind different forms of cancers. Their work, published in Science, has also identified novel mutational signatures in individuals and across sub-populations.  

Next-generation sequencing in cancer

In the early 2000s, next-generation sequencing (NGS) technologies emerged that transformed scientific research. The ability to read and explore the function of DNA code at scale and pace has had a dramatic impact on many fields of research and how we study the biology of humans and other organisms.

In clinical medicine, NGS has helped us to understand how slight variations in the genetic code from one person to another can contribute to a disease. Or why the same drug – administered in the same dosage – may be metabolized differently across individuals, leading to suboptimal side effects.

In oncology, NGS has proven critical for exploring the heterogeneity of cancer in our quest towards personalized medicine. By identifying, characterizing and quantifying even the smallest genetic differences between cancer cells from the same tumor, or between two patients’ tumors, can begin to tailor treatments accordingly.

What is heterogeneity?

In this context, this term describes differences in tumors – e.g., breast cancer tumors – both at the cellular level within a tumor and between tumors from different patients.


Broadly speaking, NGS is subcategorized into: 

  • Whole-exome sequencing (WES) methods

Where only the protein-coding regions of the genome are sequenced.

  • Whole-genome sequencing (WGS) methods

Where the entirety of the genome – including coding and non-coding regions – is sequenced.

“WES tends to seek out ‘driver’ mutations that occur in cancer genes, which are believed to be causally implicated in cancer,” explains Serena Nik-Zainal, professor of genomic medicine and bioinformatics at the University of Cambridge,NIHR research professor and honorary consultant in clinical genetics at Cambridge University Hospitals NHS Trust.

Nik-Zainal adds that there are only a handful of driver mutations per cancer. “Beyond driver mutations, there are thousands and thousands of mutations that may not have been considered to be important in the past but are in fact rather informative. They can tell us about things that have gone awry during the formation of each person’s cancer,” she says.

These “awry” processes might include the influence of environmental factors and lifestyle choices, such as exposure to UV light or smoking, on cell biology, or biological processes such as DNA repair defects.

In this context, WGS has increased value over WES; it enables scientists to see more of the genome, enabling new discoveries. Nik-Zainal has been applying WGS to the study of breast cancer for several years. Now, she is the leader of a new research study that has utilized WGS in the analysis of over 12,000 cancer patient tumors.

Novel mutation signatures

The data were obtained from the 100,000 Genomes Project, a UK-based initiative by Genomics England designed to sequence 100,000 genomes from NHS patients affected by rare diseases or cancer.

Applying WGS to this large cohort yielded a vast volume of genetic data, from which the research team were able to explore the genetic landscape of specific cancers and identify genetic patterns, or “mutational signatures”.

In total, 58 novel mutational signatures were identified – a “treasure trove” of information, the researchers say. Nik-Zainal likens this identification process to finding criminal fingerprints. Once the fingerprints have been detected, the key challenge is matching them to a suspect. To whom do the fingerprints belong to? In the case of studying cancer mutations, what is the source of the mutation?

“Some of those signatures have behaviors that suggest they are likely due to external or environmental sources,” says Nik-Zainal. She provides signature “113” as an example. It has many hallmarks that imply it is related to another well-known environmental carcinogen, aristolochic acid. “That’s quite worrying,” Nik-Zainal explains. “As clearly the patients have been exposed to something that can damage their DNA”.

Some of the novel mutational signatures resembled certain chemotherapies, like platinum. These are therefore likely to result from iatrogenic exposure, meaning they have been incurred by medical interventions.

“Other signatures occurred only in particular tissues, which leaves us to think that there must be something very specific about the organs that they have been found in. These are likely to be intrinsic factors, and there have been one or two genetic factors that produce marked signatures. These are important to identify because they may indicate sensitivities to different drugs,” Nik-Zainal explains.

Collectively, the WGS data provides insight into a wide variety of possible cancer causes, both internal and external, and many novel signatures that are without known causes too. “There is much more to learn,” Nik-Zainal emphasizes.

“This study shows how powerful whole genome sequencing tests can be in giving clues into how the cancer may have developed, how it will behave and what treatment options would work best,” – Michelle Mitchell, chief executive of CRUK said in a news release.

Translating findings from the lab to the clinic

Studies that explore the biological underpinnings of diseases such as cancer are incredibly valuable for clinicians. However, it can be challenging – and a long process – for this data to trickle into patient care. Nik-Zainal explains: “There are many hurdles between making interesting biological findings and turning them into useful clinical applications. Currently, one needs to demonstrate whether any computational tool/algorithm has clinical value first. That requires well-planned studies where in addition to collecting samples and genetic information, clinical information is collected in parallel and in a systematic way.”

In addition, generated data needs to be stored and organized in a sensible way that is also accessible. “Today, the hurdle is not in generating data, but in accessing it, asking the best questions and using it to the best of its potential,” Nik-Zainal adds.

She envisions that, should these hurdles be overcome, genomic analysis will become a routine part of cancer assessments. As current standard of care often dictates that patients undergo a blood test or a CT scan, Nik-Zainal sees genomic tests as another step towards personalized care. “Currently, some think of genomics as a little esoteric, mysterious and/or academic, but one of the things that the 100,000 Genomes Project shows us is that we can recruit and perform cancer WGS at scale from a public-sector service, and have a real-world population impact,” she says.

To ensure that the wealth of genetic data generated from this research can be translated to patient care, the research team developed a computer tool known as FitMS, which will help scientists and clinicians identify mutational signatures – both old and novel – in cancer patients, which may aid their treatment strategy.

“FitMS is ready-to-use. It is available with the publication, and we are hoping to implement it into the Genomics England NHS bioinformatics pipeline so that it is accessible nationally. It is also available as a separate package for anyone to use from anywhere in the world,” Nik-Zainal notes.

Representation in genomic studies

A key bottleneck in clinical genomics research is that, previously, study cohorts have not been largely representative. Consequently, applying the results from such studies to wider populations has been problematic. Technology Networks asks Nik-Zainal to discuss how representative the 100,000 Genomes Project sample is.

“Many genomic studies are limited to certain geographic regions within a country which tends to result in a somewhat limited representation of the total population,” she explains. “One of the benefits of this study was that it captured cancers from every geographic region in England. The region was divided into 13 recruiting ‘Genomic Medicine Centres’ between 2013–2018 (see Figure 1). Every location contributed samples without exception, thus we have as representative of England geographically as you can get.”

Figure 1: The distribution of the NHS Genomic Medicine Centres across England. Credit: Serena Nik-Zainal.

“Now is that a representation of the population demographic?” Nik-Zainal asks. Possibly not, she says. This is because there are different population densities and ethnic distributions in different areas. It’s difficult to determine whether the cancers analyzed in this study absolutely capture the true diversity of England’s population.

Nonetheless, Nik-Zainal adds, “Other ethnicities were represented. All the data were studied in aggregate and, as far as we are aware, mutational signatures do not show strong differences between ethnicities. Thus, we don’t think it affects the results.”

“Although for sure, more should be done to try to ensure that we have as broad and true a representation of the population as possible,” she concludes.

Professor Serena Nik-Zainal was speaking to Molly Campbell, Senior Science Writer for Technology Networks.

Reference: Degasperi A, Zou X, Amarante D, et al. Substitution mutational signatures in whole-genome-sequenced cancers of the UK national health service. Science. 2022. doi: 10.1126/science.abl9283.