We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Using Epidemiology in Virology: Experiences From COVID-19

Using Epidemiology in Virology: Experiences From COVID-19 content piece image
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 8 minutes

This article includes research findings that are yet to be peer-reviewed. Results are therefore regarded as preliminary and should be interpreted as such. Find out about the role of the peer review process in research here. For further information, please contact the cited source.

Over the past two years, researchers working in virology have been trying to keep one step ahead of the pathogen behind the COVID-19 pandemic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To do so, they are combining data from classical epidemiology – i.e., studying movement of the virus through populations – with genetic epidemiology, which tracks the evolution of new variants. We spoke to two researchers about how they use insights in epidemiology to answer key questions about existing and emerging viruses. 

Studying the origins of emerging viruses

“Whenever a new virus emerges, there is a checklist of questions that you need to answer,” says Marion Koopmans, head of the Erasmus MC Department of Viroscience. “You need to know what the virus is, and does it cause disease? Is the disease severe? How widespread infection is – is it found in humans and/or animals and in what locations?” To address this, you need a well-designed epidemiological study. This means combining virology lab tools and serology with epidemiological information, from patient samples and cohorts, community surveys and/or animal population surveys. “Then you might add in genetic sequencing to really help you reconstruct the events that led to the outbreak: how did this virus spread in the community? Is there a link with animals or not? Has there been a single introduction of the virus or multiple points of introduction?”

The importance of combining genetic and epidemiological information was highlighted by a study from Professor Koopmans looking at the role of farmed mink in SARS-CoV-2 transmission in the Netherlands.1 “When the pandemic started, quite early on there was mention of infection in a dog and in some cats, and so a group of us got together and asked, what other animals should we be looking out for? Because if you have a totally new disease, you really don't know where it will spread,” explains Koopmans. From this, they shortlisted a number of animal species based on what was currently known about the receptor the virus uses to infect cells. “Mink were on that list because they are related to ferrets, which are used as a model for studies of viral activity in the laboratory.”

They began screening mink through an agricultural health monitoring program in the Netherlands, and the first evidence of COVID-19 was soon reported. “This triggered a cascade of epidemiological studies to determine how widespread the infection was. Once we sequenced the genome of the mink viruses, we found they were directly related to the viruses circulating in humans. We then went back through the municipal health service to try and find out whether people on the farm had been sick and whether they had been testing positive for SARS-CoV-2.” They found evidence of cases on the farms weeks before the first reported cases in mink, but it was genetic sequencing that showed when the virus was introduced. “This told us that circulation on the farms had been ongoing for a while, because there was considerable genetic diversity in the circulating viruses.”

Having affordable access to genomics and different types of sequencers makes a big difference to this work, as well as emerging new models and tools for understanding host responses to a novel virus. “Of course, you have animal models, but there's increasing use of organoids to study some essential properties. Our lab and others are now developing organoids using key tissues from animal species to screen for possible virus susceptibility or other traits rapidly. One that I also personally favor is multiplex antigen serology for profiling the host response to an incoming virus across species, for which there are different types of platform in development. Alternative detection technologies are also available, such as metagenomic sequencing and virome sequencing to characterize whatever is in the sample. But you need to be able to decide which data are relevant for your question. So the demand is increasing for bioinformaticians who understand how to collaborate with clinicians and epidemiologists. We need hybrids of these skills.”

One of the challenges with virus epidemiology, Koopmans explains, is that it is still mostly reactive. “I really think we need to get smarter there. We know we have many different types of pathogen in circulation, so how can we pick up potential problems earlier than we do now? One of the questions I’m interested in is whether we can build alert systems based on available virus surveillance data, so if we see a high density of animal populations with a virus in certain regions, we could go in with our metagenomics toolbox and figure out what’s going on. I see it as smart surveillance.”

Responding to the Constant Emergence of Novel SARS-CoV-2 Mutants

Although most mutations have minimal impact on the characteristics of the virus, some may affect its transmissibility, the severity of associated diseases and the efficacy of diagnostics, preventives and treatments. Download this whitepaper to learn more about how SARS-CoV-2 variants can influence the development of vaccines, antibody drugs and diagnostic reagents.

View Whitepaper

Using epidemiology to understand virus mechanisms

Gary Whittaker, professor of virology at Cornell University, is a virologist who is trying to incorporate epidemiology into mechanistic studies of viral function. His research has focused on SARS, Middle East respiratory syndrome (MERS) and coronaviruses in cats, but recently he has turned his attention to tracking the surveillance and virus evolution data about SARS-CoV-2 and using this to predict the mechanistic impact of new mutations.

His lab is particularly interested in the importance of the furin protease cleavage site- a distinguishing feature of the SARS-CoV-2 genome, thought to be important in fusion of the virus glycoprotein spike with the host cell membrane.2 In a recent paper, Whittaker reports that the cleavage site mutation (which causes amino acid substitution P681H) that gave rise to the Alpha variant in late 2020, is very similar to the one that created Alpha’s younger sibling Omicron, but with very different results in terms of disease severity.3

“One of the challenges is keeping up with the pace of mutation, as new data on emerging variants is not only published or shared on preprint servers, but is increasingly being posted in its raw form on social media,” says Whittaker. “As a result, we are kind of drowning in data and finding it hard to interpret it quickly, given that wet lab science is slow, time-consuming and expensive. Everybody talks about rigor and reproducibility, but everything's being pushed out at incredible speed.”

By the time Whittaker and colleagues had their results on Alpha, it was all about the new Delta variant, and then suddenly Omicron came along. “The same point mutation we were studying in Alpha had now reappeared in Omicron.”

“This is not an unusual situation. With so many different variants emerging in a global pandemic, it can be hard to predict which ones are going to be functionally important,” Whittaker says.

“What people don't quite realize is there are two very distinct lineages of SARS-CoV-2 – lineages A and B. The initial Wuhan outbreak arose from lineage B, and although both lineages emerged almost at the same time, lineage B received all the attention because it really took hold in Wuhan and drove the pandemic into Italy and the rest of Europe. However, the A lineage never disappeared. It was appearing in pockets here and there, including in Uganda.”

Recent Developments in COVID-19 Virology

Understanding the immune response to SARS-CoV-2 at a molecular level is useful for the development of effective vaccines, treatments and prognostic tests. Download this compendium to learn more about multi-clonal SARS-CoV-2 neutralization, how multilevel proteomics reveals host perturbations by SARS-CoV-2 and the wide range of humoral immune responses to infection.

View Compendium

On one Saturday in late February 2021, Whittaker came back from his morning run, checked his phone and spotted that a colleague had tweeted about a new variant in Uganda with the P681R amino acid substitution their studies predicted would be functional. They used the genetic sequence to create synthetic genes for the variant and then went back to the Ugandan lab to collaborate with them on some mechanistic experiments. “We found that if we took that genetic mutation that caused the P681R substitution and transplanted it into the original Wuhan 1 variant, then it lost those functional changes. The new mutation only worked in the right mutational background context. The virus has to kind of figure out the right evolutionary path for that mutation to really take off.”

This latest research, now published as a pre-print,4* shows that the P681R variant appeared in Uganda around six months ahead of the Delta variant (which arose from lineage B) emerging in other countries. The key mutation that enabled Delta to behave as it did – increasing transmission and causing more severe disease – had appeared six months earlier in Uganda, but in lineage A. Although it took over quickly within Uganda, it didn't have the same global impact.

“It’s a very intriguing question: why did this key mutation cause a local, not global, virus outbreak in Uganda whereas when the same mutation occurred in Delta, it became global,” says Whittaker. “Either the virus was intrinsically different, or there were different environmental factors involved, such as public health interventions (e.g., type and degree of mask wearing), what the public transportation systems were like, what flights were canceled and whether hospitals or healthcare structure played a role, especially if you've got asymptomatic people moving between locations.”

What this shows is that predicting SARS-CoV-2’s next move is challenging, even for the most experienced virus surveillance teams. “There's a lot of talk about SARS-CoV-2 now becoming an endemic virus, but the hallmark of an endemic virus such as influenza is that, when you look at a phylogenetic tree, you’ll see they evolve in a stepwise manner. Conversely, SARS-CoV-2 is still sampling its mutational space and is making these evolutionary bursts and then retreating and trying an alternative route. That sort of forward momentum isn't quite at the stage where it's going to become endemic. It's clearly not firmly established in its host; it's taking its time to adapt.”

The Battle Against COVID-19

The battle against COVID-19 only begins with a test to identify who is infected and who is not. To accelerate an effective response, we have to move beyond the initial diagnosis to prognosis: Download this whitepaper to learn the answers to questions such as who is at risk of getting severe disease, is there a better way to find and test candidate drugs and/or vaccines and what are the long-term effects.

View Whitepaper

The question is, will we end up with one virus or multiple viruses? Delta might also have gone away for the time being, believes Whittaker, but not completely so we may end up with separate Delta- and Omicron-like lineages which continue forward, much like we see with influenza A and B.

One thing is certain though, if we are looking for a positive from the COVID-19 pandemic, it has demonstrated the importance of virus surveillance. “People were doing very modest, small-scale coronavirus surveillance but now that's completely opened up”, Whittaker notes. “Our ability to understand viral evolution is really coming from people who have spent decades studying HIV. Now that information has been translated into this field, that’s definitely a good thing.”


1. Lu L, Sikkema RS, Velkers FC, et al. Adaptation, spread and transmission of SARS-CoV-2 in farmed minks and associated humans in the Netherlands. Nat Commun. 2021;12(1):6802. doi: 10.1038/s41467-021-27096-9

2. Peacock TP, Goldhill DH, Zhou J, et al. The furin cleavage site in the SARS-CoV-2 spike protein is required for transmission in ferrets. Nat Microbiol. 2021;6(7):899-909. doi:10.1038/s41564-021-00908-w

3. Lubinski B, Fernandes MHV, Frazier L, et al. Functional evaluation of the P681H mutation on the proteolytic activation of the SARS-CoV-2 variant B.1.1.7 (Alpha) spike. iScience. 2022;25(1):103589. doi:10.1016/j.isci.2021.103589

4. Lubinski B, Frazier LE, Phan MVT, Bugembe DL, Tang T, Daniel S, Cotton M, Jaimes JA, Whittaker GR. Spike protein cleavage-activation mediated by the SARS-CoV-2 P681R mutation: a case-study from its first appearance in variant of interest (VOI) A.23.1 identified in Uganda. bioRxiv. 2021.06.30.450632; doi: 10.1101/2021.06.30.450632 *Preprint.