We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


How an Unexpected Discovery in a University Pond Changed the DNA Rulebook

A body of water.
Credit: Adora Goodenough / Unsplash.
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 7 minutes

Accidental discoveries have led to numerous breakthroughs and innovations throughout history; the antibiotic penicillin, the microwave and even the drug Viagra are all products of chance discoveries. 

Serendipity in science often challenges laws or rules that we have previously regarded as established and absolute. It serves as a tool for advancing knowledge and a sweet reminder that, sometimes, it is wise to be sure of nothing.

A recent discovery, published in PLoS Genetics, challenges the “rulebook” of DNA.

At the Earlham Institute, Dr. Jamie McGowan, a postdoctoral scientist, was working with researchers from Professor Thomas Richards’ group at the University of Oxford to test a novel DNA sequencing pipeline. Their intention was to be able to sequence small amounts of DNA from single cells, obtained from organisms that are tricky to grow in a lab. But when studying the genome sequence of a microscopic organism – a protist – obtained from a freshwater pond at Oxford University, McGowan and colleagues discovered a new species. This species had a novel variant in its genetic code.

The genetic code is a prime example of a highly preserved element in biology. In most species, the genetic code comprises 64 codons, three of which are known as “stop codons” – meaning they mark the end of a gene. In the protist analyzed by McGowan and colleagues, two of the stop codons had developed altered meanings. This is highly unusual and defies the conventional "rules" we previously understood regarding gene translation.

In an interview with Technology Networks, McGowan discusses how the team’s original study plan evolved, how it feels to stumble upon a novel discovery and how the team’s findings could be utilized in synthetic biology.


Molly Campbell (MC): Can you describe your original plans for this study, and how it then felt to land upon an unexpected discovery?

Jamie McGowan (JM): This research is part of the Darwin Tree of Life (DToL) Project, an exciting collaborative project aiming to sequence the genomes of all eukaryotic species in Britain and Ireland – that’s all animals, fungi, plants and single-celled microbial eukaryotes known as protists. This particular part of the DToL project is a collaboration between the University of Oxford and the Earlham Institute, where we are working on developing approaches to sequence protist genomes. Protists are particularly challenging to sequence as most species can’t be easily grown in the lab, so we need techniques that enable us to sequence the tiny amounts of DNA present in single cells isolated from the environment. For this work, we are taking advantage of the Earlham Institute’s expertise and cutting-edge instrumentation for single-cell genomics and adapting techniques that are typically used for biomedical research and applying them to study microbial diversity.

It was really exciting that one of the species we sequenced turned out to be a novel species with such unusual biology. It’s always surprising when exceptions are found for some of the “rules” we take for granted in genetics, in this case, the genetic code. I think our study highlights just how little we know about the genetics and genomics of protists and how we are surrounded by novel biology in the microbial world that tends to be ignored.

MC: For readers that haven’t yet encountered your paper, can you describe your key findings? 

JM: The genetic code is the set of rules that dictates how genes are translated into amino acid sequences to build proteins, the molecular machinery in our cells. The genetic code is one of the most conserved features in biology. Most species, whether they are humans, animals, plants or microbes, use the “universal genetic code”. This genetic code is made up of 64 codons (groups of three nucleotides). Three codons (TAA, TAG and TGA) are used as “stop codons” which signal the end of a gene. The remaining 61 codons (“sense codons”) are used to specify an amino acid to be inserted into a protein sequence, including the “start codon” (ATG) that signals the beginning of a gene’s sequence. The genetic code is described as being “redundant” as there are 61 sense codons but only 20 standard amino acids, meaning that multiple codons can specify the same amino acid.

Very few species use a different genetic code or set of rules for protein translation. In our study we discovered a new species of ciliate (a hairy single-celled microbe) that uses a novel variant of the genetic code. In this new variant, two of the stop codons have evolved to have different meanings. The TAA codon was reassigned to specify the amino acid lysine, while the TAG codon was reassigned to specify the amino acid glutamic acid. So, this species only uses one codon – TGA – as a stop codon, instead of the usual three.

Variants of the genetic code have been discovered before, but this is a particularly unusual variant as the codons TAA and TAG were thought to be evolutionary coupled, as they generally always have the same meaning – either they are both used as a stop codon, or they are both translated to the same amino acid. But here we discovered that they specify two different amino acids, which was a novel finding.

Want more breaking news?

Subscribe to Technology Networks’ daily newsletter, delivering breaking science news straight to your inbox every day.

Subscribe for FREE

MC: Why did you decide to analyze the genome of a protest?

JM: Protists tend to be overlooked when it comes to genome sequencing projects, so we are really interested in increasing the number of protists genomes that are available. Of the relatively modest number of protist species that have had their genome sequenced, most are pathogenic or parasitic protists. Sequencing the genomes of disease-causing protists is critical for identifying drug targets and developing diagnostic techniques, but it is also essential that we sequence the genomes of non-pathogenic or free-living protists, which can reveal important insights into how all life on the planet has evolved.

Protists tend to have a bad name as they include some of the most devastating pathogens of humans (e.g., the malaria parasite Plasmodium) and plants (e.g., the potato blight pathogen Phytophthora infestans). But they aren’t all bad. In fact, most of life on Earth depends on protists to survive.

Many protists function as primary producers, such as the photosynthetic protists (e.g., diatoms and green algae) that produce a huge chunk of the Earth’s oxygen supply. Protists also occupy important positions in the food chain, some eat bacteria and then, in turn, they get eaten by larger organisms. They also play a central role in the environment’s nutrient cycle, decomposing organic matter such as dead plants and recycling nutrients back into the environment. Protists are also incredibly diverse. When most people think of biodiversity, they tend to think of animals and plants. But protists make up the vast majority of eukaryotic biodiversity. By sequencing protist genomes, we hope we can better understand biodiversity and evolution.

Our collaborators at the University of Oxford have been refining approaches to specifically isolate single cells of interest from complex environmental samples, in this case, water collected from a freshwater pond in Oxford. Using fluorescence-activated cell sorting (FACS), it is possible to isolate single cells with specific properties that we are interested in, for example cells that are a certain size, shape or with specific features, such as flagella, or the ability to photosynthesize.

The species we sequenced for this study was a ciliate, a single-celled organism covered in cilia (hair-like structures) that swims very fast and eats bacteria for food. Ciliates are found pretty much everywhere – in bodies of water, in the soil, or even living inside other organisms such as cows and sheep.

MC: Ciliates are a hotspot for genetic code changes. Why?

JM: This is something that is not very well understood. Genetic code changes are pretty rare across the tree of life but seem to have happened more than might be expected in ciliates.

There are a few hypotheses for why the meaning of a codon might change. Ciliate genomes tend to have very low GC content – that’s the proportion of the genome that is made up of guanine or cytosine (G or C) compared to adenine or thymine (A or T). For example, the human genome has an average GC content of 41% whereas in some ciliate species GC content can be as low as 16%. Having such a low GC content influences which codons are used in genes and can lead to certain codons being avoided. When a codon isn’t used frequently, it may provide an opportunity for its meaning to change to another amino acid, or in this case, from “stop” to an amino acid. However, there are many examples of species with very low GC content whose genetic code hasn’t changed.

Using a non-standard genetic code might even be beneficial. Imagine in the context of a virus infection. If a virus tries to infect a cell and that host cell uses a different genetic code to the virus, the virus won’t be able to efficiently hijack the host cell’s translational machinery to translate its viral proteins which could provide a robust form of protection. More research is needed to better understand why ciliates are a such a hotspot for deviations of the genetic code.

MC: Since the publication of your paper, have you been contacted regarding any other examples of genetic sequences in organisms where the stop codons are linked to two amino acids?

JM: No, not yet. It will be exciting to find out where else this new ciliate species or any closely related species are found across the world. Or if there are any other undiscovered genetic code variants lurking in nature.

MC: What are the implications of your research findings and, what are your next steps in this research space?

JM: Our findings strengthen the idea that the evolution of the genetic code is more flexible than was once thought. It extends our knowledge of genetic code reassignments.

Fascinatingly, phylogenetic analyses suggest that several ciliate lineages have independently evolved the same variant genetic code multiple times. This further supports the idea that ciliates appear to be more prone to genetic code changes. From sequencing the DNA and mRNA of this ciliate species, we have assembled its genome sequence and the catalogue of genes that it encodes. This has given us detailed insights into how this species has evolved and how it is related to other species. But future lab work will be required to better understand the molecular mechanisms surrounding this and other genetic code change events.

Findings like these can have important implications in the fields of biotechnology and synthetic biology. Understanding how genetic code changes like this occur in nature might inspire biotech solutions, for example, engineering organisms with altered genetic codes to prevent viral infections.

Dr. Jamie McGowan was speaking to Molly Campbell, Senior Science Writer for Technology Networks.