We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Y Chromosome’s Complete Sequence Published

A picture of an X and Y chromosome.
Credit: iStock.
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 4 minutes

An international research team of over 100 scientists has determined the full genetic sequence of the Y chromosome. The milestone study bolsters our understanding of human genetics and further improves the human reference genome that was first constructed in 2003.

An incomplete picture of the human genome

In 2022, the Telomere-to-Telomere (T2T) consortium published the first complete sequence of a human genome. It served to fill gaps remaining in the human reference genome that was first generated during the 2003 Human Genome Project (HGP) by sequencing DNA “dark matter”.

These sequences of DNA, sometimes nicknamed “junk” DNA, are regions that do not encode proteins. “The HGP mapped about 92% of the human genome sequence. The remaining sequences were complex in nature and required advances in technology that were not available at the time,” explains Evan Eichler, professor of genome sciences at the University of Washington School of Medicine and investigator at Howard Hughes Medical Institute.

The creation of the “gapless” human reference genome, while a tremendous feat, relied on the sequencing of two human X chromosomes. Therefore, it wasn’t technically “gapless”.

The Y chromosome contains many repetitive sequences of DNA, a challenge for even the most sophisticated of next-generation sequencing technologies, explains Dylan Taylor, a geneticist and doctoral candidate at Johns Hopkins: “While it would be terrific if we could sequence an entire chromosome all at once and without any errors, our current genome sequencing technology does not allow that. Instead, we can only sequence short fragments of DNA (called reads) at a time. Afterwards, we have to figure out how those pieces fit together using overlaps between them, almost like doing a jigsaw puzzle.”

Why is the Y chromosome sometimes called the “male” chromosome?

The X and Y chromosomes contain genetic information that determines a person’s sex, in addition to other biological traits. Individuals with two X chromosomes (XX) typically develop female reproductive systems, while individuals with one X and one Y chromosome (XY) develop male reproductive systems. For this reason, the Y chromosome is often referred to as the “male” chromosome. It’s important to note that genes relating to human sex characteristics also exist in other areas of the human genome, not just the X and Y chromosome.

Fitting these pieces together is no easy feat when long stretches of repetitive DNA are present. “Like the hardest part of a jigsaw puzzle, such as blue sky or green fields where the pieces are very similar to each other, it is also very hard to assemble parts of a chromosome that are very similar to each other, and the Y chromosome is filled with these types of sequences,” Taylor adds. It took the emergence of long-read sequencing technologies, equipped for studying complex genomic regions, for scientists to overcome these hurdles. The T2T has now published a complete 62,460,029 base pair sequence of a human Y chromosome in Nature in a study that required insight from over 100 researchers across numerous academic institutes – including Taylor, who is the co-lead author.

Y chromosome’s sequence secrets unveiled

The T2T consortium discovered an additional 41 protein-coding genes in the Y chromosome, mostly from a family of genes known as Testis-specific Y-encoded protein 1, or TSPY1, known to be involved in sperm cell production. Is it fair to assume that these data could inform fertility research? Taylor thinks so: “Now that the structure of these regions is resolved, we’ll be able to better explore the impact that genetic variation in these regions might have on traits like fertility,” he says. This was a major focus of Taylor’s role in the project, where he led the analysis of how the new Y chromosome improved the team’s ability to find and analyze genetic variations. “For this we looked to thousands of other genomes sequences from people all around the world and found that we could do a much more complete and accurate investigation using the new genome,” he describes.

Taylor and colleagues are now exploring how these variants relate to fertility and other health traits. While the Y chromosome is heavily involved in sexual development, there are also genes that appear to contribute to other physiological processes, such as the development of diseases. With a clearer picture of the blueprint that makes us human, the researchers envision that improvements can be made in future studies of human DNA, such as those centering around personalized medicine. “As with completing the rest of the human genome, one of the biggest impacts of completing the Y chromosome is that now we can actually dig deeper into those regions that were previously missing/unresolved and figure out what’s going on there,” says Taylor. “We didn’t know if there were any interesting genetic features in those regions affecting human development and disease because we couldn’t even study them. Now we have the tools to do just that.”

Creating a panel of reference human genomes

Equipped with the full Y chromosome genome, the T2T consortium was able to analyze research studies that had utilized the previous reference human genome, identifying potential errors in the data. Is such work now rendered futile? It’s highly unlikely, Taylor emphasizes: “For the portions of the Y chromosome that were resolved in the previous reference, that reference was largely very good. For regions where we did identify errors, most previous studies would have likely avoided analyze the regions anyway, as they were known to be potentially ‘problematic’.”

Looking to the future, Taylor hopes that scientists will one day be able to generate genome assemblies of this quality for anyone. “While I do think this research is incredibly impactful, it is just one person’s genome at the end of the day,” he says, adding that the full spectrum of genetic diversity cannot be captured by using a single reference genome. “Instead, it would be better to have a panel of reference genomes that better cover the scope of human genetic diversity,” Taylor notes.

Want more breaking news?

Subscribe to Technology Networks’ daily newsletter, delivering breaking science news straight to your inbox every day.

Subscribe for FREE

That’s the very goal of the Human Pangenome Reference Consortium, a project funded by the National Human Genome Research Institute that aims to sequence and assemble genomes from diverse populations. “One of the biggest impacts of our own work is that we’ve laid the groundwork for how to generate more high-quality genome assemblies,” Taylor says.

For now, Taylor’s focus is on extending this work to primate species. “This allows us to accurately explore the evolutionary history of these species, and how it relates to human evolution,” he concludes.

Dylan Taylor was speaking to Molly Campbell, Senior Science Writer for Technology Networks.

Reference: Rhie A, Nurk S, Cechova M et al. The complete sequence of a human Y chromosome. Nature. 2023. doi: 10.1038/s41586-023-06425-6