We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Curation of the Entire Human Genome Requires the Best of Both Human and Artificial Intelligence

Curation of the Entire Human Genome Requires the Best of Both Human and Artificial Intelligence content piece image
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 3 minutes

The following article is an opinion piece written by Mark J. Kiel. The views and opinions expressed in this article are those of the author and do not necessarily reflect the official position of Technology Networks.


The recent publication of the gapless, telomere-to-telomere human genome assembly serves as a real reminder of the significant work that remained to be completed from initial draft genome to the final sequence assembly. Similarly, while we have learned much about causative genetic variants in human disease over the past many years, there is much work that remains to complete this knowledge and to put it into practice.


To fully realize the promise of precision medicine, it is my contention that we must pre-curate every variant in the human genome. It is not scalable for analysts in clinical practice to spend as much as 90 minutes per variant to report the results of a molecular lab test. Pre-curation will make this analysis maximally efficient and reproducible.


A great example of where pre-curation will be beneficial is newborn screening by next-generation sequencing. This new method of screening demands a great deal of information about a multitude of diseases to ensure the utmost accuracy of resulting diagnoses. However, real-time assessment of this information for each patient and each variant will challenge the scalability of this initiative.


Conventional methods for variant curation are laborious, slow, incomplete and error-prone relying as they do on painstaking manual searches for evidence in the scientific and clinical literature. These manual processes too often miss key data and lead to inaccurate conclusions about the clinical significance of a patient’s variant. Collectively, all of this previous work has led to the pre-curation of just a fraction of human genetic variants. Moreover, these older pre-curations are often inadequate as new knowledge is being created every day requiring continuous updates to existing curations to ensure nothing is missed. If we continue to rely on outmoded techniques, it will not be possible to fully curate the human genome in our lifetimes.


Faster and more comprehensive variant curation will require a combinatorial approach merging the scalability and sensitivity of AI and the specificity and accuracy that can only come from the expert judgment of experienced human curators. AI-driven indexing of the scientific and clinical literature can ensure more complete information for each variant, while AI-informed expert curation delivers maximum specificity for results in a maximally efficient manner.


The primary bottleneck to achieving accurate and comprehensive variant curation is the need to manually locate, assess, annotate and document evidence from the scientific and clinical literature. In particular, the nuances of the genetic code and the idiosyncrasies of genetic nomenclature and other complexities of biology make it difficult to disambiguate terminology and ensure that curations are correct and complete.


AI is a natural solution to these challenges. Only computational approaches can meet the scale and sensitivity requirements of this ambitious genome-wide variant curation. There is precedent for using AI to index vast amounts of data — often unstructured and poorly organized data — so it is an excellent fit for indexing the scientific and clinical literature. Paying attention to and resolving genetic ambiguities and focusing on the most critical clinical information is paramount.


A team of highly trained experts is necessary to carefully assess the assembled evidence and make an informed judgment about its appropriateness and applicability, as well as ensure the utmost accuracy of all final interpretations. This review process can be further accelerated by AI-driven organization and annotation of the data.


This approach is already making it possible to deliver new insights about disease-causing variants in patients. In one example, scientists at the Rare Genomics Institute reanalyzed previously inconclusive exome results using an AI-powered tool and found a single scientific report that allowed them to classify a key variant as pathogenic and produce an evidence-based diagnosis and effective treatment for the patient.


With the speed and scale afforded by AI technology, it will be possible within the next several years to curate the entire human genome. With a combinatorial approach that brings together the best of automated AI and expert curation, we can firmly establish the foundation of genomic intelligence needed to make precision medicine a reality for all patients.


About the author:

Mark Kiel is chief scientific officer and co-founder of Genomenon, an AI genomics company. He has extensive experience in genome sequencing and clinical data analysis. Mark is a molecular genetic pathologist having received his MD/PhD in stem cell biology and cancer genomics from the University of Michigan.