We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Probing the Biomarker Landscape of Human Disease

Futuristic representation of human genomics.
Credit: iStock
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 9 minutes

Biomarkers can provide invaluable insight into our understanding of human disease, and may be leveraged for both early disease diagnostics and precision medicine approaches. Biomarker discovery, however, can be a challenging and lengthy process, with limited clinical translation. Traditionally biomarker discovery has focused largely on genetic markers using technologies such as next-generation sequencing (NGS). Advances in discovery technologies that look beyond the genome could help to identify biomarkers that enable a more complete understanding of human disease and lead to the development and selection of optimized treatments.


To find out more about the importance of improving biomarker discovery and how next-generation mass spectrometry (MS)-based systems can help to fill the current gap in biomarker technologies, Technology Networks spoke with Dr. Mo Jain, founder and CEO of biomarker discovery lab Sapient Bioanalytics.



Anna MacDonald (AM): Why is biomarker discovery and validation such an important area?

Dr. Mo Jain (MJ): The cost of bringing drugs to market continues to rise, while the number of new molecular entities that are approved each year continues to drop – despite significant advances in technology and overall computing power. The reality today is that only 1 in 10 drug candidates that enter clinical studies will ever make it to a patient. Most concerning are the high failure rates plaguing later-phase clinical trials, at which point significant time and cost investments have already been made. Seventy percent of drugs entering Phase II and 50% entering Phase III will fail, and a majority of those failures stem from lack of efficacy in the population in which the drug was tested. Even among drugs that do make it through the development pipeline and to patients, we know that only a fraction of individuals experience a positive benefit from the drug’s intended action.

These startling statistics are reflective of the fact that there is often a great deal of heterogeneity within patient populations as well as in human disease. Individuals with the same clinical diagnosis and even the same disease pathology can have very different paths that led them from a normal to a disease state, and as such, respond very differently to any given treatment. Biomarkers allow us to overcome this variability issue, and essentially allow the scientific community to develop and deploy effective drugs faster, more efficiently, and at a lower cost. Biomarkers enhance our understanding of disease by providing readouts of host and disease factors that influence biological processes, disease progression and drug response, enabling us to align a given patient with their specific disease process, and ultimately with the therapeutic they are most likely to respond to and benefit from.

Greater discovery and use of biomarkers in drug programs will transform efficiency, success rates and patient outcomes across complex disease areas. In fact, an analysis of over 20,000 clinical studies encompassing multiple therapeutic areas and clinical agents found that the unifying factor for a drug's success was whether or not it was developed with a biomarker. Drugs with an associated biomarker have a 2- to 10-fold increase in US Food and Drug Administration approval and a faster approval time. The ultimate value that biomarkers bring is greater understanding of disease to aid diagnosis, prognosis and therapeutic alignment.

AM: What has limited biomarker discovery historically? How did you set out to change this?

MJ: For all of the time, energy and expense spent on understanding human physiology and biology, we still understand an exceedingly small percentage of the total human system, whether it be 10% at best or more likely less than 1%. This means more than 90% of this complex human system is still unexplained. Probing this unknown space is not a trivial process, and historically, technological constraints have limited our ability to do so efficiently in complex systems like the human body.

This is why high-throughput discovery technologies have become essential over the last decade, in that they now allow us to broadly sample thousands to millions of data points on an individual and their disease. Thanks to NGS in cancer, for example, the classification of tumors is now based on the molecular and genetic mutations that make up the tumor, allowing us to stratify patient populations and develop drugs aimed at those specific mutations to improve response. As these technologies have emerged and allowed us to make more broad-scale measurements in humans, discovery has been greatly accelerated.

The challenge is that much of the discovery technologies developed over the last 20 years have centered largely on genetics and DNA. NGS has been transformative in cancer, but what about diseases in which sequencing may not provide the same level of insight? There are many common conditions such as heart disease, lung disease, stroke, neurodegenerative disorders, autoimmune diseases and liver disease, as well as physiologic process such as pregnancy and biological aging, for which genetics do not represent the majority of population attributable risk, and therefore likely do not hold the key to understanding the underlying disease process and ideal treatment.

Internal and external exposures over the course of our lifespan – from where we live, to what we eat, what we smell or smoke, the microbes in our gut, to our dynamic organ physiology – have a profound influence on biological processes, disease development and progression, and treatment responsiveness. These dynamic measures are not encoded in the genome, but rather in circulating small molecule chemistry. This is where there is a massive gap in technologies that allow us to advance small molecule biomarker discovery. This is why we focused our efforts on the development of next-generation MS systems to rapidly probe the non-genetic landscape of human disease.


AM: Can you tell us more about the development of your next-generation MS systems? What challenges did you face along the way?

MJ: Much in the way parallelizing sequencing has allowed us to measure and understand genetic variation across large populations at a much faster pace, we set out to develop MS technology that could go faster, and measure more of the biology that doesn’t specifically come from the genome. There are tens of thousands of small molecules in human circulation which read out the dynamic influences of organ and cellular physiology, as well as exogenous exposures such as diet, lifestyle, physical activity, toxicants, environment, microbes and myriad other exposures on health and disease. The underlying thesis of Sapient has been that if we can measure the breadth of these small molecules, whether it be in a simple system such as a cells in a dish or in a complex specimen like human blood, then we can begin to understand and uncover the non-genetic factors that give rise to disease. We can then integrate this information with genetics to build a more complete picture of human biology and better identify biomarkers for early disease detection, for understanding disease prognosis and course and, ultimately, for aligning patients with specific therapeutics.

The idea of measuring small molecules is not new, but the challenge of achieving these measurements at scale has always been a technical one. Our goal was to enable what we call “Discovery MS”, in which we can take a complex biosample and measure thousands of small molecule factors within that sample – including unknown, uncharacterized compounds – and do so across thousands of biosamples at a time. This is really the type of scale that is required to uncover robust biomarkers, understand how they behave, and ultimately leverage this information for drug development and implementation.

There were several technical questions that we had to address to reach this goal. From a hardware perspective, how do you physically develop MS systems that go this quickly and how do you classify molecules as you are measuring them? From a software perspective, how do you handle and extract the massive data troves generated by mass spectrometers operating at this scale? It took more than a decade of hardware and software development and a number of innovations to overcome these technical challenges and finally enable the speed and scale of discovery that we are able to achieve today. A key innovation occurred in the chromatographic separation with the development of what we call rapid liquid chromatography, or rLC, which is coupled to our high-resolution mass spectrometers (in a platform system termed “rLC-MS). These technologies are what allow us to take a complex sample such blood, urine, cerebrospinal fluid, tissue, cells, tears, etc., and separate out the thousands of chemicals that comprise that specimen. We also had to build backend software systems that would allow us to handle tens of thousands of mass spectra files at a given time and to extract meaningful data from them, and to homogenize these very complex data sets in a way that allowed us to build large databases to make very robust discoveries.

Today, our rLC-MS systems are fully operational and allow us to assay more than 11,000 small molecule biomarkers per biosample, in a high-throughput manner, with a capacity that exceeds 4,000 samples analyzed per day. At this scale, we can probe a greater depth and breadth of human biology, including the unknown spaces, to truly transform our understanding of the non-genetic basis of disease.

AM: What made you decide to make the leap to spin out the technology and form Sapient in 2021? 

MJ: This was a process that started over a decade and a half ago when I was still a student and a postdoc in the Boston area and next-generation sequencing was just coming to fruition. I posed this question to a number of my colleagues: if we sequence every single person in the world, how much of disease can we understand? It turns out that even if we sequence everyone, only 15 to 20% of disease risk will be explained. After hearing that, I began working on the concept of innovating mass spectrometers to help capture the 80% of non-genetic information that remains unexplored.

Initially we developed the prototypes for high throughput mass spectrometry in academia, at the Jain Laboratory at the University of California, San Diego, and a number of individuals from industry, government, and non-profit academic organizations began to contact us to access these tools. The need was clear and we felt very strongly that we had to democratize access to these technologies because of their potential to truly transform drug development and implementation. This underscored the ultimate decision of myself and my Jain Lab colleagues, now co-founders of Sapient, to spin out Sapient in 2021.

Spinning out Sapient allowed us to gain the resources required to build the next iteration of high throughput mass spectrometers, essentially developing enterprise-grade bioanalytical platforms. We have subsequently grown to be an organization that is comprised of individuals with very deep expertise that spans analytical chemistry and engineering, bioanalytics and regulatory processes, chemistry, math, statistics, computer engineering and computer software development, as well as human biomedicine. Today we are able to offer our platform commercially in support of industry sponsors, including many large pharma, that are developing the next line of therapies.

AM: How has the platform evolved since then? Are you able to share how this may continue in the next few years/future plans for Sapient?

MJ: While we’re still early in our overall evolution, I feel strongly that what we are doing now – providing discovery services to biopharma partners in support of their drug development programs – is where we can make the greatest impact. Our systems, software and pipeline are robust and will only continue to evolve and strengthen as we continue to analyze more samples. At the same time, there is quite a bit of research and development occurring at Sapient internally. We are generating our own databases across hundreds of thousands of biological samples that are being collected from around the world, measuring thousands of molecules per sample, and ultimately integrating this very rich data together with genetics information, longitudinal clinical data and demographic data in what we call our Human Biology Database. As Sapient and our data assets have continued to grow, we have reached a point where these data assets allow for de novo mining and discovery to innovate in and develop early diagnostics across many, many disease areas, not only as part of our own internal efforts but also in support of our biopharma partners.

AM: Can you give us some examples of how Sapient’s technology is being used in research studies? What difference has it made to researchers?


MJ: Fundamentally our technologies can be applied in many different ways to answer many different types of questions, which boil down to finding insights that drive a better understanding of patients, of disease states, and ultimately, of drug therapeutics.

Because of this, the applicability of our tools spans the entire drug development spectrum all the way from very early discovery, profiling cells and media, to preclinical systems, through clinical implementation in Phase I, II, III and even Phase IV programs. We can discover biomarkers that relate to companion diagnostics and tell us if an individual is going to respond to a particular drug or not, and if they are going to have an adverse event to that drug. We can also work to find biomarkers that identify new targets for therapeutic intervention, and validate specific targets in preclinical models. And then ultimately, we can identify biomarkers that denote target engagement, that allow us to understand how a drug may actually be working in vivo in a human being, to improve understanding of pharmaco-dynamic responses.

Again, the fundamental goal of the technology is to find and use new biomarkers to align our understanding of the individual patient and their specific disease process, and to identify the best drug for their disease.


Dr. Mo Jain was speaking to Anna MacDonald, Interim Managing Editor for Technology Networks.