A decade ago there was talk that high-throughput screening (HTS) was a contributory factor in drug discovery’s declining productivity and that it stifled creativity and innovation.1 But the last ten years have certainly disproved those claims, with HTS playing a role in the discovery of many new drugs and moving from the preserve of large pharma to a method available to smaller companies and even academic institutions. Today, combining the large data sets, generated using HTS, with the latest advances in artificial intelligence is proving even more grist for the drug discovery mill.
HTS uses automated robotics, microfluidics and sensitive detectors, to allow researchers to quickly perform millions of assays to determine the drug-like properties of large libraries of typically 100,000 compounds. It has been in use for almost 30 years, but in the last five years there has been a shift in who is doing it, says Professor Bill Janzen, Executive Director for Lead Discovery at Epizyme Inc, Cambridge, MA. “The biggest advance in the area that people tend to ignore is the ability to outsource high-throughput screening,” says Janzen. Contract research organizations are now carrying out large amounts of screening, he adds, “without the need to sink a large capital investment, it gives smaller companies a lot more flexibility and in fact many large companies are taking this approach too.”
Many of the basics of HTS have not changed significantly in recent years, but Janzen points out: “It’s a steady state that’s by no means homogeneous across the industry.” A selection of testing microplates are available, with small test wells in multiples of 96, including the standard 384 and 1536 plates, with wells holding volumes down to 3μl. Adoption of miniaturization beyond this is limited by the volumes necessary to mix reagents and prevent evaporation explains David Cronk, Director of High-Throughput Screening Sciences at Charles River Laboratories in Saffron Walden, UK: “Until this can be overcome, we appear to be at our limit, and one could argue the need to progress to lower volumes or higher density is not required.”
Biochemical versus cell-based assays
The assays that are used and being developed for high-throughput screens reflect the large variety of biological targets being investigated. One main distinction is whether a screen uses a biochemical or cell-based assay - both are widely used. “If you go to any company I think you will still find there is a division somewhat close to [50%],” says Janzen. Cellular assays often require increased potency compared to biochemical assays, “you miss some of your leads, though you get better starting material,” Janzen adds. Cronk suggests that today the trend is moving away from cellular systems and towards the biochemical, but he explains: “This is probably more reflective of the interest in target classes where cellular detection systems have yet to be fully validated and adopted, rather than a conscious move.”
A large proportion of assays use fluorescence-based detection methods which offer unrivalled sensitivity and adaptability for their cost per data point. This has prevented a shift to new assay systems that may be less prone to interference (caused by intrinsically fluorescent compounds or scattered light due to precipitated compounds). But luminescent assays have taken a share of the market. These are based on bioluminescent luciferases and can be used for diverse targets.2
“The key breakthroughs have been in the label free environment and the development of high-throughput mass spectrometry systems, and this field continues to move at pace,” says Cronk.
Speed was a problem as mass spectrometers (MS) are serial detection systems capable of analyzing only one sample at a time and each sample undergoes micro-scale solid-phase extraction to rapidly desalt and purify it before analysis. State-of-the-art systems are now able to inject samples approximately 10s apart3 but adoption of the technology has been limited.
Chemical libraries used for screening are now better and more available “You don’t have to be at a big pharmaceutical company to get a good chemical library,” says Janzen. The way they are used is also changing: “In the early days of HTS the key driver was to screen as many compounds as possible, and as a result the chemical diversity and/or quality of many screening collections was far from ideal.” The type of targets being sought has also expanded, with historically important classes, such as GPCRs, ion channels, or kinases becoming less relevant, with more interest in novel enzyme classes and immuno-oncology targets. “The problem is our compound libraries are largely designed to identify hits for these more classical targets,” says Cronk.
But the emphasis has now shifted from quantity to quality. “Whilst many organizations still have large compound collections (>1 million compounds) the default position is no longer to screen the entire library. Instead libraries are stratified to offer the widest representation of chemical diversity of the entire library, in a reduced set where possible, coupled with target directed components of the library,” explains Cronk. Data acquired on library compounds allows this type of stratification and it has also allowed the removal of problematic compound classes that frequently show false positives, driving up the quality of hits.
Still thinking big are DNA-encoded libraries, where sequences of DNA are conjugated to compounds to serve as barcodes and, in some cases, control their chemical synthesis. In 2017, Danish biopharma company Nuevolution announced that it’s DNA encoded library now includes 40 trillion unique molecules.4 The DNA-encoding enables affinity selection screening against protein targets which can be performed in one vessel without the need for biochemical assays.5 Following washing to remove weakly bound compounds, the target protein can be denatured and the DNA tags associated with the hits are amplified by PCR and sequenced for identification. But Cronk cautions: “Whether this approach will deliver on its promise to identify hits for target if enough compounds are screened remains to be seen, as it is still early days.”
High-throughput Screening: CRISPR-Cas9 and RNAi
HTS is also being used to probe functional genomics. RNA interference (RNAi) screens are being widely used to discover the proteins involved in specific disease pathways and to identify ways to intervene. Using synthetic RNA molecules specific genes are silenced in cells and any resulting ‘loss-of-function’ can pinpoint the gene responsible.6 The CRISPR-Cas9 system has also been used in HTS and can provide more specific information than RNA interference. CRISPR-Cas9-based gene editing has been used to create libraries of viral CRISPR guides that target different genes, and these can be transduced into cell lines.7
CRISPR pioneer, Professor Feng Zhang of the Broad Institute of MIT and Harvard, has used the method to probe resistance to the cancer drug vemurafenib (Zelboraf®) in melanoma cells. Using a high-throughput screen of a CRISPR knockout gene library he identified several hits that confer vemurafenib resistance.
High-throughput Screening: Artificial intelligence and machine learning
The other big advance in HTS is the ability to mine the large amounts of data generated using artificial intelligence. Cronk says the impact on HTS will be huge: “The ability to parallel process data through cloud computing will influence things more than developments in the way the data is acquired.”
“It’s [in] the last decade that [using machine learning to analyse HTS data] has taken off because of the confluence of the size of data, power of computing and then also the algorithms that have started to be developed,” says Dr Sean Ekins, CEO of Collaborations Pharmaceuticals, based in North Carolina. The company focuses on rare and neglected disease drug discovery using publicly available HTS datasets and machine learning models, to identify molecules of interest for further testing.
“The machine learning models are trying to pull out key descriptors or features [of active molecules from HTS data],” explains Ekins. “Once you have a model you can use it to screen a very large library of compounds that are virtual and see if it can find things that have those features that you have already pulled out.” Ekins has used his approach to identify lead molecules for tuberculosis therapies. He was able to virtually screen bioactivity and cytotoxicity information and produce hit rates exceeding typical HTS results.8 He has used a similar approach to identify molecules active against the Ebola virus and the parasite causing Chagas disease.
The evolution of high-throughput screening
Ekins suggests the need for conventional high-throughput screens may actually decline with new developments in machine learning and the increasing amount of data already available. But screening may start to focus on new areas he muses: “What we might be thinking about in the future is very large numbers of combinations, three or four or more compounds.”
“When high-throughput screening first started out, there was this expectation that it was going to essentially produce drug leads de novo from the screen,” says Janzen, “…but people have realized now that it’s [just] one tool for the discovery of novel chemical templates, and by no means the only one.” Although it’s clear that discovering drugs needs a whole raft of tools, often used in parallel, today says Cronk: “The perception of the HTS department has certainly changed and there is a greater recognition of the skills that reside within those teams.”