How AI Can Speed Up High Content Screening Analysis
Blog Nov 14, 2018 | by Ruairi J Mackenzie, Science Writer for Technology Networks
Genedata's Imagence workflow aims to speed up HCS analysis. Source: Genedata Screener
What molecules affect our cells and how do they affect them? This is a question of central importance across biomedical science, particularly in systems biology and drug discovery. For researchers in these fields, high content screeening (HCS) has become a staple method. Imaging tools measure the fluorescent readout from cells exposed to particular molecules or agents to determine any phenotypic effects on the cells. Multiple cell processes are assessed in an HCS assay, and the biological effects of thousands of agents can be determined in a high-throughout fashion.
The huge volume of data that HCS produces is both its strength and weakness; it rapidly gives us a huge amount of information on our compounds of interest, but analyzing that data becomes quite a painful and tiring process. Basel-based Genedata believes that traditional computer analysis techniques are out of date, and suggests that deep learning-based approaches, like those used by their Genedata Imagence workflow, are the way forward. I spoke to Genedata's Head of Science, Dr. Stephan Steigele, to find out more.
Ruairi Mackenzie (RM): What is involved in HCS image analysis, and what are the pitfalls that slow down the process?
Stephan Steigele (SS): Image analysis for HCS is a time-consuming, and labor-intensive process that involves different levels of expertise and software solutions. It requires many manual steps, such as the selection of extracted features or correct detection of cells. This process can be highly repetitive and error-prone due to operational complexity and multiple data hand-overs. These challenges become magnified as early drug discovery increasingly relies on complex phenotypic assays as biologically relevant model systems. Therefore, traditional computer vision doesn’t scale well with the analytical complexity and the sheer amount of data these assays produce, which oftentimes require complex image analysis procedures and lengthy set up. Bottom line: traditional HCS image analysis cannot keep pace with big data, undermining novel drug discovery and increasing associated costs and resources.
RM: How can automation improve HCS image analysis?
SS: Automation is driving higher throughput and better quality, e.g. by increasing the coherence of results from repeated measurements with lower experimental errors. Automated image analysis frees-up the time of domain experts and resources, enabling researchers to focus on the pharmacology and biology of research systems rather than on technical details. Genedata Imagence effectively automates HCS image analysis, enabling researchers to:
• rapidly detect and define all cellular phenotypes in an HCS - unhindered by IT questions;
• precisely quantify relevant pharmacology;
• reduce the time and costs required for phenotypic image analysis; and
• produce quality results from a new experiment in just minutes to a few hours vs. the weeks typically required by manual analysis and optimization
RM: How can deep learning aid the analysis process and enable transfer of learnings between assays?
SS: Historically, scientists (bioinformaticians) had to ‘design’ the image analysis by handcrafting features. For example, a feature could be the size of a cell or the intensity of emitted light from a cellular compartment. With deep learning, however, this manual step is no longer required as a convolutional network, can on its own, learn the features and their importance (e.g., is the size of a cell really important for the pharmacological effect?).
With deep learning, we can create intuitive maps that present the so-called phenotypic space to support assay biologists. Therefore, they only need to visually review the images of a few hundred cells in the context of the experimental question and tag those images by their corresponding phenotype class (e.g. small vs. large cells) -- a process that takes just a few hours. During the hands-on-free training of the neural network, the network learns by itself which features to extract, and which ones are important to classify between different phenotype classes. The result is a trained network that can be applied on subsequent production assay runs. The network analyzes production data en masse and generates pharmacologically relevant results.
Genedata Imagence also demonstrates that we can transfer the knowledge a biologist has generated by tagging the different images from one specific experimental data set to another data set. Typically, when a new assay is generated the image analysis will require refinements to an already existing image analysis. This process can take days to weeks. We have, however, demonstrated in a joint Genedata/AstraZeneca project that we can translate the knowledge obtained from one experimental condition to new conditions without any human intervention, in just a few hundred seconds.
RM: What are the advantages of Genedata Imagence software over other available automated analysis technologies?
SS: To the best of my knowledge, software packages such as Columbus rely on classical computer vision and manually ‘handcrafting' features . The advantage of Genedata Imagence over these packages is that Imagence is the industry’s first commercial and instrument-agnostic software that automates the analysis HCS images without human intervention and image analysis expertise. As noted, our deep learning-based solution accelerates the time-consuming and complex process of analyzing phenotypic high-content images. Plus, it delivers reproducible, unbiased results proven to be equal to or better than results from more traditional analysis approaches. Imagence does not require technical expertise beyond that of an assay biologist to curate a few hundred images; a process which is extremely efficient due to the guidance from our deep learning generated phenotype maps (referred to as similarity maps in Genedata Imagence).
Also, classical non-deep learning-based approaches require several stakeholders, which creates a multidisciplinary task that involves coordinating many different roles. This approach slows down cycle times and hinders the advancement of drug discovery and rolling-out research activities to a broader set of scientists. With Genedata Imagence, we enable more groups to perform high-content screening while increasing the number of high-quality results parameters, which drives more efficient drug discovery processes and improved therapies and healthcare.
Dr. Stephan Steigele was speaking to Ruairi J Mackenzie, Science Writer for Technology Networks