We've updated our Privacy Policy to make it clearer how we use your personal data.

We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Deep Phenotyping - Harnessing Data Richness for Unsupervised High-Content Analysis

Deep Phenotyping - Harnessing Data Richness for Unsupervised High-Content Analysis content piece image
High-content analysis (HCA) has recently arisen as a novel systematic approach to cell biology research. Due to the intrinsically high variability and noise level of the assay, it has been primarily applied to supervised analysis, where the phenotypes are known beforehand and manual labels are provided to guide the analysis. Furthermore, most existing research uses only a few parameters, failing to exploit the richness of the raw images.

In this work, we develop an end-to-end computational framework focused on unsupervised HCA. The framework can identify unknown subtle phenotypes from massive multi-parametric data, thus dubbed “Deep Phenotyping”, and correlate them with gene perturbations. It consists of a series of modules: high-dimensional parameter extraction for each cell from raw images, multi-level quality control using both supervised and unsupervised methods with minimal labeling effort, a novelty detection algorithm to automatically retrieve cells that differ from the control, and an unsupervised clustering algorithm to identify unknown phenotypes in the multi-parametric space.

The framework is modularized and applicable to a variety of HCA tasks. We demonstrate its use on the phenotypic analysis of the Golgi apparatus. Novel Golgi phenotypes beyond the three known ones are identified, and potential protein-protein interactions regulating Golgi functions are revealed.