Deep Phenotyping - Harnessing Data Richness for Unsupervised High-Content Analysis
In this work, we develop an end-to-end computational framework focused on unsupervised HCA. The framework can identify unknown subtle phenotypes from massive multi-parametric data, thus dubbed “Deep Phenotyping”, and correlate them with gene perturbations. It consists of a series of modules: high-dimensional parameter extraction for each cell from raw images, multi-level quality control using both supervised and unsupervised methods with minimal labeling effort, a novelty detection algorithm to automatically retrieve cells that differ from the control, and an unsupervised clustering algorithm to identify unknown phenotypes in the multi-parametric space.
The framework is modularized and applicable to a variety of HCA tasks. We demonstrate its use on the phenotypic analysis of the Golgi apparatus. Novel Golgi phenotypes beyond the three known ones are identified, and potential protein-protein interactions regulating Golgi functions are revealed.