Data Visualization in Biopharma: Leveraging AI, VR and MR to Support Drug Discovery
Article Jun 12, 2019 | by Nimita Limaye, PhD
The importance of data visualization is increasingly being realized. Once thought of as merely a tool for data presentation, scientists are now recognizing its value in data exploration as well.1 There has been a paradigm shift from viewing data as static plots to dynamic 3D visualizations, which enable deeper insights into the interplay of different parameters. Visualize a flock of migrating swallows flying across the sky in a formation – sheer poetry in motion. It is so easy to interpret that image, and to figure out the direction in which they are headed. Now try viewing the same information as multiple data points. Then, imagine trying to derive formulae and draw meaningful conclusions from millions of data points, from different sources, flowing in continuous motion like a swarm of buzzing bees!
While real-time access to diverse sources of data has been viewed as the greatest benefit of this generation, not enough consideration has been given to the challenges faced by the end user in trying to wade his way through those innumerable and confusing data points. There is a need for new tools to make visualizing data easier, as Dr. Sangeeta Sawant, director of the Bioinformatics Centre at Savitribai Phule Pune University suggests, “As biomedical data continue to become more and more diverse, the need and importance of tools that facilitate interactive, versatile and integrative visualization of the data is also growing. The field of drug discovery demands such visualization tools not only for the clear and comprehensive representation of data but also for exploration leading to new insights and discovery.
Especially so, in case of structural data which is at the heart of the drug discovery process involving the characterization of drug targets, potential small molecule leads as well as their interactions. The availability of advanced technologies such as immersive visualization, coupled with AI-based and deep learning algorithms for molecular modeling and simulations would definitely be a key component for drug discovery in the future”.
Limitations of Existing Software ToolsExamples of existing software tools include browser-based tools such as ProViz, which is used for in-depth viewing and manipulation of multiple sequence alignments when exploring proteins. JalView is a tool which displays the alignment of both the protein sequence from the UniProt database (which has 40 million protein amino sequences listed), and the coding sequence from The European Molecular Biology Laboratory (EMBL) database. In addition, JalView generates a 3D protein folding structure using a molecular modeling tool called Jmol. STRAP-NT is a program which enables the structural interpretation of mutations and single nucleotide polymorphisms. While these are powerful tools, they have limited accessibility and usability, are not user-friendly, and often require deep knowledge of the functionality of these tools as well.
To add to the challenge of analysis, genomic and proteomic data is voluminous.2 For example, The Genomic Data Commons (GDC) is the largest cancer data repository in the world and the National Cancer Institute (NCI) has gathered 4.1 petabytes (i.e. 4.1 million gigabytes) of data to analyze genomic signatures and drive the precision medicine approach.3
The only way to effectively analyze this kind of data is by using advanced technology.
AI-based Drug Discovery SolutionsArtificial intelligence (AI) is transforming data visualization strategies and is a key component of dealing with big data in the precision medicine age.4 AI allows scientists to handle phenomenal volumes of diverse data types, often flowing in real time. It also helps them focus on data interpretation, rather than analysis. Interactive data displays need to be complemented by information-rich, yet intuitive visualizations, so that “slice-and-dice” views of protein or genomic data can be obtained. These tools cater to distinct requirements of different end users, yet ensure that data traceability and data security requirements are met in the highly regulated pharmaceutical world. Most importantly, predictive analytics is enabling speedy and accurate interpretation and forecasting.
The market is gradually being flooded by AI-driven data visualization applications that support drug discovery. Cloud-based cognitive platforms which use natural language processing (NLP - a branch of AI that helps computers understand, interpret and manipulate human language) are being developed to support drug target identification and drug repurposing.5 Diverse AI-based drug discovery visualization tools, such as next-generation data discovery software platforms, are based on metadata-driven AI technology. These tools are often complemented with data management task automation features to enable semantic searches and track data lineage.
Add-ons include smart self-service data visualization and exploration tools, fully interactive online and offline mobility, drag-and-drop simplicity, and interactive visualizations and dashboards.6 Other desirable features have been built-in to reduce hit discovery data processing times, such as advanced data exploration options and analytical tests that are easy to navigate. Design environments have enabled interactive visualization and embedded analytics with AI-driven dashboards, both of which make analysis easier.7,8
Multi-parameter visualization features utilize data-rich views such as virtual libraries, while integrating physicochemical properties and predictive models. These features are helpful in optimizing drug design, where trends and parameter relationships can be displayed.9 Yosipof et al, 2018, have discussed the development of a new learning method called AL Boost. This workflow combines the use of the t-Distributed Stochastic Neighbor Embedding (t-SNE) method for visualization along with different ML methods (such as decision trees, random forests (RF), support vector machine (SVM), artificial neural network (ANN), k-nearest neighbors (k-NN), and logistic regression models) to analyze and categorize potential drugs or other characteristics, such as disease-categories or organs of action for a molecule.10
Transitioning into the VR-MR World of Drug DiscoveryIn augmented reality (AR), users engage with the real world that has digital content added to it (“Pokemon Go” and “Google Glass” may ring a bell here!). In contrast, virtual reality (VR) is an immersive experience, whereby the user is completely immersed in a digitally generated virtual environment.
Mixed reality (MR) combines feature of both VR and AR. Thus, while AR overlays virtual objects in the real-world environment, MR anchors the virtual objects to the real world and allows users to interact with the virtual objects, using devices such as Microsoft’s Hololens, which creates a 3-dimensional image, a hologram.11,12
Studies conducted at Dartmouth College, New Hampshire and at Cardiff University, UK demonstrated the potential of a consumer augmented reality (AR) device for improving the functional vision of people with near-complete vision loss.13 The Dartmouth College team (as well as some other colleges) won an award of $100,000 from Microsoft for innovative, academic applications of the Hololens.14
How VR is transforming Data Visualization for Drug DiscoveryThe future of data visualization is about making the process more dynamic and fueling the creative instincts of scientists by allowing them to play around with the data. Leading pharmaceutical companies have used VR to visualize protein targets and small molecules, and explore interactions between them in a visual three-dimensional mode.
Go down memory lane, to those carefree days, where you spent hours playing video games. Now imagine yourself as a researcher, instead chasing molecules around and discovering how a drug locks on to its target. VR empowers end-users to have this level of creativity!
VR-based drug discovery solutions may prove useful across multiple platforms. Human genomics analysis platforms may be used to find and validate drug targets. Others can be used to map the dynamic 3D shapes of free drug molecules, providing unprecedented insights into their behavior and physical properties.15
Mixed Reality: Re-Discovering the Discovery Process
Deciphering the code: A Mixed Reality Affair! (Source: In Virtu Data Solutions)
Dr. Pavel Terentiev, Postdoctoral Fellow at WPI (Korkin Lab) and co-founder of In Virtu Data Solutions has highlighted the significance of MR in Data Visualization. “The ground-breaking technology of MR provides a completely new way to represent data and setup human-computer interaction. Its greatest advantages are the ability to go beyond the size limits of conventional monitors or tablet screens and to represent data in true 3D space where users are physically situated. These benefits are playing a critical role for biologists in their search for drug targets or for analyzing clinical trials. Their expertise is crucial for successful research.
However, without intuitive, fast and transparent data infrastructure, biologists may get lost with such massive data volumes as results of transcriptomic experiments. Immersive representation allows one to utilize all physical space around researchers to layout data, to avoid losing touch with the bigger picture, when a user is focused to specific subset of data, and to prevent overlap of data points at global observation. Having data in the same space is natural for human beings to explore, and allows them to save time for exploration as well as to reduce the risk of wrong decision or missing pattern by having a clear and a transparent representation”.
ConclusionThe inadequacy of creative and interactive real-time data visualizations and the time lost from dependency on data scientists are slowing down biologists in biopharma, resulting in significant delays in bringing a drug to the market.
These needs should be addressed on priority. Access to self-serve, user-friendly data visualizations will inject speed, cost efficiencies and creativity into the drug discovery process.
This has been emphasized by Dr. Priya Chaudhary, assistant professor at Oregon Health and Science University, who observes that “Experimental, clinical trial, cell culture, drug-target interaction, biochemical, imaging, pathology, genomic, transcriptomic, metabolomic, and proteomic data all need proper tools for mining and visualization. Data analysis and visualization have been transformed due to the availability of cloud and web-based tools. Open source data will revolutionize data visualization and accelerate drug discovery process. Machine learning and artificial intelligence will pave a smooth road for drug discovery and precision medicine”.
- Owens J. Data Visualization Innovations in Life Sciences and Drug Discovery. 2018. Technology Networks. https://www.technologynetworks.com/informatics/articles/data-visualization-innovations-in-life-sciences-and-drug-discovery-296360
- P Jehl, J Manguy, DC Shields, DG Higgins and NE Davey. ProViz—a web-based visualization tool to investigate the functional and evolutionary features of protein sequences. 2016. Nucleic Acids Research, 44, W11– W15,https://doi.org/10.1093/nar/gkw265
- Editorial Team. Why big data is critical to the pharmaceutical industry. 2018. https://insidebigdata.com/2018/11/23/big-data-critical-pharmaceutical-industry/
- Buvailo A. How Big Pharma Adopts AI to Boost Drug Discovery. 2018. BiopharmaTrend.com https://www.biopharmatrend.com/post/34-biopharmas-hunt-for-artificial-intelligence-who-does-what/
- Watson for Drug Discovery accelerates drug research. https://www.ibm.com/products/watson-drug-discovery?%206.
- Violino B. A look at the leading data discovery software and vendors. TechTarget. https://searchbusinessanalytics.techtarget.com/feature/A-look-at-the-leading-data-discovery-software-and-vendors
- O’Connell M. TIBCO Spotfire—A Big Jump in the 2017 Gartner BI & Analytics MQ.2017. https://www.tibco.com/blog/2017/02/27/tibco-spotfire-a-big-jump-in-the-2017-gartner-bi-analytics-mq-2/
- WEHI Speeds Drug Discovery with TIBCO Spotfire. https://www.tibco.com/resources/success-story/wehi-speeds-drug-discovery-tibco-spotfire
- Theorem's Definitions for Augmented, Mixed and Virtual Reality http://www.theorem.com/digital-realities/augmented-mixed-virtual-reality-definitions.htm
- Yosipof A, Guedes RC and Garcia-Sosa AT. Data Mining and Machine Learning Models for Predicting Drug Likeness and Their Disease or Organ Category. 2018. Front Chem, 6: 162. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5954128/
- Mixed Reality Technology — The Future of Virtual and Augmented Reality is Here. 2018. https://codeburst.io/mixed-reality-technology-the-future-of-virtual-and-augmented-reality-is-here-b9fb2a552d19
- Novartis Explores Virtual Reality Tools In Drug Discovery R&D. 2017. Biopharma Trend. https://www.biopharmatrend.com/post/35-novartis-explores-virtual-reality-tools-in-drug-discovery-rd/
- Kinatder M, Gualtieri J, Dunn MJ, Optom MC, Jarosz W, Yang XD and Cooper EA. Using an Augmented Reality Device as a Distance-based Vision Aid—Promise and Limitations. Optom Vis Sci 2018; Vol 00(00), 1-1 11. https://www.cs.dartmouth.edu/~xingdong/papers/AR.pdf
- Conditt J. Microsoft will hand out $500K to these five HoloLens grant winners. Engadget https://www.engadget.com/2015/11/11/microsoft-hololens-contest-winners/
- C4X Discovery uses virtual reality for drug discovery. 2018. https://healthiar.com/c4x-discovery-uses-virtual-reality-for-drug-discovery
As biology presents ever-more challenging targets, the chemistry world is pushing ahead with new tools to design or discover drugs that have optimal kinetics – they bind well to their target, exert the desired effect and have acceptable toxicity profiles. We look at two different approaches being used to explore and optimize the kinetics of tomorrow’s medicines.READ MORE
Francis S. Collins, M.D., Ph.D. Director of the National Institutes of Health, has recently directly addressed the issue of underrepresented groups in science: "Too often, women and members of other groups underrepresented in science are conspicuously missing in the marquee speaking slots at scientific meetings and other high-level conferences.”READ MORE