Maximizing Drug Discovery Success With Machine Learning
Maximizing Drug Discovery Success With Machine Learning
The pipeline to new drugs isn’t straightforward. Whilst some estimates of only 5-10% of drug programs making it to approval may be lowballing the true rate, success is still far from guaranteed. What is guaranteed is the gargantuan time and cost of taking a compound from discovery to therapy. In response, pharmas are wising up and seeking to make use of what is perhaps their greatest resource – data.
Leveraging molecular data before trials begin means better selection of target compounds and a quicker development process. But data isn’t just a tool, it’s a commodity, and a hugely valuable one. How can we encourage companies to share data assets freely?
One potential solution is offered by Machine Learning Ledger Orchestration for Drug Discovery (MELLODDY). This Innovative Medicines Initiative (IMI)-funded consortium hopes to leverage blockchain architecture to guarantee shareability and security for pharmaceutical data. What’s more, it already has the backing of 10 major pharmas. We talked to Hugo Ceulemans, MELLODDY Project Leader and Scientific Director, Discovery Data Sciences at Janssen Pharmaceutica NV and Mathieu Galtier, Project Coordinator at data science company Owkin to find out more.
Laura Lansdowne (LL): What is “MELLODDY” and why was it launched?
Hugo Ceulemans (HC): MELLODDY is a new Innovative Medicines Initiative (IMI)-funded consortium of pharmaceutical, technology and academic partners across Europe. MELLODDY is aiming, for the first time, to use machine learning methods on the annotated chemical collections of 10 pharma companies to develop a platform capable of creating more accurate prediction models of which compounds could be promising in the later stages of drug discovery and development.
MELLODDY was launched to accelerate drug discovery using machine learning to unlock the maximum potential of pharma data – the world’s largest collection of small molecules with known biochemical or cellular activity – to enable more accurate predictive models and increase efficiencies in drug discovery.
Ruairi Mackenzie (RM): How will MELLODDY help to increase efficiencies in drug discovery?
HC: Our hypothesis is that the MELLODDY privacy-preserving federated machine learning platform will help the pharma partners in the consortium to explore fewer drug candidates that are of a higher overall quality, therefore likely saving time and costs.
LL: The MELLODDY consortium will use Owkin’s blockchain technology. Can you tell us more about this technology and its benefits in relation to drug discovery?
Mathieu Galtier (MG): The MELLODDY consortium will use a blockchain architecture technology, called Substra, which is being developed by Owkin. Substra is a framework for traceable machine learning orchestration on distributed and sensitive data. It makes it possible to deploy a network of nodes (in our case pharma servers) and orchestrate the training of machine learning models on data that remain stored in the different nodes. It is based on a distributed ledger technology (Hyperledger fabric) that ensures a full traceability of operations occurring on the network. It allows actors to collaborate without having to share their assets. In the case of drug discovery, this approach works well since pharmaceutical companies value confidentiality and data protection, hence the Substra technology can waive all limitations linked to data sharing.
We believe that by combining large amounts of small molecule data with our advanced machine learning technologies, we can better understand molecular structures and effects, biological pathways and heterogeneous outcomes – becoming the cornerstone of modern data-driven approaches to precision drug discovery.
LL: What is federated learning?
MG: The federated learning paradigm provides a practical solution to training predictive models on large amounts of data, owned by different data controllers (in our case pharma servers), while guaranteeing complete visibility over the use of the data to its controller. That data never leaves its “home” location, therefore preserving privacy and security. While data is kept locked on each partner’s infrastructure, our predictive models travel from one data controller to another to be trained on the protected data. Federated learning enables us to train algorithms on data at scale, which creates the most accurate and robust predictive models for powering applications.
RM: Does MELLODDY herald a future of more collaboration between pharma companies on data handling and analysis?
HC: The MELLODDY approach combines predictivity-boosting information exchange with security and privacy preservation of the underlying data and sensitive models. That unique combination has been key for pharma partners to involve an unprecedented collective volume of competitive discovery data. The hypothesis MELLODDY aims to directly test is that the sizeable jump in collective data volume over that directly accessible to any individual partner will boost the performance and applicability domain of models. Proof of the concept would likely indeed inspire the use of this and related technologies to enable collaborative analyses across data from pharma and other stakeholders with the aim of bringing safe and efficacious therapeutics to patients.
Hugo Ceulemans and Mathieu Galtier were speaking to Laura Elizabeth Lansdowne and Ruairi J Mackenzie, Science Writers for Technology Networks.