Through a five-year collaboration, researchers and software engineers at the new Intel-Broad Center for Genomic Data Engineering will build, optimize, and widely share new tools and infrastructure that will help scientists integrate and process genomic data. The project aims to optimize best practices in hardware and software for genome analytics to make it possible to combine and use research data sets that reside on private, public, and hybrid clouds.
The project will enable researchers worldwide to run more data-intensive studies and generate robust results more quickly by accessing data that may have been unavailable to them before.
“The size of genomic datasets doubles about every eight months and, as it does, the challenge of acquiring, processing, storing, and analyzing this information increases as well,” said Eric Banks, director of the Data Sciences and Data Engineering group at the Broad Institute. “Working with Intel, we plan to build out solutions that can work across different infrastructures to facilitate efficient processing of these growing data sets, and then make these tools openly available for researchers worldwide. Our work is a step toward building something analogous to a superhighway to connect disparate databases of genomic information for the advancement of research and precision medicine.”
Building upon an existing collaboration, the new effort will apply Intel’s data analytics and artificial intelligence prowess with Broad’s expertise in genomic data generation, health research, and analysis tools toward the goal of building new resources that will promote biomedical discoveries, including those that advance precision medicine.
Under the five-year agreement, the Intel-Broad Center for Genomic Data Engineering will focus on three goals:
• Overcome the challenge of diverse genomic datasets by optimizing Broad’s Genome Analytics Toolkit (GATK) best practices hardware recommendations for genomic workloads for on-premise, public cloud, and hybrid cloud use cases.
• Simplify and accelerate the execution of genome analytics by optimizing genomics software tools such as GATK, Cromwell, and GenomicsDB on industry standard Intel-based platforms.
• Empower users such as healthcare providers, pharmaceutical companies and academic research organizations to collaborate by partnering on workflow execution models across complex and distributed datasets. Achieving this goal will enable secure processing of data across organizations, which can stimulate research and discovery, drug discovery, clinical trial recruitment, and ultimately clinical decision-making across the entire research and discovery ecosystem.
“Intel and Broad share the common vision of harnessing the power of genomic data and making it widely accessible for research around the world to yield important discoveries,” said Diane Bryant, executive vice president and general manager of the Data Center Group for Intel Corporation. “We each bring to the collaboration our unique expertise and capabilities. At Intel, through the use of artificial intelligence, we are confident we can solve the massive data challenges facing the industry.”
Original story from The Broad Institute. Please note: The content above may have been edited to ensure it is in keeping with Technology Networks’ style and length guidelines.