Google Helps the Search for Secure and Shareable Cloud Data
Industry Insight Apr 17, 2019 | by Ruairi J Mackenzie, Science Writer for Technology Networks
Cloud technologies are becoming a central part of modern research. Adopting the cloud has been an important step in making modern “big science” projects possible, whilst the cloud has also shown its worth to individual labs. Nonetheless, questions about the security and shareability of cloud data remain. Only by proving the cloud’s utility in high-profile research projects will these questions be finally answered. We caught up with Google Cloud’s Director of Google for Education NA Kevin Kells to discuss the projects that show how the cloud is changing research, and that new technologies are making secure and simple data sharing a reality.
Ruairi Mackenzie (RM): How has the cloud changed research?
Kevin Kells (KK): Generally, by enabling more questions, large scale computation, faster results, easier collaboration and all for significantly less money. One example of how researchers have taken advantage of all of these benefits is at Columbia University’s Lamont-Doherty Earth Observatory, where researchers drew on a NSF BIGDATA grant to move their data-intensive work on ocean currents to the cloud to collaborate with colleagues around the world on climate studies. The high-resolution three-dimensional datasets of climate science meant processing hundreds of terabytes of data to create their Earth System Modelling simulations. Moving to the cloud allowed researchers to ask more questions, run models with more variables, and get faster answers.
RM: How can cloud providers guarantee security to their academic users, especially those working with sensitive data?
KK: Data security and privacy is especially important in the field of human genomics. FIMM, the Institute for Molecular Medicine Finland, used their cloud system’s built-in identity and access management tools to assist with meeting donor, institutional, and EU regulations for deidentifying clinical and biological data while scaling up to manage 1.5 petabytes of genomic data over the next three years.
Innovations in cloud-based biomedical research are already improving clinical practice too: researchers at Emory University used the cloud to establish a Fast Healthcare Interoperability Resources (FHIR) database of real-time patient data and run analytics to help predict the onset of sepsis, a potentially fatal condition that affects 750,000 Americans every year. Cloud platforms make sharing data easier while helping keeping private health information more secure and helping with compliance with stringent HIPAA regulations and institutional standards.
While no organization can guarantee security, working with a cloud provider can offer significant security benefits to organizations. Cloud providers are able to hire and retain highly-skilled security engineering and operations teams whose sole responsibility is keeping their and their users’ data safe. It is the providers’ responsibility to secure all of the underlying infrastructure, allowing customers to focus on what they know best - securing and managing access to their data by the right people. Further, cloud providers can offer very advanced functionality that users may not be able to get or deploy in their own environments - for example, at Google Cloud, we encrypt all data-at-rest by default - a capability that few organizations have implemented across their own infrastructure. Finally, most cloud providers are regularly audited by third-parties against rigorous international security and privacy standards to ensure that they are following accepted best-practices.
RM: Data sharing is a central part of collaborative research. Can researchers using Google Cloud easily share data to collaborators using other platforms?
KK: At the University of Michigan genomic researchers moved their three petabytes of data into Google’s cloud-based containers to make it easier to collaborate and reproduce results across the forty TOPMed precision medicine studies being conducted all across the country. Using GCP and Google’s Preemptible VMs allowed them to accelerate their research, streamline deployment, and cut costs.
Other researchers at Columbia’s Magnetic Resonance Research Center (CMRRC) are leveraging GCP’s scalability to link five New York City research institutions into a cloud-based MRI Research Center that collects, processes, stores, and analyzes high-resolution medical images in one shared datapool. In the longer term, cloud infrastructure may be able to help the CMRRC expand this model of distributed points-of-care connected to cloud-based diagnostic and research resources to create a national and global network of clinics and laboratories.
Google Cloud Platform makes it easy to share data with collaborators using Google accounts or other types of accounts. Depending on how researchers want to configure their applications, consumer Gmail accounts, school / organization-affiliated G Suite accounts, and single-sign on options can be used for authentication.
Collaborators without Google accounts may use time-limited signed URLs, making a piece of data available even to users without Google accounts. Finally, application developers may choose to take advantage of Cloud Identity for Customers and Partners, an identity and access management system that supports multiple authentication methods such as user name / password and social network login.
Kevin Kells was speaking to Ruairi J Mackenzie, Science Writer for Technology Networks