Healthcare and Life Sciences, Meet the Third Platform of Storage
News Oct 30, 2015
By now, everyone is familiar with IDC’s view of technology progression. We are transitioning to the third platform where both people and their data are mobile, apps and users number in the millions and interact through new social outlets, Big Data is needed to explore these social interconnections, and cloud services are what enable it all.
IDC’s Third Platform Applied to Storage
While IDC’s model shows the general direction of IT technology, the model also applies to data storage. The first platform used block based direct attached storage, the second platform is based on networked storage, which brings us to the third storage platform, object storage. Infinite scalability and rich metadata, along with the ability to store very large collections of unstructured data enables collaboration - a key premise for the third platform.
Object Storage in Healthcare and Life Sciences
Enterprise Content Management in Healthcare
Healthcare IT continues to transition to electronic health records (EHR) ahead of regulatory deadlines. In clinical environments, data resides in so many locations and in so many different formats that the true value of the data cannot be realized easily. On top of this, a clinician needs to have instant access to test results and patient information from anywhere and at anytime.
Current EHR solutions are good for capturing and organizing patient information within a hospital environment, but they are not designed to organize data across the many systems found in a healthcare network, let alone dissimilar systems used by service partners. This is the pain point that Enterprise Content Management (ECM) systems such as Hyland's OnBase and others are targeting. An ECM system is able to aggregate patient information across varied repositories, making it searchable and accessible. An object storage-backed ECM supports anywhere, anytime access across platforms regardless of file type.
Genomics and Object Storage
The promise of personalized medicine is fueling significant investment in the field of genomics. With each run, genome sequencers create massive amounts of test data that must be stored and analyzed. Current Next-Generation Sequencers (NGS) are able to run tests in hours instead of days, allowing researchers to run more tests in less time. To drive this point home, consider a sequencing system that can process approximately 18,000 whole genomes per year. Now, multiply that by file sizes that can easily reach 80-90 terabytes and it’s obvious that data storage quickly becomes a choke point inhibiting scientific progress.
Admittedly, this is an extreme case given the sequencer’s hefty price tag (don’t expect to see one of these on your next visit to the doctor). Still, it is not uncommon for researchers to generate between 10 and 30 terabytes of data per day, enough to quickly consume a high-end storage array. Researchers must choose between keeping the data and moving it to cheaper, slower storage. Many organizations are seriously considering cloud storage as an alternative to expensive, high performance disk arrays.
I don’t know about you, but I can’t imagine routinely moving several days’ worth of test data across a wide area network, fat pipe or not. Data transfer is both time consuming and expensive, and it represents a security risk. Illumina recently introduced BaseSpace, allowing its sequencers to transfer raw sequence data directly to Amazon’s object storage service via an S3 interface. This is a step forward in terms of storing large data sets, and you can expect the sequencing industry as a whole to rapidly follow suit. However, with a limited research budget it makes sense to deploy a flexible storage architecture, one that scales well and can support public, private and hybrid clouds. The benefits are the ability to store data locally in cases requiring improved performance and security, as well as the ability to support collaboration with remote partners. Object storage systems can disperse data both locally and to remote data centers; this greatly reduces the risk of data loss and eliminates the expense of replication software or a dedicated disaster recovery system.
The Joining of Forces
Given the different workflows and performance needs of Heathcare and Life Sciences, it seems unlikely that one storage platform could support both segments. However, the benefits of personalized medicine cannot be achieved until the gap is bridged between genomics research and clinical practice. In fact, the National Institute of Health (NIH) recently committed $48M in research grants to integrate genomic information into EHRs in a bid to prepare EHRs and the healthcare system for precision medicine. Couple this with the government’s recently published Heath IT Strategic Plan calling for healthcare IT systems to support personal monitoring devices such as Fitbit and Smartphone apps and you can see where things are heading – lots of data streaming from multiple sources continuously. Talk about an IoT data deluge. The question becomes, will your storage system support the needs of the third platform?
Author: Scott Cleland, Senior Director Product Marketing, Cloud Infrastructure Business Unit, HGST
Office of the National Coordinator for Health Information Technology – move beyond EHR to encompass all forms of health information