Angry Birds Meets Bioinformatics
News Jul 23, 2012
So far, Big Data and Web 3.0, techie terms for massive stores of patient data and a unified global system to analyze it -- have not realized their full potential in medical research.
Meanwhile, you can cruise your favorite app store for hundreds of Web applications that analyze data, in many cases to create a smartphone game. No need to buy pricey hard drives or software packages. Ironically, the market asked Web browsers, the part of your computer that talks to the internet, to do so many things in recent years that this cheap, universal computing environment became an analytical powerhouse, especially when you network browsers together in "clouds."
Bioinformatics experts have looked on with envy, wishing they too had access to simple, powerful and free Web apps like those that drive Angry Birds. Now a team at the University of Alabama at Birmingham has made a start at delivering that.
As detailed in the July 20 edition of the Journal of Pathology Informatics, ImageJS is a free app system in some ways like Angry Birds, or perhaps more like Instagram, except that it analyzes tissue images instead of offering a digital shooting gallery or dressing up family photos. Specifically, the first ImageJS module enables pathologists to drag a digitized pathology slide into a Web app and analyze it for malignancy based on color changes that occur when cancer cells are exposed to standard dyes. But the developers' vision is more ambitious than that.
The pathology modules are the first in a series coming from Jonas Almeida, Ph.D., director of the Division of Informatics in the UAB School of Medicine Department of Pathology and corresponding author of the JPI study.
Future modules will seek to perform genomics analysis, make the system capable of cloud computing and enable doctors to compare their patient's data to similar cases stored in national databases. Such comparisons promise to increase diagnostic accuracy and rule out treatments that clash with a person's genetic signature.
"We created a new kind of computational tool that promises to make patient data more useful where it's collected," says Almeida. A demonstration is online on YouTube, and it's available from the Google Chrome App store, Google Code and Github.
"But ImageJS is an informatics experiment at this point; it will only become something special if pathologists embrace it," he says.
Public resources such as The Cancer Atlas, Gene Expression Omnibus and 1,000 Genomes Project have been generating massive data sets for years, but researchers are just beginning to harness it with Web apps to improve health-care delivery. To tap this potential, experts need apps that are neither overly complex nor require unsafe downloads of desktop applications.
The promise of ImageJS, Almeida says, lies in the willingness of pathologists to partner with biostatisticians at other institutions to write modules for this open-coded system that add value or resolve their specific problems. If that happens, ImageJS could come to resemble the social coding communities that surround smart phones, in this case to deliver better healthcare.
"Pathologists are extremely eager for this, but the barriers so far have made it frustrating," says James Robinson Hackney, M.D., a UAB pathologist who consulted on the ImageJS project. "It's great to watch a colleague's face when this system enables them almost instantly to accomplish a goal they had given up on using past applications."
The "ImageJS" name pays homage to the "ImageJ," an image-analysis program pioneered by the National Institutes of Health. ImageJ was written in the JAVA language and required hours of programming and security-breaching software to make the small changes needed to work with a hospital's system. Because hospital systems often prevent just such software downloads, its use has been limited there.
Most important, ImageJS code migrates from the code repository to the browser and eliminates the risks of travel-related data damage and exposure that violates patient privacy.
The central barrier to Web 3.0, an approaching era when all computers act as one to achieve new levels of collaboration and computing power, is that data compiled to date worldwide has been locked in silos that can't be searched, shared or analyzed.
While researchers hope to translate existing databases into the simple, universal Web 3.0 data formats that will allow this kind of sharing, they are also starting fresh. Almeida's team, for instance, will be attempting over the next year to build a database that offers future ImageJS users the option to store images and related analysis in a resource description framework or RDF.
Once data is stored in this format, it can be found by queries written in SPARQL, a simple computer language that can be learned by most people in a couple of hours. That is extremely important, says Almeida, because the next generation of bioinformatics must not remain the province of programmers. Instead, the researchers running the clinical trials and translational studies must write their own programs, which would add "stunning, new analytic value" to their research and feed into the new global databases.
The UAB team's work proceeds alongside other academic efforts, such as Tetherless World Constellation at Rensselaer Polytechnic Institute and ventures such as LinkedLifeData, Sindice and Hadoop.
Like a game, everyone's hoping to get to the next level. But for the field of bioinformatics -- and for patients -- the stakes are real, and they're much higher.
Unlike most cells in the rest of our body, the DNA (the genome) in each of our brain cells varies from cell to cell, caused by somatic changes. But much remains unknown, including when these changes arise, their size and locations, and whether they are random or regulated. Now, researchers have developed new techniques allowing the detection of CNVs smaller than one million base pairs.