Texas A&M System Teams with IBM to Drive Computational Sciences Research through Big Data and Analytics
News Feb 03, 2014
Texas A&M University System and IBM announced an agreement that is the beginning of a broad research collaboration supported by one of the largest computational sciences infrastructures dedicated to advances in agriculture, geosciences and engineering.
The collaboration will leverage the power of big data analytics and high performance computing (HPC) systems for innovative solutions across a spectrum of challenges, such as improving extraction of Earth-based energy resources, facilitating the smart energy grid, accelerating materials development, improving disease identification and tracking in animals, and fostering better understanding and monitoring of our global food supplies.
“Combining the incredible intellectual and technological resources of Texas A&M University and IBM will further position Texas as a leader in identifying and solving some of the most complex challenges we face,” Texas Gov. Rick Perry said. “The work that will be done here will change lives and potentially save lives not just in our state, but our nation and around the world.”
IBM will provide the infrastructure for the joint research consisting of Blue Gene/Q technology, Power and System x servers, and General Parallel File Systems (GPFS) Storage Systems. A test of the Blue Gene/Q on campus found that it ran a material sciences problem that previously took weeks to solve and produced a solution in "a fraction of an hour" with much greater analytical depth.
"The Texas A&M System and IBM share a passion and a commitment to research that identifies practical solutions to global challenges,” said Chancellor John Sharp, Texas A&M University System. “As the largest research university in the state, this agreement is a major step forward for the A&M System in research computing power. This brings together the best computer scientists and technology in the world to focus on issues so important to our role as a leading research institution and to our land-grant mission of serving the state while also providing resources to serve the greater good throughout the world.”
IBM Research and the A&M System intend to align skills, assets and resources to pursue fundamental research, applied development, educational reach and sustainable commercial activities with projects that may include:
- Sustainable Availability of Food: Efficiently providing sufficient food for a growing global population
- Disease Spread Tracking, Modeling and Prediction: Early and accurate detection and prediction of infectious disease spread to allow the design, testing and manufacturing of medical countermeasures
- Energy Resource Management: Responsibly explore, extract, and deliver energy resources
- New Materials Development: Atomic-level modeling, design and testing of new materials for advanced applications in energy, aerospace, structural and defense applications
As a premier engineering research agency of Texas, Texas A&M Engineering Experiment Station (TEES), which conducts research to provide practical answers to critical state and national needs, will be heavily involved from the Texas A&M University System and according to Katherine Banks, Director of TEES and Vice Chancellor of Engineering, “This is a unique opportunity to meet the needs of engineering, geosciences and agriculture and life sciences researchers to expand in areas not feasible before with small-scale HPC systems.”
“IBM and the Texas A&M System have crafted a unique collaboration that could apply computational science and big data analytics to some of the most daunting problems in agriculture, geosciences and engineering,” said William LaFontaine, Vice President of High Performance Analytics and Cognitive Markets at IBM. “With the combined research capabilities of both institutions and ready access to state-of-the-art computing technology, we feel this collaboration could produce significant scientific insights leading to industry-changing solutions and material economic impact. We are extremely pleased to be engaged with such extraordinarily capable institutions in the A&M System and look forward to years of discovery and innovation.”
TEES partners with academic institutions, governmental agencies, industries and communities to solve problems to help improve the quality of life, promote economic development, and enhance the educational systems of Texas. It is intimately connected with the College of Engineering of Texas A&M University, which is undergoing an unprecedented growth to become a College with 25,000 students by the year 2025 and hire a new generation of faculty who will be addressing the Nation’s needs for research and technology development.
In support of the long-term research effort, IBM will supply to the A&M System cutting edge technical computing technologies, which will be cloud-enabled. The A&M System will deploy a research computing cloud that will comprise of IBM hardware and software including:
- Blue Gene/Q: Serving as the foundation of the computing infrastructure, a Blue Gene/Q system consisting of two racks, with more than 2,000 compute nodes, will provide 418 teraflops (TF) of sustained performance for big data analytics, complex modeling, and simulation of molecular dynamics, protein folding and organ modeling.
- Power Systems: A total of 75 PowerLinux 7R2 servers with POWER7+ microprocessors will be connected by 10GbE into a system optimized for big data and analytics and high performance computing. This complex includes IBM BigInsights and Platform Symphony software, IBM Platform LSF scheduler, and IBM General Parallel File System.
- System x: The solution will contain an estimated 900 IBM System x dense hyperscale compute nodes as part of an IBM NeXtScale system. Some of the nodes will be managed by Platform Cluster Manager Advanced Edition (PCM-AE) as a University-wide HPC cloud while the others will be managed by Platform Cluster Manager Standard Edition (PCM-SE) and serve as a general purpose compute infrastructure for the geosciences and open source analytics initiatives.
- Platform Computing: Platform Computing software will be used to manage and accelerate various computational workloads. Platform Symphony will drive big data and analytics, and Platform LSF will drive traditional HPC and technical computing workloads. Platform Computing will also power the creation of an HPC cloud, allowing users within the A&M System access to the system.
- General Parallel File System (GPFS): Five IBM System x GPFS Storage Servers (GSS) will provide five petabytes (PB) of shared storage for use by the compute building blocks using high-speed networks. GPFS will also include an IBM FlashSystem 820 tier with 10 terabytes (TB) of flash storage, delivering performance to accelerate computation for use primarily by Texas A&M Agrilife Research, Geosciences and university HPC as a part of the research computing infrastructure.
Furthermore, IBM will work with researchers at the A&M System to assess new computing technologies that will be necessary to advance data-driven science discovery and innovation over the next several years.
Algorithm Speeds Up Medical Image Analysis 1000 TimesNews
Medical image registration is a common technique that involves overlaying two images, such as magnetic resonance imaging (MRI) scans, to compare and analyze anatomical differences in great detail. Researchers have described a machine-learning algorithm that can register brain scans and other 3-D images more than 1,000 times more quickly using novel learning techniques.
Antarctic Worm and Machine Learning Help Identify Cerebral Palsy EarlierNews
A research team has released a study in the peer-reviewed journal BMC Bioinformatics showing that DNA methylation patterns in circulating blood cells can be used to help identify spastic cerebral palsy (CP) patients. The technique which makes use of machine learning, data science and even analysis of Antarctic worms, raises hopes for earlier targeted CP therapies.
Towards Personalized Medicine: One Type of Data is Not EnoughNews
To understand the biology of diseased organs researchers use different types of molecular data. One of the biggest computational challenges at the moment is integrating these multiple data types. A new computational method jointly analyses different types of molecular data and disentangles the sources of disease variability to guide personalized treatment.READ MORE