TGAC Installs Largest SGI UV 300 Supercomputer for Life Sciences
News May 11, 2016
With the world’s largest SGI® UV™ 300 installation for Life Sciences, with one of the largest Intel® SSD for PCIe*deployments worldwide, TGAC’s new supercomputing platform gives the research Institute access to the next-generation of SGI UV technology for genomics. This will enable TGAC researchers to store, categorise and analyse more genomic data in less time for decoding living systems and answering crucial biological questions.
This new TGAC platform comprises two SGI UV 300 systems totalling 24 terabytes (TB) of shared-memory, 512 Intel® Xeon® Processor E7 v3 cores and 64TB of Intel® P3700 SSDs with NVMe* storage technology. Each SGI UV 300 flash memory solution features 12TB of shared-memory with 7th generation SGI NUMAlink® ASIC technology, scaling up to 64 TB of global addressable memory as a single system.
Paired with flash storage, the combined 24TB SGI UV 300 supercomputers can increase processing speeds of heavy workloads in scientific research by 80 percent. This combination of leading-edge technology allows TGAC researchers to benefit from the faster processing capabilities of the SGI UV 300, providing an extraordinarily powerful platform for genomics analysis.
In particular, the system will dramatically reduce the time to perform large genome assembly that TGAC researchers specialise in, as well as the analysis of wheat genotype and phenotype data generated by the Seeds of Discovery programme. TGAC will use the pioneering SGI HPC technology to enable faster analysis of complex genomes which require both large memory and fast processing capabilities, providing a powerful boost to TGAC’s research projects. This will include sequencing and assembling multiple lines of wheat with the Institute’s ‘w2rap’ assembly software - developed by the Algorithm Development team led by Bernardo Clavijo.
The new technology will also be used to aid the development of novel analysis techniques for data integration, by taking advantage of the larger, faster memory-per-core specifications of the system and its accelerated I/O capabilities from the NVMe SSDs. This will provide a significant speedup of data movement between the hardware and TGAC’s genome analysis software.
Dr Tim Stitt, Head of Scientific Computing at TGAC, said: “Continuing our successful collaborations with SGI and Intel in deploying novel and innovative computing technology, the new SGI UV 300 appliance with NVMe SSDs will undoubtedly be a leader in field of genomic analysis.
“With the unique shared-memory technology from SGI and Intel’s leading processor and non-volatile memory storage solutions, this system will set the new yardstick for large-scale data-intensive bioinformatics computations. The combination of processor performance, memory capacity and one of the largest deployments of Intel SSD storage worldwide makes this a truly powerful computing platform for the life sciences.”
Jorge Titinger, president and CEO, SGI, said: “The complexity of genomic data and workloads today requires high performance computing (HPC) to provide new insights for researchers and allow them to derive conclusions from the massive data jigsaw puzzle. This enables TGAC to tackle fundamental problems such as increasing the yield of crops to keep up with the population growth. SGI is honored to be extending the partnership with TGAC by providing the SGI UV 300 architecture. This system is designed for data intensive, in-memory, I/O demanding applications and includes the very latest NUMAlink 7 ASIC, scaling right up to 64TB of globally addressable memory as a single system. This is all done using Open Standards unmodified Linux allowing for the SGI UV to be run as one system, offering flexibility for exploring new algorithms and programming models without restriction.
“By tightly coupling the Intel® P3700 SSDs with NVMe* storage technology with the latest generation SGI UV 300 system, SGI allows customers like TGAC to achieve extraordinary bandwidth and inputs/outputs per second (IOPS) and will keep TGAC research and productivity at the very top of this highly competitive worldwide domain of genomic analysis. We are very proud of such a successful collaboration and look forward to TGAC researchers achieving, even more, breakthroughs and accelerated productivity with this new system,” said Titinger.
Ketan Paranjape, general manager of Intel’s Life Sciences team, added: “Knowledge of plant and animal genomes can lead to breakthroughs in drug discovery, food safety, and more, helping us to better manage climate change, feed a growing population, and mitigate the impact of newly emergent diseases. With the SGI UV 300 system, Intel Xeon Processor E7 v3 product family and Intel DC P3700 SSDs with NVMe, TGAC can now assemble large plant and animal genomes in record times that, until a few months ago, were impossible.”
Part of the SGI UV server line for high performance in-memory computing, the new SGI UV 300 are advanced symmetric multiprocessing (SMP) systems designed for compute-intensive, fast algorithm workloads such as genome assembly, computer-aided engineering (CAE) and scientific simulations. Limitations on computer clusters to perform analytics on very large data sets in-memory and scale without severe latency implications caused by the cluster networking have led to a surge in scale-up server platforms.
The SGI UV server, in its fifth generation, is well placed to serve this need. For TGAC, a key benefit for using this most powerful system is the scalability and memory capabilities to execute some of the most demanding data and compute-intensive workloads including large, complex genome assembly and analytics. The SGI UV with the Intel® Xeon® Processor E7 v3 product family create a powerful technology that can advance genomic breakthroughs.
TGAC is strategically funded by BBSRC and operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.
An artificial intelligence (AI) approach based on deep learning convolutional neural network (CNN) could identify nuanced mammographic imaging features specific for recalled but benign (false-positive) mammograms and distinguish such mammograms from those identified as malignant or negative.