HPC in Research: Analysing more data, more quickly
Article Jun 28, 2017 | by Jack Rudd, Senior Editor for Technology Networks
Credit: University of Bristol
For over a decade the University of Bristol has been contributing to world-leading and life changing scientific research using High Performance Computing (HPC), having invested over £16 million in HPC and research data storage. To continue meeting the needs of its researchers working with complex and large amounts of data, they will now benefit from a new HPC machine, named BlueCrystal 4 (BC4). Designed, integrated and configured by the HPC, storage and data analytics integrator OCF, BC4 has more than 15,000 cores making it the largest UK University system by core count and a theoretical peak performance of 600 Teraflops.
To find out more about this new system and the work it will enable, we spoke to Simon Burbidge, Director of Advanced Computing at the University of Bristol.
What is BC4 and how does it build on the system it has superseded?
Alongside scientific theory and experimentation, High Performance Computing (HPC) is the third pillar of modern research. At the University of Bristol we have a rolling programme to continually update and invest in our HPC services. Every three years we talk to our users to find out what they need from a system and what they want and their future ambitions too. We then set out to procure a machine that meets their needs and provide as much power as possible within our budget.
BC4 is the culmination of a lot of hard work from the University and our HPC integration partner OCF. The team worked to design, integrate and configured the new system in collaboration with Lenovo, DDN and Intel. It’s highly compatible with BC3, our previous HPC cluster so our users can easily migrate work from the existing machine onto the new system, which is much more powerful - in the region of 3-4x more so.
As well as being significantly faster, does BC4 enable any new applications?
BC4 includes 32 nodes of dual NVIDIA Pascal p1000 GPUs and a GPU login node too, which will really make a difference to certain research areas. In terms of molecular dynamics for example, the new system enables us to carry out much larger simulations, to tackle larger problems with much bigger atomic systems with many more molecules.
GPUs are hard to programme for computational use, you need to rewrite applications, which is complex and time consuming so you need to be a good programmer. But, because of the inherent power of GPUs and the use of the parallel features of the chips you can get a substantial speed up over CPUs.
There are particular applications, like the Amber Molecular Dynamics Package and BUDE that can take advantage of CUDA – the parallel computing platform and application programming interface (API) model, created by Nvidia, for GPUs.
Could you tell us about some of the exciting research this new cluster is going to support?
BC4 played a pivotal role in a €1.8m study into Ebola, looking at the speed of the virus evolution, and the correspondent effect on vaccines, diagnostics and treatment. The capabilities of BC4 were invaluable to the research – it was used to analyse raw data on the virus in 179 patient blood samples to determine the precise genetic make-up of the virus in each case.
This allowed the team, led by Dr. David Matthews, Senior Lecturer in Virology at the University, to examine how the virus had evolved over the previous year, informing public health policy in key areas such as diagnostic testing, vaccine deployment and experimental treatment options.
This complex data analysis process took around 560 days of supercomputer processing time, generating nine thousand billion letters of genetic data in order to determine the virus’ 18,000 letters long genetic sequence for all 179 blood samples.
Dr. Matthews will be using BC4 to help with further research into Dengue fever and the Zika virus.
BC4 is also being as part of the UK Biobank research into genomics. It’s data that comes from real patients, the genetics of whom are studied to help determine possible common causes from diseases by searching through genetic structures and correlating them.
As you can imagine, that’s very compute intensive. The UK Biobank recruited 500,000 people between 40-69 years from across the country to look at and improve the prevention, diagnosis and treatments of a wide range of life threatening illnesses – including cancer, heart disease, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression, and forms of dementia.
BC4 will enable them to process many more datasets much more quickly.
Simon Burbridge was speaking to Jack Rudd, Senior Editor for Technology Networks.
Plant Epigenetics: An untapped molecular resource for crop improvementArticle
Epigenetic phenomena such as paramutation, transgenic silencing, imprinting, and transposable element inactivation are prevalent in plants and potentially offer a huge resource for directed crop improvement.READ MORE
CRISPR: Emerging applications for genome editing technologyArticle
CRISPR gene editing is the ultimate toolbox for genetic manipulation with a whole host of potential applications.READ MORE
Software Introductions at ASMS 2017 Focus on Workflow and Automated AnalysisArticle
The introduction of powerful, but user-friendly, software designed to help ease workflow and help automate data analysis at ASMS’s annual meeting highlighted the importance of workflow solutions beyond instrumentation and consumables.READ MORE
Comments | 0 ADD COMMENT
EMBL Course: Next Generation Sequencing: RNA Sequencing Library Preparation
Apr 23 - Apr 27, 2018
EMBO Practical Course: Microbial Metagenomics: A 360º Approach
Apr 23 - Apr 30, 2018
EMBL Course: Next Generation Sequencing: Whole Genome Sequencing Library Preparation
Apr 16 - Apr 20, 2018
EMBL Course: Introduction to Next Generation Sequencing
Apr 09 - Apr 12, 2018
EMBL Course: RNA Sequencing Library Preparation - How Low Can You Go?
Mar 19 - Mar 23, 2018