Using Supercomputers To Understand Biomolecular Properties
The use of futuristic computers for analyzing the biomolecular structures responsible for a disease and rapidly designing a perfect cure is commonplace in science fiction. The general idea is both hopeful and well-motivated. But determining the movement and functions of biomolecular particles is not an easy task – even with today’s most powerful supercomputers.
Scientists at the University of Chicago are doing research seeking answers to the function and movement of atoms in biological systems. The team is modifying their code to run on the future Intel-HPE Aurora supercomputer that is estimated to deliver in excess of two exaflops of peak double precision compute performance. Aurora will be located at the U.S. Department of Energy’s (DOE) Argonne National Laboratory. The research group is supported by the Argonne Leadership Computing Facility’s (ALCF) Aurora Early Science Program (ESP).
Principal Investigator on the ESP project, Dr. Benoît Roux, University of Chicago explains, “Our team is running computer simulations on a pre-production supercomputer using Intel hardware and software that will be on the future Aurora supercomputer. The system includes pre-production Intel Graphic Processing Units (GPUs).”
Roux states, “The goal of our ESP project is to develop new technologies to simulate virtual models of biomolecular systems with an unprecedented accuracy. As we move to running computer simulations on an exascale supercomputer such as the future Aurora system, our hope is that we are moving toward a rational understanding of biological systems.”
Defining the nature and movement of biochemical atomic particles
Molecules follow the laws of physics, thermodynamics and chemistry in the ways they behave and move. Roux indicates that different parts of living cells have structures such as membranes, proteins and enzymes. For example, membranes are thin sheaths that are formed by lipids that separate various compartments of the cell. Membranes are typically 30 or 40 angstroms thick, but some protein membranes are large and can be 100 angstroms wide. Cell membranes communicate and signal what is occurring outside the cell. Proteins typically traverse the membrane and have functions such as pumping chemicals across a membrane or controlling the passage of different substances.
There are so many aspects of the molecular biochemistry, that a computer model provides only an approximation of how the molecules move and behave. Biology is very complex and there can be unexpected factors that are discovered as the research is proceeding. Roux indicates that it is important to compare results of computer simulations with real world experiments done in laboratories or clinical trials.
“We use such all-atom molecular dynamics simulations to rigorously compute conformational free energies and binding free energies. We are particularly interested in understanding the function of biomolecular systems. We are also developing new computational approaches (polarizable force field, solvent boundary potentials, efficient sampling methods) for studying biological macromolecular systems,” states Roux.
Biomolecular force field energy case study
The main focus of the team’s ESP research uses supercomputer simulations to determine the free energy landscapes underlying the function of two large membrane transporters. The team measures the Ca2+ ATPase (SERCA) and the P-glycoprotein multidrug resistance transporter (PGP).1-6 The goal is to gain a more complete understanding of their mechanism by performing a quantitative characterization of the pathways and the free energy landscapes that govern motions along them.
SERCA and PGP represent two broad classes of transport proteins. Both proteins use adenosine triphosphate (ATP) hydrolysis as an energy source for the transport activity involving complex conformational transitions that are tightly coupled to ATP/ADP binding, unbinding and hydrolysis. PGP is a highly biomedically important member of the largest superfamily of ATP-binding cassette transporters and mediates multi-drug resistance in many cancer types. PGP is a well-identified membrane transporter with capability to efflux drug molecules out of the cancer cell leading to reduced efficiency of chemotherapy. Cancer cells upregulate PGP expression as an adaptive response to evade chemotherapy mediated cell death.
Figure 1 shows an example of these structures. The results of this research can provide important information relating to multi-drug resistance in cancer.
Figure 1. Membrane bound structures for the two transporter proteins investigated in the research: the Ca2+ ATPase pump SERCA (left) and multidrug transporter PGP (right). Courtesy of Dr. Roux, University of Chicago.
Software used in the biomolecular research
The team performs large-scale molecular dynamics (MD) simulations using the Nanoscale Molecular Dynamics (NAMD) program in their research. NAMD is a parallel MD code designed for high performance simulation of large biomolecular systems. NAMD supports biological research measuring the dynamics of cellular processes at atomic and sub-nanosecond resolution not achievable by experimental methods.
Preparing to run on the Aurora supercomputer
The Roux team started NAMD code migration on the Joint Laboratory for System Evaluation testbed systems at Argonne for Aurora. Aurora will incorporate new Intel technologies such as the Intel Xe-HPC GPUs (codename Ponte Vecchio) and the Next Generation Intel Xeon Scalable processor (codename Sapphire Rapids), both equipped with high bandwidth memory designed to improve memory usage. The team uses SYCL compiled by the Data Parallel C++ (DPC++) compiler, which is part of the Intel-led cross-industry oneAPI initiative designed to unify and simplify application development across diverse computing architectures.
Wei Jiang, who was previously a postdoctoral researcher in the Roux lab, is now at Argonne as a computational scientist – part of ALCF’s Catalyst team. He is using SYCL compiled by Intel’s DPC++ compiler to help port CUDA models to run on the Intel GPU. Jiang indicates that using oneAPI will also aid developers in more easily modifying code to run across a variety of systems.
Jiang indicates that the ALCF and Intel team started working with the existing NAMD CUDA GPU model. The team used the oneAPI tools to convert the existing GPU code to a C++ code kernel model that will be able to run on Intel Xe-HPC GPUs (codename Ponte Vecchio). The NAMD development efforts will improve cross-platform support by porting the CUDA kernels to SYCL with help from the Intel DPC++ Compatibility Tool. The Intel VTune profiler is used to help improve GPU utilization and overall performance of the new SYCL kernels.
Jiang indicates that Intel provides workshops and support to the ALCF team. The team currently has access to the Intel oneMKL library, and Intel engineers aid in debugging code issues for code that is designed to run on the future Aurora exascale system.
Jiang states, “The oneAPI tools are very convenient because they contain a complete compiler and linker. In addition, the Intel VTune profiler is included which aids in resolving performance issues. oneAPI is designed to work across various GPUs which minimizes the task of writing code for various architectures.”
Future research on complex biologic systems
“Current supercomputers can simulate a few hundreds of microseconds for a moderate size biological system, but they are still limited. Molecular motions take place over a broad range of time scales, going from a fraction of picoseconds up to milliseconds. Much of the biologically relevant dynamics occurs within the range of microsecond-to-millisecond. Existing supercomputers cannot literally simulate the function of a system as complex [as] an ATP-driven Ca2+ pump. The new frontier for future supercomputers requires theoretical advances to build on the information provided by the simulations to understand the function of complex biological systems and accurately determine what is occurring in the system. Ultimately, the goal is to rapidly find answers relating to disease or drug development,” states Roux.
The ALCF is a DOE Office of Science User Facility.
1. Radak BK, Chipot C, Suh D, et al. Constant-ph molecular dynamics simulations for large biomolecular systems. J Chem Theory Comput. 2017;13(12):5933-5944. doi: 10.1021/acs.jctc.7b00875
2. Jiang W, Chipot C, Roux B. Computing relative binding affinity of ligands to receptor: an effective hybrid single-dual-topology free-energy perturbation approach in NAMD. J Chem Inf Model. 2019;59(9):3794-3802. doi: 10.1021/acs.jcim.9b00362
3. Das A, Rui H, Nakamoto R, Roux B. Conformational transitions and alternating-access mechanism in the sarcoplasmic reticulum calcium pump. J Mol Biol. 2017;429(5):647-666. doi: 10.1016/j.jmb.2017.01.007
4. Thirman J, Rui H, Roux B. Elusive intermediate state key in the conversion of ATP hydrolysis into useful work driving the Ca2+ pump SERCA. J Phys Chem B. 2021;125(11):2921-2928. doi: 10.1021/acs.jpcb.1c00558
5. Verhalen B, Dastvan R, Thangapandian S, et al. Energy transduction and alternating access of the mammalian ABC transporter P-glycoprotein. Nature. 2017;543(7647):738-741. doi: 10.1038/nature21414
6. Kapoor K, Pant S, Tajkhorshid E. Active participation of membrane lipids in inhibition of multidrug transporter P-glycoprotein. Chem Sci. 2021;12(18):6293-6306. doi: 10.1039/D0SC06288J
7. Phillips JC, Hardy DJ, Maia JDC, et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J Chem Phys. 2020;153(4):044130. doi: 10.1063/5.0014475
This article was produced as part of Intel’s editorial program, with the goal of highlighting cutting-edge science, research and innovation driven by the HPC and AI communities through advanced technology. The publisher of the content has final editing rights and determines what articles are published.
Complete the form below to unlock access to this Audio Article: "Using Supercomputers To Understand Biomolecular Properties"