GROMACS is a very powerful, widely used, molecular dynamics modelling software package. Running these large (hundreds of millions of particles) dynamical systems quickly and efficiently requires a lot of optimisations for the specific hardware being used. One of our Research Software Engineers, Robin Long, has been working with Stian Soiland-Reyes in The University of Manchester’s eScience Lab on developing Conda packages of this software for the BioExcel2 project, to reduce the effort needed by users to set up and use this software themselves, and to enable its reproducible use from computational workflow systems like Common Workflow Language.
Using Conda, an open source package, and environment, management system, has enabled us to prepare consistent packages for both macOS and Linux. Packages are installed as pre-compiled binaries, with all the necessary, system-specific, dependencies installed too, simplifying the install process for users. Conda has a number of _software channels_ for sharing packages - the main channel for user-provided packages is `conda-forge`, however the packages we have released are hosted on the biology specific channel `bioconda`. BioConda packages are also accompanied with automatically built BioContainers which can be used by Docker and Singularity.
Bioconda uses a set of _build recipes_, with CircleCI as the build system, a free cloud service traditionally meant for Continuous Integration testing of source code, but now also frequently used for Continuous Delivery of applications and cloud services. For third-party consumers like BioExcel this is automatic through GitHub pull requests, making it relatively easy to contribute new recipes to BioConda.
When Conda compiles software like GROMACS, the compiler may need to be configured with specific hardware support flags as the built binaries will be distributed to a wide range of consumers which hardware differs from the CircleCI build nodes. For this simulation software it is however not sufficient to just assume “Linux x64”, as we want to avoid the build inadvertently depending on hardware features that older HPC clusters might lack, and at the same time configuring for the lowest common denominator would mean a loss of performance on newer compute nodes.
SIMD is a parallel computing component of modern CPUs that is used for instance for matrix calculations, but they come in different instruction set variants. MPI is a shared memory architecture for parallel computing across multiple HPC nodes, which can handle synchronization and message passing, in GROMACS used to simulate separately physics of different parts of the molecule system and then synchronise on boundaries and per time step. We built multiple GROMACS binaries, optimised for a number of SIMD types (SSE2 AVX_256 AVX2_256), with and without MPI support. The final Conda package contains all these binaries in parallel folders like `/bin.SSE2/` along with a _environment activation_ script, that at runtime modifies the execution PATH depending on the detected CPU support.
Because of CircleCI build time limitations, and the need to recompile for each variant, we could not support all possible SIMD types nor MPI implementations. Instead, working with the GROMACS developers, we focussed on supporting the SIMD types and MPI implementation that would prove most useful to the majority of users, while still enabling us to use a single build process which fit within the CircleCI limitations. This has now been merged into the main BioConda channel and a single GROMACS version with optional MPI support is now available.
The packages are configured so that the non-MPI package will be what is installed if you opt for the default package:
conda install gromacs -c conda-forge -c bioconda
To install the MPI package you need to specifically reference the start of the build string:
conda install gromacs=*=mpi* -c conda-forge -c bioconda
These commands will install the latest version of GROMACS in BioConda (currently 2021.1).
More details on how the package was created are available in the blog post Creating hardware optimised Conda Packages.
Future development work will focus on adding GPU support for GROMACS in these precompiled packages. The BioConda distribution of GROMACS is compiled with OpenCL support, which can be used with many types of GPUs by installing their corresponding bindings. However, greater optimisation of GROMACS performance is possible when the built-in CUDA support for machines with NVIDIA GPUs is enabled. Although the BioConda build system does not yet support NVIDIA’s CUDA libraries, the Conda-Forge build system does, so future work will examine a transition to Conda-Forge.
Stian Soiland-Reyes, The University of Manchester’s deputy work package leader in BioExcel, adds: "With the effort from Research IT we have been able to optimise the packaging of high performance simulation code for reproducible use from workflows across compute architectures on HPC and cloud. We have been able to address the challenge of optimization across 4 different dimensions (GPU, SIMD, MPI, OS), which produce different binaries and hardware requirements, because with our Conda approach we were able to unify their installation and usage for the benefit of users and system administrators.