Gromacs 2018.4 (CPU & GPU versions)

Overview

GROMACS is a package for computing molecular dynamics, simulating Newtonian equations of motion for systems with hundreds to millions of particles. GROMACS is designed for biochemical molecules with complicated bonded interactions (e.g. proteins, lipids, nucleic acids) but can also be used for non-biological systems (e.g. polymers).

Please do not add the -v flag to your mdrun command.

It will write to a log file every second for the duration of your job and can lead to severe overloading of the file servers.

Significant Change in this Version

Within Gromacs 2018, the different gromacs commands (e.g., mdrun, grompp, g_hbond) should now be run using the command:

gmx command

where command is the name of the command you wish to run (without any g_ prefix), for example:

gmx mdrun

The gmx command changes its name to reflect the gromacs flavour being used but the command does not change. For example, if using the mdrun command:

# New 2018.4 method                  # Previous 5.0.4 method
# =================                  # =====================
gmx   mdrun                          mdrun
gmx_d mdrun                          mdrun_d
mpirun -n $NSLOTS gmx_mpi   mdrun    mpirun -n $NSLOTS mdrun_mpi
mpirun -n $NSLOTS gmx_mpi_d mdrun    mpirun -n $NSLOTS mdrun_mpi_d

The complete list of command names can be found by running the following on the login node:

gmx help commands
# The following commands are available:
anadock			gangle			rdf
anaeig			genconf			rms
analyze			genion			rmsdist
angle			genrestr		rmsf
awh			grompp			rotacf
bar			gyrate			rotmat
bundle			h2order			saltbr
check			hbond			sans
chi			helix			sasa
cluster			helixorient		saxs
clustsize		help			select
confrms			hydorder		sham
convert-tpr		insert-molecules	sigeps
covar			lie			solvate
current			make_edi		sorient
density			make_ndx		spatial
densmap			mdmat			spol
densorder		mdrun			tcaf
dielectric		mindist			traj
dipoles			mk_angndx		trajectory
disre			morph			trjcat
distance		msd			trjconv
do_dssp			nmeig			trjorder
dos			nmens			tune_pme
dump			nmtraj			vanhove
dyecoupl		order			velacc
dyndom			pairdist		view
editconf		pdb2gmx			wham
eneconv			pme_error		wheel
enemat			polystat		x2top
energy			potential		xpm2ps
filter			principal
freevolume		rama

Notice that the command names do NOT start with g_ and do NOT reference the flavour being run (e.g., _mpi_d). Only the main gmx command changes its name to reflect the flavour (see below for list of modulefiles for the full list of flavours available).

To obtain more help about a particular command run:

gmx help command

For example

gmx help mdrun

Helper scripts

To assist with moving to the new command calling method, we have recreated some of the individual commands that you may have used in your jobscript. For example, you can continue to use mdrun (or mdrun_d) instead of the new gmx mdrun (or gmx_d mdrun) in this release. These extra commands are automatically included in your environment when you load the gromacs modulefiles. This old method uses the flavour of gromacs in the command name (see above for comparison of new and old commands).

However, please note that some commands are new to 2018.4 and so can only be run using the new method (gmx command):

Available Flavours

For version 2018.4 we have compiled multiple versions of Gromacs, each of which is optimised for a particular CPU architecture. We have also built versions with GPU support (note, GPU versions of Gromacs only support single precision). The module file has been written to detect which CPU the compute node is using and to automatically select the correct Gromacs executable. If you want to ensure you get a particular level of opimisation specify an architecture in the jobscript e.g -l skylake.

2018.4 for Ivybridge (and Haswell, Broadwell and Skylake nodes) only

With AVX optimisation.

2018.4 for Haswell and Broadwell (and Skylake) nodes only

With AVX2 optimisation.

2018.4 for Skylake nodes only

With AVX-512 optimisation.

2018.4 for Skylake nodes with GPU acceleration.

With AVX-512 optimisation and with GPU acceleration turned on.  Note only single precision versions are available with GPU acceleration.

Restrictions on use

GROMACS is free software, available under the GNU General Public License.

Set up procedure

You must load the appropriate modulefile:

module load modulefile

replacing modulefile with one of the modules listed in the table below. The module file will auto-detect and pick a version of Gromacs with AVX optimisations to match the CPU of the compute node(s) you are assigned.

 

Version Modulefile Notes Typical Executable name
Single precision multi-threaded (single-node) apps/intel-17.0/gromacs/2018.4/single non-MPI with GPU acceleration available mdrun or gmx mdrun
Double precision multi-threaded (single-node) apps/intel-17.0/gromacs/2018.4/double non-MPI mdrun_d or gmx_d mdrun
Single precision MPI apps/intel-17.0/gromacs/2018.4/single_mpi For MPI mdrun_mpi or gmx_mpi mdrun
Double precision MPI apps/intel-17.0/gromacs/2018.4/double_mpi For MPI mdrun_mpi_d or gmx_mpi_d mdrun

Running the application

Please do not run GROMACS on the login node.

Important notes regarding running jobs in batch

We now recommend that the relevant module file (single/double non-MPI/MPI) is loaded as part of your batch script.  The module file itself will select the most suitable build of Gromacs for the processor architecture you end up running on.

Please NOTE the following which is important for running jobs correctly and efficiently:

Ensure you inform gromacs how many cores it can use. This is done using the $NSLOTS variable which is automatically set for you in the jobscript to be the number of cores you request in the jobscript header (see later for complete examples). You can use either of the following methods depending whether you want a multi-core job (running on a single compute node) or a larger job running across multiple compute nodes:

# Multi-core (single-node) or Multi-node MPI jobs

mpirun -n $NSLOTS mdrun_mpi         # Old method (v5.0.4 and earlier)
mpirun -n $NSLOTS mdrun_mpi_d       # Old method (v5.0.4 and earlier)

mpirun -n $NSLOTS gmx_mpi mdrun     # New method (v5.1.4 and later)
mpirun -n $NSLOTS gmx_mpi_d mdrun   # New method (v5.1.4 and later)

or

# Single-node multi-threaded job

export OMP_NUM_THREADS=$NSLOTS      # Do this for all versions
mdrun                               # Old method (v5.0.4 and earlier)
mdrun_d                             # Old method (v5.0.4 and earlier)

export OMP_NUM_THREADS=$NSLOTS      # Do this for all versions
gmx mdrun                           # New method (v5.1.4 and later)
gmx_d mdrun                         # New method (v5.1.4 and later)

# Single-node multi-threaded job with GPU acceleration.

The examples below can be used for single precision or double precision gromacs. Simply run mdrun (single precision) or mdrun_d (double precision).

Please do not add the -v flag to your mdrun command.

It will write to a log file every second for the duration of your job and can lead to severe overloading of the file servers.

Multi-threaded single-precision on Intel nodes, 2 to 32 cores

Note that GROMACS 2018.4 (unlike v4.5.4) does not support the -nt flag to set the number of threads when using the multithreaded OpenMP (non-MPI) version. Instead set the OMP_NUM_THREADS environment variable as shown below.

An example batch submission script to run the single-precision mdrun executable with 12 threads:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 12            # Can specify 2 to 32 cores in smp.pe
                           
module load apps/intel-17.0/gromacs/2018.4/single
export OMP_NUM_THREADS=$NSLOTS
mdrun
  #
  # This is the old naming convention (it will still work in this release)
  # The new gromacs convention is to run: gmx mdrun

Submit with the command: qsub scriptname

The system will run your job on a Ivybridge, Haswell, Broadwell or Skylake node depending on the number of cores requested and what is available. Not specifying (recommende) an architecture means that your job will start as soon as any type of compute node that can accommodate it becomes available (it gives the job the biggest pool of nodes to target). To get a more optimised run on Haswell/Broadwell you should specify the architecture you require, but note that it may take longer for your job to start as specifying an architecture reduces the size of the pool that the system can target.

Multi-threaded double-precision on Intel nodes, 2 to 32 cores

An example batch submission script to run the double-precision mdrun_d executable with 8 threads:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 24
module load apps/intel-17.0/gromacs/2018.4/double
export OMP_NUM_THREADS=$NSLOTS
mdrun_d
  #
  # This is the old naming convention (it will still work in this release)
  # The new gromacs convention is to run: gmx_d mdrun

Submit with the command: qsub scriptname

Single precision MPI (single-node), 2 to 32 cores

An example batch submission script to run the double-precision mdrun_mpi executable on 8 cores using mpi:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8            
module load apps/intel-17.0/gromacs/2018.4/single_mpi                                          
mpirun -n $NSLOTS mdrun_mpi

Submit with the command: qsub scriptname

Double precision MPI (single-node), 2 to 32 cores

An example batch submission script to run the double-precision mdrun_mpi_d executable on 8 cores using mpi:

#!/bin/bash --login
#$ -cwd
#$ -V
#$ -pe smp.pe 8
module load apps/intel-17.0/gromacs/2018.4/double_mpi                                           
mpirun -n $NSLOTS mdrun_mpi_d
  #
  # This is the old naming convention (it will still work in this release)
  # The new gromacs convention is to run: mpirun -n $NSLOTS gmx_mpi_d mdrun

Submit with the command: qsub scriptname

Single-precision, MPI, 48 cores or more in multiples of 24

An example batch submission script to run the single precision mdrun_mpi executable with 48 MPI processes (48 cores on two 24-core nodes) with the mpi-24-ib.pe parallel environment (Intel Haswell nodes using infiniband):

#!/bin/bash --login
#$ -cwd
#$ -pe mpi-24-ib.pe 48           # EG: Two 24-core Intel Haswell nodes
module load apps/intel-17.0/gromacs/2018.4/single_mpi
mpirun -n $NSLOTS gmx_mpi mdrun
  #
  # This is the old naming convention (it will still work in this release)
  # The new gromacs convention is to run: mpirun -n $NSLOTS gmx_mpi mdrun

Submit with the command: qsub scriptname

Double-precision, MPI, 48 cores or more in multiples of 24

An example batch submission script to run the single precision mdrun_mpi executable with 48 MPI processes (48 cores on two 24-core nodes) with the mpi-24-ib.pe parallel environment (Intel Haswell nodes using infiniband):

#!/bin/bash --login
#$ -cwd
#$ -pe mpi-24-ib.pe 48           # EG: Two 24-core Intel Haswell nodes
module load apps/intel-17.0/gromacs/2018.4/double_mpi
mpirun -n $NSLOTS gmx_mpi_d mdrun

Submit with the command: qsub scriptname

Multi-threaded single-precision on a single node with one GPU.

You need to request being added to the relevant group to access GPUs before you can run GROAMACS on them.

Please note that if you have ‘free at the point of use’ access to the GPUs then the maximum number of GPUs you can request is 2

The maximum number of CPU cores that anyone can request is 8 per GPU.

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8             #Specify the number of CPUs, maximum of 8 per GPU.
#$ -l v100               #This requests a single GPU.

module load apps/intel-17.0/gromacs/2018.4/single
gmx mdrun -ntmpi 1 -ntomp ${NSLOTS} ...

Submit with the command: qsub scriptname

This requests 1 thread mpi rank for the GPU and $NSLOTS (8 in this case) OpenMP threads per rank.

Multi-threaded single-precision on a single node with multiple GPUs

You need to request being added to the relevant group to access GPUs before you can run GROAMACS on them.

Please note that if you have ‘free at the point of use’ access to the GPUs then the maximum number of GPUs you can request is 2 (please therefore follow the previous example).

The maximum number of CPU cores that anyone can request is 8 per GPU requested e.g. 1 GPU and 8 cores, 2 GPUs and 16 cores.

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 16            #Specify the number of CPUs, maximum of 8 per GPU.
#$ -l v100=2             #Specify we want a GPU (nvidia_v100) node with two GPUs, maximum is 4.

module load apps/intel-17.0/gromacs/2018.4/single
gmx mdrun -ntmpi 2 -ntomp 8 ...

Here ntmpi is the number of (thread)mpi ranks and ntomp is the number of OpenMP threads per rank.

For the example above where we have requested 2 GPUs (and therefore have a maximum of 16 cores to use) sensible combinations are

-ntmpi 2 -ntomp 8   # 1 Rank per GPU, 8 threads per rank
-ntmpi 4 -ntomp 4   # 2 Ranks per GPU, 4 threads per rank
-ntmpi 8 -ntomp 2   # 4 Ranks per GPU, 2 threads per rank

If you have time to experiment you can try each combination to see which gives the best performance, if not, use the following

export OMP_NUM_THREADS=$((NSLOTS/NGPUS))
gmx mdrun -ntmpi ${NGPUS} -ntomp=${OMP_NUM_THREADS}

Submit with the command: qsub scriptname

Error about OpenMP and cut-off scheme

If you encounter the following error:

OpenMP threads have been requested with cut-off scheme Group, but these 
are only supported with cut-off scheme Verlet

then please try using the mpi version of the software. Note that is is possible to run mpi versions on a single node (example above).

Further info

Updates

Dec 2018 – 2018.4 installed with AVX, AVX2 and AVX-512 support enabled and GPU builds
Oct 2018 – 2016.4 installed with AVX, AVX2 and AVX-512 support enabled and patched with Plumed 2.4.0
Oct 2018 – 2016.4 installed with AVX, AVX2 and AVX-512 support enabled
Oct 2018 – 2016.3 installed with AVX, AVX2 and AVX-512 support enabled

Last modified on January 3, 2024 at 3:14 pm by George Leaver