MPI (OpenMPI)

Overview

The Message-Passing Interface (MPI) library provides functionality to implement single-node and multi-node multi-core parallel applications. The runtime tools allow you to run applications compiled against the MPI library. The particular implementation installed and supported on CSF is OpenMPI.

NOTE: do not confuse OpenMPI with OpenMP. While they are both used to develop parallel applications, they are different technologies and offer different capabilities.

Versions available:

  • 4.1.0
  • 4.0.1 (with and without CUDA support)
  • 3.1.4
  • 3.1.3

The modulefile names below indicate which compiler was used and the version of OpenMPI. NOTE: we no longer distinguish between non-InfiniBand (slower networking) and InfiniBand (faster networking) versions. OpenMPI will use the fastest available network. Previously you may have loaded a modulefile with -ib at the end of the name. This is now not necessary.

When to use an MPI Modulefile

The following two scenarios require an MPI modulefile:

Running Centrally Installed MPI programs

If you intend to run a centrally installed application (e.g., GROMACS) then we provide a modulefile for that application which loads the appropriate MPI modulefile. Hence you often never load an MPI modulefile from the list below yourself – the application’s modulefile will do it for you. You can probably stop reading this page here if you are not compiling your own code or an open-source application, for example!

Writing your own MPI programs or compiling open source programs

If writing your own parallel, MPI application, or compiling a downloaded open-source MPI application, you must load an MPI modulefile to compile the source code and also when running the executable (in a jobscript). If writing your own application you will need to amend your program to include the relevant calls to the MPI library.

The MPI modulefiles below will automatically load the correct compiler modulefile for you. This is the compiler that was used to build the MPI libraries and so you should use that same compiler to build your own MPI application. The modulefile name/path (see below) indicates which compiler will be used.

For further details of the available compilers please see:

Please contact its-ri-team@manchester.ac.uk if you require advice.

Programming with MPI is currently beyond the scope of this webpage.

Process Binding

When an MPI job runs on the CSF, the MPI processes will be bound to a CPU cores by default. This means that each MPI process started by your jobscript will be pinned to a specific core on a compute node and will not hop about from core to core. Without process binding Linux is free to move processes around, although tries not to – it depends what else runs on the compute node where your job is running. Binding to a core may improve performance.

Process binding is considered a bad idea if not using all of the cores on a compute node. This is because other jobs could be running on the node and so binding to specific cores may compete with those jobs. To turn off processes binding where it is enabled use the following in your jobscript:

mpirun --bind-to none ...

or alternatively, load the following modulefile after one of the MPI modulefiles listen below:

module load mpi/nobind

A full discussion of process binding is beyond the scope of this page but please email its-ri-team@manchester.ac.uk if you want more information or advice on this. See also the mpirun man page by running

man mpirun

after you have loaded one of the modulefiles below.

Restrictions on use

Code may be compiled on the login node, but aside from very short test runs (e.g., one minute on fewer than 4 cores), executables must always be run by submitting to the batch system, SGE. If you do wish to do a short test run on the login node, ensure you have loaded the non-InfiniBand Intel version of one of the modulefiles listed below.

Set up procedure

Load the appropriate modulefile from the lists below. The openmpi modulefiles below will automatically load the compiler modulefile matching the version of the compiler used to build the MPI libraries.

Intel compilers

Load one of the following modulefiles:

module load mpi/intel-18.0/openmpi/4.1.0
module load mpi/intel-18.0/openmpi/4.0.1
module load mpi/intel-18.0/openmpi/3.1.4

module load mpi/intel-17.0/openmpi/4.0.1
module load mpi/intel-17.0/openmpi/3.1.3


# CUDA aware MPI is avaiable via
module load mpi/intel-18.0/openmpi/4.0.1-cuda
module load mpi/intel-17.0/openmpi/3.1.3-cuda

GNU compilers

Load one of the following modulefiles:

module load mpi/gcc/openmpi/4.1.0
module load mpi/gcc/openmpi/4.0.1
module load mpi/gcc/openmpi/3.1.3

# CUDA aware MPI is avaiable via
module load mpi/gcc/openmpi/4.0.1-cuda

Note that the default system-wide gcc compiler (v4.8.5) was used. Hence no specific gcc compiler modulefile will be loaded.

Compiling the application

If you are simply running an existing application you can skip these step.

If compiling your own (or open source, for example) MPI application you should use the MPI compiler wrapper scripts mpif90, mpicc, mpiCC. These will ultimately use the compiler you selected above (Intel, PGI, GNU) but will add compiler flags to ensure the MPI header files and libraries are used during compilation (setting these flags manually can be very difficult to get right).

Example code compilations when compiling on the command-line:

mpif90 my_prog.f90 -o my_prog        # Produces fortran mpi executable binary called my_prog

mpicc my_prog.c -o my_prog           # Produces C mpi executable binary called my_prog

mpiCC my_prog.cpp -o my_prog         # Produces C++ mpi executable binary called my_prog

In some build procedures you specify the compiler name using environment variables such as CC and FC. When compiling MPI code you simply use the name of the wrapper script as your compiler name. For example:

CC=mpicc FC=mpif90 ./configure --prefix=~/local/appinst

Please consult the build instructions for your application.

Parallel batch job submission

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Intel nodes

Note that you are not restricted to using the version of MPI compiled with the Intel compiler. The GNU (gcc) compiled version or the PGI compiled version can be used on the Intel hardware.

To submit an MPI batch job to Intel nodes so that two or more nodes will be used create a jobscript similar to the following.

#!/bin/bash --login

#$ -cwd                         # Job runs in current directory
#$ -pe mpi-24-ib.pe 48          # Num cores must be a 48 or more and a multiple of 24

## Load the required modulefile (can use intel, gcc or PGI versions)
module load apps/intel-17.0/openmpi/3.1.3

## The variable NSLOTS sets the number of processes to match the pe core request
mpirun -n $NSLOTS ./my_prog

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

AMD Magny-Cours nodes

The AMD M-C nodes are not available in the CSF3.

AMD Bulldozer nodes

The AMD Bulldozer nodes are not available in the CSF3.

Small MPI jobs that require 32 cores or fewer

In these jobs, all processes will be placed on the same physical node and hence no communication over the network will take place. Instead shared-memory communication will be used, which is more efficient.

To submit an MPI batch job that requires 32 cores or fewer replace the pe line in the above scripts with:

#$ -pe smp.pe 16
              #
              # where 16 is replaced with the appropriate number for your job

Further info

Online help via the command line:

man mpif90         # for fortran mpi
man mpicc          # for C/C++ mpi
man mpirun         # for information on running mpi executables

Last modified on March 10, 2021 at 9:46 am by George Leaver