The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
OpenMPI
Overview
The MPI implementation installed and supported on CSF is Open MPI.
Versions available:
- 1.8.3
- 1.6
- 1.4.5
- 1.4.3
You will need to load one of the modulefiles (below) whether you are simply running an existing MPI application or compiling your own MPI code.
The modulefile names below indicate which compiler was used, the version of OpenMPI and whether the faster InfiniBand network hardware can be used. We provide further information about those factors below.
Centrally Installed MPI programs
If you intend to run a centrally installed application then we usually provide a modulefile for that application which loads the appropriate MPI modulefile. You will choose between the InfiniBand version and non-InfiniBand version depending on the size of your job (InfiniBand networking is faster). Hence you often never load an MPI modulefile from the list below yourself – the application’s modulefile will do it for you. You can probably stop reading this page here!
Writing MPI programs
If writing your own code, it will not compile without an MPI modulefile loaded. To use MPI you will need to amend your program to include the relevant calls to the MPI library.
The MPI modulefiles below will automatically load the correct compiler modulefile for you. This is the compiler that was used to build the MPI libraries and so you should use that same compiler to build your own MPI application. The modulefile name/path (see below) indicates which compiler will be used.
Note that the choice of InfiniBand or non-InfiniBand modulefile (see below) when compiling your code does not matter. It only affects the job when you run it. Hence you can load either modulefile while developing your application. The choice of Intel-only or Intel-and-AMD modulefiles (see below) is important when compiling your code. It affects where you can run your compiled application.
Code compiled for AMD Bulldozer will only run on AMD Bulldozer nodes. Hence we do provide AMD Bulldozer-specific modulefiles.
For further details of the available compilers please see:
Please contact its-ri-team@manchester.ac.uk if you require advice.
Programming with MPI is currently beyond the scope of this webpage. IT Services for Research run training courses in parallel programming.
InfiniBand vs GigE networking
If running a multi-node MPI job (across two or more compute nodes) then all suitable Parallel Environments (PEs) on the CSF will run your job on InfiniBand-connected compute nodes. InfiniBand networking is faster than the GigE network. Hence it is strongly recommended that you use the -ib
(infiniband) versions of the modulefiles below. This will give much better application performance.
If running a single-node MPI job (on one compute node using some or all of its cores) then you should load the non-IB version of the modulefile (those without -ib
). This is because some single-node parallel environments on the CSF run your job on compute nodes that do not have any InfiniBand hardware. Your job may fail on these nodes if the -ib
modulefile is used.
Note that the choice of InfiniBand or non-InfiniBand does not matter when compiling your own MPI application. But it does affect a job when you run it.
Intel and Magny-Cours vs Intel-only
Some of the modulefiles below are for use on Intel-only compute nodes. The MPI tools and libraries will only run on Intel nodes. They may be more optimised for Intel hardware (although any significant performance gains will come from optimising your application code and compiling that for Intel hardware).
Some of the modulefiles below are for use on Intel and AMD Magny-Cours nodes. The MPI tools and libraries may not be as optimised for either but they will run on a the two different types of hardware. This may be convenient if you intend to run your jobs on Intel and AMD Magny-Cours nodes – the same application and modulefile can be for both.
Note that the choice of Intel-only or Intel-and-AMD does matter if compiling your own MPI application. The choice made before compiling your code will affect where you can run your application.
Process Binding
Some of the modulefiles below indicate they will bind an MPI process to a core by default. This means that each MPI process started by your jobscript will be pinned to a specific core on a compute node and will not hop about from core to core. Without process binding Linux is free to move processes around, although tries not to – it depends what else runs on the compute node where your job is running. Binding to a core may improve performance. However, this is only recommended if you are using all of the cores in a compute node i.e., a fully populated single-node job or a multi-node IB job where you are forced to specify two or more fully populated nodes.
If no mention of process binding is indicated below then the MPI processes will not be bound to specific cores. This may reduce performance of your application if using all of the cores on a node.
Note that our use of process binding is currently a little inconsistent. In general, OpenMPI 1.8.3 will bind to cores by default (unless we’ve switched it off). OpenMPI 1.6 will not do any process binding by default. We indicate below when process binding will be used but the behaviour can be changed in all versions by adding flags to the mpirun
command in your jobscript.
Process binding is considered a bad idea if not using all of the cores on a compute node. This is because other jobs could be running on the node and so binding to specific cores may compete with those jobs. To turn off processes binding where it is enabled use the following in your jobscript:
mpirun --bind-to none ...
A full discussion of process binding is beyond the scope of this page but please email its-ri-team@manchester.ac.uk if you want more information or advice on this. See also the mpirun man page by running
man mpirun
after you have loaded one of the modulefiles below.
Restrictions on use
Code may be compiled on the login node, but aside from very short test runs (e.g., one minute on fewer than 4 cores), executables must always be run by submitting to the batch system, SGE. If you do wish to do a short test run on the login node, ensure you have loaded the non-InfiniBand Intel version of one of the modulefiles listed below.
Set up procedure
Load the appropriate modulefile from the lists below. The openmpi
modulefiles below will automatically load the compiler modulefile matching the version of the compiler used to build the MPI libraries.
IB network, Intel compilers, runs on Intel and AMD M-C
Note: These versions can be used on Intel and AMD Magny-Cours nodes containing InfiniBand hardware. Your choice of PE will determine which nodes you run on. Load one of the following modulefiles:
module load mpi/intel-15.0/openmpi/1.8.3m-ib # bind-to-core default proc binding module load mpi/intel-15.0/openmpi/1.6m-ib module load mpi/intel-14.0/openmpi/1.8.3m-ib # bind-to-core default proc binding module load mpi/intel-14.0/openmpi/1.6m-ib module load mpi/intel-12.0/openmpi/1.6-ib module load mpi/intel-12.0/openmpi/1.4.5-ib module load mpi/intel-11.1/openmpi/1.4.3-ib
IB network, Intel compilers, runs on Intel only
Note: These versions can be used on only Intel nodes (not AMD nodes) containing InfiniBand hardware. Load one of the following modulefiles:
module load mpi/intel-15.0/openmpi/1.8.3-ib # bind-to-core default proc binding module load mpi/intel-14.0/openmpi/1.8.3-ib # bind-to-core default proc binding module load mpi/intel-14.0/openmpi/1.6-ib
IB network, GNU compilers, runs on Intel and AMD M-C
Note: These versions can be used on Intel and AMD Magny-Cours nodes containing InfiniBand hardware. Your choice of PE will determine which nodes you run on. Load one of the following modulefiles:
module load mpi/gcc/openmpi/1.6-ib module load mpi/gcc/openmpi/1.4.5-ib module load mpi/gcc/openmpi/1.4.3-ib
Note that the default system-wide gcc compiler (v4.4.6) was used. Hence no specific gcc compiler modulefile will be loaded.
IB network, Open64 compilers, runs on AMD Bulldozer only
This compiler will generate more optimised code for the AMD Bulldozer architecture than the Intel compiler. However, you may still use one of the Intel modulefiles above instead (and therefore compile your code with the Intel compiler) to run on AMD Bulldozer nodes if that is better for your dependent libraries etc. But Intel-compiled code running on AMD Bulldozer may not be as optimized as Open64 compiled code.
# New: If using less than 64-cores (single-node job, not fully populated) module load mpi/open64-4.5.2.1/openmpi/1.8.3-amd-bd module load mpi/open64-4.5.2/openmpi/1.6-amd-bd # If using entire node(s) (64, 128, 190, ... cores) module load mpi/open64-4.5.2.1/openmpi/1.8.3-ib-amd-bd # bind-to-core default proc binding module load mpi/open64-4.5.2/openmpi/1.6-ib-amd-bd
Please see our AMD Bulldozer OpenMPI documentation for more information on compiling for and running MPI jobs on this architecture.
IB network, PGI compilers, runs on AMD Bulldozer only
# New: If using less than 64-cores (single-node job, not fully populated) module load mpi/pgi-14.10-acml-fma4/openmpi/1.8.3-amd-bd # If using entire node(s) (64, 128, 190, ... cores) module load mpi/pgi-14.10-acml-fma4/openmpi/1.8.3-ib-amd-bd # bind-to-core default proc binding module load mpi/pgi-13.6-acml-fma4/openmpi/1.6-ib-amd-bd module load mpi/pgi-12.10/openmpi/1.6-ib-amd-bd
Please see our AMD Bulldozer OpenMPI documentation for more information on compiling for and running MPI jobs on this architecture.
GigE network, Intel compilers, runs on Intel and AMD M-C
These modulefiles should be used when running small, single-node MPI jobs in SMP parallel environments. See below for more info about running jobs. We recommend you use InfiniBand networking rather than GigE networking where multi-node jobs are possible.
Process binding is turned off in all of the modulefiles below. Process binding is considered a bad idea if not using all of the cores on a compute node. This is because other jobs could be running on the node and so binding to specific cores may compete with those jobs.
If you are using all of the cores in a single compute node then you may wish to manually turn on process binding as this may give a performance improvement in your application. Please contact its-ri-team@manchester.ac.uk for further advice about this.
Load one of the following modulefiles (process binding is off in all of these versions):
module load mpi/intel-15.0/openmpi/1.8.3m module load mpi/intel-15.0/openmpi/1.6m module load mpi/intel-14.0/openmpi/1.8.3m module load mpi/intel-14.0/openmpi/1.6m module load mpi/intel-12.0/openmpi/1.6 module load mpi/intel-12.0/openmpi/1.4.5 module load mpi/intel-11.1/openmpi/1.4.3
GigE network, Intel compilers, runs on Intel only
Note: These versions can be used on only Intel nodes (not AMD nodes).
These modulefiles should be used when running small, single-node MPI jobs in SMP parallel environments. See below for more info about running jobs. We recommend you use InfiniBand networking rather than GigE networking where multi-node jobs are possible.
Process binding is turned off in all of the modulefiles below. Process binding is considered a bad idea if not using all of the cores on a compute node. This is because other jobs could be running on the node and so binding to specific cores may compete with those jobs.
If you are using all of the cores in a single compute node then you may wish to manually turn on process binding as this may give a performance improvement in your application. Please contact its-ri-team@manchester.ac.uk for further advice about this.
Load one of the following modulefiles (process binding is off in all of these versions):
module load mpi/intel-15.0/openmpi/1.8.3 module load mpi/intel-15.0/openmpi/1.6 module load mpi/intel-14.0/openmpi/1.8.3 module load mpi/intel-14.0/openmpi/1.6
GigE network, GNU compilers, runs on Intel and AMD M-C
These modulefiles should be used when running small, single-node MPI jobs in SMP parallel environments. See below for more info about running jobs. We recommend you use InfiniBand networking rather than GigE networking where multi-node jobs are possible. Load one of the following modulefiles:
module load mpi/gcc/openmpi/1.6 module load mpi/gcc/openmpi/1.4.5 module load mpi/gcc/openmpi/1.4.3
Compiling the application
If you are simply running an existing application you can skip these step. If compiling your own (or open source, for example) MPI application you should use the MPI compiler wrapper scripts mpif90
, mpicc
, mpiCC
. These will ultimately use the compiler you selected above (Intel, PGI, GNU, Open64) but will add compiler flags to ensure the MPI header files and libraries are used during compilation (setting these flags manually can be very difficult to get right).
Example code compilations when compiling on the command-line:
mpif90 my_prog.f90 -o my_prog # ...produces fortran mpi executable binary called my_prog mpicc my_prog.c -o my_prog # ...produces C mpi executable binary called my_prog mpiCC my_prog.cpp -o my_prog # ...produces C++ mpi executable binary called my_prog
In some build procedures you specify the compiler name using environment variables such as CC
and FC
. When compiling MPI code you simply use the name of the wrapper script as your compiler name. For example:
CC=mpicc FC=mpif90 ./configure --prefix=~/local/appinst
Please consult the build instructions for your application.
Parallel batch job submission
Using Infiniband on Intel nodes
To submit an MPI batch job to Intel nodes connected by InfiniBand networking:
- Make sure you have the appropriate modules loaded, including the mpi Infiniband module.
- Jobs must be a multiple of 12 cores
- Example submission script:
#!/bin/bash ###### PE for Intel InfiniBand nodes ###### #$ -pe orte-24-ib.pe 48 # Num cores must be a 48 or more and a multiple of 24 ### Use the current/submission directory as the working directory #$ -cwd ### Inherit the user environment settings from the login node (important so the job can find all commands) #$ -V ## The variable NSLOTS sets the number of processes to match the pe core request mpirun -n $NSLOTS ./my_prog
- To submit:
qsub jobscript # # where 'jobscript' is replaced with the name of your submission script #
Using Infiniband on AMD Magny-Cours nodes
To submit an MPI batch job to AMD 32-core Magny-Cours nodes connected by infiniband:
- Make sure you have the appropriate modules loaded, including the mpi Infiniband module.
- Jobs must be a multiple of 32 cores
- Example submission script:
#!/bin/bash ###### PE for AMD Magny-Cours InfiniBand nodes ###### #$ -pe orte-32-ib.pe 64 ### Use the current/submission directory as the working directory #$ -cwd ### Inherit the user environment settings from the login node (important so the job can find all commands) #$ -V ## The variable NSLOTS sets the number of processes to match the pe core request mpirun -n $NSLOTS ./my_prog
- To submit:
qsub jobscript # # where 'jobscript' is replaced with the name of your submission script #
Using Infiniband on AMD Bulldozer nodes
To submit an MPI batch job to AMD 64-core Bulldozer nodes connected by infiniband:
- Make sure you have the appropriate modules loaded, including the mpi Infiniband module.
- Jobs must be a multiple of 64 cores
- Example submission script:
#!/bin/bash ###### PE for AMD Bulldozer InfiniBand nodes ###### #$ -pe orte-64bd-ib.pe 128 ### Use the current/submission directory as the working directory #$ -cwd ### Inherit the user environment settings from the login node (important so the job can find all commands) #$ -V ## The variable NSLOTS sets the number of processes to match the pe core request mpirun -n $NSLOTS ./my_prog
- To submit:
qsub jobscript # # where 'jobscript' is replaced with the name of your submission script #
Please see our AMD Bulldozer OpenMPI documentation for more information on compiling for and running MPI jobs on this architecture.
Jobs that require 24 cores or fewer
- These jobs will not use InfiniBand nodes. They will run on Intel nodes connected via GigE. However, all processes will be placed on the same physical node and hence no communication over the network will take place. Instead shared-memory communication will be used, which is more efficient. Please ensure you use the correct MPI (non-InfiniBand) modulefile loaded or your job will fail.
- To submit an MPI batch job which require 24 cores or fewer replace the
pe
line in the above scripts with:###### PE for Intel single-node (24 cores max) ###### #$ -pe smp.pe 4 # # where 4 is replaced with the appropriate number for your job #
Further info
- Online help via the command line:
man mpif90 # for fortran mpi man mpicc # for C/C++ mpi man mpirun # for information on running mpi executables
- Open MPI website
- Training Courses run by IT Services for Research.
- The community wiki MPI webpage has useful links to other information.