OpenMP

Overview

OpenMP (Open Multi-Processing) is a specification for shared memory parallelism. OpenMP discussed below is used to develop multi-core software that runs on a single multi-core compute node in the CSF. Many application installed on the CSF use this technology and you can also use it to write your own parallel applications. However programming using OpenMP is currently beyond the scope of this webpage.

NOTE: do not confuse OpenMP with OpenMPI. While they are both used to develop parallel applications, they are different technologies and offer different capabilities.

Restrictions on use

You will need to use an OpenMP-compliant compiler to produce a threaded executable. Threaded executables required shared memory which, on CSF, means you can only run on a single node (see examples).

Set up procedure

Accessible via the compiler being used. For details on how to access compilers on the CSF please see:

Compiling source code

The compilers all use different flags to turn on OpenMP compilation as follows:

  • Intel compiler: -qopenmp however you may still use the older -openmp flag
  • GNU compiler: -fopenmp
  • PGI compiler: -mp
  • Open64 compiler: -openmp

Your code will also need to include the relevant OpenMP header and use OpenMP directives. The behaviour of your code (e.g., how many threads to run) will be determined by OpenMP environment variables, which you can query by the OpenMP run time library functions.

Example compiler commands, all of which produce an executable called omp_hello:

# Intel fortran, C and C++
ifort omp_hello.f -qopenmp -o omp_hello 
icc omp_hello.c -qopenmp -o omp_hello
icpc omp_hello.cxx -qopenmp -o omp_hello

# GNU fortran, C and C++
gfortran omp_hello.f -fopenmp -o omp_hello
gcc omp_hello.c -fopenmp -o omp_hello
g++ omp_hello.cxx -fopenmp -o omp_hello

Running the application

Within your jobscript, you must inform your application how many cores to use and this must match the number of cores you request from the batch system. The batch system does not know how to inform your application how many cores you have reserved. So you must inform your application.

When using OpenMP, the preferred method of informing your application how many cores to use is to set the OMP_NUM_THREADS environment variable to the number required. This is normally done in your jobscript.

If you forget to set OMP_NUM_THREADS, for example, your application will use all cores on the node where it runs. This may be more than you have told the batch system you will be using and may slow down other users’ jobs on that node. Jobs that are found to be using more cores than they have requested in the batch system will be killed by the sys admins.

Example Parallel batch submission scripts (Intel nodes)

Setting OMP_NUM_THREADS within the submission script is the preferred method. Here we use the value of the variable $NSLOTS which is automatically set by the batch system to the number of cores you request in your batch submission script:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8              # Example: request 8 cores (can be 2 -- 32) on intel compute nodes

## If you have compiled your own source code, load the compiler's modulefile. EG:
# module load compilers/intel/17.0.7

## The variable NSLOTS is automatically set to the number specified after smp.pe above.
## We use it to set OMP_NUM_THREADS so that our code will use that many cores.
export OMP_NUM_THREADS=$NSLOTS

## Run the application. It will use the OMP_NUM_THREADS variable.
./omp_hello

Submit the jobscript using qsub jobscript where jobscript is the name of your file.

Further info

Last modified on September 27, 2021 at 8:35 am by George Leaver