OpenMP
Overview
OpenMP (Open Multi-Processing) is a specification for shared memory parallelism. OpenMP discussed below is used to develop multi-core software that runs on a single multi-core compute node in the CSF. Many application installed on the CSF use this technology and you can also use it to write your own parallel applications. However programming using OpenMP is currently beyond the scope of this webpage.
NOTE: do not confuse OpenMP with OpenMPI. While they are both used to develop parallel applications, they are different technologies and offer different capabilities.
Restrictions on use
You will need to use an OpenMP-compliant compiler to produce a threaded executable. Threaded executables required shared memory which, on CSF, means you can only run on a single node (see examples).
Set up procedure
Accessible via the compiler being used. For details on how to access compilers on the CSF please see:
Compiling source code
The compilers all use different flags to turn on OpenMP compilation as follows:
- Intel compiler:
-qopenmp
however you may still use the older-openmp
flag - GNU compiler:
-fopenmp
- PGI compiler:
-mp
- Open64 compiler:
-openmp
Your code will also need to include the relevant OpenMP header and use OpenMP directives. The behaviour of your code (e.g., how many threads to run) will be determined by OpenMP environment variables, which you can query by the OpenMP run time library functions.
Example compiler commands, all of which produce an executable called omp_hello
:
# Intel fortran, C and C++ ifort omp_hello.f -qopenmp -o omp_hello icc omp_hello.c -qopenmp -o omp_hello icpc omp_hello.cxx -qopenmp -o omp_hello # GNU fortran, C and C++ gfortran omp_hello.f -fopenmp -o omp_hello gcc omp_hello.c -fopenmp -o omp_hello g++ omp_hello.cxx -fopenmp -o omp_hello
Running the application
Within your jobscript, you must inform your application how many cores to use and this must match the number of cores you request from the batch system. The batch system does not know how to inform your application how many cores you have reserved. So you must inform your application.
When using OpenMP, the preferred method of informing your application how many cores to use is to set the OMP_NUM_THREADS
environment variable to the number required. This is normally done in your jobscript.
If you forget to set OMP_NUM_THREADS
, for example, your application will use all cores on the node where it runs. This may be more than you have told the batch system you will be using and may slow down other users’ jobs on that node. Jobs that are found to be using more cores than they have requested in the batch system will be killed by the sys admins.
Example Parallel batch submission scripts (Intel nodes)
Setting OMP_NUM_THREADS
within the submission script is the preferred method. Here we use the value of the variable $NSLOTS
which is automatically set by the batch system to the number of cores you request in your batch submission script:
#!/bin/bash --login #$ -cwd #$ -pe smp.pe 8 # Example: request 8 cores (can be 2 -- 32) on intel compute nodes ## If you have compiled your own source code, load the compiler's modulefile. EG: # module load compilers/intel/17.0.7 ## The variable NSLOTS is automatically set to the number specified after smp.pe above. ## We use it to set OMP_NUM_THREADS so that our code will use that many cores. export OMP_NUM_THREADS=$NSLOTS ## Run the application. It will use the OMP_NUM_THREADS variable. ./omp_hello
Submit the jobscript using qsub jobscript
where jobscript is the name of your file.