The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
OpenMP
Overview
OpenMP (Open Multi-Processing) is a specification for shared memory parallelism.
Programming using OpenMP is currently beyond the scope of this webpage. IT Services for Research run training courses in parallel programming.
Restrictions on use
You will need to use an OpenMP-compliant compiler to produce a threaded executable. Threaded executables required shared memory which, on CSF, means you can only run on a single node (see examples).
Set up procedure
Accessible via the compiler being used. For details on how to access compilers on the CSF please see:
Compiling
The compilers all use different flags to turn on OpenMP compilation as follows:
- Intel compiler:
-openmp
- GNU compiler:
-fopenmp
- PGI compiler:
-mp
- Open64 compiler:
-openmp
Your code will also need to include the relevant OpenMP header and use OpenMP directives. The behaviour of your code (e.g., how many threads to run) will be determined by OpenMP environment variables, which you can query by the OpenMP run time library functions.
Example compiles, all of which produce an executable called ‘omp_hello’:
- Intel fortran:
ifort omp_hello.f -o omp_hello -openmp
- gfortran:
gfortran omp_hello.f -o omp_hello -fopenmp
- Intel C example:
icc omp_hello.c -o omp_hello -openmp
- GCC:
gcc omp_hello.c -o omp_hello -fopenmp
- Open64:
opencc omp_hello.c -o omp_hello -march=bdver1 -openmp
Running the application
You must inform your application how many cores to use and this must match the number of cores you request from the batch system. The batch system does not automatically run your program with the number of cores reserved for your job. You need to reserve the required number of cores in your jobscript AND inform your application to use that number of cores.
The preferred method of informing your application how many cores to use is to set the OMP_NUM_THREADS
environment variable to the number required. This is normally done in your jobscript. Alternatively use the omp_set_num_threads()
run time library function in your code (but this is not the preferred method).
If you forget to set OMP_NUM_THREADS
, for example, your application will use all cores on the node where it runs. This may be more than you have told the batch system you will be using and may slow down other users’ jobs on that node. Jobs that are found to be using more cores than they have requested in the batch system will be killed by the sys admins.
Example Parallel batch submission scripts (Intel nodes)
- Setting OMP_NUM_THREADS within the submission script (the preferred method). Here we use the value of the variable
$NSLOTS
which is automatically set by the batch system to the number of cores you request in your batch submission script:#!/bin/bash ## SGE Stuff #$ -cwd #$ -V #$ -pe smp.pe 8 # Example: request 8 cores ## The variable NSLOTS is automatically set to the number specified after smp.pe above. ## We use it to set OMP_NUM_THREADS so that our code will use that many cores. export OMP_NUM_THREADS=$NSLOTS ./omp_hello
- If you have set OMP_NUM_THREADS on the command line using
export OMP_NUM_THREADS=8
(for example) before submitting your job:
#!/bin/bash ## SGE Stuff #$ -cwd ## Must use -V to inherit the OMP_NUM_THREADS set outside of the jobscript #$ -V ## The number on the pe line (number of cores) **must** match the number you set for OMP_NUM_THREADS #$ -pe smp.pe 8 ./omp_hello
In both cases submit the jobscript using qsub jobscript
where jobscript is the name of your file.
Example Parallel batch submission scripts (AMD Bulldozer nodes)
- Setting OMP_NUM_THREADS within the submission script (the preferred method). Here we use the value of the variable
$NSLOTS
which is automatically set by the batch system to the number of cores you request in your batch submission script:#!/bin/bash ## SGE Stuff #$ -cwd #$ -V #$ -pe smp-64bd.pe 64 # Example: request all 64 cores ## The variable NSLOTS is automatically set to the number specified after smp-64bd.pe above. ## We use it to set OMP_NUM_THREADS so that our code will use that many cores. export OMP_NUM_THREADS=$NSLOTS ./omp_hello
Submit the jobscript using qsub jobscript
where jobscript is the name of your file.
Example Parallel batch submission scripts (AMD Magny-Cours nodes)
- Setting OMP_NUM_THREADS within the submission script (the preferred method). Here we use the value of the variable
$NSLOTS
which is automatically set by the batch system to the number of cores you request in your batch submission script:#!/bin/bash ## SGE Stuff #$ -cwd #$ -V #$ -pe smp-32mc.pe 32 # Example: request all 32 cores ## The variable NSLOTS is automatically set to the number specified after smp-32mc.pe above. ## We use it to set OMP_NUM_THREADS so that our code will use that many cores. export OMP_NUM_THREADS=$NSLOTS ./omp_hello
Submit the jobscript using qsub jobscript
where jobscript is the name of your file.
Further info
- The official OpenMP site
- Training Courses run by IT Services for Research.