The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
Parallel Jobs
Current Configuration and Parallel Environments
For jobs that require more than one CPU core, the appropriate SGE parallel environment should be selected from the table below. Please also consult the software page specific to the code / application you are running for advice on the most suitable pe. In particular, starccm+ users should use the parallel environments listed on the CSF starccm page, not the ones listed here.
The various PEs are described below. After that we show how to specify a PE name (and number of cores) in your jobscripts.
No broadwell or ivybridge nodes are available in CSF2 due to the upgrade/
Intel parallel environments
Single Node Multi-core(SMP) and small MPI Jobs
-l short
Max 12 cores, 4GB/core, 1 hour runtimeCurrently a pool of 24 Westmere cores.
PE name: smp.pe
fluent-smp.pe
|
||
Optional Resources (use only one flag1) | ||
---|---|---|
SGE Flag | Max cores per job, available RAM per core | Additional usage guidance |
-l highmem (not currently available) |
Max 12 cores, 8GB/core | Jobs must genuinely need extra memory. Total pool 84 Westmere cores (shared with vhighmem jobs). |
-l vhighmem (not currently available) |
Max 6 core, 16GB/core | Jobs must genuinely need extra memory. Total pool 42 Westmere cores (shared with highmem jobs). |
-l westmere |
Max 12 cores, 4GB/core | Use only Westmere cores. |
-l sandybridge |
Max 12 cores, 5GB/core | Use only Sandybridge cores. |
-l haswell |
Max 24 cores, 5GB/core | Use only Haswell cores. |
Multiple Nodes — Infiniband-Connected
orte-24-ib.pe
|
||
Optional Resources | Additional usage guidance | |
---|---|---|
None | None |
orte-12-ib.pe – NO LONGER AVAILABLE (11-07-2016) PLEASE USE orte-24-ib.pe OR smp.pe
|
||
Optional Resources | Additional usage guidance | |
---|---|---|
None | None |
AMD Bulldozer parallel environments
smp-64bd.pe
orte-64bd-ib.pe
|
||
Optional Resources | Additional usage guidance | |
---|---|---|
-l bulldozer -l short |
For compilations (open64 compiler recommended for bulldozer) and small test jobs only. 12 hour time limit. 64 cores in total available, via smp-64bd.pe only, but jobs should use as few cores as possible. Both these flags are needed for this type of work. Specifying -l bulldozer for any other type of job will not work. |
AMD Magny-Cours parallel environments
smp-32mc.pe
orte-32-ib.pe
|
||
Optional Resources | Additional usage guidance | |
---|---|---|
None | None |
1 Unless indicated you should specify only one optional resource – they are mutually exclusive.
Basic parallel batch SGE job submission
When submitting parallel jobs to the batch system you usually specify the number of cores required in two places:
- The
pe
option tells the batch system how many cores you are requesting. It will only run your job when the correct resources become available. Add the following to your jobscript:#$ -pe pename numcores
replacing pename with one of the PE names described above and numcores with the number of cores to use (satisfying the rules in the PE description above).
- Your application will also need to be informed how many cores you have requested from the batch system. There is usually a command-line flag or environment variable that you must give to your application so that it uses no more cores than you requested from the batch system. Be careful here. Some software will try to use all cores in a node if you don’t specifically tell it how many you actually requested. You might end up using cores that haven’t been allocated to you, which could adversely affect other users’ jobs. You must ensure you tell your application how many cores to use.
The batch system automatically sets the environment variable $NSLOTS
to the number of cores requested on the $# -pe
line (the number you replaced numcores with). You can then use this environment variable to tell your application how many cores to use. See below for examples of this when running MPI and OpenMP jobs.
If you use some other parallel method (e.g., Java threads or Boost library threads) then you should check the application’s documentation for how to specify the number of cores to use. In particular, if running Java applications please see our instructions for running Java applications to ensure you only use the cores which you have reserved in the batch system. Another example is the Gaussian application – this requires you to put the number of cores to use in your data input file!
MPI applications
MPI is used by many applications to provide parallelism (multi-core and multi-node). If your application uses MPI, here’s how you typically run it:
- Make sure you have all appropriate modules loaded (see the relevant software page for further information), including the correct MPI. Further info regarding the MPI Implementation (OpenMPI) on the CSF
- You also need to choose the appropriate PE (to specify single-node or multi-node parallelism, Intel or AMD hardware).
- We run our job from the scratch filesystem:
cd ~/scratch
- Example multi-core MPI job script:
#!/bin/bash #$ -S /bin/bash #$ -cwd # Job runs in current directory (where you run qsub) #$ -V # Job inherits environment (settings from loaded modules etc) #$ -pe orte-24-ib.pe 48 # This example requests 48 cores (i.e., 2 physical 24-core Intel nodes # with InfiniBand networking hardware). # $NSLOTS is automatically set to the number of requested cores. Tell MPI to start that many processes. mpirun -n $NSLOTS ./my_mpi_prog
- To submit:
qsub jobscript # # where 'jobscript' is replaced with the name of your submission script #
OpenMP applications
OpenMP is used by many applications to provide parallelism (multi-core). If your application uses OpenMP, here’s how you typically run it:
- Make sure you have all appropriate modules loaded (see the relevant software page for further information). Further info regarding OpenMP on the CSF
- We run our job from the scratch filesystem:
cd ~/scratch
- Example multi-core OpenMP job script:
#!/bin/bash #$ -S /bin/bash #$ -cwd # Job runs in current directory (where you run qsub) #$ -V # Job inherits environment (settings from loaded modules etc) #$ -pe smp.pe 8 # This example requests 8 cores (Intel) # $NSLOTS is automatically set to the number of requested cores. Tell OpenMP to start that many processes. export OMP_NUM_THREADS=$NSLOTS ./my_openmp_prog
- To submit:
qsub jobscript # # where 'jobscript' is replaced with the name of your submission script #