Parallel Jobs

Current Configuration and Parallel Environments

For jobs that require two or more CPU cores, the appropriate SGE parallel environment should be selected from the table below.

Please also consult the software page specific for the code / application you are running for advice on the most suitable PE.

A parallel job script takes the form:

#!/bin/bash --login
#$ -cwd                       # Job will run in the current directory (where you ran qsub)
#$ -pe pename numcores        # Choose a PE name from the tables below and a number of cores

# Load any required modulefiles
module load apps/some/example/1.2.3

# Now the commands to be run in job. You MUST tell your app how many cores to use! There are usually
# three way to do this. Note: $NSLOTS is automatically set to the number of cores given above.

  • OpenMP applications (multicore but all in a single compute node):
    export OMP_NUM_THREADS=$NSLOTS
    the_openmp_app
    
  • MPI applications (small jobs on a single node or larger jobs across multiple compute nodes)
    mpirun -n $NSLOTS the_mpi_app 
    
  • Other multicore apps that use their own command-line flags (you must check the app’s documentation for how to do this correctly). For example:
    the_bioinfo_app --numthreads $NSLOTS         # This is an example - check your app's docs!
    

The available parallel environments are now described. Use the name of a parallel environment in your jobscript.

AMD Parallel Environments

New from September 2024. We are installing new AMD compute nodes in the CSF, and these will eventually be the majority of nodes.

Single Node Multi-core(SMP) and MPI Jobs

PE name: amd.pe (NOTE: it is NOT smp.pe – that is for Intel CPUs – see later)

  • For jobs of 2 to 168 cores.
  • Jobs will use a single compute node. Use for OpenMP (or other multicore/SMP jobs) and MPI jobs.
  • 8GB RAM per core.
  • 7 day runtime limit.
  • Currently, jobs will run on AMD “Genoa” CPUs. All nodes provide avx2 and avx512 vector extensions.
  • A large pool of nodes available – currently 10,248 cores.
Optional Resources Max cores per job, RAM per core Additional usage guidance
-l short Max 28 cores, 8GB/core (Genoa nodes.) 1 hour runtime Usually has shorter queue-wait times. Only 2 nodes available. This option is for test jobs and interactive only – DO NOT use it for production runs as it is unfair on those who need it for testing/interactive.

Intel Parallel Environments

Single Node Multi-core(SMP) and small MPI Jobs

PE name: smp.pe

  • For jobs of 2 to 32 cores.
  • Jobs will use a single compute node. Use for OpenMP (or other multicore/SMP jobs) and small MPI jobs.
  • 4GB or 6GB RAM per core depending where the job runs.
  • 7 day runtime limit.
  • Currently, jobs may be placed on either Broadwell (max 28 cores) or Skylake (max 32 cores) CPUs. See optional resources below to control this. The system will choose if not specified.
  • We recommend you do not specify a type of CPU unless absolutely necessary for your application/situation as doing so reduces the pool of nodes available to you and can lead to an increased wait in the queue.
  • Large pool of cores.
  • The optional resource flags below can be used to modify your job. You should specify only one flag (if using any such flags) unless indicated in the table.
  • Note: Choosing a node type is not recommended as it can mean a much longer wait in the queue.
Optional Resources Max cores per job, RAM per core Additional usage guidance
-l mem512
-l mem1500
-l mem2000
-l mem4000 (restricted)
Please see the High Memory Jobs page. High memory nodes. Jobs must genuinely need extra memory.
-l short Max 24 cores, 4GB/core (haswell nodes.) 1 hour runtime Usually has shorter queue-wait times. Only 2 nodes available. This option is for test jobs and interactive only – DO NOT use it for production runs as it is unfair on those who need it for testing/interactive.
Most users will not need to use the following flags. They will restrict the pool of nodes available to your jobs, which will result in longer queue-wait times.
-l broadwell Max 28 cores, 5GB/core Use only Broadwell cores.
-l skylake Max 32 cores, 6GB/core Use only Skylake cores.
-l avx Limits will depend on what node type the system chooses and whether you include any memory options. System will choose one of Broadwell, Skylake CPUs
-l avx2 Limits will depend on what node type the system chooses and whether you include any memory options. System will choose one of Broadwell, Skylake CPUs
-l avx512 Max 32 cores, 6GB/core Use only Skylake CPUs

Multi-node large MPI Jobs

PE name: mpi-24-ib.pe

This PE HAS NOW BEEN RETIRED!! Use the AMD PE above, for larger parallel jobs (up to 168 cores).

For mutli-node jobs larger than 168 cores, please see the HPC Pool.

Optional Resources Max cores per job, RAM per core Additional usage guidance
NONE NONE NONE

Basic parallel batch SGE job submission

When submitting parallel jobs to the batch system you usually specify the number of cores required in two places:

  1. The pe option tells the batch system how many cores you are requesting. It will only run your job when the correct resources become available. Add the following to your jobscript:
    #$ -pe pename numcores
    

    replacing pename with one of the PE names described above and numcores with the number of cores to use (satisfying the rules in the PE description above).

  2. Your application will also need to be informed how many cores you have requested from the batch system. There is usually a command-line flag or environment variable that you must give to your application so that it uses no more cores than you requested from the batch system.Be careful here. Some software will try to use all cores in a node if you don’t specifically tell it how many you actually requested. You might end up using cores that haven’t been allocated to you, which could adversely affect other users’ jobs. You must ensure you tell your application how many cores to use.

The batch system automatically sets the environment variable $NSLOTS to the number of cores requested on the $# -pe line. You can then use this environment variable to tell your application how many cores to use. See above for examples of this when running MPI and OpenMP jobs.

If you use some other parallel method (e.g., Java threads or Boost library threads) then you should check the application’s documentation for how to specify the number of cores to use. In particular, if running Java applications please see our instructions for running Java applications to ensure you only use the cores which you have reserved in the batch system. Another example is the Gaussian application – this requires you to set the GAUSS_PDEF environment variable to the number of required cores.

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 4
#### Some examples of how you might inform an app it can use the 4 cores requested above:
## OpenMP application
export OMP_NUM_THREADS=$NSLOTS
someOpenMPApp.exe
## MPI application
mpirun -n $NSLOTS someMPIAPP.exe
## Gaussian
export GAUSS_PDEF=$NSLOTS
g16 ...
## pplacer
pplacer -j $NSLOTS ...

Last modified on November 6, 2024 at 12:06 pm by George Leaver