Parallel Jobs (Slurm)

Parallel batch job submission (Slurm)

For jobs that require 2-168 CPU cores – running on one AMD Genoa compute node. A jobscript template is shown below. Please also consult the Partitions page for details on available compute resources.

Please also consult the software page for the code / application you are running for advice on running that application.

A parallel job script will run in the directory (folder) from which you submit the job.

MPI parallel apps

The jobscript takes the form:

#!/bin/bash --login
#SBATCH -p multicore  # Partition is required. Runs on an AMD Genoa hardware.
#SBATCH -n numcores   # (or --ntasks=) where numcores is between 2 and 168.
#SBATCH -t 4-0        # Wallclock limit (days-hours). Required!
                      # Max permitted is 7 days (7-0).

# Load any required modulefiles. A purge is used to start with a clean environment.
module purge
module load apps/some/example/1.2.3

### MPI jobs ###
# mpirun will run $SLURM_NTASKS (-n above) MPI processes
mpirun mpi-app.exe

OpenMP parallel apps

Here we request 1 “task” (process, or copy of your executable running) and multiple CPUs per task.

On the CSF, this is functionally equivalent to the above jobscript, which simply uses -n numcores, so you can use either. In fact, many of our jobscript examples in the software pages use only -n to specify the number of cores. But this is because jobs in the multicore partition are single-node jobs. Slurm can distribute the -n cores across all of the compute nodes in a job, but in this case there is only ever one compute node.

If you use the Slurm srun starter to run your executable inside the batch job (optional, and in most cases we don’t use this) then the distinction between -n and -c is important.

#!/bin/bash --login
#SBATCH -p multicore  # Partition is required. Runs on an AMD Genoa hardware.
#SBATCH -n 1          # (or --ntasks=) The default is 1 so this line can be omitted.
#SBATCH -c numcores   # (or --cpus-per-task) where numcores is between 2 and 168
#SBATCH -t 4-0        # Wallclock limit (days-hours). Required!
                      # Max permitted is 7 days (7-0).

# Load any required modulefiles. A purge is used to start with a clean environment.
module purge
module load apps/some/example/1.2.3

### OpenMP jobs ###
# OpenMP code will use $SLURM_CPUS_PER_TASK cores (-c above)
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
omp-app.exe

MPI+OpenMP mixed-mode apps

Here we run a small number of MPI processes, and each one of those will run OpenMP threads to do multi-core processing. This type of job is mostly run in the HPC Pool because it is more suited to multi-node jobs. Remember that the multicore partition can only run single-node jobs.

#SBATCH -p multicore  # Partition is required. Runs on an AMD Genoa hardware.
#SBATCH -n numtasks   # (or --ntasks=) Number of MPI processes (e.g., 4)
#SBATCH -c numcores   # (or --cpus-per-task) Number of OpenMP threads per MPI process (e.g., 42)
                      # This will use numtasks x numcores cores in total (e.g., 168)
#SBATCH -t 4-0        # Wallclock limit (days-hours). Required!
                      # Max permitted is 7 days (7-0).

# Load any required modulefiles. A purge is used to start with a clean environment.
module purge
module load apps/some/example/1.2.3

### OpenMP ###
# Each MPI process uses OpenMP to run on $SLURM_CPUS_PER_TASK cores (-c above)
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

### MPI ###
# The MPI application is then started with $SLURM_NTASKS processes (copies of the app)
# We inform MPI how many cores each MPI process should bind to so that OpenMP can use them.
mpirun --map-by ppr:${SLURM_NTASKS}:node:pe=$OMP_NUM_THREADS mix-mode-app.exe

Available Hardware and Resources

Please see the Partitions page for details on available compute resources.

Last modified on May 9, 2025 at 10:04 am by George Leaver