Serial Jobs (Slurm)

Audience

The instructions in this section are for users who have been asked to test applications in the Slurm batch system on the upgraded CSF3, thereby helping us to ensure the upgraded system is running as expected.

PLEASE DO NOT SUBMIT A REQUEST ASKING FOR ACCESS TO THE NEW SYSTEM – WE WILL CONTACT YOU AT THE APPROPRIATE TIME!!!
(answering these requests will slow down our upgrade work!)

Serial batch job submission (Slurm)

For jobs that require one CPU core.

Please also consult the software page specific for the code / application you are running for advice running your application.

A serial job script will run in the directory (folder) from which you submit the job. The jobscript takes the form:

#!/bin/bash --login
#SBATCH -p serial  # Partition is required. Runs on an Intel Haswell node (during cluster testing)
#SBATCH -t 4-0     # Wallclock limit (days-hours). Required!
                   # Max permitted is 7 days (7-0).

# Load any required modulefiles. A purge is used to start with a clean environment.
module purge
module load apps/some/example/1.2.3

# Now the commands to be run by the job.
serialapp.exe

# If an app needs to be told explicitly to use one core, use $SLURM_NTASKS. For example:
serialapp --cores $SLURM_NTASKS
             #
             # Check your app's documentation for its required flags!

In the above jobscript, we do not explicitly specify the number of cores. In this case, the default is assumed:

#SBATCH -n 1        # Causes the $SLURM_NTASKS env var to be set to 1

If you require the $SLURM_CPUS_PER_TASKS env var in your jobscript (set to 1), then you should specify:

#SBATCH -c 1

Command-line one-liner

The above serial job can also be achieved all on the command-line on the login node (a quick way of running a serial job in batch).

Note that you will need to load any modulefiles before submitting the job. The batch system will take a copy of any settings made by the modulefile so that they are visible to the job when it eventually runs. You could logout of the CSF before the job runs and it will still have a copy of the modulefile settings allowing the job to run correctly.

module purge
module load apps/some/example/1.2.3
sbatch -p serial -t 4-0 --wrap="theapp.exe optional-args"
                  #
                  # The wallclock time limit is now required.
                  # This example specifies 4 days (0 hours).

where optional-args are any command line flags that you want to pass to the theapp.exe program (or your own program). The --wrap flag indicates the filename submitted (theapp.exe) is a binary (executable) program, not a jobscript.

AMD Serial Hardware

Not currently possible – please use the Intel hardware for serial jobs.

Intel Serial Hardware

Serial jobs can run on various Intel compute-nodes. The type of Intel hardware used (amount of memory available to the job, CPU architecture, runtime) can be controlled with extra jobscript flags as shown below.

Note: You must specify the serial partition AND a job “wallclock” timelimit.

Partition name: serial

  • For 1 core jobs (includes serial job arrays).
  • You must specify the partition name.
  • You must specify a runtime limit – max 7 days permitted.
  • 5GB per core by default.
  • Jobs will currently run on an Intel Haswell node by default.
  • During initial cluster testing, only one node (24 cores) is available!
  • An Intel Skylake node is also available for “old scratch” file transfers (see below.)
Optional Resources Node type Additional usage guidance
-C scratch-old 6GB/core Intel Skylake node (32 cores). The old scratch f/s (from CSF3) is available at /scratch-old/$USER.

Last modified on March 12, 2025 at 2:57 pm by George Leaver