- Recent Posts & Updates

Page Contents
- Batch Commands
- Further Information

Slurm Batch Commands (sbatch, squeue, scancel, sacct)

Batch Commands

Your applications should be run in the batch system. You’ll need a jobscript (a plain text file) describing your job – its CPU, memory and possibly GPU requirements, and also the commands you actually want the job to run.

Further details on how to write jobscripts are in the sections on serial jobs, parallel jobs, job-arrays and GPU jobs.

You’ll then use one or more of the following batch system commands to submit your job to the system and check on its status. These commands should be run from the CSF’s login nodes.

Job submission

sbatch jobscript

Submit a job to the batch system, usually by submitting a jobscript. Alternatively you can specify job options on the sbatch command-line. We recommend using a jobscript because this allows you to easily reuse your jobscript every time you want to run the job. Remembering the command-line options you used (possibly months ago) is much more difficult.

The sbatch command will return a unique job-ID number if it accepts the job. You can use this in other commands (see below) and, when requesting support about a job, you should include this number in the details you send in.

For example, when submitting a job you will see a message similar to:

[mabcxyz1@login1[csf3] ~]$ sbatch myjobscript
Submitted batch job 373

For scripting purposes, you may prefer just to receive the jobid number from the sbatch command. Add the --parsable flag to achieve this:

sbatch --parsable myjobscript
12345

When submitting a job, if you see the following errors, something is wrong:

sbatch: error: Batch job submission failed: No partition specified or system default partition

You must specify a partition, even for serial jobs. Add to your jobscript: #SBATCH -p partitionname.

sbatch: error: Batch job submission failed: Requested time limit is invalid (missing or exceeds some limit)

You must specify a “wallclock” time limit for your job. The maximum permitted is usually 7 days (or 4 days for GPU and HPC Pool jobs.) Add to your jobscript: #SBATCH -t timelimt.

Job Status

squeue

Report the current status of your jobs in the batch system (queued/waiting, running, in error, finished). Note that if you see no jobs listed when you run squeue it means you have no jobs in the system – they have all finished or you haven’t submitted any!

Some examples:

In this example squeue returns no output which means you have no jobs in the queue, either running or waiting:

[mabcxyz1@login1[csf3] ~]$ squeue
[mabcxyz1@login1[csf3] ~]$

In this example squeue shows we have two jobs running (one using 1 core, the other using 8 cores) and one job waiting (it will use 16 cores when it runs.):

[mabcxyz1@login1[csf3] ~]$ squeue
JOBID PRIORITY  PARTITION NAME         USER     ST SUBMIT_TIME    START_TIME     TIME NODES CPUS NODELIST(REASON)
  372 0.0000005 multicore mymulticore  mabcxyz1 R  08/03/25 13:02 08/03/25 13:32 2:04     1    8 node1260
  371 0.0000005 serial    simple.x     mabcxyz1 R  09/03/25 14:58 09/03/25 15:02 8:22     1    1 node603
  403 0.0000003 himem     mypythonjob  mabcxyz1 PD 11/03/25 09:25 N/A            0:00     1    4 (Resources)
   #                          #                 #                 ###                          #
   #                          #                 #                  #                           # Number of
   #                          #                 #                  #                           # CPU cores
   #                          #                 #                  #
   #                          #                 #                  # If running: date & time the job started
   #                          #                 #                  # If waiting: N/A
   #                          #                 #
   #                          #                 # R   - job is running
   #                          #                 # PD  - job is queued waiting
   #                          #                 # CG  - Completing (contact us, may indicate an error)
   #                          #
   #                          # Usually the name of your jobscript
   #
   # Every job is given a unique job ID number
   # (please tell us this number if requesting support)

If your job is queued you might see one of the following reasons:

(QOSMaxCpuPerUserLimit) = You have reached a global limit for the type of resources you need.

(Resources) = There are currently not sufficient of a resource to run your job. e.g A 168 core jobs needs a whole AMD node and none are available.

(MaxCpuPerAccount) = The group you are a part of has reached a global limit.

(ReqNodeNotAvail, Reserved for maintenance) = The resources you have requested have been flagged for maitenance and as such are temporarily unavailable for your job. Your job will queue until the resource becomes available again. For significant/lenghthy maintenance work we will always advise all users in advance by email.

You can modify the list of fields (columns) output by the squeue command by setting the $SQUEUE_FORMAT or $SQUEUE_FORMAT2 environment variables. In fact, the default set of columns you see is given by the first variable – it has a default value when you login to the CSF. To see the value, run:

echo $SQUEUE_FORMAT
%.15i %9p %9P %15j %8u %2t %14V %14S %10M %.6D %.5C %R

For more information on the two SQUEUE_FORMAT env vars and the column codes you can use, run man squeue.

gpusqueue / gpustat

To see a list of GPU jobs you have in the system, you can use the custom gpusqueue (also runnable using gpustat) command. This runs squeue but only shows GPU jobs and adds in some extra columns to show the types of GPUs requested:

# Show GPU job information
gpusqueue

April 2025: We are monitoring the system and will make tweaks as more and more people get onto the SLURM system and as more resources are added to it. Please bear with us whilst things get settled in.

Delete a Job

scancel jobid

To remove your job from the batch system early, either to terminate a running job before it finishes or to simply remove a queued job before it has started running.

Also use this if your job goes in to an error state or you decide you don’t want a job to run.

Note that if your job is in the CG state, please leave it in the queue if requesting support. It is easier for us to diagnose the error if we can see the job. We may ask you to scancel the job once we have looked at it – there is usually no way to fix an existing job.

For example, maybe you realise you’ve given a job the wrong input parameters causing it to produce junk results. You don’t need to leave it running to completion (which might be hours or days). Instead you can kill the job using scancel. You need to know the job-ID number of the job:

[mabcxyz1@login1[csf3] ~]$ scancel 12345

The job will eventually be deleted (it may take a minute or two for this to happen). Use squeue to check your list of jobs.

To delete all of your jobs:

# Delete all of your jobs. Use with CAUTION!
[mabcxyz1@login1[csf3] ~]$ scancel -u $USER

Please also see the Deleteing Job Arrays notes.

Get Info About Finished Jobs

sacct -j jobid: Advanced users. Once your job has finished you can use this command to get a summary of information for wall-clock time, max memory consumption and exit status amongst many other statistics about the job. This is useful for diagnosing why a job failed.

Further Information

Our own documentation throughout this site provides lots of examples of writing jobscripts and how to submit jobs. SGE also comes with a set of comprehensive man pages. Some of the most useful ones are:

man sbatch
man squeue
man scancel
man sacct

Last modified on September 1, 2025 at 9:33 am by Pen Richardson