Job Arrays (Multiple similar jobs)

 

Please do not run jobarrays in the short environment, even if your tasks have a short runtime. There are not enough cores in short for jobarrays.

Why use a Job Array?

Suppose you wish to run a large number of almost identical jobs – for example you may wish to process a thousand different data files with the same application (e.g., processing 1000s of images with the same image-processing app). Or you may wish to run the same program many times with different arguments or parameters (e.g., to do a parameter sweep where you wish to find the best value of some variable.)

You may have used the Condor pool to do this (where idle PCs on campus are used to run your jobs overnight) but the CSF can also run these High Throughput Computing jobs.

How NOT to do it

The wrong way to do this would be to write a script (using Perl, Python or BASH for example) to generate all the required qsub jobscripts and then use another BASH script to submit them all (running qsub 1000s of times). This is not a good use of your time and it will do horrible things to the submit node (which manages the job queues) on a cluster. The sysadmins may kill such jobs to keep the system running smoothly!

Do not run qsub 100s, 1000s, … of times to submit 100s, 1000s, … of individual jobs. This will strain the batch system. If you are about to do this, STOP. You should be using a job array instead. Please read through the examples below. Contact its-ri-team@manchester.ac.uk if you require further advice.

The right way to do it

A much better way is to use an SGE Job Array. Simply put, a job array runs multiple copies (100s, 1000s, …) of your job in a way that places much less strain on the queue manager. You only write one jobscript and use qsub once to submit that job.

Your jobscript includes a flag to say how many copies of it should be run. Each copy of the job is given a unique task id. You use the task id in your jobscript to have each task do some unique work (e.g., each task processes a different data file or uses a different set of input parameters).

Using the unique task id in your jobscript is the key to writing a good job array script. You can be creative here – the task id can be used in many ways.

Below, we first describe how to submit an SGE job array that runs several serial (single core) tasks. We also show how to submit a job array that will run SMP (multicore) tasks. You can also submit job arrays that will run larger multi-node tasks. If you have access to the GPUs in the CSF, you can also submit jobarrays to the GPU nodes. The majority of this page then gives examples of how to use the task id in your jobscript in different ways.

We have users on the CSF that run job-arrays with 10,000’s of tasks – hence they are a very easy way of repeatedly running the same job on lots of different data files!

Job runtime

Each task in an array gets a maximum runtime of 7 days. Job arrays are not terminated at the 7 day limit, they will remain in the system until all tasks complete.

Please note: Job-arrays are not permitted in the short area due to limited resources.

Job Array Basics

Here is a simple example of a job array – notice the #$ -t 1-1000 line and the use of the special $SGE_TASK_ID variable. These will both be explained below.

#!/bin/bash --login
#$ -cwd
#$ -t 1-1000          # A job-array with 1000 "tasks", numbered 1...1000
                      # NOTE: No #$ -pe line so each task will use 1-core by default.

./myprog -in data.$SGE_TASK_ID.dat -out results.$SGE_TASK_ID.dat
               #
               # My input files are named: data.1.dat, data.2.dat, ..., data.1000.dat
               # 1000 tasks (copies of this job) will run.
               # Task 1 will read data.1.dat, task 2 will read data.2.dat, ... 

Computationally, this is equivalent to 1000 individual queue submissions in which $SGE_TASK_ID takes the values 1, 2, 3. . . 1000, and where input and output files have the task ID number in their name. Hence task 1 will read the file data.1.dat and write the results to results.1.dat. Task 2 will read data.2.dat and write the results to results.2.dat and so on, all the way up to task 1000.

The $SGE_TASK_ID variable is automatically set for you by the batch system when a particular task runs. Please note that for serial jobs you don’t use a PE setting.

To submit the job simply issue one qsub command:

qsub jobscript

where jobscript is the name of your script above.

Job Array Size Limit

The maximum number of tasks that can be requested in an array job on CSF3 is currently 75,000. For example:

#$ -t 1-75000         # Max job array size is currently 75000 on CSF3

If you request more than 75000 job array tasks you will receive the following error when trying to submit the job:

Unable to run job: job rejected: you tried to submit a job with more than 75000 tasks
Exiting.

Multi-core Job Array Tasks

Multi-core (SMP) tasks (e.g., OpenMP jobs) can also be run in jobarrays. Each task will run your program with the requested number of cores. Simply add a -pe option to the jobscript and then tell your program how many cores it can use in the usual manner (see parallel job submission). Please be aware that each task will be requesting the specified resources (number of cores). It may take longer for each task to get through the batch queue, depending on how busy the system is.

An example SMP array job is given below:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 4       # Each task will use 4 cores in this example
#$ -t 1-1000          # A job-array with 1000 "tasks", numbered 1...1000

# My OpenMP program will read this variable to get how many cores to use.
# $NSLOTS is automatically set to the number specified on the -pe line above.
export OMP_NUM_THREADS=$NSLOTS

./myOMPprog -in data.$SGE_TASK_ID.dat -out results.$SGE_TASK_ID.dat

Again, simply submit the job once using qsub jobscript.

Advantages of Job Arrays

Job arrays have several advantages over submitting 100s or 1000s of individual jobs. In both of the above cases:

  • Only one qsub command is issued (and only one qdel command would be required to delete all tasks).
  • The batch system will try to run many of your tasks at once (possibly hundreds simultaneously for serial job arrays, depending on what we have set the limit to be according to demand on the system). So you get a lot more than one task running in parallel from just one qsub command. The system will churn through your tasks running them all as cores become free on the system, at which point the job is finished.
  • Only one entry appears to be queued (qw) in the qstat output for the job array, but each individual task running (r) will be visible. This makes reading your qstat output a lot easier than if you’d submitted 1000s of individual jobs.
  • The load on the SGE submit node (i.e., the cluster node responsible for managing the queues and scheduling which jobs run) is vastly less than that of submitting 1000 separate jobs.

There are many ways to use the $SGE_TASK_ID variable to supply a different input to each task and several examples are shown below.

More General Task ID Numbering

It is not necessary that $SGE_TASK_ID starts at 1; nor must the increment be 1. The general format is:

#$ -t start-end:increment
         #             #      
         #             # The default :increment is 1 if not supplied
         #
         # The start task id CANNOT be zero! It must be >=1

For example:

#$ -t 100-995:5

so that $SGE_TASK_ID takes the values 100, 105, 110, 115... 995.

Note: The $SGE_TASK_ID is not allowed to start at 0. The start value must be 1 or more.

Incidentally, in the case in which the upper-bound is not equal to the lower-bound plus an integer-multiple of the increment, for example

#$ -t 1-42:6     # Tasks will be numbered 1, 7, 13, 19, 25, 31, 37 !!

SGE automatically changes the upper bound, but this is only visible when you run qstat to check on the status of your job.

[username@hlogin2 [csf3] ~]$ qsub array.qsub
Your job-array 2642.1-42:6 ("array.qsub") has been submitted
                      #
                      # The qsub command simply reports what you requested in the jobscript

[username@hlogin2 [csf3] ~]$ qstat
job-ID   prior  name        user      state  submit/start at      queue    slots ja-task-ID
-------------------------------------------------------------------------------------------
2642    0.00000 array.qsub  simonh    qw     04/24/2014 12:29:29               1 1-37:6
                                                                                   #
                                                 # The qstat command now shows the #
                                                 # adjusted upper bound (37).

Note the 1-37 in the ja-task-ID column: the final task id has been adjusted to be 37 rather than 42. Hence the tasks ids used will be 1,7,13,19,25,31,37. Remember that the task id cannot start at zero so don’t be tempted to try #$ -t 0-42:6.

Ad-hoc Task ID Numbering

It is not possible to specify several ad-hoc task numbers – e.g., you CANNOT use: -t 3,7,25,26,50,51,52,100 to run ad-hoc tasks.

Instead, specify multiple single task-ids or small ranges on the qsub command-line:

# In the following, the -t on the qsub command-line will override the '#$ -t' in the jobscript:
qsub -t 3 myjobscript          # Run task 3
qsub -t 7 myjobscript          # Run task 7
qsub -t 25-26 myjobscript      # Run tasks 25 and 26
qsub -t 50-52 myjobscript      # Run tasks 50, 51 and 52
qsub -t 100 myjobscript        # Run task 100

Related Environment Variables

There are three more automatically created environment variables one can use, as illustrated by this simple qsub script:

#!/bin/bash --login
#$ -cwd 
#$ -t 1-37:6          # Tasks will be numbered 1, 7, 13, 19, 25, 31, 37

# This will report that we requested an increment of 6
echo "The ID increment is: $SGE_TASK_STEPSIZE"

# These should be used with caution (see below for explanation)
if [[ $SGE_TASK_ID == $SGE_TASK_FIRST ]]; then
    echo "first"
elif [[ $SGE_TASK_ID == $SGE_TASK_LAST ]]; then
    echo "last"
 else
    echo "neither - I am task $SGE_TASK_ID"
fi

Note that the batch system will try to start your jobs in numerical order but there is no guarantee that they will finish in the same order — some tasks may take longer to run than others. So you cannot rely on the task with id $SGE_TASK_LAST being the last task to finish. Hence do not try something like:

# DO NOT do this in your jobscript - we may not be the last task to finish!
if [[ $SGE_TASK_ID == $SGE_TASK_LAST ]]; then
  # Archive output files from all tasks (output.1, output.2, ...).
  tar czf ~/scratch/all-my-results.tgz output.*
    #
    # BAD: we may not be the last task to finish just because we are the last
    # BAD: task id. Hence we may miss some output files from other tasks that
    # BAD: are still running.
fi

The correct way to do something like this (where the work carried out by a task is dependent on other tasks having finished) is to use a job dependency which uses two separate jobs and automatically runs the second job only when the first job has completely finished. You would generally only use the $SGE_TASK_FIRST and $SGE_TASK_LAST variables where you wanted those tasks to do something different but where they are still independent of the other tasks.

Job output files

Please read: a possible downside of job arrays is that every task generates its own jobname.oNNNNNN.TTT and jobname.eNNNNNN.TTT job output files. You get a .o and .e file for every task in your job array! If you have a large job array, you might end up with 10,000s files in the directory (folder) from where you submitted the job. This can make managing your files very difficult.

For example, if you submit the following job array named myjobarray:

#!/bin/bash
#$ -cwd
#$ -t 1-5000
module load ...
theapp ...

Then it will generate the following output files:

# The job array created 10,000 output files! This is difficult to manage and can slow down your job!
myjobarray.o12345.1
myjobarray.o12345.2
...
myjobarray.o12345.5000
myjobarray.e12345.1
myjobarray.e12345.2
...
myjobarray.e12345.5000
              #     #
              #     # taskid number
              #
              # Jobid number

As you can see, there will be 10,000 files in total (5000 .o files and 5000 .e files.) Any output from your job will be captured in these files. This at least ensures that no output will be lost and then the output from one task will never overwrite the output from another task.

When a folder contains 1000s of files it can be slow to list all of the files and to also read and write other files. Hence, you will be slowing down your own job if you allow a folder to contain 1000s of files!

But it is often the case the the .e files are empty. For example:

ls -l myjobarray.e12345.1
-rw-r--r-- 1 mabcxyz1 xy01 0 Aug 30 17:16 myjobarray.e12345.1
                           #
                           # This column shows the size, in this case 0 bytes means the file is empty

To reduce the number of files generated there are two things you can do:

  1. The first thing to do is join the .o and .e file for each task in to one file (the .o file) by adding the #$ -j y flag:
    #!/bin/bash --login
    #$ -cwd
    #$ -t 1-5000
    #$ -j y        # Yes, join the .o and .e file in to just the .o file.
    ...
    

    This will then cause the job to generate the just the .o files:

    # The .e has been joined in to the .o file so now we "only" have 5000 output files (still a lot!)
    myjobarray.o12345.1
    myjobarray.o12345.2
    ...
    myjobarray.o12345.5000
    

    Any error messages that would have gone to the .e files will instead be captured by the .o files, so you won’t lose any output by joining the files.

  2. A second thing to do is decide whether you really need the .o and .e files at all. If your app write data to a different file then you can often have both of the .o and .e files being empty. In this case, you can completely disable the output of both files:
    #!/bin/bash --login
    #$ -cwd
    #$ -t 1-5000
    #$ -o /dev/null    # No .o files will be generated
    #$ -e /dev/null    # No .e files will be generated
    # You should check that your app will write its results to some other file!
    # For example:
    theapp -in data.$SGE_TASK_ID -out results.$SGE_TASK_ID
    

    The use of the special /dev/null filename for the .o and .e filename prevents the job from creating any such output files.

Remember, managing your files when you have 10,000s of files in a single directory (folder) can be difficult and you will often slow down your own job if you allow a lot of files to be created. It can also lead to the whole filesystem performance being impacted which then affects other users of the service.

Examples

We now show example job scripts which use the job array environment variables in various ways. All of the examples below are serial jobs (each task uses only one core) but you could equally use multicore (smp) jobs if your code/executable supports multicore. You should adapt these examples to your own needs.

A List of Input Files

Suppose we have a list of input files, rather than input files explicitly indexed by a number. For example, your input files may have names such as:

C2H4O.dat
NH3.dat
C6H8O7.dat
PbCl2.dat
H2O.dat
... and so on ...

The files do not have the convenient 1, 2, 3, .... number sequence in their name. So how do we use a job array with these input files? We can put the names in to a simple text file, with one name per line (as above). We then ask the job array tasks to read a name from this master list of filenames. As you might expect, task number 1 will read the filename on line 1 of the master-list. Task number 2 will read the filename from line 2 of the master-list, and so on.

#!/bin/bash --login
#$ -cwd
#$ -t 1-42

# Task id 1 will read line 1 from my_file_list.txt
# Task id 2 will read line 2 from my_file_list.txt
# and so on...
# Each line contains the name of an input file to used by 'my_chemistry_prog'

# Use some Linux commands to save the filename read from 'my_file_list.txt' to
# a script variable named INFILE that we can use in other commands.
INFILE=`awk "NR==$SGE_TASK_ID" my_file_list.txt`
    #
    # Can also use another linux tool named 'sed' to get the n-th line of a file:
    # INFILE=`sed -n "${SGE_TASK_ID}p" my_file_list.txt`
   
# We now use the value of our variable by using $INFILE.
# In task 1, $INFILE will be replaced with C2H4O.dat
# In task 2, $INFILE will be replaced with NH3.dat
# ... and so on ...

# Run the app with the .dat filename specific to this task
./my_chemistry_prog -in $INFILE -out result.$INFILE

Bash Scripting and Arrays

Another way of passing different parameters to your application (e.g., to run the same simulation but with different input parameters) is to list all the parameters in a bash array and index in to the array. For example:

#!/bin/bash --login
#$ -cwd
#$ -t 1-10           # Only 10 tasks in this example!

# A bash array of my 10 input parameters
X_PARAM=( 3400 4500 9700 10020 20000 30000 40000 44400 50000 60910 )

# Bash arrays use zero-based indexing but you CAN'T use -t 0-9 above (0 is an invalid task id)
INDEX=$((SGE_TASK_ID-1))

# Run the app with one of the parameters
./myprog -xflag ${X_PARAM[$INDEX]} > output.${INDEX}.log

Running from Different Directories (simple)

Here we run each task in a separate directory (folder) that we create when each task runs. We run two applications from the jobscript – the first outputs to a file, the second reads that file as input and outputs to another file (your own applications may do something completely different). We run 1000 tasks, numbered 1…1000.

#!/bin/bash --login
#$ -cwd
#$ -t 1-1000

# Create a new directory for each task and go in to that directory
mkdir myjob-$SGE_TASK_ID
cd myjob-$SGE_TASK_ID

# Each task runs the same executables stored in the parent directory
../myprog-a.exe > a.output
../myprog-b.exe < a.output > b.output

In the above example all tasks use the same input and output filenames (a.output and b.output). This is safe because each task runs in its own directory.

Running from Different Directories (intermediate)

Here we use one of the techniques from above – read the names of folders (directories) we want to run in from a file. Task 1 will read line 1, task 2 reads line 2 and so on.

Create a simple text file (for example my_dir_list.txt) with each folder name you wish to run a job in listed on a new line.

We assume the file contains sub-directory names. For example, suppose we are currently working in a directory named ~/scratch/jobs/ (it is in our scratch directory). The subdirectories are named after some property such as:

s023nn/arun1206/
s023nn/arun1207/
s023nn/brun1208/
s023nx/brun1201/
s023nx/crun1731/

and so on - it doesn't really matter what the subdirectories are called.
Note that in this example we assume there are 500 lines (i.e. 500 directories)
in this file. This tells us how many tasks to run in the job array.

The jobscript reads a line from the above list and cd‘s in to that directory:

#!/bin/bash --login
#$ -cwd                # Run from where we ran qsub
#$ -t 1-500            # Assuming my_dir_list.txt has 500 lines

# Task id 1 will read line 1 from my_dir_list.txt
# Task id 2 will read line 2 from my_dir_list.txt
# and so on...

# This time we use the 'sed' command but we could use the 'awk' command (see earlier).
# Task 1 will read line 1 of my_dir_list.txt, task 2 will read line 2, ...
# Assign the name of this task's sub-directory to a variable named 'SUBDIR'
SUBDIR=`sed -n "${SGE_TASK_ID}p" my_dir_list.txt`

# Go in to the sub-directory for this task by reading the value of the variable
cd $SUBDIR
  #
  # You could use the subdir name to form a longer path, for example:
  # cd ~/scratch/myjobs/medium_dataset/$SUBDIR

# Run our code. Each sub-directory contains a file named input.dat. 
./myprog -in input.dat

The above script assumes that each subdirectory contains a file named input.dat which we process.

Running from Different Directories (advanced)

This example runs the same code but from different directories. Here we expect each directory to contain an input file. You can name your directories (and subdirectories) appropriately to match your experiments. We use BASH scripting to index in to arrays giving the names of directories. This example requires some knowledge of BASH but it should be straight forward to modify for your own work.

In this example we have the following directory structure (use whatever names are suitable for your code)

  • 3 top-level directories named: Helium, Neon, Argon
  • 2 mid-level directories named: temperature, pressure
  • 4 bottom-level directories named: test1, test2, test3, test4

So the directory tree looks something like:

|
+---Helium---+---temperature---+---test1
|            |                 +---test2
|            |                 +---test3
|            |                 +---test4
|            |
|            +------pressure---+---test1
|                              +---test2
|                              +---test3
|                              +---test4
|
+-----Neon---+---temperature---+---test1
|            |                 +---test2
|            |                 +---test3
|            |                 +---test4
|            |
|            +------pressure---+---test1
|                              +---test2
|                              +---test3
|                              +---test4
|
+----Argon---+---temperature---+---test1
             |                 +---test2
             |                 +---test3
             |                 +---test4
             |
             +------pressure---+---test1
                               +---test2
                               +---test3
                               +---test4

Hence we have 3 x 2 x 4=24 input files all named myinput.dat in paths such as

$HOME/scratch/chemistry/Helium/temperature/test1/myinput.dat
...
$HOME/scratch/chemistry/Helium/temperature/test4/myinput.dat
$HOME/scratch/chemistry/Helium/pressure/test1/myinput.dat
...
$HOME/scratch/chemistry/Neon/temperature/test1/myinput.dat
...
$HOME/scratch/chemistry/Argon/pressure/test4/myinput.dat

The following jobscript will run the executable mycode.exe in each path (so that we process all 24 input files). In this example the code is a serial code (hence no PE is specified).

#!/bin/bash --login
#$ -cwd

# This creates a job array of 24 tasks numbered 1...24 (IDs can't start at zero)
#$ -t 1-24

# Subdirectories will all have this common root (saves me some typing)
BASE=$HOME/scratch/chemistry

# Path to my executable
EXE=$BASE/exe/mycode.exe

# Arrays giving subdirectory names (note no commas - use spaces to separate)
DIRS1=( Helium Neon Argon )
DIRS2=( temperature pressure )
DIRS3=( test1 test2 test3 test4 )

# BASH script to get length of arrays
NUMDIRS1=${#DIRS1[@]}
NUMDIRS2=${#DIRS2[@]}
NUMDIRS3=${#DIRS3[@]}
TOTAL=$[$NUMDIRS1 * $NUMDIRS2 * $NUMDIRS3 ]
echo "Total runs: $TOTAL"

# Remember that $SGE_TASK_ID will be 1, 2, 3, ... 24.
# BASH array indexing starts from zero so decrment.
TID=$[SGE_TASK_ID-1]

# Create indices in to the above arrays of directory names.
# The first id increments the slowest, then the middle index, and so on.
IDX1=$[TID/$[NUMDIRS2*NUMDIRS3]]
IDX2=$[(TID/$NUMDIRS3)%$NUMDIRS2]
IDX3=$[TID%$NUMDIRS3]

# Index in to the arrays of directory names to create a path
JOBDIR=${DIRS1[$IDX1]}/${DIRS2[$IDX2]}/${DIRS3[$IDX3]}

# Echo some info to the job output file
echo "Running SGE_TASK_ID $SGE_TASK_ID in directory $BASE/$JOBDIR"

# Finally run my executable from the correct directory
cd $BASE/$JOBDIR
$EXE < myinput.dat > myoutput.dat

You may not need three levels of subdirectories and you’ll want to edit the names (BASE, EXE, DIRS1, DIRS2, DIRS3) and change the number of tasks requested.

To submit your job simply use qsub myjobscript.sh, i.e., you only submit a single jobscript.

Running MATLAB (lock file error)

If you wish to run compiled MATLAB code in a job array please see MATLAB job arrays (CSF documentation) for details of an extra environment variable needed to prevent a lock-file error. This is a problem in MATLAB when running many instances at the same time, which can occur if running from a job array.

Our MATLAB documentation also contains an example of passing the $SGE_TASK_ID value to your MATLAB code.

Limit the number of tasks to be run at the same time

By default the batch system will try attempt to run as many tasks as possible concurrently. If you do not want this to happen you can limit how many tasks can be running at the same time with the -tc option. For example to limit it to 5 tasks:

#$ -tc 5

Job Dependencies with Job Arrays

It is possible to make a job wait for an entire job array to complete or to make the tasks of a job array wait for the corresponding task of another job array. It is also possible to make the tasks within a job array wait for other tasks in the same job array (although this is limited). Examples are now given.

In the following examples we name each job using the -N flag to make the text more readable. This is optional. If you don’t name a job you should use the Job ID number when referring to previous jobs.

Wait for entire job array to finish

Suppose you want JobB to wait until all tasks in JobA have finished. JobB can be an ordinary job or another job array. But it will not run until all tasks in the job array JobA have finished. This is useful where you need to do something with the results of all tasks from a job array. Using a job dependency is the correct way to ensure that all tasks in a job array have completed (using the last task in a job array to do some extra processing is incorrect because not all tasks may have finished even if they have all started).

------------------- Time --------------------->
JobA.task1 -----> End   |
JobA.task2 --------> End|                     # JobA tasks can run in parallel
 ...                    |
   JobA.taskN ----> End |
                        |JobB -----> End      # JobB won't start until all of
                        |                     # JobA's tasks have finished

Here is the jobscript for JobB – we use the -hold_jid flag to give the name of the job we should wait for.

#!/bin/bash --login
#$ -cwd
#$ -N JobB
#$ -hold_jid JobA      # We will wait for all of JobA's tasks to finish
./myapp.exe

Submit the jobs in the expected order and JobB will wait for JobA to finish.

qsub jobscript_a                   # The job-array
qsub jobscript_b                   # The job that will wait for the job-array to finish before starting

Wait for individual tasks to finish

Suppose you have two job arrays to run, both with the same number of tasks. You want task 1 from JobB to run after task 1 from JobA has finished. Similarly you want task 2 from JobB to run after task 2 from JobA has finished. And so on. This allows you to pipeline tasks but still have them run independently and in parallel with other tasks.

----------------------- Time ----------------------->
JobA.task1 -----> End JobB.task1 ------> END
JobA.task2 -------> End JobB.task2 -------> END      # Tasks can run in parallel. JobA tasks
 ...                                                 # and JobB tasks form pipelines.
   JobA.taskN -----> End JobB.taskN ----> END

Use the -hold_jid_ad to set up an array dependency. Here are the two jobscripts:

JobA’s jobscript:

#!/bin/bash --login
#$ -cwd
#$ -N JobA            # We are JobA
#$ -t 1-20            # 20 Tasks in this example

./myAapp.exe data.$SGE_TASK_ID.in > data.$SGE_TASK_ID.A_result

JobB’s jobscript

#!/bin/bash
#$ -cwd
#$ -N JobB            # We are JobB
#$ -t 1-20            # 20 Tasks in this example (must be same as JobA)
#$ -hold_jid_ad JobA  # JobB.task1 waits only for JobA.task1 to finish and so on...
                      # (the _ad means array dependency)

./myBapp.exe data.$SGE_TASK_ID.A_result > data.$SGE_TASK_ID.B_result

Submit both jobs:

qsub jobscript_a
qsub jobscript_b

Tasks wait within a job array

It is possible to make the tasks within a jobarray wait for earlier tasks within the same jobarray. This is generally not recommend because it removes one of the advantages of job arrays – the ability to run independent jobs in parallel so that you get your results sooner. However, it does provide a method of easily submitting a large number of jobs where job N must wait for job N-1 to finish. We use the -tc flag to control the number of concurrent tasks that the job array can execute.

------------------------------- Time --------------------------------->
JobA.task1 -----> End
                     JobA.task2 -----> End
                                          ...
                                                  JobA.taskN -----> End

The -tc N allows the job array to run N tasks in the job array at the same time. Without the flag the batch system will attempt to run as many tasks as possible concurrently. If you set this to 1 you effectively make each task wait for the previous task to finish:

#!/bin/bash --login
#$ -cwd
#$ -t 1-10
#$ -tc 1     # Run only one task at a time
./myApp.exe data.$SGE_TASK_ID.in > data.$SGE_TASK_ID.out

Capturing the Job ID

When submitting a job-array, the job ID returned by the -terse flag incudes some extra information about the number of tasks and the increment of the task counter:

# When submitting a job-array, info about the number of tasks and task increment is returned
qsub -terse -t 1-100 jobscript-array1.sh
129674.1-100:1

# To capture only the jobid, use the cut command to remove the extra info:
qsub -terse -t 1-100 jobscript-array1.sh | cut -d. -f1
129674

You can use the job ID of the job array to programmatically make a second job wait for the earlier job. For example:

# Submit a job array and capture its jobid
JID=$(qsub -terse -t 1-100 jobscript-array1.sh | cut -d. -f1)

# Now submit a normal job that waits for all tasks in the job array to finish
qsub -hold_jid $JID second-job.sh

Deleting Job Arrays

Deleting All Tasks

An entire job array can be deleted with one command. This will delete all tasks that are running and are yet to run from the batch system:

qdel 18305
     #
     # replace 18305 with your own job id number

Deleting Specific Tasks

Alternatively, it is possible to delete specific tasks while leaving other tasks running or in the queue waiting to run. You may wish to change some input files for these tasks, for example, or it might be because specific tasks have crashed and you need to delete them and then resubmit them. Simply add the -t taskrange flag to qdel where taskrange gives the tasks to delete. You must always give the job id followed by the tasks. For example:

  • To delete a single task (id 30) from a job (id 18205)
    qdel 18205 -t 30
    
  • To delete tasks 200-300 inclusive from a job (id 18205)
    qdel 18205 -t 200-300
    
  • To delete tasks 3,7,25,26,50,51,52,100 from a job (id 18205)
    qdel 18205 -t 3
    qdel 18205 -t 7
    qdel 18205 -t 25-26
    qdel 18205 -t 50-52
    qdel 18205 -t 100
    

    Note that it is not possible to use -t 3,7,25,26,50,51,52,100 to delete ad-hoc tasks.

Resubmitting Deleted/Failed Tasks

If you need to resubmit specific task ids that have failed (for whatever reason, e.g., the app you were running crashed or there was a hardware problem), you can specify those specific task id numbers on the qsub command-line to override the range of task ids in the jobscript. For example, to resubmit the six ad-hoc tasks 3,7,25,26,50,51,52,100:

# The -t flag on the command-line will override any $# -t start-finish line in the jobscript.
# You do not need to edit your jobscript to remove the #$ -t line.
qsub -t 3 jobscript
qsub -t 7 jobscript
qsub -t 25-26 jobscript
qsub -t 50-52 jobscript
qsub -t 100 jobscript

Note that it is not possible to use -t 3,7,25,26,50,51,52,100 to submit ad-hoc tasks in one go.

Email from Jobarrays

Aug 2021: Please note a change of policy

It is no longer possible to submit job-arrays that will send emails from each task when there are more than 20 tasks in the job-array. This is to protect the University mail routers which have recently blocked the CSF from sending emails due to large job-arrays sending 1000s of emails.

To receive an email when an entire job array has completed, please see the section below on using a job-dependency to send an email after a job array.

As with ordinary batch jobs it is possible to have the job email you when it begins, ends or it aborts due to error. Unfortunately with a job array each task will email you. Hence you may receive 1000s of emails from a large job array. If this is what you require, please add the following to your jobscript:

#$ -M your.name@manchester.ac.uk       # Can use any address
#$ -m bea
      #
      # b = email when each job array task begins running
      # e = email when each job array task ends
      # a = email when each job array task aborts
      #
      # You can specify any one or more of b, e, a

Please note we do not recommend the above as it can cause issues. It isn’t possible to have the job array email you only when the last task has finished. However, it is possible to do something very similar to this, as follows:

Email during last task

You can manually email yourself from the job array task with the highest (last) task id. There is no guarantee this will be the last task to finish (other tasks that started earlier may run for longer and finish later). But it will be the last task to start. Emailing from the last task to start would be close enough to also being the last task to finish in most cases. Do this as follows:

#$ -t 1-100

# Run some application in our jobscript
./my_app.exe data.$SGE_TASK_ID

# Send email at end of last task. Will usually be
# close enough to being the last task to finish.
if [[ $SGE_TASK_ID == $SGE_TASK_LAST ]]; then
  echo "Last task $SGE_TASK_ID in job $JOB_ID finished" | mail -s "CSF jobarray" $USER
fi

The email will be sent to your University email address.

Email from a Job Dependency after the Job Array

To guarantee that you receive an email after all tasks have finished you can submit a second serial job (not a job array) to the batch system, perhaps to the short area if available that has a job dependency on the main job-array job. The second job will only run after all tasks in the jobarray have finished. Note however, that the second job may have to wait in the queue depending how busy the system is. So your email may arrive some extended time after the job array actually finished. To submit the jobs use the following command-lines:

# First, submit your job array as normal, noting the jobid
qsub my-jobarray.sh
Your job-array 869219.1-10:1 ("my-jobarray.sh") has been submitted
                 #
                 # Make a note of this job id

# Now immediately submit a serial job with a dependency (-hold_jid jobid) 
# and request that it emails you when it ends (-m e)
qsub -b y -hold_jid 869219 -m e -M $USER true
                       #                   #
                       #                   # this app simply returns 'no error'
                       #
                       # Use the job id from the previous job array

The second serial job will execute (it won’t do any useful work and will finish immediately) when the job array ends. It will send you an email when it has finished and so you will then know that the job array on which it was dependent has also finished. Note that the information in the email (wallclock time etc) is about the serial job, not the job array.

Job Array limits

A huge number of files in a single location can cause connectivity issues with the scratch servers.
In order to mitigate this, please keep the number of files in any one directory below around 5000.
This can be achieved by reducing the size of the array, or directing the file output to different locations – to keep each under the 5000 file number limit.
Another helpful step is to direct .o and .e (output and error) files to /dev/null if they are not required.

Further Information

More on SGE Job Arrays can be found at:

Last modified on December 9, 2024 at 11:43 am by George Leaver