Job Dependencies
SGE job dependencies allow you to specify that one job should not start until some other job has completed. For example, this might be useful if ‘jobB’ relies on the output from ‘jobA’ and it saves you having to be aware of when individual jobs have started or completed because SGE will simply get on with running them in the order you specify.
You can build up more complicated pipelines of jobs, where you might be waiting for several jobs to finish before further jobs can run. Job dependencies will ensure jobs run in the correct order.
Job Dependency via the Jobscript
To use this functionality add this option to your submission script:
#$ -hold_jid jobid # # replacing jobid with the number of the job to wait for.
When you type qstat
the jobs that are waiting on others will be listed as hqw
.
It may be easier to set up job dependencies by naming jobs rather than specifying jobIDs. For example, the first job can have the following added to its jobscript:
#$ -N myFirstJob
This will name the job (and cause its output .o and .e files to use that name). The second job which is going to wait for the first job can now refer to it by name:
#$ -hold_jid myFirstJob
Our job dependency is now independent of the jobID number the batch system assigns to the first job.
Job Dependency via the qsub command-line
Alternatively you can add the -hold_jid
flag to the qsub command-line:
# Submit the first job qsub jobscript1.sh Your job 129673 ("jobscript1.sh") has been submitted # # Make a note of the job id # Now submit the second job and make it wait for the first qsub -hold_jid 129673 jobscript2.sh
By adding the flag to the qsub command-line we don’t have to edit the second jobscript to set the unique jobid to wait for. We don’t need to name the jobs either.
Scripting the use of qsub
You may wish to develop a helper shell script to submit dependency jobs. This could simplify the submitting of complicated job pipelines by running the qsub
command and processing the job IDs of the various jobs. Adding the -terse
flag to the qsub
command simplifies your shell script by changing the message reported by qsub to be just the job id of the newly submitted job. For example:
qsub -terse jobscript1.sh 129673 # # The message from qsub is now only the jobid of this job # (instead of 'Your job 129673 ("jobscript1.sh") has been submitted)
You could capture this in to a shell variable and use it to submit a second job that waits on this job to finish. For example:
JOBID=$(qsub -terse jobscript1.sh) # Save our JOBID JOBID=$(qsub -terse -hold_jid $JOBID jobscript2.sh) # Use previous value of $JOBID then save our JOBID JOBID=$(qsub -terse -hold_jid $JOBID jobscript3.sh) # Use previous value of $JOBID then save our JOBID ...
You should add some error-checking shell-script code to handle the case where a qsub
command fails due to an error in a jobscript, for example. In this case an empty string (i.e. no job id) is returned by the qsub
stderr).
Waiting for more than one job
If your job needs to wait for multiple jobs to finish you may specify a comma separated list of jobIDs (or names) on the -hold_jid
line. For example
# Wait for three earlier jobs (which were given -N names) to # finish before we proceed #$ -hold_jid myFirstJob,SimWork,dataJob
Job Array Dependencies
When submitting a job-array, the job ID returned by the -terse
flag incudes some extra information about the number of tasks and the increment of the task counter:
# When submitting a job-array, info about the number of tasks and task increment is returned qsub -terse -t 1-100 jobscript-array1.sh 129674.1-100:1 # To capture only the jobid, use the cut command to remove the extra info: qsub -terse -t 1-100 jobscript-array1.sh | cut -d. -f1 129674
There are a number of ways in which job dependencies can be used with job arrays (which includes dependencies between job arrays and ordinary jobs). Please see our Job Array Dependencies Information for further details.