{"id":210,"date":"2018-09-04T18:21:22","date_gmt":"2018-09-04T17:21:22","guid":{"rendered":"http:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=210"},"modified":"2025-05-23T15:33:13","modified_gmt":"2025-05-23T14:33:13","slug":"qsub-options","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/batch\/qsub-options\/","title":{"rendered":"qsub Options and Environment Variables"},"content":{"rendered":"<p><script type=\"text\/javascript\">\n    function toggle() {\n        var x = document.getElementById(\"hidetext\");\n        if (x.style.display === \"none\") {x.style.display = \"block\";}\n        else {x.style.display = \"none\";}\n    }\n<\/script><\/p>\n<div class=\"warning\">The SGE batch system has been shutdown and the CSF upgraded to use the Slurm batch system. Please read the <a href=\"\/csf3\/batch-slurm\">CSF3 Slurm documentation<\/a> instead.<\/p>\n<p>To display this old SGE page, <a href=\"javascript:toggle()\">click here<\/a>\n<\/div>\n<div id=\"hidetext\" style=\"display: none\">\n<h2>Jobscript vs Command-line<\/h2>\n<p>Batch job options can be specified in a <code>qsub<\/code> <em>jobscript<\/em> by placing <code>#$<\/code> in front of the option, for example using<\/p>\n<pre>#$ -cwd\r\n<\/pre>\n<p>or on the <code>qsub<\/code> <em>command line<\/em>, for example using<\/p>\n<pre>qsub -cwd ... ... <em>filename [optional args]<\/em> \r\n<\/pre>\n<p><strong>Note:<\/strong> if <em>filename<\/em> is an executable (e.g., <code>myapp.exe<\/code>) rather than a jobscript you must use the <code>-b y<\/code> flag (please <a href=\"#bflag\">see below<\/a>).<\/p>\n<p>All of the possible flags are described in the manual page (<code>man qsub<\/code>). The most commonly used options are briefly described below. Note that the order in which you specify options, either in the jobscript or on the command line does not matter.<\/p>\n<div class=\"hint\">\nWe recommend using a jobscript so that you have a permanent record of how you ran a job. This is important for reproducibility of results.\n<\/div>\n<h2>qsub Flags<\/h2>\n<dl>\n<dt><tt>-cwd<\/tt><\/dt>\n<dd>Execute the job from the current (working) directory \u2014 the directory from which the <code>qsub<\/code> command is issued. If this option is not present, the job will be executed the user&#8217;s home directory. The <code>.oNNNNN<\/code> and <code>.eNNNNN<\/code> stdout and stderr files created by SGE for each job will also be written to the directory specified by this flag (or the home directory if not present) unless the <code>-o<\/code> and <code>-e<\/code> flags are used to override where these files are written.<\/dd>\n<dt><tt>-V<\/tt><\/dt>\n<dd>(Uppercase V). This ensures that any environment settings you&#8217;ve made on the login node are inherited\/passed to the compute node, including the settings applied by loading software <code>modulefiles<\/code>. A copy of your current environment is taken when you run the <code>qsub<\/code> command (i.e., immediately, not when the job finally runs). Hence you can change your environment after running qsub, perhaps to set up for another job, or even log out and your job will still see the environment that was in place when you originally ran qsub. Unlike the CSF2 instead of loading the modulefiles your job needs via the login node command line\/in your <code>.bashrc<\/code>\/<code>.modules<\/code> we now recommend that you load the required modulefiles in your jobscript instead. When loading modulefiles in your jobscript you should <em>not use<\/em> <code>-V<\/code> in the jobscript.<\/dd>\n<dt><tt>-j y<\/tt><\/dt>\n<dd>Merge the standard error stream into the standard output stream, i.e., job output and error messages are sent to the same <code>.o<\/code> file, rather than different files (usually <code>.o<\/code> and <code>.e<\/code> files).<\/dd>\n<dt><tt>-pe <em>name.pe<\/em><\/tt><\/dt>\n<dd>Specify the SGE parallel-environment to which a job is sent \u2014 see the section on running <a href=\"\/csf3\/batch\/parallel-jobs\/\">parallel jobs<\/a>.<\/dd>\n<dt><tt>-l <em>resource<\/em><\/tt><\/dt>\n<dd>Specify a resource to modify where in the system the job is placed. For example <code>-l mem512<\/code> to select a higher-memory node. You may specify more than one resource flag, for example <code>-l haswell -l 's_rt=00:10:00'<\/code> although not all combinations are supported. Resource flags exist for <a href=\"\/csf3\/batch\/intel-cores\/\">CPU architectures<\/a>, <a href=\"\/csf3\/batch\/timelimits\/\">job time limits<\/a>, <a href=\"\/csf3\/batch\/high-memory-jobs\/\">memory requirements<\/a>, <a href=\"\/csf3\/batch\/gpu-jobs\/\">GPUs<\/a> and <a href=\"\/csf3\/batch\/timelimits\/\">interactivity<\/a> so you should check those pages for details together with the <a href=\"\/csf3\/batch\/parallel-jobs\">parallel environment<\/a> documentation to determine whether a resource and a <em>PE<\/em> are compatible (not all combinations are permitted).<\/dd>\n<dt><tt>-P <em>project-code<\/em><\/tt><\/dt>\n<dd>Specify a <em>project<\/em> in to which your jobs will be accounted. Users should NOT use this flag unless specifically told to do so (e.g., you have been given an HPC-Pool project code.) By default your jobs will account in the <em>project<\/em> associated with your supervisor.<\/dd>\n<dt><tt>-S \/bin\/bash<\/tt><\/dt>\n<dd>(Uppercase S). Indicate your jobscript is written using <code>\/bin\/bash<\/code> shell syntax. This is not required in your jobscripts &#8211; by default the jobscript will use the shell specified on the first line via the <code>#!<\/code> marker.<\/dd>\n<dt><tt>-N <em>name<\/em><\/tt><\/dt>\n<dd>(Uppercase N). Sets the job name, e.g. <code>-N my_job_name<\/code> to set job name to <code>my_job_name<\/code>. The <code>.o<\/code> and <code>.e<\/code> job output files will be named using this value \u2014 for example <code>my_job_name.o12345<\/code> and <code>my_job_name.e12345<\/code>. If you don&#8217;t use the <code>-N<\/code> option then the job output files will use the name of the jobscript (or executable) specified on the <code>qsub<\/code> command-line. Do not use spaces in the name.<\/dd>\n<\/dl>\n<p><a name=\"oflag\"><\/a><\/p>\n<dl>\n<dt><tt>-o <em>\/path\/to\/dir<\/em><br \/>\n-e <em>\/path\/to\/dir<\/em><br \/>\nalternatively:<br \/>\n-o <em>\/path\/to\/dir\/stdoutfile<\/em><br \/>\n-e <em>\/path\/to\/dir\/stderrfile<\/em><br \/>\nor to prevent any .o and .e files being generated (use with caution!)<br \/>\n-o \/dev\/null<br \/>\n-e \/dev\/null<\/tt>\n<\/dt>\n<dd>Use either the <em>directory<\/em> form or the <em>filename<\/em> form. If a directory name is given, it specifies the path to a <em>directory<\/em> where the usual standard output stream (stdout) and standard error stream (stderr) files (<code><em>JobName<\/em>.o<em>NNNNN<\/em><\/code> and <code><em>JobName<\/em>.e<em>NNNNN<\/em><\/code> respectively) will be written. The directories <em><strong>must<\/strong><\/em> already exist <em><strong>before<\/strong><\/em> the job runs &#8211; the batch system will <em>not<\/em> create them for you. If filenames are given, they specify the files to which stdout and stderr output will be written. No <em>JobID<\/em> number will be appended &#8211; your supplied filenames will be used as-is. If these flags are not used the standard output and error stream files will be written in the directory in which the job runs (see <code>-cwd<\/code>).<\/p>\n<p>For example, this will force the job&#8217;s <code>.o<em>NNNNN<\/em><\/code>.o and <code>.e<em>NNNNN<\/em><\/code> files in to a directory name <code>logs<\/code> in your home directory:<\/p>\n<pre>#!\/bin\/bash --login\r\n#$ -cwd\r\n#$ -o ~\/logs\r\n#$ -e ~\/logs\r\nmyapp.exe input.dat\r\n<\/pre>\n<p>You <strong>must<\/strong> ensure you have a directory named <code>~\/logs<\/code> (i.e. in your home directory) <em>before<\/em> submitting the job. The job will not create the directory for you. See also the <code>-j<\/code> flag for combining the .o and .e files in to one file (the .o file).<\/dd>\n<dt><tt>-hold_jid <em>jobid<\/em><\/tt><\/dt>\n<dd>Specifies this job is conditional upon completion of a previous job or jobs, e.g. <code>-hold_jid jobID<\/code> to submit a job which will not start until <code>jobID<\/code> has completed. <code>jobID<\/code> can be a job number (e.g., 89213) or job name (i.e., the earlier job was named using the <code>-N<\/code> flag). Multiple jobIDs can be specified using a comma separated list of jobIDs. In that case the current job will not run until all specified jobIDs have finished.<\/dd>\n<\/dl>\n<p><a name=\"email\"><\/a><\/p>\n<dl>\n<dt><tt>-m <em>bea<\/em><\/tt><\/dt>\n<dd>Causes an email to be sent when the job <strong>b<\/strong>egins, when it <strong>e<\/strong>nds and\/or if it is <strong>a<\/strong>borted. You can specify any or all of the <code>bea<\/code> letters. For example, most users only want to know when a job ends or aborts so use <code>-m ea<\/code>.<\/p>\n<p>Please note that you <em>must<\/em> put your email address in your jobscript or on the command line submission (see below for how) as it does not automatically detect you University email at the moment. We are looking into this.<\/dd>\n<dt><tt>-M <em>emailaddress<\/em><\/tt><\/dt>\n<dd>(Uppercase M). Specify an email address to which <code>-m<\/code> status emails will be sent. You may supply a comma-separated list if you want to receive email at more than one address. For example:<\/p>\n<pre>-M &#x6d;y&#x2e;&#110;&#x61;&#109;e&#x40;m&#x61;&#110;&#x63;&#104;e&#x73;&#116;&#x65;&#114;&#46;&#x61;c&#x2e;&#117;&#x6b;,my&#110;&#97;&#x6d;&#x65;&#x40;&#x67;ma&#105;&#108;&#x2e;&#x63;&#x6f;&#x6d;<\/pre>\n<p>Please note that you <em>must<\/em> put your email address in your jobscript or on the command line submission as it does not automatically detect you University email at the moment. We are looking into this.<\/dd>\n<\/dl>\n<p>The following options are used on the <code>qsub<\/code> command-line directly, rather than in your jobscripts:<br \/>\n<a name=\"bflag\"><\/a><\/p>\n<dl>\n<dt><tt>-b y<\/tt><\/dt>\n<dd>For use on the <strong>qsub command-line only<\/strong>. Indicates that the filename given on the <code>qsub<\/code> command line is an executable (binary) file, not a jobscript. This allows you to specify the executable directly on the command-line rather than in a job script. By default the <code>qsub<\/code> command assumes the filename refers to a jobscript. For example, the following command line and jobscript (submitted with <code>qsub myjobscript<\/code>) are equivalent:<\/p>\n<pre>qsub -b y -cwd \/bin\/hostname\r\n<\/pre>\n<p>and<\/p>\n<pre>#!\/bin\/bash\r\n#$ -cwd\r\n\/bin\/hostname\r\n<\/pre>\n<p>It is up to the user which method they prefer. However we recommend writing a jobscript so that you can see how the job was submitted if referring back to an old job (perhaps submitted months ago) rather than trying to remember a command-line. It also allows the sysadmins to identify more easily any problems with jobs.<\/dd>\n<dt><tt>-terse<\/tt><\/dt>\n<dd>This flag can be used when constructing pipelines &#8211; it causes the <code>qsub<\/code> command to return only the Job ID, not the user-friendly message about your job submission. This can then be captured and used in subsequent job submissions to make later jobs wait for earlier jobs. For example:<\/p>\n<pre>\r\n# Submit two jobs, where the second job will not run until the first job has finished:\r\nJID=$(qsub -terse jobscript_1.sh)\r\nqsub -hold_jid $JID jobscript_2.sh\r\n<\/pre>\n<p>When submitting job arrays, some extra information about the task range and increment is included in the terse output. You should remove this if capturing the job ID:<\/p>\n<pre>\r\nqsub -terse -t 1-100 jobscript-array1.sh\r\n129674.1-100:1\r\n\r\n# To capture only the jobid, use the cut command to remove the extra info:\r\nqsub -terse -t 1-100 jobscript-array1.sh | cut -d. -f1\r\n129674\r\n<\/pre>\n<\/dd>\n<\/dl>\n<p>See the individual pages (in menu of left side of this page) for the PE names and resources available on this system.<\/p>\n<h2>SGE Environment Variables<\/h2>\n<p>The following environment variables are available for use in your jobscript when the job runs. They can be used to create unique names for output files, for example, by including the job id or name in the output filename.<\/p>\n<dl>\n<dt><tt>$NSLOTS<\/tt><\/dt>\n<dd>The number of cores requested using the <code>-pe<\/code> flag or <code>1<\/code> if running a serial job (no <code>-pe<\/code> option specified). Use this variable if your application requires the number of cores to use on its command-line, rather than repeating the number in two places. This makes running jobs with different numbers of cores easier. For example:<\/p>\n<pre>#$ -pe smp.pe 4\r\nmyapp -cores $NSLOTS -input sample.dat -output results.dat\r\n  #\r\n  # $NSLOTS will be automatically replaced with 4 in this example\r\n<\/pre>\n<p>You could also use this variable in the name of an output file if doing several runs with a different number of cores when timing your code. For example<\/p>\n<pre>#$ -pe smp.pe 4\r\nmyapp -cores $NSLOTS -input sample.dat -output results.${NSLOTS}cores.dat\r\n   #\r\n   # The output file will be named results.4cores.dat\r\n<\/pre>\n<\/dd>\n<dt><tt>$NGPUS<\/tt><\/dt>\n<dd>The number of GPUs requested by a GPU job using the <code>-l nvidia_v100=<em>N<\/em><\/code> flag. For example<\/p>\n<pre>#$ -l nvidia_v100=2\r\nmyGPUApp --numgpus $NGPUS\r\n<\/pre>\n<p>[Technical note: this is a non-standard SGE env var, injected by the JSV]<\/dd>\n<dt><tt>$NHOSTS<\/tt><\/dt>\n<dd>The number of compute nodes in use by your job. For serial jobs (1-core) and single-node SMP (multi-core) jobs this will always be 1. For multi-node jobs (e.g., those running in <code>mpi-24-ib.pe<\/code>) then this will be the number of compute nodes. For example a 48-core job will need two 24-core compute nodes hence <code>NHOSTS<\/code> will be set to <code>2<\/code>.<\/dd>\n<dt><tt>$JOB_ID<\/tt><\/dt>\n<dd>The unique <em>job id<\/em> number assigned to the job at runtime by the batch system. You can use this to generate unique filenames that won&#8217;t be overwritten by other jobs. For example:<\/p>\n<pre>#$ -cwd\r\nmyapp -input sample.dat -output results.$JOB_ID.dat\r\n   #\r\n   # Output file will be named results.37823.dat where 37823 is my unique jobid.\r\n<\/pre>\n<\/dd>\n<dt><tt>$JOB_NAME<\/tt><\/dt>\n<dd>The value of the <code>-N<\/code> flag if present or the name of the jobscript if that flag is not used. Note that a unique jobid is always generated even if you use the <code>-N<\/code> flag. For example:<\/p>\n<pre>#$ -cwd\r\n#$ -N phase1\r\nmyapp -input sample.dat -output results.$JOB_NAME.$JOB_ID.dat\r\n  #\r\n  # Names the output file results.phase1.38795.dat (in this case)\r\n<\/pre>\n<\/dd>\n<dt><tt>$PE<\/tt><\/dt>\n<dd>The name of the parallel environment given after the <code>-pe<\/code> flag in parallel jobs. For example <code>smp.pe<\/code> or <code>mpi-24-ib.pe<\/code>. Unset in serial (1-core) jobs.<\/dd>\n<dt><tt>$SGE_O_WORKDIR<\/tt><\/dt>\n<dd>The full path to the directory from where you submitted the job.<\/dd>\n<dt><tt>$SGE_TASK_ID<br \/>\n$SGE_TASK_FIRST<br \/>\n$SGE_TASK_LAST<br \/>\n$SGE_TASK_STEPSIZE<\/tt><\/dt>\n<dd>See the <a href=\"\/csf3\/batch\/job-arrays\/\">Job Arrays documentation<\/a> for environment variables related to each task.<\/dd>\n<dt><tt>$PE_HOSTFILE<\/tt><\/dt>\n<dd>You will not normally need to use this variable in your jobscripts. However, some applications documented on the <a href=\"\/csf3\/software\">CSF software page<\/a> process the names of the nodes on which your job will run in to their own format. This variable gives the name of a file containing the names of the nodes on which your job has been scheduled to run. Do not, however, change the value of this variable yourself.<\/dd>\n<\/dl>\n<h2>Automatically Requeue a Job<\/h2>\n<p>A jobscript can ask the batch system to automatically requeue the job when the current job has finished. This can be used with an application that does <em>checkpointing<\/em>. This is where an application saves its current state to disk and then, when a new job starts, it can read the previous state and carry on from where it left off. This allows an app to run for more than 7 days (the max runtime on the CSF) by running one job after another, saving the state between each run.<\/p>\n<p>A jobscript that exits with code <code>99<\/code> will automatically requeue. Add the following to your jobscript:<\/p>\n<pre>exit 99\r\n<\/pre>\n<p>The jobscript will then automatically be waiting in the queue to run again. It will have the special <code>Rr<\/code> status when it runs, indicating it is a requeued job. The output from the job will be appended to the existing <em><code>jobname.o12345<\/code><\/em> and <em><code>jobname.e12345<\/code><\/em> files.<\/p>\n<p>Another use of this method would be where a job checks its own results files and may decide that it needs to rerun an analysis with some different parameters. You should add what ever checking of the results you need to perform to the jobscript then using the <code>exit 99<\/code> command to terminate the jobscript.<\/p>\n<p>For an example of an application that does checkpointing please see the <a href=\"\/csf3\/software\/applications\/starccm\/#Automatically_in_your_jobscript\">StarCCM<\/a> webpage.\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The SGE batch system has been shutdown and the CSF upgraded to use the Slurm batch system. Please read the CSF3 Slurm documentation instead. To display this old SGE page, click here Jobscript vs Command-line Batch job options can be specified in a qsub jobscript by placing #$ in front of the option, for example using #$ -cwd or on the qsub command line, for example using qsub -cwd &#8230; &#8230; filename [optional args] Note:.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/batch\/qsub-options\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":22,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-210","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=210"}],"version-history":[{"count":21,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/210\/revisions"}],"predecessor-version":[{"id":10092,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/210\/revisions\/10092"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/22"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}