Modifying job parameters

condor_submit: command line tricks with the -append argument

Apologies if you are a DropAndCompute user, but these tricks are Linux command line only.

Condor_submit is normally used as simply as this:

condor_submit submit.txt

That is, a submit file is produced that fully describes the Condor submission. An alternative is to leave some lines out of the submit file and supply them on the command line using -append (or -a for short).

Example 1

A trivial example is where sometimes you want to be notified, by email, that your job has finished and sometimes you don’t. The default when not specified in the submit file is that you do get an email notification. But, you could submit as follows:

condor_submit -a "notifcation = never" submit.txt

Logically the line is taken as coming just before the Queue line in submit.txt, at least one Queue line being mandatory.

Example 2

In this example we change the number of jobs queued dynamically; that is: we supply the details on the command line. Your Queue line in the submit file should be written like this:

Queue $(queuecount)

Do not define the macro queuecount in the submit file. Undefined macros default to the empty string, and so the above is equivalent to just

Queue

(which itself means Queue 1).

When you have, say, finished testing and you want to queue 100 jobs, submit it this way:

condor_submit -a "queuecount = 100" submit.txt

Note the quotes around the -a(ppend) argument as it has spaces in it; It can also be written:

condor_submit -a queuecount=100 submit.txt

equivalently. Thanks to Matt Farrellee of Red Hat for this tip. Matt also supplied the next one.

Example 3

What if you have lots of little files to transfer, and they come and go during your work flow. Let’s say they all end in .data, and you would like to write in your submit file:

transfer_input_files = *.data

but you can’t. Again, the command line -append trick comes to the rescue:

condor_submit -a "transfer_input_files = mycode.sh$(for i in *.data; do echo -n ",$i"; done)" submit.txt

Note that no transfer_input_files line should be used in the submit file. Also, a comma is already provided by the for loop so one isn’t required straight after the first input file.

Example 4

This one is a little similar to Example 2 in that you may have different submissions for different purposes: test versus production; or just differing data sets. The idea here is that you switch sub-directory on submission, like so:

condor_submit -a "initialdir = Test" submit.txt

An additional issue encountered here is that it’s likely the original Condor submit file and your executable code are back in the previous directory. It is also possible to script this to allow the submission of a large array of data sets at the same time. A script which achieves this is as follows

#!/bin/bash

BASE_DIR='/home/[your_UoM_username]/[your_working_directory]'
NUM_OF_INPUTS=2

for i in `eval echo {1..$NUM_OF_INPUTS}`
do
    condor_submit -a "transfer_input_files = $BASE_DIR/mycode.sh,input.txt,$BASE_DIR/myscript.sh$(for i2 in $BASE_DIR/*.data; do echo -n ",$i2"; done)" -a "initialdir = input$i" $BASE_DIR/submit.txt
done

For the above example to work, each of the new data sets needs placing in a directory called inputX, where X is an integer from 1 to the value set for ‘NUM_OF_INPUTS’ provided in the above script. The script transfers the local copy of the file ‘input.txt’ from the inputX directory into the executable being run by the executable line in the Condor submit file. It is assumed these are located directly within your main ‘BASE_DIR’ directory.

The ‘submit.txt’ file contains, among other things

executable = /home/[your_UoM_username]/[your_working_directory]/condor_script.sh

Which is a direct reference to the file containing the actual executable/script which references the ‘input.txt’ as input to the main executable/script

#!/bin/bash

./my_main_script.sh -file input.txt

Each time a job is submitted to Condor via the first, master script the working directory is changed and the input.txt file used therein is then used as the input for the task being executed.

Example 5

Another simple example where we decide how we want matching machines to be ranked.

condor_submit -a "rank = memory" submit.txt # go with most memory first

or

condor_submit -a "rank = kflops" submit.txt # go with those fastest at floating point first

or

condor_submit -a "rank = Target.TotalCPUs" submit.txt # go with most cores first

Finally, note that you can add multiple lines with multiple -a (or -append) arguments.

Last modified on July 4, 2019 at 4:45 pm by Tim Furmston