High Throughput Computing using Condor

Retrieving entire sub-directories created by your job


Some programs and applications create one or more sub-directories (folders) at run-time and store some or all of their results in files therein. The rules for getting such transferred back to the submitting machine are a little confusing, and vary with versions of Condor.

Summary of the rules and issues

By default Condor does not transfer back sub-directories created by your job, but you can choose to transfer them back to submitter.

If you do choose to transfer back entire sub-directories, you must explicitly name all the top-level files and directories you want copying back. For example, assuming output files f1 and f2 are created, as well as directories d1 and d2 which could contain many more files and sub-directories:

    transfer_output_files = f1, f2, d1, d2

This can be a problem if your application sometimes generates certain files/directories and sometimes not, depending upon the input data. If any of f1, f2, d1 or d2 do not exit, Condor sees this as an error and will place the job on Hold.

If, as is often better, you rely on the Condor default behaviour and do not include a transfer_output_files line in your submit file, only ordinary files that have been created by your job are copied back, excluding any sub-directories.

A solution

Assuming you control your application via a bash shell script, as is very common or easy to add, you can post-process your application’s output sub-directories. This is, in general, a two step process: first, remove any unwanted or temporary working directories, for example:

rm -rf d1

Then, add the following line:

for d in *; do if test -d "$d"; then tar czf "$d".tgz "$d"; fi; done

which looks for any directories and creates a compressed tar of each one. These being regular, ordinary files, will then be copied back by Condor.

Last modified on May 25, 2017 at 2:04 pm by Pen Richardson