Click [slideshow] to begin presentation.

 

Running a Condor Job: Overview

Running a Condor Job: Overview

  • Make it batch-ready
  • Choose a Universe
  • Create a submit file
  • Submit the job
  • Monitor your job's status




Running a Job

Step1: Make it Batch-Ready

Only non-interactive jobs can be submitted to Condor

Jobs must:

  • be able to run "in the background"
    • no GUI;
    • non-interactive — no user input (can be taken from an input file)
  • use STDIN, STDOUT and STDERR plus input/output data files only.




These are the same restrictions as under traditional batch systems

  • SGE, LSF, PBS. . .




Running a Job

Choose a Universe

Vanilla
Any batch-style computation will run here; Condor-related features restricted.
Standard
Full fat Condor:
  • checkpointing and job migration;
  • remote IO.
Parallel
e.g., MPI jobs can be run under Condor, but. . .?!?
Others
Java, . . .


More on Vanilla and Standard later. . .




Running a Job

Create a Submit File

  • A small ASCII text file.
  • cf. an SGE qsub file.
  • Specifies: executable; universe; STDIN, STDOUT and STDERR files to use; data files if any; and more. . .

Example

  executable = my_prog.exe

  universe = standard

  output = my_prog.$(Process).out
  error  = my_prog.$(Process).err
  log    = my_prog.log

  arguments = 3000
  queue

  arguments = 4500
  queue




Running a Job

Submit the Job

Setup Environment
prompt> export CONDOR_CONFIG=<path_to_condor>/etc/condor_config
    #
    # Ensure Condor programs can find the Condor 
    # configuration!

prompt> export PATH=$PATH:/<path_to_condor_exes>/bin
    #
    # Add the Condor programs to your PATH (e.g., condor_submit, 
    # condor_q...)

  • Use provided script, e.g.,
        source /opt/condor-7.4.2/condor.sh
Submit and Monitor Job
prompt> condor_submit my_job.cond_sub

prompt> condor_q
    #
    # condor_q -global




Running a Job

First Practical Session 1/3

Try Condor for yourselves!

  1. Login to man2condor.nw-grid.ac.uk using OpenSSH or PuTTY, using the username and password you have been given.
  2. Set up your environment:
        source /opt/condor-7.4.2/condor.sh
  3. Check the status of the pool using condor_status.
    • Notice that some nodes are busy with non-Condor activity, while others are available to Condor.
  4. Change directory to first-practical:
        cd first-practical
        
  5. Notice that there are three examples to run: hello.cmd, hello-2.cmd and loop.cmd.




Running a Job

First Practical Session 2/3

Run two vanilla universe jobs

  1. Examine the two vanilla universe jobs, the hello* files:
    • Look in the Fortran source files, hello*.f90 notice that one prints a message to STDOUT; the second writes to a file, myfile.txt.
    • In the Condor submit files, hello*.cmd, notice the file-transfer related commands.
  2. Compile and submit the two vanilla universe jobs:
        gfortran -o hello hello.f90
        condor_submit hello.cmd
    
        gfortran -o hello-2 hello-2.f90
        condor_submit hello-2.cmd
    
  3. If you are quick you may catch your jobs in the Condor pool queue by using condor_q.
  4. Check the output and error files, and also the newly-created file myfile.txt.




Running a Job

First Practical Session 3/3

  1. Examine the standard universe job submit file, loop.cmd
    • Notice that we submit more than one computation.
  2. Compile and submit the standard universe job:
        condor_compile gcc -o loop.remote loop.c
        condor_submit loop.cmd
    
  3. Quickly check the queue using condor_q and notice your jobs waiting or running.
  4. You can check the progress of one of your jobs by, for example
        tail -f loop.4.out