Universes

The Standard Universe 1/2

Jobs can be checkpointed/migrated/restarted

Checkpointing, Job Migration
  • Condor checkpoints a job at regular intervals — saves state of a process (memory, CPU, IO, etc) to a file.
  • Process can be restarted exactly as if it had never stopped.
  • Jobs can be migrated to another machine, e.g., when owner returns.
Remote System Calls; File Transfer
  • Access to IO files is through remote system calls — transfer of these files does not take place
  • Execute binaries and checkpoint files transferred automatically as needed.


...prev
next...

Views:

[Preview] [Continuous/One Page]

[Slideshow]


Contents: