Overview

What is HTCondor?

The UoM HTCondor service is part of the IT Services Computationally Intensive Research ecosystem. HTCondor is designed for high throughput computing – where you run the same application to process 10s, 100s, 1000s, … of different data files, or you run a simulation 10s, 100s, 1000s, … of times with different input parameters, perhaps to find the “best” set of parameters. HTCondor can perform many of these runs at the same, giving you your results much sooner than if you had performed each run one after another on your PC/laptop.

Computations run on HTCondor can be either single-threaded or multi-threaded for a low number of threads (provided the app you run supports multi-threaded parallelism.) Multi-threaded calculations are limited to the number of CPU cores available on each HTCondor node.

The HTCondor service comprises a mixture of “always on” rack servers — locally referred to as the HTCondor backbone nodes — desktop PCs located in student teaching clusters (that are generally only available overnight, and during weekends and vacation periods), and “on demand” virtual machines (VMs) running on Amazon Web Services (AWS). Currently there is no charge for using the HTCondor service, although charging for using AWS VMs may be introduced in the future.

Typically there are around 1,200 CPU cores available 24/7 in the HTCondor backbone, 1,000-3,000 CPU cores provided by student teaching cluster PCs and a virtually unlimited number of CPU cores available in the AWS cloud. Currently, however, IT Services imposes a limit of 400 CPU cores per user for jobs running on AWS.

The HTCondor service does not use a shared filesystem, and so all files required during a calculation must be transferred from the submitter node to each compute node with the job, and similarly for result/log files generated by the job (HTCondor will do the file transfers for you.) This means that HTCondor is unsuitable for computations involving huge datasets that are more suited to running on high performance computing services such as the IT Services Computational Shared Facility.

N.B. HTCondor MUST NOT be used with restricted or highly restricted data. Data with any degree of sensitivity should NOT be processed on this service. Please speak to Research IT if you have any questions.

Acknowledgement

We acknowledge with gratitude the University of Wisconsin-Madison for developing and providing the HTCondor software and the original AWS Marketplace Community AMI HTCondor image used for “bursting into the (AWS) cloud“. Online manuals for the production and development versions of the HTCondor software are available.

Last modified on June 22, 2021 at 10:17 am by George Leaver