High Memory Jobs

Default Memory on the CSF

The standard Intel nodes in the CSF have 4GB to 6GB of RAM per core.

Higher-memory nodes are also available and are described below. First we describe how to check if your jobs are running out of memory (RAM).

Important information

  • None of the high memory nodes have fast InfiniBand.
  • Hence you can only run serial and single-node multi-core (SMP) jobs.
We only have a small number of high memory nodes. They should only be used for work that requires significant amounts of RAM. Incorrect use of these nodes may result in restrictions being placed on your account.

How to check memory usage of your job

The batch system keeps track of the resources your jobs are using, and also records statistics about the job once it has finished.

A currently running job

You can see the peak memory usage with the following command:

qstat -j jobid | grep maxvmem

For example:

qstat -j 123456 | grep maxvmem
usage  1:  cpu=7:01:59:41, mem=491221.37209 GBs, io=1.66402, vmem=15.448G, maxvmem=15.522G
                                                                               #
                                                                               # Check this one!

You should ignore the value displayed after mem as this is a total aggregated over time. The value to check is maxvmem which shows what your job has peaked at. Divide that value by the number of cores your job is using to get a per core value.

A finished job

If your job has finished, you can check the peak memory usage with the following command:

qacct -j jobid | grep maxvmem

Reminder: your job’s memory usage will grow and shrink over the lifetime of the job. The maxvmem is the peak memory usage – so tells you the largest amount of memory your job tried to use.

Depending on the software you are using you may also find memory usage reported in output files.

Job termination

If, at any point during the running of the job, the job’s peak memory usage goes above the limit that the job is permitted to use, then the job will be terminated by the batch system.

The memory limit imposed on a job by the batch system is:

  • Number of cores your job requests” X “memory per core on the node where the job runs“.

If you are using the standard CSF nodes, can your job use more cores? If so, this would also be a potential way to increase the amount of RAM available to a job and it may result in a shorter wait in the queue than waiting for high memory cores.

Alternatively, you can give your job access to more memory by running on one of the high-mem nodes detailed below. They offer more memory-per-core.

Requesting high memory nodes

24th July 2024: the mem256 nodes and the mem1024 node have been removed from service permanently. They were very old and several had experienced hardware failures. The number of mem512 nodes has also been permanently reduced to approximately 10 nodes.

Up to 32GB per core

Add ONE of the following options to your jobscript. We recommend you do not request a CPU architecture unless you need support for specific instruction sets as it can lead to a much longer wait in the queue.

#$ -l mem512     # For 32GB per core, any of the CPU types below (system chooses)

# Note: We DO NOT recommend specifying an architecture in most cases
#$ -l mem512 -l haswell
#$ -l mem512 -l ivybridge
#$ -l mem256     # RETIRED!

There are approximately 10 of the mem512 nodes. When demand for these nodes is high it can lead to longer wait times in the queue.

Up to 64GB per core

17th July 2024: The following very high memory nodes are no longer restricted access – anyone can use these nodes.
#$ -l mem1500    # 1.5TB RAM = 48GB per core, max 32 cores (Skylake and Cascade Lake CPU). 7 nodes.
#$ -l mem2000    # 2TB RAM   = 64GB per core, max 32 cores (Icelake CPU), 8TB SSD /tmp. 10 nodes.

#$ -l mem1024    # RETIRED!

Up to 128GB per core

Access to the following node is restricted. You must demonstrate a clear need for this resource to gain access.
#$ -l mem4000    # 4TB RAM   = 128GB per core, max 32 cores (Icelake CPU), 32TB SSD /tmp. One node.

Runtimes and queue times on high memory nodes

The maximum job runtime on higher-memory nodes is the same as other CSF nodes, namely 7 days.

Due to the limited number of high memory nodes we cannot guarantee that jobs submitted to these nodes will start within the 24 hours that we aim for on the standard CSF3 nodes. Queue times may be several days or more.

Please do not use any of the high memory nodes for jobs that do not require the memory, this can lead to longer wait times for jobs that genuinely need the memory.

We ask that you do not submit a large number of jobs to any of the high memory nodes at any one time to try and ensure that everyone who needs to use them has an opportunity to do so.

Please request only the number of cores that are needed to meet the memory requirements of your jobs. Even if your job will run faster with more cores, if the job does not use all the memory of those cores then this means other jobs will queue longer than necessary.

We monitor usage of all the high memory nodes and will from time to time advise people if we think they are being incorrectly used. Persistent unfair use of high memory nodes may result in a ban from the nodes or limitations being placed on your usage of them.

mem2000 and mem4000 local SSD storage

The mem2000 and mem4000 nodes have particularly large, fast local SSD storage in the nodes. This can be useful if your jobs do a lot of disk I/O – frequently reading/writing large files. Your jobs may benefit from first copying your large datasets to the SSD drives, then running in that area where they can write output files. Finally, copy any results you want to keep back to your scratch area. To access the SSD drives within a jobscript, use the preset environment variable $TMPDIR. For example

#!/bin/bash
#$ -cwd
#$ -l mem2000
#$ -pe smp.pe 8

# Copy data from scratch to the local SSD drives
cp ~/scratch/my_huge_dataset.dat $TMPDIR

# Go to the SSD drives
cd $TMPDIR
# Run your app
myapp my_huge_dataset.dat -o my_huge_results.dat

# Copy result back to scratch
cp my_huge_results.dat ~/scratch

The $TMPDIR area (which is private to your job) will be deleted automatically by the batch system when the job ends.

Remember, the $TMPDIR location is local to each compute node. So you won’t be able to see the same $TMPDIR storage on the login nodes or any other compute node. It is only available while a job is running.

Last modified on December 5, 2024 at 2:49 pm by George Leaver