CSF3 – New Scratch Filesystem – Feb 2025

Audience

These instructions are for people who have previously been using the CSF3 with the SGE batch system (your jobscripts use #$ flags), but who now also have an account on the upgraded CSF3 with the Slurm batch system (your jobscripts use #SBATCH.)

Users that have recently received an account on the upgraded CSF3, but have never used the older CSF3, can stop reading now – no action needed.

PLEASE DO NOT SUBMIT A REQUEST ASKING FOR ACCESS TO THE NEW SYSTEM – WE WILL CONTACT YOU AT THE APPROPRIATE TIME!!!
(answering these requests will slow down our upgrade work!)

Introduction

As part of the Q1 2025 CSF3 upgrade work, a new scratch filesystem will be installed providing approximately 1.9PB storage (an increase of 500TB) as well as improved performance. This will REPLACE the existing scratch storage.

It also ensures future operation of the CSF – the current hardware will be going off maintenance, which puts the filesystem at risk. The new hardware will have 5 more years of hardware support.

Terminology

This page will use the following terms:

“Old scratch”
The scratch area on the older CSF3, running the SGE batch system (your jobscripts use #$ flags), that you’ve been accessing so far, possibly for years, that will be replaced. It contains your existing scratch files you’ve been working with.

On the upgraded CSF3, this old scratch is available at /scratch-old/$USER and is READ-ONLY.

“New scratch”
The scratch area on the upgraded CSF3, running the Slurm batch system (your jobscripts use #SBATCH.) When you login to the upgraded CSF3, this will be your usual day-to-day scratch area on the upgraded system. It will be empty initially!

On the upgraded CSF3, this new scratch is available at ~/scratch and is now your scratch area on that system.

Action Required

You are REQUIRED to decide which files you want to retain from your OLD SCRATCH, then COPY THEM to your NEW SCRATCH.

WE WILL NOT BULK TRANSFER YOUR OLD SCRATCH FILES TO THE NEW SCRATCH. AND NOR SHOULD YOU! Please copy ONLY WHAT YOU NEED (spring-clean your scratch!)

IF YOU DO NOTHING, YOUR OLD SCRATCH FILES WILL (EVENTUALLY) BE LOST FOREVER WHEN WE SWITCH OFF THE OLD SCRATCH HARDWARE!!!

Accessing your Scratch Areas

You must be logged-in to the upgraded CSF3 (Slurm) to deal with your old and new scratch. To check that you are on the upgraded CSF3:

#### Remember: You MUST be on the upgraded CSF3 (Slurm) login node ####

[mabcxyz1@login1[csf3] ~]$              # The 'csf3' should be green

ls /scratch-old/$USER
  #
  # You should see your 'old' scratch files from the CSF3 that runs SGE. 

If you see:

ls: cannot access '/scratch-old/mabcxyz1': No such file or directory
  #
  # You're on the old CSF3 - you need to be on the upgraded CSF3!

then you are either on the wrong CSF3 (the old CSF3), or you are a new user that does not have an old scratch area.

The scratch shortcut in your home on the upgraded CSF3 (Slurm) now points to NEW SCRATCH

#### Remember: You MUST be on the upgraded CSF3 (Slurm) login node ####

# NEW SCRATCH is now your default scratch (the three paths below all point to the same place)
~/scratch                       # Will be empty to begin with!!
$HOME/scratch
/scratch/username               # Where username is your CSF username

# OLD SCRATCH is READ ONLY there is no shortcut. The path below takes you your OLD SCRATCH
/scratch-old/username
   #
   # You CANNOT runs jobs from here - it is READ-ONLY. BATCH JOBS WILL FAIL IF RUN FROM HERE.

REMEMBER: On the upgraded CSF3 (Slurm), the ~/scratch symlink (shortcut) in your home directory points to your NEW SCRATCH area.

Copying files from OLD to NEW SCRATCH

There are three methods:

  1. Copying files on the login node

    This is fine for smaller copies – not too many files / folders – useful for when you are working on the upgraded CSF3 and you realise you need a few files from your OLD SCATCH area. You will need to remain logged in for the copy to complete successfully, so if the copy is going to take any length of time, see methods 2 and 3 below.

    Use the following commands:

    #### Remember: You MUST be on the upgraded CSF3 (Slurm) login node ####
    
    # All commands should be run from the OLD SCRATCH area. So first do:
    cd /scratch-old/username
    
    # Copy files from OLD to NEW SCRATCH (remember, on the upgraded CSF3, ~/scratch is NEW SCRATCH)
    rsync -av filename ~/scratch         # Will copy a single file
    rsync -av *.dat *.log ~/scratch      # Copy all files ending with .dat and .log to NEW SCRATCH
    
    # Copy specific files from a directory to the same directory in the NEW scratch
    cd /scratch-old/username/run5/outputs
    mkdir -p ~/scratch/run5/outputs
    rsync -av *.out ~/scratch/run5/outputs
    
    # Copy an entire folder from old scratch to new scratch
    # (if this will take more than one hour use a batch job instead - see below)
    rsync -av myfolder ~/scratch
    
  2. Copying files in an interactive session

    This is similar to the above login-node copy, but will let you run the commands on a dedicated compute node, where the OLD scratch has been made available. Note that it is NOT available on any other compute node. This is useful when copying a lot of files (or large files) – it keeps the load off the login node, which might otherwise slow down the login node for other users:

    #### Remember: You MUST be on the upgraded CSF3 (Slurm) login node ####
    
    # From the upgraded CSF3 login node, start an interactive session on the transfer node.
    # This will give you a 1-day session on the compute node, but you'll need to remain
    # logged in to the CSF to run any commands.
    srun -p serial -t 1-0 --constraint scratch-old --pty bash
      #
      # Wait to be logged in to the compute node, then:
    
    # Go to your OLD scratch
    cd /scratch-old/username
    
    # Use the commands given in the previous login-node copy method - e.g.,
    rsync -av filename ~/scratch
    rsync -av myfolder ~/scratch
    
    # When finished, go back to the login node
    exit
    
  3. Copying files in a batch job

    This is preferred for larger copies – entire folders or very large datasets. You don’t need to remain logged in once you’ve submitted the batch job. Note that the job must be instructed to run on the dedicated compute node where OLD scratch has been made available. It is NOT available on all compute nodes:

    Create a batch job in your home-directory (e.g., ~/my-transfer-job.txt) containing the following:

    #!/bin/bash --login
    #SBATCH -p serial                     # A single core (serial) job to do a file transfer 
    #SBATCH --constraint scratch-old      # Job must run on the dedicated file transfer node
    #SBATCH -t 4-0                        # This requests a 4-day time limit. Time is REQUIRED.
                                          # (Max permitted is 7 days: 7-0).
    
    # Go to your old scratch
    cd /scratch-old/$USER
    
    # Now copy files and folders to the new scratch area (see examples above)
    rsync -av filename ~/scratch
    rsync -av folder ~/scratch
    

    Submit the batch job using:

    #### Remember: You MUST be on the upgraded CSF3 (Slurm) login node ####
    
    sbatch my-transfer-job.txt
    

    You can check on the job using squeue

Can I use NEW SCRATCH but carry on using OLD SCRATCH?

No! Jobs cannot be run from OLD SCRATCH (it is READ-ONLY – jobs will fail if run from here.) You should use NEW SCRATCH for all of your work.

Remember: the traditional ~/scratch (or /scratch/username) path will take you to your NEW SCRATCH area. You can use the special file-transfer node to copy files from OLD SCRATCH to NEW SCRATCH.

Last modified on March 11, 2025 at 6:20 pm by George Leaver