Logging In – Upgraded (SLURM) CSF3
How to login to the upgraded CSF3
As usual, you must use a Secure Shell (ssh) program to login to the upgraded CSF3 cluster. You will need to use a specific address to access the upgraded cluster:
# Temporary address of upgraded CSF cluster (running Slurm) login01-csf3-test.itservices.manchester.ac.uk
Use your UoM IT username (not email) and IT password (the same as used for many UoM services).
For example, the following command can be run in MobaXterm on Windows, or in a Terminal window on MacOS and Linux:
ssh username@login01-csf3-test.itservices.manchester.ac.uk # # Use YOUR UoM username here (this is of the form: mxyzabc1, NOT your email address!) # You will still be asked to perform 2FA as per usual. # If presented with a message along the lines of the following please enter 'yes' This host key is known by the following other names/addresses: ~/.ssh/known_hosts:7: [hashed name] Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Please Note: At present there is only one login node available, additional login nodes will be brought online in due course.
Mac users: Previous problems when logging in from a Mac while connected to EduRoam WiFi and the VPN (GlobalProtect) have now been resolved.
How can I tell which system I’m using?
As part of the CSF3 upgrade, new login nodes have been installed. If you are on a new login node, you will have access to the Slurm batch system. If you’re on an old login node, you’ll have access to SGE.
To check which system you are on, look at the prompt:
[mabcxyz1@login1 [csf3] ~]$ # Old CSF3 (running SGE) uses red. [mabcxyz1@login1[csf3] ~]$ # Upgraded CSF3 (running Slurm) uses green. # You can also run: which qsub # Tells you where it is on old CSF3, or "no qsub in..." on upgraded CSF3. which sbatch # Tells you where it is on upgraded CSF3, or "no sbatch in..." on old CSF3.
I’ve logged-in, now what?
Congratulations you have now logged on to the newly upgraded CSF cluster. You should be able to interact with cluster via the command line:
Run some initial commands to check your storage areas:
# List what is in your home directory - it is the same as the older CSF! ls # List what is in your new scratch directory - it is empty to begin with! ls ~/scratch # # No files will be listed - the scratch area is brand-new! # List what is in your old scratch directory - it is the same as on the older CSF! ls /scratch-old/$USER # # Only available on the login nodes and one special compute node
Now run some initial commands to check the batch system
squeue # # Yes, we're now using Slurm, not SGE! qstat, qsub and qdel will NOT work (try it!) # No jobs will be listed - you haven't submitted any! # Submit a job from the new scratch area (a simple serial job "one-liner") cd ~/scratch sbatch -p serial -t 5 --wrap="date && sleep 60 && date" # # Note: The "serial" partition name is now required. # The wallclock timelime (5 minutes in this example) is now required. # We'll not run a jobscript but instead give a list of commands to run. # Check your queue (you should have one job there) squeue # # When the job has finished, squeue will show no jobs # Check the output file (look for a file named "slurm-jobid.out" where jobid was the job number ls cat slurm-456.out
That’s it – you’ve now run your first job in Slurm on the upgraded CSF.
There are some major changes you should be aware of.
- New Batch System/Job Scheduler – Slurm. The older batch system, SGE, has been retired on the upgraded cluster and replaced with Slurm. As a result, existing SGE job scripts will need to be updated to ensure compatibility with Slurm, and any new job scripts must be written following Slurm conventions. We have a dedicated guide available here – SGE to Slurm, which also tells you the new Slurm commands to run (how to submit jobs, check your queue etc.)
- New Scratch File System – A new scratch file system has been introduced, providing 1.9 PB of storage capacity (an increase of 500TB) along with improved performance. The old scratch system is now accessible in read-only mode on the login nodes and a dedicated file transfer node. As a result, users will need to selectively copy their data from the old scratch file system – the old scratch will eventually be switched off! Please DO NOT simply copy everything from old scratch to new scratch! Note that you CANNOT run jobs from the old scratch area. Detailed instructions on how to perform this transfer can be found here – New Scratch filesystem
- Same applications and modulefiles – The applications that you use on the old CSF should be there on the upgraded CSF. The moduelfile names are the same. Please check the Software pages for the applications you use – we are in the process of updating these pages to provide Slurm jobscript examples.