Monitoring Jobs
Monitoring Existing Jobs
You can use srun
to monitor existing jobs. It will login in to the allocated resource on the compute node where the job is running and give you an interactive session there.
You will need to know the JobID number of the job you wish to monitor, then run:
srun --jobid JOBID --pty bash
If you’ll be using a GUI tool to monitor your job, use:
srun-x11 --jobid JOBID # NO "--pty bash" needed for srun-x11
To limit the amount of time your interactive session will run for, add the -t timespec
flag to the srun
command. For example: -t 10
for 10 minutes.
GPU jobs
If running a GPU job, you can now run nvidia-smi
to get some info about your GPU usage.
Ending your monitoring session
Run exit
to end your interactive monitoring session. This will NOT terminate your batch job. You’ll return to the login node.