The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
Jupyter Notebook
Overview
Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
It is installed as part of Anaconda python. The modulefile described here provides a helper script to submit Jupyter Notebook jobs correctly on the CSF. The version number is the version of our helper script. You must also load an Anaconda Python modulefile to access your required version of Jupyter Notebook.
Please read carefully the instructions below when connecting a web-browser to your jupyter notebook on the CSF – there are some extra steps needed that you would not normally do if you were running a jupyter notebook on your local desktop/laptop.
Restrictions on use
There are no restrictions on accessing this software.
Set up procedure
To access the software you must first load the modulefile:
# The version number of the version of our helper script, not jupyter nodebook. module load apps/binapps/jupyter-notebook/1.0
You must also load the version of Anaconda Python you require. To see available version run:
module avail apps/binapps/anaconda
This will list the available modulefiles. Use the module load
to load your required version.
Initial Setup (required!)
Before running Jupyter Notebook for the first time, a configuration file needs to be generated. You only need to do this once and it can be done on the login node.
After loading the modulefile, run the following:
jupyter-notebook --generate-config # It will output: [I 10:03:01.010 NotebookApp] Writing notebook server cookie secret to /run/.../notebook_cookie_secret Writing default config to: /mnt/iusers01/xy01/mabcxyz1/.jupyter/jupyter_notebook_config.py # # The xy01 group and mabcxyz1 username will differ
The command has reported where it has generated the configuration file (within your home area). The file is at (where ~
means your home directory):
~/.jupyter/jupyter_notebook_config.py
You should now create a password to protect access to your notebooks when the server is running. NOTE: please do NOT use your central IT password. This is not a terribly secure method of protecting your notebooks but will offer some protection. Please do NOT use a password that is used on other systems.
Run the following on the login node:
python -c "from notebook.auth import passwd; print(passwd())" # It will display: Enter password: Enter your new notebook password and press [return] Verify password: Enter the same password again and press [return] sha1:bf8810b0e5a2:bac141e3590ce339c955f21dd6b7de1b20 # # This line is the result (yours will be different). # We need to copy this in to the config file generated earlier.
Take a copy of the above line. Now edit the config file generated earlier:
gedit ~/.jupyter/jupyter_notebook_config.py # # Could also use nano or vi or emacs editors
Search the file for a line that looks like:
# c.NotebookApp.password = ''
and change it to be:
c.NotebookApp.password = 'sha1:bf8810b0e5a2:bac141e3590ce339c955f21dd6b7de1b20' # # Notice we have remove the '#' at the start
where the sha1:....
is copied from the output generated earlier.
Now search the config file for a line that looks like:
# c.NotebookApp.ip = 'localhost'
Change this to be:
c.NotebookApp.ip = '*' # # Notice we have remove the '#' at the start
When you connect to a jupyter notebook server with a web-browser you will be asked for the password you specified earlier. Note that this does NOT use https (at the moment). So you should NOT use a password that is of value to you.
That is all of the setup we need to do. You can now start a jupyter-notebook job as described below.
Running the application
Your jupyter notebook will run in the batch system on a compute node. The basic procedure is as follows:
- Submit a jupyter notebook job on the CSF and wait for it to run. We do this using a helper script named
jupyter-notebook-csf
run on the login node – no need to write a jobscript yourself! - On your local desktop/laptop create an SSH tunnel to the CSF compute node running your jupyter notebook.
- On your local desktop/laptop connect a web-browser to the SSH tunnel – it will then connect to the CSF job.
The above procedure is shown in detail below.
jupyter-notebook-csf
helper script
The jupyter-notebook-csf
helper script mentioned above (used to submit a batch job from the login node) accepts the following flags:
jupyter-notebook-csf -h jupyter-notebook-csf [-p NUMCORES] [-s] [-j] [-k] [-P PORTNUM] Starts a Jupyter Notebook server on a compute node. -p NUMCORES -- Run in smp.pe with NUMCORES cores. Default is 1, max 24. Note: -p 1 will run as a serial job, not in smp.pe. -s -- Run in the short area (shorter max runtime allowed). -j -- Generate the jobscript but DO NOT submit it. Debugging only. -k -- Keep the jobscript file after it has been submitted. Deleted by default. The autogenerated jobscript is not very useful to you. -P PORTNUM -- Port number from which to start searching for a free port for the the notebook server to listen on. Default starting port is 8888. -h -- Display this help. Check the .o file for instructions on how to connect a web-browser to the server once the job runs.
But you only need to use the flags if you want more than one core and want to run on a particular type of compute node to get a certain amount of memory. Some examples:
jupyter-notebook-csf # runs a serial (1-core) job jupyter-notebook-csf -p 4 # runs a parallel 4-core job jupyter-notebook-csf -p 8 -s # runs a parallel 8-core job in the short area (max 12 cores)
We now go through the complete steps to using a Jupyter Notebook.
Complete Steps to Running a Jupyter Notebook
- Submit a jupyter notebook job using the helper script run on the login node:
# Run in scratch area cd ~/scratch # Choose your version of anaconda python module load apps/binapps/anaconda/3/4.2.0 # Load the CSF helper script modulfile module load apps/binapps/jupyter-notebook/1.0 # Run one of the following examples on the login node: jupyter-notebook-csf # Submit a serial (1-core) job jupyter-notebook-csf -p 4 # Submit a parallel 4-core job jupyter-notebook-csf -p 4 -s # As above but runs in the 'short' area
The output from the helper script can be one of two things:
- If the job runs straight away in the batch system you’ll see:
Your job 83873 ("jnotebook") has been submitted Checking if job has already started... Starting jupyter-notebook on node012 port 8888 You must now ssh in to the CSF from your local machine and tunnel to the backend compute node. Use the following command on *your* computer (not on the CSF): ssh -L 8888:node012:8888 mabcxyz@csf2.itservices.manchester.ac.uk You should then start a web-browser on *your* computer and browse to: http://localhost:8888 When you have logged out of your notebook in the web-browser you *must* run the following command on the CSF to stop the Jupyter server: qdel 83873 Then log out of your tunnelled ssh session.
This means that the jupyter-notebook job started immediately on a compute node so you are able to proceed with setting up the SSH tunnel from your local computer to the CSF and then connecting your web-browser to it by following the instructions in the above message (see below for an example).
- If the job doesn’t run immediately but waits in the batch queue:
The job is probably still waiting to run. You *cannot* connect a web-browser to the CSF until the job runs. Please be patient. To check your job's status, run: jnotebook-status 83873 When the job eventually runs you should then run the following command to get further instructions on how to connect your web-browser to the Jupyter notebook: cat jnotebook.o83873 Follow the instructions given in that file.
Note that the job-ID
83873
will be different for your job. The above message means that you must check on your job using thejnotebook-status JOBID
command to see when it is running. When it is running you can proceed with the instructions given in the file:jnotebook.oJOBID
where JOBID is the job-ID of your own job.
- If the job runs straight away in the batch system you’ll see:
- Once the notebook job is running you can now set up the SSH tunnel on your computer (not the CSF). You’ll need to open a terminal window. For example, if you are on Windows use MobaXterm, if on Mac use the Terminal application and if on Linux use an Xterm or GNOME Terminal. Then run:
ssh -L 8888:node012:8888 username@csf.itservices.manchester.ac.uk ^ ^ ^ ^ | | | | | | | +----- use your own central IT username here +-----+------+ | | +---------- These values (port numbers and compute node name) are reported in the above output when you submitted the notebook job. You should use the values appropriate to your job (the port will likely be 8888 but could be different. The node name will very likely be a different node number).
- Now start a web-browser on your PC and browse to:
http://localhost:8888 # # Use the same port number as used above
You should be taken to the Jupyter Notebook password page. Enter the password your generated earlier when you did the one-time setup to create the config file.
You can now use the Jupyter Notebook through your own web-browser. Any code you enter is executed on the CSF compute node on which the Jupyter Notebook server is running. Hence you should ensure your code uses the correct number of cores you reserved when you submitted the notebook job.
Stopping a Notebook
When you have finished with the jupyter notebook you must log out of the notebook in your web-browser and remove the job that is running on the CSF compute node. Do the following:
- Log out of the server through your web-browser (hit the logout button)
- Stop the CSF job that is running the server:
# On the CSF login node: qdel 83873 # # Change the job id to be that of your notebook job. # If unsure, run 'qstat' to see a list of your jobs.
- Log out of the window running the ssh tunnel created earlier.
Please do remember to stop your CSF server job via the qdel
command. Logging out of the notebook in your web-browser does NOT stop the CSF job. You MUST do this using qdel
to free up the compute node for other users.
Further info
Updates
None.