Jupyter Notebook

Overview

Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

Version 5.5.0 is installed on the CSF. This is part of Anaconda3 Python v5.2.0 which provides python 3.6.5.

Please read carefully the instructions below when connecting a web-browser to your jupyter notebook on the CSF – there are some extra steps needed that you would not normally do if you were running a jupyter notebook on your local desktop/laptop (or the x2go virtual desktop if using that).

It is possible to run jupyter notebook on the Nvidia GPU nodes. However, you need to request being added to the relevant group to access GPUs before you can run Jupyter Notebook on them.

Restrictions on use

There are no restrictions on accessing this software.

Set up procedure

To access the software you must first load the modulefile:

module load apps/binapps/jupyter-notebook/5.5.0     # Uses anaconda3/5.2.0
module load apps/binapps/jupyter-notebook/5.7.8     # Uses anaconda3/2019.03
module load apps/binapps/jupyter-notebook/6.0.0     # Uses anaconda3/2019.07

These modulefiles will automatically load the Anaconda Python modulefile for you – jupyter is provided by Anaconda.

If you wish to load your own Anaconda modulefile version, for example if you have used a specific version to create a conda env or install some python packages, then the following modulefile can be used. It does NOT load any anaconda modulefiles. You must load an anaconda modulefile before loading this one. But it will give you access to the jupyter-notebook-csf helper script, which makes it easier to submit Jupyter Notebook jobs:

# You must load an anaconda modulefile before loading this:
module load apps/binapps/jupyter-notebook/any       # Does NOT load any anaconda modulefile

Initial Setup

No initial setup is needed to run jupyter notebooks. It is possible, if you wish, to password protect your notebooks, which can be done using the commands below. But this is optional. By default a jupyter notebook will present you with a login page when you fist visit the notebook with your web-browser. You can setup a password via that webpage. You can also use the token-based security which the notebook uses if you don’t set up a password. This requires you to enter a random string of letters and numbers, which are reported in the job output .e file for the job running the notebook.

Hence only you can access your notebooks when they are running on the CSF, even if you don’t set up a notebook password.

Our recommendation is to use the default token-based authentication and not bother setting up a password for the notebooks.

Optional Setup

After loading the Jupyter Notebook modulefile, run the following on the login node:

jupyter-notebook --generate-config

# It will output:
Writing default config to: /mnt/iusers01/xy01/mabcxyz1/.jupyter/jupyter_notebook_config.py
   #
   # The xy01 group and mabcxyz1 username will differ

The command has reported where it has generated the configuration file (within your home area). The file is at (where ~ means your home directory):

~/.jupyter/jupyter_notebook_config.py

Optional Password Protection (skip this)

Password protecting the access to your running notebooks is optional. If you do not set a password then the notebooks are still only accessible by you – a notebook’s webpage will ask for a token (a random set of letters and numbers) to be entered, which only you can get from the job output .e file (see below). It is very unlikely another user would be able to guess this token!

In fact we recommend that you skip password protection use the token method instead – i.e., you don’t need to do anything here and can jump straight to running the notebook.

If you do wish to create a password to protect access to your notebooks when the server is running, do NOT use your central IT password. This is not a terribly secure method of protecting your notebooks but will offer some protection. Please do NOT use a password that is used on other systems or is valuable to you.

Run the following on the login node:

python -c "from notebook.auth import passwd; print(passwd())"

# It will display:
Enter password: Enter your new notebook password and press [return]
Verify password: Enter the same password again and press [return]
sha1:bf8810b0e5a2:bac141e3590ce339c955f21dd6b7de1b20
  #
  # This line is the result (yours will be different).
  # We need to copy this in to the config file generated earlier.

Take a copy of the above line. Now edit the config file generated earlier:

gedit ~/.jupyter/jupyter_notebook_config.py
  #
  # Could also use nano or vi or emacs editors

Search the file for a line that looks like:

# c.NotebookApp.password = ''

and change it to be:

c.NotebookApp.password = 'sha1:bf8810b0e5a2:bac141e3590ce339c955f21dd6b7de1b20'
#
# Notice we have remove the '#' at the start

where the sha1:.... is copied from the output generated earlier.

Now search the config file for a line that looks like:

# c.NotebookApp.ip = 'localhost'

Change this to be:

c.NotebookApp.ip = '*'
#
# Notice we have remove the '#' at the start

When you connect to a jupyter notebook server with a web-browser you will be asked for the password you specified earlier. Note that this does NOT use https (at the moment). So you should NOT use a password that is of value to you.

That is all of the setup we need to do. You can now start a jupyter-notebook job as described below.

Running the application

Your jupyter notebook will run in the batch system on a compute node. The basic procedure is as follows:

  1. Submit a jupyter notebook job on the CSF and wait for it to run. We do this using a helper script named jupyter-notebook-csf run on the login node – no need to write a jobscript yourself!
  2. On your local desktop/laptop (or the x2go virtual desktop if using that) create an SSH tunnel to the CSF compute node running your jupyter notebook.
  3. On your local desktop/laptop (or the x2go virtual desktop if using that) connect a web-browser to the SSH tunnel – it will then connect to the CSF job.

Using an SSH tunnel may seem like a complicated method but it is the only way to reach the CSF compute node where your jupyter job is running. It also improves security – any traffic between your web-browser and the jupyter server will be encrypted.

The above procedure is now shown in detail below.

jupyter-notebook-csf helper script

The jupyter-notebook-csf helper script mentioned above (used to submit a batch job from the login node) accepts the following flags:

jupyter-notebook-csf [-p|-a NUMCORES] [-m 16|32|50|256|1024] [-j] [-k] [-P PORTNUM] \
                     [-g NUMGPUS] [-t JOBID] [-c JOBID]

Starts a Jupyter Notebook server on a compute node.

-p NUMCORES  -- Run in smp.pe with NUMCORES cores. Default is 1, max 32.
-a NUMCORES  -- Run in amd.pe with NUMCORES cores. Default is 2, max 168.
                Note: -p|-a 1 will run as a serial job on the serial nodes, not in smp.pe or amd.pe.
-m 16 or -m 256       -- Run on a 16GB/core (256GB) node (max 16 cores).
-m 32 or -m 512       -- Run on a 32GB/core (512GB) node (max 16 cores).
-m 50 or -m 1024      -- Run on a 50GB/core (1024GB) node (max 20 cores). [REQUEST ACCESS BEFORE USING]
-m 46 or -m 1500      -- Run on a 46GB/core (1500GB or 1.5TB) node (max 32 cores). [REQUEST ACCESS ...]
-m 62  or -m 2000    -- Run on a 62GB/core (2000GB or 2TB) node (max 32 cores). [REQUEST ACCESS ...]
-m 125 or -m 4000    -- run on a 125GB/core (4000GB or 4TB) node (max 32 cores). [REQUEST ACCESS ...]
                If no -m flag given, the system will choose an ordinary compute node for you.
                NOTE: you cannot use the -m flag if running on GPU nodes (-g flag).
-j           -- Generate the jobscript but DO NOT submit it. Debugging only.
-k           -- Keep the jobscript file after it has been submitted. Deleted by default.
                The autogenerated jobscript is not very useful to you.
-P PORTNUM   -- Port number from which to start searching for a free port for the
                the notebook server to listen on. Default starting port is 8888.
-g NUMGPUS   -- Run on the Nvidia v100 GPU nodes IF YOU HAVE ACCESS TO SUCH NODES!
-G NUMGPUS   -- Run on the Nvidia A100 GPU nodes IF YOU HAVE ACCESS TO SUCH NODES!
-G40 NUMGPUS -- Run on the Nvidia A100(40GB) GPU nodes IF YOU HAVE ACCESS TO SUCH NODES!
-L NUMGPUS   -- Run on the Nvidia L40S GPU nodes IF YOU HAVE ACCESS TO SUCH NODES!
                Depending on your level of access you can request 1 -- 4 GPUs
                NOTE: you cannot use the -g flag if running on highmem nodes (-m flag).

To check whether a jupyter notebook job is running:

-c JOBID     -- report qstat status of job

Once a job is running, get info about accessing the notebook via a web-browser:

-t JOBID     -- read the job .e file and report the token used for authentication.
                You can only do this AFTER the jupyter notebook batch job has started.
                This is to help with logging in to the notebook in a web-browser.

-h           -- Display this help.

Check the .o file for instructions on how to connect
a web-browser to the server once the job runs.

But you only need to use the flags if you want more than one core and want to run on a particular type of compute node to get a certain amount of memory. Some examples:

jupyter-notebook-csf              # runs a serial (1-core) job - on a 4--6GB/core node
jupyter-notebook-csf -p 4         # runs a parallel 4-core job - on a 4--6GB RAM/core node
jupyter-notebook-csf -p 8 -m 32   # runs a parallel 8-core job - on a 32GB/core node
jupyter-notebook-csf -p 8 -g 1    # runs a parallel 8-core job - on a GPU node using 1 v100 GPU
jupyter-notebook-csf -p 8 -g 1 -j # generate job as above but don't submit the jobscript

We now go through the complete steps to using a Jupyter Notebook.

Complete Steps to Running a Jupyter Notebook

  1. Submit a jupyter notebook job using the helper script run on the CSF login node:
    module load apps/binapps/jupyter-notebook/5.5.0
    
    # Run one of the following examples on the login node:
    
    jupyter-notebook-csf             # Submit a serial (1-core) job
    jupyter-notebook-csf -p 4        # Submit a parallel 4-core job
    jupyter-notebook-csf -p 4 -m 32  # As above but running on a 512GB (32GB/core) node
    

    The output from the helper script can be one of two things:

    1. If the job runs straight away in the batch system you’ll see:
      Your job 83873 ("jnotebook") has been submitted
      Checking if job has already started...
      
      Starting jupyter-notebook on node780 port 8888
                                    #
                                    # Some node's use the name: hnodeNNN
                                    # Check for an 'h' in your node name!
      
      You must now ssh in to the CSF from your local machine and tunnel
      to the backend compute node. Use the following command on *your*
      computer or the x2go virtual desktop  (not on the CSF):
      
        ssh -L 8888:node780:8888 mabcxyz@csf3.itservices.manchester.ac.uk
      
      You should then start a web-browser on *your* computer and browse to:
      
        http://localhost:8888
      
      If you are asked for a token to login to the notebook, have a look in
      this job's .e file for the token by running the following command:
      
        jupyter-notebook-csf -t 43792
      
      (you should only run the above command after connecting your web-browser to the notebook)
      
      If you press the Quit button in your notebook's web-page it will terminate the batch job.
      If you press the Logout button in your notebook's web-page or simply close the web-browser
      then you *must* run the following command on the CSF3 to terminate the Jupyter batch job:
      
        qdel 83873
      
      Then log out of your tunnelled ssh session.
      

      This means that the jupyter-notebook job started immediately on a compute node so you are able to proceed with setting up the SSH tunnel from your local computer (or the x2go virtual desktop if using that) to the CSF and then connecting your web-browser to it by following the instructions in the above message (see below for an example).

    2. If the job doesn’t run immediately but waits in the batch queue:
      The job is probably still waiting to run. Here is the qstat output....
      
      job-ID  prior   name       user     state submit/start at     queue    slots ja-task-ID 
      ---------------------------------------------------------------------------------------
        83873 1.23456 jnotebook  mabcxyz1 qw    03/10/2017 15:50:37              1           
      
      You *cannot* connect a web-browser to the CSF until the job runs.
      Please be patient.
      
      To check your job's status, run:
      
         qstat
      
      When the job eventually runs you should then run the following command
      to get further instructions on how to connect your web-browser to the
      Jupyter notebook:
      
         cat jnotebook.o83873
      
      Follow the instructions given in that file.
      

      Note that the job-ID 83873 will be different for your job. The above message means that you must check on your job using the qstat command to see when it is running. When it is running you can proceed with the instructions given in the file: jnotebook.oNNNNN where NNNNN is the job-ID of your own job.

  2. Once the notebook job is running you can now set up the SSH tunnel on your computer (not the CSF). You’ll need to open another terminal window so that you have one that is NOT logged in to the CSF.For example, if you are on Windows use MobaXterm and press the new tab icon to open a Local Terminal (see image below), if on Mac use the Terminal application and if on linux use an Xterm or GNOME Terminal.
    MobaXterm new tabThen run:

    # Run this on your computer (PC/laptop), NOT in a terminal logged in to the CSF
    # The terminal window you run this in (e.g., MobaXterm) must NOT be logged in to the CSF.
    ssh -L 8888:node780:8888 username@csf3.itservices.manchester.ac.uk
             ^     ^      ^      ^
             |     |      |      |
             |     |      |      +----- use your own central IT username here
             +-----+------+
                   |
                   |
                   +---------- These values (port numbers and compute node name) are reported in
                               the above output when you submitted the notebook job. You should
                               use the values appropriate to your job (the port will likely be
                               8888 but could be different. The node name will very likely be
                               a different node number).
                               Note also, some high-memory nodes as named hnodeNNN.
    
    
  3. Now start a web-browser on your PC and browse to:
    http://localhost:8888
                       #
                       # Use the same port number as used above
    
  4. You should be taken to the Jupyter Notebook login page. If you created a password earlier then enter it in the webpage to access the jupyter notebook webpage.If you did not set up a password the notebook’s webpage will indicate that token authentication is in use. You will need to get the notebook’s token (see below).

You can now use the Jupyter Notebook through your own web-browser. Any code you enter is executed on the CSF compute node on which the Jupyter Notebook server is running. Hence you should ensure your code uses the correct number of cores you reserved when you submitted the notebook job.

Obtaining the Notebook Token

When you visit your notebook in a web-browser you may be asked for a token if you didn’t set up a password. The token is similar to a random auto-generated password.

To find out what the token is for your notebook run the following command on the login node when your notebook is running and after you have pointed a web-browser at the notebook page (you’ll see the login page):

# Run this on the CSF login node, in the folder where you submitted the jupyter job from
jupyter-notebook-csf -t JOBID

where JOBID is the job id of your notebook job (run qstat if you are unsure). You should see lines similar to the following:

Authentication token for your notebook is 3827b48cc078578c5b50232f7835e6cb809c128a25a74294
Type it in to the webpage at: http://localhost:8888/  (if asked for a token)
OR browse to http://localhost:8888/?token=3827b48cc078578c5b50232f7835e6cb809c128a25a74294

You can either browse to the notebook by appending the ?token=... text to the localhost URL we used above:

http://localhost:8888/?token=e80716bb3a28023b2aa2c2541b7586b12b41174e
                   #
                   # Use the same port number as used above

or copy the token part of the address:

e80716bb3a28023b2aa2c2541b7586b12b41174e

in to the type-in box in the notebook’s login web-page.

Example Notebook

Once you have connected to your notebook, try the following in your web-browser:

  1. Select New -- Python3 to start a new python notebook.
  2. Enter the following code in the first cell (see image below):
    import os
    file=os.getenv("JUPYTER_HOME")+"/example_job/mandelbrot.py"
    %load $file
    

    Load a python file in to a notebook

  3. Press the Run button twice – firstly to load the mandelbrot.py in to the notebook. Then secondly to execute the notebook code.
  4. You should see a Mandelbrot fractal image in your web-browser. The code was executed in the batch job running on the compute node!
    Mandelbrot notebook result

Stopping a Notebook

When you have finished with the jupyter notebook you can either:

  • Press the Quit button in the notebook’s web-page to stop the notebook and terminate the jupyter batch job on the CSF
  • Press the ‘Logout’ button in the notebook’s web-page, or simply close your web-browser and, if you no longer want to connect back to the notebook, remove the job that is running on the CSF compute node using the usual qdel command:
    # On the CSF login node:
    qdel 83873
          #
          # Change the job id to be that of your notebook job.
          # If unsure, run 'qstat' to see a list of your jobs.
    
  • Don’t forget to log out of the window running the ssh tunnel created earlier.

Please do remember to stop your CSF server job via the Quit button or Logout button followed by the qdel command. Logging out of the notebook in your web-browser does NOT stop the CSF job. You MUST do this using qdel to free up the compute node for other users.

Using your Conda Environments with Jupyter Notebooks

If you have a personal conda environment (e.g., myenv) which you would like to use in jupyter, please do the following on the login node. It will install a package inside your conda environment to allow it to work with jupyter notebooks.

# Add a package to your conda env to allow it to be used with jupyter notebooks:

# You must do this via an interactive session, to have access to the outside world.
# This is needed to download a package. So, on the CSF login node:
qrsh -l short
  #
  # Wait until you've been logged in to a compute node, then:

# If you used Anaconda python to create the conda env then you should
# load the modulefile for that version of Anaconda now. For example:
module load apps/binapps/anaconda3/2022.10        # Load the version you used to create the conda env
source activate myenv                    # Activate your conda env

# Install a package in your conda env to allow jupyter to use the conda env
pip install --isolated ipykernel
pip install --isolated ipywidgets
python -m ipykernel install --user --name=myenv --display-name='Environment (myenv)'
source deactivate myenv

# Return to the login node:
exit

You can now submit a jupyter notebook job, using the “any” jupyter modulefile, which will NOT load an anaconda modulefile itself – hence you use your required version of anaconda

# Submit a notebook job as normal from the login node:

module load apps/binapps/anaconda3/2022.10        # Load the version you used to create the conda env
module load apps/binapps/jupyter-notebook/any     # Uses the anaconda version you loaded

# Now use the helper script to submit a job as described earlier. You DO NOT need to
# activate your conda env, either on the login node or in the job. The above ipykernel
# installation allows jupyter to find your conda envs.
jupyter-notebook-csf ...

Now when you connect to your notebook via a web-browser, you will be able to access your conda environment by selecting

# In your web-browser, in the jupyter main screen, start a new notebook with access to your conda env:
New --> Environment (myenv)

in the drop-down menu (at the top right-hand-side of the browser) to start a new python notebook “inside” your conda environment.

Further info

Updates

None.

Last modified on October 23, 2024 at 3:28 pm by George Leaver