Modules

Overview

Modules is the system by which most software is made available. Modules allow, via a simple interface, the update of various paths (and other environment settings) that allow access to software.

The use of modules also provides a convenient way to switch between different versions of the same software (such as compilers or applications such as matlab which are updated frequently).

Note: We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. Alternatively, you may load modulefiles on the login node and let the job inherit these settings. See below for more information about these two methods.

Module Commands

Your environment is controlled using the module command with various arguments. The most commonly used command is:

module load name/of/module/1.2.3

These commands can be used directly on the login node or in your jobscripts.

You may wish to search for modules on the login node while writing your jobscript AND do the actual loading of the modulefile inside the jobscript so that your jobs always use the same version of the application.

The full list of module commands is given below.

Command Description
module search pattern Lists all modules that have pattern in their name. For example: module search bio
The module search command only works on the login node.
module avail Lists all available modules for use on the CSF (this shows what software has been installed).
module avail category Lists all available modules whose names begin with category (change this to compilers, apps, libs, tools, mpi). Can be useful to narrow down the list of things you want to look for. See below.
module list Lists modules that you have currently loaded in your environment, if any.
module load modulename Loads module modulename
module unload modulename Unloads module modulename and clears any settings made by the module from your environment
module switch oldmodulename newmodulename Switches between two modules, unloading one and loading the other
module purge Unload all modules from your environment. Very useful for starting with a clean environment (alternatively just log out and back in again).
module initadd modulename Run this command once to automatically ensure that a module is loaded whenever you log in to the CSF. It creates a .modules file in your home directory which acts as your personal configuration. We do not recommend doing this – you may forget which modulefiles you load upon login and other module loads may clash.
module help modulename May show longer description of the module if present in the modulefile
module help
module -H
Show help about the module command
module show modulename Shows all of the settings that would be made by the module if you were to load it. Very useful to see what the module does.

Categories

The software available of the CSF has been placed in categories:

  • apps – for main applications which has subsections for the compiler used to build an application (where the Research Infrastructure team has compiled the application, for example, intel-17.0), or, where no compilation was required and the application could be simply unpacked and installed: binapps.
  • compilers – GCC and Intel compilers
  • libs – for example, NAG, FFTW, Intel’s Math Kernel Library (MKL)
  • mpi – for running large parallel applications
  • tools – for small useful utilities

The module avail command will list everything available. You can view a single category at a time with:

module avail category
                #
                # replacing the optional category with one of the above

To use a module/piece of software:

module load name/of/module/1.2.3
              #
              # replacing name/of/module/1.2.3 with the module you want

For example:

module load compilers/intel/17.0.7

It is possible, and often necessary, to have multiple modules loaded at once. For example the following commands are equivalent:

module load compilers/intel/17.07
module load tools/gcc/cmake/3.13.2

or

module load compilers/intel/17.0.7 tools/gcc/cmake/3.13.2

Some modulefiles have restricted access (you need to be in a certain unix group to load the modulefile). If you cannot see a module file in the list of available modules that you expect to see, please contact its-ri-team@manchester.ac.uk giving the name of the modulefile.

Bash Completion

The module paths on the CSF are lengthy, but you can use bash completion to help you. Press TAB part-way through typing a module name and the name will be completed for you or a list of possible completions will be given. Type a few more characters and hit TAB again to narrow the list down until you get the full module name.

Module environment settings in batch

Modulefiles can be loaded:

  • on the login node before you submit a batch job
  • or from the jobscript when the job actually runs

There are pros and cons to each method but which method you choose is up to you – both are valid.

Loading a modulefile on the login node

The advantages of loading a modulefile on the login node before you submit the job are that you can check the name of the modulefile is correct (any spelling mistakes and the modulefile won’t be loaded). You could also check the settings the modulefile makes (e.g., an environment variable) and then use that in your jobscript. The disadvantage is that the particular modulefile name used is not recorded in your jobscript. If you wish to run the job again (perhaps in a few months time) you may have forgotten which version of a modulefile you used.

Loading a modulefile in the jobscript

The key advantage of loading a modulefile in the jobscript is that the names of any modulefiles used by your job are recorded – this helps with reproducibility (you might want to run the job again in 6 months time so having the modulefiles written in the jobscript shows exactly which apps were used by the job.) However, if you spell a modulefile name incorrectly the mistake will only be detected when the job runs. Your job will probably fail and you’ll have to resubmit the job, waiting in the queue again.

Example of each method

You should write your jobscripts for the modulefile method you prefer:

Jobscript that loads a modulefile Jobscript that inherits a modulefile’s settings from the login node
#!/bin/bash --login
#$ -cwd

module load name/of/app/1.2.3

app.exe input.dat ....
#!/bin/bash
#$ -cwd
#$ -V       # Inherit settings from login node so do:
            # module load name/of/app/1.2.3 there

app.exe input.dat ....
The --login flag on the first line is needed to make module commands work inside jobscripts. The #$ -V flag uses an UPPERcase V.

Note: If loading the modulefile on the login node and inheriting the settings in the jobscript, the batch system takes a copy of your environment at the time at which you run the qsub command, not when your job eventually runs. This means that after submitting your job with qsub you are free to change your environment or even log out and your job will still see the correct environment when it runs. For example:

# From the CSF login node, submit a job to run appA 
module load name/of/appA/1.2.3
qsub job_to_run_appA
  #
  # This takes a copy right now of all of the environment
  # settings made by appA's modulefile

# Unload all modules, clear the settings
module purge
  #
  # This does NOT affect the previous job. It has already
  # copied all of the required settings. When the job eventually
  # runs it will have all of the settings from the appA modulefile.

# Set up for a different app
module load name/of/appB/4.5.6
qsub job_to_run_appB
  #
  # This take a copy right now of all of the environment
  # settings made by appB's modulefile. 

# Even if we log out before the jobs have run, they have a copy
# of all of the settings needed for each job.
logout

Automatically loading modulefiles on log in

Note: We recommend that you only load modulefiles as and when you require them, not automatically on login. This way you avoid unexpected software clashes, interactions and environment problems that can cause job failure.

If you do wish to always have certain modulefiles loaded when you log into the CSF then best practice is to specify them in your .modules environment file, not the .bashrc, or .bash_profile. Add the module load commands to the end of the .modules file and leave any other content in that file untouched. However, bear in mind that the warning above still applies.

Private modules – software you have installed

If you have installed a piece of software in your home directory then you can create your own modulefile for it which saves you having to set variables on the command line, in your .bashrc, .bash_profile or in batch scripts.

  • In your home dir create a directory called ‘privatemodules’ (the name is key)
    mkdir ~/privatemodules
    
  • Within that dir create the required folders and modulefiles (see template below) so you get something like:
    cd ~/privatemodules
    pwd
    /mnt/iusers01/xy01/mxyzabc2/privatemodules
    
    # Create a directory named after my own application
    mkdir myapp
    
    # Go in to the dir and create a text file named using the application version number (e.g., 2.0.3):
    cd myapp
    gedit 2.0.3
           #
           # Write your modulefile in this plain text file which is named '2.0.3'
           # It is an unusual name for a text file but Linux allows any name to be used.
           # It will make the 'module load' command used later look like the others on the CSF.
           # See below for a template modulefile
    
    # You should now have a file in at the following location:
    ~/privatemodules/myapp/2.0.3   
    
  • To load your new modulefile first load the ‘use.own’ modulefile (only need to do this once).
    module load use.own
    module load myapp/2.0.3
    

A template for a modulefile

#%Module1.0

## (The above line is required and must be the very first line)
##
## APP Modulefile
##    -- Replace APP with the name of your app
##    -- In this example we use a made-up app called 'paracode'
##    -- It requires a few additions to our environment.

## -- The 'module load' line below shoud be where you specify other modulefiles that may
## -- be required. Remove the 'module load' line if you do not need it.
## -- For example, compliers and the MPI library you used to build the software.
module load compilers/gcc/x.x.x mpi/gcc/openmpi/1.x.x

## -- Get my home directory path in case we need it
set     USER_HOME   $env(HOME)

## -- Now, specify the path to the top level of your software install area.
## -- Using 'set' just sets a local variable for use inside the modulefile.
set     instdir     $USER_HOME/mysoftware/build/paracode

## -- Now add required environment variables as detailed in the software manual.
## -- Using 'setenv' makes the variable 'visible' in your environment after loading the module.
setenv  PARACODE_DIR     $instdir
setenv  PARACODE_DATA    $instdir/sample_data_dir

## -- Can also add to environment variables that may already exist. We don't
## -- want to overwrite these variables but instead add to them.
prepend-path    PATH               $instdir/bin
prepend-path	LD_LIBRARY_PATH    $instdir/lib

Loading modules from python scripts

It is possible to load modulefiles directly from within python scripts, or other scripts such as perl or ruby. This is not something most people will need to do, but if you are developing a pipeline in python, say, then you might prefer to do everything in python (the usual method is to load the modules in the jobscript before running the python command – your python script will then have all environment settings made by the modulefiles available to it.)

To load modules directly in python:

# First take a local copy of a system file and modify it (correcting a misnamed directory path.)
# We also rename it to give it a more meaningful name:
cp /opt/clusterware/opt/modules/init/python.py modulefiles.py
sed -i 's/Modules/modules/' modulefiles.py

Now write your python code – e.g., myapp.py, containing:

import sys
import os
# Load the modulefiles.py to make the 'module' command available
from modulefiles import module
# Can now run module commands 
module('load', 'tool/env/proxy')
module('load', 'tools/gcc/git/2.24.0')
module('list')
os.system('git clone https://github.com/someproject')

Can now run your python app using the python command (the default system-wide python is v2.7)

python myapp.py

If you want to use the modulefile.py file with python3, you’ll need to edit the file to modify some python2 syntax. The following commands will modify the file for you:

sed -i -e "s/if not os.environ.has_key('\([A-Z]*\)'):/if '\1' not in os.environ:/" modulefiles.py
sed -i -e "s/exec output/exec(output)/" modulefiles.py

You can then run your python app using:

python3 myapp.py

Further Information

Last modified on May 26, 2023 at 3:54 pm by George Leaver