Theano

Overview

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.

Versions 0.8.2, 0.9.0 and 1.0.1 are installed on the zrek. It support CUDA, cuDNN, cnmem all on the K20 and K40 GPUs.

Restrictions on use

There are no access restrictions on zrek.

You are encouraged to cite the library in any research. Please see the citation page for details.

Supported Backend Nodes

This application is available on the Nvidia GPU nodes: besso and kaiju{[1-5],101}. Please see the K40 node instructions and the K20 node instructions for how to access the nodes.

Set up procedure

To access the software you must first load one of the following modulefile:

module load apps/gcc/python-packages/anaconda-2.5.0/theano/1.0.1
module load apps/gcc/python-packages/anaconda-2.5.0/theano/0.9.0
module load apps/binapps/theano/0.9.0
module load apps/binapps/theano/0.8.2

This will automatically load the Anaconda v2.5.0 Python module files (which provides python 2.7.11) and also the latest CUDA modulefile. To see which modulefiles have been loaded use

module list

If you require cuDNN support please also load on of the additional cuDNN modulefiles, for example:

module load  libs/cuDNN/5.1.5

If you require the PyCUDA libraries please also load the following additional modulefile:

module load libs/gcc/pycuda/2017.1.1
module load libs/gcc/pycuda/2016.1.1

To see what versions are available run:

module avail libs/cuDNN
module avail libs/gcc/pycuda

Running the application

Please do not run Theano on the login node. Jobs should be run interactively on the backend nodes (via qrsh) or submitted to the compute nodes via batch.

The following instructions describe interactive use on a backend node and batch jobs from the login node.

Interactive use on a Backend Node

To see the commands used to log in to a particular backend node, run the following on the zrek login node:

backends

Once logged in to a backend K20 node or K40 node (using qrsh) and having loaded the modulefile there (see above), run:

python
   #
   # See below for an example theano script

By default Theano will use the CPU (not GPU) for computation. You can configure Theano to always use a GPU. A complete discussion of Theano config is beyond the scope of this documentation but the following examples show how to use the CPU, GPU and GPU with cuDNN and cnmem enabled. Note that we set configuration options on the command-line using the THEANO_FLAGS environment variable. Alternatively you can create a .theanorc (plain text) config file in your home directory – please see the Theano documentation.

The following examples assume the use of a simple Theano python test program. See below for the actual code:

# Run on the CPU (not GPU)
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python theano-test.py

# Run on the assigned GPU
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python theano-test.py

# Run on the assigned GPU, enable cuDNN and cnmem (requires cuDNN modulefile to be loaded).
# It can be all on one line or split with \ chars.
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,\
dnn.include_path=${CUDNN_INCLUDE_PATH},dnn.library_path=${CUDNN_LIB_PATH},lib.cnmem=1" \
python theano-test.py

In the above tests theano will report which device it is using and whether cuDNN is in use.

Serial batch job submission

Do not log in to a backend node. The job must be submitted from the zrek login node. Ensure you have loaded the correct modulefile on the login node and then create a jobscript similar to the following:

#!/bin/bash
#$ -S /bin/bash
#$ -cwd                   # Run job from directory where submitted
#$ -V                     # Inherit environment (modulefile) settings
#$ -l k20                 # Select a single GPU (Nvidia K20) node
                          # Or use: #$ -l k40
# Run thean three times on CPU, GPU and GPU+cuDNN+cnmem
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python theano-test.py
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python theano-test.py
# This next one can be all on one line or split with \ chars
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,\
dnn.include_path=${CUDNN_INCLUDE_PATH},dnn.library_path=${CUDNN_LIB_PATH},lib.cnmem=1 \
python theano-test.py

Submit your jobscript from the zrek login node using

qsub jobscript

where jobscript is the name of your jobscript.

Theano Python Example

The following python code is taken from

It has been modified to support xrange and range functions depending whether using python 2 or 3. In the previous interactive and jobscript examples the following code was saved as theano-test.py

'''
 From: http://deeplearning.net/software/theano/tutorial/using_gpu.html#cuda
 GWL: Added the python2/python3 xrange/range hack.
'''

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

# GWL: Hack to use range() in python3, xrange() in python 2
try:
    xrange
except NameError:
    xrange = range

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in xrange(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')

Further info

Updates

None.

Last modified on April 29, 2018 at 11:58 am by George Leaver