Research Infrastructure > CSF2 (retired) > Software > Applications > Keras

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

Keras

Overview

Keras is a high-level neural networks API, written in Python that can run on top of TensorFlow installed on the CSF.

Versions 2.2.2 and 2.0.8 are installed on the CSF.

Restrictions on use

There are no restrictions on accessing this software on the CSF. The software is released under the MIT License.

Set up procedure

Please note that Tensorflow 1.10.1 and up must be run on Sandybridge or better Intel CPUs. The CSF GPU nodes are Sandybridge so no further action is needed. The CPU-only nodes can contain older Westmere CPUs which will not run Tensorflow. Hence CPU-only jobs must avoid the Westmere nodes. See Intel CPUs and the CSF Tensorflow page for more details.

To access the software you must first load one of the following modulefiles:

# Backend is: Tensorflow 1.11.0 (Python 3.6)
module load apps/gcc/python-packages/anaconda3-5.2.0/keras/2.2.2-tensorflow-gpu      # GPU version of tensorflow 1.11.0
module load apps/gcc/python-packages/anaconda3-5.2.0/keras/2.2.2-tensorflow-cpu      # CPU version of tensorflow 1.11.0

# Backend is: Tensorflow but no tensorflow modulefile loaded. You must
# load a tensorflow modulefile first. This can be one of:
#   apps/gcc/tensorflow/1.10.1-py36-gpu or apps/gcc/tensorflow/1.11.0-py36-gpu       # GPU versions (tf 1.10.1 or 1.11.0)
#   apps/gcc/tensorflow/1.10.1-py36-cpu or apps/gcc/tensorflow/1.11.0-py36-cpu       # CPU versions (tf 1.10.1 or 1.11.0)
module load apps/gcc/python-packages/anaconda3-5.2.0/keras/2.2.2

# Backend is: Tensorflow 1.2.1 (Python 3.5, CPU-only)
module load apps/gcc/python-packages/anaconda3-4.2.0/keras/2.0.8-tensorflow-cpu

The keras modulefile will automatically load the tensorflow and anaconda python modulefiles for you unless otherwise indicated.

Running the application

Please do not run Keras (via python) on the login node. Jobs should be submitted to the compute nodes via batch.

Interactive use on a Backend Node

To request an interactive session on a backend compute node run:

qrsh -l inter -l short

# Wait until you are logged in to a backend compute node, then:

module load apps/gcc/python-packages/anaconda3-4.2.0/keras/2.0.8-tensorflow-cpu
python

An example Keras session is given below.

If there are no free interactive resources the qrsh command will ask you to try again later. Please do not run Keras (python) on the login node. Any jobs running there will be killed without warning.

Example Script

The following skeleton script can be used in an interactive session or in a batch job (from a jobscript). It ensure tensorflow does not use more cores than you have requested in your jobscript:

# Some of the following code has been taken from the Keras examples at:
# https://keras.io/getting-started/sequential-model-guide/

from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout
import numpy as np
import tensorflow as tf
import os

# Get number of cores reserved by the batch system (NSLOTS is automatically set, or use 1 otherwise)
NUMCORES=int(os.getenv("NSLOTS",1))
print("Using", NUMCORES, "core(s)" )

# Create TF session using correct number of cores
sess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=NUMCORES,
   intra_op_parallelism_threads=NUMCORES, allow_soft_placement=True, 
   device_count = {'CPU': NUMCORES}))

# Set the Keras TF session
K.set_session(sess)

# Replace the rest of the script with your own code

# Generate dummy data
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))

# MLP for binary classification example
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20, batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings

# $NSLOTS is automatically set to the number of cores requested on the pe line
# and can be read by your python code.
export OMP_NUM_THREADS=$NSLOTS
python my-script.py

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Ensure you have loaded the correct modulefile and then create a jobscript similar to the following:

#!/bin/bash
#$ -cwd                   # Run job from directory where submitted
#$ -V                     # Inherit environment (modulefile) settings
#$ -pe smp.pe 16          # Number of cores on a single compute node. Can be 2-24.

# $NSLOTS is automatically set to the number of cores requested on the pe line
# and can be read by your python code.
export OMP_NUM_THREADS=$NSLOTS
python my-script.py

The above my-script.py example will get the number of cores to use from the $NSLOTS environment variable.

Submit your jobscript using

qsub jobscript

where jobscript is the name of your jobscript.

Further info

Updates

None.

Last modified on October 1, 2018 at 3:20 pm by George Leaver

Page Contents