The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
Anaconda Scientific Python Distribtuion
Overview
Anaconda is a completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing general-purpose statistical software package. It has 100+ of the most popular Python packages for science, math, engineering, data analysis.
Versions 2.3.0, 2.1.0, 1.9.1 and 1.8.0 of Anaconda are installed on the CSF. These provide python 2.7.x.
Version 5.1.0, 4.2.0, 4.1.1, and 2.3.0 of Anaconda3 is also installed. They provide Python 3.6.x, 3.5.x, and 3.4.x.
To see what is available in these versions load the appropriate modulefile (see below) and then run the command:
conda list
Note: none of the python 2.6 or python 3.3 packages have been installed.
Restrictions on Use
None, but all users should read the End User License Agreement before using the software.
Set up procedure
Load one of the following modulefiles for the version you wish to use:
For Python 2.7.x
module load apps/binapps/anaconda/2.5.0 module load apps/binapps/anaconda/2.3.0 module load apps/binapps/anaconda/2.1.0 module load apps/binapps/anaconda/1.9.1 module load apps/binapps/anaconda/1.8.0
For Python 3.4
# Notice the extra /3/ in the modulefile name module load apps/binapps/anaconda/3/2.3.0
For Python 3.5
# Notice the extra /3/ in the modulefile name module load apps/binapps/anaconda/3/4.2.0 module load apps/binapps/anaconda/3/4.1.1
For Python 3.6
# Notice the extra /3/ in the modulefile name module load apps/binapps/anaconda/3/5.1.0
Running the application
After loading the modulefile you need to have a batch job submission script. Below are some examples.
Simple, Python-only job
Here is a simple Python script (download fib.py)
parents, babies = (1, 1) while babies < 100: print 'This generation has %d babies' % babies parents, babies = (babies, parents + babies)
This is an example SGE submission script (download fib.sge)
#!/bin/bash #$ -S /bin/bash #$ -N Python_fib #$ -cwd #$ -o outputfile.log #$ -j y #$ -V python fib.py
Submit with the command
qsub fib.sge
A numpy job
No extra SGE configuration is necessary in order to use modules such as numpy or scipy. For example, to run the following script
import numpy def test_eigenvalue(): i=500 data=numpy.random.rand(i,i) result=numpy.linalg.eig(data) return result print(test_eigenvalue())
you can use the following submission script, assuming that you've called the above script eig.py
#!/bin/bash #$ -S /bin/bash #$ -N Python_fib #$ -cwd #$ -o outputfile.log #$ -j y #$ -V export OMP_NUM_THREADS=$NSLOTS ### For serial jobs auto sets to 1 core. DO NOT set a number here. python eig.py
Parallel Jobs - using more than one core
If you have some parallelism in your job (i.e. it has the ability to use more than 1 core) or you have used a package/module which can use multiple cores you must request a number of cores from the smp.pe
batch parallel environment and include the OMP_NUM_THREADS
variable in your job to ensure that it uses the allocated resources correctly. Example:
#!/bin/bash #$ -S /bin/bash #$ -N Python_fib #$ -cwd #$ -o outputfile.log #$ -j y #$ -V #$ -pe smp.pe 4 ### Request 4 cores from the batch system. Min 2, max 24. export OMP_NUM_THREADS=$NSLOTS ## For parallel jobs auto sets to the no. on the pe line. DO NOT set a number here. python eig.py
PyLab
If you wish to import NumPy, SciPy, Matplotlib and all in one easy step:
import pylab
Adding packages
You will need to set the following modulefile loading on the command line before proceeding:
module load tools/env/proxy
OR alternativey, set the following manually:
export http_proxy=http://proxy.man.ac.uk:3128 export https_proxy=http://proxy.man.ac.uk:3128
Anaconda packages
To see what is already installed:
conda list
To see if a package is available
conda list package
where package
is replaced with the name of the package you want. If the package is listed as available for install please contact: its-ri-team@manchester.ac.uk and we will try to add it to the central installation.
pip/pypi installation
To search for packages available to the pip
installer:
pip search keyword
To install a package within your home folder storage:
pip install --user package
where package
is replaced with the name of the package you want. This will install the package to a hidden directory called .local
in your home directory. It should be automatically picked up by python, you can test thus:
python import package help (package)
Further information
- Anaconda Python documentation, FAQ, mailing list etc
- IT Services runs an Introduction to numerical computing with Python workshop approximately three times a year.
Hints and Tips
Got a handy tip? Please send it in to its-ri-team@manchester.ac.uk ...
Plotting graphs with Pyplot
If you want to plot a graph to a PNG file, say, in batch, try the following (see this stackoverflow question and answer):
# Create a file named graph.py: import matplotlib as mpl # Agg backend wil render without X server on a compute node in batch mpl.use('Agg') import matplotlib.pyplot as plt fig = plt.figure() ax = fig.add_subplot(111) ax.plot(range(10)) fig.savefig('temp.png') # End of python file. # Tested on CSF using: module load apps/binapps/anaconda/3/2.3.0 qsub -b y -cwd -V -l short python ./graph.py # When job completes, view the graph eog temp.png
Updates
2.3.0 (Anaconda3) - installed 8th September 2015 - included docopt, biopython
2.3.0 - installed 8th July 2015 - included wxpython, docopt and MDAnalysis
2.1.0 - installed 27th Nov 2014 - included wxpython, docopt and MDAnalysis
1.9.1 - MDAnalysis installed via pip - Nov 2014
1.9.1 - docopt installed via pip - June 2014
1.9.1 - installed 27th Feb 2014 and wxpython added to it.