Research Infrastructure > CSF2 (retired) > Software > Applications > PyPcaZip

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

PyPcaZip

Overview

PyPcaZip is a package using principle component analysis to compress trajectory data. It is a re-implementation by the ExTASY project www.extasy-project.org of the PCAZIP toolkit developed and distributed by the Laughton Group at the University of Nottingham, UK, and the Orozco group at the University of Barcelona, Spain.

Version 1.5.1 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing this software on the CSF.

Set up procedure

To access the software you must first load the modulefile:

module load apps/binapps/pypcazip/1.5.1

This will automatically load the Anaconda Python (v2.3.0) modulefile which provides Python 2.7.10.

Running the application

Please do not run PyPcaZip on the login node. Jobs should be submitted to the compute nodes via batch. The following executables are available:

pyPcaunzip  pyPcazip    pyPczclust  pyPczcomp   pyPczdump   pyPczplot

You may run the executable on the login node with the --help flag added to see a summary for command-line flags.

Only the pyPcazip utility may be run in parallel using MPI parallelism – see below.

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -S /bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings

## This is a serial jobscript (1 core) so turn off MPI parallelism with --nompi
pyPcazip --nompi -i INPUT -a ALBUM -t TOPOLOGY -o OUTPUT

## Other tools are serial-only no 
pyPcaunzip -t TOPOLOGY -c COMPRESSED -o OUTPUT

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Only the pyPcazip utility may be run in parallel.