The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
QIIME
Overview
QIIME (Quantitative Insights Into Microbial Ecology) is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
Versions 1.9.1 and 1.8.0 are installed on the CSF.
QIMME is often used with USEARCH or vsearch. These pieces of software are installed separately on the CSF. If you wish to use them together you need to load QIIME (see below) and USEARCH or vsearch.
Restrictions on use
None. Please read the citation instructions for publishing work that has used Qiime.
Set up procedure
To access the software you must first load the modulefile for the version you require:
module load apps/gcc/qiime/1.9.1
This will automatically load the apps/binapps/anaconda/2.3.0
modulefile providing Anaconda Python.
OR
module load apps/gcc/qiime/1.8.0
This will automatically load the apps/binapps/anaconda/1.8.0
modulefile providing Anaconda Python.
Running the application
Please do not run Qiime on the login node. Jobs should be submitted to the compute nodes via batch.
Serial batch job submission
Make sure you have the modulefile loaded and that you have a created a directory for the job to run in and that that directory contains all the files and other directories needed by your job. Then in the man directory for the job create a batch submission script, for example:
#!/bin/bash #$ -S /bin/bash #$ -cwd # Job will run from the current directory #$ -V # Job will inherit current environment settings pick_open_reference_otus.py -i /scratch/$USER/data/combined_seqs.fna -m usearch \ -r /scratch/$USER/Genes/abc.fasta -o /scratch/$USER/uclust_picked_otus \ -p /scratch/$USER/qiime_parameters.txt --suppress_align_and_tree
where pick_open_reference_otus.py is the Qiime tool you wish to use. To see a list of those available run the following on the login node:
ls $QIIME_HOME/bin
Now submit the job with:
qsub jobscript
Where jobscript is the name of the submission script.
Parallel batch job submission – 2 to 16 cores
As above, but we need to add a line to the submission script to ask for more than one core (in this example 4) and then we need to tell the software this as well – in the case of the component we are using this is done with -a -O $NSLOTS
:
#!/bin/bash #$ -S /bin/bash #$ -cwd # Job will run from the current directory #$ -V # Job will inherit current environment settings #$ -pe smp.pe 4 # 4 core parallel job request pick_open_reference_otus.py -a -O $NSLOTS -i /scratch/$USER/data/combined_seqs.fna -m usearch61 \ -r /scratch/$USER/Genes/abc.fasta -o /scratch/$USER/uclust_picked_otus \ -p /scratch/$USER/qiime_parameters.txt --suppress_align_and_tree
Note 1: Using $NSLOTS ensures that the number of cores you have requested (on the #$ -pe
line) and the number of cores the software actually tries to use match. If it doesn’t it can cause nodes to be overloaded (the software is using more cores than you reserved) or resources to be wasted (the software is using fewer cores than you reserved) which can then have negative impacts on the CSF service. DO NOT edit your config or jobscript to use more than 16 cores as it will overload nodes.
Note 2: Not all tools in qiime can be run in parallel. Some may require options as in the example above, others may not. Some are named parallel_tool.py
(where tool is the qiime tool you are using). Please consult the qiime documentation for further details.
Local Config File
The Qiime configuration options can be displayed by running print_qiime_config.py
. If you wish to modify any options you should create a local configuration file as follows:
cp $QIIME_CONFIG_FP ~/.qiime_config
and then edit your local ~/.qiime_config
file.
DO NOT make changes to this file that if you are not confident about how they will affect your job – errors in this config can cause issues in the batch system affecting not only your jobs, but those of other users.
Further info
Updates
None.