SPAdes

Overview

SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.

Versions 3.11.1, 3.10.1 and 3.5.0 are installed on the CSF.

Note that other tools such as metaSPAdes are included in SPAdes. Run spades.py -h to see the command-line flags accepted by spades (for example --meta).

Restrictions on use

This software is open source and may be used by any CSF user. Please ensure that you cite your usage in any results or publications as per the developers documentation.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

To access the software you must first load the modulefile:

module load apps/binapps/spades/3.13.1
module load apps/binapps/spades/3.11.1
module load apps/binapps/spades/3.10.1
module load apps/binapps/spades/3.5.0

Running the application

Please do not run SPAdes on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory

# Load the modulefile in the jobscript - choose your required version
module load apps/binapps/spades/3.11.1

# $NSLOTS will be set automatically to 1 in a serial job
spades.py --threads $NSLOTS -1 file1.in -2 file2.in -o output_dir

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 8     # Number of cores (2--32 permitted)

# Load the modulefile in the jobscript - choose your required version
module load apps/binapps/spades/3.11.1

# $NSLOTS will be set automatically to the number of cores given above
spades.py --threads $NSLOTS -1 file1.in -2 file2.in -o output_dir

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

High-memory batch job submission

Note that metaspades itself has a default memory limit of 250GB built-in. So even if your batch job has access to more memory (because you are running it on one of the high-memory nodes) you must instruct metaspades that it can use more memory using an extra -m NUM_GB flag on the metaspades command line. For example, to inform metaspades that you are using a node with 512GB of RAM:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 16    # Number of cores (2--16 permitted on a mem512 node)
#$ -l mem512        # Run on a 512GB node

# Load the modulefile in the jobscript - choose your required version
module load apps/binapps/spades/3.11.1

# $NSLOTS will be set automatically to the number of cores given above
spades.py -m 512 --threads $NSLOTS -1 file1.in -2 file2.in -o output_dir
            #
            # Extra flag to increase metaspade's in-built memory
            # limit (from 250GB) to 512GB.

Note that your jobscript must request a high-memory node and sufficient number of cores to allow the job to use enough memory to allow metaspades to process your data. Simply adding the -m flag to the metaspades command is not enough – if the job tries to use more memory than the batch system allows for your job then the batch system will kill the job.

Further info

Updates

None.

Last modified on July 8, 2019 at 5:40 pm by George Leaver