Research Infrastructure > CSF2 (retired) > Software > Applications > Canu

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

Canu

Overview

Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

The application is written in C++ and can be used in batch or interactively (via the qrsh command).

Version 1.7 is installed on the CSF.

Restrictions on use

There are no restrictions on using this software on the CSF. The application is released under the GPLv2 license.

Set up procedure

To access the software you must first load the modulefile:

module load apps/gcc/canu/1.7

Running the application

Please do not run canu on the login node. Jobs should be submitted to the compute nodes via batch or run interactively via the qrsh command. Note that canu has the ability to submit batch jobs on your behalf. However this does not work correctly on the CSF (because canu will try to submit jobs from other jobs which is not supported on the CSF). So you must run it from a jobscript and submit the job to the batch system in the usual way.

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -S /bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings

# $NSLOTS is automatically set to 1 in a serial job
# You MUST set the maxThreads, mhapThreads and maxMemory options
# to tell canu how much resource it can use. We assume 4GB per core.
# The useGrid=false option is needed to run from a batch job.

canu -d run1 -p godzilla genomeSize=1g maxThreads=$NSLOTS mhapThreads=$NSLOTS maxMemory=$((4*NSLOTS)) useGrid=false \
   -nanopore-raw reads/*.fasta.gz

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -S /bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings
#$ -pe smp.pe 8     # Number of cores (2-24 permitted)

# $NSLOTS is automatically set to the number given on the -pe line above.
# You MUST set the maxThreads, mhapThreads and maxMemory options
# to tell canu how much resource it can use. We assume 4GB per core.
# The useGrid=false option is needed to run from a batch job.

canu -d run1 -p godzilla genomeSize=1g maxThreads=$NSLOTS mhapThreads=$NSLOTS maxMemory=$((4*NSLOTS)) useGrid=false \
   -nanopore-raw reads/*.fasta.gz                            #                    #
                                                             #                    # Ensure 4GB/core is used
                                                             #
                                                             # If your pipeline uses the MHAP algorithm this
                                                             # ensure Java will use the correct number of cores.

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Interactive use of the GUI

FastQC can be run interactively but this must not be run on the login node. Instead you must schedule an interactive session on a compute node. Use the following commands (after loading the modulefile):

qrsh -l inter -l short -V canu

This will only run when a free core is available on one of the interactive nodes. If no free resources are available you will be asked to try again later.

Further info

Brief notes on command-line flags are available on the CSF using:
```
canu -help
```

Updates

None.

Last modified on April 10, 2018 at 8:34 am by George Leaver

Page Contents