Guppy

Overview

Guppy is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features. It can be run as a CPU-only app or on the GPUs in the CSF.

Restrictions on use

Access to this software is highly restricted – currently only one research group has a license and you must be an approved member of that research group. Please ask the license holder to contact us requesting access for you. Use of the software will also require you to register with the Oxford Nanopore Community website.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

# GPU versions (the CUDA modulefile will be loaded automatically for you)
module load apps/binapps/guppy/6.1.7-gpu
module load apps/binapps/guppy/5.0.16-gpu
module load apps/binapps/guppy/5.0.7-gpu
module load apps/binapps/guppy/4.4.1-gpu
module load apps/binapps/guppy/4.0.14-gpu
module load apps/binapps/guppy/3.4.5-gpu
module load apps/binapps/guppy/3.1.5-gpu

# CPU-only version
module load apps/binapps/guppy/6.1.7
module load apps/binapps/guppy/5.0.16
module load apps/binapps/guppy/5.0.7
module load apps/binapps/guppy/4.4.1
module load apps/binapps/guppy/4.0.14
module load apps/binapps/guppy/3.4.5
module load apps/binapps/guppy/3.1.5

Running the application

Please do not run Guppy on the login node. Jobs should be submitted to the compute nodes via batch.

The following executables are available. You may run these with the --help flag to obtain a list of accepted flags. However please note that the GPU versions must be run on a GPU compute node – they will not run (even with just the --help flag) on the login node because there is no GPU available there.

guppy_aligner
guppy_basecaller
guppy_basecaller_1d2
guppy_barcoder

Serial batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
                    # NO -V line - we load modulefiles in the jobscript

# Load your required version (this is the CPU-only version)
module load apps/binapps/guppy/3.1.5

# Example of using the guppy_basecaller. For a serial (1 core) job ensure
# we only use one that with one basecaller.
guppy_basecaller --cpu_threads_per_caller 1 --num_callers 1 \
     -i input path -s save path -c config file 

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 8     # Number of cores. Can be 2--32.

# Load your required version (this is the CPU-only version)
module load apps/binapps/guppy/3.1.5

# Example of using the guppy_basecaller. For a parallel job ensure
# the number of threads per basecaller X number of basecallers = $NSLOTS (the number of cores)
#
# In this example we request 8 cores so we'll use 4 cores per basecaller and 2 basecallers.
guppy_basecaller --cpu_threads_per_caller 4 --num_callers 2 \
     -i input path -s save path -c config file 

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

GPU batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 8     # Number of cores. Can be up to 8 per GPU.
#$ -l v100=1        # Number of GPUs

# Load your required version (this is the GPU-enabled version)
module load apps/binapps/guppy/3.1.5-gpu

# Example of using the guppy_basecaller. For a parallel job ensure
# the number of threads per basecaller X number of basecallers = $NSLOTS (the number of cores)
#
# In this example we request 8 cores so we'll use 4 cores per basecaller and 2 basecallers.#
# NOTE: You should adjust NUMRUNNERS to suit your needs.
guppy_basecaller --gpu_runners_per_device NUMRUNNERS \
     --cpu_threads_per_caller 4 --num_callers 2 \
     -i input path -s save path -c config file 

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Further info

Updates

None.

Last modified on July 11, 2022 at 11:06 am by