BBTools

Overview

BBTools

BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving. Program descriptions and options are shown when running the shell scripts with no parameters.

The BBTools suite includes programs such as:

  • bbduk – filters or trims reads for adapters and contaminants using k-mers
  • bbmap – short-read aligner for DNA and RNA-seq data
  • bbmerge – merges overlapping or nonoverlapping pairs into a single reads
  • reformat – converts sequence files between different formats such as fastq and fasta

Version 38.96 is installed on the CSF.

Restrictions on use

BBTools is open source and free for unlimited use, and is used regularly by DOE JGI and other institutions around the world.
BBtools may be cited using the primary website: BBMap – Bushnell B. – http://sourceforge.net/projects/bbmap/

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

module load apps/binapps/bbtools/38.96

Running the application

Please do not run bbtools on the login node. Jobs should be submitted to the compute nodes via batch.

Program descriptions and options are shown when running the shell scripts with no parameters, e.g.

module load apps/binapps/bbtools/38.96
./bbmap.sh

##### omitted for clarity #####

To index:     bbmap.sh ref=
To map:       bbmap.sh in= out=
To map without writing an index:
    bbmap.sh ref=<reference fasta> in=<reads> out=<output sam> nodisk

##### omitted for clarity #####

Serial batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
### You may need to add a high memory flag here

# Choose your required version
module load apps/binapps/bbtools/38.96

# Choose the program you would like to run, there are various programs available but for illustrative purposes have used bbmap
# The below will map without writing an index
./bbmap  ref=<reference fasta> in=<reads> out=<output sam> nodisk


Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

If you need more RAM (memory) to complete the analysis successfully, and you may well do!, please add the flags mentioned at the high-memory jobs page for more information.

Parallel batch job submission

PLEASE NOTE: BBMap, like many other BBTool programs are multithreaded for both indexing and mapping. It will use all available threads unless capped with the “t=” flag, but it scales near-linearly with processor cores
Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 16    # Number of cores, can be 2--32
### You may need to add a high memory flag here

# Choose your required version
module load apps/binapps/bbtools/38.96

# $NSLOTS is automatically set to the number of cores requested above
./bbmap  t=$NSLOTS ref=<reference fasta> in=<reads> out=<output sam> nodisk

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

If you need more RAM (memory) to complete the analysis successfully, and you may well do!, please add the flags mentioned at the high-memory jobs page for more information.

Further info

Updates

None.

Last modified on April 29, 2022 at 3:07 pm by Chris Grave