Research Infrastructure > CSF2 (retired) > Software > Applications > IDBA and IDBA-UD

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

IDBA and IDBA-UD

Overview

IDBA is the basic iterative de Bruijn graph assembler for second-generation sequencing reads. IDBA-UD, an extension of IDBA, is designed to utilize paired-end reads to assemble low-depth regions and use progressive depth on contigs to reduce errors in high-depth regions.

Version 1.2.0 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing the software on the CSF. It is released under the GNU GPL v2 license and all usage should adhere to that license.

Set up procedure

To access the software you must first load the modulefile:

module load apps/intel-15.0/idba/1.2.0

Running the application

Please do not run IDBA on the login node. Jobs should be submitted to the compute nodes via batch.

The following executables are available. To see a list of flags accepted by each executable you may run them on the login node with the --help flag:

fa2fq           idba_hybrid        print_graph    sim_reads_tran  validate_component
filter_blat     idba_tran          raw_n50        sort_psl        validate_contigs_blat
filter_contigs  idba_tran_test     sample_reads   sort_reads      validate_contigs_mummer
filterfa        idba_ud            scaffold       split_fa        validate_reads_blat
fq2fa           parallel_blat      shuffle_reads  split_fq        validate_rna
idba            parallel_rna_blat  sim_reads      split_scaffold

A number of helper scripts are available – run the following command after loading the modulefile to see what they are:

ls $IDBADIR/script/

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings

# $NSLOTS will be set to the number of cores requested (1 in a serial job)
idba_ud --num_threads $NSLOTS -r read.fa -o output_dir

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission