VSEARCH

Overview

VSEARCH is a sequence analysis tool (similar to USEARCH). It supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.

Versions 1.1.3 and 2.10.4 (pre-compiled 64bit binaries) are installed on the CSF.

Restrictions on use

There are no restrictions on accessing this software on the CSF. It is released in under the GNU GPL v3 license and all use must adhere to that license.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

To access the software you must first load one of the following modulefiles:

module load apps/binapps/vsearch/2.10.4
module load apps/binapps/vsearch/1.1.3

Running the application

Please do not run VSEARCH on the login node. Jobs should be submitted to the compute nodes via batch. You may run the following commands on the login node to get help with running the application:

# Display flags accepted by the executable
vsearch --help

# Display the manual page
man vsearch

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash --login
#$ -cwd                # Job will run from the current directory

### We now load the modulefile in the jobscript, for example:
module load apps/binapps/vsearch/2.10.4

vsearch -cluster_fast combined.fasta -id 0.97 -uc results.uc -centroids centroids.fasta

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission