VSEARCH
Overview
VSEARCH is a sequence analysis tool (similar to USEARCH). It supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
Versions 1.1.3 and 2.10.4 (pre-compiled 64bit binaries) are installed on the CSF.
Restrictions on use
There are no restrictions on accessing this software on the CSF. It is released in under the GNU GPL v3 license and all use must adhere to that license.
Set up procedure
We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.
To access the software you must first load one of the following modulefiles:
module load apps/binapps/vsearch/2.10.4 module load apps/binapps/vsearch/1.1.3
Running the application
Please do not run VSEARCH on the login node. Jobs should be submitted to the compute nodes via batch. You may run the following commands on the login node to get help with running the application:
# Display flags accepted by the executable vsearch --help # Display the manual page man vsearch
Serial batch job submission
Make sure you have the modulefile loaded then create a batch submission script, for example:
#!/bin/bash --login #$ -cwd # Job will run from the current directory ### We now load the modulefile in the jobscript, for example: module load apps/binapps/vsearch/2.10.4 vsearch -cluster_fast combined.fasta -id 0.97 -uc results.uc -centroids centroids.fasta
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Parallel batch job submission
Make sure you have the modulefile loaded then create a batch submission script, for example:
#!/bin/bash --login #$ -cwd # Job will run from the current directory #$ -pe smp.pe 6 # Use 6 cores in smp.pe (2--32 permitted) ### We now load the modulefile in the jobscript, for example: module load apps/binapps/vsearch/2.10.4 ### $NSLOTS is automatically set to number of cores requested above vsearch --threads $NSLOTS -cluster_fast combined.fasta -id 0.97 -uc results.uc -centroids centroids.fasta
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Further info
- VSEARCH Wiki (includes example pipelines)
- VSEARCH Web Forum
- VSEARCH PubMed entry
Updates
None.