Diamond
Overview
Diamond is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data. It will do pairwise alignment of proteins and translated DNA at 500x-20,000x speed of BLAST, Frameshift alignments for long read analysis and supports various output formats, including BLAST pairwise, tabular and XML, as well as taxonomic classification.
Version 0.9.26 (binary) is installed on the CSF.
Restrictions on use
There are no restrictions on accessing the software on the CSF. The software is released under the GNU GPL v3 license and all usage should adhere to that license.
Set up procedure
We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.
Load one of the following modulefiles:
module load apps/binapps/diamond/0.9.26
Running the application
Please do not run Diamond on the login node. Jobs should be submitted to the compute nodes via batch.
You may run the following command to obtain help test – diamond accepts a large number of flags:
diamond help
You can also read the manual using:
evince $DIAMONDDIR/diamond_manual.pdf
Serial batch job submission
Create a batch submission script (which will load the modulefile in the jobscript), for example:
#!/bin/bash --login #$ -cwd # Job will run from the current directory # NO -V line - we load modulefiles in the jobscript module load apps/binapps/diamond/0.9.26 # In the commands below $NSLOTS is set to the number of cores to use. # For serial jobscripts this is 1. You must inform diamond to use 1 core. # To set up a reference database for diamond diamond makedb --in nr.faa -d nr -p $NSLOTS # The alignment task may then be initiated using the blastx command like this: diamond blastx -d nr -q reads.fna -o matches.m8 -p $NSLOTS
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Parallel batch job submission
To convert the serial jobscript above in to a parallel jobscript, simply add the line:
#$ -pe smp.pe 8 # Number of cores (can be 2--32).
You should use the -p $NSLOTS
flag on the diamond
command-line as shown above.
Further info
Updates
None.