The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead. To display this old CSF2 page click here. |
MEGAHIT
Overview
MEGAHIT is an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.
Version 1.1.3 is installed on the CSF.
Restrictions on use
There are no restrictions on accessing the software on the CSF. It is released under the GPU GPLv3 license and all usage must adhere to this license.
Please cite your usage of MEGAHIT using the citation instructions.
Set up procedure
To access the software you must first load the modulefile:
# CPU-only executables module load apps/gcc/megahit/1.1.3 # CPU executables and GPU version of megahit_sdbg_build named megahit_sdbg_build_gpu. # This will load automatically the CUDA modulefile. module load apps/gcc/megahit/1.1.3-gpu
Running the application
Please do not run megahit on the login node. Jobs should be submitted to the compute nodes via batch.
Serial batch job submission
Make sure you have the modulefile loaded then create a batch submission script, for example:
#!/bin/bash #$ -cwd # Job will run from the current directory #$ -V # Job will inherit current environment settings # $NSLOTS is automatically set to the number of cores (1 for a serial job) megahit -t $NSLOTS args... -o out_dir # # If an output directory name is not given megahit will # use 'megahit_out'. You should use a unique name if # running more than one job in the same directory and you # can use $JOB_ID to use the current job's unique number. # For example: my_output_dir.$JOB_ID
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Parallel batch job submission
Make sure you have the modulefile loaded then create a batch submission script, for example:
#!/bin/bash #$ -cwd # Job will run from the current directory #$ -V # Job will inherit current environment settings #$ -pe smp.pe 8 # Number of cores (can be 2--24). # $NSLOTS is automatically set to the number of cores given above megahit -t $NSLOTS args... -o out_dir # # If an output directory name is not given megahit will # use 'megahit_out'. You should use a unique name if # running more than one job in the same directory and you # can use $JOB_ID to use the current job's unique number. # For example: my_output_dir.$JOB_ID
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Further info
Updates
None.