SAMTools
Overview
SAMTools provides various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments.
Restrictions on use
There are no restrictions on accessing this software on the CSF. It is provided under the MIT/Expat license, and all usage of the application must adhere to that license.
Set up procedure
We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.
Load one of the following modulefiles:
# These versions are available on the CSF3 (Slurm) system module load apps/gcc/samtools/1.21 module load apps/gcc/samtools/1.13 module load apps/gcc/samtools/1.11 module load apps/gcc/samtools/1.9 # These versions are available on the CSF (SGE) system (new versions will not be installed) module load apps/gcc/samtools/1.13 module load apps/gcc/samtools/1.11 module load apps/gcc/samtools/1.9
Running the application
Please do not run SAMTools on the login node. Jobs should be submitted to the compute nodes via batch.
Serial batch job submission
Create a batch submission script (which will load the modulefile in the jobscript), for example:
#!/bin/bash --login #$ -cwd # Job will run from the current directory # NO -V line - we load modulefiles in the jobscript module load apps/gcc/samtools/1.13 samtools command [options]
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
#!/bin/bash --login #SBATCH -p serial # 1-core job #SBATCH -t 4-0 # 4-day max wallclock # Start with a clean environment in Slurm module purge module load apps/gcc/samtools/1.21 samtools command [options]
Submit the jobscript using:
sbatch scriptname
where scriptname is the name of your jobscript.
Parallel batch job submission
Not all of the SAMTools tools can use mutliple cores. You should check the options on each tool, by running the help
command on the login node:
samtools help samtools help merge # Or one of the other tools
The merge tool can use multiple cores, so we use that as an example:
#!/bin/bash --login #$ -cwd #$ -pe smp.pe 8 # Run in parallel with 8 threads (max is 32 in smp.pe) module load apps/gcc/samtools/1.13 # Use --threads $NSLOTS to tell samtools to use the number of threads requested above samtools merge --threads $NSLOTS args...
Submit the jobscript using qsub jobscript
#!/bin/bash --login #SBATCH -p multicore # AMD 168-core nodes #SBATCH -n 8 # Run in parallel with 8 threads (max is 168 in multicore) #SBATCH -t 2-0 # 2-day wallclock (7-0 is max permitted) # Start with a clean environment in Slurm module purge module load apps/gcc/samtools/1.21 # Use --threads $SLURM_NTASKS to tell samtools to use the number of threads requested above samtools merge --threads $SLURM_NTASKS args...
Submit the jobscript using sbatch jobscript
Further info
Updates
None.