MaxBin

Overview

MaxBin2 is a program for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.

Version 2.2.5 is installed on the CSF. Please note that the application incorrectly reports itself as version 2.2.4. This may be corrected in a future release but this is definitely version 2.2.5.

It was compiled with the default GCC 4.8.5 compiler.

The auxiliary applications provided by MaxBin (IDBA-UD v1.1.1, HMMER3 v3.1b1, Bowtie2 v2.2.3, FragGeneScan v1.30) have been installed. While some of these already exists on the CSF (available via other modulefiles) the versions used by MaxBin are installed within the MaxBin installation and no further modulefiles are required. Depending which modulefile you load you may be able to run these applications manually in your jobs, outside of MaxBin (see below for the modulefiles). MaxBin itself will always be able to run these applications.

Restrictions on use

There are no restrictions on accessing the software on the CSF. It is released under the ?BSD license and any use should fall within the restrictions of that license. Please see the $MAXBINDIR/License text file on the CSF.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

module load apps/gcc/maxbin/2.2.5        # You can run the external apps (IDBA etc) manually

module load apps/gcc/maxbin/2.2.5-noext  # You CANNOT run the external apps (IDBA etc) manually
                                         # But MaxBin can still run them.

Running the application

Please do not run MaxBin on the login node. Jobs should be submitted to the compute nodes via batch.

You may run the command without any args on the login node to display the help text:

run_MaxBin.pl

This will display various flags/options:

Usage:
  run_MaxBin.pl
    -contig (contig file)
    -out (output file)
  ...

Serial batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
                    # NO -V line - we load modulefiles in the jobscript

module load apps/gcc/maxbin/2.2.5

# $NSLOTS is automatically set to 1 in a serial job
run_MaxBin.pl -thread $NSLOTS -contig contig_gile -out output_file other args...

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 8     # Number of cores (2-32 permitted)

module load apps/gcc/maxbin/2.2.5

# $NSLOTS is automatically set to the number of cores requested above
run_MaxBin.pl -thread $NSLOTS -contig contig_gile -out output_file other args...

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Further info

Updates

None.

Last modified on June 28, 2019 at 12:21 pm by George Leaver