RAxML

Overview

RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for postanalyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads.

Version 8.2.12 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing the software on the CSF. It is released under the GNU GPL v3 license and all usage should adhere to that license.

Please ensure you cite your usage of this software using the reference given on the RAxML website:

A. Stamatakis: “RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies”. In Bioinformatics, 2014 open access.

Set up procedure

Several RAxML executables exist for different Intel processors. This is so that the software can be optimized for specific architectures. The modulefiles can either detect the type of Intel processor your job is running on and select the appropriate executable or you can specify the executable manually. See below for details of each method (we recommend automatically detecting the architecture in use by your job).

Automatically match the executable to the architecture (recommended)

The following modulefile can only be loaded from within your jobscript – they will not load on the login node. When the job runs the architecture of the node will be detected and environment variables will be set giving the correct names of the RAxML executables that can be used on that node (e.g., the AVX2 executables will be used if the job is running on Skylake, Broadwell or Haswell nodes):

# The modulefile will determine the names of the executables to match the CSF node architecture
# You must load the modulefile in the jobscript, not on the login node

module load apps/gcc/raxml/8.2.12-detectcpu           # Serial, multi-core (pthreads) and
                                                      # multi-node (MPI) jobs

The modulefile sets the following environment variables which you use to run the RAxML executable. It will have the correct name for the type of node on which your job has landed:

$RAXML_EXE                   # Runs the serial executable
$RAXML_PTHREAD_EXE           # Runs the parallel (single-node) executable
$RAXML_MPI_EXE               # Runs the parallel (single and multi-node) executable

It is up to you whether you want to run the serial, Pthread parallel or MPI parallel version but the environment variables will give you the correct architecture for each of these versions. For example you can use the following jobscript to run a multi-core (pthreads) version optimized for AVX2 architectures:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8
#$ -l broadwell         # Force use of a Broadwell node for AVX2 optimization

# Load the modulefile in the jobscript (hence the --login on the first line)
module load apps/gcc/raxml/8.2.12-detectcpu

# The env var will automatically have AVX2 in the executable name.
# We must inform the pthreads executable how many threads it can use.
# $NSLOTS is automatically set to the number on the -pe line above.
$RAXML_PTHREAD_EXE -T $NSLOTS arg1 arg2 ....

Similarly you can use this method when you don’t mind which type of compute node your job lands on. You may or may not get the fastest architecture but the CSF has a much larger pool of nodes on which to run your job. Hence you will likely spend less time in the queue than if you had requested a specific architecture:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8          # Can run on any Intel node (architecture) in the CSF

# Load the modulefile in the jobscript (hence the --login on the first line)
module load apps/gcc/raxml/8.2.12-detectcpu

# The env var will always have the correct executable name for the architecture
# We must inform the pthreads executable how many threads it can use.
# $NSLOTS is automatically set to the number on the -pe line above.
$RAXML_PTHREAD_EXE -T $NSLOTS arg1 arg2 ....

Manually match the executable to the architecture

If you wish to specifying manually the compute node architecture AND corresponding executable name to match that architecture, the following modulefile can be used. It can be loaded on the login node (or in the jobscript). We recommend you use the more automatic jobscripts (see above) if unsure about which executable will run on which architecture.

# You will have to determine the names of the executables to match the CSF node architecture
# This can be loaded on the login node or in your jobscript

module load apps/gcc/raxml/8.2.12                    # Serial, multi-core (pthreads) and
                                                     # multi-node (MPI) jobs

The following versions are then available (fastest architecture listed first) and you must ensure your jobs lands on a compute node capable of running the executable you use in your jobscript:

Architecture Executbles CSF Nodes
AVX2 raxmlHPC-AVX2 (serial)
raxmlHPC-PTHREADS-AVX2 (multi-core)
raxmlHPC-MPI-AVX2 (multi-core, also multi-node)
Skylake, Broadwell, Haswell
AVX raxmlHPC-AVX
raxmlHPC-PTHREADS-AVX
raxmlHPC-MPI-AVX
Skylake, Broadwell, Haswell, Ivybridge
SSE3 raxmlHPC-SSE3
raxmlHPC-PTHREADS-SSE3
raxmlHPC-MPI-SSE3
Skylake, Broadwell, Haswell, Ivybridge
Any raxmlHPC
raxmlHPC-PTHREADS
raxmlHPC-MPI
Skylake, Broadwell, Haswell, Ivybridge

To select a specific CSF compute node architecture on which your job will be placed, add the following flag to the jobscript:

#$ -l architecture
    #         #
    #         # skylake, broadwell, haswell, ivybridge
    #         # (only one architecture flag can be specified)
    #
    # Lowercase letter L (not a number one!)

Use the table above to match executable names to architectures and then ensure your job lands on the required CSF node by using jobscript flags (forcing an architecture or as a consequence of the number of cores).

Running the application

The following examples all use the -detectcpu modulefile to automatically select the executable name for the compute node on which the job has landed. Hence we load the modulefile in the jobscript (not on the login node). We show an example of each type of job (serial, parallel multi-core and parallel multi-node).

Please do not run RAxML on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch job submission (detectcpu)

The following examples use the detectcpu modulefile to automatically detect the CSF compute node architecture on which the job is running. Hence the modulefile must be loaded in the jobscript (if it were loaded on the login node it would detect the architecture of the login node which may differ from the compute node where the job actually runs).

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
                    # NOTE: NO '-V' because we load the modulefile in the jobscript
                    #       NO '-pe' because we are a serial (1-core) job
                    #       The --login flag (on first line) is needed to load the modulefile

# The same modulefile name is used for all version on CSF3
module load apps/gcc/raxml/8.2.12-detectcpu

# Let's report the architecture
echo "RAxML will use the $RAXML_ARCH executable"

# Run the serial version with the correct architecture automatically detected
$RAXML_EXE -s sequenceFileName -n outputFileName -m substitutionModel

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel single-node batch job submission

The following examples use the detectcpu modulefile to automatically detect the CSF compute node architecture on which the job is running. Hence the modulefile must be loaded in the jobscript (if it were loaded on the login node it would detect the architecture of the login node which may differ from the compute node where the job actually runs).

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 8     # Can be between 2-24. Up to 12 cores can run anywhere on the CSF.
#$ -l haswell       # (optional) In this example we force the job on to a haswell node to use AVX2 optimization.
                    # NOTE: NO '-V' because we load the modulefile in the jobscript
                    #       The --login flag (on first line) is needed to load the modulefile

# The same modulefile name is used for all version on CSF3
module load apps/gcc/raxml/8.2.12-detectcpu

# Report the architecture (SSE3, AVX or AVX2 being used)
echo "RAxML will use the $RAXML_ARCH executable"

# ----------------------------------------
# Use either the PTHREADS parallel version
# ----------------------------------------

# $NSLOTS is automatically set to the number given on the -pe line above.
$RAXML_PTHREADS_EXE -T $NSLOTS -s sequenceFileName -n outputFileName -m substitutionModel

# ---------------------------
# OR the MPI parallel version
# ---------------------------

# $NSLOTS is automatically set to the number given on the -pe line above.
mpirun -n $NSLOTS $RAXML_MPI_EXE -s sequenceFileName -n outputFileName -m substitutionModel

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel multi-node batch job submission

The following examples use the detectcpu modulefile to automatically detect the CSF compute node architecture on which the job is running. Hence the modulefile must be loaded in the jobscript (if it were loaded on the login node it would detect the architecture of the login node which may differ from the compute node where the job actually runs).

#!/bin/bash --login
#$ -cwd                     # Job will run from the current directory
#$ -pe mpi-24-ib.pe 48      # Must be 48 or more in multiples of 24.
                            # NOTE: NO '-V' because we load the modulefile in the jobscript
                            #       The --login flag (on first line) is needed to load the modulefile

# The same modulefile name is used for all version on CSF3
module load apps/gcc/raxml/8.2.12-detectcpu

# Report the architecture (SSE3, AVX or AVX2 being used)
echo "RAxML will use the $RAXML_ARCH executable"

# ------------------------
# The MPI parallel version
# ------------------------

# $NSLOTS is automatically set to the number given on the -pe line above.
mpirun -n $NSLOTS $RAXML_MPI_EXE -s sequenceFileName -n outputFileName -m substitutionModel

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Further info

Updates

None.

Last modified on August 11, 2021 at 4:03 pm by Pen Richardson