FastTree

Overview

FastTree is a program to infer approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.

Version 2.1.11 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing the software on the CSF. It is released under the GNU GPL v2 license and all usage must adhere to that license.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. Load the module file like this:

module load apps/intel-18.0/fasttree/2.1.11          # Serial and multi-core versions

Running the application

Please do not run FastTree on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#SBATCH -p serial   # Partition is required. Runs on Intel hardware
#SBATCH -t 1-0      # Wallclock limit (days-hours). Required!
                    # Max permitted is 7 days (7-0).

# Purge your environment and load the FastTree module
module purge
module load apps/intel-18.0/fasttree/2.1.11

# Serial version of the app (will use 1 CPU-core)

# To infer a tree for a protein alignment with the JTT+CAT model, use
FastTree alignment.file > tree_file 

# To infer a tree for a nucleotide alignment with the GTR+CAT model, use
FastTree -gtr -nt alignment_file > tree_file

Submit the jobscript using:

sbatch scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

FastTree can use multiple CPU cores in a single-compute node. The executable program is called FastTreeMP. Note that the FastTree website contains the following information about the multi-core version:

  • As of version 2.1, FastTreeMP will not give exactly the same results as FastTree because the top-hits heuristics become non-deterministic (depending on which seed is reached first) and because the star topology test is turned off. However, in practice, the results are of the same quality.

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#SBATCH -p multicore  # Partition is required. Runs on an AMD Genoa hardware
#SBATCH -n 4          # Number of cores, can be 2-168
#SBATCH -t 1-0        # Wallclock limit (days-hours). Required!
                      # Max permitted is 7 days (7-0).

# Purge your environment and load the FastTree module
module purge
module load apps/intel-18.0/fasttree/2.1.11

# Inform FastTreeMP how many cores to use. $SLURM_NTASKS is set to the number of cores above.
export OMP_NUM_THREADS=$SLURM_NTASKS

# To infer a tree for a protein alignment with the JTT+CAT model, use
FastTreeMP alignment.file > tree_file 

# To infer a tree for a nucleotide alignment with the GTR+CAT model, use
FastTreeMP -gtr -nt alignment_file > tree_file

Submit the jobscript using:

sbatch scriptname

where scriptname is the name of your jobscript.

Further info

Last modified on June 5, 2025 at 3:32 pm by Paraskevas Mitsides