Alphafold

Overview

AlphaFold is an application for predicting models of protein structures.

AlphaFold requires a suite of supporting tools to be installed, therefore Alphafold and all supporting tools are available within a singularity container.

Restrictions on use

Although the AlphaFold code is licensed under the Apache License, Version 2.0 (the “License”) and therefore is considered open source. Alphafold utilises various genetic databases, model parameters, and third party software all using a variety of licenses.

Users wishing to access Alphafold should review and agree, in a support request submitted using the Connect Portal HPC form, that they will abide to the T&Cs of the various licenses.

Please follow this link for further information and to view all associated licenses  – Alphafold License and Disclaimer 

Only users who have been added to the Alphafold group can run the application.

Any publication that discloses findings arising from using this source code or the model parameters should cite the AlphaFold paper and, if applicable, the AlphaFold-Multimer paper.

Access to GPUs maybe controlled. If you do not have access to GPUs and you wish to use this software on GPUs please let us know when you request access.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

module load apps/singularity/alphafold/2.3.0
module load apps/singularity/alphafold/2.1.1  
module load apps/singularity/alphafold/2.0

Version 2.1.1 (and up) supports multimer modelling, however should not be considered as stable as the monomer Alphafold system

Running the application

Please do not run Alphafold on the login node. Jobs should be submitted to the compute nodes via batch.

Ideally, Alphafold should be run in a parallel environment using 8 cores (there are steps in the code which are hardcoded to use 8 cores) with or without a single GPU (by default will use GPU).  Unfortunately, Alphafold cannot use more than 1x GPU

Genetic Databases

AlphaFold needs multiple genetic (sequence) databases to run, they have already been downloaded and Alphafold has been setup to access them by default.  Please note users will need to be a member of the alphafold unix group in order to access them see Restrictions on use section.

Parallel batch jobscript with GPU

Please note that access to GPUs maybe controlled, depending on your group.

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#SBATCH -p gpuV  # Partition is required, choose based on type of CPU you want
#SBATCH -G 1     # use only 1 x v100 GPU
#SBATCH -n 8     # Job will run using 8 CPU cores (max 8 CPU cores per v100 GPU)
#SBATCH -t 2-0   # Wallclock limit (days-hours). Required!
                 # Max permitted is 7 days (7-0).

# Clear any loaded modules in pre-existing environment, then load the required version
module purge
module load apps/singularity/alphafold/2.1.1

run_alphafold.sh -f $PWD/filename.fasta -t YYYY-MM-DD -o $PWD/output_directory -m model_preset
  #
  # PLEASE NOTE: $PWD is required so the singularity container is able to map to the CSF filesystem 

Parallel batch jobscript without GPU

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#SBATCH -p multicore     # Partition is required. Runs on an AMD Genoa hardware.
#SBATCH -n 8             # Job will run using 8 CPU cores
#SBATCH -t 2-0           # Wallclock limit (days-hours). Required!
                         # Max permitted is 7 days (7-0).

# Clear any loaded modules in pre-existing environment, then load the required version
module purge
module load apps/singularity/alphafold/2.1.1

run_alphafold.sh -f $PWD/filename.fasta -t YYYY-MM-DD -o $PWD/output_directory -m model_preset -g false
  #
  # PLEASE NOTE: $PWD is required so the singularity container is able to map to the CSF filesystem

Required and Optional Parameters

Required Parameters:

-o <output_dir> Path to a directory that will store the results.
-m <model_preset> Choose preset model configuration - the monomer model (monomer), the monomer model with extra ensembling (monomer_casp14), monomer model with pTM head (monomer_ptm), or multimer model (multimer) (default: 'monomer')
-f <fasta_path> Path to a FASTA file containing one sequence
-t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets

Optional Parameters:

-g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true)
-c <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: 'full_dbs')
-p <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed (default: 'false')
-l <is_prokaryote> Optional for multimer system, not used by the single chain system. A boolean specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. This value determine the pairing method for the MSA (default: 'None')
-b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: 'false')

Batch jobs submission

Submit the jobscript using:

sbatch scriptname

where scriptname is the name of your jobscript.

Own singularity container image

If you wish to build your own singularity image you should copy the above files to your local PC and run singularity there. For security reasons only the sysadmins can build images on the CSF.

Further info

Github Deepmind/alphafold 

Guide with input files

One of our Alphafold users on CSF3 has very kindly provided a guide with example input files to help get new users started.

Last modified on June 5, 2025 at 4:23 pm by Paraskevas Mitsides