Research Infrastructure > CSF2 (retired) > Software > Applications > fastStructure

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

fastStructure

Overview

fastStructure is a fast algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.

Version 1.0 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing this software on the CSF. It is released under the MIT license and all use must fall within that license.

Set up procedure

To access the software you must first load the modulefile:

module load apps/gcc/python-packages/anaconda-2.5.0/faststructure/1.0

This will automatically load the anaconda python 2.5.0 modulefile which provides python 2.7.11.

The following python scripts are available for use:

chooseK.py
distruct.py
structure.py

Running the application

Please do not run fastStructure on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch job submission

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash
#$ -S /bin/bash
#$ -cwd             # Job will run from the current directory
#$ -V               # Job will inherit current environment settings

structure.py args...

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Parallel batch job submission

The app is serial only. If you have multiple datasets to process you should submit them as a job array so that you can have many jobs running at the same time.

Further info

fastStructure website

Updates

None.

Last modified on May 11, 2018 at 3:27 pm by George Leaver

Page Contents