Conda – Miniforge3
Overview
Conda is package and environment manager.
New versions of Conda are regularly released but environments created with older versions are generally usable with newer versions.
On the CSF we recommend the Minforge3 implementation of Conda because it is aligned with the open conda-forge community and repository.
The Bioconda channel is already configured within our Miniforge modules.
Restrictions on use
Conda is open source but packages and repositories may not be entirely open or available under the same open source license.
Miniforge installer code uses BSD-3-Clause license.
Set up procedure
We recommend loading modulefiles within your jobscripts so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.
Load one of the following modulefiles:
module load apps/binapps/conda/miniforge3/25.9.1 module load apps/binapps/conda/miniforge3/25.3.0
Loading the module will set all parameters to allow Conda to function, but will not activate the conda base environment. To use a Conda environment, you first need to create it.
The information provided here is for quick reference, for more complete documentation please see the official Conda getting started guide.
Create a Conda environment
Create an environment called “myenv” with a specific Python version and the pandas package:
conda create -n myenv python==3.11 pandas
Conda will look up the requested packages and if they are available, will present you with package plan. You do not have to install python, Conda is used to manage a wide range of different software.
Type ‘Y’ to accept the plan and allow Conda to install the software.
When the install is complete you will be prompted to activate the environment:
conda activate myenv
Once your environment is active your shell will look something like:
(myenv) [a12345bc@login2[csf3] ~]$
To deactivate an environment:
conda deactivate
You can install/uninstall any other available Conda packages with:
conda install PACKAGE conda remove PACKAGE
Mamba
If your packages are particularly complicated and have many dependencies, using mamba to install into an empty Conda environment may be faster. This tends to be the case with Bioinformatics pipeline environments. To use mamba:
conda create -n myenv conda activate myenv mamba install PACKAGE1 PACKAGE2
Running jobs within a Conda environment
While you can set up your environment on the login nodes, jobs must be submitted to the compute nodes via batch.
Serial batch job submission
Create a batch submission script which loads the modulefile, checking that you are loading the version you want, for example:
#!/bin/bash --login #SBATCH -p serial # (or --partition=) Run on the nodes dedicated to 1-core jobs #SBATCH -t 4-0 # Wallclock time limit. 4-0 is 4 days. Max permitted is 7-0. # Start with a clean environment - modules are inherited from the login node by default. module purge module load apps/binapps/conda/miniforge3/VERSION conda activate myenv python3 myscript.py
Submit the jobscript using:
sbatch scriptname
where scriptname is the name of your jobscript.
Parallel batch job submission
If the app is multicore capable, the following parallel jobscript includes an example of how you might set the number of threads/processes automatically from SLURM variables, whether this is necessary will depend on your code.
#!/bin/bash --login #SBATCH -p multicore # (or --partition=) Run on the AMD 168-core nodes #SBATCH -n 16 # (or --ntasks=) Number of cores to use. #SBATCH -t 4-0 # Wallclock time limit. 4-0 is 4 days. Max permitted is 7-0. ## Start with a clean environment - modules are inherited from the login node by default. module purge module load apps/binapps/conda/miniforge3/VERSION conda activate myenv ## set any multi-tread parameters you may need from the slurm parameter ## THREADS will be 16 in this example THREADS=$SLURM_NTASKS_PER_NODE ## run code, assuming that the -t option is number of threads python3 myscript.py -t $THREADS
Submit the jobscript using:
sbatch scriptname
where scriptname is the name of your jobscript.
Hints and tips
By default conda environments are created in a hidden subdirectory in your home directory:
~/.conda/
This can lead to your home directory getting full, the following actions can help you clean up:
Review your environments and remove any you no longer need:
conda env list conda env remove -n NAME
Clean up cached downloads (will not delete environments):
conda clean -a
You can specify a different location when you create a new environment provided you have write access there:
conda create -p /some/path/you/have/write/access/myenv
and activate with the full path:
conda activate /some/path/you/have/write/access/myenv
Reproducible environments
It is good practice to keep a record of exactly what packages and versions you used to produce your work. Conda supports this by allowing you to export the complete list of packages in an environment to a YAML file:
conda export -n NAME --no-builds --file environment.yaml
We suggest putting this environment.yaml into your source control alongside your code or scripts.
You can subsequently re-create a deleted environment using the file:
conda env create --file=environment.yaml
Further details can be found in the conda export documentation.
