{"id":11998,"date":"2026-03-03T09:57:45","date_gmt":"2026-03-03T09:57:45","guid":{"rendered":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=11998"},"modified":"2026-03-11T16:19:01","modified_gmt":"2026-03-11T16:19:01","slug":"nextflow","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/nextflow\/","title":{"rendered":"Nextflow"},"content":{"rendered":"<p><!-- \nTODO:\n- jobscript example to run nextflow process on a compute node - 7days limit\n- make adaptive profile based on https:\/\/github.com\/nf-core\/configs\/blob\/master\/conf\/utd_juno.config  for the CSF3\n--><\/p>\n<h2>Overview<\/h2>\n<p><a href=\"https:\/\/www.nextflow.io\">Nextflow<\/a> is a scientific workflow system for creating scalable, portable, and reproducible workflows. It is based on the dataflow programming model, which greatly simplifies the writing of parallel and distributed pipelines, allowing you to focus on the flow of data and computation.<\/p>\n<p>It consists of a Domain Specific Language (DSL), currently called Nextflow DSL2, based on <a href=\"https:\/\/groovy-lang.org\/\">Apache Groovy<\/a>. It runs on the Java Virtual Machine (JVM).<\/p>\n<p>You can either code your own workflow pipeline using the nextflow DSL2 or you can use one of the pre-existing pipelines developed and published by the community. You can find these hosted in community repositories such as the <a href=\"https:\/\/nf-co.re\/\">nf-core project<\/a> or sequencing vendor specific such as <a href=\"https:\/\/github.com\/epi2me-labs\">Oxford Nanopore EPI2ME<\/a>.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>Nextflow is released under the open source <a href=\"https:\/\/www.apache.org\/licenses\/LICENSE-2.0\">Apache 2.0 license<\/a> and can be accessed by all CSF users. Be aware that your usage must still adhere to the license terms and that publicly available pipelines you use may have different licenses e.g. nf-core uses the standard MIT License, but Oxford Nanopore (EPI2ME) apply their own Public License.<\/p>\n<h2>Set up procedure<\/h2>\n<p>We have created a module which will set the required environment variables, load the Java virtual machine, and provide a Nextflow profile, which is compatible with the CSF3 Slurm partitions.<\/p>\n<p>To load the module:<\/p>\n<pre>\r\nmodule load apps\/binapps\/nextflow\/25.10.4        # also loads Java openJDK 25.0.1\r\n<\/pre>\n<div class=\"note\">A pipeline you want to run might require a specific version of Nextflow, you generally do not need to request a new Nextflow module to achieve this, see <a href=\"#Running_a_specific_version_of_Nextflow\">below for details<\/a>.<\/div>\n<h2>Running Nextflow<\/h2>\n<p>Nextflow is a bit different to other software on the CSF, please read and understand the information below before raising support requests for issues related to Nextflow.<\/p>\n<p>When you run Nextflow, it launches a lightweight main task, which in turn submits a series of sub-tasks to Slurm on the CSF to run the steps in the pipeline.<\/p>\n<p>Because the main task must be available for the whole of the pipeline run, we do not recommend submitting it as a Slurm job itself as it may time-out if sub-tasks have to wait for resources. Instead, you may run the Nextflow main task on one of the login nodes.<\/p>\n<p>Pipelines (e.g. from nf-core) and their dependencies will be automatically downloaded into the <code>~\/scratch\/.nextflow<\/code> directory structure. Be aware that if you use multiple pipelines or different versions of the same pipeline, which download dependencies, this scratch location can become large and you may want to <a href=\"https:\/\/www.nextflow.io\/docs\/latest\/reference\/cli.html#clean\">clean up<\/a> periodically.<\/p>\n<p>Please always run Nextflow from within a directory you have created in <code>scratch<\/code>. Nextflow will download and create a lot of data while running a workflow and launching several jobs, therefore storing these data in your <code>$HOME<\/code> directory is not ideal.<\/p>\n<p>If you are an experienced Nextflow user, you can find the path to the Manchester profile configuration file in the environment variable <code>$NXF_UOM_CONFIG<\/code> after loading the Nextflow module. If you are using your own config file, please ensure that you do NOT use a local profile, as nextflow will not use slurm and will run the pipeline from where it was launched, usually the login node.<\/p>\n<p>If you want to inspect the NXF_xxx variables we defined in our module file, as some of theme change the default storage locations, you can try <code>module show apps\/binapps\/nextflow\/25.10.4<\/code>.<\/p>\n<h3>Running Nextflow from a login node<\/h3>\n<p>We recommend launching Nextflow from one of the three <strong>login nodes<\/strong>. Make a note of which login node you use, as if you logout you would need to (re)connect to the <strong>same node<\/strong> in order to <a href=\"#Monitoring_a_pipeline8217s_progress\">monitor<\/a> or <a href=\"#Cancelling_a_pipeline_run\">cancel<\/a> the pipeline run. Specific login nodes can be specified when logging-in like: <code>ssh <em>username<\/em>@login3-csf3.itservices.manchester.ac.uk<\/code> to go to login node 3.<\/p>\n<p>The generic command sequence to run Nextflow on the CSF3 is:<\/p>\n<pre>module load apps\/binapps\/nextflow\/25.10.4        # load the module\r\n\r\nnextflow -bg run &lt;<em>pipeline<\/em>&gt; -c &lt;<em>nextflow_config_file<\/em>&gt; \\\r\n         -profile &lt;<em>profile_1,profile_2,...<\/em>&gt; [arg...] &> NXF_OUT.log\r\n<\/pre>\n<p>Explanation of the options shown:<\/p>\n<ul>\n<li><code>-bg<\/code>:\n<p>Run Nextflow as a background process. With this option the Nextflow process will keep running in the background (in the login node), even if you log out. You can find the Nextflow process ID (PID) stored in the <code>.nextflow.pid<\/code> file created in the directory from which you launched nextflow. See also: <a href=\"#Monitoring_a_pipeline8217s_progress\">Monitoring_a_pipeline&#8217;s_progress<\/a> and <a href=\"#Cancelling_a_pipeline_run\">Cancelling a pipeline run<\/a>.<\/p>\n<\/li>\n<li><code>&lt;<em>pipeline<\/em>&gt;<\/code>:\n<p>For a local pipeline this is your pipeline script, e.g. <code>\/path\/to\/main.nf<\/code>.<br \/>\nFor <a href=\"https:\/\/www.nextflow.io\/docs\/stable\/cli.html#launching-a-remote-project\">a pipeline from an online repository<\/a>, you can give the project URL or a short version if supported e.g. nf-core pipelines.<\/p>\n<\/li>\n<li><code>-c &lt;<em>nextflow_config_file<\/em>&gt;<\/code>:\n<p>A custom <a href=\"https:\/\/www.nextflow.io\/docs\/stable\/config.html#configuration-file\">configuration file<\/a> is needed to define the parameters and limits specific to the CSF. By loading our nextflow module you have access to our custom University of Manchester configuration file, using the variable <code>$NXF_UOM_CONFIG<\/code>.<\/p>\n<\/li>\n<li><code>-profile &lt;<em>profile_1,profile_2,...<\/em>&gt;<\/code>:\n<p>use one or more of the defined profiles (in a comma separated list). <a href=\"https:\/\/www.nextflow.io\/docs\/stable\/config.html#config-profiles\">A profile<\/a> is a set of configuration settings to be used during pipeline execution. Profiles may be defined in a pipeline&#8217;s configuration file (in the pipeline&#8217;s project directory) and in the custom configuration file loaded with the <code>-c<\/code> option.<\/p>\n<p>In our <code>$NXF_UOM_CONFIG<\/code> Nextflow configuration file we currently have the <code>csf3himem<\/code> profile defined. This will instruct Nextflow to launch all jobs to the <code>himem<\/code> Slurm partition. This is chosen as the <code>himem<\/code> partition allows both single and multicore jobs and has large memory, which is a typical requirement for bioinformatics pipelines.<\/p>\n<\/li>\n<li><code>[arg..]<\/code>:\n<p>One or more <strong>specific pipeline<\/strong> arguments. They always start with a double dash(<code>--<\/code>) in contrast to the generic nextflow options which start with a single hyphen (e.g. <code>-bg<\/code>). Each pipeline may have its own different options defined. For published pipelines these are usually described in the pipeline&#8217;s documentation.<\/p>\n<\/li>\n<li><code>&> NXF_OUT.log<\/code>:\n<p>This section of the command redirects any standard output or error messages from the terminal into a file called <code>NXF-OUT.log<\/code> and returns you to the command prompt. <\/p>\n<\/li>\n<\/ul>\n<h3>Example test runs<\/h3>\n<h4>RNAseq pipeline from nf-core<\/h4>\n<p>The basic command to run the test case for the <a href=\"https:\/\/nf-co.re\/rnaseq\">rnaseq pipeline<\/a> published in the nf-core project is:<\/p>\n<pre>\r\nnextflow -bg run nf-core\/rnaseq -c $NXF_UOM_CONFIG \\\r\n         -profile test,singularity,csf3himem --outdir output-rnaseq &> NXF_OUT.log\r\n<\/pre>\n<p>To run the pipeline for real, remove <code>test<\/code> from the <code>-profile<\/code> arguments and add the required arguments for the pipeline at the end e.g. for inputs, outputs, settings. These arguments will be defined in the pipeline documentation.<\/p>\n<p>If you want to run a specific <strong>revision of the pipeline<\/strong> (recommended), include the <code>-r<\/code> option:<\/p>\n<pre>\r\nnextflow -bg run nf-core\/rnaseq -r 3.22.2  -c $NXF_UOM_CONFIG \\\r\n         -profile test,singularity,csf3himem --outdir output-rnaseq &> NXF_OUT.log\r\n<\/pre>\n<h4>wf-human-variation workflow from Oxford Nanopore EPI2ME Labs<\/h4>\n<p>Detailed instructions to run the workflow are provided in its <a href=\"https:\/\/github.com\/epi2me-labs\/wf-human-variation\">Github project page<\/a>.<\/p>\n<p>Create a working directory in <code>scratch<\/code>, and from inside that directory download and unpack the demo dataset: <\/p>\n<pre>\r\nwget https:\/\/ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com\/wf-human-variation\/hg002%2bmods.v1\/wf-human-variation-demo.tar.gz\r\ntar -xzvf wf-human-variation-demo.tar.gz\r\n<\/pre>\n<p>Then run the pipeline:<\/p>\n<pre>\r\nnextflow run epi2me-labs\/wf-human-variation \\\r\n    -bg \\\r\n    -c $NXF_UOM_CONFIG \\\r\n    -profile singularity,csf3himem \\\r\n    --bam 'wf-human-variation-demo\/demo.bam' \\\r\n    --ref 'wf-human-variation-demo\/demo.fasta' \\\r\n    --bed 'wf-human-variation-demo\/demo.bed' \\\r\n    --sample_name 'DEMO' \\\r\n    --snp \\\r\n    --sv \\\r\n    --mod \\\r\n    --phased &> NXF_OUT.log\r\n<\/pre>\n<h3>Running a specific version of Nextflow<\/h3>\n<p>If you need a specific <strong>Nextflow version<\/strong> set the <code>NXF_VER<\/code> variable at the start of the command:<\/p>\n<pre>\r\nNXF_VER=23.10.1 nextflow -bg run nf-core\/rnaseq -r 3.22.2  -c $NXF_UOM_CONFIG \\\r\n    -profile test,singularity,csf3himem --outdir output-rnaseq &> NXF_OUT.log\r\n<\/pre>\n<p>It doesn&#8217;t matter if this version is older or newer than the one we have installed on the CSF. The requested version will be downloaded into the <code>~\/scratch\/.nextflow<\/code> directory structure and run from there. <\/p>\n<p>Defining the version as shown above applies only to the run in which it is specified, it does not persist for future runs.<\/p>\n<h3>Monitoring a pipeline&#8217;s progress<\/h3>\n<p>Run <code>nextflow log<\/code> from the launch directory to see the status and metadata of the current and past pipeline runs launched from the specific directory.<\/p>\n<p>To check the progress of a currently running pipeline, in the launch directory read the <code>NXF_OUT.log<\/code> file like:<br \/>\n<code>cat NXF_OUT.log<\/code><br \/>\nOr if you want to get live updates as the log file grows:<br \/>\n<code>tail -f NXF_OUT.log<\/code><\/p>\n<p>If you want more detail, then check the more verbose <code>.nextflow.log<\/code> file, in the same manner as described above.<\/p>\n<p>To monitor the Slurm jobs launched by Nextflow, bear in mind that usually their names start with the <code>nf-<\/code> characters. Usuful commands are:<\/p>\n<pre>\r\nsqueue                    # see pending and running jobs\r\nsacct -X -S now-2hours    # see all jobs submitted within the past 2hours, including completed\r\n<\/pre>\n<p>Another helpful tip, is to add <a href=\"https:\/\/docs.seqera.io\/nextflow\/reports#trace-file\">the <code>-with-trace<\/code> option<\/a> when launching nextflow. This will create a tab-delimited csv-like file, that contains a list of all pipeline tasks, where the SLURM jobID is the 3d column. To view it, from the nextflow launch directory, try:<\/p>\n<p><code>column -t -s $'\\t' output\/execution\/trace.txt<\/code><\/p>\n<p>Each hash value in the 2nd column, corresponds to a subdirectory in <code>$launchDir\/.nextflow\/work<\/code>, that contains all the task log files and outputs generated.<\/p>\n<p>Finally, you can use the <code>nextflow log [options]<\/code> command to <a href=\"https:\/\/docs.seqera.io\/nextflow\/reports\">filter and analyse pipeline data<\/a> and even get an html report or timeline (exported in <code>$launchDir\/.nextflow\/output\/execution<\/code>) when the pipeline has finished. <\/p>\n<h3>Cancelling a pipeline run<\/h3>\n<p>To cleanly cancel a pipeline run (along with all related Slurm jobs):<\/p>\n<ol>\n<li>\n<p>Make sure you are logged in the <a href=\"#Running_Nextflow_from_a_login_node\">same login node you launched the pipeline from<\/a>.<\/p>\n<\/li>\n<li>\n<p>Type <code>kill <em>PID<\/em><\/code>, where the PID is the Nextflow Process ID; the number stored in the <code>.nextflow.pid<\/code> file located in the directory you launched the pipeline from.<\/p>\n<p>An elegant option to do this is to change to the launch directory, that contains the .pid file, and run: <code>kill $(cat .nextflow.pid)<\/code>.<\/p>\n<\/li>\n<\/ol>\n<h3>Resuming a pipeline run<\/h3>\n<p>Nextflow also provides the ability to resume a cancelled or interrupted pipeline run. Please read <a href=\"https:\/\/docs.seqera.io\/nextflow\/cache-and-resume\">the official documentation<\/a> on how to achieve this.<\/p>\n<p>Generally, as long as you have not cleaned the launch directory, all you have to do is:<\/p>\n<ul>\n<li>launch from the same directory that the original failed run was launched<\/li>\n<li>add the <code>-resume<\/code> cli option to the original command<\/li>\n<\/ul>\n<h2>Further Info<\/h2>\n<h3>Official documentation<\/h3>\n<p><a href=\"https:\/\/docs.seqera.io\/nextflow\/\">https:\/\/docs.seqera.io\/nextflow\/<\/a><\/p>\n<h3>Official courses<\/h3>\n<p><a href=\"https:\/\/training.nextflow.io\/latest\/\">https:\/\/training.nextflow.io\/latest\/<\/a><\/p>\n<h3>Example tutorials for piplelines<\/h3>\n<p><a href=\"https:\/\/www.nextflow.io\/docs\/edge\/tutorials\/rnaseq-nf.html\">https:\/\/www.nextflow.io\/docs\/edge\/tutorials\/rnaseq-nf.html<\/a><br \/>\n<a href=\"https:\/\/www.nextflow.io\/docs\/edge\/tutorials\/data-lineage.html\">https:\/\/www.nextflow.io\/docs\/edge\/tutorials\/data-lineage.html<\/a><\/p>\n<h3>VSCode integration<\/h3>\n<p><a href=\"https:\/\/www.nextflow.io\/docs\/latest\/vscode.html\">https:\/\/www.nextflow.io\/docs\/latest\/vscode.html<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview Nextflow is a scientific workflow system for creating scalable, portable, and reproducible workflows. It is based on the dataflow programming model, which greatly simplifies the writing of parallel and distributed pipelines, allowing you to focus on the flow of data and computation. It consists of a Domain Specific Language (DSL), currently called Nextflow DSL2, based on Apache Groovy. It runs on the Java Virtual Machine (JVM). You can either code your own workflow pipeline.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/nextflow\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":24,"featured_media":0,"parent":86,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-11998","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/11998","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/24"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=11998"}],"version-history":[{"count":21,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/11998\/revisions"}],"predecessor-version":[{"id":12130,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/11998\/revisions\/12130"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/86"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=11998"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}