{"id":10724,"date":"2025-07-22T18:23:40","date_gmt":"2025-07-22T17:23:40","guid":{"rendered":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=10724"},"modified":"2025-07-23T14:54:51","modified_gmt":"2025-07-23T13:54:51","slug":"gatk","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/gatk\/","title":{"rendered":"GATK"},"content":{"rendered":"<h2>Overview<\/h2>\n<p><a href=\"https:\/\/gatk.broadinstitute.org\/hc\/en-us\">GATK<\/a> offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.<\/p>\n<p>Various versions are installed on the CSF &#8211; please see modulefiles below.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>There are no restrictions on accessing GATK4 on the CSF. It is released under the <a href=\"https:\/\/github.com\/broadinstitute\/gatk\/blob\/master\/LICENSE.TXT\">Apache 2.0 license<\/a> and all use must adhere to that license.<\/p>\n<p>GATKv3 is released under a more restrictive <a href=\"https:\/\/github.com\/broadgsa\/gatk\/blob\/master\/licensing\/protected_license.txt\">license<\/a> which prohibits commercial\/for-profit use. All usage must adhere to that license. If this is too restrictive, you must switch to GATK4, which is fully open sourced.<\/p>\n<h2>Set up procedure<\/h2>\n<p>We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.<\/p>\n<p>Load one of the following modulefiles:<\/p>\n<pre>\r\nmodule load apps\/<strong>singularity<\/strong>\/gatk\/4.5.0.0\r\nmodule load apps\/binapps\/gatk\/4.4.0.0\r\nmodule load apps\/binapps\/gatk\/4.1.8.0\r\n\r\n# For older versions, first load the bioinf modulefile\r\n<strong>module load apps\/bioinf<\/strong>\r\n# Then the required gatk modulefile\r\nmodule load apps\/gatk\/3.8.0               # See StatusLogger Log4j2 error fix below\r\nmodule load apps\/gatk\/3.6.0\r\nmodule load apps\/gatk\/3.5.0\r\n<\/pre>\n<h2>Running the application<\/h2>\n<p>Please do not run gatk on the login node to process data. Jobs should be submitted to the compute nodes via batch.<\/p>\n<p>You <em>may<\/em> run <code>gatk -h<\/code> on the login node to see a list of flags that can be used to run the various GATK tools in your jobscripts.<\/p>\n<p>Please note that complete instructions on how to run gatk are beyond the scope of this page. Please consult the <a href=\"https:\/\/gatk.broadinstitute.org\/hc\/en-us\">GATK Online Documentation<\/a> for how to use this application. <\/p>\n<h3>StatusLogger Log4j2 Error Fix<\/h3>\n<div class=\"hint\">\nThis section gives a fix for the <code>StatusLogger<\/code> error, which has been seen in v3.8.0 and may exist in other versions.\n<\/div>\n<p>If you receive an error similar to the following, particularly in v3.8.0 when <strong>running on the AMD compute nodes<\/strong> (<code>#SBATCH -p multicore<\/code>):<\/p>\n<pre>\r\nERROR StatusLogger Unable to create class org.apache.logging.log4j.core.impl.Log4jContextFactory ...\r\nERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. \r\n<\/pre>\n<p>then please append the following flags to the gatk command-line in your jobscript:<\/p>\n<pre>\r\n-jdk_inflater -jdk_deflater\r\n<\/pre>\n<p>Without these flags, gatk will use some optimized components that only run on Intel CPUs.<\/p>\n<p>See the jobscript examples below.<\/p>\n<h3>Serial batch job submission<\/h3>\n<p>Create a batch submission script (which will load the modulefile in the jobscript), for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p serial     # (or --partition=) Run on the nodes dedicated to 1-core jobs\r\n#SBATCH -t 4-0        # Wallclock time limit. 4-0 is 4 days. Max permitted is 7-0.\r\n\r\n# Start with a clean environment - modules are inherited from the login node by default.\r\nmodule purge\r\nmodule load apps\/binapps\/gatk\/4.4.0.0\r\n\r\n# Note: The -jdk_inflater -jdk_deflater may be needed in v3.8.0 jobs on the AMD (-p multicore) nodes\r\ngatk -T RealignerTargetCreator -R <em>my<\/em>.fasta -I <em>my<\/em>.bam -o <em>my<\/em>_realigner.intervals  <em>-jdk_inflater -jdk_deflater<\/em>\r\n\r\n<\/pre>\n<p>Submit the jobscript using: <\/p>\n<pre>sbatch <em>scriptname<\/em><\/pre>\n<p>where <em>scriptname<\/em> is the name of your jobscript.<\/p>\n<h3>Parallel batch job submission<\/h3>\n<p>If the app is multicore capable, given an example parallel jobscript, including suitable partition<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p multicore  # (or --partition=) Run on the AMD 168-core nodes\r\n#SBATCH -n 16         # (or --ntasks=) Number of cores to use.\r\n#SBATCH -t 4-0        # Wallclock time limit. 4-0 is 4 days. Max permitted is 7-0.\r\n\r\n# Start with a clean environment - modules are inherited from the login node by default.\r\nmodule purge\r\nmodule load apps\/binapps\/gatk\/4.4.0.0\r\n\r\n# You must inform you app how many cores to use. $SLURM_NTASKS will be set to the -n number above.\r\n# Note: The -jdk_inflater -jdk_deflater may be needed in v3.8.0 jobs on the AMD (-p multicore) nodes\r\ngatk -T RealignerTargetCreator <strong>-nt $SLURM_NTASKS<\/strong> -R <em>my<\/em>.fasta -I <em>my<\/em>.bam -o <em>my<\/em>_realigner.intervals <em>-jdk_inflater -jdk_deflater<\/em>\r\n<\/pre>\n<p>Submit the jobscript using: <\/p>\n<pre>sbatch <em>scriptname<\/em><\/pre>\n<p>where <em>scriptname<\/em> is the name of your jobscript.<\/p>\n<h2>Further info<\/h2>\n<ul>\n<li><a href=\"https:\/\/gatk.broadinstitute.org\/hc\/en-us\">GATK website<\/a><\/li>\n<\/ul>\n<h2>Updates<\/h2>\n<p>None.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview GATK offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Various versions are installed on the CSF &#8211; please see modulefiles below. Restrictions on use There are no restrictions on accessing GATK4 on the CSF. It is released under the Apache 2.0 license and all use must adhere to that.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/gatk\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":86,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-10724","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10724","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=10724"}],"version-history":[{"count":12,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10724\/revisions"}],"predecessor-version":[{"id":10738,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10724\/revisions\/10738"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/86"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=10724"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}