{"id":3268,"date":"2019-05-08T14:25:27","date_gmt":"2019-05-08T13:25:27","guid":{"rendered":"http:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=3268"},"modified":"2025-09-10T10:10:05","modified_gmt":"2025-09-10T09:10:05","slug":"checkm","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/checkm\/","title":{"rendered":"CheckM"},"content":{"rendered":"<h2>Overview<\/h2>\n<p><a href=\"https:\/\/ecogenomics.github.io\/CheckM\/\">CheckM<\/a> provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage.<\/p>\n<p>Version 1.2.2 is installed on the CSF.<br \/>\nVersion 1.1.0 is installed on the CSF.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>There are no restrictions on accessing this software on the CSF. It is licensed using the GNU General Public License version 3 and all usage must adhere to that license.<\/p>\n<p>Please cite your usage of this software using:<\/p>\n<p>Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2014. <a href=\"http:\/\/genome.cshlp.org\/content\/25\/7\/1043\">Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes<\/a>. Genome Research, 25: 1043-1055.<\/p>\n<h2>Set up procedure<\/h2>\n<p>We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. <\/p>\n<p>Load one of the following modulefiles:<\/p>\n<pre>\r\nmodule load apps\/python\/checkm\/1.2.2\r\nmodule load apps\/python\/checkm\/1.1.0\r\n\r\n# You will also need to load modulefiles for hmm (search and fetch), prodigal and pplacer, as follows:\r\nmodule load apps\/gcc\/hmmer\/3.2.1\r\nmodule load apps\/gcc\/prodigal\/2.6.3\r\nmodule load apps\/binapps\/pplacer\/1.1.alpha19\r\n<\/pre>\n<h2>Standard Data (dataRoot)<\/h2>\n<p>The checkm standard database has been downloaded and the <em>dataRoot<\/em> variable has been configured. The data is installed centrally. You do not need to do anything to configure checkm to use this data. For reference, the data was downloaded from <a href=\"https:\/\/data.ace.uq.edu.au\/public\/CheckM_databases\/\">https:\/\/data.ace.uq.edu.au\/public\/CheckM_databases\/<\/a>.<\/p>\n<h2>Running the application<\/h2>\n<p>Please do not run checkm on the login node. <strong>It is very memory hungry<\/strong>. Jobs should be submitted to the compute nodes via batch.<\/p>\n<p>Please see the checkm online documentation for an example of <a href=\"https:\/\/github.com\/Ecogenomics\/CheckM\/wiki\/Quick-Start#example-usage\">typical usage<\/a>.<\/p>\n<p>You <em>may<\/em> run <code>checkm<\/code> without any args \/ flags on the login node to get the help text showing the names of the checkm tools that can be run. For example:<\/p>\n<pre>\r\ncheckm\r\n                ...::: CheckM v1.1.0 :::...\r\n\r\n  Lineage-specific marker set:\r\n    tree         -> Place bins in the reference genome tree\r\n    tree_qa      -> Assess phylogenetic markers found in each bin\r\n    lineage_set  -> Infer lineage-specific marker sets for each bin\r\n\r\n  Taxonomic-specific marker set:\r\n...\r\n<\/pre>\n<h3>Serial batch job submission<\/h3>\n<p><strong>PLEASE NOTE:<\/strong> due to the high memory requirements of checkm, running serial jobs is unlikely to give you enough memory. If you are on a compute-node with &lt;40GB of memory per-core, the<\/p>\n<pre>--reduced_tree<\/pre>\n<p>flag can be added to the <code>checkm<\/code> command to reduced the memory requirements to approximately 14GB. Most CSF3 compute nodes offer only 4&#8211;5GB per core! The <a href=\"\/csf3\/batch\/high-memory-jobs\/\">high memory nodes<\/a> can offer 16GB, 32GB, 46GB, and 50GB per core. But given that checkm can use multiple cores to speed up computation, using a multi-core job is recommended as it will also give you access to more memory.<\/p>\n<p>Create a batch submission script (which will load the modulefile in the jobscript), for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p serial   # Use the nodes dedicated to 1-core jobs\r\n#SBATCH -n 1        # Use one core (causes the $SLURM_NTASKS var to be set)\r\n#SBATCH -t 4-0      # 4-day wallclock (max permitted is 7 days)\r\n\r\n# Choose the version you require\r\nmodule purge\r\nmodule load apps\/python\/checkm\/1.1.0\r\n\r\n# $NSLOTS will be automatically set to 1 (one core) in a serial job\r\ncheckm <em>tool<\/em> -t $SLURM_NTASKS <em>arg1<\/em> <em>arg2 ...<\/em>\r\n        #\r\n        # See the checkm documentation for a list of its available tools\r\n<\/pre>\n<p>Submit the jobscript using: <\/p>\n<pre>qsub <em>scriptname<\/em><\/pre>\n<p>where <em>scriptname<\/em> is the name of your jobscript.<\/p>\n<h3>Parallel batch job submission<\/h3>\n<p>Create a batch submission script (which will load the modulefile in the jobscript), for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p multicore    # (or --partition=) Use the AMD 168-core nodes\r\n#SBATCH -n 8            # Number of cores. Can be 2--168.\r\n#SBATCH -t 4-0          # 4-day wallclock (max permitted is 7 days)\r\n\r\n# Choose the version you require\r\nmodule purge\r\nmodule load apps\/python\/checkm\/1.1.0\r\n\r\n# $NSLOTS will be automatically set to the number of cores requested above\r\ncheckm <em>tool<\/em> -t $SLURM_NTASKS <em>arg1<\/em> <em>arg2 ...<\/em>\r\n        #\r\n        # See the checkm documentation for a list of its available tools\r\n<\/pre>\n<h2>Further info<\/h2>\n<ul>\n<li><a href=\"https:\/\/ecogenomics.github.io\/CheckM\/\">CheckM github website<\/a><\/li>\n<\/ul>\n<h2>Updates<\/h2>\n<p>None.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. Version 1.2.2 is installed on the CSF. Version 1.1.0 is installed on the CSF. Restrictions on use There are no restrictions on accessing this software on the CSF. It is licensed.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/checkm\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":86,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-3268","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/3268","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=3268"}],"version-history":[{"count":9,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/3268\/revisions"}],"predecessor-version":[{"id":10919,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/3268\/revisions\/10919"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/86"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=3268"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}