{"id":847,"date":"2022-07-20T11:54:15","date_gmt":"2022-07-20T10:54:15","guid":{"rendered":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/?page_id=847"},"modified":"2022-07-20T11:54:56","modified_gmt":"2022-07-20T10:54:56","slug":"pigz","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/software\/tools\/pigz\/","title":{"rendered":"pigz and unpigz"},"content":{"rendered":"<h2>Overview<\/h2>\n<p><a href=\"http:\/\/zlib.net\/pigz\/\">Pigz<\/a> is a parallel version of gzip, a tool to compress\/uncompress files in to <code>gzip<\/code> or <code>zip<\/code> archives. It uses multiple cores on a compute node to speed up file compression.<\/p>\n<p>Note that the archive (<code>.gz<\/code>) files written by <code>pigz<\/code> and ordinary <code>gzip<\/code>, installed on many Linux systems, are compatible. You do <em>not<\/em> need to compress <em>and<\/em> uncompress a file using <code>pigz<\/code> and <code>unpigz<\/code>. For example you can compress a file quickly on the CSF using <code>pigz<\/code> (see below), then transfer the file to your local desktop machine and uncompress it using the ordinary <code>gunzip<\/code> command. Conversely if you download a data file from the web that has been compressed with <code>gzip<\/code> you can uncompress it on the CSF using <code>unpigz<\/code>.<\/p>\n<p><strong>Under no circumstances<\/strong> should <code>pigz<\/code> be run on the login node. If found running it will be killed without warning. It <em>must<\/em> be submitted as a batch job.<\/p>\n<p>Version 2.4 is installed on the CSF.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>Under no circumstances should pigz be run on the login node. If found running it will be killed without warning. It must be submitted as a batch job.<\/p>\n<p>There are no restrictions on accessing pigz on the CSF.<\/p>\n<h2>Set up procedure<\/h2>\n<p>Under no circumstances should pigz be run on the login node. If found running it will be killed without warning. It must be submitted as a batch job.<\/p>\n<p>Load the modulefile:<\/p>\n<pre>\r\nmodule load pigz\/2.4-gcccore-9.3.0\r\n<\/pre>\n<h2>Running the application<\/h2>\n<p>Please do not run pigz on the login node. Jobs should be submitted to the compute nodes via batch.<\/p>\n<h3>Parallel batch job submission &#8211; file compression<\/h3>\n<p>It is recommended you run pigz on files in your scratch area. This is a faster filesystem than your home area:<\/p>\n<pre>\r\ncd ~\/scratch\r\n<\/pre>\n<p>Create a batch submission script, for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p multicore       # (--partition=multicore) Single compute-node parallel job\r\n#SBATCH -n 8               # (--ntasks=8) Number of cores to use for file compression\/decompression\r\n\r\n# Load the required version\r\nmodule load pigz\/2.4-gcccore-9.3.0\r\n\r\n### Some example compression uses are given below ###\r\n\r\n## Note that $SLURM_NTASKS is automatically set to the number of cores requested above\r\n\r\n## Compress a file named mydatafile.dat - it will be renamed mydatafile.dat.gz once compressed\r\npigz -p $SLURM_NTASKS mydatafile.dat\r\n\r\n## OR Compress everything found in a directory named 'my_data' to a compressed tar file named my_data.tar.gz\r\ntar cf - my_data | pigz -p $SLURM_NTASKS > my_data.tar.gz\r\n       #\r\n       #\r\n       # Note that a '-' here means the output is sent through the\r\n       # pipe (the | symbol) to the pigz command, not to an intermediate\r\n       # tar file.\r\n<\/pre>\n<p>Submit the jobscript using: <\/p>\n<pre>sbatch <em>scriptname<\/em><\/pre>\n<p>where <em>scriptname<\/em> is the name of your jobscript.<\/p>\n<h3>Parallel batch job &#8211; decompression<\/h3>\n<p>The pigz manual states:<\/p>\n<ul>\n<li>Decompression can\u2019t be parallelized. As a result, pigz uses a single thread (the main thread) for decompression, but will create three  other  threads  for reading,  writing,  and  check  calculation,  which  can speed up decompression under some circumstances.  Parallel decompression can be turned off by specifying one process ( -dp 1 or -tp 1 ).<\/li>\n<\/ul>\n<p>Hence when using <code>unpigz<\/code> you should request 4 cores unless you turn off parallel decompression.<\/p>\n<p>Create a batch submission script, for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p multicore       # (--partition=multicore) Single compute-node parallel job\r\n#SBATCH -n 4               # (--ntasks=4) 4 is the maximum number of cores that decompression can use\r\n\r\n# Load the required version\r\nmodule load pigz\/2.4-gcccore-9.3.0\r\n\r\n### Some example decompression uses are given below ###\r\n### These will use the maximum of 4 cores           ###\r\n\r\n## Uncompress a file named mydatafile.dat.gz - it will be renamed mydatafile.dat once decompressed\r\nunpigz mydatafile.dat.gz\r\n\r\n## OR Uncompress a previously compressed tar file named my_data.tar.gz in to the current directory\r\nunpigz -c my_data.tar.gz | tar xf -\r\n        #                         #\r\n        #                         #\r\n        # Send output through     # Note that a '-' here means the tar command will read\r\n        # the pipe to the tar     # input sent through the pipe by the pigz command, not\r\n        # command.                # from a tar file on disk.\r\n<\/pre>\n<p>Submit the jobscript using: <\/p>\n<pre>sbatch <em>scriptname<\/em><\/pre>\n<p>where <em>scriptname<\/em> is the name of your jobscript.<\/p>\n<h2>Further info<\/h2>\n<ul>\n<li>On the CSF login node run: <code>man pigz<\/code> to get a list of options.<\/li>\n<li><a href=\"http:\/\/zlib.net\/pigz\/\">Pigz<\/a> website.<\/li>\n<\/ul>\n<h2>Updates<\/h2>\n<p>None.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview Pigz is a parallel version of gzip, a tool to compress\/uncompress files in to gzip or zip archives. It uses multiple cores on a compute node to speed up file compression. Note that the archive (.gz) files written by pigz and ordinary gzip, installed on many Linux systems, are compatible. You do not need to compress and uncompress a file using pigz and unpigz. For example you can compress a file quickly on the.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/software\/tools\/pigz\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"parent":156,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-847","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/comments?post=847"}],"version-history":[{"count":3,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/847\/revisions"}],"predecessor-version":[{"id":850,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/847\/revisions\/850"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/156"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/media?parent=847"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}