zstd

Overview

Zstd /  Zstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression / speed trade-off, while being backed by a very fast decoder (see benchmarks below). It also offers a special mode for small data, called dictionary compression, and can create dictionaries from any sample set. Zstandard library is provided as open source software using a BSD license.

Version 1.4.0 is installed on the CSF and was compiled with GCC 4.8.5.

Restrictions on use

There are no restrictions on accessing zstd on the CSF.

Set up procedure

Under no circumstances should zstd be run on the login node. If found running it will be killed without warning. It must be submitted as a batch job.

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load the modulefile:

module load tools/gcc/zstd/1.4.0

Running the application

Jobs should be submitted to the compute nodes via batch.

Serial batch job submission – file compression

It is recommended you run zstd on files in your scratch area. This is a faster filesystem than your home area:

cd ~/scratch

Make sure you have the modulefile loaded then create a batch submission script, for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory


# Load the required version
module load tools/gcc/zstd/1.4.0

### Some example compression uses are given below ###



## Compress a file named mydatafile.dat - it will be renamed mydatafile.dat.zst once compressed
zstd mydatafile.dat
## OR Compress everything found in a directory named 'my_data' to a compressed tar file named my_data.tar.zst
tar cf - my_data | zstd > my_data.tar.zst
       #
       #
       # Note that a '-' here means the output is sent through the
       # pipe (the | symbol) to the zstd command, not to an intermediate
       # tar file.

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Further info

  • On the CSF login node run: zstd -h to get a list of options.
  • Zstd website.

Updates

None.

Last modified on May 1, 2019 at 2:02 pm by Daniel Nisbet