{"id":10879,"date":"2025-09-08T14:48:56","date_gmt":"2025-09-08T13:48:56","guid":{"rendered":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=10879"},"modified":"2026-03-31T12:56:15","modified_gmt":"2026-03-31T11:56:15","slug":"file-management","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/faqs\/file-management\/","title":{"rendered":"File Management FAQ"},"content":{"rendered":"<div class=\"hint\">The contents of this article originally appeared on <a href=\"\/userdocs\/file-management\/\">a different site<\/a> and it has be migrated to this site for better visibility.<\/div>\n<p>For the definitions of terms see <a href=\"\/csf3\/filesystems\/#introd\">the section on file systems<\/a>.<\/p>\n<h2>Maintaining Directories<\/h2>\n<h3>How can I manage my directories (folders)?<\/h3>\n<p>There are 7 basic commands to inspect and manipulate files and directories: <code>cp<\/code> to copy files,  <code>ls<\/code> to list files, <code>mv<\/code> to rename\/move files, <code>mkdir<\/code> to create directories (folders), <code>rm<\/code> to remove files, <code>rmdir<\/code> to remove empty directories, and <code>ln<\/code> to create links (&#8220;shortcuts&#8221;). For detailed information about their use, see <a href=\"\/csf3\/getting-started\/accessing-documentation\/\">their pages in the manual<\/a>.<\/p>\n<p>For example, to see the manual (known as the &#8220;man page&#8221;) for the <code>cp<\/code> command, run:<\/p>\n<pre>\r\nman cp\r\n #\r\n # Press space to page through the text,\r\n # b to go backwards, q to quit.\r\n<\/pre>\n<h3>How can I make better use of the file system?<\/h3>\n<p>The following strategies can be used to optimise use of the file systems:<\/p>\n<ul>\n<li><a href=\"#how-do-i-compress-my-files\">compressing files<\/a> so that less data needs to be held on the storage medium;<\/li>\n<li>combining many files into a single file by <a href=\"#how-do-i-tarball-files-together\">creating an archive<\/a> so that the file system has fewer files to index;<\/li>\n<li><a href=\"#do-empty-files-use-disk-space\">removing empty files<\/a> for the same reason;<\/li>\n<li>removing files that are no longer required after a submission has finished so that less data is held and fewer files are indexed;<\/li>\n<li>storing source files (e.g. data sets), temporary files, or reproducible files (e.g. output files) in <a href=\"\/csf3\/filesystems\/home-scratch-rds\/\">the appropriate directory<\/a>.<\/li>\n<\/ul>\n<h3>Some of my scratch files have been deleted! Where have they gone?<\/h3>\n<p>Files on the <em>scratch<\/em> file system that have not been accessed recently <a href=\"\/csf3\/filesystems\/scratch-cleanup\/\">are removed<\/a>. This is because it is to be used for computation, but final result files should be stored elsewhere for longer-term storage on a backed-up filesystem (e.g, your home directory).<\/p>\n<p>A list of candidate files for removal can be constructed with <code>lfs find<\/code>.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ lfs find $HOME\/scratch --atime +90\r\n...\r\n<\/pre>\n<p>The resulting list will be of files that have not been <em>accessed<\/em> in over 90 days.<\/p>\n<h3>Is there a way to keep my scratch files from being deleted by the clean up policy?<\/h3>\n<p>No.<\/p>\n<p>Any user found to be trying to systematically retain scratch files beyond the 3 month limit may be banned from the system.<\/p>\n<p>If you need long term storage, then your principle investigator (PI) needs to request an allocation on <a href=\"\/rds\/\">Research Data Storage<\/a> and clearly state that is is required on the CSF and\/or iCSF when applying.<\/p>\n<h3 id=\"how-do-i-compress-my-files\">How do I compress my files?<\/h3>\n<p>A file is compressed (and uncompressed) using a program that implements a compression algorithm. Many exist but, at the time of writing, the most common commands are <code>bzip2<\/code>, <code>gzip<\/code>, and <code>xz<\/code>.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ ls\r\nfoo\r\n$ du -a foo\r\n123116  foo\r\n\r\n# Compress the file then check its size\r\n$ gzip foo\r\n$ ls\r\nfoo.gz\r\n\r\n$ du -a foo.gz\r\n336     foo.gz\r\n<\/pre>\n<p>It is standard behaviour for these commands to remove the name they were given. The output file will have a similar name whose suffix is related to the command used: <code>.bz2<\/code>, <code>.gz<\/code>, <code>.xz<\/code>, etc.<\/p>\n<p>The previously compressed file is uncompressed using a related command. For those listed above, these are: <code>bunzip2<\/code>, <code>gunzip<\/code>, and <code>unxz<\/code>.<\/p>\n<pre>$ ls\r\nfoo.gz\r\n$ gunzip foo.gz\r\n$ ls\r\nfoo\r\n<\/pre>\n<h3 id=\"how-do-i-tarball-files-together\">How do I tarball files together?<\/h3>\n<p>The following command creates a gzipped tarball (the Unix equivalent of a .zip file) called <code>my-job-files.tar.gz<\/code> in the current directory and contains the contents of 3 directories and a text file:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ tar zcvf my-job-files.tar.gz dir1 dir2 jobscript.txt dir3\r\n...\r\n<\/pre>\n<p>You can place create the file somewhere else if you wish. For example, the same command again specifying your directory in a group area:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ tar zcvf \/mnt\/zx01-data01\/zfgg1234\/my-job-files.tar.gz dir1 dir2 jobscript.txt dir3\r\n...\r\n<\/pre>\n<p>Note: you have to suffix the name of the file you are creating with .gz so you know it is compressed. You can also check if a file is compressed with:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ file my-job-files.tar.gz\r\nmy-job-files.tar.gz: gzip compressed data, from Unix, last modified: Mon Dec 19 15:31:08 2016\r\n<\/pre>\n<p>This <code>file<\/code> command will work even if the file does not have <code>.gz<\/code>.<\/p>\n<h3>Does tar delete the original files?<\/h3>\n<p>No. Once you have created your tarball you need to delete the original files. For example, based on the above tar command that would be:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ rm -rf dir1 dir2 jobscript.txt dir3\r\n<\/pre>\n<p>Warning: Be careful with this &#8211; it will delete all the files and folders you specify and anything inside the folders (i.e all files and subdirectories they contain).<\/p>\n<h3>I&#8217;ve accidentally deleted\/overwritten a file. Is there a backup? Is it possible to recover the file?<\/h3>\n<p>The possibility of restoring a file (either in entirety or a previous revision) depends on where it is stored. This will be one of three locations: a directory that is part of Research Data Storage (RDS), a home directory, or on the <em>scratch<\/em> file system. It can be done for <a href=\"\/csf3\/filesystems\/backup\/\">the first and second<\/a>, it can not be done for the third.<\/p>\n<h3>All this file management stuff takes a lot of time, can I make it any faster?<\/h3>\n<h4>rsync<\/h4>\n<p><code>rysnc<\/code> is much faster than <code>cp<\/code>, <code>mv<\/code> and a little faster than <code>scp<\/code> or the drag and drop of MobaXterm. It is also clever in that if your transfer gets interrupted it can be restarted using the exact same command and it will figure out where it got up to and start again from there (i.e. it does not do the whole transfer again). It can also copy files that have been previously copied but only send\/copy any changes.<\/p>\n<p>You can use the <code>--remove-source-files<\/code> option which will remove the files from the source once they have been copied to the destination. For example, the below command copies three files from the current directory to a group area and then removes the three files from the current directory<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ rsync --remove-source-files -avz file1 file2 file \/mnt\/zx01-data01\/zfgg1234\r\n<\/pre>\n<p>saving considerable time if you are trying to in effect move files from one location to another. It is much more reliable than doing a <code>mv<\/code> or a <code>cp<\/code> followed by an <code>rm<\/code>.<\/p>\n<h4>pigz<\/h4>\n<p>You can speed up the creation of tarballs and compressing them with the <code>pigz<\/code> command. This must be run as a batch job. Some examples are given above and in the detailed <a href=\"\/csf-apps\/software\/applications\/pigz\/\">pigz<\/a> CSF documentation.<\/p>\n<h2>Querying File Systems<\/h2>\n<h3>How can I see how much home or group disk space is available?<\/h3>\n<p>Home and group file systems are part of the Research Data Storage service. To see how much disk space is allocated, type the following in any directory on the file system you are interested in:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ df -h .\r\nFilesystem            Size  Used Avail Use% Mounted on\r\n...\r\n<\/pre>\n<p>The <code>.<\/code> is important. An example of what a home file system (e.g. <code>\/mnt\/iusers01\/zx01<\/code>) might report:<\/p>\n<pre>Filesystem            Size  Used Avail Use% Mounted on\r\nnas.isilonr.manchester.ac.uk:\/isilon\/x\/x\/x\/csf-users01\r\n                      500G  328G  173G  66% \/mnt\/iusers01\r\n<\/pre>\n<p>Here we can see that the group has been allocated 500GB and of that, 328GB is in use, leaving 173GB available.<\/p>\n<h3>How can I see how much scratch disk space is available?<\/h3>\n<p>The scratch file system uses something called <a href=\"https:\/\/www.lustre.org\/\">Lustre<\/a>. You can see how big it is and how much space is available with:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ df -h \/scratch\r\n<\/pre>\n<p>which would then report something like this:<\/p>\n<pre>Filesystem            Size  Used Avail Use% Mounted on\r\n10.x.x.x@o2ib0:10.x.x.x@o2ib0:\/lustre\r\n                      306T  228T   64T  79% \/net\/lustre\r\n<\/pre>\n<p>To see how much you are using on the CSF3 scratch file system run this command:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ scrusage\r\nYour scratch usage: 450.7GB, 84,053 files\r\n<\/pre>\n<p>Bear in mind that we have a <a href=\"\/csf\/getting-started-on-the-csf\/filestore-home-directories-and-scratch\/filesystems\/#Scratch_clean-up_policy\">3 month clean up policy<\/a> on files in <em>scratch<\/em>. It may look big, but with over 1000 users on the system there is the potential for it to run out!!<\/p>\n<h3>Is there a limit\/quota on how much disk space I can use?<\/h3>\n<p>In home and group areas you are limited by the quota set on your group. Within that group quota, there are usually no quotas set on individuals, but everyone in the group is expected to make fair usage of the space. If one or two users use up most of the space, it will cause problems for the group and prevent group users from being able to work.<\/p>\n<p>A group can consist of between 2 and 200 users. If someone persistently uses unfair amount of disk, then a per-user quota can be imposed.<\/p>\n<p>Principal investigators (PI) may request that a quota be set on individual directories to help them manage their space. Some groups do this routinely with project areas.<\/p>\n<p>The scratch file system has to be used fairly by everyone on the system.<\/p>\n<p>We will contact people when disk usage on home, group, or scratch file systems gets too high and expect them to tidy up as a matter of priority.<\/p>\n<h3>I need a lot of disk space or long term storage, what should I do?<\/h3>\n<p>Ask your <strong>head of research group<\/strong> to request some Research Data Storage (RDS) and to clearly indicate when they do that it needs to be accessible on the CSF. For further information and the application form please see the <a href=\"http:\/\/ri.itservices.manchester.ac.uk\/rds\/\">RDS website<\/a>.<\/p>\n<h3>How much disk space am I using?<\/h3>\n<p>To see how much space you are using in any directory run:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ du -sh\r\n...\r\n<\/pre>\n<p>This will report usage for the current directory and all directories below it: e.g<\/p>\n<pre>56G\t.\r\n<\/pre>\n<p>If you want to see a breakdown for the files and directories in the current directory:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ du -sh *\r\n...\r\n<\/pre>\n<p>might return something like:<\/p>\n<pre>4.0K      jobscript.txt\r\n11G       my_jobs\r\n1.4G      fileA\r\n43.6G     results\r\n<\/pre>\n<p>Note: typing <code>cd<\/code> (followed by the enter\/return key) will always automatically place you in your home directory.<\/p>\n<h3>I&#8217;m using a lot of home disk space, but can&#8217;t see why.<\/h3>\n<p>Not all names in a directory are ordinarily visible. The result of <a href=\"https:\/\/web.archive.org\/web\/20141205101508\/https:\/\/plus.google.com\/+RobPikeTheHuman\/posts\/R58WgWwN9jp\">an accident in Unix<\/a> is that names whose first character is &#8220;.&#8221; are, normally, hidden. The option <code>-a<\/code> for <code>ls<\/code> will list all names.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ ls\r\nbaz  foo\r\n$ ls -a\r\n.  ..  .bar  baz  foo  .quux<\/pre>\n<p>If you have been using a tool such as XFE, for example on the RDS-SSH service, to delete files then you may have a trash folder which is rather full and needs emptying. On the command line you will find the size of the trash directory by doing this:<\/p>\n<pre>du -sh .Trash\r\n<\/pre>\n<p>Note: the dot because it is a hidden file.<\/p>\n<h3>How can I see the size of a single file?<\/h3>\n<p>The command <code>du<\/code> will print the space consumed by a file.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ du -a foo\r\n4       foo<\/pre>\n<p>The same information can be found in the fifth column of the long format printed by <code>ls<\/code>.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ ls -l bar\r\n-rw-rw---- 1 user user 742 Sep  9 14:39 bar<\/pre>\n<h3>How do I find large files?<\/h3>\n<p>The <code>find<\/code> command is really good at this. Please run it in batch if you have a large directory for it to work through. Example, to find files in the current directory and directories below it that are bigger than 5GB:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ find . -size +5G\r\n...\r\n<\/pre>\n<p>(The <code>.<\/code> is important.). Example results of this command:<\/p>\n<pre>my_model.sim\r\ndir1\/my_old_model.sim\r\ndir1\/my_old_model.output\r\ndir2\/input.txt\r\n<\/pre>\n<p>To see more detailed information about each file returned by find, including it&#8217;s size, you can add a little bit more to the command, for example:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ find . -size +5G -exec ls -sh {} \\;\r\n...\r\n<\/pre>\n<p>Might return:<\/p>\n<pre>5.1G  my_model.sim\r\n10.8G dir1\/my_old_model.sim\r\n6G    dir1\/my_old_model.output\r\n5G    dir2\/input.txt\r\n<\/pre>\n<p>On the <em>scratch<\/em> file system, the alternative, more specialised, command <code>lfs find<\/code> is available.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ lfs find $HOME\/scratch --size +5G\r\n...\r\n<\/pre>\n<h3 id=\"do-empty-files-use-disk-space\">Do empty files use disk space?<\/h3>\n<p>Yes, they do. For every record of a file on the file system it uses a file system block. This is very small, but when 100s of users have 100s or 1000s of empty files it can add up to a significant amount. Please delete them. Common empty files are those from batch jobs, such as<\/p>\n<pre>slurm-123456.out\r\nslurm-123456_123.out\r\njobscript.txt.e123456\r\njobscript.txt.o123456\r\n<\/pre>\n<p>Many software applications write to their own specific output files and you will have no use for the empty or very small batch output files, but you should check them before deleting. Some applications will write to them, but you will not require the info.<\/p>\n<p>The command <code>find<\/code> is capable of searching for files whose size is 0.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ du -a\r\n0       .\/baz\r\n28      .\/foo\r\n0       .\/.quux\r\n8       .\/.bar\r\n40      .\r\n$ find . -size 0\r\n.\/baz\r\n.\/.quux<\/pre>\n<p>The command <code>lfs find<\/code> is also available for use on the <em>scratch<\/em> file system.<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ lfs find $HOME\/scratch --size 0\r\n...\r\n<\/pre>\n<h2>Transferring Files<\/h2>\n<h3>Can I do file management on the CSF login node?<\/h3>\n<p>We do see high loads on the CSF login nodes caused by people doing lots of <code>rsync<\/code>, compression or uploading and downloading. Please use batch jobs to manage your files directly on the system and for file transfer of more than a few GB please use the RDS-SSH service.<\/p>\n<h3 id=\"Moving-files\">How can I move files from one file system to another? I&#8217;ve heard of something called rsync.<\/h3>\n<p><code>rsync<\/code> is a fast, versatile tool for copying files between file systems and from one computer to another. An example of copying a file from scratch to your group area:<\/p>\n<pre>[<em>mabcxyz1<\/em>@login1[csf3] ~]$ rsync -avz ~\/scratch\/myjobdir \/mnt\/zx01-data01\/zfgg1234\r\n<\/pre>\n<p>The above will create an exact copy of <code>myjobdir<\/code> and all the files it contains in <code>\/mnt\/zx01-data01\/zfgg1234\/myjobdir<\/code> .<\/p>\n<p>We <strong>strongly recommend<\/strong> that you do this <strong>via a batch job<\/strong> to avoid overloading the login nodes. See below examples.<\/p>\n<h3>How do I download\/upload to\/from the system?<\/h3>\n<p>For <strong>a few small files<\/strong> you can use <code>scp<\/code> or <code>rysnc<\/code> from a linux or mac computer direct to the login nodes. If you are using windows then winscp is a very good tool, or you can use the drag and drop feature of Mobaxterm. All of these methods are described in our <a href=\"\/userdocs\/file-transfer\/\">file transfer<\/a> docs.<\/p>\n<p>For <strong>large files or many files<\/strong> you can use all of the same tools as for small files, but please do so via the RDS-SSH service. Note, you may have to request an account on this service and it cannot see the scratch file system. Full details are available on the <a href=\"http:\/\/ri.itservices.manchester.ac.uk\/rds\/the-rds-ssh-service\/\">RDS-SSH service webpage<\/a>.<\/p>\n<h3>Can I use a batch script to transfer files to and from the system?<\/h3>\n<p>Unfortunately not. Please follow <em>How do I download\/upload to\/from the system?<\/em><\/p>\n<h2>Miscellaneous<\/h2>\n<h3>How do I use the CSF batch system to manage my files, copy, compress, rsync etc?<\/h3>\n<p>It is a simple case of putting the same commands you run on the command line into a serial batch job.<\/p>\n<h4>Example 1: Create compressed tar and rsync it to RDS space via the batch system<\/h4>\n<p>To create a tarball in the current directory and then copy it to a group area:<\/p>\n<pre class=\"slurm\">#!\/bin\/bash --login\r\n#SBATCH -p serial\r\n#SBATCH -t 0-1\r\n\r\ntar zcvf my-job-files.tar.gz dir1 dir2 jobscript.txt dir3\r\nrsync -avz my-job-files.tar.gz \/mnt\/zx01-data01\/zfgg1234\r\n<\/pre>\n<p>and submit in the usual way:<\/p>\n<pre class=\"slurm\">sbatch file-tidy.sbatch<\/pre>\n<p>where <code>file-tidy.sbatch<\/code> is the name of your batch submission script.<\/p>\n<p>Once you are happy that everything in the batch job completed without issue remember that depending on which commands you ran (e.g. tar) you may need to delete the original copies of files (via a new batch job).<\/p>\n<h4>Example 2: Create compressed tar in RDS space using pigz via the batch system<\/h4>\n<p>This example uses <a href=\"\/csf-apps\/software\/applications\/pigz\/\">pigz<\/a> (a parallel compression tool) and places the compressed file in your directory in your group&#8217;s Research Data Storage:<\/p>\n<pre class=\"slurm\">#!\/bin\/bash --login\r\n#SBATCH -p multicore_small\r\n#SBATCH -n 4               # Use 4 cores to do parallel file compression\r\n#SABTCH -t 0-4             # Wallclock max allowed time needs to be specified - here 4 hrs\r\n\r\n# Set up to use pigz\r\nmodule purge\r\nmodule load tools\/gcc\/pigz\r\n\r\n## Note that $SLURM_NTASKS is automatically set to the number of cores requested above\r\n\r\n## Compress everything found in a directory named 'my_data' to a compressed tar file named my_data.tar.gz in a group area\r\ntar cf - my_data | pigz -p $SLURM_NTASKS &gt; \/mnt\/zx01-data01\/zfgg1234\/my_data.tar.gz\r\n       #\r\n       #\r\n       # Note that a '-' here means the output is sent through the\r\n       # pipe (the | symbol) to the pigz command, not to an intermediate\r\n       # tar file.\r\n<\/pre>\n<p>and submit in the usual way:<\/p>\n<pre class=\"slurm\">sbatch file-tidy.sbatch<\/pre>\n<p>where <code>file-tidy.sbatch<\/code> is the name of your batch submission script.<\/p>\n<p>Once you are happy that everything in the batch job completed without issue remember that depending on which commands you ran (e.g. tar) you may need to delete the original copies of files (via a new batch job).<\/p>\n<h4>Example 3: Compress individual large files via the batch system<\/h4>\n<p>The following will use the <code>find<\/code> command from earlier to find files of size 5GB or larger in the current directory (and any sub-directories below it). It will compress each file. Here we use the parallel <a href=\"\/csf-apps\/software\/applications\/pigz\/\">pigz<\/a> command on the CSF. The jobscript is:<\/p>\n<pre class=\"slurm\">#!\/bin\/bash --login\r\n#SBATCH -p multicore_small\r\n#SBATCH -n 4           # Use 4 cores to do parallel file compression\r\n#SBATCH -t 0-4         # Wallclock max allowed time needs to be specified - here 4 hrs\r\n    \r\n# Set up to use pigz\r\nmodule purge\r\nmodule load tools\/gcc\/pigz\r\n\r\n# Find all 5GB (or larger) files, show the size, compress in parallel, show the new size\r\nfor fname in `find . -size +5G`; do ls -sh $fname; pigz -p $SLURM_NTASKS $fname; ls -sh $fname.gz; done\r\n<\/pre>\n<p>Submit the jobscript in the usual way:<\/p>\n<pre>sbatch pigz-jobscript<\/pre>\n<h3>I&#8217;ve got 1000s of files in scratch I want to download. What&#8217;s the best way?<\/h3>\n<p>Downloading a large number of individual files can be very time consuming and will place a lot of strain on the login node.<\/p>\n<p>First consider whether you need to download the files at all. If they are important result files you should consider keeping them in your home area which is on secure backed-up Isilon storage. Your research group may also have additional Isilon areas for specific research projects or data areas. Downloading to a PC that isn&#8217;t backed up could result in data loss if the local PCs disks fail. If you don&#8217;t have enough space in your Isilon area then consider compressing the files (with <code>zip<\/code> or <code>gzip<\/code>).<\/p>\n<p>If you still want to download a copy then a better option would be to zip up the files in to a single compressed archive. Zip files are common on Windows \/ MacOS so if you want to transfer the files to a local Windows \/ MacOS computer you can create the zip file on the CSF and then download it. Alternatively if your local PC is running linux you can create a <em>tar.gz<\/em> file on the CSF and download that. We advise running the zip app as a batch job to prevent the login node from being overloaded. Here&#8217;s how:<\/p>\n<pre># In this example we assume the files to be downloaded are in the folder:\r\n~\/scratch\/my_data\/experiment1\/\r\n\r\n# Go to the parent of the required location in scratch. For example:\r\ncd ~\/scratch\/my_data\/\r\n\r\n# zip up all the files from a sub-directory named 'experiment1' (job wallclock is 1 day: -t 1-0)\r\nsbatch -p serial -t 1-0 --mail-type=END --mail-user=$USER --wrap=\"zip -r my_stuff.zip experiment1\"\r\n\r\n# Or, to create a .tar.gz file for use on Linux PCs\/Laptops (job wallclock is 1 day: -t 1-0)\r\nsbatch -p serial -t 1-0 --mail-type=END --mail-user=$USER --wrap=\"tar czf my_stuff.tar.gz experiment1\"\r\n<\/pre>\n<p>The above command will submit a batch job (without writing a jobscript), run it from the current directory in the <em>short<\/em> environment (and will email you when it has finished. The job will zip up and compress all the files in the <code>experiment1<\/code> sub-directory of the <code>scratch\/my_data\/<\/code> directory (change the names to suit your own directory structure). When the job finished you&#8217;ll have a file named <code>my_stuff.zip<\/code> in your <code>~\/scratch\/my_data\/<\/code> directory which you can then download using WinSCP, scp or other favourite file transfer program from your PC. Alternatively copy the zip file to your home area.<\/p>\n<h3>How can I free up some space in my home or scratch area?<\/h3>\n<p>The obvious answer is to delete unwanted files (use the <code>rm<\/code> command or your preferred graphical file browser such as that in MobaXterm). However, deleting results and data files is not always possible. But there are ways to reduce your usage:<\/p>\n<ol>\n<li>Compress your files. Many applications write out plain text results files and other log files. These can be huge. Do you need the log file? If not, delete it. But the results files will compress will using <code>gzip myresult.dat<\/code> (which will create a new smaller file named <code>myresult.dat.gz<\/code>. You can still read the file using <code>zless myresult.dat.gz<\/code> or uncompress it using <code>gunzip myresult.dat.gz<\/code><\/li>\n<li>Delete unwanted job <code>.o<em>NNNNNN<\/em><\/code> and <code>.e<em>NNNNNN<\/em><\/code> output files. Every job will produce an output file capturing what would have been printed to screen when your application ran. The files can contain normal output (the <code>.o<\/code> file) and error messages (the <code>.e<\/code> file). Each file will have the unique job number at the end of the name. If you run a lot of jobs (1000s &#8211; and many users do!) you will soon have 1000s of files. We&#8217;ve seen some directories with millions of these output files! Individually each file is often small but they soon accumulate. They also take up more space on the file system than you think (the minimum block size of the storage system is used even if your file is smaller). Please <strong>delete unwanted job output files<\/strong>. The following command can be used:\n<pre>rm -f *.[oe][0-9]* \r\n<\/pre>\n<\/li>\n<li>Keep your job directories tidy. Deleting files from jobs you ran months ago is never an exciting task &#8211; you may have 1000s of output files. Nobody likes looking through old files to see if you need them or not. Deleting unwanted files when the job finishes is the best way to keep your storage areas tidy. You can even but the delete commands (<code>rm<\/code> in your jobscript to clean up any junk at the end of a job.<\/li>\n<\/ol>\n<h3>I have downloaded a .zip of several datasets but the scratch clean-up keeps deleting them. What can I do?<\/h3>\n<p>When you unzip (or tar) the archive of datasets, the files in the archive will be created with their original timestamps. These could be months or years in the past. The scratch-tidy will then see these old files and delete them.<\/p>\n<p>The solution is to ask unzip (or tar) to extract the files and apply today&#8217;s date to them. See <a href=\"\/csf3\/filesystems\/scratch-cleanup\/#tardate\">here<\/a> for the extra flag\/switch you must add to <code>unzip<\/code> or <code>tar -xf<\/code> to extract the files correctly.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The contents of this article originally appeared on a different site and it has be migrated to this site for better visibility. For the definitions of terms see the section on file systems. Maintaining Directories How can I manage my directories (folders)? There are 7 basic commands to inspect and manipulate files and directories: cp to copy files, ls to list files, mv to rename\/move files, mkdir to create directories (folders), rm to remove files,.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/faqs\/file-management\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":22,"featured_media":0,"parent":40,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-10879","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10879","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/22"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=10879"}],"version-history":[{"count":20,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10879\/revisions"}],"predecessor-version":[{"id":12221,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/10879\/revisions\/12221"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/40"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=10879"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}