tabix
Overview
Tabix is a generic indexer for TAB-delimited genome position files. Also provided is bgzip which is a block compression/decompression utility. Closely associated with SAMtools.
Version 0.2.6 is installed on the iCSF. It was compiled with gcc 4.7.0.
Version 1.3.1 is available through SAMtools
Restrictions on use
The software is open source. All iCSF users may access it.
Set up procedure
To access the software you must first load the modulefile:
module load apps/gcc/tabix/0.2.6
Version 1.3.1 is available through SAMtools
Running the application
It is possible to use tabix to download vcf files, or, more usefully, portions of large vcf files from remote repositories. You should use http://
when specifying the remote repository address, not ftp://
so that tabix can download through the University proxy server (Incline does not have direct access to the outside world – off-campus servers). For example:
tabix -fh http://ftp.1000genomes.ebi.ac.uk/somepath/blah.genotypes.vcf.gz \ 1:14720000-15600000 | gzip -c > data.vcf.gz
Notice that in the above command we pipe the downloaded data (which is plain text data) through the gzip command to produce a compressed file named data.vcf.gz
. Text data usually compresses well and so this method will save you a lot of disk space.
To view the contents of the compressed file (without needing to first incompress it) use the following command to page through it:
zless data.vcf.gz
Further info
Updates
None.