TEtranscripts
Overview
TEtranscripts is a software package that utilizes both unambiguously (uniquely) and ambiguously (multi-) mapped reads to perform differential enrichment analyses from high throughput sequencing experiments.
Version 2.0.5 is installed on the CSF.
Restrictions on use
There are no restrictions on accessing this software on the CSF. It is released under the GPL v3 license and all usage must adhere to that license.
Please cite your use of this software using:
Jin Y., Tam O.H., Paniagua E. and Hammell M. (2015). TEtranscripts: A package for including transposable elements in differential expression analysis of RNA-seq datasets. Bioinformatics 31: 3593-3599. Pubmed ID: 26206304
Set up procedure
We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.
Load one of the following modulefiles:
module load apps/python/tetranscripts/2.0.5 # Also loads modulefiles for # Pysam 0.15.2 and R 3.6.1 (inc DESeq2)
Reference and Test Data
The modulefile will set the following environment variables which can be used in your jobscripts to access the reference / pre-built data and test data files. NOTE: all downloaded datafiles have been uncompressed.
$TE_DATA set to /mnt/data-sets/tetranscripts/ $TE_TESTDATA set to /mnt/data-sets/tetranscripts/test_data $TE_PREBUILT set to /mnt/data-sets/tetranscripts/Prebuilt_indices/ $TE_GTF set to /mnt/data-sets/tetranscripts/TE_GTF/
To see what prebuilt and GTF data-files area available, run the following on the login node after loading the modulefile there.
ls $TE_PREBUILT ls $TE_GTF ls $TE_TESTDATA
You can then use the filenames when specifying commmand-line flags for TEtranscript
. For example:
TEtranscripts ... --TE $TE_GTF/ce10_rmsk_TE.gtf
Running the application
Please do not run TEtranscripts
or TEcount
on the login node. Jobs should be submitted to the compute nodes via batch.
Serial batch job submission
Create a batch submission script (which will load the modulefile in the jobscript), for example:
#!/bin/bash --login #$ -cwd # Job will run from the current directory #$ -l mem512 # It is recommended to run on the high-memory nodes. # This will give you 32GB memory to work with (1 core). # NO -V line - we load modulefiles in the jobscript module load apps/python/tetranscripts/2.0.5 TEtranscripts -t RNAseq1.bam RNAseq2.bam -c CtlRNAseq1.bam CtlRNAseq.bam \ --GTF genic-GTF-file --TE TE-GTF-file # # See above for environment variables that provide # the location of the downloaded reference data
Submit the jobscript using:
qsub scriptname
where scriptname is the name of your jobscript.
Further info
- TEtranscripts github page. This includes instructions on running TEtranscripts and example command-lines.
- TEtranscripts website
Updates
None.