RSEM

Overview

RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. The RSEM package provides an user-friendly interface, supports threads for parallel computation of the EM algorithm, single-end and paired-end read data, quality scores, variable-length reads and RSPD estimation. In addition, it provides posterior mean and 95% credibility interval estimates for expression levels. For visualization, It can generate BAM and Wiggle files in both transcript-coordinate and genomic-coordinate. Genomic-coordinate files can be visualized by both UCSC Genome browser and Broad Institute’s Integrative Genomics Viewer (IGV). Transcript-coordinate files can be visualized by IGV. RSEM also has its own scripts to generate transcript read depth plots in pdf format. The unique feature of RSEM is, the read depth plots can be stacked, with read depth contributed to unique reads shown in black and contributed to multi-reads shown in red. In addition, models learned from data can also be visualized. Last but not least, RSEM contains a simulator.

Version 1.3.1 is installed on the CSF.

Restrictions on use

There are no restrictions on accessing the software on the CSF. It is released under the licensed under the GNU General Public License v3.

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

# Load one of the following modulefiles
module load apps/gcc/rsem/1.3.1

Running the application

Please do not run RSEM on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch job submission

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
                    # NO -V line - we load modulefiles in the jobscript

# Load the required CPU version
module load apps/gcc/rsem/1.3.1
# Note that $NSLOTS is set to the number of cores: 1 for a serial
rsem-prepare-reference --gtf mm9.gtf \
                       --star \
                       --star-path $STARBIN \
                       -p 8 \
                       --prep-pRSEM \
                       --bowtie-path $BOWTIE2BIN \
                       --mappability-bigwig-file /path/to/data/mm9.bigWig \
                       /data/mm9 \
                       /ref/mouse_0

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

 

Further info

Updates

None.

Last modified on January 24, 2020 at 2:04 pm by Daniel Nisbet