...
Figure from: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks, Trapnell et al, Nature Protocols, 2013.
Get your data
| Code Block | ||
|---|---|---|
| ||
#IF YOU HAVE NOT COPIED THIS OVER ALREADY
cds
cd my_rnaseq_course
cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2015/day_3/cuffdiff_results .
IF YOU HAVE ALREADY COPIED IT OVER
cds
cd my_rnaseq_course
cd day_3/cuffdiff_results
#We need the cuffdiff results because that is the input to cummeRbund. |
Introduction
CummeRbund is powerful package with many different functions. Cuffdiff produces a lot of output files and all the output files are related by sets by IDs.
| Code Block | ||
|---|---|---|
| ||
#If you have a local copy:
ls -l cuffdiff_results
-rwxr-x--- 1 daras G-801020 2691192 Aug 21 12:20 isoform_exp.diff : Differential expression testing for transcripts
-rwxr-x--- 1 daras G-801020 1483520 Aug 21 12:20 gene_exp.diff : Differential expression testing for genes
-rwxr-x--- 1 daras G-801020 1729831 Aug 21 12:20 tss_group_exp.diff: Differential expression testing for primary transcripts
-rwxr-x--- 1 daras G-801020 1369451 Aug 21 12:20 cds_exp.diff : Differential expression testing for coding sequences
-rwxr-x--- 1 daras G-801020 3277177 Aug 21 12:20 isoforms.fpkm_tracking
-rwxr-x--- 1 daras G-801020 1628659 Aug 21 12:20 genes.fpkm_tracking
-rwxr-x--- 1 daras G-801020 1885773 Aug 21 12:20 tss_groups.fpkm_tracking
-rwxr-x--- 1 daras G-801020 1477492 Aug 21 12:20 cds.fpkm_tracking
-rwxr-x--- 1 daras G-801020 1349574 Aug 21 12:20 splicing.diff : Differential splicing tests
-rwxr-x--- 1 daras G-801020 1158560 Aug 21 12:20 promoters.diff : Differential promoter usage
-rwxr-x--- 1 daras G-801020 919690 Aug 21 12:20 cds.diff : Differential coding output |
...
Figures from http://compbio.mit.edu/cummeRbund/manual_2_0.html
...
MAKE SURE YOU ARE IN THE IDEV SESSION FOR THIS!
...
| title | Get set up |
|---|
...
A) First load R and enter R environment
| Code Block |
|---|
#module load R will not work for this R package because its needs the latest version of R. So, we'll load a local version. /work/01184/daras/bin/R |
B) DON'T RUN THIS-IT HAS ALREADY BEEN INSTALLED! Within R environment, install package cummeRbundthis is how cummeRbund was installed.
| Code Block |
|---|
#DON'T RUN THIS-IT HAS ALREADY BEEN INSTALLED!
source("http://bioconductor.org/biocLite.R")
biocLite("cummeRbund")
|
...
| Code Block |
|---|
library(cummeRbund)
cuff_data <- readCufflinks('cuffdiffdiff_resultsout')
|
D) Use cummeRbund to visualize the differential expression results.
...
| Code Block | ||
|---|---|---|
| ||
#Below command is equivalent to looking at the gene_exp.diff file that we spent a lot of time parsing yesterday gene_diff_data <- diffData(genes(cuff_data)) #Do gene_diff_data followed by tab to see all the variables in this data object sig_gene_data <- subset(gene_diff_data, (significant == 'yes')) #Count how many we have nrow(sig_gene_data) #Below command is equivalent to looking at the isoform_exp.diff file isoform_diff_data <-diffData(isoforms(cuff_data)) sig_isoform_data <- subset(isoform_diff_data, (significant == 'yes')) #Count how many we have nrow(sig_isoform_data) Why do we have so many? Yesterday we saw a lot less. |
...
Exercise 2b: Pull out from your data, genes that are significantly differentially expressed and above log2 fold change of 1
| Code Block | ||
|---|---|---|
| ||
#Below command is equivalent to looking at the gene_exp.diff file that we spent a lot of time parsing yesterday gene_diff_data <- diffData(genes(cuff_data)) #Do gene_diff_data followed by tab to see all the variables in this data object sig_gene_data <- subset(gene_diff_data, (significant == 'yes')) up_gene_data <- subset(sig_gene_data, (log2_fold_change > 1)) #How many nrow(up_gene_data) #Pause here to consider a mistake I made yesterday down_gene_data <- subset(sig_gene_data, (log2_fold_change < -1)) nrow(up_gene_data) |
...