Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this lab, we'll look at how to use cummeRbund to visualize our gene expression results from cuffdiff.  CummeRbund is part of the tuxedo pipeline and it is an R package that is capable of plotting the results generated by cuffdiff.

Image RemovedImage Added

Figure from: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks, Trapnell et al, Nature Protocols, 2013. 

Get your data

Code Block
titleGet set up
#IF YOU HAVE NOT COPIED THIS OVER ALREADY
cds
cd my_rnaseq_course
cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2015/day_3_partB/cuffdiff_results .

IF YOU HAVE ALREADY COPIED IT OVER
cds
cd my_rnaseq_course
cd day_3_partB/cuffdiff_results
 

#We need the cuffdiff results because that is the input to cummeRbund.

Introduction

CummeRbund is powerful package with many different functions. Cuffdiff produces a lot of output files and all the output files are related by sets by IDs.

Code Block
titlecuffdiff output
#If you have a local copy:
ls
-l cuffdiff_results

-rwxr-x--- 1 daras G-801020  2691192 Aug 21 12:20 isoform_exp.diff  : Differential expression testing for transcripts
-rwxr-x--- 1 daras G-801020  1483520 Aug 21 12:20 gene_exp.diff     : Differential expression testing for genes
-rwxr-x--- 1 daras G-801020  1729831 Aug 21 12:20 tss_group_exp.diff: Differential expression testing for primary transcripts
-rwxr-x--- 1 daras G-801020  1369451 Aug 21 12:20 cds_exp.diff      : Differential expression testing for coding sequences
 
-rwxr-x--- 1 daras G-801020  3277177 Aug 21 12:20 isoforms.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1628659 Aug 21 12:20 genes.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1885773 Aug 21 12:20 tss_groups.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1477492 Aug 21 12:20 cds.fpkm_tracking
 
-rwxr-x--- 1 daras G-801020  1349574 Aug 21 12:20 splicing.diff  : Differential splicing tests
-rwxr-x--- 1 daras G-801020  1158560 Aug 21 12:20 promoters.diff : Differential promoter usage
-rwxr-x--- 1 daras G-801020   919690 Aug 21 12:20 cds.diff       : Differential coding output

...

Figures from http://compbio.mit.edu/cummeRbund/manual_2_0.html

...


MAKE SURE YOU ARE IN THE IDEV SESSION FOR THIS!

...

titleGet set up

...

 

A) First load R and enter R environment

Code Block
#module load R will not work for this R package because its needs the latest version of R. So, we'll load a local version.

/work/01184/daras/bin/R

 

B) DON'T RUN THIS-IT HAS ALREADY BEEN INSTALLED! Within R environment, install package cummeRbundthis is how cummeRbund was installed. 

Code Block
#DON'T RUN THIS-IT HAS ALREADY BEEN INSTALLED!
source("http://bioconductor.org/biocLite.R")
biocLite("cummeRbund")

...

Code Block
library(cummeRbund)
cuff_data <- readCufflinks('cuffdiffdiff_resultsout')

 
 

 

D) Use cummeRbund to visualize the differential expression results.

...

Code Block
pdf(file="scatterplot.pdf")

csScatter(genes(cuff_data), 'C1', 'C2')

dev.off()

 
#How does this plot look? What is it telling us?

The resultant plot is here.


Exercise 2a:  Pull out from your data, significantly differentially expressed genes and isoforms.

Code Block
titleTo pull out significantly differentially expressed genes and isoforms
#Below command is equivalent to looking at the gene_exp.diff file that we spent a lot of time parsing yesterday
gene_diff_data <- diffData(genes(cuff_data))   
#Do gene_diff_data followed by tab to see all the variables in this data object
 
sig_gene_data  <- subset(gene_diff_data, (significant ==  'yes'))

#Count how many we have
nrow(sig_gene_data)

#Below command is equivalent to looking at the isoform_exp.diff file 
 
isoform_diff_data <-diffData(isoforms(cuff_data))
sig_isoform_data <- subset(isoform_diff_data, (significant == 'yes'))

#Count how many we have
nrow(sig_isoform_data)

 
Why do we have so many? Yesterday we saw
a
lot
less.

 

Exercise 2b:    Pull out from your data, genes that are significantly differentially expressed and above log2 fold change of 1

Code Block
titleTo pull out genes
#Below command is equivalent to looking at the gene_exp.diff file that we spent a lot of time parsing yesterday
gene_diff_data <- diffData(genes(cuff_data))   
#Do gene_diff_data followed by tab to see all the variables in this data object
 
sig_gene_data  <- subset(gene_diff_data, (significant ==  'yes'))
up_gene_data  <- subset(sig_gene_data, (log2_fold_change > 1))
#How many
nrow(up_gene_data)
 
#Pause here to consider a mistake I made yesterday

 
 
  
down_gene_data <- subset(sig_gene_data, (log2_fold_change < -1))
nrow(updown_gene_data)


Exercise 3: For a gene, regucalcin, plot gene and isoform level expression.

Code Block
titleTo plot gene level and isoform level expression for gene regucalcin
pdf(file="regucalcin.pdf")
mygene1 <- getGene(cuff_data,'regucalcin')
expressionBarplot(mygene1)
expressionBarplot(isoforms(mygene1))
dev.off()

The resultant plot is here.

 

Exercise 4: For a gene, Rala, plot gene and isoform level expression.

Code Block
titleTo plot gene level and isoform level expression for gene Rala
pdf(file="rala.pdf")
mygene2 <- getGene(cuff_data, 'Rala')
expressionBarplot(mygene2)
expressionBarplot(isoforms(mygene2))
dev.off()

The figure is here.

Take cummeRbund for a spin...

...

Expand
Solution
Solution
Code Block
titleR command to generate box plot of gene level fpkms
csBoxplot(genes(cuff_data))
Code Block
titleR command to generate box plot of isoform level fpkms
csBoxplot(isoforms(cuff_data))

The resultant graph is here.

 

Expand
Hint
Hint

Use csBoxplot function on cuff_data object to generate a boxplot of gene or isoform level fpkms.

The resultant plot is here.

Exercise 6: Visualize the significant vs non-significant genes using a volcano plot.

...

Expand
Hint
Hint

Use csVolcano function on cuff_data object to generate a volcano plot.

The resultant plot is here.

 

Exercise 7: MORE COMPLICATED! Generate a heatmap for genes that are upregulated and significant.

...

Expand
Hint
Hint

Use csHeatmap function on the up_gene_data data structure we created. But its a little tricky.

The resultant plot is here.