Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Overview

In this lab, we'll look at how to use cummeRbund to visualize our gene expression results from cuffdiff.  CummeRbund is part of the tuxedo pipeline and it is an R package that is capable of plotting the results generated by cuffdiff.

Figure from: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks, Trapnell et al, Nature Protocols, 2013. 

What can you do with cummeRbund?

CummeRbund is powerful package with many different functions. Cuffdiff produces a lot of output files and all the output files are related by sets by IDs. CummeRbund makes managing and querying this data easier by loading the data into multiple objects of several different classes. Because all of this gets stored in an sql database, you can access it quickly without loading everything in to memory.

  •   readCufflinks- functions designed to read all the output files that cuffdiff generates into an R data object (of class CuffSet).
    •  cuff_data <- readCufflinks(diff_out)
    • Now you can access gene information in your differential output using genes(cuff_data), your isoform level output using isoforms(cuff_data), TSS related groups using tss(cuff_data) and so forth
  • Global statistics on data for quality, dispersion, distribution of gene expression scores etc.
  • Plot things for specific features or genes

  • Get your data

Get set up
cds
cd my_rnaseq_course
cp -r /corral-repl/utexas/BioITeam/rnaseq_course/diff_results .
cd diff_results
 
 

#We need the cuffdiff results because that is the input to cummeRbund.

 

A) First load R and enter R environment

#module load R will not work for this R package because its needs the latest version of R. So, we'll load a local version.
$BI/bin/R

 
What is $BI here?

 

B) Within R environment, install package cummeRbund

source("http://bioconductor.org/biocLite.R")
biocLite("cummeRbund")

 

C) Load cummeRbund library and read in the differential expression results.  If you save and exit the R environment and return, these commands must be executed again.

library(cummeRbund)
cuff_data <- readCufflinks('cuffdiff_results')

 

D) Use cummeRbund to visualize the differential expression results.

NOTE:  Any graphic outputs will be automatically saved as "Rplots.pdf" which can create problems when you want to create multiple plots with different names in the same process.  To save different plots with different names, preface any of the commands below with the command: 

pdf(file="myPlotName.pdf")

And after executing the necessary commands, add the line:

dev.off()

Exercise 1: Generate a scatterplot comparing genes across conditions C1 vs C2.

pdf(file="scatterplot.pdf")

csScatter(genes(cuff_data), 'C1', 'C2')

dev.off()

The resultant plot is here.


Exercise 2:  Pull out from your data, significantly differentially expressed genes and isoforms.

To pull out significantly differentially expressed genes and isoforms
gene_diff_data  <- diffData(genes(cuff_data))
sig_gene_data  <- subset(gene_diff_data, (significant ==  'yes'))

#Count how many we have
nrow(sig_gene_data)

isoform_diff_data <-diffData(isoforms(cuff_data))
sig_isoform_data <- subset(isoform_diff_data, (significant == 'yes'))

#Count how many we have
nrow(sig_isoform_data)

 

Exercise 3: For a gene, regucalcin, plot gene and isoform level expression.

To plot gene level and isoform level expression for gene regucalcin
pdf(file="regucalcin.pdf")
mygene1 <- getGene(cuff_data,'regucalcin')
expressionBarplot(mygene1)
expressionBarplot(isoforms(mygene1))
dev.off()

The resultant plot is here.

 

Exercise 4: For a gene, Rala, plot gene and isoform level expression.

To plot gene level and isoform level expression for gene Rala
pdf(file="rala.pdf")
mygene2 <- getGene(cuff_data, 'Rala')
expressionBarplot(mygene2)
expressionBarplot(isoforms(mygene2))
dev.off()
  • Take cummeRbund for a spin...

CummeRbund is powerful package with many different functions. Above was an illustration of a few of them. Try any of the suggested exercises below to further explore the differential expression results with different cummeRbund functions.

If you would rather just look at the resulting graphs, the links to each of the resultant graphs is given below the commands.

You can refer to the cummeRbund manual for more help and remember that ?functionName will provide syntax information for different functions.
http://compbio.mit.edu/cummeRbund/manual.html

 

Exercise 5: Visualize the distribution of fpkm values across the two different conditions using a boxplot.

 Solution
R command to generate box plot of gene level fpkms
csBoxplot(genes(cuff_data))
R command to generate box plot of isoform level fpkms
csBoxplot(isoforms(cuff_data))
 Hint

Use csBoxplot function on cuff_data object to generate a boxplot of gene or isoform level fpkms.

The resultant plot is here.

Exercise 6: Visualize the significant vs non-significant genes using a volcano plot.

 Solution

csVolcano(genes(cuff_data), "C1", "C2")

 Hint

Use csVolcano function on cuff_data object to generate a volcano plot.

The resultant plot is here.

 

  • No labels