Objectives
In this lab, you will explore a popular transcriptome-aware mapper called Tophat. Simulated RNA-seq data will be provided to you; the data contains paired-end reads that have been generated in silico to replicate real gene count data from Drosophila. The data simulates two biological groups with three biological replicates per group (6 samples total). The objectives of this lab is to:
...
- c1_r1, c1_r2, c1_r3 from the first biological condition
- c2_r1, c2_r2, and c2_r3 from the second biological condition
Introduction
Tophat is part of the tuxedo suite of RNA-Seq tools. Tophat does a transcriptome-aware alignment of the input sequences to a reference genome using either the Bowtie or Bowtie2 aligner (in theory it can use other aligners, but we do not recommend this).
How Tophat Works
Image from: http://genomebiology.com/2013/14/4/R36
...
More documentation on tophat2 can be found here: http://tophat.cbcb.umd.edu/manual.shtml
Why splice aware/split alignment is important?
Now on to our tophat exercises.