All these steps have already been run. We'll be spending time looking at the commands and output. Let's get set up.
Get to the data and results
cd /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_exercise
Step 1: Run tophat2
We've already gone over how this is done and looked over the results. Let's move on to step 2.
Step 2: Run cufflinks
cufflinks [options] <hits.bam> Some of the important options: -p/--num-threads -G/--GTF -g/--GTF-guide -b/--frag-bias-correct -u/--multi-read-correct
Look a t$BI/rnaseq_course/cufflinks_exercise/run_commands/commands.cufflinks to see how it was run.
Take a minute to look at the output files produced by one cufflinks run.
The important file is transcripts.gtf, which contains Tophat's assembled junctions for C1_R1.
cd $BI/rnaseq_course/cufflinks_exercise/result/C1_R1_clout ls -l drwxrwxr-x 2 nsabell G-801021 32768 May 22 15:10 cuffcmp -rwxr-xr-x 1 daras G-803889 14M Aug 16 12:49 transcripts.gtf -rwxr-xr-x 1 daras G-803889 597K Aug 16 12:49 genes.fpkm_tracking -rwxr-xr-x 1 daras G-803889 960K Aug 16 12:49 isoforms.fpkm_tracking -rwxr-xr-x 1 daras G-803889 0 Aug 16 12:33 skipped.gtf
Step 3: Merging assemblies using cuffmerge
We first create a file listing the paths of all per-sample transcripts.gtf files so far, then pass that to cuffmerge:
find . -name transcripts.gtf > assembly_list.txt cuffmerge <assembly_list.txt>
Take a minute to look at the output files produced by cuffmerge. The most important file is merged.gif, which contains the consensus transcriptome annotations cuffmerge has calculated.
cd $BI/rnaseq_course/cufflinks_exercise/merged_asm ls -l -rwxrwxr-x 1 daras G-803889 1571816 Aug 16 2012 genes.fpkm_tracking -rwxrwxr-x 1 daras G-803889 2281319 Aug 16 2012 isoforms.fpkm_tracking drwxrwxr-x 2 daras G-803889 32768 Aug 16 2012 logs -r-xrwxr-x 1 daras G-803889 32090408 Aug 16 2012 merged.gtf -rwxrwxr-x 1 daras G-803889 0 Aug 16 2012 skipped.gtf drwxrwxr-x 2 daras G-803889 32768 Aug 16 2012 tmp -rwxrwxr-x 1 daras G-803889 34844830 Aug 16 2012 transcripts.gtf
Step 4: Finding differentially expressed genes and isoforms using cuffdiff
cuffdiff [options] <merged.gtf> <sample1_rep1.bam,sample1_rep2.bam> <sample2_rep1.bam,sample2_rep2.bam>
Exercise: What does cuffdiff -b do?