All these steps have already been run. We'll be spending time looking at the commands and output. Let's get set up.
Get to the results
Get set up
cds cd my_rnaseq_course cp /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results .
Step 1: Tophat
We've already gone over how tophat results look here so let's move on to step 2.
Step 2: Cufflinks
HOW WAS IT RUN?
HOW DOES THE OUTPUT LOOK?
Take a look at output for one of our samples, C1_R1. The important file is transcripts.gtf, which contains Tophat's assembled junctions for C1_R1.
Cufflinks output files
#If you have a local copy: ls -l C1_R1_clout #If you don't have a local copy: ls -l /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results/C1_R1_clout -rw------- 1 daras G-801020 627673 May 17 16:58 genes.fpkm_tracking -rw------- 1 daras G-801020 1021025 May 17 16:58 isoforms.fpkm_tracking -rw------- 1 daras G-801020 0 May 17 16:50 skipped.gtf -rw------- 1 daras G-801020 14784740 May 17 16:58 transcripts.gtf
Step 3: Cuffmerge
HOW WAS IT RUN?
We first created a file listing the paths of all per-sample transcripts.gtf files so far, then pass that to cuffmerge:
How did we do that?
find . -name transcripts.gtf > assembly_list.txt #If you have a local copy: cat assembly_list.txt #If you don't have a local copy: cat /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results/assembly_list.txt
commands.cuffmerge file
cat run_commands/commands.cuffmerge
cuffmerge -g reference/genes.exons.gtf assembly_list.txt
HOW DOES THE OUTPUT LOOK?
The most important file is merged.gif, which contains the consensus transcriptome annotations cuffmerge has calculated.
cuffmerge output
#If you have a local copy: ls -l merged_asm #If you don't have a local copy: cat /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results/merged_asm -rwxrwxr-x 1 daras G-803889 1571816 Aug 16 2012 genes.fpkm_tracking -rwxrwxr-x 1 daras G-803889 2281319 Aug 16 2012 isoforms.fpkm_tracking drwxrwxr-x 2 daras G-803889 32768 Aug 16 2012 logs -r-xrwxr-x 1 daras G-803889 32090408 Aug 16 2012 merged.gtf -rwxrwxr-x 1 daras G-803889 0 Aug 16 2012 skipped.gtf drwxrwxr-x 2 daras G-803889 32768 Aug 16 2012 tmp -rwxrwxr-x 1 daras G-803889 34844830 Aug 16 2012 transcripts.gtf
Step 4: Cuffdiff
HOW WAS IT RUN?
commands.cuffmerge file
#If you have a local copy: cat run_commands/commands.cuffdiff cuffdiff -o diff_out -b reference/genome.fa -p 8 -L C1,C2 -u merged_asm/merged.gtf C1_R1_thout/accepted_hits.bam,C1_R2_thout/accepted_hits.bam,C1_R3_thout/accepted_hits.bam C2_R1_thout/accepted_hits.bam,C2_R2_thout/accepted_hits.bam,C2_R3_thout/accepted_hits.bam #If you don't have a local copy: cat /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results/commands.cuffdiff
cuffmerge output
#If you have a local copy: ls -l diff_out #If you don't have a local copy: cat /corral-repl/utexas/BioITeam/rnaseq_course/cufflinks_results/diff_out -rwxr-x--- 1 daras G-801020 2691192 Aug 21 12:20 isoform_exp.diff : Differential expression testing for transcripts -rwxr-x--- 1 daras G-801020 1483520 Aug 21 12:20 gene_exp.diff : Differential expression testing for genes -rwxr-x--- 1 daras G-801020 1729831 Aug 21 12:20 tss_group_exp.diff: Differential expression testing for primary transcripts -rwxr-x--- 1 daras G-801020 1369451 Aug 21 12:20 cds_exp.diff : Differential expression testing for coding sequences -rwxr-x--- 1 daras G-801020 3277177 Aug 21 12:20 isoforms.fpkm_tracking -rwxr-x--- 1 daras G-801020 1628659 Aug 21 12:20 genes.fpkm_tracking -rwxr-x--- 1 daras G-801020 1885773 Aug 21 12:20 tss_groups.fpkm_tracking -rwxr-x--- 1 daras G-801020 1477492 Aug 21 12:20 cds.fpkm_tracking -rwxr-x--- 1 daras G-801020 1349574 Aug 21 12:20 splicing.diff : Differential splicing tests -rwxr-x--- 1 daras G-801020 1158560 Aug 21 12:20 promoters.diff : Differential promoter usage -rwxr-x--- 1 daras G-801020 919690 Aug 21 12:20 cds.diff : Differential coding output