Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

Code Block
titlecommands.cuffdiff file
#If you have a local copy:

 
cuffdiff -o diff_out -b reference/genome.fa -p 8 -L C1,C2 -u merged_asm/merged.gtf C1_R1_thout/accepted_hits.bam,C1_R2_thout/accepted_hits.bam,C1_R3_thout/accepted_hits.bam C2_R1_thout/accepted_hits.bam,C2_R2_thout/accepted_hits.bam,C2_R3_thout/accepted_hits.bam


HOW DOES THE OUTPUT LOOK?
Code Block
titlecuffdiff output
#If you have a local copy:
ls -l diff_out

-rwxr-x--- 1 daras G-801020  2691192 Aug 21 12:20 isoform_exp.diff  : Differential expression testing for transcripts
-rwxr-x--- 1 daras G-801020  1483520 Aug 21 12:20 gene_exp.diff     : Differential expression testing for genes
-rwxr-x--- 1 daras G-801020  1729831 Aug 21 12:20 tss_group_exp.diff: Differential expression testing for primary transcripts
-rwxr-x--- 1 daras G-801020  1369451 Aug 21 12:20 cds_exp.diff      : Differential expression testing for coding sequences
 
-rwxr-x--- 1 daras G-801020  3277177 Aug 21 12:20 isoforms.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1628659 Aug 21 12:20 genes.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1885773 Aug 21 12:20 tss_groups.fpkm_tracking
-rwxr-x--- 1 daras G-801020  1477492 Aug 21 12:20 cds.fpkm_tracking
 
-rwxr-x--- 1 daras G-801020  1349574 Aug 21 12:20 splicing.diff  : Differential splicing tests
-rwxr-x--- 1 daras G-801020  1158560 Aug 21 12:20 promoters.diff : Differential promoter usage
-rwxr-x--- 1 daras G-801020   919690 Aug 21 12:20 cds.diff       : Differential coding output


PARSING CUFFDIFF OUTPUT

Exercise 3: What is the status column in our gene_exp.diff files? and what values does it have?

Code Block
titlestatus
cat diff_out/gene_exp.diff|cut -f 7|sort|uniq

Here is a basic command useful for parsing/sorting the gene_exp.diff or isoform_exp.diff files:

Code Block
titleLinux one-liner for sorting cuffdiff output by log2 fold-change values
cat diff_out/isoform_exp.diff | awk '{print $10 "\t" $4}' | sort -n -r | head

Exercise 3: Find the 10 most up-regulated genes, by fold change that are classified as significantly changed. 

Code Block
titleOne-line command to get 10 most up regulated genes
cat diff_out/gene_exp.diff |grep 'yes'|sort -k10nr,10|head

 
#Lets pull out just gene names
cat diff_out/gene_exp.diff |grep 'yes'|sort -k10nr,10|head|cut -f 3
Expand
Solution
Solution

Top 10 upregulated genes

scf

crc

KdelR

Nacalpha

CG8979

CG4389

Vha68-2

regucalcin

CG3835

CG1746


Exercise 4: Find the 10 most down-regulated genes, by fold change that are classified as significantly changed. 

Code Block
titleOne-line command to get 10 most down regulated genes
cat diff_out/gene_exp.diff |grep 'yes'|sort -k10n,10|head|cut -f 3
Expand
Solution
Solution

Top 10 downregulated genes

CG17065

Git

CG18519

del

CG13319

RNaseX25

msb1l

achi

CG15611

CG11309


 

Exercise 5: Find the 10 most up-regulated isoforms, by fold change that are classified as significantly changed. What genes do they belong to?

Code Block
titleOne-line command to get 10 most up-regulated isoforms
cat diff_out/isoform_exp.diff |grep 'yes'|sort -k10nr,10|head

 
#To pull out gene names:
cat diff_out/isoform_exp.diff |grep 'yes'|sort -k10nr,10|head 
Expand
Solution
Solution

sick

mub

CG5021

akirin

CG4587

c(3)G

Ank2

CG1674

CrebB-17A

stai

Exercise 6: Pull out and count the significantly differentially expressed genes do we have (pvalue <=0.05 and abs fold change > 2 or log2 fold change > 1)?

Code Block
titleDEG list
cat diff_out/gene_exp.diff |grep 'yes' |awk '{if (($10 >= 1)||($10<=1)) print }' > DEG

wc -l DEG