Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

File Name

Description

Sample

SRR034450.fastq

Single-end Illumina 36-bp reads

wild-type, biological replicate 1

SRR034451.fastq

Single-end Illumina 36-bp reads

Δ ?sigB mutant, biological replicate 1

SRR034452.fastq

Single-end Illumina 36-bp reads

wild-type, biological replicate 2

SRR034453.fastq

Single-end Illumina 36-bp reads

Δ ?sigB mutant, biological replicate 2

NC_017544.1.fasta

Reference Genome sequence (FASTA)

Listeria monocytogenes strain 10403S

NC_017544.1.gff

Reference Genome features (GFF)

Listeria monocytogenes strain 10403S

...

The multicov command takes a feature file (GFF) and counts how many reads are in certain regions from many input files. By default it counts how many reads overlap the feature on either strand, but it can be made specific with an option. Note: Remember that the chromosome names in your gff file should match the way the chromosomes are named in the reference fasta file used in the mapping step. For example, if gff file contains chr1, chrX etc, the GFF file must also call the chromosomes as chr1, chrX and so on.

Our GFF file has a lot of redundant features that describe a gene multiple times, so we are going to trim it just to have "gene" features using grep.

...

Code Block
titleUsing DESeq
login1$ R
...
> library("DESeq")
> counts = read.delim("gene_counts.tab", header=F, row.names=1)
> head(counts)
> colnames(counts) = c("wt1", "mut1", "wt2", "mut2")
> head(counts)
> my.design <- data.frame(
  row.names = colnames( counts ),
  condition = c( "wt", "mut", "wt", "mut"),
  libType = c( "single-end", "single-end", "single-end", "single-end" ) 
)
> conds <- factor(my.design$condition)

> cds <- newCountDataSet( counts, conds )
> cds

> cds <- estimateSizeFactors( cds )
> sizeFactors( cds )

> cds <- estimateDispersions( cds )

> pdf("DESeq-dispersion_estimates.pdf")
> plot(
  rowMeans( counts( cds, normalized=TRUE ) ),
  fitInfo(cds)$perGeneDispEsts,
  pch = '.', log="xy"
  )
> xg <- 10^seq( -.5, 5, length.out=300 )
> lines( xg, fitInfo(cds)$dispFun( xg ), col="red" )
> dev.off()

> result <- nbinomTest( cds, "wt", "mut" )
> head(result)

> result = result[order(result$pval), ]
> head(result)

> write.csv(result, "DESeq-wt-vs-mut.csv")

> pdf("DESeq-MA-plot.pdf")
> plot(
  result$baseMean,
  result$log2FoldChange,
  log="x", pch=20, cex=.3,
  col = ifelse( result$padj < .1, "red", "black" ) )
> dev.off()

> q()
Save workspace image? [y/n/c]: n
login1$ head DESeq-wt-vs-mut.csv

...