Content Comparison

...

Code Block

title	Reformatting gene_counts.gff

head gene_counts.gff
sed -e 's/^.*locus_tag=//' < gene_counts.gff > gene_counts.tab

After it has run, take a peek at the new file:

Code Block
head gene_counts.tab

Now, we are going to load the GFF file straight into R, remove the columns we don't want, name the columns and rows like R expects, and write out this file. You could do this in any other scripting language, or even Excel. We will write out the first few lines of the file at each step, so that you can see what the command is doing.

...

What are the numbers returned by sizeFactors( cds )?

Expand

	Answer...
	Answer...

They are, roughly speaking, the relative average coverage of each data set? There are roughly 5 times as many counts of reads in genes for wt2 as there are for mut2. Specifically, they are the size parameter of the negative binomial fit to the counts per gene per data file.

What are the dispersion estimates?

Expand

	Answer...
	Answer...

The model assumes there is also a per-gene aspect to the variance in counts observed, that is again fit to a negative binomial distribution (=overdispersed Poisson distribution). The program fits a model where In this model, the lower the counts are, the more dispersion relative to the mean is expected (red line in graph), and thus the less significant a change in counts becomes. Thus, higher fold changes are required in lowly expressed genes to call the same observed fold-change difference as significant.

What was the predominant effect of the mutation on gene expression in this Listeria strain?

...

Version	Old Version 30	New Version 31
Changes made by	Jeffrey E Barrick	Jeffrey E Barrick
Saved on	May 23, 2012	May 23, 2012

Versions Compared

Key