Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleAnswer
Code Block
languagebash
cat yeast_mrna_gene_counts.bed | awk '
 BEGIN{FS="\t";sum=0;tot=0}
 {if($9 > 0) { sum = sum + $9; tot = tot + 1 }}
 END{print sum,"overlapping reads in",tot,"records"}'

There are 1144990 overlapping reads in 6235 records

In Recall that in the yeast annotations from SGD there are 3 gene classifications: Verified, Uncharacterized and Dubious, and the Dubious ones have no experimental evidence so are generally excluded.

Exercise: What is the total count of reads mapping to gene features other than Dubious?

Expand
titleHint

The classification is in the 7th column of sc_genes.bed.

Use cut to isolate that field, sort to sort the resulting values into blocks, then uniq -c to count the members of each block.

...

grep -v 'Dubious'
Expand
titleHint
Code Block
languagebash
grep -v 'Dubious' yeast_mrna_gene_counts.bed | awk '
 BEGIN{FS="\t";sum=0;tot=0}
 {if($9 > 0) { sum = sum + $9; tot = tot + 1 }}
 END{print sum,"overlapping reads in",tot,"records"}'

There are 1093140 overlapping reads in 5578 records

BEDTools merge