Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Exercise: How many of the chrI:1000-2000 alignments are from properly paired mapped reads?

Expand
titleHint 1 (building on the last command)

Properly paired reads have the 0x2 flag set (1). All our reads also have the 0x1 flag set because they are paired-end reads. Mapped reads will have the 0x4 flag cleared (0), and properly paired mapped reads will have the 0x8 flag cleared (0) as well (mate not unmapped = mate mapped). So all mapped proper pairs will end in 0x3 = binary 0011.

Expand
titleHint 2
Using

Also, using '$' in a grep pattern anchors the pattern to the end of the line (only patterns matching before EOL are printed).

Expand
titleAnswer 1 (building on the last command)

Here's one way to do it, building on our last command line:

Code Block
languagebash
samtools view -F 0x4 yeast_pe.sort.bam chrI:1000-2000 | awk '
  {$2 = sprintf("0x%x", $2); print $2}' | grep -c '3$'
Expand
titleHint 2 (using a SAM flag)
Use the 0x2 flag (1 = properly paired)
Expand
titleAnswer 2 (using SAM flag)

Here's a simpler way of doing it:

Code Block
languagebash
samtools view -c -F 0x4 -f 0x2 yeast_pe.sort.bam chrI:1000-2000

About mapping quality

Mapping qualities are a measure of how likely a given sequence alignment to its reported location is correct. If a read's mapping quality is low (especially if it is zero, or mapQ 0 for short) the read maps to multiple locations on the genome (they are multi-hit reads), and we can't be sure whether the reported location is the correct one.

...