...
Exercise: How many of the chrI:1000-2000 alignments are from properly paired mapped reads?
Expand |
---|
title | Hint 1 (building on the last command) |
---|
|
Properly paired reads have the 0x2 flag set (1). All our reads also have the 0x1 flag set because they are paired-end reads. Mapped reads will have the 0x4 flag cleared (0), and properly paired mapped reads will have the 0x8 flag cleared (0) as well (mate not unmapped = mate mapped). So all mapped proper pairs will end in 0x3 = binary 0011. |
Expand |
---|
|
Using Also, using '$' in a grep pattern anchors the pattern to the end of the line (only patterns matching before EOL are printed).
|
Expand |
---|
title | Answer 1 (building on the last command) |
---|
|
Here's one way to do it, building on our last command line: Code Block |
---|
| samtools view -F 0x4 yeast_pe.sort.bam chrI:1000-2000 | awk '
{$2 = sprintf("0x%x", $2); print $2}' | grep -c '3$' |
|
Expand |
---|
title | Hint 2 (using a SAM flag) |
---|
|
Use the 0x2 flag (1 = properly paired) |
Expand |
---|
title | Answer 2 (using SAM flag) |
---|
|
Here's a simpler way of doing it: Code Block |
---|
| samtools view -c -F 0x4 -f 0x2 yeast_pe.sort.bam chrI:1000-2000 |
|
About mapping quality
Mapping qualities are a measure of how likely a given sequence alignment to its reported location is correct. If a read's mapping quality is low (especially if it is zero, or mapQ 0 for short) the read maps to multiple locations on the genome (they are multi-hit reads), and we can't be sure whether the reported location is the correct one.
...