Page Comparison

...

For major bonus points and a great THANK YOU from Scott, compute the mean and standard deviation of the intersected and subtracted SNPs from NA12878 vs all and then perform a t-test to make sure the differences are statistically significant using only linux command line tools (probably in a shell script). Yes, it's probably easier in Python, Perl, or R.

Other linux utilities useful for making subsets of VCF files and comparing them

Code Block

title	Make files containing all the het & hom alt alleles from a vcf, and simplify it somewhat:

cat NA12878.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "0/1" > NA12878.raw.vcf.simple.het
cat NA12878.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "1/1" > NA12878.raw.vcf.simple.hom

Code Block

title	Make a file containing all the het & hom alt alleles from a vcf, with the same simplification we used above:

cat NA12891.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "0/1" | sort > NA12891.raw.vcf.simple.het
cat NA12892.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "0/1" | sort > NA12892.raw.vcf.simple.het
cat NA12891.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "1/1" | sort > NA12891.raw.vcf.simple.hom
cat NA12892.raw.vcf | awk 'BEGIN {FS="\t"} {print $2 "\t" substr($10,1,3) "\t" $4 "\t" $5}' \
  | sort -n | grep "1/1" | sort > NA12892.raw.vcf.simple.hom

Now count how many GT are het in both of the second two (parents) but hom in the first (child):
Code Block
join NA12892.raw.vcf.simple.het NA12891.raw.vcf.simple.het > both.het join both.het NA12878.raw.vcf.simple.hom | wc -l
(would you have expected this result?)

Now find which GT are hom in both of the second two (parents) but het in the first (child):

Code Block
join NA12892.raw.vcf.simple.hom NA12891.raw.vcf.simple.hom > both.hom join both.hom NA12878.raw.vcf.simple.het \| wc -l

Virmid - an advanced auto-screener

...

Versions Compared

Old Version 1

New Version 2

Key

Other linux utilities useful for making subsets of VCF files and comparing them

Virmid - an advanced auto-screener