Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Evaluating capture metrics

...

To run the program on Lonestar, there are three prerequisites: 1) A bam file and 2) a list of the genomic intervals that were to be captured and 3) the reference (.fa).  As you would guess, the BAM and interval list both have to be based on exactly the same genomic reference file.

For our tutorial, the bam files are one of these:

Code Block
titleBAM files for exome capture evaluation tutorial
/corral-repl/utexas/BioITeam/ngs_course/human_variation/NA12878.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam  
/corral-repl/utexas/BioITeam/ngs_course/human_variation/NA12892.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam
/corral-repl/utexas/BioITeam/ngs_course/human_variation/NA12891.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam

I've started with one of Illumina's target capture definitions (the vendor of your capture kit will provide this) but since the bam files only represent chr21 data I've created a target definitions file from chr21 only as well.  Here they are:

Code Block
titleTwo relevant target list definitions
/corral-repl/utexas/BioITeam/ngs_course/human_variation/target_intervals.chr20.reduced.withhead.intervallist
/corral-repl/utexas/BioITeam/ngs_course/human_variation/target_intervals.reduced.withhead.intervallist

And the relevant reference is:

Code Block
titleReference for exome metrics
/corral-repl/utexas/BioITeam/ngs_course/human_variation/ref/hs37d5.fa


If you'd like to try this, copy the intervals, bam files, and reference (.fa and .fai) to a temporary directory on your $SCRATCH, and don't forget to "module load picard" first!

Expand
titleIf you're in a hurry...
Code Block
cds
mkdir tmpE
cd tmpE
cp /corral-repl/utexas/BioITeam/ngs_course/human_variation/NA12878.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam .
cp /corral-repl/utexas/BioITeam/ngs_course/human_variation/target_intervals.chr20.reduced.withhead.intervallist .
cp /corral-repl/utexas/BioITeam/ngs_course/human_variation/ref/hs37d5.fa .
cp /corral-repl/utexas/BioITeam/ngs_course/human_variation/ref/hs37d5.fa.fai .


The run command looks long but isn't that complicated (like most java programs):

Code Block
titleHow to run exactly these files on Lonestar
java -Xmx4g -Djava.io.tmpdir=/tmp -jar $TACC_PICARD_DIR/CalculateHsMetrics.jar BI=target_intervals.chr20.reduced.withhead.intervallist TI=target_intervals.chr20.reduced.withhead.intervallist I=NA12878.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam R=ref/hs37d5.fa  O=exome.picard.stats PER_TARGET_COVERAGE=exome.pertarget.stats

...