...
| Code Block |
|---|
bwa index -a bwtsw reference/genome.fa |
Part 2a. Align the samples to reference using bwa aln/samse/sampe
We will be using this set of commands (with options that you should try to figure out) in this order, on each sample:
bwa aln
bwa samse or sampeLet's submit the bwa aln job
...
| title | Submit to the TACC queue or run in an idev shell |
|---|
Create a commands file and use launcher_creator.py followed by sbatch.
...
nano commands.bwa
Put this in your commands file:
bwa aln -f GSM794483_C1_R1_1.sai reference/genome.fa data/GSM794483_C1_R1_1.fq
bwa aln -f GSM794483_C1_R1_2.sai reference/genome.fa data/GSM794483_C1_R1_2.fq
bwa aln -f GSM794484_C1_R2_1.sai reference/genome.fa data/GSM794484_C1_R2_1.fq
bwa aln -f GSM794484_C1_R2_2.sai reference/genome.fa data/GSM794484_C1_R2_2.fq
bwa aln -f GSM794485_C1_R3_1.sai reference/genome.fa data/GSM794485_C1_R3_1.fq
...
2.
...
bwa aln -f GSM794486_C2_R1_1.sai reference/genome.fa data/GSM794486_C2_R1_1.fq
bwa aln -f GSM794486_C2_R1_2.sai reference/genome.fa data/GSM794486_C2_R1_2.fq
bwa aln -f GSM794487_C2_R2_1.sai reference/genome.fa data/GSM794487_C2_R2_1.fq
bwa aln -f GSM794487_C2_R2_2.sai reference/genome.fa data/GSM794487_C2_R2_2.fq
bwa aln -f GSM794488_C2_R3_1.sai reference/genome.fa data/GSM794488_C2_R3_1.fq
bwa aln -f GSM794488_C2_R3_2.sai reference/genome.fa data/GSM794488_C2_R3_2.fq
| Expand | ||
|---|---|---|
| ||
launcher_creator.py -n aln -t 04:00:00 -j commands.bwa -q normal -a UT-2015-05-18 -m "module load bwa/0.7.7" -l bwa_launcher.slurm |
*.sai file is a file containing "alignment seeds" in a file format specific to BWA. We still need to extend these seed matches into alignments of entire reads, choose the best matches, and convert the output to SAM format. Do we use sampe or samse?
Lets submit the bwa sampe job, but have it be on hold till previous job is finished.
| Warning | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
Create a
|
Part 2b. Align the samples to reference using bwa mem
| Warning | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||
Create a
|
Since these this will take a while to run, you can look at already generated results at: /corral-repl/utexas/BioITeam/rnaseq_course_2015/ bwa_mem_results
Help! I have a lots of reads and a large number of reads. Make BWA go faster!
Use threading option in the bwa command ( bwa -t <number of threads>)
- Split one data file into smaller chunks and run multiple instances of bwa. Finally concatenate the output.
- WAIT! We have a pipeline for that!
- Look for runBWA.sh in $BI/bin (it should be in your path)
Now that we are done mapping, lets look at how to assess mapping results.