Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleExpand here if you'd like to try this on your own...

Let's use Anna Battenhouse's shell script align_bwa.sh to align the paired end fastq file to the hg19 version of the genome calling the output trios_mapping. The input fastq file can be found at $BI/ngs_course/human_variation/allseqs_R1.fastq or if you have already copied today's materials to your scratch directory at $SCRATCH/BDIB_Human_tutorial/raw_files. Throughout the course we have been telling you to copy things to a local location before working with them, and you can continue to do so, but it is important to remember this is not necessarily a requirement.

If you use launcher_creator.py to create the job submission script. Be sure to append &> "meaningful file name" to the end of the script to capture the standard output and standard

 

Expand
titleClick here for a hint

Type align_bwa.sh without any options to gain insight into the programs expectations. As you might have put together, the "BWA" part of the script name represents the BWA mapping tool, but what isn't obvious is that it also uses SAMtools. Make sure that these things are available before starting the commands

Code Block
languagebash
titleIf on an iDev node:
collapsetrue
module load bwa
module load samtools
cds
cd BDIB_Human_tutorial
align_bwa.sh raw_files/allseqs_R1.fastq trios_mapping hg19 1
Expand
titleIf you got an error message check here for a possible reason why
Code Block
titleIf your error message was this:
Fastq read1 File 'raw_files/allseqs_R1.fastq' not found...exiting
Code Block
languagebash
titleDid you remember to do this after the copy command finished?
cd $SCRATCH/BDIB_Human_tutorial/rawfiles
gunzip *.gz  # this will unzip all the compressed files you have just copied
Code Block
languagebash
titleMake job submission script for mapping & variant calling
collapsetrue
echo "align_bwa.sh $BI/ngs_course/human_variation/allseqs_R1.fastq trios_mapping hg19 &> aln.trios_mapping.log" > commands  # this is an alternative way to generate a commands file than nano, use nano if more comfortable, or try this new option
launcher_creator.py -l map_call.sge -n map_call -t 01:00:00 -j commands -A UT-2015-05-18
module load bwa
module load samtools
qsub map_call.sge

 


Code Block
title"ls -lt" output ... (a way of checking if things completed correctly
collapsetrue
-rw-r-rw-r-- 1 sphsmithded G-801020802740       5732392 May 2028 2303:0113 maptrios_callmapping.flagstat.o586338txt
-rw-rw-r----- 1 sphsmithded G-801020802740       3922149304 May 2028 2303:01 test.flagstat.txt13 trios_mapping.sorted.bam.bai
-rw-rw---r--- 1 sphsmithded G-801020802740   2175952334770401 May 2028 2303:0113 testtrios_mapping.sorted.bam.bai
-rw-rw-r-----  1 sphsmithded G-801020802740 334782188411155732 May 2028 2303:01 test.sorted.bam
-rw-r--r-- 1 sphsmith G-801020     13135 May 20 23:00 map_call.e586338
-rw------- 1 sphsmith G-801020 411244396 May 20 23:00 test.bam
-rw------- 1 sphsmith G-801020  45695040 May 20 22:49 test.read2.sai12 trios_mapping.bam
-rw-rw-r----- 1 sphsmithded G-801020802740  4537240062257040 May 2028 2203:3901 testtrios_mapping.read1read2.sai
-rw-rrw--r-- 1 sphsmithded G-801020802740         061884104 May 2028 2202:2651 maptrios_callmapping.read1.pe586338sai
Code Block
title"samtools flagstat trios_mapping.sorted.bam" expected results
collapsetrue
Running flagstat...
4546280 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
39922743992266 + 0 mapped (87.81%:nan%)
4546280 + 0 paired in sequencing
2273140 + 0 read1
2273140 + 0 read2
4029040234 + 0 properly paired (0.89%88%:nan%)
36369463636928 + 0 with itself and mate mapped
355328355338 + 0 singletons (7.82%:nan%)
4412843366 + 0 with mate mapped to a different chr
1563415352 + 0 with mate mapped to a different chr (mapQ>=5)

...

Code Block
titleExpected output of vcfutils.pl script
collapsetrue
login1$ vcfutils.pl qstats testtrios_tutorial.raw.vcf
QUAL #non-indel #SNPs #transitions #joint ts/tv #joint/#ref #joint/#non-indel
186	1011	1011	757	0	2.9803	0.0000	0.0000	2.9803
142	2036	2036	1441	0	2.4218	0.0000	0.0000	2.0059
122	3091	3091	2156	0	2.3059	0.0000	0.0000	2.1029
109	4177	4177	2925	0	2.3363	0.0000	0.0000	2.4259
99.5	5237	5237	3647	0	2.2937	0.0000	0.0000	2.1361
92	6273	6273	4354	0	2.2689	0.0000	0.0000	2.1489
85.5	7328	7328	5105	0	2.2964	0.0000	0.0000	2.4704
79	8352	8352	5811	0	2.2869	0.0000	0.0000	2.2201
74	9369	9369	6497	0	2.2622	0.0000	0.0000	2.0725
69	10553	10553	7311	0	2.2551	0.0000	0.0000	2.2000
65	11782	11782	8176	0	2.2673	0.0000	0.0000	2.3764
61	13059	13059	9090	0	2.2902	0.0000	0.0000	2.5179
57	14280	14280	9945	0	2.2941	0.0000	0.0000	2.3361
53	15301	15301	10656	0	2.2941	0.0000	0.0000	2.2935
48.8	16323	16323	11347	0	2.2803	0.0000	0.0000	2.0876
45	17460	17460	12122	0	2.2709	0.0000	0.0000	2.1409
41.8	18639	18639	12899	0	2.2472	0.0000	0.0000	1.9328
39.8	19660	19660	13572	0	2.2293	0.0000	0.0000	1.9339
37.8	21013	21013	14496	0	2.2243	0.0000	0.0000	2.1538
35.8	22309	22309	15380	0	2.2197	0.0000	0.0000	2.1456
33.8	23568	23568	16251	0	2.2210	0.0000	0.0000	2.2448
31.8	24846	24846	17101	0	2.2080	0.0000	0.0000	1.9860
29.8	26171	26171	17975	0	2.1931	0.0000	0.0000	1.9379
28	27308	27308	18688	0	2.1680	0.0000	0.0000	1.6816
26	28654	28654	19551	0	2.1478	0.0000	0.0000	1.7867
24	29947	29947	20371	0	2.1273	0.0000	0.0000	1.7336
22	31050	31050	21065	0	2.1097	0.0000	0.0000	1.6968
20	32081	32081	21719	0	2.0960	0.0000	0.0000	1.7347
17.1	33251	33251	22446	0	2.0774	0.0000	0.0000	1.6411
13.2	34447	34447	23238	0	2.0732	0.0000	0.0000	1.9604
10.4	35751	35751	24067	0	2.0598	0.0000	0.0000	1.7453
9.53	36772	36772	24724	0	2.0521	0.0000	0.0000	1.8049
8.65	38023	38023	25539	0	2.0457	0.0000	0.0000	1.8693
7.8	39301	39301	26360	0	2.0369	0.0000	0.0000	1.7965
6.98	40665	40665	27196	0	2.0192	0.0000	0.0000	1.5833
6.2	42394	42394	28243	0	1.9958	0.0000	0.0000	1.5352
5.46	44018	44018	29211	0	1.9728	0.0000	0.0000	1.4756
4.77	45553	45553	30168	0	1.9609	0.0000	0.0000	1.6557
4.13	47020	47020	31081	0	1.9500	0.0000	0.0000	1.6480
3.54	48572	48572	32063	0	1.9422	0.0000	0.0000	1.7228
3.01	50463	50463	33139	0	1.9129	0.0000	0.0000	1.3202

...