...
The samtools mpileup
command will take a few minutes to run. As practice for a fairly common occurrence when working with the iDEV environment, once the command is running, you should could try putting it in the background by pressing control-z
and then typing the command bg
so that you can do some other things in this terminal window at the same time. Remember, there are still many other processors available on this node for you to do other things! Just remember that if you have something running in the background you need to check in with it to see if it is still running with the ps
command or watch the command line every time you execute a new command as you may see information about your background task having finished.
...
Analyzing variants detected
VCF format has alternative Allele Frequency tags denoted by AF= Try the following command to see what frequency our variants exist at.
Code Block |
---|
grep AF1 SRR030257.vcf |
...
Code Block | ||||
---|---|---|---|---|
| ||||
awk -F";" '{for(i=1;i<=NF;i++){if ($i ~ /AF1/){print $i}}}' SRR030257.vcf | sort
|
Expand | ||
---|---|---|
| ||
After you ran the bcftools call command you saw: "Note: none of --samples-file, --ploidy or --ploidy-file given, assuming all sites are diploid". Just like on a webpage you can use control/command + F to find specific text in the window. Lookf for 'diploid' and you should see the line referenced above. Obviously this suggests a way that you could go back and reanalyze this data introducing one of the recommended flags to the bcftools call command and see how this might effect your analysis. If you choose to do this, I suggest adding descriptive file name between 'SRR030257' and '.vcf' to make the results easier to compare. |
Expand | |||||||
---|---|---|---|---|---|---|---|
| |||||||
An initial attempt might be something like this:
|
...
Expand | |||||||
---|---|---|---|---|---|---|---|
| |||||||
Remember from our Raw Sequencing Data tutorial yesterday that we can group certain characters together by placing them between square brackets [].
Here we added a decimal point after the 0, and then allowed for a match to any digit between 0 and 8. Thus lines that have AF1=1 would not match, nor would a line with AF1=0.9 . |
Return to GVA2020 course page.
Optional Exercises at the end of class or for Wednesday/Thursday choose your own tutorial time.
...
- Trim both Read1 and Read2 using info from read preprocessing tutorial.
- Map reads with bowtie2 using info from read mapping tutorial.
- Call variants using this tutorial.
Remember in the intro tutorial we talked about file/direcotory directory naming. Be sure you don't write over your old files. Maybe create a new directories like GVA_samtools_bowtie_improved
for the outputs.
...