Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

SAMtools is a suite of commands for dealing with databases of mapped reads. You'll be using it quite a bit throughout the course. It includes programs for performing variant calling (mpileup-bcftools).

Table of Contents

Calling variants in reads mapped by

...

bowtie2

Right now, we'll be using it to call variants (find mutations) in the re-sequenced E. coli genome from the Mapping tutorial. You will need the output SAM files from that tutorial to continue here.

Warning

If you do not have the output from the Mapping tutorial, run these commands to copy over the output that would have been produced. Then, you can immediately start this tutorial!

Code Block
cds
mkdir intro_to_mapping
cd intro_to_mapping
cp -r $BI/gva_course/intro_to_mapping/bowtie .
cp -r $BI/gva_course/intro_to_mapping/bwa . 
cp -r $BI/gva_course/intro_to_mapping/bowtie2 .

We assume that you are still working in the main directory called intro_to_mapping data that you created on $SCRATCH.

...

Expand
No, give me the commands...
No, give me the commands...

This should work:

Code Block
samtools
which samtools

Prepare your directories

Create From inside your main mapping directory, create a new output directory called samtools_bowtiebowtie2 or whatever makes sense to you.

...

SAM is a text file, so it is slow to access information about how any given read was mapped. SAMtools and many of the commands that we will run later work on BAM files (essentially GZIP compressed binary forms of the text SAM files). These can be loaded much more quickly. Typically, they also need to be sorted, so that when the program wants to look at all reads overlapping position 4,129,888, it can easily find them all at once without having to search through the entire BAM file.

Warning
titleDo not run on head node

Many commands past this point are computationally intensive. You should run them through an idev shell or by qsub. We recommend idev for the tutorial.

Code Block
titleExample command to start an idev shell
idev -m 60 -q development -A CCBB

 

Convert from SAM to BAM format.

...

What are all the options doing?

The samtools mpileup command will take a few minutes to run. You might consider putting it in the background by pressing control-z and then typing the command bg so that you can do some other things in this terminal window at the same time – like starting to run variant calling on the BWA or bowtie mapping results. Remember, there are still many other processors available on this node!

Convert BCF to human-readable VCF:

...

Comparing the results of different mappers using bedtools

You will need the output from #Calling variants in reads mapped by BWA or Bowtie2 to complete this exercise.

Often you want to compare the results of variant calling on different samples or using different pipelines. Bedtools is a suite of utility programs that work on a variety of file formats, one of which is conveniently VCF format. It provides many ways of slicing, dicing, and comparing the information in VCF files. For example, we can use it to find out what predictions are the same and which are different from the variant calling on reads mapped with different programs if you generated VCF files for each one.

Set up a new output directory and copy the respective VCF files to it, renaming them so that we know where they came from:

...