Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This says that anyone can read the file, the owner (you) or anyone in your group can modify it (write permission), but no one can execute it. We use the chmod program to allow anyone to execute the script:

...

  1. Aligns a fastq file to a pre-made reference genome
  2. Extracts alignments from bwa's proprietary binary .sai file to a .sam file
  3. Converts the .sam file into a .bam file using samtools
  4. Sort and index the .bam file so that it can be viewed in IGV.
  5. Count the number of aligned and unaligned reads, and calculate the mapping rate.

Since we want to use this script on different datasets, it should take some arguments on the command line telling it what to work on. Let's have it take the following arguments:

  1. Name of the input fastq file (or the R1 file if paired).
  2. A prefix to use when writing output files (e.g. <prefix>.bam).
  3. Name of an reference genome to use. The script will find the appropriate reference index based on this value.
  4. A flag indicating whether single end or paired end alignment should be done. 0 = single, 1 = paired.

Here is a completed Example BWA alignment script.You may want to open it in a separate window so you can read along as it is discussed here.

What could possibly go wrong?

The first thing you will notice about this script.

Daechan original