Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleIf you're stuck...
Code Block
bowtie2-build NC_012967.1.fasta bowtiebowtie2/NC_012967.1

The first argument is the reference FASTA. The second argument is the "base" file name to use for the created index files. It will create a bunch of files beginning bowtie/NC_012967.1*.

...

Warning
titleSubmit to the TACC queue or run in an idev shell

Create a commands file and use launcher_creator.py followed by qsub.

Expand
titleNot sure what to do...

Put this in your commands file:

Code Block
bowtie2 -X bowtiet -x bowtie2/NC_012967.1 -1 SRR030257_1.fastq -2 SRR030257_2.fastq -S bowtiebowtie2/SRR030257.sam

What does the -t option do?

...

Still, you should recognize some of the information on a line in a SAM file from the input FASTQ, and some of the other information is relatively straightforward to understand, like the position where the read mapped. Give this a try:

Code Block
head bowtiebowtie2/SRR030257.sam

What do you think the 4th and 8th columns mean?

...

...

...

Multithreaded execution

We have actually massively under-utilized Lonestar in this example. We submitted a job that reserved a single node on the cluster, but that node has 12 processors. Bowtie was only using one of those processors (a single "thread")! For programs that support multithreaded execution (and most mappers do because they are obsessed with speed) we could have sped things up by using all 12 processors for the bowtie process.

Expand
What's the command line option to enable multithreaded execution in bowtie?
What's the command line option to enable multithreaded execution in bowtie?

It's -p, for "processors". Since we had 12 processors available to our job, the better bowtie alignment commands file would look like this.

Code Block
bowtie2 -t -p 12 -t -x bowtiebowtie2/NC_012967.1 -1 SRR030257_1.fastq -2 SRR030257_2.fastq -S bowtiebowtie2/SRR030257.sam

Try it out and compare the speed of execution by looking at the log files.

...

One consequence of using multithreading that might be confusing is that the aligned reads might appear in your output SAM file in a different order than they were in the input FASTQ. This happens because small sets of reads get continuously packaged, "sent" to the different processors, and whichever set "returns" fastest is written first. You can force them to appear in the same order (at a slight cost in speed) by adding the --reorder flag to your command.

Anchor
BWA
BWA

Mapping with BWA

BWA (the Burrows-Wheeler Aligner) is another fast mapping program. It's the successor to another aligner you might have used or heard of called MAQ (Mapping and Assembly with Quality).

...

Code Block
module load bwa

There are could be multiple versions of BWA on TACC , so you might want to check which one you have loaded for when you write up your awesome publication that was made possible by your analysis of next-gen sequencing data.and this command loads the default one.

How could you check to see what version you are using to write in the materials and methods of your paper?

Expand
titleHere are some commands that could help...
Code Block
module spider bwa
module list
bwa

Create a fresh output directory, so that we don't write over the output from bowtie. Be sure you are back in your main intro_to_mapping directory. Then:

Code Block
mkdir bwa

...

BWA doesn't give you a choice of where to create your index files. It creates them in the same directory as the FASTA that you input. So copy the FASTA in your intro_to_mapping directory to your new bwa directory:

...