...
Info | ||
---|---|---|
| ||
Like an index for a book (in the olden days before Kindles and Nooks), creating an index for a computer database allows quick access to any "record" given a short "key". In the case of mapping programs, creating an index for a reference sequence allows it to more rapidly place a read on that sequence at a location where it knows at least a piece of the read matches perfectly or with only a few mismatches. By jumping right to these spots in the genome, rather than trying to fully align the read to every place in the genome, it saves a ton of time. Indexing is a separate step in running most mapping programs because it can take a LONG time if you are indexing a very large genome (like our own overly complicated human genome). Furthermore, you only need to index a genome sequence once, no matter how many samples you want to map. Keeping it as a separate step means that you can skip it later when you want to align a new sample to the same reference sequence. |
Finally, it's time to map the reads! The command you need is:
Code Block | ||||
---|---|---|---|---|
|
...
|
...
| |||
bowtie2
|
Warning | ||
---|---|---|
| ||
This command can take a while (~5 minutes) and is extremely taxing. This is longer than we want to run a job on the head node (especially when all of us are doing it at once). In fact, in previous years, TACC has noticed the spike in usage when multiple students forgot to make sure they were on idev nodes and complained pretty forcefully to us about it. Let's not have this be one of those years. Use the showq -u command to make sure you are on an idev node. If you are not sure if you are on an idev node, speak up on zoom and I'll show(q) -u what you are looking for. Yes, your instructor likes bad puns. My apologies. If you are not on an idev node, and need help to relaunch it, click over to the idev tutorial. |
...
In the bowtie2 example, we mapped in
--local
mode. Try mapping in--end-to-end
mode (aka global mode).- Do the BWA tutorial so you can compare their outputs.
- Did bowtie2 or BWA map more reads?
- In our examples, we mapped in paired-end mode. Try to figure out how to map the reads in single-end mode and create this output.
- Which aligner took less time to run? Are there any options you can change that:
- Lead to a larger percentage of the reads being mapped? (increase sensitivity)
- Speed up run time without causing many fewer reads to be mapped? (increase performance)
...