Get your data
Six raw data files were provided as the starting point:
...
Due to the size of the data and length of run time, many of the programs have already been run for this exercise. We will submit some jobs, but because we cannot wait for it to complete, we will be looking through already generated results. You will then be parsing the output, finding answers, and visualizing results (in the directory tophat_results).
Run tophat
On lonestar, to run tophat, following modules need to be loaded.
...
Code Block | ||
---|---|---|
| ||
qsub tophat_launcher.sge |
Examine tophat parameters
Why did we choose the tophat parameters we did and what do they mean? Here's our tophat command again:
...
As you can see there are many many other options for running tophat!
Examine tophat results
We are going to be running commands that can take a couple of minutes to run. But with 20 people running the same commands, we don't want to run these on the head node. They are too small to be submitted too. So, let's switch over to the tab with our idev session.
...
Code Block |
---|
samtools flagstat accepted_hits.bam |
Let's see how this compares to BWA results...
Code Block | ||
---|---|---|
| ||
cd $SCRATCH/my_rnaseq_course/bwa_mem_results |
...
Expand | ||||
---|---|---|---|---|
| ||||
|
Help! I have a lots of reads and a large number of reads. Make tophat go faster!
Use threading option efficiently (tophat -p <number of threads>)
- Split one data file into smaller chunks and run multiple instances of tophat. Finally concatenate the output.
- WAIT! We have a pipeline for that!
- Look for fastTophat in $BI/scripts
- Split mapping by chromosome- mapping to each chromosome=1 job.
...