Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

See if you can install breseq and get it running from the installation instructions.

Warning

You will need Bowtie version 2.0.0-beta7 or later to run breseq. The version available on TACC by module laod is currently not this new.

Expand
I need help...
I need help...

Hint: The previous lesson on Installing Linux tools should help you get SSAHA2 bowtie2 and breseq installed. A suitable version of R is already installed on TACC. Remember that you can load that using the command:

Code Block
module load R

...

The data files for this example are in the path:

Code Block
/corral-repl/utexas/BioITeam$BI/ngs_course/lambda_mixed_pop/data

...

Because this data set is relatively small (roughly 100x coverage of a 48,000 bp genome), a breseq run will take < 5 minutes. Submit this command to the TACC development queue.

Code Block

login1$
breseq -r lambda.gbk lambda_mixed_population.fastq > log.txt

A bunch of progress messages will stream by during the breseq run. They detail several steps in a pipeline that combines the steps of mapping (using SSAHA2), variant calling, annotating mutations, etc. You can examine them by peeking in the log.txt file as your job runs using tail -f. The -f option means to "follow" the file and keep giving you output from it as it gets bigger. You will need to wait for your job to start running before you can tail -f log.txt.

Looking at breseq predictions

...

Expand
Need some help?
Need some help?

If you use scp then you will need to run it in a terminal that is on your desktop and not on the remote TACC system. It can be tricky to figure out where the files are on the remote TACC system, because your desktop won't understand what $HOME, $WORK, $SCRATCH mean (they are only defined on TACC).

To figure out the full path to your file, you can use the pwd command in your terminal on TACC:

Code Block
login1$ pwd

Then try a command like this on your desktop:

Code Block
desktop1$ scp -r username@lonestar.tacc.utexas.edu:/the/directory/returned/by/pwd/output .

It would be even better practice to archive and gzip the output directory before copying it using tar -cvzf to archive. Then copying that file and using tar -xvzf to unarchive it.

Inside of the output directory is a file called index.html. Open this in a web browser on your desktop an and click around to take a look at the mutation predictions and summary information.

...

The data files for this example are in the path:

Code Block
/corral-repl/utexas/BioITeam$BI/ngs_course/ecoli_clones/data

...