Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note
titleRunning this data set not recommended

Running this data set yourself is not actually recommended, or at least not recommended until you have completed some of the advanced breseq tutorials for 3 reasons:

  1. At best, this data set expected to take ~1 hour to complete.
  2. Version conflicts with bowtie2 between TACC and breseq prevent accessing the 'best' time.
  3. Potential issue with the launcher module.

3 can be worked around fairly simply as can be seen in the multiqc tutorial 2 will be handled in breseq installation tutorial after which the data will still take ~1hr.

Therefore while the information of how to run this data is below, it is recommended that the majority of you focus on just downloading the results of that run when I did it.

Code Block
languagebash
titlepartial scp command to copy to the current directory of your local computer
scp <user>@ls5.tacc.utexas.edu:/corral-repl/utexas/BioITeam/ngs_course/lambda_mixed_pop/breseq_SRR030257_run_output_folder.tar.gz .

Once you have downloaded the compressed output to your local computer you can jump down to evaluating it, or read through the commands I used for the run and some more explanation of what is going on.


data

Like we did yesterday we'll start by downloading our reads and reference into a new folder on scratch:

...

Expand
titleSome interesting example coordinates
  • Expand
    titleCoordinate 161,041. What gene is this in and what is the effect on the protein sequence?

    Gene is pcnB, mutation is a snp

  • Expand
    titleCoordinate 3,248,957. What gene is this in and what is the effect on the protein sequence?

    Gene is infB, mutation is a snp

  • Expand
    titleCoordinate 3,894,997. What type of mutation is this?

    Deletion of the rbsD gene

  • Expand
    titleCheck out the rbsA gene region? What's going on here?

    There was a large deletion. Can you figure out the exact coordinates of the endpoints?

  • Expand
    titleNavigate to coordinate 3,289,962. Compare the results for different alignment programs and settings. Can you explain what's going on here?

    There is a 16 base deletion in the gltB gene reading frame.

  • Expand
    titleWhat is going on in the pykF gene region? You might see red read pairs. What does that mean? Can you guess what type of mutation occurred here?

    The read pairs are discordantly mapped. There was an insertion of a new copy of a mobile genetic element (an IS150 element) that exists at other locations in the reference sequence.

  • See if you can find more interesting locations. Recall in the IGV tutorial we had ~190 mutations, here we have ~40. This highlights much

In addition to highlighting the power of the statistical testing going on under the hood of breseq to separate the signal from the noise the concise html output to visualize the mutations, hopefully you also see how you can answer the same questions even easier with breseq.

Additional tutorials dealing with breseq

...