/
Breseq Tutorial

Breseq Tutorial

Introduction

breseq is a tool developed by the Barrick lab intended for analyzing genome re-sequencing data for bacteria. It is primarily used to analyze laboratory evolution experiments with microbes. In these experiments, there is usually a high-quality reference genome for the ancestral strain, and one is interested in exhaustively finding all of the mutations that occurred during the evolution experiment. Then one might want to construct a phylogenetic tree of individuals samples from a single population or determine whether the same gene is mutated in many independent evolution experiments in an environment.

Input data / expectations:

  • Haploid reference genome
  • Relatively small (<20 Mb) reference genome
  • Input FASTQ reads can be from any sequencing technology
  • Average genomic coverage > 30-fold
  • Less than ~1,000 mutations expected
  • Detects SNVs and SVs from single-end reads (does not use paired-end distance information)
  • Produces annotated HTML output

You can learn a great deal more about breseq by reading the Online Documentation.

Here is a rough outline of the workflow in breseq with proposed additions.

This tutorial was reformatted from the most recent version found here. Our thanks to the previous instructors.

Objectives:

  • Use a very self contained/automated pipeline to identify mutations.
  • Explain the types of mutations found in a complete manner before using methods better suited for higher order organisms.

 

Example 1: Bacteriophage lambda data set

First, we'll run breseq on a small data set to be sure that it is installed correctly, and to get a taste for what the output looks like. This sample is a mixed population of bacteriophage lambda that was co-evolved in lab with its E. coli hosts.

Environment

To set your profile up to run breseq, we need to add "module load bowtie/2.1.0" to your profile.

Adding bowtie to your profile
cdh  #move to your home directory
echo "module load bowtie/2.1.0" >> .profile  #this command updates your profile to automatically load the bowtie module

After you've completed these commands, exit lonestar and re log in to re run your profile.  

Data

The data files for this example are in the path:

$BI/ngs_course/lambda_mixed_pop/data

Copy this directory to a new directory called BDIB_breseq in your $SCRATCH space and cd into it.

Click here for the solution
cds
mkdir BDIB_breseq_lambda
cp $BI/ngs_course/lambda_mixed_pop/data/* BDIB_breseq_lambda
cd BDIB_breseq_lambda
ls 

If the copy worked correctly you should see  the following 2 files:

File Name

Description

Sample

lambda_mixed_population.fastq

Single-end Illumina 36-bp reads

Evolved lambda bacteriophage mixed population genome sequencing

lambda.gbk