Evaluating & Visualizing assemblies
Visualizing assemblies using Mauve
Mauve uses a fast sequence matching algorithm to identify Locally Collinear Blocks (LCBs) between genomes. It has a really slick interface for viewing whole genome alignments or of viewing your genome assembly aligned to a closely related species.
Install Mauve
On your Desktop (local) machine. Go here in a web browser:
Choose Linux (it's ok to leave the other boxes blank).
Mauve has a binary that you should be able to launch by double-clicking on it.
Download Data
In a terminal on your Desktop (local) machine.
scp -r username@lonestar.tacc.utexas.edu:/corral-repl/utexas/BioITeam/ngs_course/mauve_examples ~/Desktop
Example: Comparing two whole genomes
First we're just going to view two related species versus one another.
Choose: File → Align with progressiveMauve...
Navigate to the mauve_examples
folder that you downloaded and choose:
NC_000913.2.gbk
NC_004631.1.gbk
Then hit the Align... button. Choose to save the result as genome_alignment_result
or something similar.
A console window will pop up and show a bunch of commands that are being run at the command line for you. You could run them yourself at the terminal if you wanted to use Mauve on Lonestar or another machine in power-user mode.
After a little bit, a window will pop up showing the aligned genomes. It should look something like this:
What's going on?
From the Mauve manual:
"When a block lies above the center line the aligned region is in the forward orientation relative to the first genome sequence. Blocks below the center line indicate regions that align in the reverse complement (inverse) orientation. Regions outside blocks lack detectable homology among the input genomes. Inside each block Mauve draws a similarity profile of the genome sequence. The height of the similarity profile corresponds to the average level of conservation in that region of the genome sequence. Areas that are completely white were not aligned and probably contain sequence elements specific to a particular genome."
Basics of navigating in Mauve
- You can move around and zoom in and out using control and the arrow keys.
- You can click on a region in one genome to center the aligned regions in the other genome to it.
- You can switch which genome is the main reference by clicking on the up and down arrows on the left side.
- If you zoom in far enough, sequence features (genes) will show up.
Example: Comparing assemblies to a reference genome
Next, we are going to compare several assemblies that were created in.
Choose: Tools → Move Contigs...
Move Contigs reorders the contigs in each assembly based on their similarity so that we won't have a jumbled mess of connections and will be better able to compare the assemblies.
Navigate to the mauve_examples
folder that you downloaded and choose:
NC_000913.2.gbk
- reference genome for a related (but not identical) strain.pairedc20.fa
Be sure to add them in this order! We need the reference first.
To remind you of what we are looking at:
|
Set 1 |
Set 2 |
Set 3 |
Set 4 |
---|---|---|---|---|
File Name |
single.fa |
pairedc50.fa |
pairedc25.fa |
pairedc20.fa |
Read Size |
100 |
100 |
100 |
100 |
Paired/Single Reads |
Single |
Paired |
Paired |
Paired |
Gap Sizes |
NA |
400 |
400, 3000 |
400, 3000, 1500 |
Coverage |
50 |
50 |
25 for each subset |
20 for each subset |
Number of Subsets |
1 |
1 |
2 |
3 |
Try out some of the other files as well.
You can also try to assemble all five files at once with Align with progressiveMauve...:
NC_000913.2.gbk
single.fa
pairedc20.fa
pairedc25.fa
pairedc50.fa
Warning! single.fa
has a lot of sequences. If you add it to the mix, things will get ugly.
Transferring annotation
Mauve has a useful feature to transfer the coordinates of genes across the alignment that it has made by Move Contigs. This can be a good way of assigning orthologs.
Choose: Tools → Export... → Export Orthologs
Other ways to create genome graphics
There are various powerful (non-interactive) programs for drawing the awesome pictures that you see in publications. They are not CPU intensive and you have to fiddle with the configuration files, so it's best to use them on your own computer where you can easily view the images that they create. These tools are also easiest to install on a computer where you have administrator privileges.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.