Evaluating eukaryotic assemblies
PASA
CEGMA
Visualizing assemblies
Mauve - progressiveAlignment
IGV - glimmer to bed or gff3 conversion
Creating publication quality genome graphics
- Circos
- cgview
Probably best installed
Using Mauve to visualize the assembly
Copy the contigs.fa files in the single_out, pairedc20_out, pairedc25_out, and pairedc50_out directories that were assembled earlier to a directory on your computer's Desktop. Make sure you rename the files as you copy them. They're all named contigs.fa, so if you don't rename them, you'll overwrite all of the files as you copy them.
If you happen to be unable to complete your Velvet assembly, you can copy the following files to your local computer:
cd ~/Desktop mkdir velvet_contigs cd velvet_contigs scp your_name@lonestar.tacc.utexas.edu:/corral-repl/utexas/BioITeam/velvet/all_fa/*paired*fa ./
"pairedc20" means the read set that has 3 subsets each with coverage of 20.
On your local computer.
Download Mauve from http://asap.ahabs.wisc.edu/mauve/download.php
The package comes pre-installed so all that's necessary to do is to execute the "Mauve" file.
With Mauve open, go to File -> Align with ProgressiveMauve
Find the 3 contig files ending with ".fa" that correspond to the paired reads and add them to be aligned on the list. Then create the alignment.
When the alignment is complete, you'll see a graph.
For an explanation of this graph, the following was taken from the Mauve website (http://asap.ahabs.wisc.edu/mauve-aligner/mauve-user-guide/using-the-alignment-viewer.html):
When a block lies above the center line the aligned region is in the forward orientation relative to the first genome sequence. Blocks below the center line indicate regions that align in the reverse complement (inverse) orientation. Regions outside blocks lack detectable homology among the input genomes. Inside each block Mauve draws a similarity profile of the genome sequence. The height of the similarity profile corresponds to the average level of conservation in that region of the genome sequence. Areas that are completely white were not aligned and probably contain sequence elements specific to a particular genome.
You can set the reference set by clicking the R at the left. You can see the similarities between 3 sets by using the up and down arrows at the left.