Mauve uses a fast sequence matching algorithm to identify Locally Collinear Blocks (LCBs) between genomes. It has a really slick interface for viewing whole genome alignments or of viewing your genome assembly aligned to a closely related species.
On your Desktop (local) machine. Go here in a web browser:
Choose Linux (it's ok to leave the other boxes blank).
In a terminal on your Desktop (local) machine.
cd ~/Desktop scp -r username@tacc.utexas.edu: |
First we're just going to view two related species versus one another.
Choose: File → Align with progressiveMauve.
Navigate to the mauve_data folder that you downloaded and choose:
NC_000913.2.gbkNC_004631.1.gbkThen hit the Align... button. Choose to save the result as genome_alignment_result or something similar.
A console window will pop up and show a bunch of commands that are being run at the command line for you. You could run them yourself at the terminal if you wanted to use Mauve on Lonestar or another machine in power-user mode.
After a little bit, a window will pop up showing the aligned genomes. It should look something like this
Mauve has a useful feature to transfer the coordinates of genes across the alignment that it has made.
There are various powerful (non-interactive) programs
These tools are easiest to install Probably best installed
If you are going to seriously annotate a eukaryotic genome, then you are going to need a machine with database infrastructure and other tools installed. Here are two systems for annotating gene structure in eukaryotic genomes:
PASA - Program to Assemble Spliced Alignments
CEGMA - Core Eukaryotic Genes Mapping Approach