Custom Genome Databases

Setting up a genome database and/or blast server for your new assembly

Genome databases

It's beyond our scope in this course to show you how, but the general concept of building and hosting a genome database is not new and is not too hard if you use the GMOD toolkit. GMOD is the infrastructure behind a large number of organism databases including X. tropicalis, S. cerevisiae, E. coli, worm, fly, etc. - it's widely used and well supported. Educational sessions are offered frequently.

As a starting point, I would recommend you only start with the Gbrowse browser and avoid Chado unless you either have or want to get some experience with LAMP stacks, php, and MySql databases.

Blast server

Fortunately, thanks to a clever bit of code written by Yannick Wurm and colleagues, it's very easy to serve your results out to the whole world via blast.

Recommended path to do this:
a) Use your credit card to get an account at Rackspace (you could do this anywhere you'd like of course)
b) Spin up a virtual machine based on the BioLinux distribution of linux, which already has a host of bioinformatics tools built-in, including blast
c) Upload your data
d) Start the sequenceserver
e) Share the IP address with anyone you want!

An example of a custom genome database set up with SequenceServer can be found here.

Enter in the following sequence:

CGGCGTAAACGCCTTATCCGGCCTACAAAAATGTGCAAATTCAATAAATTGCAATTCAACTTGTAGGCCT
GATAAGCGCAGCGCATCAGGCAATTTGGCGTTGCCGTCAGTCTCAGTTAATCAGGTTACAACGATTAACC
CTGCAGCAGAGACAGAACCTGCTGCGGTACCTGGTTAGCTTTTGCCAACACGGAGTTACCGGCCTGCTGG
ATGATCTGCGCTTTCGACATATTGGACACTTCGGTCGCATAGTCGGCGTCCTGAATACGGGACTGCGCTT

Select blastn as your blast method, check NC_000913.2.fasta as your nucleotide database and finally click BLAST!