UCSC Genome Browser tracks
The UCSC Genome Browser is an invaluable resource both for obtaining public sequencing data and for visualizing it.
Tip Sometimes the UCSC Genome Browser at http://genome.ucsc.edu/ is pretty slow -- after all, it's a resource shared among the Eukaryotic genomics community. But there's also a second "Beta test" version of the browser at http://hgwdev.cse.ucsc.edu/. It has slightly newer (and possibly less stable) code, but fewer people use it.
Configuring custom tracks
The UCSC Genome Browser has a "Custom Tracks" feature that lets you visualize your data using the Genome Browser web application. This data is visible only to you, not publically (unless you choose to share a link to it with others).
There are two approaches to visualizing your data in the UCSC Genome Browser:
- Directly upload a data file, in one of the supported formats.
- Your data is copied over the Internet to UCSC, where it is stored in tables and displayed as you browse.
- Appropriate for small to medium size files (up to a few MB).
- Host your data locally, and configure the UCSC Genome Browser with its URL.
- Your data resides in a location accessible via an HTTP or FTP public URL (e.g., our /corral-repl/utexas/BioITeam/web directory). No data is copied to UCSC. You only tell the browser where to find the data when it is needed.
- Appropriate for large data sets (e.g. BAM files) that can be indexed for fast retrieval.
BED data
BED format is a simple 3 to 9 column format for location-oriented data.
See supported data formats for custom tracks for more information and examples.
VCF data
VCF data can only be configured as a URL, not uploaded directly. Directions are found at http://genome.ucsc.edu/goldenPath/help/vcf.html.
- The VCF file must be sorted by chromosome and position (most tools produce VCFs like this).
- The VCF file must be compressed using bgzip:
module load tabix # also loads bgzip cd $BI/web bgzip progeria_ctcf.vcf
- The VCF file must be indexed using tabix:
tabix -p vcf progeria_ctcf.vcf.gz
This has already been done, and the resulting files are at this URL: http://loving.corral.tacc.utexas.edu/bioiteam/ucsc_custom_tracks/. These are hg18 SNP calls from published Iyer Lab CTCF ChIP-seq data in Progeria cells. The VCF file was produce using Broad's GATK.
- Add custom tracks (be sure to pick assembly March 2006, NCBI36/hg18)
- Here is the track configuration line
track type=vcfTabix name="progeria_ctcf_snp_calls" bigDataUrl="http://loving.corral.tacc.utexas.edu/bioiteam/ucsc_custom_tracks/progeria_ctcf.vcf.gz"