Quick tips on GO analysis
Scott likes two approaches:
1. GoMiner: http://discover.nci.nih.gov/gominer/htgm.servlet
But this requires gene names in a text file (MUST have .txt ending) to work; useful hack to get gene names from, e.g., blast -m 8 results (presuming fish.t has only the blast result identifier, e.g. "gi|5729849|ref|NM_006496.1"):
grep -f fish.t /home/scott/data/blastdb/human_rna.annot > fish.t.genes cat fish.t.genes | awk '{for (i=1;i<NF;i++) {if (substr($i,1,1)=="("&&substr($i,length($i),1)==",") {print substr($i,2,length($i)-3)} } }' > fish.t.genes.genenames
The first command retrieves the entire reference line from the BLAST database annotation, the second command parses out the gene name (not attractively).
Then send fish.t.genes.genenames off to gominer, allowing it to select the background (you could send the annot file of course...)
2. DAVID: http://david.abcc.ncifcrf.gov/tools.jsp
In contrast to GoMiner, DAVID is fine having NCBI identifiers (NM_..., NP_..., etc.) that can easily be parsed out from blast results.
Why would you chose one or the other?
GoMiner provides a very clean output - suitable for grant applications and summaries of gene lists, but doesn't give you much to explore. DAVID's output is intended for interactive exploration when you're really trying to work out biology.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache. If you require further assistance, please email wikihelp@utexas.edu.