| Code Block | ||
|---|---|---|
| ||
cds cd my_rnaseq_course cp -r /corral-repl/utexas/BioITeam/rnaseq_course/exercises . cd exercises |
A) We have the fastq file, test.fastq. Can you find out how many reads we have in this fastq file? Can you think of multiple ways to do this?
B) I The instructions asked you to copy the directory exercises. But I left out one file. Can you copy that over to your exercises directory? You'll need to view it because it has your next exercise.
The file is at
/corral-repl/utexas/BioITeam/rnaseq_course/C
| Expand | |
|---|---|
|
C) See above
| |
Do not copy the whole directory over because you will rewrite the current directory. |
DC) We are concerned that a weird artifact sequence may be in our data- ACTACCGATCCA Can you find out what proportion of our reads have this artifact?
ED) We want to trim this sequence out from our data? Can you do that?
| Expand | ||
|---|---|---|
| ||
Use Fastx_toolkit! Here's the page where we covered that: FASTQ Quality Assurance tools |
E) Ok we’ve mapped the data using top hat. We have a bam file (test.bam) and annotation file (genes.formatted.gtf) and want to assemble novel and annotated transcripts using cufflinks. Can you submit a cufflinks job that to lonestar? Of course we are not utilizing lonestar fully because we are just submitting one job, but this is for practice.
| Expand | ||
|---|---|---|
| ||
Having trouble constructing the cufflinks command to use? Load the module and type cufflinks to see the options or look at commands we've used before: Tuxedo Suite For Splice Variant Analysis and Identifying Novel Transcripts II 2014 module load cufflinks cufflinks Having trouble with submitting jobs to lonestar. Remember the three steps: create commands file create launcher using launcher_creator.py submit the job using qsub |
F) Changing directions a bit, lets go do some parsing of our alignment output file, test.bam. I'm concerned about the region in 2L chromosome between 7620-7700. Could you pull out just the alignments in this region from the bam file?
| Expand | ||
|---|---|---|
| ||
Samtools has all things needed to parse sam and bam files. You may need to go into the samtools manual (http://samtools.sourceforge.net/samtools.shtml) to figure out how to do this. |
G) Ok we've assembled transcripts etc and run cuffdiff to get differential gene expression information. I can't really see what's going on just by looking at the alignments. Can you view this region on IGV to get a clue of what may be happening? This data is from DM3 genome.
| Expand | ||
|---|---|---|
| ||
You'll need to transfer files. Review the first day's lessons if you don't remember how. |
If you want to venture even further...
H) Ok we've got some results from a differential expression tool- DESeq_output.csv Can you pull out the top 10 changing genes from this (criteria is pvalue<=0.05, up or downregulated with abs(log2 fold change) >= 1 )
| Expand | ||
|---|---|---|
| ||
Using grep and awk on the columns that represent log2 fold change and significant will give you the result |