...
Email your answers/PDFs to shunickesmith <at> gmail.com, cc: Prof. Matouschek no later than TBD, 10:00 am (BETTER: before Thanksgiving break).
Homework - Revised 9:00 pm Monday 11/24/14 Revised 5:00 pm Thursday 11/20/14
For your homework, you will investigate the validity of combining data files from different sequencing runs. Only a few of these questions require working at a computer keyboard, but I encourage you to work in groups to solve the entire set of questions.
Before you begin, pull up this web site in a new window - it's an all-class, live group chat to which you can post questions, get answers and even answer questions of your fellow students. Dr. Hunicke-Smith and Benni will be monitoring it periodically (largely during the daytime and early evenings).
By next Tuesday, 12/2 11/25, 10:00 am do the following:
- Follow the steps listed above on this web page under the expanding section "This is your homework due 11/25..." to log into appsoma and setup access to the data you will need from here on.
- Move into the "rawdata" directory, find the first four lines of the read 1 sequence file for the MURI 102 sample from lane 6 of sequencing run SA14008 - put them into a new file in that directory called "s1.fq" and copy it into an email.
- Move into the "finaldata" directory and make sure you can see the gene expression data file "all3x3.counts"
- Using the linux "sort" command, sort all3x3.counts 6 times, sorting on the expression values of each of the 6 samples separately from lowest to highest, redirecting the output of each sort into a separate file.
- Using the linux command "tail -1" on each of these 6 files, copy the name of the most abundant gene from each sample into the same email.
...