Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Email your answers/PDFs to shunickesmith <at> gmail.com, cc: Prof. Matouschek no later than TBD, 10:00 am (BETTER: before Thanksgiving break).

 

Homework - Revised 9:00 pm Monday 11/24/14  Revised 5:00 pm Thursday 11/20/14

For your homework, you will investigate the validity of combining data files from different sequencing runs.  Only a few of these questions require working at a computer keyboard, but I encourage you to work in groups to solve the entire set of questions.

Before you begin, pull up this web site in a new window - it's an all-class, live group chat to which you can post questions, get answers and even answer questions of your fellow students.  Dr. Hunicke-Smith and Benni will be monitoring it periodically (largely during the daytime and early evenings).

By next Tuesday, 12/2 11/25, 10:00 am do the following:

  1. Follow the steps listed above on this web page under the expanding section "This is your homework due 11/25..." to log into appsoma and setup access to the data you will need from here on.
  2. Move into the "rawdata" directory, find the first four lines of the read 1 sequence file for the MURI 102 sample from lane 6 of sequencing run SA14008 - put them into a new file in that directory called "s1.fq" and copy it into an email.
  3. Move into the "finaldata" directory and make sure you can see the gene expression data file "all3x3.counts"
  4. Using the linux "sort" command, sort all3x3.counts 6 times, sorting on the expression values of each of the 6 samples separately from lowest to highest, redirecting the output of each sort into a separate file.
  5. Using the linux command "tail -1" on each of these 6 files, copy the name of the most abundant gene from each sample into the same email.

...