Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This should take ~15 minutes to complete in an idev shell. While this is completing we will generate suggest generating .fastq files where the molecular index has been trimmed from the read to familiarize yourself with flexbar and to set up input files for a more advanced optional breseq tutorial. 

Error correction evaluation:

The SSCS_Log is a great place to start. Use the tail command to look at the last 8 lines of the log file to determine how many reads made it from raw reads to error corrected SSCS reads. 

Code Block
languagebash
titlehelp with the tail command
collapsetrue
tail -n 8 SSCS_Log
Expand
titleApproximately what fraction of raw reads became SSCS reads?

There are approximately 1/6th as many SSCS reads as raw reads:

Total Reads: 6066836

SSCS count: 978142

While this is somewhat misleading as it takes a minimum of 2 reads to generate a single SSCS read, we do have some additional information regarding what happened to the other reads. The first thing is to consider is the "Dual MI Reads" these represent the reads which correctly had the 12bp of degenerate sequence and the 4bp anchor. In this case, more than 1.5 million reads lacked an identifyable molecular index on read 1 and/or read 2. By that regard, we had ~1/4 as many SSCS reads as raw reads.

Perhaps more interesting is the number of errors removed. This is also available in the SSC_Log file, but in the middle of the file and don't have any good handle to grep with. One option is to cat the entire file and scroll around, another is to use tail/head commands you can get the specific lines only:

Code Block
languagebash
titlehelp with the tail command
collapsetrue
tail -n 94 SSCS_Log | head -n 86

The 3 columns are the read posistion, the number of bases changed, and the number of bases not changed. If you copy and paste these 3 columns into excel you can easily calculate the sum of the 2nd column to see that 

Tutorial (Trimmed Reads):

...