...
Since this will take longer we will first use First we want to generate SSCS reads where we take advantage of the molecular indexes added during library prep. For the purpose of this tutorial, the paired end sequencing of sample DED110 has been placed in the $BI/gva_course/mixed_population directory. To do so we will use a "majority rules" python script (named SSCS_DCS.py) which was heavily modified by DED from a script originally created by Mike Schmitt and Scott Kennedy for the original duplex seq paper. This script can be found in the $BI/scripts directory. Invoking the script is as simple as typing SSCS_DCS.py; adding -h will give a list of the available options. The goal of this command is to generate SSCS reads, for any molecular index where we have at least 2 reads present, and to generate a log file which will tell us some information about the data.
Expand |
---|
title | If you need help to copy the fastq files click the triangle... |
---|
|
Code Block |
---|
SSCS_DCS.py -f1 fastq/DED110_CATGGC_L006_R1_001.fastq -f2 fastq/DED110_CATGGC_L006_R2_001.fastq -p DED110 -s -m 2 --log SSCS_Log |
|
Expand |
---|
title | If you need a hint without the answer click the triangle... |
---|
|
The following arguments are the ones that are needed to generate just the SSCS reads Code Block |
---|
-f1 FASTQ1, --fastq1 FASTQ1
fastq read1 file to check
-f2 FASTQ2, --fastq2 FASTQ2
fastq read2 file to check
-p PREFIX, --prefix PREFIX
prefix for output files
-s, --SSCS calculate SSCS sequence, off by default. IF DCS
specificed, automatically on
-m MINIMUM_READS, --minimum_reads MINIMUM_READS
minimum number of reads needed to support SSCS reads
--log LOG name of output log file |
Expand |
---|
title | If you are still stuck and want the answer click the triangle... |
---|
| Code Block |
---|
SSCS_DCS.py -f1 fastq/DED110_CATGGC_L006_R1_001.fastq -f2 fastq/DED110_CATGGC_L006_R2_001.fastq -p DED110 -s -m 2 --log SSCS_Log |
|
|
...
Expand |
---|
title | If you need a hint without the answer click the triangle... |
---|
|
The following arguments are the ones that are needed to successfully trim the first 16 bases of the sequence: Code Block |
---|
-u, --max-uncalled NUM
Allowed uncalled bases (N or .) in reads, default: 0
-x, --pre-trim-left NUM
Trim specified number of bases on 5' end of reads before alignment
-t, --target STR
Prefix for output file names
-s, --source FILE
Input file with reads, that may contain barcodes
-p, --source2 FILE
Second input file for paired read scenario
-f, --format STR
Input format of reads: csfasta, csfastq, fasta, fastq, fastq-sanger, fastq-solexa, fastq-i1.3, fastq-i1.5,
fastq-i1.8 (illumina 1.8+) |
Expand |
---|
title | If you are still stuck and want the answer click the triangle... |
---|
| Code Block |
---|
flexbar -u 100 -x 16 -t trimmed -s fastq/DED110_CATGGC_L006_R1_001.fastq -p fastq/DED110_CATGGC_L006_R2_001.fastq -f fastq |
|
|
...