Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Use python script to generate SSCS Reads.
  2. Use flexbar cutadapt to trim molecular indexes from duplex seq libraries.

...

Code Block
languagebash
titleClick here for 2 example commands that will work.
collapsetrue
cutadapt -fu 17 -i DED110_CATGGC_L006_R1_001.fastq -o DED110.R1.trimmed.fastq
cutadapt -fu 17 -i DED110_CATGGC_L006_R2_001.fastq -o DED110.R2.trimmed.fastq

...

Code Block
titlePossible answers
collapsetrue
# 1. use a semicolon to separate the two commands so that the second will start as soon as the first finishes:
fastx_trimmercutadapt -fu 17 -i DED110_CATGGC_L006_R1_001.fastq -o DED110.R1.trimmed.fastq; fastx_trimmercutadapt -fu  17 -i DED110_CATGGC_L006_R2_001.fastq -o DED110.R2.trimmed.fastq
 
# 2. use a double && between the commands so the second will start as soon as the first finishes, if it finishes without any errors:
fastx_trimmercutadapt -fu 17 -i DED110_CATGGC_L006_R1_001.fastq -o DED110.R1.trimmed.fastq && fastx_trimmercutadapt -fu 17 -i DED110_CATGGC_L006_R2_001.fastq -o DED110.R2.trimmed.fastq
 
# 3. use a trailing & to have the commands run in the background:
fastx_trimmercutadapt -fu 17 -i DED110_CATGGC_L006_R1_001.fastq -o DED110.R1.trimmed.fastq &
fastx_trimmercutadapt -fu 17 -i DED110_CATGGC_L006_R2_001.fastq -o DED110.R2.trimmed.fastq &

...

Expand
titleAnswer and small discussion

 The 3rd solution will finish before the other two because they are actually executed at the same time rather than waiting for one to finish. In many circumstances this is among the best ways to do something like this, and 'simple' read trimming with the fastx toolkit cutadapt is one of them. If you are doing something much more computationally intense (say read mapping, variant calling, or genome assembly) trying to complete the tasks at the same time will often leave you with no results at all as you run out of memory even on the compute nodes and the programs error out.

...

Code Block
titlePossible solution
collapsetrue
cat *.trimmed.fastq > DED110_all.trimmed.fastq  
# The above could also be done as 2 sequential steps with naming each file separately, and using a >> on the second line.
head DED110_all.trimmed.fastq
tail DED110_all.trimmed.fastq
wc -l *.trimmed.fastq
 
# these 4 commands should give you all the information you need to make sure you have a single file with all the information from the first 2. Ask if you aren't sure you ahvehave the right solution.

Next step:

You should now have 2 new .fastq files which we will use to call variants in: DED110_SSCS.fastq, and DED110_all.trimmed.fastq. You should take these files into a more in depth breseq tutorial for comparisons of the specific mutations that are eliminated using the error correction (SSCS). Link to other tutorial.

Optional not recommended tutorial trimming reads with flexbar:

For an another discussion about version control and when it is necessary to update to new tools and versions of programs, take a look at the trimmed reads tutorial from last year which used flexbrar simply because 'it worked before so keep using it'. Compare the simplistic fastx_trimmer commands used in this tutorial to all the work that went into flexbar last year. So while "well enough can be left alone", sometimes it is still better to use new tools. As the heading suggests, we don't actually suggest that you USE flexbar to trim this data set or any other, just something worth looking at to see how different programs operate or are invoked to achieve the same goals.


Return to GVA2017GVA2019