...
Code Block |
---|
|
#we have two fastq files: Lib01_R1.fastq and Lib01_R2.fastq
ls *.fastq
#These are paired end reads, two separate reads from either end of the same DNA fragment
#Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read)
expr $(cat Lib01_R1.fastq | wc -l) / 4
expr $(cat Lib01_R2.fastq | wc -l) / 4
#7453585
#(Note these are for demo purposes and shorter than they should be)
#this ddRAD library was prepared so that forward reads were cut with the nlaIII restriction enzyme.
#Looking up nlaIII, we see it's cut site:
# CATG'
# 'GTAC
#check if we see this cut site in the forward reads
grep CATG Lib01_R1.fastq | wc -l
#compare this with the total number of reads we got above
#do these numbers make sense?
#look at where in the reads the cut site is (may help to paste into a text editor and character search)
grep CATG Lib01_R1.fastq | head -n 30
#how many barcode combinations did we have?
#First look at the barcodes file:
cat barcodes_Lib1.tsv
#then count the lines
cat barcodes_Lib1.tsv | wc -l
#how r1 and r2 fastq files did we output (ignore the 'rem' files)
ls sample_fastqs/*AGCGAC.1.fq
ls sample_fastqs/*AGCGAC.1.fq
#are the paired end files still the same length?
cat sample_CATAT-AGCGAC.1.fq | wc -l
cat sample_CATAT-AGCGAC.2.fq | wc -l |
process the reads using process_radtags (part of the Stacks package):
...
Code Block |
---|
|
#how many barcode combinations did we have?
#First look at the barcodes file:
cat barcodes_Lib1.tsv
#then count the lines
cat barcodes_Lib1.tsv | wc -l
#how many r1 and r2 fastq files did we output (ignore the 'rem' files)
ls sample_fastqs/*AGCGAC.1.fq
ls sample_fastqs/*AGCGAC.1.fq
#are ourthe paired end files still the same length?
cat sample_CATAT-AGCGAC.1.fq | wc -l
cat sample_CATAT-AGCGAC.2.fq | wc -l
#did
|