Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

Code Block
titlecheck out reads
#This library was prepared with bcgl restriction enzyme
#Check that the cut site appears in the reads
#How many reads do we have?
expr $(cat A_CTGCAG_R1.fastq | wc -l) / 4
expr $(cat B_GAAGTT_R1.fastq | wc -l) / 4
	#7011846
 
#(Note these are for demo purposes and shorter than they should be)
#How many of the reads have the cut site?
grep "CGA......TGC" A_CTGCAG_R1.fastq | wc -l
	#3429158
 
#3429158/7011846 = 0.49
#Why is the cut site only in half the reads?
 
#This occurs because during the 2bRAD library prep
#the adapters can be ligated to the fragment in either
#orientation (note the mirrored sticky ends left by bcgl).
#So the genomic fragment may have been read from either direction
 
#check for reverse complement of cut site
grep "GCA......TCG" A_CTGCAG_R1.fastq | wc -l
	#3274171
 
#(3429158 + 3274171) / 7011846 = 0.96
#Good, as expected the vast majority of the reads have the cut site.
#These look like 2bRAD reads

 

Demultiplex the reads:

Code Block
titledemultiplex
#The command to run trim2bRAD_2barcodes_dedup.pl is complicated
#Another script -- 2bRAD_trim_launch_dedup.pl -- will generate the command for us
 
2bRAD_trim_launch_dedup.pl R1.fastq > processTags
 
#returns our next commands to execute:
trim2bRAD_2barcodes_dedup.pl input=A_CTGCAG_R1.fastq site=".{12}CGA.{6}TGC.{12}|.{12}GCA.{6}TCG.{12}" adaptor="AGATC" sampleID=100
trim2bRAD_2barcodes_dedup.pl input=B_GAAGTT_R1.fastq site=".{12}CGA.{6}TGC.{12}|.{12}GCA.{6}TCG.{12}" adaptor="AGATC" sampleID=100

Check results:

 

Code Block
titledemultiplexcheck results
#The resulting files are labled with their barcode combinations and a *tr0 suffix
ls -l A*tr0
ls -l B*tr0
 
#How many did we get?
ls -l A*tr0 | wc -l
ls -l B*tr0 | wc -l
 
#Looking back at our barcodes, does this make sense?
 
#Which sample does B_GAAGTT_R1_ACCA.tr0 come from?
grep TGGT barcode_data.tsv | grep GAAGTT
 
	#MCNA13
 
#Why search for TGGT instead of ACCA?
#
 

 

 

#make a directory to put the resulting sample fastq files into
mkdir sample_fastqs
 
#look at the documentation for process_radtags
./process_radtags -h
  
#execute the command to process the rad data
./process_radtags -i 'fastq' -1 Lib01_R1.fastq -2 Lib01_R2.fastq -o ./sample_fastqs/ -b barcodes_Lib1.tsv --inline_index -e 'nlaIII' -r --disable_rad_check

 

...