Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Demultiplexing ddRAD data with Stacks exercise:

This exercise should be done on TACC.


First copy over the exercise directoryCheck out the reads to make sure they make sense:

#start an idev session if you have not already idev #go to coure directory on scratch cds cd rad_intro/ #copy over the demultiplexing directory cp -r /work/02260/grovesd/lonestar/
Code Block
titlecopy over directory
check out reads
#navigate to the exercise directory
cd intro_to_rad_20172018/demultiplexing/process_ddRAD_stacks .

#go to the demultiplexing with stacks exercise
directory
cd
process_ddRAD_stacks

Check out the reads to make sure they make sense:

Code Block
titlecheck out reads
#we have two fastq files: Lib01_R1.fastq and Lib01_R2.fastq
ls *.fastq
 
#These files arehave paired end reads, (two separate reads from either end of the same DNA fragment)
#Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read)
expr $(cat Lib01_R1.fastq | wc -l) / 4
expr $(cat Lib01_R2.fastq | wc -l) / 4
	#7453585#53042
#(Note these are for demo purposes and shorter than they should be)
#this ddRAD library was prepared so that forward reads were cut with the nlaIII restriction
enzyme.
#Looking up nlaIII, we see it's cut site:
#     CATG'
#    'GTAC
 
#check if we see this cut site in the forward reads
grep CATG Lib01_R1.fastq | wc -l
 
#compare this with the total number of reads we got above
 
#do these numbers make sense?
 
#look at where in the reads the cut site is (may help to paste into a text editor and character search)
grep CATG Lib01_R1.fastq | head -n 30

Based on the library preparation, we expect that our forward reads were cut with the restriction enzyme nlaIII (NLA3).

Check that the forward reads fit this expectation.

Hint: look up restriction site and use grep

Code Block
titleSolution
collapsetrue
#Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read)
expr $(cat Lib01_R1.fastq | wc -l) / 4
expr $(cat Lib01_R2.fastq | wc -l) / 4
	#53042
#(Note these are for demo purposes and shorter than they should be)

#Looking up nlaIII, we see it's cut site:
#     CATG'
#    'GTAC
 
#check if we see this cut site in the forward reads
grep CATG Lib01_R1.fastq | wc -l
 
#compare this with the total number of reads we got above

process the reads using process_radtags (part of the Stacks package):

...