For full documentation of the 2bRAD de novo pipeline see the github page
The pipeline is very similar to that performed by Stacks (Catchen et al. 2011):
Code Block | ||
---|---|---|
| ||
#navigate to the directory
#look at starting trimmed fastq files
ls *.trim
sampleA.trim sampleB.trim sampleC.trim
#run uniquerOne.pl
#(this is analogous to making 'stacks' in STACKS (Fig1A Catchen et al. (2011))
#finds the unique RAD tags from each fastq
uniquerOne.pl sampleA.trim > sampleA.trim.uni
uniquerOne.pl sampleB.trim > sampleB.trim.uni
uniquerOne.pl sampleC.trim > sampleC.trim.uni
# merging uniqued files
#(Fig1B Catchen et al. (2011))
mergeUniq.pl uni minDP=2 >mydataMerged.uniq
#generates a merged set of unique tags:
mergedUniqTags.fasta
# clustering allowing for up to 3 mismatches (-c 0.91); the most abundant sequence becomes reference
#This is equivalent to calling loci (Fig1C-D Catchen et al. (2011))
module load cd-hit
cd-hit-est -i mergedUniqTags.fasta -o cdh_alltags.fas -aL 1 -aS 1 -g 1 -c 0.91 -M 0 -T 0
#now we have called de novo loci based on the tags
#assemble them into an artificial reference for re-mapping and genotyping
concatFasta.pl fasta=cdh_alltags.fas num=8
#index the artificial reference with bowtie
module load bowtie
bowtie2-build cdh_alltags_cc.fasta cdh_alltags_cc.fasta
#now map the reads back to the artificial reference
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleC.trim -S sampleC.trim.bt2.sam
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleB.trim -S sampleB.trim.bt2.sam
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleA.trim -S sampleA.trim.bt2.sam
#The alignment files can now be used for whichever genotyping method you prefer |