...
- Check for low quality bases, low quality reads, overrepresented sequences, and sequence duplication using fastqc.
- If needed, trim low quality bases, filter low quality reads, trim adaptors. We covered fastx_toolkit and cutadapt for doing these operations.
For mapping your reads to reference
- Unspliced mapper- BWA: We ran BWA mem algorithm to map simulated rna seq data to the transcriptome.
- After mapping, samtools to get some statistics from mapping results.
- Spliced mapper -Tophat/STAR etc will map rna seq data to the genome, with or without transcriptome annotation to identify known and novel splice junctions.
- After mapping, always get some mapping statistics- samtools flagstat and samtools idxstats are two ways.
...
- Now it's ready for mapping.
BACK TO COURSE OUTLINE