qiime Processing

Processing biome data sets with qiime 1. (mac/unix protocol)
V. 2.0

Written by Bryant
Adapted from Michal

*note support for qiime 1 has ended. Future protocols should be developed for qiime 2.

MATERIALS

You will need the following installed before proceeding:

macqiime or qiime (http://www.wernerlab.org/software/macqiime)
flash (https://ccb.jhu.edu/software/FLASH/)
green genes database (http://greengenes.secondgenome.com)
miniconda (https://conda.io/miniconda.html)

You will also need a basic understanding of terminal scripting and package/script invocation in a unix terminal, you can learn more here: https://www.codecademy.com/courses/learn-the-command-line/lessons/navigation/exercises/your-first-command


Ok…. So you have installed the above! Yay… now you are ready to become a bioinformatisist

PROCEDURE

Note it is better to work in a local directory rather than a share/backed up folder as this can cause issues between the autoback-up and file versioning.
FLASH
1. Use flash to stitch bidirectional reads together into a single sequencing file
a. Using a paired end reads ie: GW1_S1_L001_R1_001.fastq and GW1_S1_L001_R2_001.fastq invoke FLASH (v 1.2.7) $ ./flash reads_1fastq reads_fastq -d outputdirectory
Change the file names as you do this. There should be a way to pipe this so ./flash can operate on a file list. You should look into this

2. Take the output out.extendedFrags.fastq and use qiime to finish the data processing
QIIME
# you will need a pickparameters file, a mapping file, and a sample descriptor file maps (optional) see qiime help for creating these

1. split_libraries_fastq.py -i out.extendedFrags.JB1.fastq,out.extendedFrags.JB2.fastq,out.extendedFrags.JB3.fastq -o slout/ --barcode_type 'not-barcoded' -q 19 --sample_ids JB1,JB2,JB3
# Note that you can run this on a total set of files and samples and filter the samples that you want for each project that you are processesing. This will save you time if you have several sets of consulting data sets. If you are processing your own data, there is no need to filter and create the sample ID file set. This can be done with:

Filter_fasta.py -f ~[result from split filter.fna] -o [output of filtered.fna] –sample_id_fp ~[MappngFile.txt]