Generation of gene counts from results of mapping to genome

Using an output file from mapping data to the genome, test.out:

1. Generate a tab-delimited info file using mapreads_interpreter, giving as input test.out and the reference file mapped against, reference.fasta

mapreads_interpreter test.out reference.fasta > test.info

2. (optional but recommended) Filter out reads that mapped to the reference at just one location.

 find_uniquely_mapped_reads test.info > test.uniq.info

3. Locate and count genes. IMPORTANT: The reference sequence identifiers in the GFF file must correspond exactly to the sequence identifiers used in the reference file.

locateGene_new test.uniq.info genes.gff > test.gene.info 2>test.log

where

test.uniq.info : info file created in previous step

genes.gff : the gff file containing start and end coordinates of the genes

test.gene.info : the output information - of reads that map to a gene in the gff

test.log : log information

getGeneCount_new test.gene.info > test.gene.count 2>test.log

where

test.gene.info : file created in previous step

test.gene.count : file containing two columns : geneid read count

test.log : log information