Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Introduction 

The purpose of this guide is to aid current and future Whiteley Lab members and University of Texas microbiologists with bacterial RNA?Seq analysis. Once you have analyzed your data with this pipeline, you will have files identifying differentially expressed genes and files that can be used to identify novel non-coding RNAs, transcriptional start sites, and operons. Throughout this guide I will provide hyperlinks to valuable resources and programs that will help get your analysis off the ground and running.

...

            Follow this link to obtain a fourierseq account: [fourierseq account details|]

            Follow this link to obtain a lonestar account: lonestar account details

...

Follow the steps on this wiki page to make a profile that will give you access to a ton of tools that you will need throughout the analysis pipeline ([how to set up a profile|]). This step is absolutely imperative. If your profile is not set up properly, then the pipeline will not work.

...

            A. actinomycetemcomitans D7S-1                   AAD7S
            E. coli K12 W3110                                          ECK12W3110
            P. aeruginosa PAO1                                       PAO1
            P. aeruginosa PA14                                        PA14
            S. gordonii Challis CH1                                   SGCH1

How to run it:

To run the script properly on lonestar you have to run the script from a commands file in the directory containing your fastq files. So create a text file called commands containing the following text. This can be done in the Unix terminal using nano.

Code Block
# Open nano and save your file as “commands”. Files are saved in nano with CTRL+o
#    and you can exit nano with CTRL+x. In the example below we have a control
#    condition and a test condition with two replicates each. We are mapping to the
#    PAO1 genome and using 12 processing threads (on lonestar always use 12).
nano


            mapRNASeq.sh Control1.fastq Control1 PAO1 12
            mapRNASeq.sh Control2.fastq Control2 PAO1 12
            mapRNASeq.sh Test1.fastq Test1 PAO1 12
            mapRNASeq.sh Test2.fastq Test2 PAO1 12

...

-o OUT_PFX                substitute “OUT_PFX” with the prefix you want for all of your output file
-c CONTROL_PFX       substitute “CONTROL_PFX” with the prefix for your control condition
-x [#]                            substitute “[#]” with the number of control condition replicates
-t TEST_PFX                substitute “TEST_PFX” with the prefix for your test condition
-y [#]                            substitute “[#]” with the number of test condition replicates
<file1>… <filen>          list your count files, with control conditions listed first followed by test condition files

How to run it:

This is a little simpler to run than the mapRNASeq.sh. You don’t need to create a commands file. You can run it directly from the command line, because it does not require as much computational power. Make sure you use the flags (i.e. “-o”, “-c”, etc).

Code Block
# An example, continuing from mapRNASeq.sh above
calcRNASeq.sh -o Exp1 -c Control -x 2 -t test -y 2 Control1.count.txt Control2.count.txt Test1.count.txt Test2.count.txt

This script takes any number of count files and joins them together in one count table, where the first column is the locus tags for the genome and the following columns contain the read counts for each locus for each condition. Then, once the table is created in the proper format it determines differential expression using the R package DESeq. This takes your joined count file with all of your conditions and replicates, normalizes the total counts for each condition/replicate, and calculates differential gene expression using Fisher’s exact test and a negative binomial distribution.

What do I get out of this?

...

Code Block
# Change into the ref_genome directory
cd /home1/02173/pjorth/ref_genome/

# Make the new directory
mkdir NEWASSEMBLY 

Example:
/home1/02173/pjorth/ref_genome/PAO1

...