Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Then, skip the mapping and SAM/BAM conversion, sorting, indexing steps below.

...

Using the R environment for statistical computing

Many of the modules for doing statistical tests on NGS data have been written in the "R" language for statistical computing. If you're not familiar with R, then this section is probably going to be a bit confusing. (You might be thinking "Stop with the new languages already guys! Uncle!") To orient you, we are going to run the R command, which launches the R shell inside our terminal. Like the bash shell that we normally use, the R shell interprets commands, but now they are R commands rather than bash commands. The prompt changes from login1$ to > when you are in the R shell, to help clue you in to this fact. The R shell is inside the bash shell. So when you quit R, you will be back where you were in the bash shell.

...

Warning
titleDo not copy the > characters in the R examples.

They are the R prompt to remind you which commands are to be run inside the R shell!

The Basic rules of R:

  • Don't forget: it's q() to quit.
  • For help, type ?command. Try ?read.table. The q key gets you out of help, just like for a man page.
  • The left arrow <- (less-than-dash) is the same as an equals sign =. You can use them interchangeably.
  • The prompt we will sometimes be showing for R is >. Don't type this for a command. It is like the login1$ at the beginning of the bash prompt when you log in to Lonestar. It just means that you are in the R shell.
  • You can type the name of a variable to have its value displayed. Like this...
    Code Block
    > x <- 10 + 5 + 6
    > x
    [1] 21
    

Bioconductor modules for R

Like other languages, R can be expanded by loading modules. The R equivalent of Bioperl or Biopython is Bioconductor. Bioconductor can theoretically do things for you like convert sequences (none of us use it for that), but where it really shines is in doing statistical tests (where is it second-to-none in this list of languages). Many functions for analyzing microarray data are implemented in R, and this strength has now carried over to the analysis of RNAseq data.

...