Introduction to R

R and RStudio (its companion IDE, integrated development environment), together constitute one of the most popular tools used by researchers to analyze big data. Because it was first designed by statisticians for statistical purposes, R is a really nice language for data science. Wrangling massive amounts of information and producing publication-ready graphics and visualizations are more straightforward with R than other languages, and its use in data science remains strong.

R was created in the early 1990s by University of Auckland statisticians Ross Ihaka and Robert Gentleman. Ihaka and Gentleman, “identified what they called a common need for a better software environment” which lead them to first develop R based on a language called “S”. They started working on R in the early 90s, but version 1.0.0 wasn’t released until February 2000. So it’s about 25 years old, officially.

R is an open-source language, which means that anyone can contribute, and the course of its development is based on the collaborative effort of many many people.

Like Unix, R is not as intuitive as say Python, but it is easier to learn than Unix, because the syntax is more uniform across the many R packages that are available. Like Unix, you will come to use specific packages, and the ones you use most often will become very familiar to you, like old friends.

R is an interpreted language, which means that users access its functions through a command-line interpreter. We will be using the RStudio IDE, but lets first see what it’s like to use R at the command line.

Type R. into your terminal, and run a couple of “calculator” commands.

3+2*4
5^5

Define an object with the name 'animal', return the object value.

animal <- "elephant"
animal

But we will not be using R from the terminal, we will be using the RStudio IDE.

More information/Resources

RStudio user guide

dev_R/RStudio Part I

Introduction to R

More information/Resources