Coding basics
Coding in R
Some guidelines:
R is case sensitive.
Anything that follows a
#symbol is interpreted as a comment and ignored by R. Annotate heavily!! Your future self will be happier.In R, commands are generally separated by a new line. You can also use a semicolon
;to separate your commands on the same line but this is rarely done.If a continuation prompt
+appears in the console after you execute your code this means that you haven’t completed your code correctly. This often happens if you forget to close a bracket. Either try to finish the command on the new line, or hit escape on your keyboard until the console resets.In general, R allows extra spaces inserted into your code, in fact using spaces is actively encouraged. However, spaces should not be inserted into operators i.e.
<-should not read< -(note the space)If your console ‘hangs’ and becomes unresponsive, try pressing the escape key (esc) or clicking on the stop icon in the top right of your console. This will terminate the current operation in most cases.
R as a calculator -The Python REPL (read evaluate print loop)
2+2
log(1) # log base e
log2(1) # log base 2
exp(1) # e^x
sqrt(4) # square root
4^2 # 4 to the power of 2
pi # not a function but useful
17%%6 # modulo operator
# Order of operations, options(digits = 10)Objects, Variables, Datatypes
Objects are the central concept that unites R code. Everything in R can be thought of as an object
Examples of objects include: a single number, a character string (like a word), a vector, a data table, or a highly complex plot or function. Understanding how you create objects and assign values to objects is key to understanding R.
To create an object we simply give the object a name. We assign a value to this name using the assignment operator <-. The assignment operator is a composite symbol comprised of a
'less-than’ symbol < and a hyphen - .
my_obj <- 5In the code above, we created an object called my_obj and assigned it a value of the number 5 using the assignment operator. You can also use = instead of <- to assign values. Some people do this, but it is considered bad practice.
To view the value of the object, enter the name of the object in the console, or execute it from the IDE <CTRL><ENTER>.
my_obj## [1] 5
Check ‘Environment’ tab for the object.
If you click on the down arrow on the ‘List’ icon in the same pane and change to ‘Grid’ view RStudio will show you a summary of the objects including the type (numeric - it’s a number), the length (only one value in this object), its ‘physical’ size and its value (5 in this case).
Naming objects
Naming your objects in R is important. Good object names should be short and informative. If you need to create objects with multiple words in their name then use either an underscore or a dot between words, or capitalise the first letter of new words. I prefer using underscores (snake case).
input_argument_last <- "cell type 1"input.argument.last <- "cell type 1"inputArgumentLast <- "cell type 1"
Code break: Create some additional objects
Write some code in the Source window, and execute it <CTRL><ENTER>. Use the mathematical operators above, and create objects by assigning variable names.
#Examples:
num_1 <- 33
char_1 <- "don\'t think so!"
num_2 <- num_ m_1 / 2
num_4 1 + trast with PythUndetandi1ors is a learning process (U or a ‘binary operatis o le). To iprontseation about a particular error, Google a version of the error message, e.g. ‘non-numeric argument to binary operator error R’ .‘Base R’ Functions
There are many functions, operators, and objeR already available in R distributions. These are referred to as ‘base R’ functions, ‘base R’ operators, etc. Functions are R objects that take an argument, carry out some operations, and typically return a value.
For example, the log() call made above used the log() function. It also takes other arguments, such as the base of the logarithm that you may want to use.
To get help for a function, type a question mark before the name of the function and execute it.
?log
First look at the vector in R.
The c() function is short for concatenate and we use it to join together a series of values and store them in a data structure called a vector.
my_vec <- c(2, 3, 1, 6, 4, 3, 3, 7)
A vector is essentially a one-dimensional container that holds a sequence of elements of the same type (referred to as 'atomic vectors'). We currently have seen only two data types: numeric and character. Although it’s really a data structure, it's also considered a basic object because many other data structures in R consist of vectors.
Now that we’ve created a vector we can use other functions to do useful stuff with this object. For example, we can calculate the mean, variance, standard deviation and number of elements in our vector by using the mean(), var(), sd() and length() functions.
Code Break: Introduction to vectors
Some code to play with vectors is given below. Type it out in your source window on your own, and play around with the objects to become familiar.
my_vec <- c(2, 2, 3, 9, 4, 5)
typeof(my_vec)
mean(my_vec) # returns the mean of my_vec
var(my_vec) # returns the variance of my_vec
sd(my_vec) # returns the standard deviation of my_vec
length(my_vec) # returns the number of elements in my_vec
vec_mean <- mean(my_vec)
Code Break: Creating vectors with seq(), rep(), and sample()
Some code for creating vectors is given below. Play with the code, or make your own and try to understand what is happening. If you want to read about the functions, type a question mark followed by the function name.
#The seq() function
seq2 <- seq(1,20)
seq3 <- seq(1,20,.1)
length(seq3)
seq4 <- c( seq(1,10), seq(20,30) )
#The rep() function
seq4 <- c( rep(1, 10), rep(2,10) )
seq5 <- rep("abc", 10)
#query the sample function (?sample), and try to use it
#query the rnorm function, and try to use it
In R, to "unset" or remove a variable from your environment, you can use the rm() function.
The next section focuses on extracting or altering elements of vectors.
Vectors and extracting elements
By positional indices
my_vec <- c(2, 2, 3, 9, 4, 5)
my_vec[3] #extract single element
my_vec[3:8] #extract a range of elements
my_vec <- my_vec[c(3,4,5,6,1,2)] #re-order elements of a vector note 'NA'
#Not this:
my_vec <- my_vec[3,4,5,6,1,2] #the extraction operator [] expects a vector
Logical operators, and booleans
Code break:
Test my_vec with the above operators using a number, and look at the output.
Vectorization: Create another vector and see how the operators behave when used on two vectors. (You must use vectors of the same length.)
Examples (create your own)
#examples:
my_vec <- c(2, 2, 3, 9, 4, 5)
my_vec < 5
my_vec <= 5
my_vec1 <- 1:11 #can use the colon alone to create vectors!
my_vec2 <- rev(5:15)
my_vec1 < my_vec2
my_vec1 == my_vec2
Extraction of elements using booleans
Extraction of elements can be carried out using boolean vectors resulting from the above types of logical ‘tests’. Only the indices where TRUE occurs will be extracted, and indices where FALSE occurs will be ignored. Here is an example.
vect1 <- c(1:10)
vect2 <- c(rep(TRUE, 3), rep(FALSE, 7))
vect2
vect1[vect2]In the code below, try to predict what the sub_vector will look like, before executing that line.
a_vector = c(1:3, 10:15, 5:20, 33, -5) #note the shorter way to create number ranges !
sub_vector <- a_vector[a_vector < 12]
sub_vector
How to find the element that is equal in the following two vectors?
my_vec1 <- 1:11
my_vec2 <- rev(5:15)
Booleans with logical operators &, |, and ! .
vect1 <- c(T,T,T,F,F,F)
vect2 <- c(T,T,F,F,F,T)
vect1 & vect2
vect1 | vect2
vect1 & !vect2Test what will happen if the boolean vector is shorter.
Vectorization
One of the great things about R functions is that most of them are vectorized. This means that the function will operate on all elements of a vector without needing to apply the function on each element separately. This will become very useful when manipulating data tables.
c(1:10) + c(21:30)
c(1:10) / c(21:30)
c(1:10) %% c(21:30)