Data Analysis with R — Part 3 (Control Flow)
“Data is a precious thing and will last longer than the systems themselves.” — Tim Berners-Lee, Inventor of the World Wide Web
This is the third part of a series of posts on the topic of Data Analysis with R. Here are the links to the previous two parts.
Data Analysis with R — Part 1 (Getting Started)
Data Analysis with R — Part 2(Basics of R Programming)
These are the things you can expect out of this post. How to use
- Relational & Logical Operators and
- Control Flow (If...Else & For loop)
Relational & Logical Operators
R has several categories of operators like arithmetic operators (+, -, *, /, ^, etc.), relational operators, and logical operators. The following table summarizes all kinds of relational and logical operators. Keep it handy, as it will be useful whenever you do any sort of data processing.
A quick example illustrating relational and logical operators can be found below.
Note: Bold text represent the R code — you can copy/paste into our RStudio console — and Italicized text within any code snippets are the output of that particular command execution or comments, if within ‘#’.
#1. Create a vector from 1 to 20
x = 1:20#2. Filter to only single digit numbers
x[x <= 9]
[1] 1 2 3 4 5 6 7 8 9#3. Let's filter to numbers greater than 8 and lesser than 15
x[x > 8 & x < 15]
[1] 9 10 11 12 13 14
Control Flow
While coding, more often than not, we have to control the flow of the actions our code performs. In simple terms, control flow means the order in which we code and the order in which it is evaluated. Usually, this can be achieved by
- evaluating a set of code only if a condition or multiple conditions are met- If…Else
- a similar action being performed a specific number of times — For or While Loop
If…Else
The if-else commands are quite popular in many programming languages (C, Java, BASIC, FORTRAN, etc.).
The syntax for a SIMPLE if-else in R is as follows
if (condition){
do action 1} else {
do action 2}
# Notes
# 1. Condition can be created using relational and/or logical operators
# 2. For example the condition could be
# - (marks >= 70) or
# - (marks >=70 & marks <=90)
The syntax for a COMPLEX if-else in R is as follows
if (condition1){
do action 1} else if (condition2){
do action 2} else if (condition3){
do action 3} else {
do action 4}
Illustration -IF-ELSE
# Scenario 1: Check if a student Passed or Failed in an exam based on the marks
marks = 95if (marks >= 50) {
result = "Pass"
} else {
result = "Fail"}
print(result)
"Pass"
# Notes
# 1. Given that the marks for this student (95) is greater than our condition (50), the result = "Pass" will be executed# Scenario 2: Identify the Grade of the student based on the marks
marks = 60if(marks>=50 & marks < 60){
grade = "C"
} else if(marks>=60 & marks < 70) {
grade = "B-"
} else if(marks>=70 & marks < 80) {
grade = "B"
} else if(marks>=80 & marks < 90) {
grade = "B+"
} else if(marks>=90) {
grade = "A"
} else {
grade = "F"
}
print(grade)
[1] "B-"# Notes
# 1. Given that the marks for this student (60) falls in our condition2, the corresponding action of grade = "B-" will be executed.
#2. Do play around by changing the marks and see the output.
There is another handy and nifty version of If-Else in R called ifelse, which we will cover in Part 5 of this series when we look deep into a data processing package called DPLYR.
QUIZ — IF-ELSE: This quiz is to improve your understanding of if-else. Feel free to drop a comment, if you’re unable to answer it.
#### Quiz: What's the output you expect out of the following set of codes, if
# 1. number = 1
# 2. number = 10
# 3. number = 1000if (number < 10) {
if (number < 5) {
result <- "extra small"
} else {
result <- "small"
}
} else if (number < 100) {
result <- "medium"
} else {
result <- "large"
}print(result)# 1. output =
# 2. output =
# 3. output =
FOR loop
A FOR loop in an R is to iterate over a vector or data frames. First, let’s see the syntax for FOR loop.
Note: In R the vectors/data frames, etc. start from 1 unlike other languages like Python where it starts from 0
for(loop_variable in start:increment:stop){
statements you want to be repeated
statement1
statement2
}
# Notes
# 1. The loop variable can be named anything. Let's say i or n or x or y or even_your_name is fine.
# 2.The 'increment' argument is optional and default increment of 1 will be used, if increment is not provided.
Illustration -FOR loop
# Loop 1: Print the loop variable (i)
for (i in 1:10) {
print(paste0("Loop variable = ", i))
# paste0 will concatenate multiple strings into a single string
}
[1] "Loop variable = 1"
[1] "Loop variable = 2"
[1] "Loop variable = 3"
[1] "Loop variable = 4"
[1] "Loop variable = 5"
[1] "Loop variable = 6"
[1] "Loop variable = 7"
[1] "Loop variable = 8"
[1] "Loop variable = 9"
[1] "Loop variable = 10"# Notes
# 1. First the loop kicks off with i=1 and hence Loop variable = 1 would be printed
# 2. Then i will incremented by 1 (the default increment) and Loop variable = 2 would be printed.
# 3. Again i will incremented by 1 (the default increment) and Loop variable = 3 would be printed.
# 4. I think you got the idea by now. :)
# Loop 2: Translate 10 students' marks into grades
marks = c(40, 60, 20, 70, 65, 90, 80, 75, 95, 35)
grade = vector()
for (n in 1:10) {
if(marks[n] >= 60 & marks[n] < 70){
grade[n] = "B-"
} else if(marks[n] >= 70 & marks[n] < 80) {
grade[n] = "B"
} else if(marks[n] >= 80 & marks[n] < 90) {
grade[n] = "B+"
} else if(marks[n] >= 90) {
grade[n] = "A"
} else {
grade[n] = "F"
}
}
grade[1] "F" "B-" "F" "B" "B-" "A" "B+" "B" "A" "F"
# Notes
# 1. In the first iteration, when n=1, marks would be 40 and the final condition would be executed. So, grade[1] would be "F"
# 2. In the second iteration, when n=2, marks would be 60 and the first condition would executed. So, grade[2] would be "B-
# 3. And so on
Conclusion
In this post, we learned about various operators and control flow. Within Control flow, we specifically looked into If-Else and FOR loop. In R we are supposed to avoid for loop and rely on vector computations, as much as we can. But, there are circumstances where FOR loops usage became unavoidable and hence it’s good to learn it.
In the next post let’s dive deep into how to read/write Excel and CSV files to/from R. Till then, I hope you get stuck into a nice long FOR loop of practicing. 😜