Chapter-4-slides
Chapter-4-slides
Thomas Maierhofer
Fall 2024
1 / 55
Learning Objectives
2 / 55
Why Loops?
▶ Loops reduce redundancy and make code more efficient by automating repeated
tasks.
▶ Several flow control statements help control how many times commands in a
loop are repeated.
▶ We will explore various flow control statements, such as for loops, which are ideal
for simulations.
3 / 55
Intro: The Sum Sign is a For Loop
4 / 55
The Sum Sign is a For Loop
We are all familiar with the sum sign and it’s use in math.
For data x = {1, 3, 5, 7} we can write
n
X
xi = x1 + x2 + . . . xn
i=1
= x1 + x2 + x3 + x4
=1+3+5+7
= 16
This is actually a for loop. If you understand the sum sign you are more than halfway
there!
5 / 55
Let’s take this apart:
n
X
xi
i=1
6 / 55
In R: The Laborious Way
x <- c(1, 3, 5, 7)
total_sum <- x[1] + x[2] + x[3] + x[4]
total_sum
## [1] 16
7 / 55
In R: The Smart Way
x <- c(1, 3, 5, 7)
total_sum <- 0 # initialize the total to be 0
for (i in 1:4) { # i takes values 1, 2, 3, 4
# increase the current total by the i-th element of x
total_sum = total_sum + x[i]
}
# check the total at the end
total_sum
## [1] 16
8 / 55
In R: Inside the Smart Way
x <- c(1, 3, 5, 7)
total_sum = 0 # initialize the total to be 0
total_sum
## [1] 16
9 / 55
1st Chorus: For Loops
10 / 55
Formal Introduction: What is a For Loop?
▶ For loops are common in most programming languages and repeat a set of
commands a fixed number of times.
▶ In R, we use the for() statement to create for loops.
▶ element is an R object that is available in each iteration:
11 / 55
How For Loops Work in R
▶ The body of the loop is the code that is repeated each iteration.
▶ It’s recommended to indent the body for clarity. (Ctrl + I / Cmd + I will fix all
indentations)
▶ Use curly braces {} to allow for multiple commands in the loop.
▶ Results are not printed automatically: To display results during each iteration,
use print().
for (i in 1:4) {
i ˆ 2 # does not show up in output
print(i) # shows up
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
12 / 55
Important Notes about Loops in R
No local environments: Objects created inside the loop will exist in the global
environment.
for (i in 1:3) {
result <- i # create object inside the loop
}
# 'result' is available in the global environment
result
## [1] 3
13 / 55
Example: Squaring Each Entry in a Vector
Let’s create a for loop that squares each entry in a vector.
Note: The current loop doesn’t store or print the squared values.
▶ To make the loop useful, we need to save the output from each iteration.
▶ This requires creating an empty object outside the loop and storing the results
inside it during each iteration.
14 / 55
For example, to save the squares of the five_num vector, the loop above can be
rewritten as follows:
## [1] 0 1 4 16 64
15 / 55
Example: The Fibonacci Sequence
▶ Sometimes, iterations in a loop depend on results from previous iterations.
▶ The Fibonacci sequence is a sequence where:
▶ The first two terms are 1 and 1.
▶ Each subsequent term is the sum of the previous two terms.
# Create a vector to store the Fibonacci numbers
fib <- numeric(12)
fib[1:2] <- c(1, 1) # Initialize the first two terms
## [1] 1 1 2 3 5 8 13 21 34 55 89 144
16 / 55
Indexing Set vs Vector Elements
▶ In many settings, it’s useful to set vector as an indexing set in a for loop.
▶ Use the letter i as the index to match the mathematical notation for indexing “xi
is the ith entry of a vector.”
▶ You can also loop over the actual elements of a vector, not just the indices.
# a vector
vec <- c("Go", "Bruins!")
# loop through indexing set
for (i in 1:length(vec)) {
print(vec[i])
}
## [1] "Go"
## [1] "Bruins!"
# loop through actual elements of the vector
for (element in vec) {
print(element)
}
## [1] "Go"
## [1] "Bruins!" 17 / 55
I promised to talk about : vs. seq_len vs. seq_along
and now is the time for it. Let’s start with a well-behaved example:
my_text <- c("I", "love", "Stats 20")
for (i in 1:length(my_text)) {
print(my_text[i])
}
## [1] "I"
## [1] "love"
## [1] "Stats 20"
for (i in seq_along(my_text)) {
print(my_text[i])
}
## [1] "I"
## [1] "love"
## [1] "Stats 20"
18 / 55
for (i in seq_len(length(my_text))) {
print(my_text[i])
}
## [1] "I"
## [1] "love"
## [1] "Stats 20"
19 / 55
and here is the problem
my_text <- character(0) # empty vector of length 0
for (i in 1:length(my_text)) { # i in 1:0
print(my_text[i])
}
## [1] NA
## character(0)
Final Words: I prefer looping through an index set instead of the actual elements,
seq_along is my preferred way of setting up an index set for a loop.
20 / 55
Your turn: One more Variance function
Implement the two-pass variance function using for loops (no vectorized functions like
sum() or mean() allowed):
n
1 X
Var(x ) = (xi − x̄ )2
n − 1 i=1
You are allowed to use the length() function but if you are feeling fancy you can
implement your own using the fact that length(x ) = ni=1 1.
P
21 / 55
Solution: compute length using a loop
x <- c(0, 1, 2, 5)
# compute the sample size n
length_loop <- function(x) {
n <- 0
for (i in seq_along(x)) {
n <- n + 1
}
n
}
c(length(x), length_loop(x)) # test
## [1] 4 4
22 / 55
Solution: compute mean using a loop
## [1] 2 2
23 / 55
Solution: compute var using a loop
variance_loop <- function(x) {
# sample size
n <- length_loop(x)
# sample mean
x_bar <- mean_loop(x)
# compute the sum of squared deviations
ssd <- 0
for (i in seq_along(x)) {
ssd <- ssd + (x[i] - x_bar)ˆ2
}
# compute the actual variance
1 / (n - 1) * ssd
}
c(var(x), variance_loop(x)) # test
25 / 55
Conditional Execution: The if() Statement
▶ Relational and Boolean operators produce logical vectors that can be used for
subsetting.
▶ Another way to control code execution based on conditions is the if() statement.
if (condition) {
# Commands run when condition is TRUE
}
26 / 55
Example: if
cups_of_coffee <- 1
if (cups_of_coffee == 0){ # FALSE
print("Running on empty... Send coffee!") # is not run
}
if (cups_of_coffee >= 1){ # TRUE
print("Fueled and focused! Let's get things done!") # is run
}
27 / 55
A missing value (NA) for condition will throw an error.
if (cups_of_coffee == sqrt(-1)){ # NA
print("Imaginary coffee won't fuel real work!") # is not run
}
28 / 55
A numeric input for condition will be coerced into a logical value using the
as.logical() function. (Bad style)
if (cups_of_coffee){ # TRUE
print("Running on caffeine and bold decisions!") # is run
}
29 / 55
The logical expression in condition should evaluate to a single logical value. If a
logical vector with more than one element is used, R will throw an error.
30 / 55
The if-else Statement
if (condition) {
# Commands when condition is TRUE
} else { # else has to be on the same line as closing }
# Commands when condition is FALSE
}
31 / 55
Example: if else
We can create objects in the global environment (workspace) from within if and if
else statements.
cups_of_coffee <- 1
if (cups_of_coffee == 0) {
say <- "One coffee please."
} else {
say <- "Another!"
}
32 / 55
if and if-else: Returning Values
33 / 55
say <- if (cups_of_coffee == 0) { # FALSE
"One coffee please."
} else {
"another.PNG"
}
knitr::include_graphics(say)
34 / 55
if() in Functions
The if() or if else statement is often used in functions to:
▶ Enable optional features.
▶ Switch between behaviors based on input.
Side Note: The ... argument is used to pass optional arguments to functions used
inside the main function. In this example, the ... enables the cor_plot() function
to pass arguments to the plot() function. 35 / 55
Let’s try it
x_seq <- seq(0, 2 * pi, by = 0.1)
cor_plot(x = x_seq, y = sin(x_seq))
## [1] -0.7762993
−0.5
36 / 55
Additional arguments to cor_plot() are handed over to the plot function.
−0.5
−1.0
0 1 2 3 4 5 6
## [1] -0.7762993
37 / 55
Bridge: Error Handling
38 / 55
Error Handling with if()
The if() statement can be used for error handling, allowing us to check for invalid
inputs and throw appropriate error or warning messages.
Let’s revisit the variance function we wrote in Chapter 2.
## [1] 10
## [1] NA 39 / 55
The stop() Function
The stop() function stops the execution of the current expression and throws an error
message.
40 / 55
The warning() Function
The warning() function throws a warning message but does not stop the execution of
the current expression.
## [1] NA
41 / 55
Your Turn:
Think about other things that can go wrong with this function, or what other
functionality you might want:
▶ What if the input object is not numeric?
▶ What if the input object is not a vector?
▶ What if you wanted to first check for NA values, throw a warning if necessary, then
remove the NA values before computing the variance?
42 / 55
The message() Function
A related function is the message() function, which is used for printing diagnostic
messages.
## [1] NA
The behavior of message() and warning() are similar, but the purposes are distinct:
▶ The purpose of the message() function is to notify the user of what the code is
doing, but typically the code is working as intended.
▶ The warning() function is reserved for warning when the result may not be what 43 / 55
2nd Chorus: The while() Loop
44 / 55
The while() Loop
▶ Unlike a for() loop, which repeats commands a fixed number of times, the
while() loop is used when the number of iterations is unknown in advance.
▶ The while() loop repeats a set of commands as long as a specified condition
is TRUE.
while (condition) {
# Commands go here
}
## [1] 25
Question: Why is the result larger than 20 if the loop only runs when num is less than
or equal to 20? How would you change that?
46 / 55
Example: Fibonacci Sequence with a while() Loop
47 / 55
# Create the vector to store the output from the while loop.
fib <- c(1, 1)
# While the sum of the last two terms is less than 500,
# execute the following commands.
while (fib[length(fib)] + fib[length(fib) - 1] < 500) {
# compute the next term
next_term <- fib[length(fib)] + fib[length(fib) - 1]
# Append the next term to the vector with all previous terms.
fib <- c(fib, next_term)
}
# Print the output from the while loop.
fib
48 / 55
Sidenote on Efficiency: Vector Memory Allocation
Increasing the length of a vector element by element inside a loop (e.g., fib <-
c(fib, next_term)) should generally be avoided.
Why Avoid This?
▶ Each time the vector grows by one element, R replaces the entire vector with a
new vector in a new memory location that includes the additional element.
▶ New memory space is allocated for each new vector, making this process
time-consuming and inefficient.
Solution:
▶ Pre-allocate a storage vector of a fixed size when possible to avoid frequent
memory allocations.
▶ This practice improves the performance of loops, especially when the number of
iterations is large.
Key Takeaway: Reducing memory re-allocation in large loops leads to more efficient
and faster code execution.
49 / 55
Outro: break and next Statements
50 / 55
The break Statement in Loops
51 / 55
Example: break
## [1] "Let’s"
## [1] "get"
52 / 55
The next Statement in Loops
▶ A next statement skips the remainder of the current iteration of a loop and
immediately moves to the next iteration.
▶ A next statement is typically placed inside an if() statement so that the loop
skips based on a specific condition (the skip condition).
53 / 55
Example: next
## [1] "I"
## [1] "love"
## [1] "R code"
54 / 55
When To Use Which Loop (and When Not To)
▶ Loops are natural when executing code multiple times or iterating over an index.
▶ However, a loop may not be necessary in all cases:
When Not To Use a Loop
▶ If the order of execution doesn’t matter, vectorization should be used instead of
a loop.
▶ Vectorized approaches are simpler, shorter, and much more efficient.