0% found this document useful (0 votes)
5 views

Chapter-4-slides

Uploaded by

levinali1225
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter-4-slides

Uploaded by

levinali1225
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

STATS 20: Chapter 4 - Flow Control Statements

Thomas Maierhofer

Fall 2024

1 / 55
Learning Objectives

▶ Create loops using for(), while(), and repeat.


▶ Understand the difference between for and while loops.
▶ Understand and construct if() and if-else statements.

2 / 55
Why Loops?

▶ Loops reduce redundancy and make code more efficient by automating repeated
tasks.
▶ Several flow control statements help control how many times commands in a
loop are repeated.
▶ We will explore various flow control statements, such as for loops, which are ideal
for simulations.

3 / 55
Intro: The Sum Sign is a For Loop

4 / 55
The Sum Sign is a For Loop

We are all familiar with the sum sign and it’s use in math.
For data x = {1, 3, 5, 7} we can write
n
X
xi = x1 + x2 + . . . xn
i=1
= x1 + x2 + x3 + x4
=1+3+5+7
= 16

This is actually a for loop. If you understand the sum sign you are more than halfway
there!

5 / 55
Let’s take this apart:

n
X
xi
i=1

▶ There is a running index i = 1, . . . , n = 1, 2, 3, 4.


▶ In the first step, i = 1, so we access element xi = x1 = 1. This is our current total.
▶ In the second step, i = 2, so we access element xi = x2 = 3 and we add this to
the current total to obtain 1 + 3 = 4
▶ In the second step, i = 3, so we access element xi = x3 = 5 and we add this to
the current total to obtain 4 + 5 = 9
▶ In the fourth step, i = 4, so we access element xi = x4 = 7 and we add this to the
current total to obtain 9 + 7 = 16

6 / 55
In R: The Laborious Way

We can sum over all entries in a vector manually.

x <- c(1, 3, 5, 7)
total_sum <- x[1] + x[2] + x[3] + x[4]
total_sum

## [1] 16

This would be annoying for long vectors or vectors of changing length.

7 / 55
In R: The Smart Way

We can sum over all entries in a vector using a for loop.

x <- c(1, 3, 5, 7)
total_sum <- 0 # initialize the total to be 0
for (i in 1:4) { # i takes values 1, 2, 3, 4
# increase the current total by the i-th element of x
total_sum = total_sum + x[i]
}
# check the total at the end
total_sum

## [1] 16

8 / 55
In R: Inside the Smart Way
x <- c(1, 3, 5, 7)
total_sum = 0 # initialize the total to be 0

i <- 1 # first iteration


total_sum <- total_sum + x[i]

i <- 2 # second iteration


total_sum <- total_sum + x[i]

i <- 3 # third iteration


total_sum <- total_sum + x[i]

i <- 4 # fourth iteration


total_sum <- total_sum + x[i]

total_sum

## [1] 16
9 / 55
1st Chorus: For Loops

10 / 55
Formal Introduction: What is a For Loop?
▶ For loops are common in most programming languages and repeat a set of
commands a fixed number of times.
▶ In R, we use the for() statement to create for loops.
▶ element is an R object that is available in each iteration:

1. element <- vector[1] on the first iteration.


2. element <- vector[2] on the second iteration.
3. And so on, until all entries in vector are processed.
Syntax of a For Loop

for (element in vector) {


# Commands go here
# element is available as an object
}

11 / 55
How For Loops Work in R
▶ The body of the loop is the code that is repeated each iteration.
▶ It’s recommended to indent the body for clarity. (Ctrl + I / Cmd + I will fix all
indentations)
▶ Use curly braces {} to allow for multiple commands in the loop.
▶ Results are not printed automatically: To display results during each iteration,
use print().

for (i in 1:4) {
i ˆ 2 # does not show up in output
print(i) # shows up
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
12 / 55
Important Notes about Loops in R

No local environments: Objects created inside the loop will exist in the global
environment.

for (i in 1:3) {
result <- i # create object inside the loop
}
# 'result' is available in the global environment
result

## [1] 3

13 / 55
Example: Squaring Each Entry in a Vector
Let’s create a for loop that squares each entry in a vector.

five_num <- c(0, 1, 2, 4, 8)

# Let n cycle over the entries in five_num


for (n in five_num) {
# For each iteration, compute the square of n.
nˆ2
}

Note: The current loop doesn’t store or print the squared values.
▶ To make the loop useful, we need to save the output from each iteration.
▶ This requires creating an empty object outside the loop and storing the results
inside it during each iteration.
14 / 55
For example, to save the squares of the five_num vector, the loop above can be
rewritten as follows:

# Create a vector to store the output from the for loop.


squares <- numeric(5)

# Let i cycle over the numbers 1 to 5 (the length of five_num).


for (i in 1:5) {
# For the i-th iteration of the for loop
# square the i-th entry of five_num
# and save the output into the i-th entry of the squares vector.
squares[i] <- five_num[i]ˆ2
}
# Print the output from the for loop.
squares

## [1] 0 1 4 16 64
15 / 55
Example: The Fibonacci Sequence
▶ Sometimes, iterations in a loop depend on results from previous iterations.
▶ The Fibonacci sequence is a sequence where:
▶ The first two terms are 1 and 1.
▶ Each subsequent term is the sum of the previous two terms.
# Create a vector to store the Fibonacci numbers
fib <- numeric(12)
fib[1:2] <- c(1, 1) # Initialize the first two terms

# Use a for loop to compute the next terms


for (i in 3:12) {
fib[i] <- fib[i - 2] + fib[i - 1] # Sum the previous two terms
}
fib # Print the Fibonacci sequence

## [1] 1 1 2 3 5 8 13 21 34 55 89 144
16 / 55
Indexing Set vs Vector Elements
▶ In many settings, it’s useful to set vector as an indexing set in a for loop.
▶ Use the letter i as the index to match the mathematical notation for indexing “xi
is the ith entry of a vector.”
▶ You can also loop over the actual elements of a vector, not just the indices.
# a vector
vec <- c("Go", "Bruins!")
# loop through indexing set
for (i in 1:length(vec)) {
print(vec[i])
}

## [1] "Go"
## [1] "Bruins!"
# loop through actual elements of the vector
for (element in vec) {
print(element)
}

## [1] "Go"
## [1] "Bruins!" 17 / 55
I promised to talk about : vs. seq_len vs. seq_along
and now is the time for it. Let’s start with a well-behaved example:
my_text <- c("I", "love", "Stats 20")
for (i in 1:length(my_text)) {
print(my_text[i])
}

## [1] "I"
## [1] "love"
## [1] "Stats 20"

for (i in seq_along(my_text)) {
print(my_text[i])
}

## [1] "I"
## [1] "love"
## [1] "Stats 20"
18 / 55
for (i in seq_len(length(my_text))) {
print(my_text[i])
}

## [1] "I"
## [1] "love"
## [1] "Stats 20"

19 / 55
and here is the problem
my_text <- character(0) # empty vector of length 0
for (i in 1:length(my_text)) { # i in 1:0
print(my_text[i])
}

## [1] NA
## character(0)

for (i in seq_along(my_text)) { # i in seq_along(character(0))


print(my_text[i])
} # there's no output

for (i in seq_len(length(my_text))) { # i in seq_len(0)


print(my_text[i])
} # there's no output

Final Words: I prefer looping through an index set instead of the actual elements,
seq_along is my preferred way of setting up an index set for a loop.
20 / 55
Your turn: One more Variance function

Implement the two-pass variance function using for loops (no vectorized functions like
sum() or mean() allowed):
n
1 X
Var(x ) = (xi − x̄ )2
n − 1 i=1

You are allowed to use the length() function but if you are feeling fancy you can
implement your own using the fact that length(x ) = ni=1 1.
P

21 / 55
Solution: compute length using a loop

x <- c(0, 1, 2, 5)
# compute the sample size n
length_loop <- function(x) {
n <- 0
for (i in seq_along(x)) {
n <- n + 1
}
n
}
c(length(x), length_loop(x)) # test

## [1] 4 4

22 / 55
Solution: compute mean using a loop

mean_loop <- function(x) {


# compute sample size
n <- length_loop(x)
# sum over x
sum_x <- 0
for (i in seq_along(x)) {
sum_x <- sum_x + x[i]
}
# return the sample mean
sum_x / n
}
c(mean(x), mean_loop(x)) # test

## [1] 2 2
23 / 55
Solution: compute var using a loop
variance_loop <- function(x) {
# sample size
n <- length_loop(x)
# sample mean
x_bar <- mean_loop(x)
# compute the sum of squared deviations
ssd <- 0
for (i in seq_along(x)) {
ssd <- ssd + (x[i] - x_bar)ˆ2
}
# compute the actual variance
1 / (n - 1) * ssd
}
c(var(x), variance_loop(x)) # test

## [1] 4.666667 4.666667 24 / 55


Interlude: Conditional Execution

25 / 55
Conditional Execution: The if() Statement

▶ Relational and Boolean operators produce logical vectors that can be used for
subsetting.
▶ Another way to control code execution based on conditions is the if() statement.

if (condition) {
# Commands run when condition is TRUE
}

▶ The condition is a logical expression that evaluates to TRUE or FALSE.


▶ The commands inside the block are executed only if the condition is TRUE.

26 / 55
Example: if

How many cups of coffee did you have?

cups_of_coffee <- 1
if (cups_of_coffee == 0){ # FALSE
print("Running on empty... Send coffee!") # is not run
}
if (cups_of_coffee >= 1){ # TRUE
print("Fueled and focused! Let's get things done!") # is run
}

## [1] "Fueled and focused! Let’s get things done!"

27 / 55
A missing value (NA) for condition will throw an error.

if (cups_of_coffee == sqrt(-1)){ # NA
print("Imaginary coffee won't fuel real work!") # is not run
}

## Warning in sqrt(-1): NaNs produced

## Error in if (cups_of_coffee == sqrt(-1)) {: missing value where TRUE/FAL

28 / 55
A numeric input for condition will be coerced into a logical value using the
as.logical() function. (Bad style)

as.logical(c(-1, 0, 1, 2)) # any number but 0 will return TRUE

## [1] TRUE FALSE TRUE TRUE

if (cups_of_coffee){ # TRUE
print("Running on caffeine and bold decisions!") # is run
}

## [1] "Running on caffeine and bold decisions!"

29 / 55
The logical expression in condition should evaluate to a single logical value. If a
logical vector with more than one element is used, R will throw an error.

if (cups_of_coffee == c(1, 2, 3, 4)){ # c(TRUE, FALSE, FALSE, FALSE)


print("Over-caffeinated and under-prepared for all these choices!")
}

## Error in if (cups_of_coffee == c(1, 2, 3, 4)) {: the condition has lengt

30 / 55
The if-else Statement

▶ Sometimes we need an alternative set of commands when a condition is FALSE.


▶ The else statement is used in conjunction with if() to provide this alternative.

if (condition) {
# Commands when condition is TRUE
} else { # else has to be on the same line as closing }
# Commands when condition is FALSE
}

31 / 55
Example: if else

We can create objects in the global environment (workspace) from within if and if
else statements.

cups_of_coffee <- 1
if (cups_of_coffee == 0) {
say <- "One coffee please."
} else {
say <- "Another!"
}

Question What is the value of say?

32 / 55
if and if-else: Returning Values

▶ An if or if-else statement is similar to a function call in that the result from


the last command in the body is returned if executed.
▶ However, unlike functions, if and if-else statements do not create local
environments.

This allows us to rewrite the previous example as:

33 / 55
say <- if (cups_of_coffee == 0) { # FALSE
"One coffee please."
} else {
"another.PNG"
}
knitr::include_graphics(say)

34 / 55
if() in Functions
The if() or if else statement is often used in functions to:
▶ Enable optional features.
▶ Switch between behaviors based on input.

cor_plot <- function(x, y, plot = FALSE, ...) {


if (plot) {
# If plot = TRUE, draw a scatterplot of x and y.
plot(x, y, ...)
}
# Always compute and return the correlation coefficient
cor(x, y)
}

Side Note: The ... argument is used to pass optional arguments to functions used
inside the main function. In this example, the ... enables the cor_plot() function
to pass arguments to the plot() function. 35 / 55
Let’s try it
x_seq <- seq(0, 2 * pi, by = 0.1)
cor_plot(x = x_seq, y = sin(x_seq))

## [1] -0.7762993

x_seq <- seq(0, 2 * pi, by = 0.1)


cor_plot(x = x_seq, y = sin(x_seq), plot = TRUE)
1.0
0.5
0.0
y

−0.5

36 / 55
Additional arguments to cor_plot() are handed over to the plot function.

cor_plot(x = x_seq, y = sin(x_seq), plot = TRUE,


# additional arguments are collected by ...
col = "seagreen", pch = 16) # make sure you name them
1.0
0.5
0.0
y

−0.5
−1.0

0 1 2 3 4 5 6

## [1] -0.7762993
37 / 55
Bridge: Error Handling

38 / 55
Error Handling with if()
The if() statement can be used for error handling, allowing us to check for invalid
inputs and throw appropriate error or warning messages.
Let’s revisit the variance function we wrote in Chapter 2.

var_fn <- function(x) {


sum((x - mean(x))ˆ2) / (length(x) - 1)
}
var_fn(five_num)

## [1] 10

What if the input vector contains NA values?

incomplete_nums <- c(NA, 8, 12, 0)


var_fn(incomplete_nums)

## [1] NA 39 / 55
The stop() Function

The stop() function stops the execution of the current expression and throws an error
message.

var_fn2 <- function(x) {


if (sum(is.na(x)) > 0) {
stop("The input has NA values!")
}
sum((x - mean(x))ˆ2) / (length(x) - 1)
}
var_fn2(incomplete_nums)

## Error in var_fn2(incomplete_nums): The input has NA values!

40 / 55
The warning() Function
The warning() function throws a warning message but does not stop the execution of
the current expression.

var_fn3 <- function(x) {


if (sum(is.na(x)) > 0) {
warning("The input has NA values!")
}
sum((x - mean(x))ˆ2) / (length(x) - 1)
}
var_fn3(incomplete_nums)

## Warning in var_fn3(incomplete_nums): The input has NA values!

## [1] NA
41 / 55
Your Turn:

Think about other things that can go wrong with this function, or what other
functionality you might want:
▶ What if the input object is not numeric?
▶ What if the input object is not a vector?
▶ What if you wanted to first check for NA values, throw a warning if necessary, then
remove the NA values before computing the variance?

42 / 55
The message() Function
A related function is the message() function, which is used for printing diagnostic
messages.

var_fn4 <- function(x) {


message("Computing variance, unless there are NAs...")
sum((x - mean(x))ˆ2) / (length(x) - 1)
}
var_fn4(incomplete_nums)

## Computing variance, unless there are NAs...

## [1] NA

The behavior of message() and warning() are similar, but the purposes are distinct:
▶ The purpose of the message() function is to notify the user of what the code is
doing, but typically the code is working as intended.
▶ The warning() function is reserved for warning when the result may not be what 43 / 55
2nd Chorus: The while() Loop

44 / 55
The while() Loop
▶ Unlike a for() loop, which repeats commands a fixed number of times, the
while() loop is used when the number of iterations is unknown in advance.
▶ The while() loop repeats a set of commands as long as a specified condition
is TRUE.

while (condition) {
# Commands go here
}

A while() loop is essentially a repeating if statement:


1. The logical condition expression is evaluated.
2. If condition is TRUE, the commands are executed.
3. Repeat steps 1 and 2 until the condition evaluates to FALSE, when the loop
stops.
45 / 55
Example: while loop

# Start with num = 1.


num <- 1
# While num is less than or equal to 20, execute the following commands.
while (num <= 20) {
# Add 6 to num and assign the sum to num (replace the old num value).
num <- num + 6
}
# Print the output from the while loop.
num

## [1] 25

Question: Why is the result larger than 20 if the loop only runs when num is less than
or equal to 20? How would you change that?
46 / 55
Example: Fibonacci Sequence with a while() Loop

▶ Suppose we want to list all Fibonacci numbers less than 500.


▶ Since we don’t know how long the sequence is beforehand, we cannot create an
index set (let alone specify the elements of the vector) which is needed for a
for() loop.

Why use a while() Loop?


▶ The number of iterations is not known in advance.
▶ Instead of using indexing, we check the last two terms in the Fibonacci series.

47 / 55
# Create the vector to store the output from the while loop.
fib <- c(1, 1)

# While the sum of the last two terms is less than 500,
# execute the following commands.
while (fib[length(fib)] + fib[length(fib) - 1] < 500) {
# compute the next term
next_term <- fib[length(fib)] + fib[length(fib) - 1]
# Append the next term to the vector with all previous terms.
fib <- c(fib, next_term)
}
# Print the output from the while loop.
fib

## [1] 1 1 2 3 5 8 13 21 34 55 89 144 233 377

48 / 55
Sidenote on Efficiency: Vector Memory Allocation
Increasing the length of a vector element by element inside a loop (e.g., fib <-
c(fib, next_term)) should generally be avoided.
Why Avoid This?
▶ Each time the vector grows by one element, R replaces the entire vector with a
new vector in a new memory location that includes the additional element.
▶ New memory space is allocated for each new vector, making this process
time-consuming and inefficient.

Solution:
▶ Pre-allocate a storage vector of a fixed size when possible to avoid frequent
memory allocations.
▶ This practice improves the performance of loops, especially when the number of
iterations is large.
Key Takeaway: Reducing memory re-allocation in large loops leads to more efficient
and faster code execution.
49 / 55
Outro: break and next Statements

50 / 55
The break Statement in Loops

▶ A break statement immediately exits or breaks out of a loop.


▶ A break is typically placed inside an if() statement so that the loop breaks
when a certain condition is met (the exit condition).

# within a while or for loop


if (condition) { # exit condition
break # stop and exit
}

51 / 55
Example: break

my_text <- c("Let's", "get", "out", "of","here")


for(word in my_text) {
if(word == "out") {
break
}
print(word)
}

## [1] "Let’s"
## [1] "get"

52 / 55
The next Statement in Loops

▶ A next statement skips the remainder of the current iteration of a loop and
immediately moves to the next iteration.
▶ A next statement is typically placed inside an if() statement so that the loop
skips based on a specific condition (the skip condition).

# within a while or for loop


if (condition) { # skip condition
next # skip the current iteration
}

53 / 55
Example: next

my_text <- c("I", "love", "debugging", "R code")


for(word in my_text) {
if(word == "debugging") {
next
}
print(word)
}

## [1] "I"
## [1] "love"
## [1] "R code"

54 / 55
When To Use Which Loop (and When Not To)
▶ Loops are natural when executing code multiple times or iterating over an index.
▶ However, a loop may not be necessary in all cases:
When Not To Use a Loop
▶ If the order of execution doesn’t matter, vectorization should be used instead of
a loop.
▶ Vectorized approaches are simpler, shorter, and much more efficient.

When To Use a Loop


▶ Order matters: Use a loop if iterations depend on previous results (e.g.,
Fibonacci sequence).
▶ for() loop: Use when the number of iterations is known in advance.
▶ while() loop: Use when the number of iterations is unknown or depends on a
condition.
Key Takeaway:Choose loops only when necessary. Vectorized solutions are typically
preferred for efficiency. 55 / 55

You might also like