How to Change value of variable with dplyr

Last Updated : 03 Jun, 2024

The Dplyr package in R is a powerful tool for data manipulation and transformation. It provides a set of functions that allow you to perform common data manipulation tasks concisely and efficiently. One of these tasks is changing the value of a variable within a data frame. This article will guide you through various methods to change the value of a variable using dplyr.

Change the value of a variable with dplyr

dplyr is part of the tidyverse, a collection of R packages designed for data science. The key functions of dplyr that are commonly used include select(), filter(), mutate(), summarise(), and arrange(). Among these, mutate() is the primary function for modifying or creating new variables in a data frame.

Before diving into the examples, let's create a sample data frame to work with:

# Load dplyr package
library(dplyr)

# Create a sample data frame
data <- data.frame(
  id = 1:5,
  name = c("Ali", "Boby", "Charles", "David", "Eva"),
  age = c(25, 30, 35, 40, 45),
  score = c(88, 92, 85, 87, 90)
)

# Display the data frame
print(data)

Output:

  id    name age score
1  1     Ali  25    88
2  2    Boby  30    92
3  3 Charles  35    85
4  4   David  40    87
5  5     Eva  45    90

Changing Variable Values with mutate()

The mutate() function from the dplyr package is a powerful tool for creating and modifying variables within a data frame.

1. Changing Values Based on a Condition

You can use mutate() along with ifelse() to change the values of a variable based on a condition.

# Change 'score' to 100 if 'age' is greater than 40
data <- data %>%
  mutate(score = ifelse(age > 40, 100, score))

# Display the updated data frame
print(data)

Output:

  id    name age score
1  1     Ali  25    88
2  2    Boby  30    92
3  3 Charles  35    85
4  4   David  40    87
5  5     Eva  45   100

2. Using mutate() with Multiple Conditions

You can use case_when() for more complex conditional logic.

# Change 'score' based on multiple conditions
data <- data %>%
  mutate(
    score = case_when(
      age <= 30 ~ score + 10,
      age > 30 & age <= 40 ~ score + 5,
      age > 40 ~ score + 15
    )
  )

# Display the updated data frame
print(data)

Output:

  id    name age score
1  1     Ali  25    98
2  2    Boby  30   102
3  3 Charles  35    90
4  4   David  40    92
5  5     Eva  45   115

3. Modifying Multiple Variables

You can modify multiple variables within a single mutate() call.

# Change 'age' and 'score' simultaneously
data <- data %>%
  mutate(
    age = age + 1,
    score = score * 1.1
  )
# Display the updated data frame
print(data)

Output:

  id    name age score
1  1     Ali  26 107.8
2  2    Boby  31 112.2
3  3 Charles  36  99.0
4  4   David  41 101.2
5  5     Eva  46 126.5

Using transmute() to Change and Drop Variables

If you want to change the values of variables and simultaneously drop others, you can use transmute(). This function works similarly to mutate() but only keeps the variables that are explicitly mentioned.

# Change 'score' and keep only 'id' and 'score'
data <- data %>%
  transmute(
    id,
    score = score * 1.2
  )
# Display the updated data frame
print(data)

Output:

  id  score
1  1 129.36
2  2 134.64
3  3 118.80
4  4 121.44
5  5 151.80

Using across() for Multiple Columns

The across() function allows you to apply the same transformation to multiple columns simultaneously.

# Create a new sample data frame
data <- data.frame(
  id = 1:5,
  math_score = c(88, 92, 85, 87, 90),
  science_score = c(80, 85, 82, 84, 88)
)
# Load dplyr package
library(dplyr)

# Apply a transformation across multiple columns
data <- data %>%
  mutate(across(c(math_score, science_score), ~ . * 1.1))

# Display the updated data frame
print(data)

Output:

  id math_score science_score
1  1       96.8          88.0
2  2      101.2          93.5
3  3       93.5          90.2
4  4       95.7          92.4
5  5       99.0          96.8

Conclusion

The dplyr package in R offers a versatile and efficient way to change the value of variables within a data frame. Whether you need to modify single variables based on specific conditions, update multiple variables simultaneously, or apply transformations across multiple columns, dplyr provides intuitive functions such as mutate(), transmute(), case_when(), and across() to accomplish these tasks. Mastering these functions can significantly enhance your data manipulation capabilities, making your data analysis workflows more efficient and effective.

Create a ranking variable with Dplyr package in R

Anonymous

Improve

Article Tags :

How to Change value of variable with dplyr

Change the value of a variable with dplyr

Changing Variable Values with mutate()

1. Changing Values Based on a Condition

2. Using mutate() with Multiple Conditions

3. Modifying Multiple Variables

Using transmute() to Change and Drop Variables

Using across() for Multiple Columns

Conclusion

Similar Reads

Thank You!

What kind of Experience do you want to share?