0% found this document useful (0 votes)
81 views3 pages

Data Wrangling Basics with R

The document discusses basic data wrangling in R. It describes using the mutate function to add or change columns, the filter function to subset rows, and the select function to choose columns. Pipelines can be created using the pipe operator %>% to combine data transformations. Data frames can be created with the data.frame function, and specifying stringsAsFactors=FALSE avoids converting character columns to factors.

Uploaded by

rahulrsingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views3 pages

Data Wrangling Basics with R

The document discusses basic data wrangling in R. It describes using the mutate function to add or change columns, the filter function to subset rows, and the select function to choose columns. Pipelines can be created using the pipe operator %>% to combine data transformations. Data frames can be created with the data.frame function, and specifying stringsAsFactors=FALSE avoids converting character columns to factors.

Uploaded by

rahulrsingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Basic Data Wrangling

Key Points

 To change a data table by adding a new column, or changing an existing one, we use
the mutate function.

 To filter the data by subsetting rows, we use the function filter.

 To subset the data by selecting specific columns, we use the select function.

 We can perform a series of operations by sending the results of one function to another
function using what is called the pipe operator, %>%.

Code

# installing and loading the dplyr package

install.packages("dplyr")

library(dplyr)

# adding a column with mutate

library(dslabs)

data("murders")

murders <- mutate(murders, rate = total / population * 100000)

# subsetting with filter

filter(murders, rate <= 0.71)

# selecting columns with select

new_table <- select(murders, state, region, rate)

# using the pipe


murders %>% select(state, region, rate) %>% filter(rate <=
0.71)
Creating Data Frames

Key Points

 We can use the data.frame() function to create data frames.

 By default, the data.frame() function turns characters into factors. To avoid this, we
utilize the stringsAsFactors argument and set it equal to false.

Code

# creating a data frame with stringAsFactors = FALSE

grades <- data.frame(names = c("John", "Juan", "Jean", "Yao"),

exam_1 = c(95, 80, 90, 85),

exam_2 = c(90, 85, 85, 90),

stringsAsFactors = FALSE)

You might also like