4mission 493 Dataframes in R Takeaways
4mission 493 Dataframes in R Takeaways
Syntax
• Import a dataset:
library(readr)
> glimpse(recent_grads)
Observations: 173
Variables: 18
$ Rank 1, 2...
# Keeping data
# Removing data
library(dplyr)
mutate(
mutate(
) %>%
arrange(-prop_male)
> head(new_recent_grads)
# A tibble: 6 x 3
1 124 124 1
summarize(
avg_unemp = mean(Unemployment_rate),
min_unemp = min(Unemployment_rate),
max_unemp = max(Unemployment_rate)
)
Concepts
• The four data structures covered in this course are:
• Tabular data is organized into rows, where one row represents a single entity and columns
represent different characteristics of this row.
• Microsoft Excel, Google Sheets, and CSV files are common ways that we see tabular data.
• Tibbles are a data structure that implements tabular data in R and the tidyverse .
• Piping enables us to create pipelines with all of the functions we learned, allowing us to convert
raw data in tibbles to more refined datasets.