0% found this document useful (0 votes)

14 views3 pages

4mission 493 Dataframes in R Takeaways

Uploaded by

David Celis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

4mission 493 Dataframes in R Takeaways

Uploaded by

David Celis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Dataframes in R: Takeaways

by Dataquest Labs, Inc. - All rights reserved © 2022

Syntax
• Import a dataset:

library(readr)

data <- read_csv("name_of_file_with_data.csv")

• Learn about a tibbles columns, types and dimensions:

> glimpse(recent_grads)

Observations: 173

Variables: 18

$ Rank 1, 2...

$ Major_code 2419, 2416...

$ Major "PETROLEUM ENGINEERING", "MINING AND MINERAL ENGINEERING"...

• Return the number or rows or columns from a tibble:

nrow(data) # returns the number of rows in `data`

ncol(data) # returns the number of rows in `data`

• Pick columns to keep or remove from your data:

# Keeping data

filtered_data <- select(recent_grads, Rank, Major)

# Removing data

filtered_data <- select(recent_grads, -College_jobs)

• Filter rows based on conditions:

top_100_majors <- filter(recent_grads, Rank < 100)

• Chain together tidyverse functions into a pipeline:

library(dplyr)

low_total_ranked_majors <- recent_grads %>%

select(., Rank, Major, Total) %>%

filter(., ranked_majors, Total < 2000)

• Create new columns:

new_recent_grads <- recent_grads %>%

mutate(

prop_male = Men / Total

• Sort data by a particular or multiple columns:

new_recent_grads <- recent_grads %>%

mutate(

prop_male = Men / Total

) %>%

arrange(-prop_male)

• Use head() to return just the first few rows of a tibble

> head(new_recent_grads)

# A tibble: 6 x 3

Total Men prop_male

1 124 124 1

2 4790 4419 0.923

3 18498 16820 0.909

4 756 679 0.898

5 1258 1123 0.893

6 91227 80320 0.880

• Use summarize() to calculate some summary values based on entire columns:

summary_table <- recent_grads %>%

summarize(

avg_unemp = mean(Unemployment_rate),

min_unemp = min(Unemployment_rate),

max_unemp = max(Unemployment_rate)

)
Concepts
• The four data structures covered in this course are:

• Vector: one-dimensional structure for storing values of SAME TYPE.

• Matrix: two-dimensional structure for storing values of SAME TYPE.
• Lists: multi-dimensional stucture for storing values of ANY DATA TYPE/OBJECT.
• Dataframe: two-dimensional structure for storing values of ANY DATA TYPE/
OBJECT.

• Tabular data is organized into rows, where one row represents a single entity and columns
represent different characteristics of this row.

• Microsoft Excel, Google Sheets, and CSV files are common ways that we see tabular data.
• Tibbles are a data structure that implements tabular data in R and the tidyverse .

• Piping enables us to create pipelines with all of the functions we learned, allowing us to convert
raw data in tibbles to more refined datasets.

Bendix Air Compressor Service Manual
100% (2)
Bendix Air Compressor Service Manual
40 pages
Damien Lewis - The Ministry of Ungentlemanly Warfare - How Churchill's Secret Warriors Set Europe Ablaze and Gave Birth To Modern Black Ops-Quercus (2015) (Z-Lib - Io)
No ratings yet
Damien Lewis - The Ministry of Ungentlemanly Warfare - How Churchill's Secret Warriors Set Europe Ablaze and Gave Birth To Modern Black Ops-Quercus (2015) (Z-Lib - Io)
366 pages
Desert Love
100% (1)
Desert Love
227 pages
Zoomark Exhibitor List
100% (2)
Zoomark Exhibitor List
30 pages
#2 ARacer RC Super X Basic Tuning
100% (1)
#2 ARacer RC Super X Basic Tuning
28 pages
Lower - Secondary - Science - 8 - End-Of-Year Test
94% (18)
Lower - Secondary - Science - 8 - End-Of-Year Test
9 pages
Men On Boats Jaclyn Backhaus
No ratings yet
Men On Boats Jaclyn Backhaus
30 pages
Module 8 Pharma
No ratings yet
Module 8 Pharma
34 pages
LGB Rolan Product: SNO Code Item Name Cam Chain
No ratings yet
LGB Rolan Product: SNO Code Item Name Cam Chain
12 pages
Unit 2 Reading and Writing Files
No ratings yet
Unit 2 Reading and Writing Files
33 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
Unit 2
No ratings yet
Unit 2
76 pages
Lecture Week2
No ratings yet
Lecture Week2
72 pages
Week6 Slides Updated
No ratings yet
Week6 Slides Updated
57 pages
Module III
No ratings yet
Module III
53 pages
4 Chapter 4 Antigens
No ratings yet
4 Chapter 4 Antigens
30 pages
Session 10 (Sent)
No ratings yet
Session 10 (Sent)
44 pages
02-Data Gathering and Preparation
No ratings yet
02-Data Gathering and Preparation
54 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
Statistics and Data Science With R Part - 4
No ratings yet
Statistics and Data Science With R Part - 4
23 pages
10 English Paper 2
No ratings yet
10 English Paper 2
5 pages
Daur Unit 2
No ratings yet
Daur Unit 2
28 pages
Serfas User Manual - English
No ratings yet
Serfas User Manual - English
33 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
Data
No ratings yet
Data
40 pages
5 Summarizing Data
No ratings yet
5 Summarizing Data
29 pages
Subset Creation in R
No ratings yet
Subset Creation in R
27 pages
DSF 11-12
No ratings yet
DSF 11-12
21 pages
TB018 Cylinder Compression Test v2021
No ratings yet
TB018 Cylinder Compression Test v2021
2 pages
Notebook 1 - Basic R & Data Exploration - Jupyter Notebook
No ratings yet
Notebook 1 - Basic R & Data Exploration - Jupyter Notebook
21 pages
R Programming Cont..
No ratings yet
R Programming Cont..
24 pages
Unit 1 Big Data Analytics - An Introduction (Final)
No ratings yet
Unit 1 Big Data Analytics - An Introduction (Final)
65 pages
R - Tutorial: Matrices Are Vectors
No ratings yet
R - Tutorial: Matrices Are Vectors
13 pages
Data Preparation: Handling Missing Values and Outliers
No ratings yet
Data Preparation: Handling Missing Values and Outliers
28 pages
Introduction To R
No ratings yet
Introduction To R
18 pages
MKT4080-Codes
No ratings yet
MKT4080-Codes
9 pages
Big Data - Lab 3
No ratings yet
Big Data - Lab 3
25 pages
Practical 1 EDA
No ratings yet
Practical 1 EDA
14 pages
Presentation 1
No ratings yet
Presentation 1
34 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
R Data Frame - Javatpoint
No ratings yet
R Data Frame - Javatpoint
14 pages
BigData - BCom Unit 4
No ratings yet
BigData - BCom Unit 4
9 pages
Module 2.9
No ratings yet
Module 2.9
11 pages
Notebook 1 - Basic R & Data Exploration
No ratings yet
Notebook 1 - Basic R & Data Exploration
19 pages
ICT2103 Full Book-Part-3
No ratings yet
ICT2103 Full Book-Part-3
14 pages
R
No ratings yet
R
15 pages
Lecture 9: Data Wrangling With Dplyr: Kevin Lee
No ratings yet
Lecture 9: Data Wrangling With Dplyr: Kevin Lee
12 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
7 pages
Notebook 1 - Basic R & Data Exploration
No ratings yet
Notebook 1 - Basic R & Data Exploration
19 pages
Yelenik Et Al. (2004)
No ratings yet
Yelenik Et Al. (2004)
8 pages
Evaluation The Effect of Modified Nano-Fillers Addition On Some Properties of Heat Cured Acrylic Denture Base Material
No ratings yet
Evaluation The Effect of Modified Nano-Fillers Addition On Some Properties of Heat Cured Acrylic Denture Base Material
7 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
Instruction: Type AKS 4100/4100U Coaxial D14 Version
No ratings yet
Instruction: Type AKS 4100/4100U Coaxial D14 Version
12 pages
cs448 - Tool Manipulating Data
No ratings yet
cs448 - Tool Manipulating Data
4 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
Lecture 5 (Managing and Understanding Data)
No ratings yet
Lecture 5 (Managing and Understanding Data)
9 pages
Arnis
No ratings yet
Arnis
27 pages
Statistics With R Week 3
No ratings yet
Statistics With R Week 3
3 pages
R Study Material I
No ratings yet
R Study Material I
8 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
DataFramesCheatSheet v1.x Rev1
No ratings yet
DataFramesCheatSheet v1.x Rev1
2 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
INS-TGW-B787-8001 Issue 17
No ratings yet
INS-TGW-B787-8001 Issue 17
9 pages
Witherspoon2008 Vital Pulp Therapy With New Materials New Directions and Treatment Perspeectives
No ratings yet
Witherspoon2008 Vital Pulp Therapy With New Materials New Directions and Treatment Perspeectives
4 pages
Data Tidying With Tidyr::: Cheat Sheet
No ratings yet
Data Tidying With Tidyr::: Cheat Sheet
2 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
5 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Ethnopedology and Folk Soil Taxonomies
No ratings yet
Ethnopedology and Folk Soil Taxonomies
27 pages
Michelson Exp
No ratings yet
Michelson Exp
8 pages
Alaia Bag Long Strap - Google Search PDF
No ratings yet
Alaia Bag Long Strap - Google Search PDF
1 page
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
No ratings yet
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
2 pages
Cleaning Data in R
No ratings yet
Cleaning Data in R
9 pages
R Functions
No ratings yet
R Functions
8 pages
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
No ratings yet
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
2 pages
Data Transformation
No ratings yet
Data Transformation
2 pages
Healthy: Shifting Your PH Toward Alkaline..
No ratings yet
Healthy: Shifting Your PH Toward Alkaline..
9 pages
NCP Rectal Adenocarcinoma
No ratings yet
NCP Rectal Adenocarcinoma
3 pages
Building Material 4th Sem
No ratings yet
Building Material 4th Sem
1 page
Data Transformation With Dplyr Cheat Sheet
No ratings yet
Data Transformation With Dplyr Cheat Sheet
2 pages
Dofile - Quan Ly Va Lam Sach Du Lieu 2
No ratings yet
Dofile - Quan Ly Va Lam Sach Du Lieu 2
6 pages
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
No ratings yet
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
2 pages
In-W610 (Merit LMC 6 - Er70s-6) - 1
No ratings yet
In-W610 (Merit LMC 6 - Er70s-6) - 1
2 pages
Caramel Overview
No ratings yet
Caramel Overview
6 pages
WW LS4
No ratings yet
WW LS4
4 pages
Inspection & Testing Requirements Scope:: Test and Inspection Per
No ratings yet
Inspection & Testing Requirements Scope:: Test and Inspection Per
3 pages
Application of Top-Down Construction Method For Deep Excavations
No ratings yet
Application of Top-Down Construction Method For Deep Excavations
2 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet

4mission 493 Dataframes in R Takeaways

Uploaded by

4mission 493 Dataframes in R Takeaways

Uploaded by

Dataframes in R: Takeaways

by Dataquest Labs, Inc. - All rights reserved © 2022

data <- read_csv("name_of_file_with_data.csv")

• Learn about a tibbles columns, types and dimensions:

$ Major_code 2419, 2416...

$ Major "PETROLEUM ENGINEERING", "MINING AND MINERAL ENGINEERING"...

• Return the number or rows or columns from a tibble:

nrow(data) # returns the number of rows in `data`

ncol(data) # returns the number of rows in `data`

• Pick columns to keep or remove from your data:

filtered_data <- select(recent_grads, Rank, Major)

filtered_data <- select(recent_grads, -College_jobs)

• Filter rows based on conditions:

top_100_majors <- filter(recent_grads, Rank < 100)

• Chain together tidyverse functions into a pipeline:

low_total_ranked_majors <- recent_grads %>%

filter(., ranked_majors, Total < 2000)

• Create new columns:

new_recent_grads <- recent_grads %>%

prop_male = Men / Total

• Sort data by a particular or multiple columns:

new_recent_grads <- recent_grads %>%

prop_male = Men / Total

• Use head() to return just the first few rows of a tibble

Total Men prop_male

2 4790 4419 0.923

3 18498 16820 0.909

4 756 679 0.898

5 1258 1123 0.893

6 91227 80320 0.880

• Use summarize() to calculate some summary values based on entire columns:

summary_table <- recent_grads %>%

• Vector: one-dimensional structure for storing values of SAME TYPE.

Takeaways by Dataquest Labs, Inc. - All rights reserved © 2022

You might also like