0% found this document useful (0 votes)

12 views

Week 11 Tasks and Solutions

The document provides instructions for a series of tasks to practice working with data frames and visualization in R. Task 1 involves loading and manipulating the iris dataset, including selecting columns, creating new columns, and filtering rows. Task 2 focuses on data visualization using ggplot2, including loading demographic data, exploring plot types and shapes, and creating dot plots to visualize the USArrests dataset based on assault, rape, and other columns. Task 3 calculates summary statistics like sums, means, and minimums from the USArrests data and displays the results in bar plots and other graphs.

Uploaded by

misxbeepics

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Week 11 Tasks and Solutions

Uploaded by

misxbeepics

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

R Practice Tasks

Task 1 Data Frame Practice

a) Save both the iris and demographic data to a folder in your M drive

b) Set your working directly to the same folder

c) Create a variable named “dataset” to read the iris dataset

dataset <- read.csv("iris.csv")

d) Display the first 15 rows of the dataset

head(dataset, n=15)

e) Using the factor command investigate how many different species of iris flowers
levels(factor(dataset$species))

f) Create a dataframe “newdataset” to show all data for columns petal length and sepal length
only
newdataset <- data.frame(dataset$sepal_length, dataset$petal_length)

g) Install and load the dplyr package and run the following statement: what does this show?
install.packages("dplyr")
library(dplyr)
select(dataset, sepal_width, petal_width)

h) Assign a variable name “widthonly” to the statement in question g, run the variable to
display the results
widthonly <- select(dataset, sepal_width, petal_width)

i) Create a new column in the iris dataset to show the area for the petal of each flower (petal
length * petal width) – ignore any error messages, now display the results of the dataset
showing the new column, it should look like this:

dataset$petal_area <- dataset$petal_width * dataset$petal_length

j) Show all petal areas which are greater than or equal to 8.64 – assign a variable and run it
petalfilter <- dataset$petal_area >= 8.64
dataset[petalfilter,]

k) Create a new variable to extend question j to display petal areas which are greater than or
equal to 8.64 and less than 12, run your new variable
petalfilter2 <- dataset$petal_area >= 8.64 & dataset$petal_area < 12
dataset[petalfilter2,]

Task 2 – Visualisation

a) install and load ggplot package

install.packages("ggplot")
library(ggplot)

b) create a variable “mydata” to read the demographic dataset and display all data
mydata <- read.csv("Demographic-Data.csv")
mydata

c) Investigate the demographic data set using functions such as head, str, summary
head(mydata)
str(mydata)
summary(mydata)

d) Run each of the statements on slide 16 and 17 and observe the results

e) Follow and carry out the instructions from slides 18 – 23 – explore the different shapes
available.

f) You are now going to work with the USArrests built in dataset, display the results from this
dataset
USArrests

g) You will notice that the first column does not have a column name, we can use tibble to
assign a column name to the index column (1st column) we can then use this to visualise
data. Install the tidytext, tibble and rlang package and load all of them.

#install tidytext package to use tibble function

install.packages("tidytext")
install.packages("tibble")
install.packages("rlang")

#load library
libary(tidytext)
libary(tibble)
libary(rlang)

h) Run the following statement which will now apply the name state to the first column
USArrests <- tibble::as_tibble(USArrests, rownames = "State")
i) Check the first column now shows as State for the column name. To remove this index use
remove(USArrests) this puts the dataset back into its original format. To show all rows you
can use the command - print(USArrests, n = 50).
USArrests

j) You are now going to use the following code to show a dot plot to display the number of
assaults per state, we will be doing more with ggplot next week.
ggplot(USArrests, aes(x = Assault, y = reorder(State, Assault))) +
geom_point(color = "red") +
labs(title = "Assaults by State") +
theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
theme(plot.subtitle = element_text(hjust = 0.5))
k) Change the code above to display a dot graph to display the number of rapes per state
ggplot(USArrests, aes(x = Rape, y = reorder(State, Rape))) +
geom_point(color = "red") +
labs(title = "Rape by state") +
theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
theme(plot.subtitle = element_text(hjust = 0.5))

l) We can use built in functions such as sum, min, mean etc to perform further calculations to
them visualise patterns. There are various ways of how these built-in functions can be used
within R code. We are going to do it in stages to help you understand

1) We are going to create 4 variables to calculate the total number of arrest types using the
USArrests, execute the following code

murder <- sum(USArrests$Murder)

rape <- sum(USArrests$Rape)
assault <- sum(USArrests$Assault)
urbanpop <- sum(USArrests$UrbanPop)

2) Run each variable to see the value

3) We are now going to create 2 vectors, one for the headings and one for the totals

arrest_type <- c("Murder", "Rape", "Assault", "Urban Pop")

arrest_total <- c(murder, rape, assault, urbanpop)

Display your vectors to check they have been created

4) Use the code below to create a simple bar chart to display the list values

barplot(arrest_total, names.arg=arrest_type, main = "Arrest Types",

xlab="Assault Type", ylab="Assault Total")

What conclusions can we draw from the graph?

5) We could quicken the process above by creating a data frame which includes the
headings and summed values – write the code to create the data frame, look back over
the BMI example on slide 23 from last week presentation (“introduction to R”)
arrest_types <- data.frame(types = c("Murder", "Rape", "Assault",
"Urban Pop"),
arrest_totals = c(sum(USArrests$Murder), sum(USArrests$Rape),
sum(USArrests$Assault), sum(USArrests$UrbanPop)))

6) Display the dataframe, it should look like the one below:

arrest_types

#round
arrest_types <- data.frame(types = c("Murder", "Rape", "Assault",
"Urban Pop"),
arrest_totals = c(round(sum(USArrests$Murder)),
round(sum(USArrests$Rape)), round(sum(USArrests$Assault)),
round(sum(USArrests$UrbanPop))))

7) Use qplot, display the following graph using your data frame from question l5)

qplot(data = arrest_types, x = types, y=arrest_totals, size = I(3),

colour = I("Red"))

m) Find the lowest value for murder?

min(USArrests$Murder)
n) Can you show the name of the State with the lowest murder rate, research the function
‘which.min’

USArrests[which.min(USArrests$Murder),"State"]
returns
#A tibble: 1 × 1
State
<chr>
1 North Dakota

USArrests[which.min(USArrests$Murder),]
#
# A tibble: 1 × 5
State Murder Assault UrbanPop Rape
<chr> <dbl> <int> <int> <dbl>
1 North Dakota 0.8 45 44 7.3

rownames(USArrests)[which.min(USArrests$Murder)] returns 34 as included an

index name of State using tibble and North Dakota is on row 34
[1] "34"
#
#remove dataset and use original - returns "North Dakota"
rownames(USArrests)[which.min(USArrests$Murder)]
[1] "North Dakota"

o) Write the code to find the average murder, assault, rape and urbanpop from the dataset it
should show the following results:

arrest_types1 <- data.frame(types = c("Murder", "Rape", "Assault",

"Urban Pop"),
arrest_average = c(mean(USArrests$Murder), mean(USArrests$Rape),
mean(USArrests$Assault), mean(USArrests$UrbanPop)))

p) Display the results from question 2o in a graph, it should look similar to below:
qplot(data = arrest_types1, x = types, y=arrest_average, size =
I(5), colour = I("blue"))

Task 3

Any remaining time, you can work on your assignment

RT Svf75g en
No ratings yet
RT Svf75g en
20 pages
Lab 5
0% (1)
Lab 5
5 pages
Global Product Catalogue: Reverse Circulation Tools
No ratings yet
Global Product Catalogue: Reverse Circulation Tools
20 pages
Comp Lab 2 GunExample 2425
No ratings yet
Comp Lab 2 GunExample 2425
15 pages
KrutikaKolhe-862467252-HW5
No ratings yet
KrutikaKolhe-862467252-HW5
18 pages
Coding Self-Assessment 2023
No ratings yet
Coding Self-Assessment 2023
5 pages
Lab 1 Activities
No ratings yet
Lab 1 Activities
4 pages
Experiment # 4
No ratings yet
Experiment # 4
10 pages
Mlda DD
No ratings yet
Mlda DD
17 pages
Introduction To The Analysis of Spatial Data Using R
No ratings yet
Introduction To The Analysis of Spatial Data Using R
8 pages
Course Title: Introduction To R in Business Applications
No ratings yet
Course Title: Introduction To R in Business Applications
19 pages
R File
No ratings yet
R File
37 pages
2140838_Assignment2_STA351
No ratings yet
2140838_Assignment2_STA351
4 pages
COMP2501 - Assignment - 1 - Questions - RMD 2
No ratings yet
COMP2501 - Assignment - 1 - Questions - RMD 2
7 pages
STAT-1000---Worksheet-2
No ratings yet
STAT-1000---Worksheet-2
14 pages
STAT-1000---Worksheet-2 (1)
No ratings yet
STAT-1000---Worksheet-2 (1)
14 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
The Xtable Gallery: With Small Contributions From Others November 6, 2009
No ratings yet
The Xtable Gallery: With Small Contributions From Others November 6, 2009
19 pages
STAT 214-T241-Lab 2
No ratings yet
STAT 214-T241-Lab 2
23 pages
You May Work On This Homework in Groups. A Group of Up To 2 Students May Submit A Homework
No ratings yet
You May Work On This Homework in Groups. A Group of Up To 2 Students May Submit A Homework
3 pages
Practical2 3
No ratings yet
Practical2 3
6 pages
R Imp Funtions
No ratings yet
R Imp Funtions
10 pages
ppt3
No ratings yet
ppt3
20 pages
CSE 3121 Information Visualization R Studio All Codes
No ratings yet
CSE 3121 Information Visualization R Studio All Codes
9 pages
Examples of R-Studio
No ratings yet
Examples of R-Studio
8 pages
Spatial Statistics in R
No ratings yet
Spatial Statistics in R
29 pages
Chapter 2 R Ggplot2 Examples
No ratings yet
Chapter 2 R Ggplot2 Examples
22 pages
Statistics With R Unit 1: Divya Arun Kumar
No ratings yet
Statistics With R Unit 1: Divya Arun Kumar
65 pages
Practica Usando R
No ratings yet
Practica Usando R
2 pages
Exercise 2
No ratings yet
Exercise 2
3 pages
L5
No ratings yet
L5
29 pages
Preprocessing - Preprocessing Your Data With R
No ratings yet
Preprocessing - Preprocessing Your Data With R
23 pages
RBasics Handout
No ratings yet
RBasics Handout
6 pages
Data Science R Basics
No ratings yet
Data Science R Basics
17 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
R Complete
No ratings yet
R Complete
24 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
STAT-1000---Worksheet-2 (4)
No ratings yet
STAT-1000---Worksheet-2 (4)
14 pages
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
No ratings yet
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
24 pages
Summarizing Data
No ratings yet
Summarizing Data
13 pages
3.1-3.3 Eegii
No ratings yet
3.1-3.3 Eegii
18 pages
2016 04 27 Cmpe 140 Computing Econ 09 Graphics Continued
No ratings yet
2016 04 27 Cmpe 140 Computing Econ 09 Graphics Continued
28 pages
Unit 3
No ratings yet
Unit 3
11 pages
Math10282 Ex03 - An R Session
No ratings yet
Math10282 Ex03 - An R Session
10 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
R Notes For Data Analysis and Statistical Inference
No ratings yet
R Notes For Data Analysis and Statistical Inference
10 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
Lab 3. Linear Regression 230223
100% (1)
Lab 3. Linear Regression 230223
7 pages
Practical 1_Data Frame Manipulation_072502
No ratings yet
Practical 1_Data Frame Manipulation_072502
16 pages
DSUR Chapter 04 Web Material
No ratings yet
DSUR Chapter 04 Web Material
19 pages
Spatial Statistics in R
No ratings yet
Spatial Statistics in R
29 pages
Workshop 1
No ratings yet
Workshop 1
7 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
ProfessiR programming
No ratings yet
ProfessiR programming
22 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
R_training_AM
No ratings yet
R_training_AM
6 pages
IntroStats PDF
No ratings yet
IntroStats PDF
59 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Nature and Magnitude of Problem Causing Juvenile Court: By: Niraj Thakkar Bba LLB Sem-9 Roll No: 24
No ratings yet
Nature and Magnitude of Problem Causing Juvenile Court: By: Niraj Thakkar Bba LLB Sem-9 Roll No: 24
7 pages
Cumulative Deposit Scheme
No ratings yet
Cumulative Deposit Scheme
9 pages
CAO 22-2020 Liquid Biosolids and Residuals Management Program
No ratings yet
CAO 22-2020 Liquid Biosolids and Residuals Management Program
9 pages
OBLICON QnA Part 2
No ratings yet
OBLICON QnA Part 2
2 pages
Ahmad Melhem Account Manager 2024
No ratings yet
Ahmad Melhem Account Manager 2024
1 page
Eula
No ratings yet
Eula
3 pages
Revolut Uk
No ratings yet
Revolut Uk
1 page
LEARNING PACKET NO.3 The Success and Failures of The Philippine Agrarian Programs
100% (1)
LEARNING PACKET NO.3 The Success and Failures of The Philippine Agrarian Programs
30 pages
Financial Accounting and Analysis Assignment 1
No ratings yet
Financial Accounting and Analysis Assignment 1
5 pages
Aa City
No ratings yet
Aa City
6 pages
Dissertation Topic On Financial Risk Management
100% (1)
Dissertation Topic On Financial Risk Management
8 pages
HELP Application Form
No ratings yet
HELP Application Form
4 pages
B.tech Fee Structure
No ratings yet
B.tech Fee Structure
1 page
Karnataka Bank Project
70% (10)
Karnataka Bank Project
72 pages
Application For Inclusion in The List of Accredited/Selected External Auditors - Individual
No ratings yet
Application For Inclusion in The List of Accredited/Selected External Auditors - Individual
4 pages
DTC Agreement Between United Arab Emirates and Philippines
100% (1)
DTC Agreement Between United Arab Emirates and Philippines
26 pages
Emet-Veemunah-4 25 12
No ratings yet
Emet-Veemunah-4 25 12
1 page
Child-Friendly Approach in The Performance of BADAC Duties 1
No ratings yet
Child-Friendly Approach in The Performance of BADAC Duties 1
35 pages
Ngo Accounting System
0% (1)
Ngo Accounting System
14 pages
Initial and Subsequent Measurement of Investment Property
No ratings yet
Initial and Subsequent Measurement of Investment Property
2 pages
FILIPINO SAINTS FIESTA CELEBRATION 2014 Souvenir Program
No ratings yet
FILIPINO SAINTS FIESTA CELEBRATION 2014 Souvenir Program
56 pages
1716 Telecom Consumer Charter - TRAI 180412 PDF
No ratings yet
1716 Telecom Consumer Charter - TRAI 180412 PDF
44 pages
Acc No: 906827017 BSNO: 19 D: Mr. Srinivasa Rao - Seeram
No ratings yet
Acc No: 906827017 BSNO: 19 D: Mr. Srinivasa Rao - Seeram
8 pages
Sample Resume Format
No ratings yet
Sample Resume Format
1 page
Worksheets For Determaining Evacuation Capability PDF
No ratings yet
Worksheets For Determaining Evacuation Capability PDF
5 pages
Volunteer Recruitment Action Plan
No ratings yet
Volunteer Recruitment Action Plan
3 pages
Jason Koss
No ratings yet
Jason Koss
1 page
Difference Between Accounting and Bookkeeping
100% (3)
Difference Between Accounting and Bookkeeping
3 pages