0% found this document useful (0 votes)

18 views10 pages

Wic 5 MLR & Anova

This document discusses using linear regression and ANOVA to analyze golf distance data from three different datasets (D1, D2, D3). Linear models are created to test the effects of brand and golfer on distance. The ANOVA tables show that for both D1 and D2, there are no significant effects of brand or golfer on distance as all p-values are greater than 0.05. Residual plots for the linear models indicate the assumptions of linear regression are reasonably met.

Uploaded by

hekmat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views10 pages

Wic 5 MLR & Anova

Uploaded by

hekmat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

WIC 5: MLR & ANOVA

Sean Hekmat

2023-11-28

D1 <- read.csv("~/Desktop/ANOVA Golf Example 1.csv")

D2 <- read.csv("~/Desktop/ANOVA Golf Example 2.csv")
D3 <- read.csv("~/Desktop/Fresh-Demand.csv")

head(D1)

## Brand Distance
## 1 A 250.8
## 2 A 250.0
## 3 A 235.5
## 4 A 255.4
## 5 A 248.7
## 6 A 241.8

head(D2)

## Golfer Brand Dist

## 1 G1 A 248.73
## 2 G1 A 258.74
## 3 G1 A 223.65
## 4 G2 A 240.30
## 5 G2 A 258.00
## 6 G2 A 272.71

head(D3)

## x1 x2 y
## 1 3.85 3.80 7.38
## 2 3.75 4.00 7.51
## 3 3.70 4.30 9.52
## 4 3.70 3.70 7.50
## 5 3.60 3.85 9.33
## 6 3.60 3.80 8.28

lm1 <- lm(Distance ~ Brand, data = D1)

summary(lm1)

##
## Call:

1
## lm(formula = Distance ~ Brand, data = D1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.475 -11.896 -2.518 8.708 36.782
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 252.817 5.160 48.997 <2e-16 ***
## BrandB -2.608 7.297 -0.357 0.723
## BrandC -7.742 7.297 -1.061 0.295
## BrandD -3.598 7.461 -0.482 0.632
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 17.87 on 43 degrees of freedom
## Multiple R-squared: 0.0264, Adjusted R-squared: -0.04152
## F-statistic: 0.3887 on 3 and 43 DF, p-value: 0.7617

a1 <- aov(Distance ~ Brand, data = D1)

summary(a1)

## Df Sum Sq Mean Sq F value Pr(>F)

## Brand 3 373 124.2 0.389 0.762
## Residuals 43 13738 319.5

Above the ANOVA table shows that: there is no Brand effect on Distance, i.e. the mean distances obtained
from the 4 brands are all equal (since all P-values > 0.05).

library(ggfortify)

## Loading required package: ggplot2

autoplot(lm1, label.size = 3)

2
Residuals vs Fitted Normal Q−Q
40

Standardized residuals
26 42 42
15 2 1526
20
Residuals

0 0

−20 −1

−2
245 247 249 251 253 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals
1.5 26 42 26 42
15 2 15

1
1.0
0

0.5 −1

−2
245 247 249 251 253 0.000 0.025 0.050 0.075
Fitted values Leverage

autoplot(a1, label.size = 3)

3
Residuals vs Fitted Normal Q−Q
40

Standardized residuals
26 42 42
15 2 1526
20
Residuals

0 0

−20 −1

−2
245 247 249 251 253 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals
1.5 26 42 26 42
15 2 15

1
1.0
0

0.5 −1

−2
245 247 249 251 253 0.000 0.025 0.050 0.075
Fitted values Leverage

lm2 <- lm(Dist ~ Brand + Golfer, data = D2)

summary(lm2)

##
## Call:
## lm(formula = Dist ~ Brand + Golfer, data = D2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33.153 -11.194 -0.059 11.022 40.171
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 241.5561 7.4523 32.414 <2e-16 ***
## BrandB -11.6933 8.6052 -1.359 0.1843
## BrandC -0.2344 8.6052 -0.027 0.9784
## BrandD -2.0833 8.6052 -0.242 0.8103
## GolferG2 13.7875 7.4523 1.850 0.0742 .
## GolferG3 5.6542 7.4523 0.759 0.4539
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 18.25 on 30 degrees of freedom
## Multiple R-squared: 0.1654, Adjusted R-squared: 0.02632
## F-statistic: 1.189 on 5 and 30 DF, p-value: 0.3379

4
a2 <- aov(Dist ~ Golfer + Brand, data=D2)
summary(a2)

## Df Sum Sq Mean Sq F value Pr(>F)

## Golfer 2 1153 576.4 1.730 0.195
## Brand 3 828 276.2 0.829 0.489
## Residuals 30 9997 333.2

Above ANOVA table shows that: (i) there is no Brand effect on Distance, i.e. the mean distances obtained
from the 4 brands are all equal (since all P-values > 0.05), and (ii) there is no Golfer effect since P-values
for Golfers are all > 0.05.

autoplot(lm2)

Residuals vs Fitted Normal Q−Q

Standardized residuals
40 22 22
29 2 29
20
Residuals

0 0

−20 −1

11 −2 11
230 240 250 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Constant Leverage:

Standardized residuals

1.6 22 Residuals vs Factor Levels

Standardized Residuals

11 29
22
1.2 2 29
1
0.8
0

0.4 −1

−2 11
230 240 250 A:G1
A:G2
A:G3
B:G1
B:G2
B:G3
C:G1
C:G2
C:G3
D:G1
D:G2
D:G3
Fitted values Factor Level Combination

autoplot(a2)

5
Residuals vs Fitted Normal Q−Q

Standardized residuals
40 22 22
29 2 29
20
Residuals

0 0

−20 −1

11 −2 11
230 240 250 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Constant Leverage:

Standardized residuals

1.6 22 Residuals vs Factor Levels

Standardized Residuals
11 29
22
1.2 2 29
1
0.8
0

0.4 −1

−2 11
230 240 250 G1:A
G1:B
G1:C
G1:D
G2:A
G2:B
G2:C
G2:D
G3:A
G3:B
G3:C
G3:D
Fitted values Factor Level Combination

lm3 <- lm(y~x1+x2, data = D3)

summary(lm3)

##
## Call:
## lm(formula = y ~ x1 + x2, data = D3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.91839 -0.10915 0.02283 0.16094 0.58057
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.7402 4.0586 3.878 0.002195 **
## x1 -4.9976 1.0379 -4.815 0.000423 ***
## x2 2.8573 0.5495 5.200 0.000222 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 0.3872 on 12 degrees of freedom
## Multiple R-squared: 0.7772, Adjusted R-squared: 0.74
## F-statistic: 20.93 on 2 and 12 DF, p-value: 0.0001224

a3 <- aov(y~x1+x2, data = D3)

summary(a3)

6
## Df Sum Sq Mean Sq F value Pr(>F)
## x1 1 2.221 2.221 14.82 0.002313 **
## x2 1 4.053 4.053 27.04 0.000222 ***
## Residuals 12 1.799 0.150
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

autoplot(lm3)

Residuals vs Fitted Normal Q−Q

Standardized residuals
0.5
5 5
12 12
1
Residuals

0.0 0

−1
−0.5
−2
2 2
7.5 8.0 8.5 9.0 9.5 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals

1.6 2 5
12
5 1
1.2 12
0
0.8
−1

0.4 −2
2
7.5 8.0 8.5 9.0 9.5 0.0 0.1 0.2 0.3
Fitted values Leverage

autoplot(a3)

7
Residuals vs Fitted Normal Q−Q

Standardized residuals
0.5
5 5
12 12
1
Residuals

0.0 0

−1
−0.5
−2
2 2
7.5 8.0 8.5 9.0 9.5 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals
1.6 2 5
12
5 1
1.2 12
0
0.8
−1

0.4 −2
2
7.5 8.0 8.5 9.0 9.5 0.0 0.1 0.2 0.3
Fitted values Leverage

lm4 <- lm(y~x1-x2, data = D3)

summary(lm4)

##
## Call:
## lm(formula = y ~ x1 - x2, data = D3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.9803 -0.5438 0.1154 0.5255 1.0483
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.953 6.610 3.473 0.00413 **
## x1 -3.914 1.762 -2.221 0.04472 *
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 0.6709 on 13 degrees of freedom
## Multiple R-squared: 0.2751, Adjusted R-squared: 0.2194
## F-statistic: 4.934 on 1 and 13 DF, p-value: 0.04472

a4 <- aov(y~x1-x2, data = D3)

summary(a4)

8
## Df Sum Sq Mean Sq F value Pr(>F)
## x1 1 2.221 2.2211 4.934 0.0447 *
## Residuals 13 5.852 0.4502
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

autoplot(lm4)

Residuals vs Fitted Normal Q−Q

Standardized residuals
1.0 3 3
1
0.5
Residuals

0.0 0

−0.5
−1
−1.0 9 4 9 4
8.0 8.5 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals

9 3
4 3
1.2
1
1.0

0
0.8

0.6 −1 6
0.4 9
8.0 8.5 0.00 0.05 0.10 0.15 0.20
Fitted values Leverage

autoplot(a4)

9
Residuals vs Fitted Normal Q−Q

Standardized residuals
1.0 3 3
1
0.5
Residuals

0.0 0

−0.5
−1
−1.0 9 4 9 4
8.0 8.5 −2 −1 0 1 2
Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage

Standardized residuals

Standardized Residuals
9 3
4 3
1.2
1
1.0

0
0.8

0.6 −1 6
0.4 9
8.0 8.5 0.00 0.05 0.10 0.15 0.20
Fitted values Leverage
Since R-square for the model lm3 (0.78) is much higher than that for lm4, we reccommend Model lm3. The
normality of residuals for all the linear models can be assessed as shown in class.

Test Bank Introductory Econometrics A Modern Approach 5th Edit
100% (2)
Test Bank Introductory Econometrics A Modern Approach 5th Edit
113 pages
A028 GLM-SC3
No ratings yet
A028 GLM-SC3
137 pages
Exam 1 Notes
No ratings yet
Exam 1 Notes
4 pages
A2 Copy 2
No ratings yet
A2 Copy 2
8 pages
WEEK
No ratings yet
WEEK
17 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
Homework 2
100% (1)
Homework 2
14 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Shivam Batra (19BPS1131) 21/01/2022: List
No ratings yet
Shivam Batra (19BPS1131) 21/01/2022: List
5 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
Final Exam 2018 R Output
No ratings yet
Final Exam 2018 R Output
20 pages
Multiple Linear Regression
100% (1)
Multiple Linear Regression
14 pages
04 BasicAnalyses
No ratings yet
04 BasicAnalyses
44 pages
20BCE1205 Lab3
No ratings yet
20BCE1205 Lab3
9 pages
Deena Assignment 2
No ratings yet
Deena Assignment 2
16 pages
Linear Model
No ratings yet
Linear Model
10 pages
Stepwiseselection MATTOUHI AICHA
No ratings yet
Stepwiseselection MATTOUHI AICHA
7 pages
Amta - Final - Notes.r: ### Step Wise AIC Regression
No ratings yet
Amta - Final - Notes.r: ### Step Wise AIC Regression
6 pages
Model Linear
No ratings yet
Model Linear
33 pages
Regression Analysis Script
No ratings yet
Regression Analysis Script
24 pages
Week 7 and Week 8
No ratings yet
Week 7 and Week 8
29 pages
R Codes
No ratings yet
R Codes
5 pages
Backward Elimination Mattouhi Aicha
No ratings yet
Backward Elimination Mattouhi Aicha
3 pages
Lab 5
No ratings yet
Lab 5
6 pages
Lab 4
No ratings yet
Lab 4
7 pages
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
No ratings yet
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
8 pages
Assignment Solution 1
No ratings yet
Assignment Solution 1
11 pages
R Notesss
No ratings yet
R Notesss
12 pages
R Code Default Data PDF
No ratings yet
R Code Default Data PDF
10 pages
Lab 5 LR
No ratings yet
Lab 5 LR
9 pages
Using R For Linear Regression
No ratings yet
Using R For Linear Regression
9 pages
Machine Learning-Lecture 1 (Student)
No ratings yet
Machine Learning-Lecture 1 (Student)
14 pages
Tutorial 2 Solutions
No ratings yet
Tutorial 2 Solutions
12 pages
H-311 Linear Regression Analysis With R
100% (1)
H-311 Linear Regression Analysis With R
71 pages
The University of Auckland: Second Semester, 2004 Campus: City
No ratings yet
The University of Auckland: Second Semester, 2004 Campus: City
23 pages
Final Predictive Vaibhav 2020
No ratings yet
Final Predictive Vaibhav 2020
101 pages
ANOVA in R
No ratings yet
ANOVA in R
7 pages
Apurvareport
No ratings yet
Apurvareport
22 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
Notes 3008
No ratings yet
Notes 3008
6 pages
Make Up Cat
No ratings yet
Make Up Cat
6 pages
Regression
No ratings yet
Regression
36 pages
Homework 2
100% (1)
Homework 2
12 pages
Practice-Training BTTC
No ratings yet
Practice-Training BTTC
25 pages
Lab-9 RMD
No ratings yet
Lab-9 RMD
5 pages
Exercice V
No ratings yet
Exercice V
5 pages
Assignment 3 Forecasting
No ratings yet
Assignment 3 Forecasting
12 pages
R Regression Commands
No ratings yet
R Regression Commands
5 pages
Problem 7.5 A)
No ratings yet
Problem 7.5 A)
11 pages
Section 2
No ratings yet
Section 2
22 pages
Consolidated Outputs PGDM
No ratings yet
Consolidated Outputs PGDM
52 pages
Week 10
No ratings yet
Week 10
23 pages
Yaikob Second Assesiment Final
No ratings yet
Yaikob Second Assesiment Final
33 pages
Business Analytics C-2
No ratings yet
Business Analytics C-2
7 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
330 Lect11
No ratings yet
330 Lect11
35 pages
Advanced Regression Assignment
No ratings yet
Advanced Regression Assignment
5 pages
Phillips Perron
100% (1)
Phillips Perron
4 pages
Lampiran 3 Uji Validitas Dan Reliabilitas Variabel Green Marketing Reliability Scale: All Variables
No ratings yet
Lampiran 3 Uji Validitas Dan Reliabilitas Variabel Green Marketing Reliability Scale: All Variables
21 pages
Samples of Questions 2022
No ratings yet
Samples of Questions 2022
4 pages
Oleh: Rika Triana 144020354
No ratings yet
Oleh: Rika Triana 144020354
27 pages
Sortino Ratio: A Better Measure of Risk: Managed Money
No ratings yet
Sortino Ratio: A Better Measure of Risk: Managed Money
3 pages
12.simple Regression NLS Edit
No ratings yet
12.simple Regression NLS Edit
62 pages
QT Unit-III
No ratings yet
QT Unit-III
19 pages
Introduction To Game Theory: 1 What Is A Game?
No ratings yet
Introduction To Game Theory: 1 What Is A Game?
16 pages
Machine Learning Pipeline: Model Name Dependent Variable
No ratings yet
Machine Learning Pipeline: Model Name Dependent Variable
5 pages
QM Excel
No ratings yet
QM Excel
7 pages
STA 220 Syllabus Spring 2018-1
No ratings yet
STA 220 Syllabus Spring 2018-1
3 pages
Principles of Experimental Designs in Statistics - Replication, Randomization & Local Control
No ratings yet
Principles of Experimental Designs in Statistics - Replication, Randomization & Local Control
6 pages
Journal of Statistical Software: Tscount: An R Package For Analysis of Count Time
No ratings yet
Journal of Statistical Software: Tscount: An R Package For Analysis of Count Time
51 pages
Elasticity Ofsupply
No ratings yet
Elasticity Ofsupply
16 pages
Student Chapter 14 - Regression Without SE - Tagged
No ratings yet
Student Chapter 14 - Regression Without SE - Tagged
43 pages
Net Present Value
No ratings yet
Net Present Value
2 pages
Bab V
No ratings yet
Bab V
29 pages
COURSES ECONOMETRICS Multiple Regression, Dummy, Error Anal
No ratings yet
COURSES ECONOMETRICS Multiple Regression, Dummy, Error Anal
26 pages
Session 2 Time Value of Money
No ratings yet
Session 2 Time Value of Money
15 pages
Sample Size For MSA
No ratings yet
Sample Size For MSA
80 pages
CTY#18MAT41#MODULE-3#Binomial and Poisson Distribution-Problems
No ratings yet
CTY#18MAT41#MODULE-3#Binomial and Poisson Distribution-Problems
21 pages
QII.W6.D3.Annuity Due
No ratings yet
QII.W6.D3.Annuity Due
16 pages
Resoning Beyond The Backward Induction
No ratings yet
Resoning Beyond The Backward Induction
26 pages
Summary of Chapter 4
No ratings yet
Summary of Chapter 4
3 pages
Value at Risk
No ratings yet
Value at Risk
4 pages
Henningsen-Using R For Agricultural Economics Research-626
No ratings yet
Henningsen-Using R For Agricultural Economics Research-626
3 pages
Ujian Lab - Regresi - Priscilia Claudia Ondang
No ratings yet
Ujian Lab - Regresi - Priscilia Claudia Ondang
2 pages
Portfolio Optimization
No ratings yet
Portfolio Optimization
27 pages

Wic 5 MLR & Anova

Uploaded by

Wic 5 MLR & Anova

Uploaded by

WIC 5: MLR & ANOVA

D1 <- read.csv("~/Desktop/ANOVA Golf Example 1.csv")

## Golfer Brand Dist

lm1 <- lm(Distance ~ Brand, data = D1)

a1 <- aov(Distance ~ Brand, data = D1)

## Df Sum Sq Mean Sq F value Pr(>F)

## Loading required package: ggplot2

Scale−Location Residuals vs Leverage

Scale−Location Residuals vs Leverage

lm2 <- lm(Dist ~ Brand + Golfer, data = D2)

## Df Sum Sq Mean Sq F value Pr(>F)

Residuals vs Fitted Normal Q−Q

Scale−Location Constant Leverage:

1.6 22 Residuals vs Factor Levels

Scale−Location Constant Leverage:

1.6 22 Residuals vs Factor Levels

lm3 <- lm(y~x1+x2, data = D3)

a3 <- aov(y~x1+x2, data = D3)

Residuals vs Fitted Normal Q−Q

Scale−Location Residuals vs Leverage

Scale−Location Residuals vs Leverage

lm4 <- lm(y~x1-x2, data = D3)

a4 <- aov(y~x1-x2, data = D3)

Residuals vs Fitted Normal Q−Q

Scale−Location Residuals vs Leverage

Scale−Location Residuals vs Leverage

You might also like