0% found this document useful (0 votes)
89 views

Regression ANOVA

The document discusses testing for linear relationships between variables using regression analysis and ANOVA. It covers testing the slope of the regression line for significance, calculating R-squared as a measure of fit, and using the coefficient of correlation. It provides examples of manually conducting hypothesis tests of the slope and correlation coefficient using t-statistics. Finally, it outlines the steps for regression diagnostics and the basic concepts and terminology involved in one-way ANOVA such as factors, levels, and estimating between- and within-sample variances.

Uploaded by

Jamal Abdullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views

Regression ANOVA

The document discusses testing for linear relationships between variables using regression analysis and ANOVA. It covers testing the slope of the regression line for significance, calculating R-squared as a measure of fit, and using the coefficient of correlation. It provides examples of manually conducting hypothesis tests of the slope and correlation coefficient using t-statistics. Finally, it outlines the steps for regression diagnostics and the basic concepts and terminology involved in one-way ANOVA such as factors, levels, and estimating between- and within-sample variances.

Uploaded by

Jamal Abdullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

JKE 316E

QUANTITATIVE
ECONOMICS

VC4
REGRESSION (Part
2)
Testing the Slope
If no linear r/ship exists between the 2
variables, we would expect the regression
line to be horizontal, that is, to have a slope
of zero.
We want to see if there is a linear r/ship, i.e.
if the slope (β1) is something other than zero.
Research hypothesis: H1: β1 ≠ 0
Null hypothesis: H0: β1 = 0

2
Testing the Slope
•  

3
Ex. 16.4
Test to determine if there is a linear r/ship
btw the price & odometer readings (at 5%
significance level)
H0: β1 = 0 & H1: β1 ≠ 0
(if the null hypothesis is true, no linear
r/ship exists)
The rejection region is:

4
COMPUTE
Ex. 16.4
Compute t manually @ refer to our Excel output:

p-value

See that the t statistic for


“odometer” (i.e. the slope, b1) is –13.49 Compare

which is greater than tCritical = –1.984. We also note that the p-


value is 0.000.
There is overwhelming evidence to infer that a linear
relationship between odometer reading and price exists.

5
Testing the Slope
To test for +ve or -ve linear r/ships, conduct
one-tail tests, i.e. our research hypothesis
become:
H1: β1< 0 (testing for a -ve slope) @
H1: β1 >0 (testing for a +ve slope)
The null hypothesis remains: H0: β1 = 0.

6
Coefficient of Determination
Tests thus far have shown if a linear r/ship
exists; it is also useful to measure the
strength of the r/ship. This is done by
calculating the coefficient of determination =
R2.

The coefficient of determination is the


square of the coefficient of correlation (r),
hence R2 = (r)2
7
Coefficient of Determination
As we did with analysis of variance, we can partition the
variation in y into two parts:
Variation in y = SSE + SSR
SSE – Sum of Squares Error – measures the amount of
variation in y that remains unexplained (i.e. due to error)
SSR – Sum of Squares Regression – measures the amount of
variation in y explained by variation in the independent
variable x.

8
COMPUTE
Coefficient of Determination
We can compute this manually or with
Excel…

9
INTERPRET
Coefficient of Determination
R2 = .6483 means 64.83% of the variation in the auction
selling prices (y) is explained by the variation in the
odometer readings (x). The remaining 35.17% is
unexplained, i.e. due to error.
In general the higher the value of R2, the better the model
fits the data.
R2 = 1: Perfect match btw the line & the data points.
R2 = 0: There are no linear r/ship between x & y.

10
More on Excel’s Output
An analysis of variance (ANOVA) table for the simple
linear regression model can be given by:
degrees of Sums of
Source Mean Squares F-Statistic
freedom Squares
Regression 1 SSR MSR = SSR/1 F=MSR/MSE
MSE =
Error n–2 SSE
SSE/(n–2)
Variation
Total n–1
in y

11
Coefficient of Correlation
We can use the coefficient of correlation to
test for a linear r/ship btw 2 variables.
Recall:
The coefficient of correlation’s range is
between –1 and +1.
• If r = –1 (-ve association) @ r = +1 (+ve
association) every point falls on the
regression line.
• If r = 0 there is no linear pattern
12
Coefficient of Correlation
The population coefficient of correlation is denoted
(rho)
We estimate its value from sample data with the sample
coefficient of correlation:

The test statistic for testing if = 0 is:

Which is Student t-distributed with n–2 degrees of


freedom.
13
Ex. 16.6
We can conduct the t-test of the coefficient
of correlation as an alternate means to
determine whether odometer reading &
auction selling price are linearly related.
Research hypothesis:
H1: ρ≠ 0 (i.e. there is a linear r/ship)
H0: ρ = 0 (i.e. there is no linear r/ship)

14
COMPUTE
Ex. 16.6
We’ve already shown that:

Hence we calculate the coefficient of


correlation as:

and the value of our test statistic becomes:

15
COMPUTE
Ex. 16.6
We can also use Excel > Add-Ins > Data Analysis Plus & the
Correlation (Pearson) tool to get this output:
We can also do a one-tail test for
positive or negative linear relationships

p-value
compare

Again, we reject the null hypothesis (that there is no linear


correlation) in favor of the alternative hypothesis (that our two
variables are in fact related in a linear fashion).
16
Using the Regression Equation
We could use our regression equation:
y = 17.250 – .0669x
to predict the selling price of a car with 40
(,000) miles on it:
y = 17.250 – .0669x
= 17.250 – .0669(40) = 14,574
We call this value ($14,574) a point prediction. Chances are
though the actual selling price will be different, hence we
can estimate the selling price in terms of an interval.

17
Procedure for Regression Diagnostics

1. Develop a model that has a theoretical basis.


2. Gather data for the 2 variables in the model.
3. Draw the scatter diagram to determine whether a linear
model appears to be appropriate. Identify possible
outliers.
4. Determine the regression equation.
5. Calculate the residuals & check the required conditions
6. Assess the model’s fit.
7. If the model fits the data, use the regression equation
to predict a particular value of the dep var &/@
estimate its mean.

18
MANUAL STEPS- For EXAM!!

19
MANUAL STEPS- For EXAM!!

20
MANUAL STEPS- For EXAM!!

21
MANUAL STEPS- 4 EXAM!!

22
MANUAL STEPS- 4 EXAM!!

23
MANUAL STEPS- 4 EXAM!!

24
MANUAL STEPS- 4 EXAM!!

25
MANUAL STEPS- 4 EXAM!!

26
MANUAL STEPS- 4 EXAM!!

27
MANUAL STEPS- 4 EXAM!!

28
MANUAL STEPS- 4 EXAM!!

29
30
31
32
ANOVA
To compare two @ more populations of
interval data.
A procedure which determines
whether differences exist between
population means.
 A procedure which works by
analyzing sample variance.

33
1-way ANOVA
Independent samples are drawn from k
populations:

Note: These populations are referred


to as treatments.
It is not a requirement that n1 = n2 =

….= nk34
.
1-way ANOVA
New Terminology:
x is the response variable, and its values are
responses.
xij refers to the ith observation in the jth sample.
E.g. x35 is the third observation of the fifth
sample.
The grand mean, , is the mean of all the
observations, i.e.:
(n = n1 + n2 + … + nk)

35
1-way ANOVA
More New Terminology:

Population classification criterion is called a


factor.
Each population is a factor level.

36
Basic ANOVA situation:
Two variables: 1 Categorical, 1 Quantitative
Main Question: Do the (means of) the
quantitative variables depend on which group
(given by categorical variable) the individual is
in?
If categorical variable has only 2 values:
• 2-sample t-test
ANOVA allows for 3 or more
groups

37
ANOVA:

• To test hypothesis of two or more


population means

38
3 TYPES OF ANOVA:

1 ) One-way ANOVA
2 ) Randomized Blocks (2-way ANOVA)
3 ) Regression Model

39
4 STEPS OF ANOVA:

1. Estimate population variance between


the mean sample (MST)
2. Estimate population variance in the
mean sample (MSE)

40
4 STEPS OF ANOVA:

3. Calculate F-ratio:
F = MST
MSE

4. Reject H0 if F* > F ( from F-table)

41
Thank You

You might also like