Statistics Overview Part II
Statistics Overview Part II
Part II
Outline
• Covariance
• Correlation
• Simple Linear Regression Model
Measures Of The Relationship Between Two Numerical Variables
• The Covariance
• The Coefficient of Correlation
The Covariance
• The covariance measures the strength of the linear relationship between two numerical
variables (X & Y)
( X X)( Y Y )
i i
cov ( X , Y ) i1
n 1
cov (X , Y)
r
SX SY
where
n n n
(X X)(Y Y)
i i (X X)
i
2
i
(Y Y ) 2
X X
r = -1 r = -.6
Y
Y Y X
relationshi
p
X X X
r = +1 r = +.3 r=0
Introduction to Regression Analysis
• Regression analysis is used to:
• Predict the value of a dependent variable based on the value of at least one
independent variable
• Explain the impact of changes in an independent variable on the dependent
variable
Dependent variable: the variable we wish to predict or explain(Y)
Independent variable: the variable used to predict
or explain the dependent variable(X)
Simple Linear Regression Model
Y Y Quadrati
c/
Paraboli
c
X X
Y Y
Exponenti
X al X
Types of Relationships (continued)
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Types of Relationships (continued)
No relationship
X
Simple Linear Regression Model
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Yi β0 β1Xi ε i
Linear component Random Error
component
Simple Linear Regression Model
(continued)
Y Yi β0 β1Xi ε i
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value
Intercept = β0
Xi X
Simple Linear Regression Equation (Prediction Line)
Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
i 1 i 1
Inferences About the Slope
S YX S YX
Sb1
SSX (X i X) 2
where:
Sb1 = Estimate of the standard error of the slope
SSE
S YX = Standard error of the estimate
n 2
Chap 13-20
Inferences About the Slope: t Test
• t test for a population slope
• Is there a linear relationship between X and Y?
• Null and alternative hypotheses
• H0: β1 = 0 (no linear relationship)
• H1: β1 ≠ 0 (linear relationship does exist)
• Test statistic where:
b1 β 1 b1 = regression slope
t STAT coefficient
Sb β1 = hypothesized slope
1
Sb1 = standard
d.f. n 2 error of the slope
Chap 13-21
Measures of Variation
Y
Yi
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2
Y _
_ SSR = (Yi - Y)2 _
Y Y
Xi X
Coefficient of Determination, r 2
Y
r2 = 1
X
r =1
2
Examples of Approximate r2 Values
Y
0 < r2 < 1
X
Examples of Approximate r2 Values
r2 = 0
Y
No linear relationship
between X and Y: