10-Regression
10-Regression
Summary
Empirical models
Empirical models
• Many problems in engineering and science involve
Simple Linear exploring the relationships between two or more
Regression
variables.
Estimating σ2
• Regression analysis is a statistical technique that is
Hypothesis
tests very useful for these types of problems.
Confidence
intervals
• For example, in a chemical process, suppose that the
Prediction yield of the product is related to the process-operating
Adequacy temperature.
Correlation • Regression analysis can be used to build a model to
predict yield at a given temperature level.
Summary
Empirical models
Empirical models
Simple Linear
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Empirical models
Empirical models
Simple Linear
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Empirical models
Empirical models Based on the scatter diagram, it is probably reasonable
Simple Linear
to assume that the mean of the random variable Y is
Regression related to x by the following straight-line relationship:
Estimating σ2 E(Y | x) Y |x 0 1 x
Hypothesis
tests
Confidence
regression coefficients.
intervals
Prediction
Adequacy
The simple linear regression model is given by
Correlation Y 0 1 x
Summary
random error
07/12/2023 Department of Mathematics 5/40
Empirical Models
Empirical models
Empirical models Suppose that the mean and variance of are 0 and 2,
respectively, then
Simple Linear
Regression
E(Y | x) E(0 1x ) 0 1x E( ) 0 1x
Estimating σ2
Hypothesis
tests The variance of Y given x is
Confidence
intervals
V(Y | x) V (0 1x ) V (0 1x) V ( ) 0 2 2
Prediction
Adequacy
The true regression model is a line of mean values:
Correlation
Summary
Y | x 0 1 x
Empirical models
Empirical models
Simple Linear
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Empirical models Suppose that we have n pairs of observations (x1, y1), (x2,
y2), …, (xn, yn):
Simple Linear
Simple Linear
Regression
yi 0 1xi i , i 1,...,n
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction Figure 11-3
Adequacy Deviations of the
data from the
Correlation
estimated
Summary regression model.
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction Figure 11-3
Adequacy Deviations of the
data from the
Correlation
estimated
Summary regression model.
Empirical models
The sum of the squares of the deviations of the observations
Simple Linear
Simple Linear
from the true regression line is
Regression
Regression
n n 2
Estimating σ2 L i2 ( yi 0 1 x)
Hypothesis i 1 i 1
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Correlation
n n
xi yi
S xy ( yi y )( xi x ) xi yi i 1 i 1
n n
Summary
i 1 i 1 n
07/12/2023 Department of Mathematics 12/40
Simple Linear Regression
Empirical models
Theorem
Simple Linear
Simple Linear The least squares estimates of the intercept and slope in
Regression
Regression the simple linear regression model are
Estimating σ2
ˆ 0 y ˆ1 x
Hypothesis
tests
ˆ S xy
Confidence 1
intervals S xx
Prediction Estimated regression line is
Adequacy
Correlation
yˆ ˆ0 ˆ1 x
Summary
Empirical models
Simple Linear
Simple Linear
Regression
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Correlation
Summary
Empirical models
Simple Linear
Simple Linear
Regression
Regression
Estimating σ2
Hypothesis Therefore, the least squares estimates of the slope and
tests intercept are
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Estimating σσ22
Estimating
Hypothesis
tests
Confidence We have
intervals
E(SSE) = (n – 2)2.
Prediction
Adequacy
SSE SST ̂1Sxy
Correlation
Summary
Correlation
Summary
Estimating σ2
Hypothesis
Hypothesis Test statistic
tests
tests
Confidence
intervals
Prediction
Adequacy has the t distribution with n - 2 degrees of freedom.
Correlation
If |t0| > tα/2, n-2 : reject H0
Summary
If |t0| < tα/2, n-2 : fail to reject H0
Estimating σ2
Hypothesis
Hypothesis
tests
tests
Confidence
intervals
Prediction These hypotheses relate to the significance of regression.
Adequacy
Failure to reject H0 is equivalent to concluding that there
Correlation is no linear relationship between x and Y.
Summary
Estimating σ2
Hypothesis
Hypothesis
tests
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Estimating σ2
Hypothesis
Hypothesis
tests
tests
Confidence
intervals
Prediction
Adequacy
Correlation
Summary
Figure 11-6 The hypothesis H0: 1 = 0 is rejected.
07/12/2023 Department of Mathematics 23/40
Hypothesis Tests in Simple Linear Regression
Summary
Correlation 1 x2 1 x 2
ˆ0 t / 2,n 2 ˆ
2
0 ˆ t
0 / 2, n 2 ˆ 2
Summary n S xx n S xx
Estimating σ2 Recall
Hypothesis
tests CI 95% for β1 :
Confidence
Confidence
intervals
intervals
Prediction
Adequacy
Correlation
Summary
Correlation CI 95% on
Summary
Empirical models
Simple Linear
Regression
Estimating σ2
Hypothesis
tests
Confidence
Confidence
intervals
intervals Scatter diagram of
Prediction oxygen purity with
Adequacy
fitted regression line
Correlation and 95% confidence
limits on Y|x0.
Summary
Summary
Empirical models
Simple Linear
Regression
Estimating σ2
Hypothesis
tests
Confidence
intervals
Prediction
Prediction
Adequacy
Correlation
Scatter diagram of oxygen purity data from Table 11.1
Summary with fitted regression line, 95% prediction limits, and
95% confidence limits on Y|x0.
07/12/2023 Department of Mathematics 32/40
Adequacy of the Regression Model
Empirical models
• Fitting a regression model requires several
Simple Linear
Regression
assumptions.
1. Errors are uncorrelated random variables with
Estimating σ2
Hypothesis mean zero;
tests
Confidence
2. Errors have constant variance; and,
intervals
3. Errors be normally distributed.
Prediction
Adequacy
Adequacy • The analyst should always consider the validity of
Correlation
these assumptions to be doubtful and conduct
analyses to examine the adequacy of the model
Summary
Simple Linear
• The quantity
Regression
Estimating σ2
Hypothesis
tests
Confidence is called the coefficient of determination and is often
intervals used to judge the adequacy of a regression model.
Prediction
Adequacy
Adequacy
Correlation
• 0 R2 1;
• We often refer to R2 as the amount of variability in the
Summary
data explained or accounted for by the regression model.
07/12/2023 Department of Mathematics 34/40
Adequacy of the Regression Model
Correlation
Summary
(X i X )(Yi Y )
S XY
Estimating σ2 R i 1
n n
S XX SST
(X (Y Y )
Hypothesis
i X) 2
i
2
tests i 1 i 1
Confidence
intervals
Prediction
Adequacy Note that
Correlation
Correlation
Empirical models
Properties of the Linear Correlation Coefficient r
Simple Linear
Regression
1. –1 r 1
2. The value of r does not change if all values of either
Estimating σ2
Hypothesis
variable are converted to a different scale.
tests 3. The value of r is not affected by the choice of x and y.
Confidence
intervals
Interchange all x- and y-values and the value of r will
Prediction not change.
Adequacy
4. r measures strength of a linear relationship.
Correlation
Correlation
Summary
Estimating σ2
x x
Hypothesis
tests
Confidence Strong negative correlation Strong positive correlation
y y
intervals
Prediction r = 0.07
r = 0.42
Adequacy
Correlation
Correlation
Summary x x
R n2
Estimating σ2 Test statistic T0
Hypothesis 1 R2
tests
Confidence has the t distribution with n - 2 degrees of freedom.
intervals
Prediction
Adequacy If |t0| > tα/2, n-2 : reject H0
Correlation
Correlation If |t0| < tα/2, n-2 : fail to reject H0
Summary
Summary
Summary
Summary