0% found this document useful (0 votes)
10 views

Supply Chain Analytics

The document discusses predictive analytics and its applications in supply chain analytics, specifically focusing on linear regression techniques. It provides a case example involving a wine producing company analyzing the impact of advertising expenditures on sales, including the use of Excel and XLMiner for regression analysis. The document also addresses the importance of p-values and cautions against model misspecification and overfitting in regression analysis.

Uploaded by

aishwarya anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Supply Chain Analytics

The document discusses predictive analytics and its applications in supply chain analytics, specifically focusing on linear regression techniques. It provides a case example involving a wine producing company analyzing the impact of advertising expenditures on sales, including the use of Excel and XLMiner for regression analysis. The document also addresses the importance of p-values and cautions against model misspecification and overfitting in regression analysis.

Uploaded by

aishwarya anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

18/01/2023

BUSINESS ANALYTICS

SUPPLY CHAIN ANALYTICS


MBA & MBAA TERM-VI
(2022-23)

SESSION 3
PREDICTIVE ANALYTICS & ITS
APPLICATIONS

Dr. Devendra Kumar Pathak


(M.Tech. & Ph.D., IIT Delhi)
Assistant Professor, 2
Operations Management & Decision Sciences,
Indian Institute of Management (IIM) Kashipur

1 2

MACHINE LEARNING OBJECTIVES & TECHNIQUES LINEAR REGRESSION

 You own a ‘KBC’ wine producing company that uses


business analytics as its competitive advantage

 You would like to understand the effect of


advertising expenditures on sales for one of your
brands

3 4

3 4

1
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


SCATTER PLOT
First Year
Sales First Year
Advertising
Regio Sales
Expenditure You are planning to start
n ($ million)
s ($ million) selling in a new region next
A 101.8 1.3 year. What is your estimate of
B 44.4 0.7 expected first year sales in this
C 108.3 1.4
new region if you plan to
D 85.1 0.5
E 77.1 0.5
spend $1.2M in advertising?
F 158.7 1.9
G 180.4 1.2
H 64.2 0.4 Average First Year Sales =
I 74.6 0.6 $101.5M
J 143.4 1.3
K 120.6 1.6
L 69.7 1
M 67.8 0.8
N 106.7 0.6 5 6
O 119.6 1.1

5 6

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE

SCATTER PLOT SCATTER PLOT

First Year Sales & Advertising Data First Year Sales & Advertising Data
200 200
180 180
160 160
First Year Sales

First Year Sales

140 140
($ million)

($ million)

120 120
100 100
Average First Year
80 80
Sales = $101.5M
60 60
40 40
20 20
0 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
7 8
First Year Advertising Expenditures ($ million) First Year Advertising Expenditures ($ million)

7 8

2
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


Linear Regression How can we find the best line?
 How can we estimate the intercept (b0) and slope (b1)?

 To model sales (Y) as a linear function of advertising


expenditures (x), plus some random deviations (ε) First Year Sales & Advertising Data
[residual] 200
180 b1
160
Y = β0+ β1x + ε

First Year Sales


140

($ million)
120
100
80
Dependent Variable Predictor Variable (IDV) 60
40
20
 We must estimate the unknown parameters β0 and β1 . b0
9 0 10
 We will call these estimates b0 and b1, respectively 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
First Year Advertising Expenditures ($ million)

9 10

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE

How can we find the best line? Use of Excel to solve it!

 Data > Data Analysis > Regression


 Regression estimate for Y at xi (prediction):
Ŷi = b0+ b1xi
 Select data cells for your Y (sales) data and x (advertising
expenditures) data
 Residuals (“error” in prediction): ei = yi – ŷi
Use of XLMiner to solve it!
 Choose b0 and b1 to minimize sum of squared
residuals (or “errors”)  Data Mining > Prediction > Linear Regression

 Select data cells for your Y (sales) data and x (advertising


expenditures) data
11 12

11 12

3
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE

13 14

13 14

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


 Regression Output: Equation

15 16
Sales = 42.2 + 59.7 * advertising expenditures

15 16

4
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


Best linear regression Best linear regression
Line Fit Plot Line Fit Plot
200 200
Y
180 180
Predicted Y
160 160
Linear (Predicted Y)
140 140
120 120
100 Y 100
Y

Y
80 Predicted Y 80 ei
= 10185.6
60 Linear (Predicted Y) 60
40 40
20 20
0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
x Variable Error ?? 17 x Variable 18

Sales (Y) = 42.2 + 59.7 * advertising expenditures (x) Sales (Y) = 42.2 + 59.7 * advertising expenditures (x)

17 18

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


How good is our sales prediction?
NAÏVE ESTIMATION

First Year Sales & Advertising Data


200
180
160
First Year Sales

140
($ million)

120
100
80 ei
60
SSE naive= 20405
40
20
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
19 20
First Year Advertising Expenditures ($ million) R2 is a measure of the overall quality of the regression. It is the proportion
Average first year sales = $101.5M of the variance in the dependent variable that is predicted from the
independent variable.

19 20

5
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


How good are our b0 and b1 predictions?  What if we had more data? Can we do even better?

What is your estimate of


expected first year sales
in the new region if you
plan to spend $1.2M in
advertising, $0.3M in
Confidence promotions, and the
Intervals competitors’ sales are
$20M?

❖ Recall that b0 and b1 are estimates of β0 and β1


21 22
❖ Interpretation: Our estimate for β1 is b1 = 59.7; we are 95%
certain that the true value of β1 will be in between 24.0 and 95.4.

21 22

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


Multiple Linear Regression

 Idea is to model sales (Y) as a linear function of


multiple “features” (x1, x2, …, xk), plus some random
deviations (ε)

Y = β0+ β1x1+ β2x2+…+ βkxk + ε

 We must estimate the unknown parameters β0, β1,


β2,…,βk
23 24

23 24

6
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


Regression Output: Equation R2 Revisited

25 R2 has increased from 0.50 to 0.83. 26


Sales = 65.7 + 49.0 * advertising expenditures + 59.7 * promotions In fact, R2 will always increase when an additional feature is added.
expenditures - 1.8 * competitors’ sales Does this imply that we should keep adding more features?

25 26

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: CASE EXAMPLE


Regression Output
Adding an additional feature: Average Annual Snowfall

R2 has increased
from 0.83 to 0.86

27 28

Is average annual snowfall really a good predictor of sales?

27 28

7
18/01/2023

LINEAR REGRESSION: CASE EXAMPLE LINEAR REGRESSION: SIGNIFICANCE OF P-VALUE


Regression Output
How to Interpret the P-values in Linear Regression
Analysis?

 The p-value for each term tests the null hypothesis


that the coefficient is equal to zero (no effect).
 A low p-value (< 0.05) indicates that you can reject the null
 Our estimate for the impact of one more inch of snow on sales hypothesis.
is $0.3M; we are 95% certain that the true value is between -
$0.2M and $0.8M.  In other words, a predictor that has a low p-value is
likely to be a meaningful addition to your model because
 Zero is in this confidence interval, which implies that there is changes in the predictor's value are related to changes in the
a good chance that snow has NO effect on sales. response variable.
 Associated with p-value > 0.05
 p-values < 0.05 ➔ variable is significant in prediction  Conversely, a larger (insignificant) p-value suggests that
changes in the predictor are not associated with changes in
 Adding this feature has artificially inflated R2 the response variable.
 Example of overfitting 29 30

 Variable selection – make sure no confidence intervals


contain 0.

29 30

LINEAR REGRESSION: CAUTIONS

 Model misspecification
 May be due to Left out variables
 May be due to irrelevant variables
 Functional Misspecification: What if the underlying
relationship between x and Y is not linear? [The Ramsey
Regression Specification Error Test (RESET)]

 Extrapolation
 Extending the model beyond the domain of available data

 Variable selection
 Exclude irrelevant variables to avoid overfitting (will
result in confidence interval containing 0)
 Exclude highly correlated variables (may also result in31
confidence intervals containing 0)

31

You might also like