0% found this document useful (0 votes)
8 views

Logistic Regression

logistics regression method

Uploaded by

upscwalabro881
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Logistic Regression

logistics regression method

Uploaded by

upscwalabro881
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

INSTITUTE –University

School of Business
DEPARTMENT -Management
Master of Business Administration
MARKETING ANALYTICS
(23BAT737)
Parveen Abrol
Assistant Professor
Chandigarh University

DISCOVER . LEARN .
EMPOWER

1
Learning Objectives
Introduction to
marketing
analytics
CO Title Level
Numb
er

CO4 To draw and evaluate inferences Evaluate


to choose the most effective
solution to the marketing
problem keeping in mind the
market dynamics and resources
at disposal.

How to measure
success?

2
Logistic Regression vs TGDA
• Two-Group Discriminant Analysis
– Implicitly assumes that the Xs are Multivariate Normally
(MVN) Distributed
– This assumption is violated if Xs are categorical variables
• Logistic Regression does not impose any restriction
on the distribution of the Xs
• Logistic Regression is the recommended approach if
at least some of the Xs are categorical variables
Data
Favored Stock Less Favored Stock
Success Size Success Size
1 1 0 1
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 1 0 0
1 0 0 0
1 0 0 0
Contingency Table

Type of Large Small Total


Stock
Preferred 10 2 12

Not 1 11 12
Preferred
Total 11 13 24
Basic Concepts
• Probability
– Probability of being a preferred stock = 12/24 =
0.5
– Probability that a company’s stock is preferred
given that the company is large = 10/11 = 0.909
– Probability that a company’s stock is preferred
given that the company is small = 2/13 = 0.154
Concepts … contd.
• Odds
– Odds of a preferred stock = 12/12 = 1
– Odds of a preferred stock given that the company
is large = 10/1 = 10
– Odds of a preferred stock given that the company
is small = 2/11 = 0.182
Odds and Probability

• Odds(Event) = Prob(Event)/(1-Prob(Event))

• Prob(Event) = Odds(Event)/(1+Odds(Event))
Logistic Regression
• Take Natural Log of the odds:
– ln(odds(Preferred|Large)) = ln(10) = 2.303
– ln(odds(Preferred|Small)) = ln(0.182) = -1.704

• Combining these relationships


– ln(odds(Preferred|Size)) = -1.704 + 4.007*Size
– Log of the odds is a linear function of size
– The coefficient of size can be interpreted like the
coefficient in regression analysis
Interpretation
• Positive sign  ln(odds) is increasing in size of
the company i.e. a large company is more
likely to have a preferred stock vis-à-vis a
small company
• Magnitude of the coefficient gives a measure
of how much more likely
General Model
• ln(odds) = 0 + 1X1 + 2X2 +…+ kXK (1)

• Recall:
– Odds = p/(1-p)

• ln(p/1-p) = 0 + 1X1 + 2X2 +…+ kXK (2)

• p= e  0  1 X 1
1  e  0  1 X 1
• p= 1
1  e  (  0  1 X 1 )
Logistic Function

1
0.9
0.8
0.7
0.6
0.5 Series1
p

0.4
0.3
0.2
0.1
0
-50 -30 -10 10 30 50
X
Estimation
• Coefficients in the regression model are
estimated by minimizing the sum of squared
errors
• Since, p is non-linear in the parameter
estimates we need a non-linear estimation
technique
– Maximum-Likelihood Approach
– Non-Linear Least Squares
Maximum Likelihood Approach
• Conditional on parameter , write out the probability
of observing the data
• Write this probability out for each observation
• Multiply the probability of each observation out to
get the joint probability of observing the data
condition on 
• Find the  that maximizes the conditional probability
of realizing this data
Logistic Regression
• Logistic Regression with one categorical
explanatory variable reduces to an analysis of
the contingency table
Interpretation of Results
Look at the –2 Log L statistic
• Intercept only: 33.271
• Intercept and Covariates: 17.864
• Difference: 15.407 with 1 DF (p=0.0001)
• Means that the size variable is explaining a lot
Do the Variables Have a Significant
Impact?
• Like testing whether the coefficients in the
regression model are different from zero
• Look at the output from Analysis of Maximum
Likelihood Estimates
– Loosely, the column Pr>Chi-Square gives you the
probability of realizing the estimate in the Parameter
estimate column if the estimate were truly zero – if this
value is < 0.05 the estimate is considered to be significant
Other things to Look for
• Akaike’s Information Criterion (AIC),
Schwartz’s Criterion (SC) – this like Adj-R2 – so
there is a penalty for having additional
covariates
• The larger the difference between the second
and third columns – the better the model fit
Interpretation of the Parameter
Estimates
• ln(p/(1-p)) = -1.705 + 4.007*Size

• p/(1-p) = e(-1.705) e(4.007*Size)

• For a unit increase in size, odds of being a favored stock go up


by e4.007 = 54.982
Predicted Probabilities and Observed
Responses
• The response variable (success) classifies an
observation into an event or a no-event
• A concordant pair is defined as that pair
formed by an event with a PHAT higher than
that of the no-event
• Higher the Concordant pair % the better
Classification
• For a set of new observations where you have
information on size alone
• You can use the model to predict the
probability that success = 1 i.e. the stock is
favored
• If PHAT > 0.5 success = 1else success=2
Logistic Regression with multiple
independent variables

• Independent variables a mixture of


continuous and categorical variables
Data

Favored Stock Less Favored Stock


Success Size fp Success Size fp
1 1 0.58 0 1 2.28
1 1 2.8 0 0 1.06
1 1 2.77 0 0 1.08
1 1 3.5 0 0 0.07
1 1 2.67 0 0 0.16
1 1 2.97 0 0 0.7
1 1 2.18 0 0 0.75
1 1 3.24 0 0 1.61
1 1 1.49 0 0 0.34
1 1 2.19 0 0 1.15
1 0 2.7 0 0 0.44
1 0 2.57 0 0 0.86
General Model
• ln(odds) = 0 + 1Size + 2FP

• ln(p/1-p) = 0 + 1Size + 2FP

• p= e  0  1Size   2 FP
1  e  0  1Size   2 FP

• p= 1
1  e  (  0  1Size  2 FP )
Estimation & Interpretation of the
Results

• Identical to the case with one categorical


variable
Summary
• Logistic Regression or Discriminant Analysis
• Techniques differ in underlying assumptions
about the distribution of the explanatory
(independent) variables
• Use logistic regression if you have a mix of
categorical and continuous variables
References
Text Books:-
1. Grigsby, Mike., Marketing Analytics, Kogan Page, 1st Edition, 2015
2. Winston, Wayne, Marketing Analytics, Wiley, 1st Edition.
3. Venkatesan, Rajkumar; Farris Paul; Wilcox Ronald, Cutting Edge Marketing
Analytics, FT Press, 1st Edition, 2015.
4. Grigsby, Mike, Advanced Customer Analytics, Kogan Page, 1st Edition, 2016.

Reference Books:-
•1. Halligan, Bryan., Shah, Dharmesh., Inbound Marketing, John Wiley, 1 st Edition.
2016.
•2. Sauro, Jeff., Customer Analytics for Dummies, John Wiley 1 st Edition, 2017.

27

You might also like