Regression Logistic Regression
Regression Logistic Regression
Logistic Regression
• Modelling tomorrow
Simple linear regression
Table 1 Age and systolic blood pressure (SBP) among 33 adult women
220
200
180
160
140
120
100
80
20 30 40 50 60 70 80 90
Age (years)
y
Slope y α β1x 1
• Regression coefficient 1
– Measures association between y and x
– Amount by which y changes on average when x changes by one unit
– Least squares method
Multiple linear regression
y α β1x 1 β 2 x 2 ... βi x i
• Example
– SBP versus age, weight, height, etc
Multiple linear regression
y α β1x 1 β 2 x 2 ... βi x i
Model Outcome
and
– dichotomous variable Y
• Linear regression?
Dot-plot: Data from Table 2
Logistic regression (2)
Diseased % 100
80
60
40
20
0
0 1 2 3 4 5 6 7
Age (years)
The logistic function (1)
Probability of
disease 1.0
0.8
0.6
0.4
0.2
0.0
x
The logistic function (2)
{logit of P(y|x)
The logistic function (3)
P P
ln α βx e αβx
1- P 1- P
Interpretation of (1)
P αβx
e
1- P
Interpretation of (2)
β2
2 (1 df)
Variance( β)
• Interval testing
Example
P
ln α β1 Age - 0.841 2.094 Age
1- P
Fitting equation to the data
• Iterative computing
– Choice of an arbitrary value for the coefficients (usually 0)
– Computing of log-likelihood
– Variation of coefficients’ values
– Reiteration until maximisation (plateau)
• Results
– Maximum Likelihood Estimates (MLE) for and
– Estimates of P(y) for a given value of x
Multiple logistic regression
P
ln α β1x 1 β2 x 2 ... βi x i
1- P
• Interpretation of i
– Increase in log-odds for a one unit increase in x i with all the
other xis constant
– Measures association between xi and log-odds adjusted for
all other xi
Multiple logistic regression
• Effect modification
– Can be modelled by including interaction terms
P
ln α β1x1 β2 x 2 β3 x1 x1
1- P
Statistical testing
• Question
– Does model including a given independent variable
provide more information about dependent variable than
model without this variable?
• Three tests
– Likelihood ratio statistic (LRS)
– Wald test
– Score test
Likelihood ratio statistic
• LR statistic
-2 log (likelihood model 2 / likelihood model 1) =
-2 log (likelihood model 2) minus -2log (likelihood model 1)
P
ln α β1 Exc β2 Smk
1- P
0.7102 1.0047 Exc 0.7005 Smk
(SE 0.2614) (SE 0.2664)
P
ln α β1 Exc β2 Smk β3 Smk Exc
1- P
• 198 observations
• Low Birth Weigth [LBW]
– 1= Birth weight < 2500g
– 0= Birth weight >= 2500g
• Age of mother in years
• Weight of mother in pounds [LWT]
• Race (1,2,3)
• Number of doctor’s visit in last trimester [FTV]
Example 2: Risk of death from bacterial
meningitis according to treatment
• 161 observations
• Death (0,1)
• Treatment
– 1=Chloramphenicol, 2=Ampicillin)
• Delay before treatment (onset, in days)
• Convulsions (1,0)
• Level of consciousness (1-3)
• Severity of dehydration (1-3)
• Age in years
• Pathogen
– 1 Others, 2 HiB, 3 Streptococcus pneumoniae
Reference