Evaluating the Proportional
Hazard (PH) Assumption
Week 10
Jerry D.T. Purnomo, Ph.D.
2
Outline
Graphical techniques
Goodness of Fit
3
Graphical Techniques (1/2)
The most popular of these involves comparing
estimated –ln(–ln) survival curves over different
(combinations of) categories of variables being
investigated.
–ln(–ln) S curves parallel?
4
Problem (1/3)
Problems with log–log survival curve approach: How
parallel is parallel?
Recommend:
subjective decision
conservative strategy: assume PH is OK unless
strong evidence of nonparallelism
many categories data “thins out”
different categorizations may give different graphical
pictures
small # of categories (2 or 3)
meaningful choice
reasonable balance
5
Problem (2/3)
How to evaluate several variables simultaneously?
Strategy:
categorize variables separately
form combinations of categories
compare log–log curves on same graph
Drawback:
data “thins out”
difficult to identify variables responsible for
nonparallelism
Alternative Strategy: Adjust for predictors already
satisfying PH assumption, i.e., use adjusted log–log
survival curves
6
Problem (3/3)
Alternative Strategy: Adjust for predictors
already satisfying PH assumption, i.e., use
adjusted log–log survival curves.
7
Data Example
> library(survival)
> data(ovarian) → ovarian cancer
The data consists of the following variables:
futime: survival time (in days) after diagnosis of the
cancer
fustat: 0 = censored, 1 = dead.
age: age in years
[Link]: a measure of health condition after
chemotherapy.
rx: 1 = treatment A, 2 = treatment B
[Link]: measure of functioning of the ovaries
8
Cox PH Model (1/2)
Hypothesis for overall test.
H 0 : β1 = β 2 = β 3 = β 4 = 0
H1: at least one of these β’s are nonzero
We derive from coxph output (LRT) that p-
value=0.001896 lower than α=0.05 (the three of
statistics tests give the same result). It means that
H0 is rejected, or at least one of these β’s are
nonzero.
9
Cox PH Model (2/2)
10
-ln(–ln) Survival Curves (1/2)
log #1:
ln S t , X exp
p
i 1
i X i ln S0 t
0 S t , X 1
log #2:
ln ln S t , X i 1 i X i ln ln S0 t
p
or
ln ln S t , X i 1 i X i ln ln S0 t
p
11
–ln(–ln) Survival Curves (2/2)
Figure 1. KM survival curve vs -ln(-ln) survival curve
12
Log Rank Test
Hypothesis:
H0: S1(t) = S2(t) for all t
H1: S1(t) ≠ S2(t) for at least one t
13
Graphical Techniques (2/2)
Other graphical techniques that also commonly used
are Cox-Snell residual, and martingale residual.
14
Cox-Snell Residual (1/3)
The ith Cox-Snell residual is defined as
rCi Hˆ 0 (ti ) exp xTi βˆ
where Hˆ 0 (ti ) and β̂ are the MLE’s of the baseline
cumulative hazard function and coefficient vector,
respectively.
The Cox-Snell residuals are most useful for
examining the overall fit of a model.
This plot is generally used only as a rough
diagnostic.
15
Cox-Snell Residual (2/3)
Figure 2. Cox-Snell residual of ovarian data
16
Cox-Snell Residual (3/3)
The final model gives a reasonable fit to the data.
Overall the residuals fall on a straight line with an
intercept zero and a slope one.
Further, there are no large departures from the
straight line and no large variation at the right-hand
tail.
17
Martingale Residual (1/3)
The ith martingale residual is defined as
Mˆ i i rCi
The M̂ i take value in (,1] and are always
negative for censored observations.
Used to check the linearity assumption of the
covariate.
It is common practice in many medical studies to
discretize continuous covariate. The martingale
residual are useful for determining possible cut
points for such variables.
18
Martingale Residual (2/3)
Figure 3. Martingale residual of ovarian data
19
Martingale Residual (3/3)
In the plot of the Martingale residuals, Figure 3,
there appears to be just little bit bump for age,
between 52 and 65. Moreover, the lines before
and after the bump nearly coincide. Therefore, a
linear form seems appropriate for age.
20
The GOF Testing Approach (1/2)
Statistical test appealing
Provides p-value
More objective decision than when using graphical
approach
Test of Harrel and Lee (1986)
Variation of test of Schoenfeld
Uses Schoenfeld residuals
Schoenfeld residuals defined for
Each predictor in model
Every subject who has event
21
The GOF Testing Approach (2/2)
p-value large ) PH satisfied
(e.g. P > 0.10)
p-value small ) PH not satisfied
(e.g. P < 0.05)
22
GOF (Test PH)
23
Schoenfeld Residuals
Figure 4. Schoenfeld residual of ovarian data
24
Backward Method
25
Cox PH (Best Model)
26
Test PH (Best Model)
Figure 5. Schoenfeld residual of best model