Retention Model
Retention Model
Peter S. Fader is the Frances and Pei-Yuan Chia Professor of Marketing at the Wharton School of the
University of Pennsylvania (address: 749 Huntsman Hall, 3730 Walnut Street, Philadelphia, PA 191046340; phone: 215.898.1132; email: [email protected]; web: www.petefader.com). Bruce G. S.
Hardie is Associate Professor of Marketing, London Business School (email: [email protected];
web: www.brucehardie.com). The authors thank Michael Berry and Gordon Lino for providing the
data used in this paper, and Naufel Vilcassim for his helpful comments. The second author acknowledges
the support of the London Business School Centre for Marketing and the hospitality of the Department
of Marketing at the University of Auckland Business School.
Abstract
A Simple Probability Model for Projecting Customer Retention
At the heart of any contractual or subscription-oriented business model is the notion of the
retention rate. An important managerial task is to take a series of past retention numbers for
a given group of customers and project them into the future in order to make more accurate
predictions about customer tenure, lifetime value, and so on. In this paper we reanalyze data
from a leading book on data mining (Berry and Lino 2004), who drew the dire conclusion
that parametric approaches do not work for such a task. As an alternative to common
curve-tting regression models, we develop and demonstrate a probability model with a wellgrounded story for the churn process. We show that our basic model (known as a shiftedbeta-geometric) can be implemented in a simple Microsoft Excel spreadsheet and provides
remarkably accurate forecasts and other useful diagnostics about customer retention. We provide
a detailed appendix covering the implementation details and oer additional pointers to other
related models.
Keywords: retention, churn, forecasting, customer base analysis, probability models, betageometric
Introduction
t
ri ,
(1)
S(t)
.
S(t 1)
(2)
i=1
which implies
rt =
Several quantities of managerial interest can easily be calculated directly from the survivor
function. For example, the expected (or average) tenure of a customer is simply the area under
the survivor function. In a discrete-time setting, this is computed as
expected tenure =
S(t) .
t=0
In light of (1), the standard textbook expression for (expected) customer lifetime value (CLV)
1
This is in contrast to a noncontractual setting, a dening characteristic of which is that the departure of a
customer is not observed by the rm. See Section 4 for a discussion of the implications of this characteristic.
2
Strictly speaking, we should talk of retention and churn probabilities, not rates.
in a contractual setting that (correctly) reects the phenomenon of nonconstant retention rates,
E(CLV ) =
t=0
t
1 t
m
ri
,
1+d
can be written as
E(CLV ) =
i=1
t=0
S(t)
.
(1 + d)t
The survival data presented in Table 1 are for two segments of customers (Regular and High
End) for an unspecied subscription-type business. These data are presented in graphical form
in Berry and Lino (2004, Chapter 12). The High End data are used by Berry and Lino in
their examination of parametric approaches to the projection of the survivor function.
Year
0
1
2
3
4
5
6
7
8
9
10
11
12
% survived
Regular High End
100.0%
100.0%
63.1%
86.9%
46.8%
74.3%
38.2%
65.3%
32.6%
59.3%
28.9%
55.1%
26.2%
51.7%
24.1%
49.1%
22.3%
46.8%
20.7%
44.5%
19.4%
42.7%
18.3%
40.9%
17.3%
39.4%
y = 0.925 0.071t
R2 = 0.922
Quadratic
R2 = 0.998
R2 = 0.963
where y is the proportion of customers surviving at least t years. These equations are then used
to extrapolate the survivor function out to year 12; Figure 1 re-creates the plot presented in
Berry and Linos sidebar (p. 393).
100
% Survived
80
60
40
20
......
.....
....
.......... ........
........ .....
..........
..........
..........
............
............
..............
..................
....
......... ........
....
.......... ......
.....
........... .......
.. .
......... . .......
.
.
.
............ . .......
...
...... . .......
....
........ . .......
.. ...
........ . ......
......
.......... ......
.......
........................
.
.
.
.
......................
.. .....
..... .................
............................ .. ....... ........ ........ ......
........ ................
........ ... ....................
...................
.... . ....
.......................
.... .... ..
........................
..
....
....................
.... .... .... ....
....
.
.
.. ....
....
.... ....
....
.... ....
....
.
.
.
.
.... ....
....
Actual
...
....
....
....
Linear
....
....
....
Quadratic
....
....
....
....
Exponential
10
11
12
Tenure (years)
Figure 1: Actual versus model-based estimates of the percentage of High End customers
surviving at least 012 years
The t of all three models up to and including year 7 is reasonable, and the quadratic model
provides a particularly good t. But when we consider the projections beyond the model calibration period, all three models break down dramatically. The linear and exponential models
underestimate year 12 survival by 81% and 30%, respectively, while the quadratic model overestimates year 12 survival by 92%. Furthermore, the models lack logical consistency: the linear
model would have S(t) < 0 after year 14, and according to the quadratic model the survival will
3
In the models run by Berry and Lino, time is indexed 1, 2, . . . , 8, but in order to maintain consistency with
the denitions of S(t) discussed earlier (specically S(t) = 0), we reindex time to 0, 1, . . . , 7. This has no impact
at all on the t or forecasting performance of any of the models.
start to increase over time, which is not possible. It is therefore not surprising that Berry and
Lino conclude that parametric curves do not work for the task of projecting the survivor
function over time.
Repeating this analysis for the Regular segment yields the following equations:
Linear
y = 0.773 0.092t
R2 = 0.776
Quadratic
R2 = 0.960
R2 = 0.915
and the corresponding ts and projections are reported in Figure 2. The projections associated
with the linear and quadratic models are terrible and illogical once again. The exponential
model doesnt appear to be very bad in the gure, but in fact it underestimates year 12 survival
by 54%. This is not an acceptable range of error.
120
% Survived
80
40
40
.
Actual
..
..
..
...
Linear
...
.
.
.
...
..........
Quadratic
......
...
.....
.......
.
.
.
.
Exponential
...... ..........
.....
..
..............
.....
..............
...
...............
.
.
.
.
.
......... .......
............. ....
...
...... ...... .....
.....
...... ........ .....
...
..... . ....... .. .....
.....
........ .... .. .. ....
.
.
.
............. .. .... ....
..
.. ..
........... .... .....
........... .... ......
......
....
......................... ...........
.. ....
.... . .............................
....... ... ............................ ........ ......
..... ........ .....................................................
..... .... .... .. .......................................................
.. .... ....
................................................
.....
.... .... .... .
.....
... .... .... ....
.....
.... .... .... .... ....
.....
.....
.....
.....
.....
0
1
2
3
4
5
6
7
8
9 ........10
11
12
.....
.....
.....
.....
.....
Tenure (years)
....
a cohort of customers beyond the range of observations (see, for instance, Hardie, Fader and
Wisniewski (1998) for the case of new product sales forecasting) . With this in mind, the next
section sees us formulating a probabilistic model of contract duration that is based on a simple
story of customer behavior.
t = 1, 2, 3, . . .
t = 1, 2, 3, . . .
(3)
(4)
1 (1 )1
,
B(, )
P (T = t | , ) =
(5)
(6)
...
...
...
..
..
..
..
...
...
...
...
...
...
....
....
....
......
........
..........
.............
................
...............
.....
0.0
0.5
......
... ......
..
...
...
...
..
...
...
...
...
...
...
...
...
...
...
...
...
...
..
...
...
....
...
..
...
....
...
....
....
....
....
......
.........................
1.0
0.0
0.5
1.0
...
...
...
...
...
...
...
...
...
..
..
.
...
...
...
...
...
...
....
....
..
....
..
.......
.....
.................
..................................................
.
...
..
..
...
.
..
..
...
..
.
...
...
..
...
.
.
.
...
....
......
........
..........
.
.
.
.
.
.
.
.
.
.
.
.
.............
................
........
0.0
0.5
1.0
0.0
0.5
1.0
P (T = t) =
t=1
(7)
+t2
P (T = t 1) t = 2, 3, . . .
++t1
Recall from (2) that the retention rate is the ratio of sequential values of the survivor function.
Substituting (6) into (2) and simplifying gives us the following expression for the (aggregate)
retention rate associated with sBG model:
rt =
+t1
.
++t1
(8)
(See Appendix A for details of the derivation.) Given (8), we can compute S(t) without having
to deal with a beta function by using the expression given in (1).
We immediately see that, under the sBG model, the retention rate is an increasing function of
time, even though the underlying (unobserved) individual-level retention probability is constant.
According to this model, there are no underlying time dynamics at the level of the individual
customer; the observed phenomenon of retention rates increasing over time is simply due to
heterogeneity (i.e., the high churn customers drop out early in the observation period, with the
remaining customers having lower churn probabilities). This well-known ruse of heterogeneity
(Vaupel and Yashin 1985) is often overlooked by those attempting to make sense of various
aggregate patterns of customer behavior.
We t the sBG model to the rst seven years of the data presented in Table 1. For the
High End segment,
= 0.688, = 3.806; for the Regular segment,
= 0.704, = 1.182. (See
Appendix B for details of how to estimate the model parameters in the familiar Microsoft Excel
environment.) Using these parameter estimates, we extrapolate the survivor function for each
segment out to year 12. These model-based numbers are plotted in Figure 4, along with the
corresponding empirical survivor functions. The resulting predictions are almost too good to
be true; the sBG model overestimates year 12 survival by only 4% and 2% for the High End
and Regular segments, respectively. Even though this model is no more complicated than the
regression models discussed earlier, its carefully constructed story makes it possible to tease
out, and therefore accurately project, the critical behavioral components.
Another plot of interest shows the (aggregate) retention rate as a function of tenure. The
model-based retention rate numbers (as computed using (8)) are plotted in Figure 5, along with
the corresponding observed retention rates as computed from the empirical survivor functions.
100
% Survived
80
60
40
20
......
........
... ........
... .........
... ..........
Actual
... ..........
...
..........
.......
...
......
Model
...
.
.
.
.....
...
.....
...
......
High End
...
.......
...
...........
.........
...
......... ...
..
........... .
..
.......... ...
.....
........... ...
.....
............. ...
....
.............. ..
....
...................
....
.................
....
................
....
.................
....
. ...................
.....
.............................
......
... ..................................
.......
... ... ...............................
......
... ... ... .............................
......
. ... ... ... .
........
........
........ .
.............
............
...............
Regular
..................
......................
.......................
. ...............................
........................................
........................
10
11
12
Tenure (years)
High End
Retention Rate
0.9
0.8
0.7
0.6
0.5
Actual
Model
10
11
12
Tenure (years)
Figure 5: Actual versus model-based estimates of retention rates by tenure for the High
End and Regular segments.
For both segments we note that the retention rates are an increasing function of the length of
a customers relationship with the rm. The important point to emphasize, once again, is that
10
the sBG story assumes that these apparent dynamics are simply a result of heterogeneity;
any given individual has a constant (but unknown) retention probability 1 . Unlike the
conventional wisdom about customer retention, it is not a story of individual customers becoming
increasingly loyal as they develop a deeper relationship with the rm, etc.
As a nal demonstration of the usefulness of the sBG model, we show and contrast the mixing
distributions that characterize how the churn probabilities () dier across the individuals in
each segment. In Figure 6 we see that both distributions are reverse J-shaped. This implies
that, within each group, most customers have fairly low churn probabilities, but there is a
sizeable sub-segment within each one that will tend to depart very quickly. These patterns
suggest that there is a fairly high degree of heterogeneity within each segment, and therefore a
model that doesnt take these cross-customer dierences into account will not perform very well,
particularly in terms of out-of-sample forecasting. Closer examination shows that the overall
weight of the distribution for the Regular group is shifted slightly to the right compared to the
High End distribution. This reects the fact that the Regular group has a higher mean churn
probability (E() = /( + ) = 0.37) compared to that of the High End group (E() = 0.15).
It should be clear from Figures 4 and 5 that this kind of dierence in the means exists, but this
plot provides a better idea about the nature of these dierences at a more ne-grained level.
4
f ()
...
...
...
...
...
...
...
...
...
High End
...
..
...
..
...
..
Regular
...
..
...
..
...
..
...
..
...
...
..
...
..
...
..
...
..
...
..
...
..
....
....
...
....
....
....
....
.....
....
..
... ...
... ... . ........
.. ... ........
............ .
...... .. ... ... ... ..
. ... ... ... ...
.....
... ... ... ... ... .
......
.. ... ... ... ... ...
......
... ... ... ... ... ...
.......
... ... ... ... ...
.......
... ... ... ...
........
... ... ..
.........
. ...
.........
..........
.............
.................
..............................
..............................................
0
0.00
0.25
0.50
0.75
1.00
Figure 6: Estimated distributions of churn probabilities for the High End and Regular
segments
11
Discussion
We have presented the shifted-beta-geometric (sBG) distribution as a model for the duration
of customer relationships in a contractual setting. This easy-to-implement model enables us to
project an empirical survivor function beyond the observed time horizon. As a result, we need
not suer the truncation problem associated with computing expected tenure or customer
lifetime value using just the observed survival data; that is, underestimating the quantity of
interest by ignoring the remaining lifetime of those customers who are still active at the end of
the observation period.
Strictly speaking, the sBG model is only applicable in discrete-time contractual settings.
In situations where a customer can become inactive at any point in time (rather than only at
discrete contract renewal times), it may be more appropriate to use the models continuoustime analog, the exponential-gamma (EG) distribution (also known as the Lomax distribution
or the Pareto distribution of the second kind). This model assumes that the duration of an
individual customers relationship with the rm is characterized by the exponential distribution,
and that heterogeneity in departure rates is captured by a gamma distribution (Hardie et al.
1998; Morrison and Schmittlein 1980).
Both the sBG and EG models are based on the assumption that the commonly observed
phenomenon of increasing retention rates is due entirely to heterogeneity; individual-customerlevel retention rates are assumed to be constant. If we wish to allow for the possibility of time
dynamics at the level of the individual customer, we can no longer characterize the duration
of an individuals relationship with the rm using either the shifted-geometric or exponential
distribution, both of which have the memoryless property (i.e., the probability of survival to
s + t, given survival to t, is the same as the initial probability of survival to s).
A natural extension would be to assume that individual lifetimes can be characterized by the
Weibull distribution, which allows for an individuals risk of cancelling his contract to increase or
decrease as the length of the relationship with the rm increases. In a discrete-time contractual
setting, this leads to the beta-discrete-Weibull (BdW) model (Fader and Hardie 2005), which is
a generalization of the sBG model, while in a continuous-time contractual setting, this leads to a
12
generalization of the EG model, the Weibull-gamma (WG) model (Hardie et al. 1998; Morrison
and Schmittlein 1980; Schweidel et al. 2005).
The latter paper cited above shows how to add additional bells and whistles to the retention
model, including marketing mix eects, time-varying covariates (such as seasonality), and crosscohort dierences. The key is to bring in all of these factors at the right level, i.e, at the level
of the latent parameter of interest (in this case ) instead of just jamming dierent covariate
eects into a regression-like model. Furthermore, we strongly recommend adherence to Occams
Razor (often expressed as entities should not be multiplied unnecessarily), an implication of
which is that one should not make any more assumptions than the minimum needed. An appeal
to simplicity is particularly important when the managerial focus is more on projection than
detailed explanation.
We recognize that the model presented in this paper (along with the extensions discussed
above) only apply in a contractual setting, a dening characteristic of which is that the departure
of a customer is observed. This is in contrast to a noncontractual setting, a dening characteristic
of which is that the departure (or death) of a customer is not observed by the rm. In a
noncontractual setting, the time at which a customer becomes inactive, and the likelihood that
it has occurred at all, must be inferred from the transaction history. This creates many challenges
that make the model-building process a tougher task than for the present (contractual) case.
Such models are developed by Fader et al. (2005a, 2005b) and Schmittlein et al. (1987). But in
those situations, the notion of customer retention is not a meaningful concept anyway, since the
company never owns the customer, or has any kind of formal relationship that would require
renewal. It is important therefore, for managers to use the word retention more carefully than
current practice often shows. But when retention is indeed a relevant notion, as it is for the
dataset used here, then the proposed model and surrounding discussion should be among the
rst analytical considerations undertaken by management in their desire to better understand
and project the retention patterns they have observed.
13
(A1)
B(, ) =
For the purposes of this paper, the only thing we need to know about the gamma function is its
so-called recursive property: (x) = (x 1)(x 1).
We derive the expression for the sBG probability mass function in the following manner.
If were known, the probability of dropping out in period t would simply be the geometric
probability (1 )t1 . But since is unobserved, P (T = t) for a randomly-chosen individual is
the expected value of the shifted-geometric probability of dropping out in period t (conditional
on = ), where the expectation is with respect to the beta distribution for :
P (T = t) =
P (T = t | = )f () d
(1 )t1
1 (1 )1
d
B(, )
which, combining terms and moving all non- elements to the left of the integral sign,
1
B(, )
(1 )+t2 d .
Looking closely at the integral, we see that it is simply the integral expression for the beta
14
B( + 1, + t 1)
.
B(, )
(The expression for the sBG survivor function is derived in exactly the same manner.)
The forward-recursion formula used to compute sBG probabilities is derived in the following
manner. We rst note that
P (T = 1 | , ) =
B( + 1, )
B(, )
which, expressing the beta functions in term of gamma functions and cancelling terms,
( + 1) ( + )
() ( + + 1)
.
+
But how does this help us compute P (T = t) for t = 2, 3, . . .? Reecting on the identity
P (T = t) =
P (T = t)
P (T = t 1) ,
P (T = t 1)
if we have a simple expression for the ratio P (T = t)/P (T = t 1), we can easily compute
P (T = 2) given the value of P (T = 1) = /( + ). Given the value of P (T = 2), we can then
compute P (T = 3), and so on.
Recalling the sBG pmf, we have
B( + 1, + t 1)
B( + 1, + t 2)
P (T = t)
=
P (T = t 1)
B(, )
B(, )
B( + 1, + t 1)
=
B( + 1, + t 2)
15
which, expressing the beta functions in term of gamma functions and cancelling terms,
( + t 1) ( + + t 1)
( + t 2) ( + + t)
+t2
.
++t1
rt =
which, expressing the beta functions in term of gamma functions and cancelling terms,
( + t) ( + + t 1)
( + t 1) ( + + t)
+t1
.
++t1
16
7
t=1
nt
(B1)
However, we do not know the values of and , even though we believe that the data come
from the sBG distribution.
The idea of maximum likelihood estimation is to ask what values of the model parameters
17
maximize the probability (or, more formally, the likelihood ) of the observed data. We dene the
likelihood function as
L(, | data) = P (T = 1 | , )n1 P (T = 2 | , )n2 P (T = 3 | , )n3
P (T = 4 | , )n4 P (T = 5 | , )n5 P (T = 6 | , )n6
P (T = 7 | , )n7 S(7 | , )n
7
t=1
nt
(B2)
and use numerical optimization methods (e.g., the Solver add-in in Excel) to nd the values of
and that maximize this function; these are called the maximum likelihood estimates of the
model parameters.4 As the number computed using (B2) will be very small, we usually work
with the natural logarithm of the likelihood function, the so-called log-likelihood function:
LL(, | data) =
7
7
nt ln P (T = t | , ) + n
nt ln S(7 | , ) .
t=1
(B3)
t=1
The observant reader will note that we do not actually know n, n1 , n2 , . . . , n7 for the two datasets
given in Table 1; the data are expressed as percentages of the initial number of customers.
Looking closely at (B3), we see that this is not a problem; we can simply factor out n (e.g., n1
becomes n1 /n, the proportion of customers who become inactive in the rst period). While this
will aect the height of the function, the location of the maximum (i.e., the values of and )
will be unaected.
So our task is to code up this expression for the model log-likelihood function in an Excel
worksheet and nd maximum likelihood estimates of and by using Solver to nd the values of
and that maximize the value of this function. The relevant worksheet is shown in Figure B1
and is constructed in the following manner.
In order to enter expressions for P (T = t | , ) without an error message appearing (e.g.,
#NUM! or #DIV/0!), we need some starting values for and . The exact values do not
matter provided they are within the dened bounds so we start with 1.0 for and ,
locating these parameter values in cells B1:B2, respectively.
4
We note that (B1) and (B2) look almost identical, but there is a subtle dierence: in (B1), the probability
we compute is a function of the data pattern for xed model parameters, while in (B2), we already have the data
and the probability we compute is a function of the model parameters.
18
A
1 alpha
2 beta
3 LL
4
5
6
7
8
9
10
11
12
13
t
1
2
3
4
5
6
7
B
1.000
1.000
-2.116
P(T=t)
0.500
0.167
0.083
0.050
0.033
0.024
0.018
S(t)
0.500
0.333
0.250
0.200
0.167
0.143
0.125
% alive
86.9%
74.3%
65.3%
59.3%
55.1%
51.7%
49.1%
% die
13.1%
12.6%
9.0%
6.0%
4.2%
3.4%
2.6%
-0.091
-0.226
-0.224
-0.180
-0.143
-0.127
-0.105
-1.021
19
As the proportion of customers who dropped out in year 1 is simply one minus the
proportion of customers who are still active at the end of the rst year, we enter
=1-D6 in cell E6.
For t > 1, the proportion of customers who dropped out in year t is the proportion
of customers who are still active at the end of year t 1 minus the proportion of
customers who are still active at the end of the year t. We therefore enter =D6-D7 in
cell E7 and copy it to E8:E12.
The rst seven elements of the log-likelihood function are computed in cells F6:F12: we
enter =E6*LN(B6) in cell F6 and copy it to E7:E12.
The nal element of the log-likelihood function, that associated with those customers who
have survived at least seven years, is entered as =D12*LN(C12) in cell F13.
The sum of cells F6:F13 is entered in cell B3; this is the value of the log-likelihood function
given the values for the two model parameters in cells B1:B2. (With starting values of 1.0
for both parameters, LL = 2.116.)
We nd the maximum likelihood estimates of the two model parameters by maximizing the
log-likelihood function. We do this using the Excel add-in Solver, available under the Tools
menu. The target cell is the value of the log-likelihood, cell B3. We wish to maximize this by
changing cells B1:B2. The constraints we place on the parameters are that and are greater
than 0. As Solver only oers us a greater than or equal to constraint, we add the constraint
that cells B1:B2 are a small positive number (e.g., 0.0001) see Figure B2.
Clicking the Solve button, Solver converges to a solution where the maximum value of the
log-likelihood function is 1.611, associated with = 0.668 and = 3.806. These are the
maximum likelihood estimates of the model parameters. (So as to be sure that we have actually
reached the maximum of the log-likelihood function, it is good practice to redo the optimization
process using a completely dierent set of starting values. For example, using starting values of
0.01 and 0.01 (for which LL = 2.742), use Solver to nd the maximum of the log-likelihood
function. Are the corresponding values of the two model parameters equal to those given above?
They should be!)
21
References
Berry, Michael J. A. and Gordon S. Lino (2004), Data Mining Techniques: For Marketing,
Sales, and Customer Relationship Management, 2nd edition, Indianapolis, IN: Wiley Publishing,
Inc.
Buchanan, Bruce and Donald G. Morrison (1988), A Stochastic Model of List Fallo with
Implications for Repeat Mailings, Journal of Direct Marketing, 2 (Summer), 715.
Fader, Peter S. and Bruce G. S. Hardie (2005), Accommodating Individual-level Dynamics in
a Discrete Lifetime Distribution, unpublished working paper.
Fader, Peter S., Bruce G. S. Hardie, and Ka Lok Lee (2005a), "Counting Your Customers"
the Easy Way: An Alternative to the Pareto/NBD Model, Marketing Science, 24 (Spring),
275284.
Fader, Peter S., Bruce G. S. Hardie, and Ka Lok Lee (2005b), RFM and CLV: Using Iso-value
Curves for Customer Base Analysis, Journal of Marketing Research, 42 (November).
Hardie, Bruce G. S., Peter S. Fader, and Michael Wisniewski (1998), An Empirical Comparison
of New Product Trial Forecasting Models, Journal of Forecasting, 17 (JuneJuly), 209229.
Morrison, Donald G. and David C. Schmittlein (1980), Jobs, Strikes, and Wars: Probability
Models for Duration, Organizational Behavior and Human Performance, 25 (April), 224251.
Schmittlein, David C., Donald G. Morrison, and Richard Colombo (1987), Counting Your
Customers: Who They Are and What Will They Do Next? Management Science, 33 (January),
124.
Schweidel, David A., Peter S. Fader, Peter and Eric T. Bradlow (2005), Modeling Retention in
and Across Cohorts, http://ssrn.com/abstract=742884.
Vaupel, James W. and Anatoli I. Yashin (1985), Heterogeneitys Ruses: Some Surprising Eects
of Selection on Population Dynamics, The American Statistician, 39 (August), 176185.
Weinberg, Clarice Ring and Beth C. Gladen (1986), The Beta-Geometric Distribution Applied
to Comparative Fecundability Studies, Biometrics, 42 (September), 547560.
22