short_line_groups
short_line_groups
A thesis presented
by
to
Economics
Harvard University
Cambridge, Massachusetts
December 1996
Copyright c 1996 by Pia Nandini Malaney
All rights reserved.
Abstract
The first part of this thesis looks at issues in index number theory. By
using techniques developed in differential geometry, it is shown that the so-
called index number problem can be resolved by the development of a special
economic derivative operator constructed for this purpose. This derivative
is shown to give rise to a unique differential geometric index number which
is then demonstrated to equal the Divisia index. It is then shown in the
second chapter (co-authored with Eric Weinstein) that when placed under
similar assumptions, this data based index equals the Konus index, previously
asserted to be incalculable in the absence of a knowledge of preference maps.
The third chapter deals with rural to urban migration. Based on the risk
neutral Todaro model, it has been suggested that raising rural incomes will
have the effect of controlling migration to urban labour markets. However, it
is shown in this chapter that it in the case of decreasing absolute risk aversion
it is possible for increasing income to have the perverse effect of increasing
the desirability of migration.
Contents
Acknowledgments 4
Introduction 6
1
2.3 Implicit Assumptions Underlying a Cost-of-Living Index. . . . 53
2.3.1 Changing Preferences and Psychological Neu-
trality . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 The Konus Index under Changing Preferences . . . . . . . . . 63
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2
List of Figures
3
Acknowledgments
I would like to begin by thanking my advisor Eric Maskin for guidance and
assistance in the writing of this thesis. This dissertation has benefited from
his extraordinary breadth, precision and intellectual flexibility in guiding me
through such non-standard approaches as the application of geometry to
welfare considerations.
I would like also to thank Peter Timmer for his generous advice and
encouragement throughout this process. In particular, I took continuous
inspiration from his ability to focus my analysis on the core economic points,
even amidst the most seemingly arcane mathematics.
The author benefited from a research assistantship working with Prof.
Amartya Sen during the 1994-1995 academic year. Additionally, I have
found Sen’s discourse on index theory contained in the article “Welfare Basis
of Real Income Comparisons” (cited here as Sen [32]) to be invaluable as
a reference laying bare the otherwise implicit assumptions underlying the
theory of index numbers.
I am grateful to W. Erwin Diewert for his thoughtful input and for
freely sharing his comprehensive knowledge of index theory. Specifically,
I received invaluable information on the state of the literature concerning,
changing preferences, chaining arguments, and welfare discussions of the Di-
visia index, from a thorough letter which is cited here as Diewert [10]. Per-
haps the most helpful resource in understanding the background and history
of Index Number theory has proved to be his compilation of articles Essays
in Index Number Theory Vol. I (cited here as Diewert and Nakamura [11]).
Many of the ideas contained in this dissertation are the result of an ongo-
ing discussion, debate, and intellectual interchange with my co-author Eric
Weinstein.
Throughout the course of writing this thesis I received a great deal of
valuable input. In particular, I would like to thank Scott Axelrod, Dick
4
Goldman, Thomas Hellmann, Stephan Klasen, Thomas Sjostrom
and Isadore Singer, for helpful conversations. It is probable that this list
contains omissions; they are entirely unintentional.
To my parents Esther and Hira Malaney: You always expected the
best from me, and it is only in striving to give it to you that I have made
it this far. I am grateful to have had the opportunity to fulfill some of the
dreams which when placed out of your reach, you put within mine. Thank
you for all the love and support that have enabled me to achieve this.
I would like to dedicate this thesis to my intellectual inspiration and best
friend, Eric: I have learned so much more than differential geometry from
you along the way, and I know the journey continues to hold ever more.
5
Introduction
The theory of index numbers has been of interest to economists for over a cen-
tury. This interest has been rejuvenated in the 1990s as the U.S. government
grapples with the question of how to construct a true cost of living adjustment
(C.O.L.A.), or how to correct for the problems of the current consumer price
index (CPI). The inconsistencies between the hundreds of available algebraic
index numbers, and their uncertain effects on welfare (when implemented as
C.O.L.A) have led to extensive debate, and some pessimism about the ability
of economic theory to answer the fundamental questions lying at the core of
this issue.
The first chapter of this thesis takes a new approach to understanding
the question of what the “correct” index number is. We first observe that
all index numbers can be viewed as a combination of two components: an
algebraic formula and a notion of constancy. A general implicit choice has
been made using the ordinary calculus derivative as the notion of constancy.
By using techniques in differential geometry, we construct here an economic
derivative precisely to perform the separation of income and substitution
effects necessary for an economic index. It is shown that when the choice
of derivative is made explicit and the ordinary non-economic derivative is
replaced by the differential geometric ‘economic co-variant derivative’, the
discrepancy between all algebraic formulas for index numbers disappears,
thereby resolving the index number problem. The unique index resulting
from the use of the economic derivative in any of the previously developed
index number formulas is then shown to equal the Divisia index.
The second chapter looks at the welfare implications of the Divisia in-
dex based on the new understanding of its mathematical underpinnings. We
compare the welfare effects of the Divisia index with those of the theoretical
Konus index. While the Konus is referred to as the ‘true’ economic index, it is
asserted to be incalculable in the absence preference maps. The Divisia index
6
is, however, entirely data based, and therefore computable. In comparing the
two, however, it is found that there is a major difference in the assumptions
upon which these indexes have been developed. The Konus index makes the
implicit assumption of unchanging preferences. The Divisia makes no such
assumption. It is essential therefore, when comparing the two, to uniformise
the assumptions under which they are studied. It is shown in the first part
of this chapter that if an individual is compensated using the Divisia price
index as a cost of living adjustment, under constant preferences the Divisia
price index is in fact equal to the Konus price index. The Divisia therefore
provides us with a revealed preference method for finding the Konus index,
previously thought to be unknowable in the absence of knowledge on pref-
erence maps. This is in fact an indication that the path dependent nature
of the Divisia C.O.L.A. arises entirely from the failure of the assumption of
unchanging preferences in the real world. In the second part of the chapter
this unrealistic assumption is relaxed, and both indexes are analysed under
the assumption of changing preferences. In a situation of unchanging pref-
erences, two distinct Konus indexes are defined, the Laspeyres Konus or the
Paasche Konus, depending on whether the indifference curve picked is that
of the base time or the current time. Once preferences are allowed to change,
one must select not only between base and current time indifference curves,
but also base and current time preferences. We therefore define four Konus
price indexes, with the various combinations of base and current indifference
curves, and base and current preferences. It is proved that chaining these
indexes also gives equivalence with the Divisia.
It is therefore shown that when viewed in its natural mathematical set-
ting, the Divisia index not only resolves the consistency issues that have
heretofore plagued index number theory, but also answers the welfare con-
cerns that have spurred the deep and longlasting economic interest in the
issue.
The third chapter deals with issues of migration from rural to urban areas.
Since the seminal work of Todaro in the 1960s, there has been much interest
in the decision making process that leads to the migration of labour into
urban areas which often suffer from high unemployment. The conclusion of
the Todaro model was that the discrepancy in incomes between the low wage
rural labour market and the high wage urban market was sufficient that it
was possible for the present discounted value of the expected urban income to
outweigh that of rural income, even with a high probability of unemployment
in the urban market. The policy suggested by this conclusion has been to
7
reduce the discrepancy in incomes either by raising the level of rural wages or
by lowering the level of urban wages. In this chapter it is shown that these
two approaches may in fact not be equivalent. The Todaro model makes
two implicit assumptions which are unlikely to hold in reality. The first is
that an individual maximises expected income as opposed to expected utility.
This is equivalent to making the assumption of risk neutrality. It has been
shown that in general individuals are risk averse, especially at the low levels
of income being discussed here. The second assumption is that the decision
making unit is the individual. Here too much evidence has been accumulated
showing that migration decisions are often made at the level of the family
unit and not solely by the individual. In this chapter we relax these two
unrealistic assumptions. It is proved that when households facing migration
decisions are characterised by decreasing absolute risk aversion, it is in fact
possible for increasing rural wages to lead to increased migration to urban
areas. This ‘perverse’ migration arises because with the increased security of
higher wages for those family workers remaining in the rural labour market,
it is possible for the family to ‘gamble’ more of its members on the high wage-
low probability urban market. It is therefore essential that before using this
policy as tool for stemming rural to urban migration, one ensures that wages
are not within the range of perversity defined herein.
8
Chapter 1
1.1 Introduction
“A man with one watch knows what time it is; a man with
two is never sure.” -Proverb
The primary diagnostic measures of an economy are price and quantity in-
dex numbers. The plethora of existing indexes can therefore leave economists
with an inability to evaluate accurately the most basic of our economic con-
cepts. The most widely used index numbers, the Paasche and Laspeyres, can
disagree not only in terms of the magnitude of change but also the direction.
These contradictions in qualitative results (with regards to growth versus
shrinkage or inflation versus deflation) caused by our index number problem,
puts us as economists in the embarrassing position of being doctors unable to
take a temperature without counseling the patient to get a second opinion.
In the words of W.E. Diewert:
9
base Laspeyres index, while the implicit deflator for the consump-
tion expenditures component of GNP is a fixed base Paasche In-
dex. Given the importance of consumer price indexes in indexing
wages and cost of living supplements, it is important that official
price indexes be consistent with each other.” -Diewert 1978 pp.
268
This points out that in practical terms, the social welfare of those who
depend on cost of living adjustments is affected to a large extent by which of
several gauges is currently in favor. To put it simply, consistency issues are
pre-requisite to welfare analysis.
The fundamental nature of this problem has, over the last century, at-
tracted the attention of many economists. Starting in the late 1800s the quest
for the “true” index number led to the development of hundreds of different
formulas, each offering a different result. The inability to evaluate the rela-
tive validity of any of these formulae led to the development of what is known
as the test approach to index numbers. Since all index numbers agree in the
case of a single good, an attempt was made to formulate plausible properties
for index numbers in the one good case and test whether a proposed index
satisfies them for many goods. There were systematic attempts made by
Walsh, Fisher and other economists to develop tests to isolate the “true”
index number. However the people developing the tests, not coincidentally,
were often the people developing the indexes. Needless to say this lent an air
of relativism to the entire process. In the 1930’s, however, the test approach
showed up an even bigger problem: Frisch proved an impossibility theorem
10
showing that no bilateral index number could pass three basic tests1 . This
led to a pessimism as to the very existence of this “true” index that had been
quested.
An entirely different approach was developed by Konus in the 1920’s
which has come to be called the ‘economic’ approach to index numbers.
The Konus index, a somewhat theoretical concept, is based upon economic
optimisation under constraints. A Konus cost of living adjustment would
compensate (or penalise) a person by the amount it would cost him to con-
sume the least expensive basket of goods on his original indifference curve.
While this in some sense captures what we as economists would like to think
of as the “true” cost of living adjustment, there are several problems with
this index and its interpretation.
Undoubtedly, the most obvious practical shortcoming of the Konus in-
dex is the general unavailability of information on preference maps. While
this practical concern may limit its computation, there are even more chal-
lenging theoretical problems. One basic flaw is its unnatural assumption of
unchanging preferences, without which the answers it returns can be more
psychological than economic in nature2 . Finally we once again are left with
the decision of whether to use base or current year preferences, bringing us
back to the original problem of consistency.
As has been noted by Sen and others, it is of the utmost importance in
work on index theory to clearly identify the scope of inquiry and the implicit
assumptions present. The focus of this paper will be on those price and
quantity indices which are directly computable from time-dependent vectors
p(t), q(t) of price and quantity data for an individual or group between
two time periods t = 0, 1. We will refer to such indices as bilateral temporal
indices. Since the disagreement between these indices can reduce what should
be an objective measurement (viz. of inflation or growth) to a subjective
choice of measurement instrument, the most vexing problem in the field is the
consistency or so-called index number problem. The solution of this problem
may in turn be thought of as the first step in the application of data-based
indexes to the analysis of welfare. It is this problem to which this paper is
addressed. It should be stressed that this is a theoretical paper attempting
to further develop the logical foundations of index number theory.3
1
Frisch 1930
2
Sen (1979) addresses a broad range of issues underlying the assumption of unchanging
preferences.
3
For previous related results see: Richter 1966, Hulten 1973, Samuelson and Swamy
11
It is possible to look at any bilateral index number as a combination of two
ingredients: an algebraic formula and a notion of constancy. We claim that
given such a decomposition there is no inherent problem with the algebraic
formulas of any of the common bilateral index numbers. The problem, when
properly isolated, is seen to arise from a consistent implicit choice of an
incorrect notion of constancy.
As the field of economics shifted in the last century to incorporate devel-
opments in calculus, economists have adopted the standard notion of deriva-
tives: df = 0, as our notion of constancy. It was only later that mathe-
maticians themselves refined their concept of derivatives to allow for them
to be adapted to particular problems. The field of differential geometry was
developed around work with these “adapted derivatives” or “connections”.
By using differential geometry to adapt our notion of constancy to fit the
unique problem of index numbers, we are able to eradicate inconsistencies
between all common bilateral index numbers.
Updating the field of index numbers to incorporate these developments in
mathematics in this way enables us to cast age old issues into an entirely new
framework. Just as differential calculus was the framework within which to
think of optimisation, differential geometry is the correct framework within
which to think of index numbers. As with previous mathematical develop-
ments, this framework immediately helps clarify the questions we are asking.
Not only is it able to recover various scattered results from index theory in
a cohesive picture, it enables us to prove new and valuable results.
We give here a plan of this chapter. In section 2 we use a simple toy
example to show explicitly how a seemingly innocuous use of the ordinary
derivative must be altered in order to recover the economics of constant
purchasing power; this guiding analogy is intended to bring out the logical
necessity of the differential geometric approach. In section 3 we will de-
fine the differential geometric set-up and show how it produces a preferred
index which we refer to as the geometric index. A simplified 2-good econ-
omy is constructed in section 4 which exhibits the index number problem
for the Paasche and Laspeyres indexes. We show in detail how the some-
what abstract geometric set-up is used to carry out an explicit computation
producing agreement between the aforementioned offenders. In section 5 we
show that this framework actually leads to a resolution of the index number
problem for temporal bilateral indices. Section 6 contains a derivation and
1974, and Diewert and Nakamura 1993 and the references therein.
12
discussion of the Divisia index. In section 7 we summarize our results and
discuss the merits of path dependence in the computations of C.O.L.A.s;
we also discuss plans for future extensions. Finally, in Appendix A we give
some of the terminology from differential geometry underlying the previous
discussions.
13
1.2 Adapted Derivatives and Differing No-
tions of Constancy
Each index number can be viewed as a combination of two elements :
1. An algebraic formula
2. A notion of constancy.
14
Let us posit:
δ(·) d(·)
= + g(t)(·). (1.2)
δt dt
With this ansatz let us solve for g(t):
15
1.3 The Differential Geometric Index
1.3.1 Notation and Introduction to the Index
Let V n be a space of n goods and services under consideration. Let V ∗ be
the space of pricing systems of V n . (V and V ∗ can be thought of as column
and row vectors respectively.) Then if q ∈ V n and p ∈ V ∗ , we have
Assume p(t) · q(t) 6= 0 ∀t, i.e our basket is never worthless. Any combi-
nation of goods and services w ∈ V can be uniquely described by a sum
16
to translate any particular basket is unavailable. For instance, the U.S. CPI
is a prominent representative example of a chained index whose links are
approximately 10 years in length. Rapid changes in fields such as computers
and the automotive industry mean that several generations of changes may
occur between recalibrations of the reference basket. Consider, for example,
the category referred to by the Bureau of Labor Statistics as “Information
Processing Equipment”; there has been only one recalibration (in 1982-84)
between the last slide rules and the modern lap tops of the 1990s. Obviously,
this kind of rough taxonomy is insufficient to deal with rapidly changing
components of the basket.
What the index should require is the ability to allow the composition of
the basket to change, while maintaining constant some notion of value. We
can maintain a “constant value” basket while allowing the composition of
the basket to change by allowing trade-offs in the basket at any time based
on the current market value of the trade-offs.
This can be accomplished mathematically using an adapted or covariant
derivative. The tools we use to develop this are:
1. Projections onto the space of goods [q(t)] and barters βp(t) i.e. v =
λ(t)q(t) + b(t)
Definition 3 We define Ξ to be V × V ∗ − C.
17
For most realistic purposes, the baskets of goods under consideration will
have non-zero value. Thus confining ourselves to consideration of only those
baskets and pricing system represented by elements of Ξ ⊂ R2n will not
present a serious restriction.
Given this motivation we give the following definition.
p(t) · v
Π[q(t)] (v) = ( )q(t) (1.8)
p(t) · q(t)
ω · q(t)
Π[p(t)] (ω) = · p(t) (1.10)
p(t) · q(t)
18
suggests, composites of debts and assets whose total value is identically zero.
The corresponding subspace βq can be defined within V ∗ .
Let us now look at the subspace [q] spanned by the vector q in V . The
dimension of [q] will clearly be 1 so long as q 6= 0; likewise the dimension
of the subspace βp will be n − 1 so long as p 6= 0. Therefore we can, in the
generic situation, decompose any other basket v into its projections onto [q]
and βp (see Figure 2). The v[q] component represents the multiple of q whose
value is equal to that of v in the pricing system p. Given the equivalence in
value between v and v[q] , there must exist a barter in βp which transforms v
into v[q] ; this is the economic interpretation of vβp .
The subspaces q and βp will, however, not be linearly independent on
those occasions when p · q = 0, in which case we will not be able to span
the space V in this way. In order to avoid this degeneracy, we define C to
be all points in V × V ∗ such that p · q = 0, and remove these points from
consideration.
19
Figure 1.2: Decomposition of a vector v into components.
fi : Ξ −→ R. (1.13)
We can then assemble these real valued functions into a single vector valued
function
σ : Ξ −→ V (1.14)
given by
n
X
σ(q1 , ..., qn , p1 , ..., pn ) = fi (q, p)ei . (1.15)
i=1
20
n
X
o
∇σ= dfi (q, p)ei (1.16)
i=1
n n n
X X ∂fi X ∂fi
= ( dqj + dpk )ei (1.17)
i=1 j=1
∂qj k=1
∂pk
Let α : R −→ Ξ be given by 2n functions (q1 (t), ..., qn (t), p1 (t), ..., pn (t))
with tangent vector α̇ = ( dα
dt
) then
n X n n
X ∂fi dqj X ∂fi dpk
∇oα̇ σ = ( + )ei (1.18)
i=1 j=1
∂qj dt k=1
∂pk dt
This is simply the Jacobian evaluated in the direction of the tangent field
α̇ = dα
dt
along α.
Let us now define the derivative uniquely adapted to achieve the ap-
propriate index number. The non trivial ‘economic’ covariant derivative of
σ : Ξ −→ V along α is:
The interested reader may check that ∇a satisfies the axioms for a covariant
derivative found in Appendix A. 4
If ∇aα̇ σ = 0, we will say that σ is covariantly constant along α with
respect to the covariant derivative ∇a . If σ is covariantly constant along α
with respect to a given covariant derivative ∇ and σ(α(s)) = v we will refer
to σ(α(d)) as the parallel translation of v along α from the ‘source’ time
t = s to the ‘destination’ time t = d. We will denote this parallel translate
by τs (d); that is τs (d) is determined as a particular value of a vector valued
function σ, covariantly constant along a curve. It should be noted that it is
possible to parallel translate using any covariant derivative. If the parallel
4
It should be noted that when the vector valued function σ(x) is a multiple of the
quantity vector comprising x ∈ Ξ, the second summand is identically zero; this simplifies
matters considerably and is the major case of interest in this paper. For example, if we
are attempting to track the growth of a basket q(0) from time t = 0 to time t = 1, the
vector σ(0) at the source time t = 0 will lie entirely in the subspace cut out by q(0). The
projection of σ onto the barter space, Πβpt σ, will therefore be zero. In fact if ∇aα̇ σ = 0
then it will be the case that along α we will have Πβpt σ = 0 for all t.
21
translation is done using the ordinary derivative, it will simply give us the
original basket. This corresponds to the traditional notion of constancy. We
will denote the parallel translate of q(s) (or p(s)) along α with respect to the
ordinary derivative ∇o by τ o ; likewise, the parallel translate of q(s) (or p(s))
along α using the adapted covariant derivative ∇a will be written τ a .
At this point one might ask ‘what is the economic interpretation of the
equation ∇aα̇ σ = 0?’ In order to address this question, it will help to recall
our earlier interpretation of the various projection maps. Let us begin by
making the assumption that σ(α(0)) = q(0); while this is certainly not a
generic assumption, all σ’s considered in this paper will satisfy similar con-
ditions. In order to induce modernisation or antiquation of the σ(t) basket
components while disallowing ‘growth’, we would like to require that σ(t)
be changing at any given time strictly by barters. This allows the time de-
pendent basket σ(t) to maintain the ‘value’ of the original basket q(0) while
changing its composition as the composition of q(t) changes. This goal is
achieved by selecting the covariant derivative ∇aα̇ σ, for by construction, the
parallel translation equation requires that ∇aα̇ σ = Π[q] ∇oα̇ (Π[q] σ) = 0. In
prose, this means that the projection of the “change in σ(t)” (namely ∇oα̇ )
onto ‘basket space’ (namely [q]), must be zero, thereby confining all change
to lie in barter space. This parallel translation of the basket q(0) from time
t = 0 to time t = 1 using ∇aα̇ σ will thereby ensure that at time t = 1, σ(t)
will be some multiple of the basket q(t).
We are therefore going to seek a vector valued function σ : Ξ −→ V such
that for a source time s and a destination time d we have:
1. σ(α(s)) = q(s)
2. ∇aα̇ σ = 0
The first condition requires that our vector valued function σ agree with
the source-time basket q(s) at the point from which we are parallel trans-
lating. The second condition requires that we parallel translate this basket
using the adapted derivative.
We note that σ(α(t)) is uniquely determined by a theorem in differential
geometry.5
It is clear that depending on the covariant derivative used, such a solution
will have quite different properties. For the ordinary derivative ∇o , a solution
5
See Bishop and Goldberg pg.225-226.
22
will be of the form
σ(α(t)) = q(s). (1.20)
For the adapted derivative ∇a , however, we will have,
As discussed above, the parallel translate of the basket from the source time
to the destination time using the adapted derivative is a multiple of the
basket at the destination time t. This multiple, Λ expresses the “size” of the
source time basket relative to the destination-time basket q(t). Therefore
the geometric index is simply the multiple Λ.
We should note at this point that there is a complementary version of
this picture where we consider vector valued functions
σ̃ : Ξ −→ V ∗ (1.22)
Π : V ∗ −→ V ∗ . (1.23)
23
number problem for the Laspeyres and Paasche quantity indices. While the
example has obviously been abstracted, it does exhibit the full complexity of
the paradox that we are trying to resolve.
2. Λ(0) = 1
q : [0, 1] −→ V ∼
= R2 (1.25)
given by
q(t) = ((t + 1)e1 + (−t + 2)e2 ) (1.26)
cuts out a changing 1-dimensional vector sub-space from V (a non-trivial
sub-bundle with 1-dimensional fiber in the geometric language of Appendix
A). Conversely, the price history
24
Thus we calculate
d(Λ(t)q(t))
∇aα̇ Λ(t)q(t) = Π[q(t)] ( ) (1.29)
dt
dΛ
= Π[q(t)] ( (t)q(t) + Λ(t)(e1 − e2 )) (1.30)
dt
dΛ
= Π[q(t)] ( (t)((t + 1)e1 + (−t + 2)e2 ) + Λ(t)(e1 − e2 )) (1.31)
dt
dΛ dΛ
= Π[q(t)] ( (t)(t + 1) + Λ(t))e1 + ( (t)(−t + 2) − Λ(t))e2 ) (1.32)
dt dt
Recalling that
p(t) · v
Π[q(t)] (v) = ( )q(t) (1.33)
p(t) · q(t)
we have
( dΛ dΛ
dt (t)(t + 1) + Λ(t))(−10t + 100) + ( dt (t)(−t + 2) − Λ(t))(9t + 92)
∇aα̇ Λ(t)q(t) = q(t)
(t + 1)(−10t + 100) + (−t + 2)(9t + 92)
(1.34)
which indicates that g(t) is of the form g(t) = − 12 · ln(−19t2 + 16t + 284) + c
where c is a constant to be determined from initial conditions. Thus we have
1 2 +16t+284)+c c
Λ(t) = e− 2 ·ln(−19t =√ (1.41)
−19t2 + 16t + 284
which when when
√ supplemented by our requirement that Λ(0) = 1 fixes the
value c = ln( 284) giving our final solution
√
284
Λ(t) = p . (1.42)
2
(−19t + 16t + 284)
1. σ̃(t) = Λ̃(t)q(t)
2. Λ̃(1) = 1
26
1.5 Resolution Of the Index Number Prob-
lem
As noted previously, the two most commonly used indexes, the Paasche and
Laspeyres, have an inherent discrepancy. Let us start by observing the effects
of updating our notion of constancy for these two indexes. We will later
extend these results to all bilateral indexes.
The Paasche and Laspeyres quantity indexes have the formulas
p(0) · q(1) p(1) · q(1)
Qold
L = Qold
P = (1.44)
p(0) · q(0) p(1) · q(0)
In both of these indices, three of the four factors which appear are from a
common time period (the period used to make the comparison) and the odd
factor is imported unchanged across the time separating t = 0 from t = 1.
It has been held constant in the usual sense of the term. This leads us to
suspect that the problem is arising from the use of a derivative, un-adapted
to our situation. We thus rewrite (1.44) in the form
Qold new
L = QL Qold new
P = QP (1.46)
27
thus resolving the so called index number problem.
Let us verify this for our example from the previous section:
In our example we see that the Laspeyres quantity index
p(0)q(1) 292
= (1.48)
p(0)q(0) 284
p(1)q(1) 281
= (1.49)
p(1)q(0) 292
1. σ(t) = Λ(t)q(t)
2. Λ(0) = 1
From our calculations in the last section we see that this function is given
by √
284
Λ(t) = p . (1.52)
(−19t2 + 16t + 284)
28
enabling us to calculate the corrected Paasche index:
√
p(1)q(1) 281
a
=√ (1.53)
p(1)τ0 (1) 284
Similarly, to calculate the Laspeyres index we need to parallel translate
back from t = 1 to time t = 0, which we do by finding the function Λ̃ :
[0, 1] −→ R determining σ̃ : [0, 1] −→ V with
1. σ̃(t) = Λ̃(t)q(t)
2. Λ̃(0) = 1
As the Laspeyres and Paasche are actually equal for all t, the agreement with
the Fisher index is immediate.
This result is not restricted to the Paasche, Laspeyres and Fisher in-
dexes. What follows is a formal proof of the equivalence of all bilateral index
numbers when the correct adapted derivative is used.
29
1.5.1 Technical Proof of Equivalence of Bilateral Index
Numbers
We take the opportunity below to set notation before going into the proof
of the equivalence of all bilateral adapted indices. Readers should consult
appendix A for terminology. We follow the proof of equivalence with explicit
formulas for the major indices and their differential geometric adaptations.
Preliminary Definitions
We begin by motivating the concept of bundle and sub-bundle. Intuitively
the idea is as follows.
As we have seen, the evolution of the economy we wish to investigate is
given by a path α in the space of all possible economies Ξ. It is convenient
in such situations to introduce a second copy of the space of possible prices
and quantities for use in tracking the ‘best analogues’ of the initial and final
states of the economy during the period under study. Thus we will work with
the space Ξ × V × V ∗ . The first factor Ξ will be referred to as the “base
space”. For every point x ∈ Ξ there exists a (different) copy of the vector
space V × V ∗ ; each copy (V × V ∗ )x = x × V × V ∗ is referred to as the “fiber”
above the point x. The totality of these spaces is referred to as a “vector
bundle”. The base space Ξ will function as home to the history α. The space
of fibers V × V ∗ will be used to keep track of the parallel translates of the
price and quantity vectors.
Let us now give a brief intuitive explanation of the following proof. We
first observe that all of the algebraic formulas used by the common indexes
satisfy the so called proportionality test; that is, if the original basket of
goods q(0) is a multiple λ1 of the current basket q(1) = λ · q(0), then the
quantity index is guaranteed to be λ.
As discussed in Section 2, all of the common bilateral index numbers
tacitly require a notion of parallel translation. To take but one example,
the Laspeyres quantity index purports to transport today’s goods back to
yesterday’s prices. Further, all such indices make an implicit choice of the
ordinary derivative for the purpose of this translation. By contrast, we will
insist that the choice of derivative be made explicit. In addition, we will
require that the data be expressed in a form which distinguishes the usual
price and quantity vectors (at the source and destination times) from their
parallel translates.
30
When one chooses the ordinary derivative, the parallel translation of the
source-time vector q(s) to the destination-time is merely the original vec-
tor q(s). However, when the adapted derivative is selected, we are ensured
that the parallel translate of the vector q(s) will be a multiple of the vector
q(d) at the destination time. Therefore, when this data is input into any
index number with an algebraic formula passing the proportionality test, it
is guaranteed to give us the same answer: namely λ.
Definition 5 We define sub-bundles of Ξ × V × V ∗ by
Σ0 = Ξ × V × V ∗ (1.58)
Σ1 = {(x, 0, v ∗ ) ∈ Σ0 s.t. v ∗ ∈ [px ] ⊂ V ∗ } (1.59)
∗ ∗ ∗
Σ2 = {(x, 0, v ) ∈ Σ0 s.t. v ∈ βqx ⊂ V } (1.60)
Σ3 = {(x, v, 0) ∈ Σ0 s.t. v ∈ [qx ] ⊂ V } (1.61)
Σ4 = {(x, v, 0) ∈ Σ0 s.t. v ∈ βpx ⊂ V } (1.62)
Definition 6
P (Ξ) = {α|α : [0, 1] −→ Ξ (with α piecewise differentiable)} (1.63)
31
L4 L4
Definition 9 Let us define Θ := i=1 V i=1 V ∗ and κ ⊂ Θ by the rule
1 1
κ = {θ ∈ Θ| s.t. Θ = (A, y · A, B, · B, C, z · C, D, · D )} (1.67)
y z
with A, B ∈ V , C, D ∈ V ∗ and y, z ∈ R
Definition 11 Let
T : P (Ξ) × A −→ Θ (1.68)
be given by
T (α, ∇) = (1.69)
( p(0), τα∇ (p(1), 0), p(1), τα∇ (p(0), 1),
q(0), τα∇ (q(1), 0), q(1), τα∇ (q(0), 1) )
F : Θ −→ R (1.70)
32
Proof of Equivalence
Theorem 2 Let I1,2 : P (Ξ) × A −→ R be a pair of quantity index formulas
where I1,2 = F1,2 ◦ T ,
T : P (Ξ) × A −→ Θ (1.72)
is as before and
F1,2 : Θ −→ R (1.73)
are two formulae which satisfy the proportionality test. Then if ∇a1 = ∇a2 =
1
∇a , we have I1 (·, ∇a ) = I2 (·, ∇a ) and Ii (α(t), ∇a ) = Ii (α(1−t),∇ a ) (time rever-
sal).
Proof: The proof proceeds in two parts. First we we will show that any
adapted connection on Σ0 is “reducible” in that it is the sum of 4 connec-
tions on the Σi sub-bundles of Σ0 . The implication of reducibility is that
if we parallel translate a vector v(α(s)) belonging to a sub-bundle Σi using
a reducible connection, then the parallel translate at the destination time d
will remain in the same sub-bundle as the original vector at source time s. As
both the sub-bundle of pricing systems Σ1 , and the sub-bundle of baskets Σ3
have 1-dimensional fibers, we are guaranteed to have our parallel translates
of p(s) and q(s) expressible as non-zero multiples of the price and quantity
data at the destination time d.
The second part of the proof will be to show that for any such reducible
connection ∇a , the 8 pieces of data comprising T (α, ∇a ) are guaranteed to
fit the pattern found in the proportionality hypothesis. Since all index num-
bers can be represented by formulas satisfying the proportionality hypothesis
above, the conclusion follows.
To begin, we must demonstrate that if ∇ is an adapted connection and
σ is a section of one of the four sub-bundles of Σi ⊂ Σ0 , then ∇aX σ is again
in Σi for any X ∈ T Ξ. For concreteness, let us look at σ ⊂ Σ3 . Now by
hypothesis we have
∇X σ = ∇aX σ = Π[q] ∇X (Π[q] σ) = Π[q] ∇X σ (1.74)
proving our claim. The claim that given any connection ∇ on Σ0 , ΠΣi ∇(ΠΣi ·)
is a connection on Σi can be verified by checking the axioms for a connection
in Appendix A, and is straightforward.
Because the 4 summands of an adapted connection ∇a are individually
connections on the sub-bundles Σi , we are assured that the parallel transla-
tion of a vector in one of the summands Σi will remain in that summand. In
33
particular, for the one dimensional [q] sub-bundle we have
a
τα∇ (q(s), d) = λq(d) · q(d) (1.75)
for λq(d) ∈ R. Now the equation for a section along α to be a parallel translate
of our source data
∇aα̇ σ(t) = 0 (1.76)
is a first order linear ordinary differential equation with initial data q(s).
This implies that there exists a unique solution to the initial value problem
which depends linearly on the initial conditions. Thus if σ satisfies (1.76)
with
σ(α(s)) = q(s) (1.77)
then by (1.75) we have
σ(α(d)) = λq(d) q(d). (1.78)
Conversely, the section
1
σ̃ = σ (1.79)
λq(d)
will also satisfy (1.76) but with the boundary condition
1
σ̃(d) = · λq(d) · q(d) = q(d). (1.80)
λq(d)
This proves:
Proposition 3 For an adapted derivative ∇a on Σ0 there exist numbers
y, z ∈ R such that
T (α, ∇a ) = (1.81)
1
(p(0), y · p(0), p(1), · p(1),
y
1
q(0), z · q(0), q(1), · q(1) ).
z
This means that in the case of an adapted covariant derivative ∇a , we
are guaranteed that T (α, ∇a ) ∈ κ. Thus by the hypothesis that our index
formulas satisfy the proportionality test we can see that
1 1
I1 (α(t), Ã) = I2 (α(t), Ã) = = (1.82)
I1 (α(1 − t), Ã) I2 (α(1 − t), Ã)
for all piecewise smooth α. QED.
34
Adapted Index Numbers
Let us now see how this adjustment in the notion of constancy corrects
some common bilateral index numbers. In the following price index number
formulas the σ’s refer to the parallel translates of prices:
Laspeyres
Original:
p1 · q 0
PL (p0 , p1 , q 0 , q 1 ) ≡ 0 0 (1.83)
p ·q
Adapted:
σ1 (0) · q 0
PL (p0 , p1 , q 0 , q 1 ) ≡ (1.84)
p0 · q 0
Paasche
Original:
p1 · q 1
PP (p0 , p1 , q 0 , q 1 ) ≡ 0 1 (1.85)
p ·q
Adapted:
p1 · q 1
PP (p0 , p1 , q 0 , q 1 ) ≡ (1.86)
σ0 (1) · q 1
Fisher
Original: s
p1 · q 0 p1 · q 1
PF (p0 , p1 , q 0 , q 1 ) ≡ · (1.87)
p0 · q 0 p0 · q 1
Adapted:
s
σ1 (0) · q 0 p1 · q 1
PF (p0 , p1 , q 0 , q 1 ) ≡ · (1.88)
p0 · q 0 σ0 (1) · q 1
Tornqvist
Original:
s
N (p0n qn0 /p0 ·q0 ) 1 (p1n qn1 /p1 ·q1 )
0 1 0 1
Y p1n pn
PT (p , p , q , q ) ≡ · (1.89)
n=1
p0n p0n
Adapted:
s
N (p0n qn0 /p0 0·q0 ) 1 (p1n qn1 /p1 ·q1 )
0 1 0 1
Y σ1 (0) pn
PT (p , p , q , q ) ≡ · (1.90)
n=1
p0n σ0 (1)
35
Walsh
Original:
PN 1/2 1
0 1 0 1i=1 (qi0 qi1 ) pi
PW (p , p , q , q ) ≡ PN (1.91)
0 1 1/2 0
j=1 (qj qj ) pj
Adapted:
PN 1/2 1 PN 0 1 1/2
0 1 0 1 i=1 (qi0 qi1 )
pi i=1 (qi qi ) σ1 (0)i
PW (p , p , q , q ) ≡ PN 1/2
= PN 1/2
(1.92)
0 1 0 1
j=1 (qj qj ) σ0 (1)j j=1 (qj qj ) p0j
Jevons
Original:
N
Y p1
PJ (p0 , p1 ) ≡ ( 0 )1/N (1.93)
i=1
p
Adapted:
N N
Y p1 1/N Y σ1 (0) 1/N
PJ (p0 , p1 ) ≡ ( ) = ( 0 ) (1.94)
i=1
σ0 (1) i=1
p
36
1.6 The Divisia Index
What is this new index? In this section we will prove that this index is
the same as the Divisia index. Divisia developed this index in the 1920’s
in a somewhat ad hoc fashion. In fact, Bennet [1920] writing a few years
before Divisia using essentially the same argument, came up with almost the
same index which differs from Divisia’s by its absence of Log functions. The
following is a description of the traditional development of the Divisia index6 :
Q : R −→ R P : R −→ R (1.95)
such that n
X
p(t) · q(t) = pi (t)qi (t) = P (t)Q(t) ∀t. (1.96)
i=1
V (t)
P (t) = (1.97)
Q(t)
with Q(t) arbitrary. Divisia’s idea was to take the derivatives of the loga-
rithms
dln(p(t) · q(t)) dln(P (t)Q(t))
= (1.98)
dt dt
expand using the chain and product rules
37
to be equal. We can then see that this does specify unique function P (t), Q(t)
by the formulas
1 n
p0i (t)
Z X p (t)qi (t)
P (t) = exp( Pn i ) (1.102)
0 i=1 j=1 pj (t)qj (t) pi (t)
1 n
qi0 (t)
Z X p (t)qi (t)
Q(t) = exp( Pn i ). (1.103)
0 i=1 j=1 pj (t)qj (t) qi (t)
dΛ dp
= Π[p] (∇oα̇ Π[p] (σ)) = Πp ( (t) · p(t) + Λ(t) (t)) (1.107)
dt dt
(( dΛ
dt
(t) · p(t) + Λ(t) dp
dt
(t)) · q(t))
= p(t) (1.108)
p(t) · q(t)
dΛ dp
=( (t)p(t) + Λ(t) (t)) · q(t) (1.109)
dt dt
so
dp
dΛ (t) · q(t)
(t) = −( dt )Λ(t) (1.110)
dt p(t) · q(t)
7
Deaton and Muellbauer pp. 174-175
38
hence
t dp
(r) · q(r)
Z
dr
Λ(t) = exp(− ( )dr) (1.111)
0 p(r) · q(r)
n
Z tX dpi
dr
(r)· qi (r)
= exp(− ( )dr) (1.112)
0 i=1
p(r) · q(r)
Z tXn
pi (r) · qi (r) 1 dpi (r)
= exp(− ( ) dr) (1.113)
0 i=1 p(r) · q(r) pi (r) dr
n
pi · qi dpi
Z X
= exp(− ( ) ) (1.114)
α i=1 p · q pi
Thus
1 Pn R
Diff. Geom. Index = = e i=1 α wi (α)d log(pi )
= Divisia Index (1.115)
Λ(t)
QED.
39
1.7 Results
As we have shown in this paper, much of index number theory can be viewed
as a special topic in differential geometry. When the traditional index number
problem is formulated in the context of differential geometry, a new feature
emerges. Unlike the ordinary calculus, differential geometry points out that
there is not a unique derivative as is often assumed. There is however a
unique choice of derivative adapted from the familiar one to the problems
of index theory. Once that choice is made successfully, all index numbers
currently in favor become equivalent and are equal to the Divisia index. The
shift in emphasis from the algebraic formulas of various index numbers, and
towards the choice of derivative thereby enables us to resolve the so called
Index Number Problem. Thus by using these mathematical techniques, we
have a natural answer to the question “What is the correct index number?”
Casting the question in the right framework not only allows us to unify and
recover previous results, it also enables us to understand better the nature
of the question.
One of the primary debates one sees in the current literature on index
number theory centers around whether or not to chain bilateral indexes.
With the sophistication of the differential geometric framework, we can cast
this question in an entirely new light. As we have seen above, all index
numbers make a choice of derivative. The choice has till now been made
implicitly. When we view the issue differential geometrically we are forced
to make the choice explicit. What then becomes clear is that there are two
different derivatives being used. The ordinary derivative, which has no justifi-
cation whatsoever economically, and the adapted derivative, whose economic
justification has been presented in this paper. Once one is forced to justify
the choice of derivative, one must realise the arbitrary nature of all bilateral
indexes using the ordinary derivative. The Divisia is the only index which
uses an economically justified derivative and is therefore superior to all bi-
lateral indexes using the ordinary derivative. Any further approximation to
the Divisia is therefore an improvement. Chaining is simply one such ap-
proximation. The issue should no longer be whether or not to chain bilateral
indexes, but rather to find the best way of approximating the Divisia.
The one characteristic of the Divisia index most troubling to economists
is that it is given by a path dependent line integral. What we would claim is
that even with this path dependence (or perhaps because of it), the Divisia
is the fair and correct way to calculate cost of living adjustments in our real
40
world filled with changing preferences. Let us attempt to make an argument
for this in a particular setting in order to show that path dependence can be
more of a virtue than a vice. Let us assume that we have two individuals
starting and finishing by consuming exactly the same basket of goods, and
look at a situation where the path dependent Divisia might give us a different
index for each of these individuals. Let us say that the first, as a connoisseur
of the finer things in life, starts cultivating his taste in red wine, while this
might not be a particularly popular habit. As he is functioning under a
budget constraint he thus shifts consumption away from other goods, for
instance soda pop. If we assume there is then a faddish move toward the
consumption of red wine we should expect that wine prices will start rising.
The second consumer, following the fad, starts shifting his consumption only
after the prices have already risen. The Divisia would require compensating
the first consumer, but not the second, precisely because their paths were
different. The first started consuming wine while prices were low, and then
got hit with the price increase. The second chose to shift his consumption
only after prices had increased. What we would claim is that it is only the
path dependent Divisia index that can give a fair cost of living adjustment.
What we see in the mathematics is that a Divisia index C.O.L.A. is a
guarantee that that the recipient will be able to maintain his previous life-
style with the freedom to change to an equivalently priced life-style at any
time. Unlike the Konus C.O.L.A. discussed in Sen [1979], the person will
not be compensated if they become increasingly depressed nor penalized for
greater happiness. Unlike the Paasche and Laspeyres C.O.L.A., her adjust-
ment is based on a lifestyle which is allowed to change at any time. But there
is an important feature of Divisia compensation not found amongst the other
indices: the element of risk. Under a Divisia C.O.L.A., your lifestyle is an
investment. If you change your spending patterns, there is no guarantee that
you will be able to get back to your previous level of consumption. Far from
being exotic, these risks are well known to all those who must decide how
much currency to change when visiting nations with volatile exchange rates.
What the differential geometry emphasizes is that path dependence is prefer-
able and that the path dependence of the Divisia is no more counter-intuitive
than the path dependence underlying the theory of investment.
As previously noted, the differential geometric framework allows us to
see that this index is the natural one to emerge from both the mathematics
and the economics, and that it enables us to resolve consistency issues raised
by the plethora of existing indexes. What it also enables us to do is recover
41
previous results about the Divisia in a unified framework and to extend these
results in ways that can practically facilitate our use of the index.
For economists concerned with issues of path dependence, an important
result is path independence of the Divisia under unchanging homothetic pref-
erences [see Hulten 1973]. What does the mathematics have to say about
this?
Differential geometry allows us to calculate something called a curvature
tensor. Recall that with the ordinary notion of a derivative, mixed partials
commute. This is no longer true of general covariant derivatives. The cur-
vature tensor measures the failure of the covariant derivative to commute.
i.e. ∇X ∇Y 6= ∇Y ∇X . What differential geometry tells us is that when the
curvature tensor is equal to zero on a subspace, our index will be path in-
dependent. Homothetic preferences are a special case of a zero curvature
tensor.
Another important result is Diewert’s 1980 demonstration that when cer-
tain approximations are valid, the Paasche, Laspeyres and Tornqvist indexes
can be regarded as discrete approximations to the Divisia. In addition, it has
been noted that chaining usually improves the approximation and leads to
less variation among index numbers. As discussed previously, the differential
geometry shows us that chaining is a crude way of approximating our adapted
derivative. Given that the adapted derivative resolves the consistency issue,
greater agreement under chaining is therefore explained.
Thus we see that not only does our approach recover the above results,
but it also gives a unifying framework in which to understand them.
1.7.1 Extensions
Besides recovering previous results in a consistent framework, differential
geometry enables us to extend our knowledge about the Divisia in practically
applicable ways:
I.) Using the theory of connections, it is possible to do for the space of
barters what has been done for the space of baskets. In this paper we have
shown how to modernize or antiquate the components of the usual ‘fixed
basket’ to reflect changes in production and/or consumption.
Understanding how barters change is essential to understanding the econ-
omy as a whole. For example, if our basket q(t) is taken to be the U.S. GDP,
an individual’s basket will consist of a non-zero barter vector B added to
a scaled basket vector λ · q(t) (here B is asserted to be non-zero because
42
individuals never purchase multiples of the GDP). Further, the entire theory
of option pricing rests on the fact that barters today will not in general be so
tomorrow. It can be shown however that the adapted derivative gives us a
natural identification between the spaces of barters at two different instants
of time. This in turn results in an n-1 × n-1 matrix analog for barters of the
usual Divisia index for baskets. The significance of this matrix index is not
yet clear and and is the subject of work in progress with E. Weinstein.
II.) Differential geometry allows us to discuss path independence in the
absence of knowledge of homothetic preferences. Let us assume that we are
not given access to preferences, but that we are instead given some region
a
R ⊂ V × V ∗ such that when we restrict the curvature tensor F ∇ to R we get
zero. Then if α1 , α2 are two curves with the same end points and α1 can be
deformed to α2 within R while keeping the endpoints fixed, then the Divisia
of α1 equals α2 .
a
III.) When the curvature tensor F ∇ 6= 0 we can use the ‘size’ of the
curvature tensor to estimate how far we are from path independence. In
other words, we can use the size of the curvature tensor to estimate the local
contribution to the path dependence of the Divisia index. Thus, this allows us
to deviate from pure homothetic preferences and estimate data needs based
on the curvature intensity of a region.
These issues will be further explored in following papers.
43
as a triple. We will postpone the definition of ∇0 until section 1.8.3.
Let V n be the vector space of all quantities of goods and services in the
sector of the economy under consideration. That is we let {ei }ni=1 ⊂ V n be
a basis for V indexed by the (ordered) set of n goods and services under
consideration in their natural units of measurement (kilograms, units, etc...).
We will interpret negative multiples of these goods as debts or obligations to
provide.
We will let V ∗ be the dual space of linear functionals φ : V −→ R. This
is a vector space of the same dimension as V . We note that given our basis
{ei }ni=1 ⊂ V n there is a corresponding basis {e∗i }ni=1 ⊂ V ∗ where e∗i is defined
by specifying that e∗i (ej ) is equal to 1 if i = j and 0 otherwise.
We wish to comment here that the reader may choose to represent the
elements of V as ‘column’ vectors and the elements of V ∗ as ‘row’ vectors.
In this notation the act of evaluating an element µ∗ of V ∗ on a vector v is
obtained by matrix multiplying the 1×n row vector against the n×1 column
vector to get a real number (i.e. a 1 × 1 matrix).
If we assume that we are interested in an initial state of the economy
at time τ0 contrasted against a later state of the economy at time τ1 then
we can define ρ : [τ0 , τ1 ] −→ [0, 1] by ρ(x) = ( τx−τ 0
1 −τ0
) in order to restrict our
formalism to the interval [0, 1]. From now on we will assume that the time
interval of interest will always be [0, 1].
Let Ξ = {(v, φ) v ∈ V, φ ∈ V ∗ s.t. φ(v) 6= 0} be the collection of pairs
of baskets of goods and possible pricing systems.
Then let α = (m, π) where α : [0, 1] −→ Ξ and π : [0, 1] −→ V ∗ gives
the prices of the goods and services under consideration. Conversely m :
[0, 1] −→ V denotes the (possibly) time dependent basket of goods and
services with which we are attempting to calibrate the economy.
Given any non-zero vector w in a vector space W , we will denote by [w]
the linear subspace spanned by w.
With this defined we argue that if we are given any element (v, φ) ∈ Ξ
we get a decomposition of our vector space V as follows. In the first place
φ determines a subspace β[φ] defined by β[φ] = {h ∈ V s.t. φ(h) = 0}. Our
bracket notation is justified by the fact that β[φ] is independent of whether
one chooses to work with φ or r · φ (r 6= 0). Now by our above definition of
Ξ we are guaranteed that
44
w[v] ∈ [v] and wβ[φ] ∈ β[φ] such that w = w[v] + wβ[φ] .
Proof: The argument above goes over word for word in this case with V ∗
and V interchanged. QED
The importance of the decompositions is that they determine projection
maps onto the various subspaces. Given vectors y ∈ V, ω ∈ V ∗ and our
point (v, φ) ∈ Ξ we get projection maps
45
1.8.2 Vector Bundles, Sections etc....
We now turn to a discussion of vector bundles. We wish to say at the outset
that we will only be discussing a restricted class of vector bundles and will
not need the full formalism. Given a vector space W n of dimension n and a
(topological) space X (set, manifold etc...) we refer to X × W as the total
space of the trivial vector bundle with base space X and fiber W . For
most all of our discussion, X will be an open dense subset of Rn × Rn ∼ = R2n
so that X 2n = R2n − C on which we will use the standard coordinates for
R2n , agreeing at the outset simply to ignore the points of C. (Note: C is a
2n − 1 dimensional space with singularities).
Let Pm (W n ) denote the space of (oriented) m-dimensional planes in an
n-dimensional vector space W n . That is, the points of the space Pm (W n )
are in one to one correspondence with the oriented m-planes through the
origin 0 ∈ W n . Denote by Pp ⊂ W n the plane corresponding to a point
p ∈ Pm (W n ).
Now if the points x ∈ X parameterise a family of m-dimensional sub-
spaces PZ(x) ⊂ W by a map Z : X −→ Pm (W n ) we can define a new non-
trivial bundle; we refer to the space Σ = {(x, u) : (x, u) ∈ X × W n and u ∈
PZ(x) } as a (non-trivial) vector bundle with fiber Rm and base space X (the
idea being that a vector bundle is trivial if it is given as a cartesian product).
The bundle we have described is a sub-bundle of the trivial bundle X ×W n .
Note that a trivial bundle is always a sub-bundle of itself (ie. when n = m).
In this paper m will usually be either n − 1 or 1; we note that in these cases
we have Pm (W n ) = S n−1 .
The surjective map π : Σ −→ X given by π((x, u)) = x (defined for a
sub-bundle of a trivial bundle) is referred to as the ‘projection map onto
the base space’. The collection of points PZ(x) = π −1 (x) are called the fiber
of the vector bundle at x (or ‘above x’).
Now a map σ : X −→ Σ is called a section of Σ if for all x ∈ X we
have σ(x) = (x, u) for some u ∈ π −1 (x). This is equivalent to requiring that
π ◦ σ(x) = x. The space of all sections is denoted by Γ(X, Σ). The space of
smooth (that is infinitely differentiable) sections is denoted by Γ∞ (X, Σ).
It is important for our paper to know that given a curve α : [0, 1] −→ X,
then every vector bundle over our space X determines a vector bundle over
[0, 1] as follows. If E is a vector bundle over X with projection map π, define
the pull-back bundle α∗ (E) to be the set of pairs α∗ (E) = {(t, u) s.t. t ∈
[0, 1], u ∈ π −1 (α(t))}. Thus if E = X × W n then α∗ (E) is simply the trivial
46
bundle [0, 1] × W n . Likewise if we have a sub-bundle of X × W n given by a
map Z : X −→ Pm (W n ) then we get a sub-bundle of [0, 1] × W n given by the
map Ẑ : [0, 1] −→ Pm (W n )which sends t to the point Z(α(t)) ∈ Pm (W n ).
47
3. Product Rule: ∇Y f σ = df (Y )σ + f ∇Y σ
The simplest connection is just the familiar total differential of ordinary real
valued functions.
If we have a vector bundle Σ over a base space R1 (or any connected subset
thereof) and a (possibly non-trivial) connection ∇ on Σ, then we define the
parallel translations T (wp , ∇) of a vector wp in the fiber above p to be
given by the section σ : R1 −→ Σ which satisfies
48
Chapter 2
2.1 Introduction
In recent months the U.S. Senate has been actively debating the Consumer
Price Index (CPI) and its relation to the “true” cost of living index. The
importance of this question for the U.S. economy is undeniable. According
to recent Senate hearings on the CPI, 30% of federal outlays and 45% of
federal revenues are indexed to the CPI, and a 1% overstatement in the CPI
costs the U.S. government $280,000,000,000 over 7 years.
The current CPI issued by the U.S. Bureau of Labor Statistics is based
on a Laspeyres index PL = PP (1)·Q(0)
(0)·Q(0)
for a representative consumer whose
fixed basket is updated once every 10 years or so. It is widely accepted,
however, that this is not a cost of living index. There are several reasons for
this, both technical (eg. the way data on housing is handled by the BLS)
and theoretical. We will focus here on that part of the failure of the CPI to
evaluate the true cost of living adjustment that arises from the theoretical
issues such as substitution effects, the new goods bias, and changes in quality.
The goal of a ‘cost of living adjustment’ PE (t) is that a utility maximising
consumer purchasing a basket q(0) with expenditure E(0) at time 0, should
receive the same utility from an expenditure of
49
at time t.2
The Konus index (often referred to as the ‘true cost of living index’ or
simply as ‘the economic index’), is believed by most economists to be a
complete theoretical (if incalculable) solution to this welfare problem. Pollak
(1983) defines the Konus as a ratio of expenditures in the following way:
a b E(P a , s, R)
PK (P , P , s, R) = (2.2)
E(P b , s, R)
where P a and P b are prices at t = a and t = b, R represents a preference
ordering, and s represents the choice of a base indifference curve from that
map.
While the Konus may seem the most natural and intuitively appealing
solution for the index PE given normal neoclassical assumptions, it does in
fact make severely restrictive demands and is therefore not the most realistic
measure of welfare in the real world. Perhaps the two most severe require-
ments are that the Konus requires access to an individual’s preference map,
and that no changes in taste occur during the period in question. In ad-
dition, using this index involves dealing with an “index number problem”.
Whereas the Paasche and Laspeyres indexes must fix a reference basket, the
Paasche-Konus and Laspeyres-Konus are fixed utility indexes which involve
the choice of a reference indifference curve, and thereby suffer from much of
the arbitrariness plaguing data based bilateral indices.
In this paper we will examine the welfare implications of the Divisia price
index Z 1Xn
pq dpi
PD = exp Pn i i (2.3)
0 i=1 j=1 pj qj pi
defined for a consumer consuming a basket q(t) under p(t) prices. While
this index is given by a path dependent line integral, as a data based index
it suffers from none of the preference restrictions under which the Konus
labours.
It is clear that the Laspeyres index, currently being used for the CPI
is an overestimate of the Konus index, PK . What is not clear, however, is
by how much. The economists who have been asked by the government for
2
This paper addresses itself to the benchmark case where marginal analysis is applica-
ble. Accordingly, all functions of time should be assumed once continuously differentiable
unless otherwise specified. If care is taken, this assumption can be weakened in some cases
to continuous or continuous and piecewise differentiable.
50
their estimations of the overstatement have delivered a range of suggested
adjustments to the CPI. This raises two important questions:
51
2.2 Welfare and Path Dependence
Let us start by looking at why the issue of changing preferences is of funda-
mental importance to a welfare index.
Standard neoclassical economic analysis generally makes the simplifying
assumption that preferences are static; as mentioned before, the definition of
the Konus index makes this assumption implicitly. In (Malaney I), however,
we discuss the example of a “fad” based upon the changes in taste one is
likely to see in any realistic model of an economy. The situation described
was that of two persons starting and ending with the same basket of goods
comprised of red wine and soda pop.
Let us assume at time t0 that Consumer A and Consumer B share not
only preferences but also a budget constraint I(0). For the sake of simplicity,
let us consider a toy economy in which the only goods are wine and cola.
Assume that at t0 the preferences of both consumers cause them each to
purchase 70% cola and 30% wine at p(0) prices. Assume that between t0 and
ta demand, supply and prices remain nearly constant but that consumer A
begins to cultivate a taste in red wine causing her to shift her consumption
towards 30% cola and 70% wine; consumer B’s preferences and purchases
remain static.
If one then assumes that between time ta and t1 a fad for red wine in-
creases society’s demand for wine, we should expect to see an increase in
price as well. If consumer B is carried along with the fad, she will increase
her consumption of wine despite the fact that the price of wine is increas-
ing. We note that this common behavior associated with fads and trends, is
associated only with the exotic Giffen goods under static preferences.
We may then assume that at time t1 both consumers are purchasing 30%
cola and 70% wine. If we compute their Divisia Indexes we will find that
Consumer A will have undergone considerable inflation while Consumer B
will not have. This despite the fact that their initial and final baskets are
the same.
The point here is that A was hit with a price increase in the basket she
was already consuming, while B increased her consumption into a good whose
price had already undergone much of the increase.
A continuous C.O.L.A. based on the Divisia price index would protect the
rights of both consumers to keep their life styles intact in the face of a price
increase but it would compensate both consumers differently. This led us in
[Malaney 1] to make the point that the path dependence of the Divisia index
52
C.O.L.A represents something directly analogous to the path dependence of
currency exchange or stock portfolio composition; with a Divisia C.O.L.A.
your lifestyle is an investment. You can always retain your way of life but
you cannot necessarily return to it if you change your preferences.
E(P a , s, R)
I(P a , P b , s, R) = (2.4)
E(P b , s, R)
The notation emphasizes that the index depends not only on the
two sets of prices, P a and P b , but also on an initial choice of
an indifference map or preference ordering, R and the choice of
a base indifference curve s from that map. One set of prices is
called “reference prices” and the other, “comparison prices’. If
the comparison prices are twice the reference prices, the index is
2; if they are one-half, the index is one-half.” -The Theory of the
Cost of Living Index Robert Pollak Pg. 94
It is plain from the above definition that ‘the cost of living index’ (otherwise
known as the Konus price index PK ) is only defined under the hypothesis
of unchanging ordinal preferences. The assertion commonly made is that if
a consumer with a given cardinal utility function U maintains such a static
indifference map R, then PK = I(P a , P b , s, R) by its definition insures ‘con-
stant utility’. We do not take issue with this statement in this paper except
to point out that this assertion appeals to an implicit condition on U which
we wish to make explicit.
53
Let us assume for the moment that we were interested in a consumer
whose preference map Rt was derived from a dynamic utility function U :
RTime × V −→ R. It is quite possible that our consumer’s “efficiency as
a [cardinal] pleasure machine” might change even if her indifference map
Rt was constant. For example, consider a static cardinal utility function
U0 : V −→ R with an indifference map R0 . Then
and
1
Ub (t, v) = ( Sin(2πt) + 1)U0 (v) (2.6)
2
are both time dependent utility functions with a common indifference map:
R0 .
Let us then assume that our utility maximising consumer possesses just
such a static indifference map R derived from the cardinal utility function
U (s, v) and at time t expends an adjusted income of I(t) = PK (t)I(0). In this
case, the assumption that her Konus compensation yields constant cardinal
utility uc .
U (t, q(t)) = uc ∀t (2.7)
implies that
dU ∂U ds ∂U dq
= + = 0. (2.8)
dt ∂s dt ∂q dt
We know however from the definition of the Konus that dq dt
lies tangent to
∂U
the indifference curve while ∂q = ∇U is orthogonal to the tangent planes of
U ’s level sets implying
∂U dq
= 0. (2.9)
∂q dt
Therefore, for U (t, q(t)) to be constant under Konus compensation, we must
make the single assumption
∂U (s, q(t))
=0 whenever t=s. (2.10)
∂s
Relative to the basket q(t), equation (2.10) refers to the constancy of
what what Sen calls the ‘change in man’s efficiency as a pleasure machine’
and what Balk refers to as the ‘cost of living effect of pure preference change’.
54
2.3.1 Changing Preferences and Psychological Neutral-
ity
The Konus index in its standard form is of no use in evaluating welfare
considerations in the context of variable preference maps. However, if we
restrict our attention to those utility functions which:
then the Konus guarantees constant utility. This raises the question of
whether there is any way to guarantee constant cardinal utility when the
first assumption is lifted.
Happily this question can be answered in the affirmative; we will show
in this paper that the Divisia index PD can be seen to be an extension of
the Konus index into the realm of changing preferences where only the above
“condition 2” is retained. To this end we make the following precise definition
based on equation (2.10).
dU (s, v)
|s=t,v=q(t) = 0 (2.11)
ds
for all t. Given a dynamic utility function Û , we will refer to any U which
shares the same indifference curves as Û while satisfying (2.11) as a psycho-
logical neutralisation of Û .
55
With this in mind, we can introduce the function
Z s
∂ Û (t, q(s))
Ũq (s, v) = Û (s, v) − dt (2.12)
0 ∂t
56
Theorem 8 Consider a representative consumer with budget constraint I(0)
at time t = 0 and a differentiable time-dependent convex utility function
Û : R × V −→ R . Let us further assume the consumer’s expenditure at time
t is given by an experimental price index PE (t) according to the formula
Proof: For part 1, we first show that if the consumer’s PN utility U (t, q(t))
is constant for all time t then his Divisia quantity index QD (t) is equal to
unity.
57
Now define βt to be the time dependent space of barters
βt = {v ∈ V s.t. pt · v = 0} (2.16)
i.e. all baskets which our pricing system at time t evaluates as representing
a ‘fair trade’.
Then by the assumption that the basket q(t) represents the purchases of
a utility maximiser, we know that any instant of time t there exists a strictly
positive constant λt such that (∇U )q(t) = λt pq(t)
∂U (s, q) dq dq dq dq
∴ = λt p t ( ) = 0 ⇒ p t ( ) = 0 ⇒ ∈ βt ⇒ QD = 1 (2.17)
∂q dt dt dt dt
To establish the converse, assume that the Divisia quantity index is equal
to unity for all t.
dq dq
QD = 1 ⇒ ∈ βt ∀t ⇒ pt ( ) = 0 (2.18)
dt dt
dq
· ∇U = 0 (2.20)
dt
∂U dq
=0 (2.21)
∂q dt
By PN:
∂U ds
= 0 at s = t, v = q(t) (2.22)
∂s dt
∂U ds ∂U dq dU
∴ + = (2.23)
∂s dt ∂q dt dt
The second point can be deduced from the above argument.
n n
pi (s) · dqds
i (s)
Z tX Z tX
p i · qi dqi
QD (t) = exp( Pn ) = exp( Pn ds)
0 i=1 j=1 pj · qj qi 0 i=1 j=1 pj (s) · qj (s)
(2.24)
58
but by P.N., maximisation and line (2.17) we have
1 ∂U (s, q) dq dq
= pt ( ) (2.25)
λt ∂q dt dt
pi (s) · dqds
i (s) dqi (s)
Z tX Z t
1 ∇U (s, ·) · ds
exp( n Pn ds) = exp( Pn ds)
0 i=1 j=1 pj (s) · qj (s) 0 λs j=1 pj (s) · qj (s)
Z t (2.26)
dqi (s)
= exp( µ(s)(∇U (s, ·) · ) ds) (2.27)
0 ds
The point here is that the function µ is strictly positive by the positivity of
λs and I(s) = p(s) · q(s). Thus the sign of the integrand is negative, positive
or zero according to whether utility is decreasing, increasing or stagnant.
Hence
Z t
dQD (t) d dqi (s)
= exp( µ(s)(∇U (s, ·) · ) ds (2.28)
dt dt 0 ds
Z t
dqi (s) dqi (t)
= exp( µ(s)(∇U (s, ·) · ) ds)µ(t)(∇U (t, ·) · ) (2.29)
0 ds dt
dqi (t)
= ν(t)(∇U (t, ·) · ) (2.30)
dt
where ν(t) is strictly positive; this proves the assertion.
In order to establish the third point let us introduce the notation v(t) =
p(t) · q(t) = I(t) for the value of a basket We then multiply the Divisia price
and quantity indices to obtain:
dp · q + p · dq
Z
PD (t)QD (t) = exp(ln(PD ) + ln(QD )) = exp( ) (2.31)
p·q
Z Z
dv
= exp( ) = exp( dln(v) ) = exp( ln( v(t) ) + c ) = v(t)ec (2.32)
v
but since PD (0) = QD (0) = 1 we must have c = −ln( v(0) ) yielding the
expression
v(t)
PD (t)QD (t) = . (2.33)
v(0)
59
Therefore, if QD (t) = 1 we must have
since q(t) is the basket of minimal cost on the indifference curve C by the
maximisation hypothesis.
QED.
60
Corollary 9 Given the same assumptions and notation as in Theorem 2:
∂U (s, v)
PD (t) Path dependent ⇒ ∃s̃ ∈ R, ṽ ∈ V s.t. |s̃,ṽ 6= 0. (2.36)
∂s
Proof: If dU (t,q(t))
dt
= 0 then by part 1 of the preceding theorem, we know
that QD (t) = 1 ∀t. But
QED.
61
first part of Corollary 3 shows that in this respect the Divisia index outper-
forms the more brittle Konus. That is, it agrees with the Konus when both
are defined under constant preferences and utility but continues to guaran-
tee constant utility in the absence of constant preferences. This suggests the
idea that the definition of the Konus fixed utility price index be extended
by defining it to equal PD when QD = 1 ∀t. In the next section we back
up this assertion by showing that even more is possible. First we define four
variable-preference Konus indexes and then show through an advanced cal-
culus argument (translated from differential geometry) utilising a traditional
chaining approach, that they are all directly equal to the Divisia even under
changing utility.
62
2.4 The Konus Index under Changing Pref-
erences
As discussed before, the Konus Price Index
E(P a , s, R)
PK = (2.39)
E(P b , s, R)
measures the change in the minimal cost of getting the consumer to the level
of utility ur associated with a reference basket. Depending on whether we
take the reference basket to be the base time basket or the current time
basket, therefore, we get different indexes. The Laspeyres Konus (PLK ) uses
the base time basket as the reference basket, whereas the Paasche Konus
(PP K ) uses the current time basket. In general these are not the same,
except in the case of homothetic preferences.
Let us define Q(Ua (q(b)), p(c)) ∈ V to be the basket of minimal cost in
the pricing at time c with utility equivalent to the consumer’s basket at time
b as measured by the consumer’s utility function at time a. With the above
notation, the traditional Laspeyres Konus and Paasche Konus price indices
are given by the formulas:
p(1) · Q(U (q(0)), p(1))
PLK (p(1), p(0); U (q0 )) = (2.40)
p(0) · Q(U (q(0)), p(0))
p(1) · Q(U (q(1)), p(1))
PP K (p(1), p(0); U (q1 )) = (2.41)
p(0) · Q(U (q(1)), p(0))
Here, the subscript on the utility function is suppressed because both
these indexes implicitly assume that the preferences of the consumer, i.e. the
utility function with which he is maximising, remain absolutely constant over
time. As pointed out before, the Divisia index does not make this unrealistic
assumption. If we relax this assumption for the Konus Price index, allowing
preferences to change, we find that as well as specifying which basket is the
reference basket, we need to specify which utility function we will use as the
reference function, base time preferences or current time preferences. We
thus define four Konus indexes:
Definition 15 The dynamic utility Konus price indexes are defined by the
formulas:
p(1) · Q(U0 (q(0)), p(1))
PLKL (p1 , p0 ; U0 (q0 )) = (2.42)
p(0) · Q(U0 (q(0)), p(0))
63
p(1) · Q(U1 (q(0)), p(1))
PLKP (p1 , p0 ; U1 (q0 )) = (2.43)
p(0) · Q(U1 (q(0)), p(0))
p(1) · Q(U0 (q(1)), p(1))
PP KL (p1 , p0 ; U0 (q1 )) = (2.44)
p(0) · Q(U0 (q(1)), p(0))
p(1) · Q(U1 (q(1)), p(1))
PP KP (p1 , p0 ; U1 (q1 )) = (2.45)
p(0) · Q(U1 (q(1)), p(0))
PLKL gives the change in the cost of achieving the utility associated with
the base time basket under base time preferences, whereas PLKP gives the
change in the cost of achieving the utility associated with the base time basket
under current preferences. Similarly, PP KL represents the change in the cost
of achieving the utility associated with the current basket under base time
preferences, while PP KP gives the change in the cost of achieving the utility
associated with the current basket under current preferences.
In this section we will prove the equality between the infinitely chained
versions of these indexes and the Divisia.
The infinitely chained Laspeyres-Konus-Laspeyres price Index over the interval [t0 , t1 ] =
s−1
∞
Y (a)(t1 − t0 ) (a + 1)(t1 − t0 )
PLKL (t0 , t1 ) = lim PLKL ( + t0 , + t0 ) (2.46)
s→∞
a=0
s s
For the sake of simplicity we shall assume that the intervals under consider-
ation have been shifted and scaled to coincide with [0, 1].
Our strategy to demonstrate the equivalence
∞
PLKL (0, 1) = PD (0, 1)
will be to
64
1. Re-express the infinite limit of finite products as an infinite limit of
finite sums.
s−1
a a+1
,
X
= exp( lim ln(PLKL
s s
)) (2.49)
s→∞
a=0
a a+1 a (a + 1)
φLKL
s (x) = s ln(PLKL ( , )) <x≤ (2.50)
s s s s
where s represents the number of ‘links’ in our chain and a is a non-negative
integer less than s. These functions have been specifically constructed so
that their integrals
Z 1 s−1
X a a+1
φLKL
s (x)dx = ln(PLKL ( , )) (2.51)
0 a=0
s s
65
are exactly equal to the natural logarithms of the chained PLKL -indices.
Let us look at the limit as s → ∞ of values of these functions φLKL
s (x0 )
for a fixed point x0 ∈ [0, 1].
Let as (x0 ) be the sequence of integers such that
ln(PLKL ( as (x
s
0 ) as (x0 )
, s + r)) − ln(PLKL ( as (x
s
0 ) as (x0 )
, s ))
= lim (2.57)
r→0 r
as (x0 ) a (x ) a (x ) as (x0 ) a (x ) a (x )
p( +r)·Q(U a (x ) (q( s 0 )),p( s 0 +r)) p( )·Q(U a (x ) (q( s 0 )),p( s 0 ))
s s 0 s s s s 0 s s
ln( s ) − ln( a (x ) s )
a (x ) a (x ) a (x ) a (x )
p( s 0 )·q( s 0 ) p( s 0 )·Q(U a (x ) (q( s 0 )),p( s 0 ))
s s s s 0 s s
s
= lim
r→0 r
(2.58)
∂ln(PLKL (t0 , t1 ))
= |t0 =t1 =x0 (2.59)
∂t1
66
1 dp(t1 ) ∂Q(Ux0 (q(x0 )), pt1 )
= ·( · q(x0 ) + p(t1 ) · )|t1 =x0 (2.61)
p(x0 ) · q(x0 ) dt1 ∂t1
ṗ(x0 ) · q(x0 )
= (2.69)
p(x0 ) · q(x0 )
67
s−1
a a+1
,
Y
∞
PLKL [0, 1] = lim PLKL
s s
(2.71)
s→∞
a=0
s−1
a a+1
,
X
= exp( lim ln(PLKL
s s
)) (2.72)
s→∞
a=0
Z 1 Z 1
= exp( lim φLKL
s (t) dt) = exp( lim φLKL
s (t) dt) (2.73)
s→∞ 0 0 s→∞
Z 1
= exp( φLKL
∞ (t) dt) (2.74)
0
1
ṗ(t) · q(t)
Z
= exp( dt) (2.75)
0 p(t) · q(t)
1 n
p ·q
Z X dpi
= exp( Pn i i ) = PD [0, 1] (2.76)
0 i=1 j=1 pj · qj pi
QED
68
2.5 Conclusions
A comparatively small amount of work seems to have been done on the
analysis of the welfare implications of the “little understood”5 Divisia index.
This is especially curious as in this paper we have seen that for an idealised
representative consumer with static or dynamic preferences, the Divisia price
index is peerless in its ability to maintain constant utility.
In retrospect this is perhaps not so surprising for two main reasons:
69
means for evaluating estimates of the CPI, which is of particular relevance
given the current Congressional debate on this issue.
70
Chapter 3
3.1 Introduction
The problems involved with the control of rural migration into overtaxed
urban areas presents one of the great challenges in development economics.
Thirty years of intensive study by economists have spawned numerous pol-
icy initiatives intended to combat the phenomenon. Nevertheless, despite
scattered success, many LDCs continue to report an increasing demographic
shift towards urban areas.
The pioneering work of Lewis (1954) and Todaro (1969) approached the
question of rural-to-urban labor migration in the face of high urban unem-
ployment and a high urban to rural wage ratio as an individual decision
making problem. The basic behavioral equation1 developed by Todaro can
be expressed by
Z τ
V (0) = [P (t)Yu (t) − Yr (t)]e−rt dt − C(0) (3.1)
t=0
where V (0) represents the present discounted value of the move, P represents
the probability of getting a job in the urban market, Yu represents urban in-
come, and Yr represents rural income. C is the one time cost of moving. This
model was then extended by Harris and Todaro (1970) to discuss productivity
and policy implications.
1
Cf. for example to Cole and Sanders 1985
71
While this straightforward approach has enabled economists to under-
stand some of the decision making involved in migration, it has been pointed
out that there are pervasive phenomena that are unexplained by this model.
One such issue is that of remittances. It has been found that 10%-30% of a
migrant’s income may be transferred in remittances (Lucas and Stark 1988).
There are various theories based on pure self interest or pure altruism to
explain these remittances. Lucas and Stark also discuss migration as a con-
tractual risk sharing proposition: as there are basically non-correlated risks
involved in rural production and urban job search, the two parties can co-
insure, thereby diversifying the risk. The extent to which such co-operative
behavior is found however, suggests that it might be useful to think of the
decision making process not at the individual level, but instead at the house-
hold level. As Lauby and Stark (1988) contend
In fact there were already indications from the authors of the the original
individual migrant models that the ultimate goal should be to move the level
of analysis from laborer to household. According to Harris and Todaro:
“...this notion that migrants retain their ties to the rural sec-
tor is quite common and manifested by the phenomenon of the
extended family system and the flow of remittances to rural rela-
tives of large proportions of urban earnings. However, the reverse
flow, i.e. rural-urban monetary transfers is also quite common in
cases where the migrant is temporarily unemployed and therefore,
must be supported by rural relatives.”3
72
Policy Considerations
In the face of high urban unemployment there has been an ongoing discussion
of the best policies by which to control rural to urban migration. This dis-
cussion has focused mainly on whether the migration incentive is best viewed
as a “push” from the low wage rural sector, or a “pull” from the high wage
urban sector. Throughout the development literature it is often stated as
self-evident that
73
Yet the expected income hypothesis, even in its revised formu-
lations, is devoid of any explicit decisional risk content (Todaro,
1976, 1980); the hypothesis does not incorporate a random vari-
able (multiplicative or other), and the implied utility function is
linear.”6
It is only once one incorporates the effects of risk aversion into the fam-
ily’s decision making that one realises that there may be an income effect
associated with the rise in rural incomes that may have adverse effects.
It is shown in this paper that if household utility can be characterised by
Decreasing Absolute Risk Averse functions, then it is possible for increases
in rural wages to in fact lead to ‘perverse migration’; i.e. depending on which
point of their utility function the family is at, an increase in rural wages may
increase the incentive to send workers into the city.
74
Proposition 11 Assume that u(x) is the Bernoulli utility function of an
investor displaying decreasing absolute risk aversion (DARA). Assume fur-
ther that the investor is given the opportunity to invest amounts of wealth
w = w1 , w2 in shares of a safe asset returning r dollars for every r dollars
invested and a risky asset returning z dollars with cumulative distribution
function F . Then if w1 < w2 , the optimal investments in the risky asset will
obey α1∗ < α2∗ .
Proof:
We will show that if the utility function displays the property of Decreas-
ing Absolute Risk Aversion, then at higher levels of wealth the individual
will invest more in the risky asset.
Consider the two levels of initial wealth w1 < w2 . Denote the increments
or decrements to wealth by z. Then the individual evaluates risk at wi by
the Bernoulli utility functions ui (z) = u(wi + z).
Because u is posited to exhibit DARA we are assured that whenever
w1 < w2 , u1 (z) = u(w1 + z) is a concave transformation of u2 (z) = u(w2 + z).
The expected utility maximising investor with wealth wi seeks to max-
imise the expression
Z Z
u(wi − αi + αi z)dF (z) = ui (−αi + αi z)dF (z) = 0. (3.2)
As we know, the concavity of u(·) implies that the functions φi (·) are
decreasing. Therefore if we show φ1 (α2∗ ) < 0, it must follow that α1∗ < α2∗ ,
which is the result we are seeking. Now by the hypothesis that u exhibits
DARA there must exist an increasing concave function ψ(x) such that the
concave transformation, u1 (x) = ψ(u2 (x)) holds for all x. ψ 0 (x) is therefore
positive and decreasing. This allows us to assert that
Z
φ1 (α2∗ ) = (z − 1)ψ 0 (u2 (α2∗ [z − 1]))u02 (α2∗ [z − 1])dF (z) < 0. (3.4)
75
The reasoning is that integral
Z
φ2 (α2 ) = (z − 1)u02 (α2∗ [z − 1])dF (z)
∗
(3.5)
Z 1 Z ∞
= (z − 1)u02 (α2∗ [z − 1])dF (z) + (z − 1)u02 (α2∗ [z − 1])dF (z) = 0. (3.6)
−∞ 1
is closely related to (3.4) with the first summand in line (3.6) exhibiting
a negative integrand and the second summand containing a positive one.
The integral in line (3.5) differs only from the integral in equation (3.4) by
the presence of the ‘weighting function’ ψ 0 (u2 (α2∗ [z − 1])) which due to its
decreasing nature will weight the negative integral more heavily. This gives
the desired result.
QED
Thus we see that increases in the wealth variable w lead to greater in-
vestment in the risky asset. Under such a shift in the budget line we see that
the risky asset behaves as a normal good.
The migration model can be regarded as just such a portfolio allocation
decision but with an important twist: the ‘wealth’ of the household is rep-
resented not by an amount of money but by the labor force within the unit.
In such a situation, direct application of the proposition above applies to an
increase in the family size n leading to an increase in the number of ‘assets’
(namely workers) being sent to the urban market. While this would give
rise to interesting population policy considerations, it is not the case being
discussed here.
It is reasonably clear that an increase in the rural wage cannot be for-
mulated to fit the structure of the above proposition. Were this possible, we
would be left with the uncomfortable conclusion that rural wage increases
could lead only to greater rural-to-urban migration and migration models
would be universally perverse. That this cannot be the case can be seen by
considering an increase in the rural wage beyond urban levels.
In the case under consideration, the potential for an increase in expected
return on investment is coming from an increase in the return to the ‘safe
asset’. Therefore, while we would expect something like the income effect
discussed above, there is also a substitution effect: the increase in the rural
wage naturally increases the relative attractiveness of the rural market. In
fact migration theory has focused on this substitution effect to the exclusion
of the ‘income’ effect, thereby positing the sort of policy discussed above.
76
What we see here is that with DARA utility functions, and the fact that the
increase in income is coming precisely from an increase in the return to the
safe asset, income and substitution effects move in opposite directions.
In the case of consumer goods it is assumed that even when these ef-
fects conflict, it is highly unlikely that the income effect will outweigh the
substitution effect. This is generally true because in order for the income
effect of a price rise in any good to be large, the good must constitute a large
proportion of the consumption basket. While this is unlikely to be true for
any good in a consumption situation, it is almost certainly true in the case of
employment. As there are only two assets between which a consumer must
choose, it is very likely that income effects will be substantial. The omission
of any discussion of this when forming policy can therefore lead to counter
productive efforts.
77
3.3 A Risk Sensitive Household Migration Model.
Our aim in this section is to construct household migration models which
incorporate risk sensitivity into labour allocation decisions. While this seems
straightforward enough, a literature search indicates that such models are
either quite obscure or have not been introduced at all.7
There is much debate in the current development literature about the
appropriate methods of modeling household decision making. The problem
is rich in intricacies which include principal-agent problems, gender specific
labour8 , migrant network structures and various game theoretic considera-
tions. Given the wide variety of phenomena which are being uncovered, it is
unlikely that a ‘universal’ model will be settled upon any time soon. For the
sake of simplicity we will thus work with a household whose decision making
is well approximated by the maximisation of a household utility function that
views the decisions as being made by a single entity. While such a choice
does not pretend to universality, the situation is known to arise from a variety
of situations (e.g. a benevolent dictatorship, a nuclear family whose inter-
ests are similar enough to work well in aggregation or an individual decision
maker) and may lend itself to most easily to varied adaptation.
We define below the category of rural household models used in this paper.
H = (ν, n, wr , wu , πr , πu , c) (3.7)
where
dν(x)
>0 ∀x ≥ 0 (3.8)
dx
guaranteeing that the function is monotonically increasing.
7
In his 1991 book, Stark encountered a related lacuna writing “To the best of our
knowledge, the argument that aversion to risk is a major cause of rural-to-urban migration
has appeared only in Stark (1978,1981)”.
8
Gender issues in particular may play a role in migration decision making according to
the suggestion of Lauby and Stark 1988.
78
2. The number of workers in the household is given by a positive integer
n ∈ Z+ .
3. The rural wage is given as 0 < wr .
4. The urban wage wu is given by a value satisfying wr ≤ wu .
5. Rural employment is full, i.e. πr = 1.
6. The probability of urban employment is 0 < πu ≤ πr .
7. A Utility function UH (m) giving the expected utility of allocating m of
the household’s n workers for rural labour while sending the remaining
n − m workers to search for urban employment.
We can then calculate the expected utility of a family allocating m of
its n workers for rural labour. In this paper we shall assume for simplic-
ity that both labour markets make all employment offers at a single time
(e.g. spot labour markets in which all work opportunities are offered before
the beginning of the work day so that the chance of later employment is
negligible.)
Proposition 12 The expected utility of allocating m of the family’s n work-
ers to the rural labour market is given by
U(ν,n,wr ,wu ,πr ,πu ,c) (m) = U H (m) (3.9)
m n−m
X m n − m i m−i j n−m−j
X
= (πr ) (1 − πr ) (πu ) (1 − πu ) ν(i(wr ) + j(wu ) − (n − m)c).
i j
i=0 j=0
Proof:
Let us assume that we have an ordered set of r identical though unrelated
binary events {Xi }ri=1 (e.g. urban job searches) which all carry the probabil-
ity πA of outcome A (urban employment) and 1 − πA of outcome B (urban
unemployment). Then the probability π that only the first s events yield the
outcome A is given by
π = πAs (1 − πA )r−s . (3.10)
If one wishes to instead calculate the probability π̃ that exactly s of the
possible outcomes result in A (without respect to which events return the
result) the probability is given by the discrete binomial distribution:
r! s r−s r
π̃ = πA (1 − πA ) = πAs (1 − πA )r−s (3.11)
s!(r − s)! s
79
r
where the binomial coefficient is the number of distinct ways of getting
s
the outcome A from exactly s of the r events {Xi }ri=1 .
The above analysis tells us that if either all workers are allocated to a
single labour market (m = n or m = 0) or both rural and urban probabilities
are identical (πr = πu ), the resulting probability structure will be given by a
discrete binomial distribution. In our situation however we have two markets
which are governed by non-conditional probabilities which are in general not
equal. Therefore, since such probabilities are multiplicative, the expected
utility of a household which allocates m of its n workers for rural labour is
given by the following ‘hybrid’ distribution:
m n−m
X X m! i m−i (n − m)! j n−m−j
= (πr ) (1 − πr ) (πu ) (1 − πu ) ν(iwr + jwu − (n − m)c) (3.13)
i=0 j=0 i!(m − i)! j!(n − m − j)!
m n−m
X X m n − m
= (πr )i (1−πr )m−i (πu )j (1−πu )n−m−j ν(iwr +jwu −(n−m)c)
i j
i=0 j=0
QED
80
3.4 Perverse Migration Incentives
We will show below that the migration model introduced in the previous
section may behave in a fashion exactly opposite to that expected from the
migration literature. In such situations, attempts to boost the attractiveness
of rural labour will serve to increase the appeal of migration to the urban
labour market.
While such phenomena can be shown to occur for families with any num-
ber of workers n exceeding 2, the higher cases do not contribute any quali-
tatively new phenomena; we will thus restrict our attentions to a 2 worker
household for ease of illustration.
In order to clarify our exposition, it will be helpful to define a condition
somewhat weaker than perverse migration which refers only to the effect
on incentive structure. The following definition is given for the two worker
household although the concept can be defined more generally:
With these definitions set, let us look at the case of a two worker house-
hold. In this particular case we have the explicit formulas for the expected
utility of the 3 possible labour allocations:
∂Uwr (2, 0)
= 2ν̇(2wr ) (3.17)
∂wr
81
∂Uwr (1, 1)
= (1 − Πu )ν̇(wr ) + (Πu )ν̇(wr + wu ) (3.18)
∂wr
∂Uwr (0, 2)
=0 (3.19)
∂wr
relative to changes in the rural wage.
As can be seen from the formula in line (3.16), the strategy of sending both
workers to the urban labour market carries a significant risk of a complete
loss of income. It is thus important to make sure that whatever functional
form is posited for ν(x) above wr , the value for ν(0) must be low enough
to prevent ‘gambling’ with the (0, 2) strategy in all but the most extreme
circumstances as the migrants under consideration are taken to belong to
risk averse subsistence level households.9
If we assume that the family was initially indifferent between having 1 or
2 workers in the rural market
Uwr (2, 0) − Uwr (1, 1) = 0 (3.20)
then asking whether rural wage increases will lead to perverse migration is
equivalent to asking whether
∂Uwr (2r , 0u ) ∂Uwr (1r , 1u )
− = 2ν̇(2wr ) − (1 − Πu )ν̇(wr ) − (Πu )ν̇(wr + wu )
∂wr ∂wr
(3.21)
is positive or negative.
In order for the increase in rural wages to cause an increase in the utility
of sending workers to the urban labour market,
2ν̇(2wr ) < (1 − Πu )ν̇(wr ) + (Πu )ν̇(wr + wu ) (3.22)
On the pages that follow it is shown that (for a two worker household)
any unbounded DARA utility function ν which is a concave transformation
of the CRRA function ln(x) is capable of producing perverse migration at
every rural wage level. Intuitively this indicates that functions which are
‘intermediate’ between CARA and CRRA will exhibit perversity for a two
worker household.
9
See for example Sen (1984) Pg.260
“In matters of ‘life and death’ such as these decisions involve, affecting one’s
entire economic existence, the assumption of expected utility maximization
is recognized to be quite restrictive.”
82
First, we establish that for any concave transformation of the CRRA
function ln(x), there are always urban employment probabilities leading to
perverse incentives (without restriction on the urban wage).
ψ 0 (ln(2wr ))
c> 0 ∀wr ∈ [a, b] (3.24)
ψ (ln(wr ))
for some constant c ∈ [0, 1]. Then for any probability of urban employment
Πu satisfying
1 − c ≥ Πu (3.25)
all rural wage increases will lead to perverse incentives by increasing the
relative expected utility of the urban labour market.
Proof:
By hypothesis we have
1 − Πu ≥ c (3.26)
implying the inequality
ψ 0 (ln(2wr ))
(1 − Πu ) > 0 (3.27)
ψ (ln(wr ))
1 1
2ψ 0 (ln(2wr )) < (1 − Πu )ψ 0 (ln(wr )) (3.29)
2wr wr
dψ(ln(x)) |2wr dψ(ln(x)) |wr
2 < (1 − Πu ) (3.30)
dx dx
83
implying
From this theorem we can see that while the standard CARA function
−exp(−x) is not everywhere a concave transformation of the CRRA ln(x),
it nevertheless leads to perverse incentives.
84
as ln(1) = 0. By the fact that ψ(x) is an increasing concave function above
x = ln(1) = 0, we are assured that for wr large
ψ 0 (ln(2wr ))
<1 (3.39)
ψ 0 (ln(wr ))
and there will thus exist probabilities 1 > Πu > 0 such that
ψ 0 (ln(2wr ))
< 1 − Πu (3.40)
ψ 0 (ln(wr ))
While perverse incentives indicate that raises in the rural wage are ineffec-
tive for combating rural to urban migration, they do not necessarily indicate
that effort is counterproductive to the point of inducing migration. The fol-
lowing theorem shows that if the concave transformation ψ is unbounded,
then for every choice of wr , it will always be possible to find an urban wage
wu which will lead to perverse migration for a marginal increase in the rural
wage.
then
∀wr ∈ [a, b] ∃wu > wr (3.42)
such that the family is indifferent between the pure rural strategy and the
mixed rural/urban strategy
with
∂Uwr (2, 0) ∂Uwr (1, 1)
< (3.44)
∂wr ∂wr
leading to perverse migration.
85
Proof:
First, we recall that the expected utility for the ‘mixed strategy’ is given
by:
Uwr (1, 1) = (1 − Πu )ν(wr ) + (Πu )ν(wr + wu ). (3.45)
Now by hypothesis we know that
ν(x) = ψ(ln(x)) (3.46)
with ψ(x) unbounded from above. However, since ln(x) is also unbounded,
we see from (3.45) that
lim Uwr ,wu (1, 1) = ∞. (3.47)
wu →+∞
However, since the family is assumed to be risk averse we are assured that if
wu were set equal to wr we would have
Uwr ,wu (2, 0) > Uwr ,wu (1, 1). (3.48)
Thus by the intermediate value theorem we are assured that
∃wu > wr s.t. Uwr ,wu (2, 0) = Uwr ,wu (1, 1) (3.49)
as Uwr ,wu (2, 0) has no dependence on wu . Thus if wu is so chosen for a par-
ticular rural wage wr ∈ [a, b], the household will exhibit perverse migration
for marginal increases in the rural wage wr .
QED
86
ln(wr )
ln(ln(2wr )) − ln(2wr )
ln(ln(wr ))
wu = exp(exp( ln(wr )
)) − wr (3.52)
1− ln(2wr )
lead to a 2 worker household being indifferent between the pure rural strategy
and a mixed urban/rural strategy with marginal increases in the rural wage
leading to perverse migration.
Proof: First, it can be readily checked that (3.50) is decreasing for x > 1
and can be solved by
ν(x) = ln(ln(x)) + c (3.53)
where for simplicity we take c = 0.
Next we see that if we take the difference in utilities between the two
strategies under consideration we have
87
discuss below a proposed method of relating the models of the current paper
with models which deal with uncertainty in the agricultural incomes.
The policies proposed for raising the attractiveness of rural labour may
attempt to do so by raising rural wages directly, or by attempting to reduce
the fluctuations associated with agricultural incomes. In the latter case it is
possible to model rural incomes as lotteries in which the certainty of rural
employment is preserved but the possible incomes fluctuate with agricultural
prices. The importance of this change in perspective is that it could be used
to incorporate the effects of agricultural price stabilisation policy into the
rural-to-urban migration decision.
In this situation we would imagine that we would have two cumulative
distribution functions with G1 (x) representing the natural variation in rural
incomes and G2 (x) giving the variation which would result from government
stabilisation efforts. If the government is successful in doing no more that sta-
bilising rural incomes we could expect G1 (x) to determine a mean preserving
spread of G2 (x). In either case, if we call Z01,2 the certainty equivalents
Z
i −1
Z0 = ν ( ν(x) dGi (x)) (3.59)
of the two lotteries G1 (x), G2 (x), we know that for a household governed by
a risk averse utility function ν(x) that
88
3.5 Examples
Once one is made aware of the possibility of perverse migration, examples of
perverse models appear to be ubiquitous. It is thus not too difficult to find
examples of models where:
2. The values for probabilities, and wages are not terribly different from
those found in actual labour markets.
While such models are not intended to mirror any particular labour mar-
ket, they do indicate that such models exist without recourse to unrealistic
hypotheses.
Let us then look at examples in which we expect perverse migration. We
will illustrate the phenomena by graphing the utility differences of the two
main allocation strategies as a function of the rural wage.
It has been shown that any unbounded DARA utility function ν(x) which
is a concave transformation of the CRRA function ln(x) is capable of pro-
ducing perverse migration. Here we look at a specific function and evaluate
ranges of rural and urban wages and urban probabilities that cause perversity.
Let us assume the household’s utility function is equal to
x
ν(x) = ln(ln ) (3.61)
10
in the range of interest (e.g. Rs. 11 and up).
We are able to plot a graph of the Rural Advantage Function (RAF) for
the household with the utility function ν. This looks at how the difference
between the utility of keeping two workers in the country and the utility of
keeping one worker in the city and one in the country changes with rises in
the rural income.
89
to cause perverse migration as it becomes preferable for the family to send
another worker into the city.
We will look at two specific cases; in the first case we set the urban prob-
ability equal to .5 and observe the range of rural wages wr which will cause
perverse migration with an urban wage of Rs. 200 (It will be noted in this
example that a rural wage increase would have to more than double the rural
earnings before it brought the migrant out of the range of perversity). In the
second case we look at a situation with an urban probability of employment
of .4 and urban wage rate set at Rs. 500.
We assume in these examples that the negative utility of a zero wage
is taken to be large enough that the family will consider it prohibitive to
gamble both workers in the city (as this strategy necessarily entails a high
risk of earning nothing).
90
Case 1
We set:
Πu = .5
wu = Rs.200
Perverse migration is caused whenever rural wages wr rise or fall to any
wage within the range of Rs. 15 to Rs. 44. (See Graph 1)
Graph 1:
91
Case 2
We set:
Πu = .4
wu = Rs.500
Perverse migration is caused whenever rural wages wr rise or
fall to any wage within the range of Rs. 20 to Rs. 56. (See Graph
2)
Graph 2:
92
3.6 Conclusion
When developing migration control policy, understanding the deci-
sion making processes behind rural to urban migration is essential.
It has in general been assumed in the literature that there is a
theoretical equivalence between raising rural wages to dampen the
“push” factors of the low wage rural sector, and putting in place
urban wage controls, to deal with the “pull” factor of a high wage
urban sector. It is shown here, however, that once attitudes to risk
are incorporated, it is is possible to find that raising rural wages
may in fact have a counterproductive effect.
The migration literature has, until very recently, been based
closely on the Todaro model. While this model captures the basic
incentive structure for migration, it makes two simplifying assump-
tions that are generally acknowledged to be unrealistic, i.e.
Once these assumptions are lifted, it is found that one of the basic
conclusions of the Todaro model can no longer be relied upon.
Perverse incentives have been shown in this paper to afflict the
category of Decreasing Absolute Risk Averse utility functions. As
this is precisely the utility structure one expects to find in the gen-
eral situation we are left with the question of how to evaluate the
likely migration effects of a generic rural wage support program.
This paper would argue that without specific information on the
particulars of the potential migrants being targeted (e.g. family
sizes, objective functions, wage rates and probabilities) the usual
conclusion drawn in the literature is not assured.
93
3.7 Appendix B: The Relationship Between
the Rural Household Labour Allocation
Model and the Todaro Model
The original Todaro equation serves something of a dual role in
the development literature; while it is in some sense one of the
simplest migration models, it also functions as the progenitor for
the more sophisticated models which followed it. It may there-
fore be worthwhile to see how the data specifying a rural labour
allocation model (RHLMAM) compares with the data specifying
Todaro’s equation.
In Todaro’s model (3.1), the basic behavioral equation gives a
single number V (0) whose sign is the answer to the discrete op-
timisation problem; if and only if the number is positive will the
rational individual choose to migrate. In the case of a household,
2
the number of options is enlarged so that there are n 2+n decisions to
evaluate. We thus posit that the natural analog of Todaro’s equa-
tion for a remitting household is given by the following ‘behavioral
matrix’ M H .
Definition 19 The (n+1)×(n+1) skew-symmetric behavioral matrix
M H associated to a RHLMAM is given by the entries {MijH }ni,j=0
where MijH = U H (i) − U H (j)
The number of rural workers m which will optimise the expected
utility of the household H is then the number of the row whose
entries are all non-negative. The greatest number row with this
property will be called O(H). If we denote the collection of house-
hold models by H:
Definition 20 Let O : H −→ Z be the integer valued function which
gives the number of workers who should be allocated to rural
labour in order to maximise the household’s expected utility.
With these conventions established we are now in a position to
explain how these RHLMAMs relate to the established literature.
In the following proposition it is shown that these models gener-
alise the (static) Todaro model to families or networks of labourers
governed by risk-sensitive household welfare functions.
94
Proposition 17 Let us assume that the following hypotheses are
made:
1. The household work force consists of an individual: n = 1.
2. The individual is risk neutral: ν(x) = x.
3. The interest rate is negligible: r = 0.
4. The probabilities of urban and rural employment are given by
constants as are the urban and rural wages: πu = P (t), πr =
1, Yu (t) = wu and Yr (t) = wr .
5. The time period τ in Todaro’s equation is set equal to unity:
τ = 1.
Then the RHLMAM is equivalent to the Todaro equation (3.1)
Proof: In order to evaluate the migration decision presented by a
single worker RHLMAM, one need only examine the M01 compo-
nent of the 2 × 2 behavioral matrix.
H
M01 = UH (0) − UH (1) (3.63)
which when expanded from equation (3.13) yields
= πu ν(wu − c) + (1 − πu )ν(0 − c) − πr ν(wr ) − (1 − πr )ν(0) (3.64)
= πu wu − πr wr + (−c)(πu + (1 − πu )) (3.65)
= πu wu − wr − c (3.66)
Z τ
= [P (t)Yu (t) − Yr (t)]e−rt dt − C(0) = V (0) (3.67)
t=0
giving equation (3.1) (the Todaro behavioral equation).
QED.
95
Bibliography
[3] Banerjee B. and Kanbur, S.M. “On the Specification and Es-
timation of Macro Rural-Urban Migration functions: With an
Application to Indian Data” Oxford Bulletin of Economics
and Statistics 43, 7-29, 1981
96
[10] Diewert, W.E., Pesonal Communication to the Author,
Spring, 1996
97
[20] Jorgenson, D.W., and Z.Griliches, “Divisia Index Numbers
and Productivity Measurement,” Review of Income and
Wealth, 17, 227-29, 1971
98
[31] Salvatore, D. “A Theoretical and Empirical Extension of the
Todaro Migration Model”Regional Science and Urban Eco-
nomics 11, 499-508 1981
99