Cheeger Chung
Cheeger Chung
II · 1–4
Abstract
1. Introduction
One of the major tools in spectral graph theory is the Cheeger
inequality. It concerns the relationship between two important graph in-
variants, the spectral gap and the Cheeger constant. For an (undirected)
graph G = (V, E), we denote by λG the spectral gap of the (normalized)
Laplacian (see [5]). Let hG denote the Cheeger constant of G, defined
as the minimum value h such that any cut separating a set of volume x
requires at least hx edges if x is less than half of the total volume. (The
∗ University of California, San Diego, USA
† Research supported in part by NSF Grants DMS 0457215 and ITR 0426858
2 F. Chung
h2G
2hG ≥ λG ≥ .
2
The proof of the above (discrete) Cheeger inequality is quite similar to
the proof in the continuous case in spectral geometry due to Jeff Cheeger
[4]. The idea of the proof is to use eigenvectors to guide the search for
good cuts. Indeed, the constructive proof gives the following expanded
version of the Cheeger inequality:
α2G h2
2hG ≥ λG ≥ ≥ G (11)
2 2
where αG is the minimum Cheeger ratio of the size of the edge boundary
and the volume of sets consisting of the vertices associated with the
largest i coordinates of the eigenvector which is associated with λG .
A somewhat simplified proof for (11) will be given in Section 3. In
the search for the optimum cut (evaluated by the Cheeger constant)
among an exponential number of possibilities, we can then focus on
a linear number of choices for the cut using the order determined by
the eigenvector. The above Cheeger inequality guarantees that the cut
resulting from this efficient algorithm has the Cheeger ratio within a
quadratic factor of the optimum. This spectral partition algorithm has
been widely used in numerous applications in the generic divide-and-
conquer approach by reducing problems recursively to smaller and more
manageable sizes [12].
Nowadays we often are dealing with problems involving graphs or
networks of prohibitively large sizes, such as the Web graph, various so-
cial networks, biological networks or many information networks arising
from massive data sets [9]. It is often infeasible to compute eigenvec-
tors for a graph with hundreds of thousands of vertices. For problems
involving such large graphs, the total number of vertices and edges of
the graph are no longer realistic parameters. Alternative approaches are
needed in a way that computation can be carried out “locally” and the
local optimum can be analyzed. As it turns out, one of the key ideas
rests on random walks.
In the early 90’s, Lovász and Simonovits derived an elegant result
on rapid mixing of random walks in their work on approximating the
volume of convex polytopes [13, 14]. Using this mixing result, Spielman
and Teng in 2004 gave a graph partition algorithm which has running
time proportional to the size of the output, and not depending on the
total number of vertices [15]. Although it was not specifically described
in the papers mentioned above, the basic thrust of this approach is a
The Cheeger inequalities and graph partition algorithms 3
Cheeger inequality,
2
βG h2
2hG ≥ λG ≥ ≥ G (12)
8 8
where βG is the minimum Cheeger ratio of the sets determined by the
largest i values using the vector which is the probability distribution of
a random walk starting at a vertex v after k steps over all vertices v,
i ≤ n = |V (G)| and k ≤ (16 log n)/λ2G . Some local variations will be
further discussed in Section 5.
A third Cheeger inequality relies on the notion of PageRank, which
was first introduced by Brin and Page [3] as a method for quantitatively
ranking Webpages by the Web search engines. The PageRank can be
viewed as a combination of random walks, scaled by a parameter called
the jumping constant. (The detailed definition will be given later.) In
[1, 2], PageRank is used for developing a local partition algorithm which
improves upon the partition algorithm using random walks. PageRank
provides a way to deal with random walks of various lengths simulta-
neously in an organized fashion. The underlying theme here is again a
Cheeger inequality, although it was not mentioned in [1, 2]. In Section
6, we will show that for a subset S in G with vol(S) ≤ vol(G)/2, we have
γS2
hS ≥ λS ≥
4 log(vol(G))
where λS is the Dirichlet eigenvalues on S and γS denotes the mini-
mum of the Cheeger ratios determined by the PageRank involving only
vertices in S with appropriately chosen parameters. As a result, the lo-
cal partition algorithm using PageRank improves both the running time
and performance of the previous algorithms by a factor of powers of
log(vol(S)) (see [1]).
The fourth Cheeger inequality involves a notion of pagerank based
on the heat kernel of a graph. The heat kernel pagerank can be viewed
as an exponential sum of random walks while PageRank is a geometric
sum. The rate of diffusion of the heat kernel pagerank is controlled by a
heat parameter t ≥ 0. The heat kernel has many useful properties with
close relations to the spectrum of the graph. We will use a rapid mixing
inequality of the heat kernel to prove the following Cheeger inequality:
For a subset S of volume s ≤ vol(G)2/3 , we have
κ2S
hS ≥ λS ≥ (13)
8
where κS denote the minimum Cheeger ratio of subsets Si determined
in a sweep of the heat kernel pagerank associated with vertices in S with
vol(Si ) at most 2s for some appropriately chosen parameters for the heat
4 F. Chung
kernel pagerank. The details will be given in Section 7. This again leads
to an improved local partition algorithm.
In this paper, we examine four Cheeger inequalities and their proofs
using different analytic methods. In their entirety, it is of interest to
observe the comparisons as well as pointing out the interconnections.
The methods intertwine, ranging from their roots in differential geometry
through spectral graph theory, random walks, PageRank, heat kernels
to graph partition algorithms. These topics have been rapidly advancing
and emerging as main tools for the age of information and the Internet.
2. Preliminaries
There are basically two ways to give definitions for the heat ker-
nel and eigenvalues of a graph. One way is to use symmetric matrices
throughout which in some situations simplify the arguments (as in [5]).
However, in dealing with random walks, the transition probability ma-
trix W is not necessarily symmetric. Namely,
W = D−1 A
Since the main results will be stated in the language of random walks,
we will use the unsymmetrical version. The spectral gap λG is the least
nonzero eigenvalue of L. The value λG can be expressed as the infimum
of the Rayleigh quotient:
P
(g(u) − g(v))2
λG = inf R(g) = inf u∼v P 2
u g (u)du
g g
|∂S|
hS = ,
min{vol(S), vol(G) − vol(S)}
The Cheeger inequalities and graph partition algorithms 5
Brin and Page introduced the notion of PageRank which has been
used as a major tool for conducting Web search. The definition of Page-
Rank can be entirely described in graph-theoretical terms since its orig-
inal application is in the Webgraph which has its vertex set consisting of
all webpages and edge set consisting of all hyperlinks. In a graph G, the
PageRank operator can be described as a combination of random walks
on G and a jumping constant c, 0 ≤ c ≤ 1. The PageRank matrix Rc is
defined to be
∞
X
Rc = c (1 − c)k Wk .
k=0
An equivalent definition is that the PageRank matrix satisfies the
following recurrence:
Rc = cI + (1 − c)Rc W. (21)
For a vertex u, let χu denote the (0, 1) indicator vector, i.e., χu (v) = 1
if v = u and 0 otherwise. The personalized PageRank has additional
parameters, the jumping constant and a preference vector. For example,
a typical preference vector associated with a vertex u is χu . All vec-
tors here are taken to be row vectors unless mentioned otherwise. For
a vertex u, the corresponding personalized PageRank, denoted by pru ,
is defined by pru = χu Rc , the matrix product of χu and Rc . The per-
sonalized PageRank can be viewed as quantitative ranking of all vertices
with respect to u. The original definition for PageRank has the prefer-
ence vector (1/n, 1/n, . . . , 1/n). In general, we can consider an arbitrary
starting vector s, summing to 1, in place of χu .
We will consider a new notion of pagerank which is based on the
heat kernel of a graph [8]. The heat kernel pagerank also has two pa-
rameters, the heat t and a seed distribution f . The heat kernel pagerank
ρt,f is the matrix product of f and the heat kernel Ht which is defined
as follows for t ≥ 0.
t2 2 tk
Ht = e−t (I + tW + W + . . . + W k + . . .)
2 k!
= e−t(I−W )
= e−tL
t2 2 tk
= I − tL + L + . . . + (−1)k Lk + . . . .
2 k!
6 F. Chung
∂
Ht = −(I − W )Ht .
∂t
α2G h2
2hG ≥ λG ≥ ≥ G
2 2
where αG is the minimum Cheeger ratio of subsets Si consisting of ver-
tices with the largest i values in the eigenvector associated with λG , over
all i.
vol(S)
g = χS − 1,
vol(G)
we have
λG ≤ R(g) ≤ 2hG .
Thus, the proofs for the Cheeger inequality mainly concerns a lower
bound for λG in terms of Cheeger ratios.
Now let g denote an eigenvector achieving λG . Namely, λG = R(g)
P
and v g(v)dv = 0. We order the vertices so that
αG = min hSi .
i
The Cheeger inequalities and graph partition algorithms 7
Let r denote the largest integer such that vol(Sr ) ≤ vol(G)/2. Since
P
v g(v)dv = 0,
X X X
g(v)2 dv = min (g(v) − c)2 dv ≤ (g(v) − g(vr ))2 dv .
c
v v v
g(v) − g(vr ) if g(v) ≥ g(vr ),
g+ (v) =
0 otherwise,
|g(v) − g(vr )| if g(v) ≤ g(vr ),
g− (v) =
0 otherwise.
We consider
P
(g(u) − g(v))2
λG = P
u∼v
2
v g(v) dv
P
(g(u) − g(v))2
≥ P u∼v 2
v (g(v) − g(vr )) dv
P
(g+ (u) − g+ (v))2 + (g− (u) − g− (v))2
≥ u∼v P .
2 2
v g+ (v) + g− (v) du
a+b a b
≥ min{ , }.
c+d c d
˜
vol(S) = min{vol(S), vol(G) − vol(S)}
so that
˜ i ).
|∂(Si )| ≥ αG vol(S
8 F. Chung
Then we have
λG ≥ R(g+ )
P
(g+ (u) − g+ (v))2
= u∼vP 2
u g+ (u)du
P 2
P 2
u∼v (g+ (u) − g+ (v)) u∼v (g+ (u) + g+ (v))
= P 2 P 2
u g+ (u)du u∼v (g+ (u) + g+ (v))
P 2 2
2
u∼v (g+ (u) − g+ (v) )
≥ P 2 2 by the Cauchy-Schwarz inequality,
2 u g+ (u)du
P 2 2
2
i |g+ (vi ) − g+ (vi+1 ) | |∂(Si )|
= P 2 2 by counting,
2 u g + (u)d u
P
|g (v )2
− ˜ i )| 2
g+ (vi+1 )2 |αG |vol(S
i + i
≥ P 2 2 by the def. of αG ,
2 u g+ (u)du
P 2
2
α2G ˜ ˜
i g+ (vi ) (|vol(Si ) − vol(Si+1 )|)
= P 2 2
2
u g+ (u)du
P 2
α2 g+ (vi )2 dvi
= G Pi 2
2 g 2 (u)du
u +
α2G
= .
2
Therefore we have proved the Cheeger inequality in (11):
α2G h2
2hG ≥ λG ≥ ≥ G.
2 2
f (S) + f (Sin )
f W(S) =
2
f (Sout ) + f (Sin )
=
2
f (Sin ∪ Sout ) + f (Sin ∩ Sout )
=
2
f (vol(S) + |∂S|) + f (vol(S) − |∂S|)
≤
2
f (vol(S)(1 + hS )) + f (vol(S)(1 − hS ))
≤ .
2
f (x(1 + hf )) + f (x(1 − hf ))
f W(x) ≤ . (41)
2
√ √
Here we use the fact that 1 + y + 1 − y ≤ 2 − y 2 /4 for 0 ≤ y ≤ 1.
To finish the proof of Lemma 3, we consider g(v) = gk (v) = −fk (v). We
can prove in a similar way that
s
vol(S) β2
gk (v) ≤ (1 − k )k .
du 8
The proof can be done by induction on k and we note that it is true for
k = 0 since g0 (x) ≤ 1. This finishes the proof for (51).
A special case of (51) is the following:
r
dv β2
|W (u, v) − π(v)| ≤
k
(1 − k )k . (52)
du 8
2
βG h2
2hG ≥ λG ≥ ≥ G,
8 8
where
16 log n
βG = min{βt : t ≤ d e}.
λ2G
λG t
kφ(Wt − 1∗ π)D−1/2 k = kφWt D−1/2 k = 1 − kφD−1/2 k.
2
12 F. Chung
First we observe that (61) holds for the case that k = 0 since, for x ≥ du ,
we have f (x) ≤ 1, and for x ≤ du ,
r
x x
f (x) = f (du ) ≤ .
du du
(1 − c)hS
pru (S) ≥ 1 − .
c
Proof: We first consider
γu2
hS ≥
32 log(vol(S))
where γu denotes the Cheeger ratio determined by the PageRank pru with
jumping constant c = γu2 /(16 log(vol(S))).
(1 − c)hS
1− − π(S)
c
vol(S) γu2 k
≤ 1 − (1 − c) +
k
(1 − ) (1 − c) (1 − π(S)).
k
du 8
This implies
hS (1 − c) vol(S) γ2
≥ (1 − c)k 1 − (1 − u )k . (62)
c(1 − π(S)) du 8
where f ranges over all nontrivial function f which satisfy the Dirichlet
boundary condition for S.
Theorem 4 For a subset S in G with vol(S) ≤ vol(G)/2 and a constant
c ≤ 1/2, we have
γS2
hS ≥ λS ≥
8 log(vol(S))
where γS denotes the minimum of the Cheeger ratios determined by the
PageRank pru for u in S and the jumping constant c ≥ γS2 /(8 log(vol(S))).
Proof: Let S denote the subset achieving the Cheeger constant hG .
From the definition in (63), we have hG = R(χS ) ≥ λS . To prove the
lower bound, we consider WS which has entries WS (u, v) the same as
W(u, v) if u, v are both in S and 0 otherwise. We define
X
Rc0 = c (1 − c)k WSk
k
so that
Rc0 = cIS + (1 − c)Rc0 WS . (64)
Clearly, by Lemma 4, we have
χu Rc0 χS ≤ χu Rc χS
s
vol(S) γ2
≤ 1 − (1 − c)k + (1 − S )k (1 − c)k .
du 8
P
Let ϕ denote the function achieving λS and ϕχS = u∈S ϕ(u) = 1. It
is not hard to see we can choose that ϕ(u) ≥ 0 for all u ∈ S. Then,
X
ϕRc0 χS ≤ ϕ(u)χu Rc0 χS
u∈S
p γ2
≤ 1 − (1 − c)k + vol(S)(1 − S )k (1 − c)k .
8
Note that from (64), we have
1−c
ϕRc0 χS = ϕ IS − RS (IS − WS ) χS
c
(1 − c)λS
= 1− .
c + (1 − c)λS
The Cheeger inequalities and graph partition algorithms 17
(1 − c)λS p γ2
(1 − c)k ≤ + vol(S)(1 − S )k (1 − c)k . (65)
c + (1 − c)λS 8
We can now select k so that
p γS2 k
vol(S)(1 − ) ≤ 1/2.
8
Namely, k is chosen to be k = 4γS−2 log(vol(S)).
Then (65) implies that
(1 − c)k (1 − c)λS
≤ .
2 c + (1 − c)λS
Therefore
1 c
ck ≥ log ≥ .
1 − c/((1 − c)λS ) (1 − c)λS
We then have
1
λS ≥
(1 − c)k
γS2
≥ .
8 log(vol(S))
The proof of Theorem 4 is complete.
log s
if t ≥ 2λS . Such t exists if s ≤ vol(G)2/3 . This leads to
κ2t,S
λS ≥
8
for t = blog(vol(G)/s)/λS c. The proof of Theorem 5 is complete.
For applications, the following modified Cheeger inequality can be
derived along the similar spirit of Theorem 4. Its proof is contained in
[7] and will be omitted.
Theorem 6 In a graph G for a subset of volume s, s ≤ vol(G)/4 and
Cheeger ratio hS ≤ φ2 /4, there is a subset S 0 ⊂ S with vol(S 0 ) ≥ s/2
such that for any u ∈ S 0 , the sweep by using the heat kernel pagerank
−2
√ t = dφ /4e, will find a set T with s-local Cheeger ratio at
ρt,u , with
most φ log s.
so that the associated heatpkernel pagerank can be use to find a cut with
Cheeger ratio at most O( Φ log(vol(S))) (see [7]). The running time
is basically O(vol(S) log(vol(S))/Φ) for approximating and sorting the
heat kernel pagerank with a support no more than the target volume.
References
[1] R. Andersen, F. Chung and K. Lang, Local graph partitioning using
pagerank vectors, Proceedings of the 47th Annual IEEE Symposium
on Founation of Computer Science (FOCS’2006), 475–486.
[2] R. Andersen, F. Chung and K. Lang, Detecting sharp drops in
PageRank and a simplified local partitioning algorithm, Theory and
Applications of Models of Computation, Proceedings of TAMC 2007,
1–12.
[3] S. Brin and L. Page, The anatomy of a large-scale hypertextual
Web search engine, Computer Networks and ISDN Systems, 30 (1-
7), (1998), 107–117.
[4] J. Cheeger, A lower bound for the smallest eigenvalue of the Lapla-
cian, Problems in Analysis (R. C. Gunning, ed.), Princeton Univ.
Press (1970), 195–199.
[5] F. Chung, Spectral Graph Theory, AMS Publications, 1997.
[6] F. Chung, Random walks and local cuts in graphs, LAA 423 (2007),
22–32.
[7] F. Chung, The heat kernel as the pagerank of a graph, PNAS, to
appear.
[8] F. Chung and S.-T. Yau, Coverings, heat kernels and spanning trees,
Electronic Journal of Combinatorics 6 (1999), #R12.
[9] F. Chung and L. Lu, Complex Graphs and Networks, CBMS Re-
gional Conference Series in Mathematics, 107, AMS Publications,
RI, 2006. viii+264 pp.
[10] D. Coppersmith and S. Winograd, Matrix multiplication via arith-
metic progressions, J. Symbolic Comput. 9 (1990), 251–280.
[11] M. Jerrum and A. J. Sinclair, Approximating the permanent, SIAM
J. Computing 18 (1989), 1149-1178.
[12] R. Kannan and S. Vempala and A. Vetta, On clusterings: Good,
bad and spectral, JACM 51 (2004), 497–515.
[13] L. Lovász and M. Simonovits, The mixing rate of Markov chains,
an isoperimetric inequality, and computing the volume, 31st IEEE
Annual Symposium on Foundations of Computer Science, (1990),
346–354.
[14] L. Lovász and M. Simonovits, Random walks in a convex body and
an improved volume algorithm, Random Structures and Algorithms
4 (1993), 359–412.
22 F. Chung
[15] D. Spielman and S.-H. Teng, Nearly-linear time algorithms for graph
partitioning, graph sparsification, and solving linear systems, Pro-
ceedings of the 36th Annual ACM Symposium on Theory of Com-
puting, (2004), 81–90.
[16] R. M. Schoen and S. T. Yau, Differential Geometry, International
Press, Cambridge, Massachusetts, 1994 .