FGFDG
FGFDG
A U S T R A L I A N M AT H E M AT I C A L S O C I E T Y L E C T U R E S E R I E S
Editor-in-chief:
Professor J. Ramagge, School of Mathematics and Statistics, University of Sydney, NSW 2006,
Australia
Editors:
Professor G. Froyland, School of Mathematics and Statistics, University of New South Wales, NSW
2052, Australia
Professor M. Murray, School of Mathematical Sciences, University of Adelaide, SA 5005, Australia
Professor C. Praeger, School of Mathematics and Statistics, University of Western Australia, Crawley,
WA 6009, Australia
www.cambridge.org
Information on this title: www.cambridge.org/9781108417365
DOI: 10.1017/9781108277457
c Peter J. Cameron 2017
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2017
Printed in the United Kingdom by Clays, St Ives plc
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-41736-5 Hardback
ISBN 978-1-108-40495-2 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party Internet Web sites referred to in this publication
and does not guarantee that any content on such Web sites is, or will remain,
accurate or appropriate.
Contents
Preface ix
1 Introduction 1
1.1 What is counting? 1
1.2 About how many? 7
1.3 How hard is it? 7
1.4 Exercises 8
vi Contents
4 Recurrence relations 64
4.1 Linear recurrences with constant coefficients 65
4.2 Linear recurrence relations with polynomial coefficients 77
4.3 Some non-linear recurrence relations 83
4.4 Appendix: Euler’s Pentagonal Numbers Theorem 85
4.5 Appendix: Some Catalan objects 89
4.6 Appendix: The RSK algorithm 97
4.7 Appendix: Some inverse semigroups 99
4.8 Exercises 101
6 q-analogues 114
6.1 Motivation 114
6.2 q-integers 116
6.3 The q-Binomial Theorem 118
6.4 Elementary symmetric functions 119
6.5 Partitions and permutations 121
6.6 Irreducible polynomials 122
6.7 Quantum calculus 124
6.8 Exercises 125
Contents vii
10 Species 170
10.1 Cayley’s Theorem 170
10.2 Species and counting 171
10.3 Examples of species 173
10.4 Operations on species 174
10.5 Cayley’s Theorem revisited 176
10.6 What is a species? 177
10.7 Oligomorphic permutation groups 178
10.8 Weights 181
10.9 Exercises 183
Index 217
Preface
ix
x Preface
the number of permutations which move every symbol, strings of zeros and ones
containing no occurrence of a fixed substring, invertible matrices of given size over
a finite field, expressions for a given positive integer as a sum of positive integers
(the answers are different depending on whether we care about the order of the
summands or not), trees and graphs on a given set of vertices, and many more.
Very often I will not be content with giving you one proof of a theorem. I may
return to an earlier result armed with a new technique and give a totally differ-
ent proof. We learn something from having several proofs of the same result. As
Michael Atiyah said in an interview in the Newsletter of the European Mathemat-
ical Society,
I think it is said that Gauss had ten different proofs for the law of
quadratic reciprocity. Any good theorem should have several proofs,
the more the better. For two reasons: usually, different proofs have
different strengths and weaknesses, and they generalise in different
directions — they are not just repetitions of each other.
In particular, there are two quite different styles of proof for results in enumer-
ative combinatorics. Consider a result which asserts that two counting functions
F(n) and G(n) are equal. We might prove this by showing that their generating
functions are equal, perhaps using analytic techniques of some kind. Alternatively,
we might prove the result by finding a bijection between the sets of objects counted
by F(n) and G(n). Often, when such an identity is proved by analytic methods,
the author will ask for a ‘bijective proof’ of the result.
As an example, if n is even, then it is fairly straightforward to prove by analytic
methods that the number of permutations of {1, . . . , n} with all cycles even is equal
to the number with all cycles odd. But finding an explicit bijection between the
two sets is not straightforward, though not too difficult.
I should stress, though, that the book is not full of big theorems. Tim Gowers, in
a perceptive article on ‘The two cultures of mathematics’, distinguishes branches
of mathematics in which theorems are all-important from those where the empha-
sis is on techniques; enumerative combinatorics falls on the side of techniques. (In
the past this has led to some disparagement of combinatorics by other mathemati-
cians. Many people know that Henry Whitehead said ‘Combinatorics is the slums
of topology’. A more honest appraisal is that the techniques of combinatorics
pervade all of mathematics, even the most theorem-rich parts.)
The notes which became this book were for a course on Enumerative and
Asymptotic Combinatorics at Queen Mary, University of London, in the spring
of 2003, and subsequently as Advanced Combinatorics at the University of St
Andrews. The reference material for the subject has been greatly expanded by
the publication of Richard Stanley’s two-volume work on Enumerative Combina-
torics, as well as the book on Analytic Combinatorics by Flajolet and Sedgewick.
(References to these and many other books can be found in the bibliography at the
Preface xi
end.) Many of these books are encyclopaedic in nature. I hope that this book will
be an introduction to the subject, which will encourage you to look further and to
tackle some of the weightier tomes.
What do you need to know to read this book? It will probably help to have had
some exposure to basic topics in undergraduate mathematics.
• Real and complex analysis (limits, convergence, power series, Cauchy’s the-
orem, singularities of complex functions);
• Abstract algebra (groups and rings, group actions);
• Combinatorics.
None of this is essential; in most cases you can pick up the needed material as you
go along.
The heart of the book is Chapters 2–4, in which the most important tools of the
subject (generating functions and recurrence relations) are introduced and used to
study the most important combinatorial objects (subsets, partitions and permuta-
tions of a set). The basic object here is a formal power series, a single object
encapsulating an infinite sequence of numbers, on which a wide variety of manip-
ulations can be done: formal power series are introduced in Chapter 2.
Later chapters treat more specialised topics: permanents, systems of distinct
representatives, and Latin squares in Chapter 5, ‘q-analogues’ (familiar formu-
lae with an extra parameter arising in a wide variety of applications) in Chap-
ter 6, group actions and the Redfield–Pólya theory of the cycle index in Chapter 7,
Möbius inversion (a wide generalisation of the Inclusion–Exclusion Principle) in
Chapter 8, the Tutte polynomial (a counting tool related to Inclusion–Exclusion)
in Chapter 9, species (an abstract formalism including many important count-
ing problems) in Chapter 10, and some miscellaneous topics (mostly analytic)
in Chapters 11 and 12. The final chapter includes an annotated list of books for
further study.
As always in a combinatorics book, the techniques described have unexpected
applications, and it is worth looking through the index. Cayley’s Theorem on
trees, for example, appears in Chapter 10, where several different proofs are given;
Young tableaux are discussed in Chapter 4, as are various counts of inverse semi-
groups of partial permutations.
The final chapter includes an annotated book list and a discussion of using the
On-line Encyclopedia of Integer Sequences.
The first few chapters contain various interdependences. For example, bino-
mial coefficients and the Binomial Theorem for natural number exponents appear
in Chapter 2, although they are discussed in more detail in Chapter 3. Such occur-
rences will be flagged, but I hope that you will have met these topics in undergrad-
uate courses or will be prepared to take them on trust when they first appear.
I am grateful to many students who have taken this course (especially Pablo
Spiga, Thomas Evans, and Wilf Wilson), to colleagues who have helped teach
xii Preface
it (especially Thomas Müller), and to others who have provided me with exam-
ples (especially Thomas Prellberg and Dudley Stark), and to Abdullahi Umar for
the material on inverse semigroups. I am also grateful to Morteza Mohammed-
Noori, who used my course notes for a course of his own in Tehran, and did a
very thorough proof-reading job, spotting many misprints. (Of course, I may have
introduced further misprints in the rewriting!)
The book will be supported by a web page at
http://www-circa.mcs.st-and.ac.uk/~pjc/books/counting/
which will have a list of misprints, further material, links, and possibly solutions
to some of the exercises.
ix
20:11:43, subject to the Cambridge
.001
x Preface
the number of permutations which move every symbol, strings of zeros and ones
containing no occurrence of a fixed substring, invertible matrices of given size over
a finite field, expressions for a given positive integer as a sum of positive integers
(the answers are different depending on whether we care about the order of the
summands or not), trees and graphs on a given set of vertices, and many more.
Very often I will not be content with giving you one proof of a theorem. I may
return to an earlier result armed with a new technique and give a totally differ-
ent proof. We learn something from having several proofs of the same result. As
Michael Atiyah said in an interview in the Newsletter of the European Mathemat-
ical Society,
I think it is said that Gauss had ten different proofs for the law of
quadratic reciprocity. Any good theorem should have several proofs,
the more the better. For two reasons: usually, different proofs have
different strengths and weaknesses, and they generalise in different
directions — they are not just repetitions of each other.
In particular, there are two quite different styles of proof for results in enumer-
ative combinatorics. Consider a result which asserts that two counting functions
F(n) and G(n) are equal. We might prove this by showing that their generating
functions are equal, perhaps using analytic techniques of some kind. Alternatively,
we might prove the result by finding a bijection between the sets of objects counted
by F(n) and G(n). Often, when such an identity is proved by analytic methods,
the author will ask for a ‘bijective proof’ of the result.
As an example, if n is even, then it is fairly straightforward to prove by analytic
methods that the number of permutations of {1, . . . , n} with all cycles even is equal
to the number with all cycles odd. But finding an explicit bijection between the
two sets is not straightforward, though not too difficult.
I should stress, though, that the book is not full of big theorems. Tim Gowers, in
a perceptive article on ‘The two cultures of mathematics’, distinguishes branches
of mathematics in which theorems are all-important from those where the empha-
sis is on techniques; enumerative combinatorics falls on the side of techniques. (In
the past this has led to some disparagement of combinatorics by other mathemati-
cians. Many people know that Henry Whitehead said ‘Combinatorics is the slums
of topology’. A more honest appraisal is that the techniques of combinatorics
pervade all of mathematics, even the most theorem-rich parts.)
The notes which became this book were for a course on Enumerative and
Asymptotic Combinatorics at Queen Mary, University of London, in the spring
of 2003, and subsequently as Advanced Combinatorics at the University of St
Andrews. The reference material for the subject has been greatly expanded by
the publication of Richard Stanley’s two-volume work on Enumerative Combina-
torics, as well as the book on Analytic Combinatorics by Flajolet and Sedgewick.
(References to these and many other books can be found in the bibliography at the
end.) Many of these books are encyclopaedic in nature. I hope that this book will
be an introduction to the subject, which will encourage you to look further and to
tackle some of the weightier tomes.
What do you need to know to read this book? It will probably help to have had
some exposure to basic topics in undergraduate mathematics.
• Real and complex analysis (limits, convergence, power series, Cauchy’s the-
orem, singularities of complex functions);
• Abstract algebra (groups and rings, group actions);
• Combinatorics.
None of this is essential; in most cases you can pick up the needed material as you
go along.
The heart of the book is Chapters 2–4, in which the most important tools of the
subject (generating functions and recurrence relations) are introduced and used to
study the most important combinatorial objects (subsets, partitions and permuta-
tions of a set). The basic object here is a formal power series, a single object
encapsulating an infinite sequence of numbers, on which a wide variety of manip-
ulations can be done: formal power series are introduced in Chapter 2.
Later chapters treat more specialised topics: permanents, systems of distinct
representatives, and Latin squares in Chapter 5, ‘q-analogues’ (familiar formu-
lae with an extra parameter arising in a wide variety of applications) in Chap-
ter 6, group actions and the Redfield–Pólya theory of the cycle index in Chapter 7,
Möbius inversion (a wide generalisation of the Inclusion–Exclusion Principle) in
Chapter 8, the Tutte polynomial (a counting tool related to Inclusion–Exclusion)
in Chapter 9, species (an abstract formalism including many important count-
ing problems) in Chapter 10, and some miscellaneous topics (mostly analytic)
in Chapters 11 and 12. The final chapter includes an annotated list of books for
further study.
As always in a combinatorics book, the techniques described have unexpected
applications, and it is worth looking through the index. Cayley’s Theorem on
trees, for example, appears in Chapter 10, where several different proofs are given;
Young tableaux are discussed in Chapter 4, as are various counts of inverse semi-
groups of partial permutations.
The final chapter includes an annotated book list and a discussion of using the
On-line Encyclopedia of Integer Sequences.
The first few chapters contain various interdependences. For example, bino-
mial coefficients and the Binomial Theorem for natural number exponents appear
in Chapter 2, although they are discussed in more detail in Chapter 3. Such occur-
rences will be flagged, but I hope that you will have met these topics in undergrad-
uate courses or will be prepared to take them on trust when they first appear.
I am grateful to many students who have taken this course (especially Pablo
Spiga, Thomas Evans, and Wilf Wilson), to colleagues who have helped teach
it (especially Thomas Müller), and to others who have provided me with exam-
ples (especially Thomas Prellberg and Dudley Stark), and to Abdullahi Umar for
the material on inverse semigroups. I am also grateful to Morteza Mohammed-
Noori, who used my course notes for a course of his own in Tehran, and did a
very thorough proof-reading job, spotting many misprints. (Of course, I may have
introduced further misprints in the rewriting!)
The book will be supported by a web page at
http://www-circa.mcs.st-and.ac.uk/~pjc/books/counting/
which will have a list of misprints, further material, links, and possibly solutions
to some of the exercises.
Introduction
This book is about counting. Of course this doesn’t mean just counting a single
finite set. Usually, we have a family of finite sets indexed by a natural number n,
and we want to find F(n), the cardinality of the nth set in the family. For example,
we might want to count the subsets or permutations of a set of size n, lattice paths
of length n, words of length n in the alphabet {0, 1} with no two consecutive 1s,
and so on.
1
20:11:31, subject to the Cambridge
.002
2 Introduction
We will study these further in the next chapter, and meet them many times
during later chapters. In Chapter 10, we will see a sort of explanation of
why some problems need one kind of generating function and some need
the other.
• An asymptotic estimate for F(n) is a function G(n), typically expressed in
terms of the standard functions of analysis, such that F(n) − G(n) is of
smaller order of magnitude than G(n). (If G(n) does not vanish, we can
write this as F(n)/G(n) → 1 as n → ∞.) We write F(n) ∼ G(n) if this holds.
This might be accompanied by an asymptotic estimate for F(n) − G(n), and
so on; we obtain an asymptotic series for F. (The basics of asymptotic anal-
ysis are described further in the next section of this chapter.)
• Related to counting combinatorial objects is the question of generating them.
The first thing we might ask for is a system of sequential generation, where
we can produce an ordered list of the objects. Again there are two possibili-
ties.
If the number of objects is F(n), then we can in principle arrange the objects
in a list, numbered 0, 1, . . . , F(n) − 1; we might ask for a construction which,
given i with 0 ≤ i ≤ F(n) − 1, produces the ith object on the list directly,
without having to store the entire list and count through from the start.
Alternatively, we may simply require a method of moving from each object
to the next.
• We could also ask for a method for random generation of an object. If we
have a technique for generating the ith object directly, we simply choose a
random number in the range {0, . . . , F(n) − 1} and generate the correspond-
ing object. If not, we have to rely on other methods such as Markov chains.
Here are a few examples. These will be considered in more detail in Chapter 3;
it is not necessary to read what follows here in detail, but you are advised to skim
through it.
1
∑ 2nxn = 1 − 2x ,
n≥0
2n xn
∑ = exp(2x).
n≥0 n!
(I will use exp(x) instead of ex in these notes, except in some places involving
calculus.)
No asymptotic estimate is needed, since we have a simple exact formula. In-
deed, it is clear that 2n is a number with n log10 2 decimal digits.
Choosing a random subset, or generating all subsets in order, are easily achieved
by the following method. For each i ∈ {0, . . . , 2n − 1}, write i in base 2, producing
a string of length n of zeros and ones. Now j belongs to the ith subset if and only
if the jth symbol in the string is 1.
A procedure for moving from one set to the next can be produced using the
odometer principle, based on the odometer or mileage gauge in a car. Represent
a subset as above by a string of zeros and ones. To construct the next subset in
the list, first identify the longest substring of ones at the end of the string. If the
string consists entirely of ones, then it is last in the order, and we have finished.
Otherwise, this string is preceded by a zero; change the zero to a one, and the ones
following it to zeros. For example, for n = 3, the odometer principle generates the
strings
000, 001, 010, 011, 100, 101, 110, 111,
which correspond to the subsets
/ {3}, {2}, {2, 3}, {1}, {1, 3}, {1, 2}, {1, 2, 3}
0,
of {1, 2, 3}.
Notice that the binary strings are in lexicographic order, the order in which they
would appear in a dictionary, regarding them as words over the alphabet {0, 1}.
For 0 ≤ k ≤ n, the number of k-element subsets of {1, . . . , n} is given by the
binomial coefficient
n n(n − 1) · · · (n − k + 1)
= .
k k(k − 1) · · · 1
The binomial coefficients are traditionally written
in a triangular
array where, for
n ≥ 0, the nth row contains the numbers n0 , n1 , . . . , nn . This is usually called
Pascal’s triangle, though, as we will see, it was not invented by Pascal. It begins
like this:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
The most important property, and the reason for the name, is the form of the
generating function for these numbers (regarded as a sequence indexed by k for
fixed n), the Binomial Theorem:
n
n
∑ k xk = (1 + x)n.
k=0
or as an integral: ∞
n! = xn e−x dx.
0
Neither of these is easier to evaluate than the original definition. (We will meet
both these formulae later on.)
The recurrence relation for F(n) = n! is
F(0) = 1, F(n) = nF(n − 1) for n ≥ 1.
This leads to the same method of evaluation as we saw earlier.
The ordinary generating function for F(n) = n! fails to converge anywhere
except at the origin. The exponential generating function is 1/(1 − x), convergent
for |x| < 1.
As an example to show that convergence is not necessary for a power series to
be useful, let
−1
1 + ∑ n!xn = 1 − ∑ c(n)xn .
n≥1 n≥1
Then c(n) is the number of connected permutations on {1, . . . , n}. (A permutation
π is connected if there does not exist k with 1 ≤ k ≤ n − 1 such that π maps
{1, . . . , k} to itself.) This will be proved in the next chapter.
The approximate size of the factorial function is not obvious, as it was for
powers of 2. An asymptotic estimate for n! is given by Stirling’s formula:
√ n n
n! ∼ 2πn .
e
We give the proof later.
It is possible to generate permutations sequentially, or choose a random permu-
tation, by a method similar to that for subsets, using a variable base.
The set of permutations of {1, . . . , n} forms a group under the operation of
composition, the symmetric group of degree n, denoted by Sn .
The ordinary generating function for d(n) fails to converge, but the exponential
generating function is equal to exp(−x)/(1 − x).
These facts will be proved in Chapter 4.
Since the probability that a random permutation is a derangement is about 1/e,
we can choose a random derangement as follows: repeatedly choose a random
permutation until a derangement is obtained. The expected number of choices
necessary is about e.
Example: set partitions The Bell number B(n) is the number of partitions of
the set {1, . . . , n}. Again, no simple formula is known, and the asymptotics are
very complicated. There is a recurrence relation,
n
n−1
B(n) = ∑ B(n − k),
k=1 k − 1
and the exponential generating function is
B(n)xn
∑ n!
= exp(exp(x) − 1).
Based on the recurrence one can derive a sequential generation algorithm, which
calls itself recursively.
an n-set have?’ The input data is the integer n, which (if written in base 2) requires
only m = 1 + log2 n bits to specify. The question asks us to calculate the Bell
number B(n), which is greater than 2n−1 for n > 2, and so it takes time exponential
in m simply to write down the answer! To get round this difficulty, it is usual
to pretend that the size of the input data is actually n rather than log n. (We can
imagine that n is given by writing n consecutive 1s on the input tape of the Turing
machine, that is, by writing n as a tally rather than in base 2.)
We have seen that computing 2n (the number of subsets of an n-set) requires at
most log n integer multiplications. But the integers may have as many as n digits,
so each multiplication takes about n Turing machine steps. Similarly, the solution
to a recurrence relation can be computed in time polynomial in n, provided that
each individual computation can be.
On the other hand, a method which involves generating and testing every subset
or permutation will take exponentially long, even if the generation and testing can
be done efficiently.
A notion of complexity relevant to this situation is the polynomial delay model,
which asks that the time required to generate each object should be at most nc
for some fixed c, even if the number of objects to be generated is greater than
polynomial.
Of course, it is easy to produce combinatorial problems whose solution grows
faster than, say, the exponential of a polynomial. For example, how many inter-
secting families of subsets of an n-set are there? The total number, for n odd, lies
n−1 n
between 22 and 22 , so that even writing down the answer takes time exponential
in n.
We will not consider complexity questions further in this book.
1.4 Exercises
1.1 Construct a bijection between the set of all k-element subsets of {1, . . . , n}
containing no two consecutive elements, and the set of all k-element subsets
of
n−k+1
{1, . . . , n − k + 1}. Hence show that the number of such subsets is .
k
When the UK National Lottery was introduced in 1994, the draw consisted
of choosing six distinct numbers randomly from the set {1, . . . , 49}. What is the
probability that the draw contained no two consecutive numbers?
1.2 (a) In Vancouver in 1984, I saw a Dutch pancake house advertised ‘a thou-
sand and one combinations’ of toppings. What do you deduce?
(b) More recently McDonalds offered a meal deal with a choice from eight com-
ponents of your meal, and advertised ‘40 312 combinations’. What do you
deduce?
1.4 Let f (n) be the number of partitions of an n-set into parts of size 2.
(b) Prove that the exponential generating function for the sequence ( f (n)) is
exp(x2 ).
for n = 2m.
{1} {1,2}
0/ {2}
Remark Gray codes are used in analog-to-digital converters. Since only one
digit changes at a time when the input varies continuously, the damage caused by
an error in reading the changing digit is minimised.
1.6 Counting can be used to prove structural results, as in the following exercise,
which proves a theorem of Mantel: A graph with n vertices and more than n2 /4
edges must contain a triangle.
Consider a graph with n vertices, e edges and t triangles. Let xi be the number
of edges containing vertex i.
(a) Show by Inclusion–Exclusion that, if vertices i and j are joined, then at least
xi + x j − n triangles contain these two vertices.
∑ xi2 ≥ 4e2/n.
i
(e) Hence show that, if e > n2 /4, the graph contains a triangle.
The Cauchy–Schwarz inequality says that the inner product of two vectors x and
y cannot exceed in modulus the product of the norms of the vectors. Indeed, the
ratio (x ·y)/
x
·
y
is equal to the cosine of the angle between x and y.
This is a remarkably useful inequality, in combinatorics as well as other branches
of mathematics.
Can you prove the inequality? [Hint: Calculate the squared norm of the vector
x + λy; this is a quadratic function of λ which can never be negative.]
1.7 (a) Find an iterative method for listing the k-subsets of the natural numbers
in reverse lexicographic order (that is, ordered by the largest element, and if
largest elements are equal then by the second largest, and so on). Thus, for
k = 4, the list begins
Your method should give an algorithm for moving from any subset to the
next in the list.
(b) Show that, if a1 < · · · < ak , then the position of {a1 , . . . , ak } in the list is
given by
a1 a2 ak
+ +···+ .
1 2 k
Can you describe the inverse of this function, which enables us to write down
the nth subset in the list?
12
20:11:16, subject to the Cambridge
.003
2.1 Fibonacci numbers 13
What is (1 − x − x2 )F(x)? Clearly this product has constant term 1, while the term
in x is (1)(1) + (−1)(1) = 0. For n ≥ 2, the term in xn is (1)Fn + (−1)Fn−1 +
(−1)Fn−2 = 0. In other words, we have
(1 − x − x2 )F(x) = 1,
so that F(x) = 1/(1 − x − x2 ).
This is an explicit formula for the power series; how can we turn it into a
formula for the coefficients? In this case, there are many ways to proceed; the
most elementary uses partial fractions. Write
1 − x − x2 = (1 − αx)(1 − β x),
so that
A + B = 1,
Aβ + Bα = 0.
∑ (αx)n = 1 + αx + α 2x2 + · · · ,
n≥0
We have succeeded in finding a formula for the Fibonacci numbers. There are
several things to note about this formula. √
√ good asymptotic estimate. For α = (1+ 5)/2 is greater
First, it gives us a very
than 1, while β = (1 − 5)/2 is between 0 and −1 (in fact, α and β are approx-
imately 1.618 and −0.618 respectively), so, as n → ∞, α n grows exponentially
while β n decays exponentially to zero. Thus, Fn ∼ Aα n .
Indeed, the second term in our formula above is always √ less than 1/2
√ in modu-
lus, so we can conclude that Fn is the nearest integer to (1/ 5)((1 + 5)/2)n+1 .
Second, the formula is all but useless for calculation. Evaluating it for n = 100
would require
√ doing two binomial expansions, cancelling half the terms, and then
knowing 5 to sufficient accuracy that we could evaluate the result to within an
integer. Using the recurrence relation, as Fibonacci did, is a much more efficient
method! We will see an even quicker method in Exercise 2.12, allowing the evalu-
ation of Fn in only 2 log n arithmetic operations with integers (though each of these
operations will take about n steps, since the integers have this many digits).
Third, it is almost always easier to prove properties of the Fibonacci numbers
using the recurrence relation than to use the formula.
Finally, you may feel a bit uneasy about the manipulations used in the argu-
ment. In analysis, you learn that these manipulations are justified if the series
converge absolutely; we did them with no regard to convergence, and the results
we obtain could be used to show that the series do converge. However, it all seems
a bit suspicious!
2.1.1 How do the Fibonacci numbers start?
Nobody doubts that the recurrence relation for the Fibonacci numbers is Fn =
Fn−1 + Fn−2 . But there is no general agreement on the ‘initial conditions’. There
are three fairly common conventions:
(a) F0 = 0, F1 = 1
(b) F0 = 1, F1 = 1
(c) F0 = 1, F1 = 2
Changing from one convention to another simply shifts the sequence by one or
two places, and requires a corresponding adjustment in the formula for the nth
Fibonacci number.
The second and third conventions are both rather natural if you believe that
the Fibonacci numbers count things. If, like Virahanka, you think that Fn is the
number of compositions of n made of 1s and 2s, then you will probably use the
second convention above; if you think that it is the number of zero-one sequences
with no two consecutive ones, then you will use the third convention.
The first convention has some remarkable properties. For a start, it satisfies
gcd(Fn , Fm ) = Fgcd(m,n) , so that in particular, if m divides n then Fm divides Fn .
A consequence is that Fn is prime only if n is prime (apart from one small case:
F2 = 1 with this convention, and F4 = 3). Thus, F3 = 2, F5 = 5, F7 = 13, . . . . There
is also the very curious property that F12 = 122 .
associative, and identity laws for addition and multiplication, the distributive law,
and the existence of additive inverses or negatives of all elements). In other words,
the entries should be taken from a commutative ring with identity.
We denote the set of all formal power series with coefficients in R, a fixed
commutative ring with identity, by R[[x]].
From this point of view, we can regard a polynomial with coefficients in R as a
formal power series, all of whose terms are zero from some point on. Thus the set
R[x] of polynomials can be identified as a subset (indeed, a subring) of R[[x]].
Now we come to the definitions. It is important to note that, even though we
write a formal power series as if it were an infinite sum, the definitions below
require no infinite processes or taking limits; everything we do will involve only
sums and products of finite numbers of terms in the sequences.
rA = ∑(ran )xn .
Proposition 2.1 Let R be a commutative ring with identity. Then the set R[[x]] of
formal power series over R, with the above operations, is also a commutative ring
with identity.
We will not prove this result, which involves checking all the appropriate ax-
ioms for R[[x]], using the fact that they hold in R.
We can extend the definition of multiplication to give the product of a finite
number of formal power series in an obvious way.
As usual, we define A(x)n by repeated multiplication, or by induction:
A(x)0 = 1, A(x)n+1 = A(x)(A(x))n for n ≥ 0.
Differentiation This operation sounds suspiciously like calculus, but all we take
from calculus is the rule for differentiating positive powers of x. We define an
operator D on formal power series by the rule that, if A = ∑ an xn , then DA =
∑ nan xn−1 ; in other words, the term of degree n in DA is (n + 1)an+1 xn . (We don’t
get a negative power of x since, as expected, the derivative of the constant term is
zero.)
As usual, we usually write d f /dx instead of D f (assuming that the variable in
the formal power series is x). But remember that no calculus is involved here!
Some examples
∏ (1 + xn)
n≥1
This name is not really accurate since, as we have seen, the concept of a formal
power series makes sense even when the series does not converge (and so does not
define an analytic function). A more accurate term would be generating power
series; this term is sometimes used, but ‘generating function’ is more common, so
we will stick to that.
Sometimes the series A(x) above is called the ordinary generating function,
abbreviated to o.g.f. This contrasts with the exponential generating function or
e.g.f., which is defined by
an x n
a(x) = ∑ ,
n≥0 n!
in other words the generating function of the sequence with nth term an /n!. Later
in this book we will see many examples, and motivation, for using the exponential
generating function in counting problems. The name comes from the fact that the
exponential generating function of the all-1 series is the exponential function:
xn
ex = ∑ n! .
n≥0
Note that the exponential generating function will converge in many cases where
the ordinary generating function does not.
Another variant is useful when, instead of a single sequence, we have a two-
variable array of numbers, say an,m , where n and m are integers and the points
(n, m) lie in some appropriate region of the first quadrant. We associate a formal
variable with each subscript (say x with n and y with m in this case), and define the
multivariate generating function
A(x, y) = ∑ an,m xn ym .
n,m
We will see an example in the next chapter, for the array of binomial coefficients.
This can be further varied, for example, we might take the ordinary generating
function in one variable and the exponential in the other, say
an,m xn ym
∑ n!
.
n,m
Nonetheless, we often find that our series are actually convergent in some re-
gion of the complex plane, maybe even the whole plane. Let us just stop and
remind ourselves about convergence of complex power series.
For example, the geometric series ∑ cn xn converges for |x| < 1/|c|, and the limit
is equal to 1/(1 − cx) in this domain; it has a singularity (a simple pole) at x = 1/c.
Now, in analysis, one learns rules for addition and multiplication of convergent
series, and (under extra conditions) for infinite products, substitution, and differ-
entiation of power series. In every case, these definitions agree with the ones we
have given. So we have the following result:
Theorem 2.2 Suppose that a collection of formal power series over C converge in
the domain |x| < r, for some positive real number r. Then the formal power series
defined by addition, multiplication or differentiation of these series also converge
in this domain, and their limits are the sum, product or derivative of the limit
functions of the original series.
There are similar results for infinite products and substitution, but their state-
ments require a little more care. (In our earlier example of an infinite product, the
factors are polynomials and so converge everywhere, but it can be shown that the
product has radius of convergence 1.)
In other words, we can operate on formal power series with no consideration
of convergence; if the series happen to converge, then all our conclusions will be
correct for the limit functions of the series.
These theorems allow us to bring all the resources of analysis to bear on com-
binatorial counting problems. We will see examples of this later. First, here is a
very simple example of such a result.
Theorem 2.3 Let A(x) = ∑ an xn be a formal power series with non-negative coef-
ficients. Suppose that the radius of convergence of the series is r, and assume that
r > 0.
(a) If r is finite, and c = 1/r, then the nearest singularity to the origin of the
limit function is at x = r, and for any ε > 0, we have
Note that we write exp(x) rather than ex , to stress that we are thinking of it as a
formal power series. Its characteristic property is that it is equal to its derivative:
d
exp(ax) = a exp(ax).
dx
The exponential series also has a multiplicative property. This will be familiar
to you from analysis; we give a combinatorial proof here.
Proposition 2.4
(In the first line we use the Binomial Theorem to expand (x + y)n ; and in getting
from the first line to the second we use the formula
n n!
= .
k k! (n − k)!
Now on the right we have exactly the terms of total degree n in the two variables
which arise from multiplying the two exponential series on the right. We see that
the multiplicative property of the exponential series is equivalent to the Binomial
Theorem for positive integer exponents.
The second type of series is the binomial series for arbitrary exponent. First we
generalise the definition of binomial coefficients as follows. Let a be an arbitrary
complex number, and n a non-negative integer. Define
a a(a − 1) · · · (a − n + 1)
= .
n n!
We see that, if a is a non-negative integer, the series is finite, and agrees with the
series (1 + x)a by the Binomial Theorem; so the definition is actually a theorem in
this case.
In general, the series converges for |x| < 1, and the sum can be shown to agree
with the analytic definition of powers with arbitrary exponent.
The exponent laws hold:
In the second and third equations, we use the substitution rule for formal power
series to evaluate ((1 + x)a )b or (1 + (x + y + xy))a . (In the first case, we let (1 +
x)a = 1 + u, where u is a formal power series with constant term zero; then, as we
have seen, the formal power series (1 + u)b is well-defined.)
We also have the familiar property of differentiation:
d
(1 + x)a = a(1 + x)a−1 .
dx
(−1)n−1 xn
log(1 + x) = ∑ n
.
n≥1
Again this series converges for |x| < 1, and the sum function agrees with the an-
alytic definition of the logarithm. Note that we use log(1 + x) rather than log x,
since log x has an essential singularity at the origin and so there is no Taylor series
expansion of it there.
We see that
exp(log(1 + x)) = 1 + x,
log(1 + (exp(x) − 1)) = x.
Again, note that exp(x) − 1 has constant term zero and so can be substituted into
the logarithmic series.
There is some elaborate combinatorics underlying these two equations. Some
of this is discussed in an appendix to the next chapter.
2.6 Exercises
2.1 (a) Let (an ) be a sequence of integers, and (bn ) the sequence of partial sums
n
of (an ) (in other words, bn = ∑ ai). Suppose that the generating function
i=0
for (an ) is A(x). Show that the generating function for (bn ) is A(x)/(1 − x).
(b) Let (an ) be a sequence of integers, and let cn = nan for all n ≥ 0. Sup-
pose that the generating function for (an ) is A(x). Show that the generating
function for (cn ) is x(d/dx)A(x). What is the generating function for the
sequence (n2 an )?
(c) Use the preceding parts of this exercise to find the generating function for
n
the sequence whose nth term is ∑ i2, and hence find a formula for the sum
i=1
of the first n squares.
2.2 Suppose that a collection of complex power series all define functions analytic
in some neighbourhood of the origin, and satisfy some identity there. Are we
allowed to conclude that this identity holds between the series regarded as formal
power series?
2.3 Suppose that A(x), B(x) and C(x) are the exponential generating functions of
sequences (an ), (bn ) and (cn ) respectively. Show that A(x)B(x) = C(x) if and only
n
if
n
cn = ∑ ak bn−k ,
k=0 k
where
n n!
= .
k k! (n − k)!
2.4 Let R be a commutative ring with identity, and let R[[x]] denote the set of
formal power series over R.
(a) Prove that, with the operations of addition and multiplication defined in this
chapter, R[[x]] is a commutative ring with identity.
(b) Prove that a formal power series ∑ an xn is a unit in R[[x]] (that is, has a
multiplicative inverse) if and only if its constant term a0 is a unit in R.
(c) Suppose that R is a field. Show that R[[x]] has a unique maximal ideal I,
consisting of all formal power series with zero constant term (and this ideal
is generated by x). What is R[[x]]/I?
(d) Suppose that R is a field. Describe all the ideals in R[[x]].
2.5 Prove that (d/dx) exp(ax) = a exp(ax) and (d/dx)(1 + x)a = a(1 + x)a−1 .
2.6 Verify the following formula for the sloping diagonals of Pascal’s triangle:
n/2
n−i
∑ i
= Fn .
i=0
2.8 (a) Prove that the number of sequences of length n of zeros and ones which
have no two consecutive ones is Fn+1 for n ≥ 0.
(b) In how many ways can zeros and ones be arranged in n positions around a
circle so that no two ones are consecutive?
2.10 Let G(n) be the number computed in the following way. Write n as a com-
position (an ordered sum) of positive integers in all possible ways. For each ex-
pression, multiply the summands together; then add all the products formed in this
way. For example,
4 = 3 + 1 = 1 + 3 = 2 + 2 = 2 + 1 + 1 = 1 + 2 + 1 = 1 + 1 + 2 = 1 + 1 + 1 + 1,
so
G(4) = 4 + 3 + 3 + 4 + 2 + 2 + 2 + 1 = 21.
Prove that G(n) = F2n−1 .
2.11 In the previous exercise, instead of multiplying the summands, multiply fac-
tors 2d−2 for each summand d > 2. Show that the answer is F2n−2 .
2.13 Use the result of the preceding exercise to show that the Fibonacci number
F(n) can be calculated with only c log n arithmetic operations, for some constant
c.
prove that
am+1 − bm+1 m/2 m − k
= ∑ (−ab)k (a + b)m−2k .
a−b k=0 k
By taking a and b to be the roots of the equation z2 − z − 1 = 0, deduce the
equality of two expressions for the Fibonacci numbers.
(I am grateful to Marcio Soares for this exercise.)
2.15 This exercise explores some remarkable properties of the Fibonacci recur-
rence relation. I learned this (and more) from J. H. Conway; Clark Kimberling
also proved some of these results. There are further properties not discussed here!
(a) Show that any positive integer n can be written uniquely in the form
n = Fm1 + Fm2 + · · · + Fmk ,
where m1 , . . . , mk are positive integers (in increasing order) with no two con-
secutive.
(b) Define the Fibonacci successor function σ on the set of positive integers as
follows: if n is expressed as in the preceding part of the question, then
σ (n) = Fm1 +1 + Fm2 +1 + · · · + Fmk +1 .
Show that σ (σ (n)) = σ (n) + n for any positive integer n.
(c) Construct an array of positive integers as follows:
• The zeroth row consists of the Fibonacci numbers Fn for n ≥ 1: so this
row is (1, 2, 3, 5, . . .).
• The first element in any row after the zeroth is the smallest positive
integer which has not yet occurred in the table (so, for example, the
first number in the first row is 4). Subsequent elements in the row are
found by applying the Fibonacci successor function (so the first row
begins (4, 7, 11, 18, . . .)).
Prove that
• every positive integer occurs exactly once in the table;
• given any sequence (a1 , a2 , . . .) satisfying the Fibonacci recurrence re-
lation an+2 = an+1 + an , the terms of the sequence from some point on
agree with the entries in some row of the table.
(d) Extrapolate every row of the table backwards two places so that the Fi-
bonacci recurrence holds (so, for example, the zeroth row (1, 2, 3, . . .) is
preceded by (0, 1). Show that the first entry in the new nth row of the ta-
ble is n.
2.16 For this exercise, we adopt the convention that F0 = 0 and F1 = 1.
Prove that
(a) gcd(Fm , Fn ) = Fgcd(m,n) ;
(b) if m | n, then Fm | Fn ;
(c) if Fn is prime, then either n is prime or n = 4.
Is the converse of the last statement true?
3.1 Subsets
The number of k-element subsets of the set {1, . . . , n} is the binomial coefficient
0 if k < 0 or k > n;
n
= n(n − 1) · · · (n − k + 1) if 0 ≤ k ≤ n.
k k(k − 1) · · · 1
29
20:10:59, subject to the Cambridge
.004
30 Subsets, partitions and permutations
Proof Partition the k-element subsets into two classes: those containing n (which
have the form {n} ∪ L, where L is a (k − 1)-element subset of {1, . . . , n − 1}, and
so are n−1k−1 in number); and those not containing n (which are k-element subsets
n−1
of {1, . . . , n − 1}, and so are k in number).
(x + y)(x + y) · · · (x + y) (n factors);
multiplying this out we get the sum of 2n terms, each of which is obtained by
choosing y from a subset of the factors and x from the remainder. There are nk
subsets of size k, and each contributes a term xn−k yk to the sum, for k = 0, . . . , n.
Once we have the Binomial Theorem, alternative proofs of various facts about
the binomial coefficient become available. Here, for example, is a proof of Propo-
sition 3.1.
Proof We have
(x + y)n = (x + y)n−1 · (x + y).
n−k k n
The coefficient of x y on the left is . On the right, this term arises from
k
n−k
multiplying the term in x y k−1 in (x + y) n−1 byy, andalsofrom multiplying
the
n − 1 n − 1
term xn−k−1 yk by x; so the resulting coefficient is + , as required.
k−1 k
The Binomial Theorem can be looked at in various ways. From none point of
view, it gives the generating function for the binomial coefficients k for fixed n:
n
∑ k yk = (1 + y)n.
k≥0
Since the binomial coefficients have two indices, we could ask for a two-variable
generating function:
n
∑ ∑ k xnyk = ∑ xn(1 + y)n
n≥0 k≥0 n≥0
1
= .
1 − x(1 + y)
If we expand this in powers of y, we obtain
1 1 1
= ·
(1 − x) − xy 1 − x 1 − (x/(1 − x))y
xk
= ∑ yk ,
k≥0 (1 − x)k+1
licenses/by-sa/3.0/
Proof The sum is over the finite number of values of i for which both the binomial
coefficients on the right are non-zero; that is, 0 ≤ i ≤ m and 0 ≤ k − i ≤ n.
Consider
the coefficient of xk in the expansion of (1 + x)m+n . This is clearly
m+n
. On the other hand, we can write
k
(1 + x)m+n = (1 + x)m (1 + x)n ,
The other aspect of the Binomial Theorem is its generalisation to arbitrary real
exponents (due to Isaac Newton). This depends on a revised definition of the
binomial coefficients.
Let a be an arbitrary real (or complex) number, and k a non-negative integer.
Define
a a(a − 1) · · · (a − k + 1)
= .
k k!
Note that this agrees with the previous definition in the case when n is a non-
negative integer, since if k > n then one of the factors in the numerator is zero. We
do not define this version of the binomial coefficients if k is not a natural number.
Now the Binomial Theorem asserts that, for any real number a, we have
a k
(1 + x)a = ∑ x. (3.1)
k≥0 k
and the series on the right-hand side is convergent for |x| < 1), it is a theorem, and
was understood by Newton in this form. As an equation connecting formal power
series, we may follow the same approach, or we may instead choose to regard (3.1)
as the definition and (3.2) as the theorem, according to taste. Whichever approach
we take, we need to know that the laws of exponents hold:
If (3.1) is our definition, these verifications will reduce to identities between bino-
mial coefficients; if (3.2) is the definition, they depend on properties of the power
series for exp and log, defined as in the last chapter.
3.1.1 Sampling
I have n distinguishable objects in a hat. In how many ways can I choose k of
them?
The answer depends on how the sampling is done. Two questions must be
resolved:
(a) Is the sampling with replacement (after each item is chosen, I note it down
and return it to the hat), or without replacement (an item once chosen is put
to one side and not used again)?
(b) Does the order in which items are chosen matter or not?
Theorem 3.6 The number of ways of sampling k objects from a set of n is given in
the following table, where (n)k = n(n − 1) · · · (n − k + 1):
Order Order
significant not
significant
k n+k−1
With replacement n
k
n
Without replacement (n)k
k
n+k−1
Remark The number of samples with replacement and with order
k
not significant can be written as a negative binomial coefficient:
n+k−1 k −n
= (−1) .
k k
Proof Three of the entries of the table are straightforward. The one that requires
proof is the number of samples with replacement and with order not significant.
We calculate this by transforming the problem into another.
Step 1 The number of samples of k objects from n, with replacement and with
order not significant, is equal to the number of n-tuples (a1 , . . . , an ) of non-negative
integers with sum k.
We code a sample by the n-tuple whose ith entry ai is the number of occurrences
of the ith object in the sample. Clearly ai ≥ 0 and a1 + · · · + an = k. The bijection
reverses.
n+k−1
and a1 + · · · + an = k; so the number is .
k
Interchanging the roles of n and k, the result in (b) asserts that the number of
compositions of n with k parts
(that is, expressions for n as an ordered sum of k
n−1
positive integers) is . Summing over k, the total number of compositions
k−1
n
of n is
n−1
∑ k − 1 = 2n−1
k=1
(compare the introductory example in Chapter 4).
You see it won’t fit on the page, even at the smallest type size. The answer is
161700.
How can we find this more efficiently?
If k is small compared to n, the formula
n n(n − 1) · · · (n − k + 1)
=
k k!
uses only 2(k − 1) multiplications and one division. Even better is to do the multi-
plications and divisions alternately:
n n n−k+1
= · .
k k−1 k
This has k − 1 multiplications and the same number of divisions, but the numbers
being divided by are at most k and the numbers in the sum never grow too large.
Moreover, in the sequence
n n n−1 n−2
= × × ×···
k 1 2 3
3.2 Partitions
In this section, we see for the first time a distinction between ‘labelled’ and ‘un-
labelled’ structures, which will be discussed in more detail in Chapters 7 and 10.
A labelled structure of a certain type is simply a structure on the set {1, 2, . . . , n},
for some n. An unlabelled structure is an equivalence class of labelled structures,
where two structures are equivalent if one can be obtained from the other by some
permutation of the set {1, 2, . . . , n}.
For a rather trivial example, consider the case where our ‘structures’ are subsets
of the set in question. The number of labelled structures is simply the number of
subsets of {1, . . . , n}, which is 2n . Now two subsets are equivalent in the above
sense if and only if they have the same cardinality. So the number of unlabelled
subsets of an n-set is the number of possible cardinalities: the possibilities are
0, 1, 2, . . . , n, so the number is n + 1.
The Bell number B(n) is the number of partitions of the set {1, . . . , n}. This is a
count of labelled structures. Two partitions are equivalent if and only if the lists of
sizes of the parts, arranged in non-increasing order, are the same in the two cases.
(For if this holds, we may find a permutation mapping the ith part of one partition
to the ith part of the other, for all i.) So the ‘unlabelled’ counting number is p(n),
the partition number, which is the number of partitions of the number n (that is,
lists in non-increasing order of positive integers with sum n). Thus, given any set
partition, the list of sizes of its parts is a number partition; and two set partitions
are equivalent under relabelling the elements of the underlying set (that is, under
permutations of {1, . . . , n}) if and only if the corresponding number partitions are
equal.
where the Bell number B(n) is the total number of partitions of {1, . . . , n}.
n By
analogy with the binomial coefficients, Karamata proposed the notation
for S(n, k). This approach was championed by Knuth, and is sometimes
k
referred to as Karamata–Knuth notation.
Proposition 3.7 The recurrence relation for the Stirling numbers is
S(n, 1) = S(n, n) = 1, S(n, k) = S(n − 1, k − 1) + kS(n − 1, k) for 1 < k < n.
Proof We split the partitions into two classes: those for which {n} is a single part
(obtained by adjoining this part to a partition of {1, . . . , n − 1} into k − 1 parts),
and the remainder (obtained by taking a partition of {1, . . . , n − 1} into k parts,
selecting one part, and adding n to it). There are S(n − 1, k − 1) partitions in the
first class, and kS(n − 1, k) in the second.
Proposition 3.8 (a) The Stirling numbers satisfy the recurrence
n−1
n−1
S(n, k) = ∑ S(n − i, k − 1).
i=1 i − 1
Proof Consider the part containing n of anarbitrary partition with k parts; sup-
pose that it has cardinality i. Then there are n−1
i−1 choices for the remaining i − 1
elements in this part, and S(n − i, k − 1) partitions of the remaining n − i elements
into k − 1 parts. This proves (a); the proof of (b) is almost identical.
The Stirling numbers also have the following property. Let (x)k denote the
polynomial x(x − 1) · · · (x − k + 1).
n
Proposition 3.9 xn = ∑ S(n, k)(x)k .
k=1
Proof We prove this first when x is a positive integer. We take a set X with x
elements, and count the number of n-tuples of elements of x. The total number
is of course xn . We now count them another way. Given an n-tuple (x1 , . . . , xn ),
we define an equivalence relation on {1, . . . , n} by i ≡ j if and only if xi = x j .
If this relation has k different classes, then there are k distinct elements among
x1 , . . . , xn , say y1 , . . . , yk (listed in order). The choice of the partition and the k-
tuple (y1 , . . . , yk ) uniquely determines (x1 , . . . , xn ). So the number of n-tuples is
given by the right-hand expression also.
Now this equation between two polynomials of degree n holds for any positive
integer x, so it must be a polynomial identity.
yk
∑ S(n, k)yn = (1 − y)(1 − 2y) · · · (1 − ky) .
n≥1
Now in the terms involving S(n − 1, ∗), we first note that we can start the summa-
tion at n = 2 since the terms for n = 1 are zero; then use a new variable m = n − 1;
and finally replace the variable m by n. We get
so that
y yk
Fk (y) = Fk−1 (y) = .
1 − ky (1 − y) · · · (1 − ky)
Stirling numbers are involved in the substitution of exp(x) − 1 for x in formal
power series. The result depends on the following lemma:
Lemma 3.11
S(n, k)xn (exp(x) − 1)k
∑ n!
=
k!
.
n≥k
Proof The proof is by induction on k, the result being true when k = 1 since
S(n, 1) = 1. Suppose that it holds when k = l − 1. Then (setting S(n, k) = 0 if
n < k) we have
(exp(x) − 1)l 1 (exp(x) − 1)l−1
= · (exp(x) − 1) ·
l! l (l − 1)!
1 xn S(n, l − 1)xn
=
l n≥1∑ n! · ∑ n!
.
n≥1
Proof Suppose that (a) holds. Without loss of generality we may assume that
a0 = b0 = 0. Then
bn xn
B(x) = ∑
n≥1 n!
n
xn
= ∑ n! ∑ S(n, k)ak
n≥1 k=1
S(n, k)xn
= ∑ ak ∑ n!
k≥1 n≥k
ak (exp(x) − 1)k
= ∑ k!
k≥1
= A(exp(x) − 1),
by Lemma 3.11.
The converse is proved by reversing the argument.
Corollary 3.13 The exponential generating function for the Bell numbers is
B(n)xn
∑ n!
= exp(exp(x) − 1).
n≥0
Proof Apply Proposition 3.12 to the sequence with an = 1 for all n; or sum the
equation of Lemma 3.11 over k.
3.3 Permutations
A permutation of {1, . . . , n} is a bijective function from this set to itself.
In the nineteenth century, a more logical terminology was used. Such a func-
tion was called a substitution, while a permutation was a sequence (a1 , a2 , . . . , an )
containing each element of the set precisely once. Since there is a natural ordering
of {1, 2, . . . , n}, there is a one-to-one correspondence between ‘permutations’ and
‘substitutions’: the sequence (a1 , a2 , . . . , an ) corresponds to the function π : i → ai ,
for i = 1, . . . , n.
The correspondence between permutations and total orderings of an n-set has
profound consequences for a number of enumeration problems. For now we re-
turn to the usage ‘permutation = bijective function’. We refer to the sequence
(a1 , . . . , an ) as the passive form of the permutation π in the last paragraph; the
function is the active form of the permutation.
Following the conventions of algebra, we write a permutation on the right of its
argument, so that iπ is the image of i under the permutation π (that is, the ith term
of the passive form of π).
The set of permutations of {1, . . . , n}, with the operation of composition, is a
group, called the symmetric group Sn . Products, identity, and inverses of permuta-
tions always refer to the operations in this group.
value previously visited. This point must be a, since any other point will already
have a pre-image, and we have a cycle. If not all points have been covered, choose
another point and repeat the procedure, producing a cycle disjoint from the first
(again, no overlap is possible since all points in previously found cycles already
have pre-images). Repeat until all points are included.
Proposition 3.15 (a) The parity of π is equal to the parity of the number of
factors in any expression for π as a product of transpositions.
(b) Parity is a homomorphism from the symmetric group Sn to the group Z/(2)
of integers mod 2, and hence sign is a homomorphism to the multiplicative
group {±1}.
(c) For n > 1, these homomorphisms are onto; their kernel (the set of permuta-
tions of even parity, or of sign +1) is a normal subgroup of index 2 in Sn ,
called the alternating group An .
Proof The proof of the first part will be indicated; the rest of the proposition is
an exercise. Let π be a permutation, written as a product of disjoint cycles. Show
that, for any transposition τ = (a, b),
• if a and b lie in distinct cycles C1 and C2 of π, these two are stitched together
into a single cycle of πτ;
• conversely, if a and b lie in the same cycle of π, this cycle is cut apart into
two cycles of πτ.
So the number of cycles increases or decreases by 1, and the parity changes.
Now the empty product of transpositions is the identity permutation, which has
n cycles, and so even parity; every transposition changes the parity, so a product
of l transpositions has the same parity as l.
Proposition 3.16 Two permutations are conjugate in the symmetric group if and
only if the lists of cycle lengths of the two permutations (written in non-increasing
order) are equal.
Proof We split the permutations into two classes: those for which (n) is a single
part (obtained by adjoining this cycle to a permutation of {1, . . . , n − 1} with k − 1
cycles), and the remainder (obtained by taking a permutation of {1, . . . , n − 1} with
k cycles and interpolating n at some position in one of the cycles). The second
construction, but not the first, changes the sign of the permutations.
To see that there are (n − 1)! permutations with a single cycle, note that if we
choose to start the cycle with 1 then the remaining n − 1 elements can be written
into the cycle in any order.
Note that substituting x = 1 into this equation shows that ∑k s(n, k) = 0 for
n ≥ 2.
Corollary 3.19 The triangular matrices S1 and S2 whose entries are the Stirling
numbers of the first and second kinds are inverses of each other.
Proof Propositions 3.9 and 3.18 show that S1 and S2 are the transition matrices
between the bases (xn : n ≥ 1) and ((x)n n ≥ 1) of the space of real polynomials
with constant term zero.
Proposition 3.20 Let (a0 , a1 , . . .) and (b0 , b1 , . . .) be two sequences of numbers,
with exponential generating functions A(x) and B(x) respectively. Then the fol-
lowing two conditions are equivalent:
(a) b0 = a0 and bn = ∑nk=1 s(n, k)ak for n ≥ 1;
(b) B(x) = A(log(1 + x)).
Proof Write out the pattern for the cycle structure of a permutation with ck (π)
cycles of length k for all k, leaving blank the entries in the cycles. There are
n! ways of entering the numbers 1, . . . , n in the pattern. However, each cycle of
length k can be written in k different ways, since the cycle can start at any point;
and the cycles of length k can be written in any of the ck (π)! possible orders. So the
number of ways of entering the numbers 1, . . . , n giving rise to each permutation
in the conjugacy class is ∏ kck (π) ck (π)! .
The cycle index of the symmetric group Sn is the generating function for the
numbers ck (π), for k = 1, . . . , n. By convention it is normalised by dividing by n!.
Thus,
n
1 c (π)
Z(Sn ) = ∑ ∏ sk .
n! π∈Sn k=1 k
Because of the normalisation, this can be thought of as the probability generat-
ing function for the cycle structure of a random permutation: that is, the coefficient
of the monomial ∏ sak k (where ∑ kck = n) is the probability that a random permu-
tation π has ck (π) = ak for k = 1, . . . , n – this is
1
.
∏k kak ak !
One result which we will meet later is the following. We adopt the convention
that Z(S0 ) = 1.
sk
Proposition 3.22 ∑ Z(Sn ) = exp ∑ .
n≥0 k≥1 k
k≥1 k
sk
= exp ∑
k≥1 k
as required. (The sum on the right-hand side of the first line is over all infinite se-
quences of natural numbers (a1 , a2 , . . .) with only finitely many entries non-zero.)
We will see much more about cycle indices in the chapter on orbit counting.
and so
n
(x)(n) = ∑ L(n, k)(x)k ,
k=1
by the rule for matrix multiplication. In other words, the Lah numbers express the
‘rising factorials’ (x)(n) in terms of the ‘falling factorials’ (x)n .
Now
(x)(n+1) = ∑ L(n, k)(x)k ((x − k) + (n + k))
= ∑(n + k)L(n, k)(x)k + ∑ L(n, k)(x)k+1,
so
L(n + 1, k) = (n + k)L(n, k) + L(n, k − 1),
with the convention that L(n, k) = 0 if k = 0 or k > n. Now it is straightforward to
show that
n! n − 1
L(n, k) =
k! k − 1
is the unique solution of this recurrence relation with the appropriate boundary
conditions.
In other words, unlike the Stirling numbers, there is a closed form for the Lah
numbers in terms of factorials.
Proposition 3.23
n k n−k
dn n d d
( f (x)g(x)) = ∑ f (x) g(x) .
dxn k=0 k dxk dxn−k
Proof The proof is by induction. The n − 1st derivative is a sumof terms where
n−1
f is differentiated k times and g n − k − 1 times, with coefficient . By the
k
product rule, terms in (Dk f )(Dn−k g) in Dn ( f g) arise by differentiating terms in
either (Dk−1 f )(Dn−k g) or (Dk f )(Dn−k−1 g), so the coefficient of this term is
n−1 n−1 n
+ = .
k−1 k k
n
Dn ( f (exp(x) − 1)) = ∑ S(n, k) f (k)(exp(x) − 1) exp(kx),
k=1
since the sum of P(n; a1 , . . . , ak ) over all (a1 , . . . , ak ) with fixed n and k is just the
number S(n, k) of partitions with k parts. Putting x = 0 we obtain the formula
n
bn = ∑ S(n, k)ak
k=1
3.6 Unimodality
For the Stirling numbers of the second kind the recurrence is
so to get any particular value we take the value immediately above, multiply it by
its column number k, and add the value above and to the left:
1 1
1 3 1
1 7 6 1
↓1 ↓2 ↓3 ↓4
1 15 25 10 1
For the unsigned Stirling numbers of the first kind the recurrence is
so it works the same except that we multiply the value immediately above the one
1 1
2 3 1
6 11 6 1
↓4 ↓4 ↓4 ↓4
24 50 35 10 1
In both cases, we notice that the numbers in a row seem to increase to a maxi-
mum and then decrease. We now examine this property.
Given a sequence of positive numbers, say a0 , a1 , a2 , . . . , an , we say that the
sequence is unimodal if there is an index m with 0 ≤ m ≤ n such that
a0 ≤ a1 ≤ · · · ≤ am ≥ am+1 ≥ · · · ≥ an .
while if n = 2m + 1, we have
n n n n n n
< < ··· < = > > ··· > .
0 1 m m+1 m+2 n
So the binomial coefficients increase to the middle and then decrease (remaining
constant for one step if n is odd).
This is all very well, but it depends on having a formula for the binomial coef-
ficients. We want to show that various other sequences are unimodal, so we need
to develop some machinery. Here is a simple but useful test.
The sequence a0 , a1 , a2 , . . . , an of positive integers is said to be log-concave if
a2k ≥ ak−1 ak+1 for 1 ≤ k ≤ n − 1. The reason for the name is that the logarithms
of the as are concave: setting bk = log ak , we have 2bk ≤ bk−1 + bk+1 , or in other
words, bk+1 − bk ≤ bk − bk−1 . So if we plot the points (k, bk ) for 0 ≤ k ≤ n, then
the slopes of the lines joining consecutive points decrease as k increases, so that
the figure they form is concave when viewed from above.
Proof We have a2k ≥ ak+1 ak−1 ; so (ak+1 /ak ) ≤ (ak /ak−1 ). So if ak ≤ ak−1 then
ak+1 ≤ ak . So the numbers ak increase until it first happens that ak ≤ ak−1 and then
decrease.
n
Theorem 3.26 Let P(x) = ∑ ak xk be the generating polynomial for the numbers
k=0
a0 , . . . , an . Suppose that all the roots of the equation P(x) = 0 are real and negative.
Then the sequence a0 , . . . an is log-concave.
Proof The proof goes by induction on n. If n = 1, then there are only two num-
bers, and trivially they form a log-concave sequence. We will look at the case
n = 2 before going on to the general case.
The coefficients of the polynomial P(x) are, by assumption, all positive, so
there cannot be a non-negative root. So our assumption is that the roots are all
real. Now for n = 2 we have P(x) = a0 + a1 x + a2 x2 ; this has real roots if and only
if its discriminant is non-negative, that is, if and only if a21 − 4a0 a2 ≥ 0; but this
implies that a21 ≥ a2 a0 , which is the definition of log-concavity!
Now we turn to the general case. Suppose that P(x) = (x + c)Q(x), where c > 0
and
Q(x) = bn−1 xn−1 + · · · + b1 x + b0 .
Now the polynomial Q(x) has all its roots real and negative, since they are all the
roots of P(x) except for −c. By the inductive hypothesis, the sequence b0 , . . . , bn−1
is log-concave; that is,
b2k ≥ bk−1 bk+1
for k = 1, . . . , n − 2. Also, since P(x) = (x + c)Q(x), we have a0 = cb0 , an = bn−1 ,
and ak = bk−1 + cbk for 1 ≤ k ≤ n − 1.
We first show that bk bk−1 ≥ bk+1 bk−2 for 2 ≤ k ≤ n − 2. For we have
and the equation (1 + x)n = 0 has just the root −1 with multiplicity n. So the
binomial coefficients are log-concave and hence unimodal.
Example 2: Stirling numbers of the first kind Let u(n, k) be the number of
permutations of {1, . . . , n} having k cycles. We showed earlier that
n
∑ u(n, k)xk = x(x + 1) · · · (x + n − 1),
k=1
and the polynomial on the right has roots 0, −1, −2, . . . , −(n − 1). We can neglect
the zero root: the Stirling numbers start at k = 1 rather than zero, and dividing by
x simply changes the indexing so that they start at 0. So again the Stirling numbers
are log-concave and hence unimodal.
Example 3: Stirling numbers of the second kind Let S(n, k) be the number of
partitions of {1, . . . , n} into k parts. These numbers are also unimodal. The proof
is a little more difficult, but is a good showcase for some elementary real analysis
(Rolle’s Theorem).
We begin with the recurrence relation for the Stirling numbers of the first kind,
proved in Proposition 3.7:
Let
n
Pn (x) = ∑ S(n, k)xk .
k=0
Putting Qn (x) = Pn (x)ex , we see that Pn (x) = 0 and Qn (x) = 0 have the same
roots. The identity above, multiplied by ex , gives
By Rolle’s Theorem, there is a root of Qn (x) between each pair of roots of Qn−1 (x),
and one to the left of the smallest root of Qn−1 (x) (since Qn−1 (x) → 0 as x → −∞);
and also a a root at 0. This accounts for (n − 2) + 1 + 1 roots, that is, all the roots
of Qn (x). So the induction step is complete.
exp(log(1 + x)) = 1 + x.
We give here a combinatorial proof, rather than the usual analytic proof. We will
see that the proof ultimately rests on facts about the sign of a permutation.
n! different ways. But each permutation arises in several different ways, for two
reasons: we may start each cycle at any point (giving a factor 1b1 2b2 · · · nbn ); and
we may rearrange the cycles of the same length (giving a factor b1 ! b2 ! · · · bn !). So
the number of such permutations is
n!
.
2 ! · · · n n bn !
1b 1 b ! 2b2 bb
1
By inspection, we see that the claim is proved.
for n > 1, since there are equally many permutations with each sign.
3.8 Exercises
3.1 Write down Pascal’s triangle mod 2; that is, record only whether each entry
is odd or even. You should find that the triangle has a fractal structure; can you
explain why? [Exercise 3.6 below may help.]
3.6 Again let p be a prime number. Using the result of the previous exercise, show
the following. Suppose that n = ap + b and k = cp + d, with 0 ≤ b, d ≤ p − 1. By
computing the coefficient of xcp+d in (1 + x)ap+b , show that
ap + b a c
≡ (mod p).
cp + d b d
3.7 Deduce the Binomial Theorem by applying Taylor’s Theorem to the analytic
function (1 + x)a (for arbitrary real number a), using the standard formula for the
derivative of (1 + x)a .
3.9 Use the method of the preceding exercise, together with the Central Limit
Theorem, to deduce the constant in Stirling’s formula.
3.11 Formulate and prove an analogue of Proposition 3.12 for binomial coeffi-
cients.
3.12 In how many ways can k identical sweets be distributed to n children if the
number xi of sweets given to the ith child is required to satisfy xi ≥ ai , for some
numbers a1 , . . . , an ?
n
3.13 Let B be the infinite matrix whose (n, k) entry is . What are the entries
k
of the matrix B2 ?
Can you find a bijective proof that the numbers in the two parts are the same?
3.16 Let B(n) be the number of partitions of {1, . . . , n}. Prove that
√
n! ≤ B(n) ≤ n! .
3.17 Prove that log n! is greater than n log n − n + 1 and differs from it by at
most 12 log n. Deduce that
nn nn+1/2
n−1
≤ n! ≤ n−1 .
e e
3.18 Prove that
n −n + k − 1
(−1)n−k =
k k
for 0 ≤ k ≤ n. Use this and Proposition 3.3 to prove the Binomial Theorem for
negative integer exponents.
A(x1 , . . . , xn ) = ∏ (x j − xi ).
1≤i< j≤n
Prove that, if two of the indeterminates are transposed and the others are fixed,
then A(x1 , . . . , xn ) is mapped to its negative.
Hence show that, if we define a function s from the symmetric group to the
group of integers mod 2 by the rule that
3.22 What is the relation between the numbers T (n, k) defined in Section 3.7 and
Stirling numbers?
3.23 A total preorder is a reflexive and transitive relation on a set X which satis-
fies the trichotomy law, that is, for any x, y ∈ X, either R(x, y) or R(y, x) holds.
(a) Show that the number of total preorders of an n-set is
n
P(n) = ∑ S(n, k)k! .
k=1
(b) Deduce that the exponential generating function for (P(n)) is 1/(2−exp(x)).
3.24 Prove that the number P(n) of total preorders on n points is given by the
formula
kn
P(n) = ∑ k+1 .
k≥1 2
n
[Hint: kn = ∑ S(n, i)(k)i .]
i=1
Show that the following method allows us to choose a random preorder on n
points uniformly:
(a) choose a positive integer k from the probability distribution K given by
kn
P(K = k) = ;
P(n)2k+1
3.27 Find a formula for the number P(n; a1 , . . . , ak ) appearing in Faà di Bruno’s
formula.
3.28 (a) Let pn be the counting function for permutations on an even number
of points: that is, pn = n! if n is even, pn = 0 if n is odd. Show that the
exponential generating function of the sequence (pn ) is P(x) = (1 − x2 )−1 .
(b) Let n = 2m. Show that the number of permutations of {1, . . . , n} with all
cycles even is the square of the number of permutations with all cycles of
length 2. (Hint: if σ and τ are permutations with all cycles of length 2,
colour their cycles red and blue respectively, and produce a permutation with
all cycles even by following red and blue edges alternately. Show that every
pair (σ , τ) gives rise to 2c permutations, where c is the number of cycles of
length greater than 2, while every permutation with all cycles even and c of
length greater than 2 arises from 2c such pairs (σ , τ).)
(c) Deduce that, if en is the number of permutations of {1, . . . , n} with all cy-
cles even, then the exponential generating function for (en ) is E(x) = (1 −
x2 )−1/2 .
(d) Let on be the number of permutations of a set {1, . . . , n} with all cycles odd,
where n is even (that is, on = 0 if n is odd). Show that the exponential
generating function O(x) for the sequence (on ) satisfies E(x)O(x) = P(x),
by decomposing a permutation into the product of a permutation with all
cycles even and one with all cycles odd.
(e) Deduce that en = on for all even numbers n.
This demonstrates a result mentioned in the Preface.
Remark Can you find a bijective proof of the last assertion? [See the paper by
Richard Lewis and Simon Norton for a crib.]
Following the hint, we first calculate the expected number of times during the
trial when the numbers of patients receiving drug and placebo are equal. This
is obtained by summing, over all n-element subsets A of {1, . . . , 2n}, the number
of values of k for which |A ∩ {1, . . . , 2k}| = k, and dividing by the number 2n n
of subsets. Now the sum can be calculated by counting, for each value of k, the
number of n-subsets A for which |A ∩ {1, . . . , 2k}| = k, and summing the result
over k. 2(n−k)
For a given k, the number of subsets is 2k k n−k , since we must choose k
of the numbers 1, . . . , 2k, and n − k of the numbers 2k+ 1, . . . , 2n. Hence, by the
stated result, the sum is 22n , and the average is 22n / 2n
n .
Now consider the doctor’s guesses in any particular trial. At any stage where
equally many patients have received drug and placebo, he guesses at random, and
is equally likely to be right as wrong. Such points contribute zero to the expected
number of correct minus incorrect guesses. In each interval between two consec-
utive such stages, say 2k, and 2l, the doctor will guess right one more time than
he guesses wrong. (For example, if the 2kth patient gets the drug, then between
stages 2k + 1 and 2l the number of patients getting the drug is l − k − 1 and the
number getting the placebo is l − k, but the doctor will always guess the placebo.)
So the expected number of correct minus incorrect guesses is the number of such
intervals, which is one less than thenumber of times that the numbers√are equal.
2n
So the expected number is 2 / n − 1, which is asymptotically πn, by the
2n
Recurrence relations
Clearly such a recurrence has a unique solution. (Note that this allows the possi-
bility of prescribing some initial values, by choosing the first few functions to be
constant. Also, it is permissible for the function Fn to depend explicitly on n.)
In this chapter, we will describe the complete solution of linear recurrence rela-
tions of finite length with constant coefficients. For other types, we usually do not
give any general theory, but describe some important examples and methods for
solution. Two of these examples, the partition numbers and the Catalan numbers,
are of such importance that we devote a section to further study of each.
To begin, an example to show that the first recurrence relation we think of for
a sequence of numbers is not always the easiest!
4 = 3 + 1 = 2 + 2 = 2 + 1 + 1 = 1 + 1 + 1 + 1.
64
20:10:19, subject to the Cambridge
.005
4.1 Linear recurrences with constant coefficients 65
with initial condition u1 = 1. This obviously has the solution un = 2n−1 for n ≥ 1.
The simpler recurrence can be proved directly. Starting from a composition
of n − 1, we obtain two compositions of n: one by adding a summand 1 at the
beginning, and the other by increasing the first summand by 1. There is no overlap,
since the first construction gives all compositions beginning with 1, and the second
gives all beginning with a number greater than 1; moreover, every composition of
n is obtained from one or other construction.
for all n ≥ n0 .
This can be expressed in terms of generating functions as follows. Let U(x) =
∑n≥0 un xn and A(x) = 1 − ∑n≥1 an xn .
for all n ≥ n0 if and only if the generating functions defined above satisfy the
condition that U(x)A(x) is a polynomial of degree less than n0 .
Proof The recurrence expresses the condition that the coefficient of xn in U(x)A(x)
is zero for n ≥ n0 .
(a) the sequence (un ) satisfies the recurrence relation (4.1) for n ≥ n0 ;
(b) A(x)U(x) is a polynomial of degree less than n0 , where A(x) is the charac-
teristic polynomial of (4.1).
Proof As in Proposition 4.1, the proof is immediate from the observation that the
relation (4.1) asserts that the coefficient of xn in A(x)U(x) is zero.
Corollary 4.3 If a sequence (un ) satisfies the recurrence (4.1) for all n ≥ k, then
its generating function is given by U(x) = P(x)/A(x), where A(x) is the character-
istic polynomial of the recurrence, and P(x) is an arbitrary polynomial of degree
less than k.
Theorem 4.4 Suppose that the recurrence relation (4.1) has characteristic poly-
nomial
A(x) = (1 − α1 x)m1 · · · (1 − αr x)mr ,
where α1 , . . . , αr are distinct complex numbers and m1 , . . . , mr are positive inte-
gers. (Thus, we have m1 + · · · + mr = k.) Suppose that the sequence (un ) satisfies
the relation (4.1) for all n ≥ k. Then
un = p1 (n)α1n + · · · + pr (n)αrn ,
Remark The roots α1 , . . . , αr are the inverses of the roots of the characteristic
polynomial of the recurrence; so they are the roots of the ‘inverse polynomial’
xk − a1 xk−1 − a2 xk−2 − · · · − ak = 0.
Proof We have
P(x)
∑ unxn = U(x) = A(x) .
We use the technique of partial fractions to evaluate this fraction.
First, if the degree of P is less than k, we claim that
P(x) P1 (x) Pr (x)
= +···+ ,
A(x) (1 − α1 x)m 1 (1 − αr x)mr
where the degree of Pi is less than mi for i = 1, . . . , r.
To prove this, let
Now let Pi (x) be the remainder when Ti (x) is divided by (1 − αi x)mi . Then
and since both left and right sides of this congruence have degree strictly less than
∑ mi = k, we have equality. Now dividing by A(x) gives the claimed result.
for some uniquely determined numbers c0 , . . . , cm−1 . To see this, multiply both
sides by (1 − αx)m to obtain an equation between two polynomials, the right-hand
side involving the coefficients c0 , . . . , cm−1 ; equating coefficients of powers of x
from largest to smallest now determines these coefficients.
To conclude the proof, we observe that, by the Binomial Theorem with negative
exponent, we have
−j
(1 − αx)−j
= ∑ (−αx)n
n≥0 n
j+n−1
= ∑ (αx)n
n≥0 n
j+n−1 n n
= ∑ α x ,
n≥0 j−1
j+n−1
and the binomial coefficient is a polynomial of degree j − 1 in n.
j−1
Taking a linear combination of such expressions for j = 1, . . . , m gives a coefficient
P(n)α n , where P is a polynomial of degree at most m − 1.
An alternative proof is outlined in Exercise 4.1.
In particular, we see that for a recurrence of depth 2, that is, k = 2, there are
just two possibilities:
(a) the characteristic polynomial is (1 − αx)(1 − β x) with α = β , and the solu-
tion is un = Aα n + Bβ n ;
(b) the characteristic polynomial is (1 − αx)2 , and the solution is (An + B)α n .
Fn = Aα n + Bβ n ,
A + B = 1,
Aα + Bβ = 1.
since if we take a string with no occurrence of 11 and precede it with a 1, then the
only possible position of 11 is at the beginning. Also, if we take a string with no
occurrence of 11 and precede it with 11, then the resulting sequence contains 11,
but possibly two occurrences (if the original string began with a 1); so we have
for m < n.
We take the two-variable generating function
F(x, y) = ∑∑ bm,n xn ym .
We note that f (x) = F(x, 1), while g(x) = [(∂ /∂ y)F(x, y)]y=1 .
We separate the expression for F(x, y) into two parts: terms with n = m, and
terms with n = m (necessarily n > m). The first part is just
xy
xy + x2 y2 + · · · = .
1 − xy
The second part, which we shall call Σ, is
Σ2 = ∑ (xy)m ∑ ibi,n−mxn−m
m≥1 n,i
xy
= g(x) ,
1 − xy
and
Σ3 = ∑ (xy)m ∑ bi,n−mxn−m
m≥1 n,i
xy
= f (x) .
1 − xy
So we get
2
xy xy xy
F(x, y) = + f (x) + g(x).
1 − xy 1 − xy 1 − xy
Putting y = 1, we obtain
2
x x x
f (x) = + f (x) + g(x).
1−x 1−x 1−x
Differentiating and putting y = 1, we obtain
x 2x2 x
g(x) = + f (x) + g(x).
(1 − x)2 (1 − x)3 (1 − x)2
We now have two equations for f (x) and g(x), which we can solve to obtain
x(1 − x)3
f (x) = .
1 − 5x + 7x2 − 4x3
By reversing the proof of Theorem 4.2, we conclude that the number an of HC
polyominoes with n cells satisfies the recurrence relation
for n ≥ 5. (Remember from the proof that the recurrence relation holds for n at
least one greater than the degree of the polynomial in the numerator.)
It is worth spending a little while trying to deduce this recurrence relation di-
rectly from the definition of an . You will probably not succeed; no such direct
proof is known.
for n ≥ d), then σ da = ∑di=1 αi σ d−i a,so that σ d a lies in the span of the sequences
a, σ a, . . . , σ d−1 a. By an easy induction, σ n a lies in this subspace for all n ≥ d, and
(c) holds.
Conversely, if (c) holds, then we may assume that the subspace is spanned by
finitely many shifts of a. So σ d a is a linear combination of a, σ a, . . . , σ d−1 a for
some d, and a satisfies a (d + 1)-term recurrence with constant coefficients.
We denote the subspace spanned by the shifts of a by S(a).
A C-finite sequence is one which satisfies the three equivalent conditions of the
proposition. Its degree can be defined as the smallest d such that the sequence
satisfies a (d + 1)-term recurrence; the smallest degree of the denominator of a
rational function representing the generating function; or the dimension of the
vector space spanned by the shifts of a.
From this, we see immediately:
Proposition 4.7 The sum of two C-finite sequences is C-finite; its degree is at most
the sum of the degrees of the two sequences.
Proof Clearly a, b ∈ S(a) + S(b) (where the sum is taken in the space of all se-
quences); and so a + b ∈ S(a) + S(b), a finite-dimensional vector space. Clearly
all shifts of a + b also lie in this space.
Note that the C-finite sequences form a vector space over Q, since scalar mul-
tiplication clearly takes a C-finite sequence to one of the same degree.
Corollary 4.8 Let a and b be two C-finite sequences with degrees d and e respec-
tively, and suppose that the first d + e terms of a and b agree. Then a = b.
The power of this test is increased by the following result. The pointwise prod-
uct a ◦ b of two sequences a = (an ) and b = (bn ) is the sequence whose nth term
is an bn .
Proposition 4.9 (a) Let a and b be two C-finite sequences with degrees d and e
respectively. Then a ◦ b is a C-finite sequence with degree at most de.
(b) Let a be a C-finite sequence with degree d. Then a ◦ a is a C-finite sequence
with degree at most d(d + 1)/2.
(c) More generally, if a is C-finite with degree d, then the
sequence
of k-th pow-
d +k−1
ers of elements of a is C-finite with degree at most .
k
Proof (a) We consider the set of all infinite matrices (with rows and columns
indexed by the natural numbers) such that each row satisfies the (d + 1)-term re-
currence defining a and each column satisfies the (e + 1)-term recurrence defining
b. It is easy to see that this set is a rational vector space. Moreover, any such ma-
trix is completely determined by the de entries in the first e rows and d columns,
so the vector space has dimension de.
A particular matrix lying in this space has entry bi a j in row i and column j.
All its simultaneous shifts (move everything one place up and to the left, deleting
the first row and column) also belong to this space. Now consider the linear map
which ‘projects’ the vector space onto the diagonal (that is, it maps any matrix to
the sequence of entries on the diagonal). This map is linear, and so its dimension
is at most de. Projecting the matrix with entries bi a j and its shifts by the same
horizontal and vertical amount onto the diagonal, we see that the product sequence
and all its shifts are contained in a space of dimension at most de. The result
follows.
(b) The proof is similar, except that we can use the space of symmetric matrices
whose rows and columns satisfy the recurrence. To specify such a matrix, we only
need to give the d(d + 1)/2 entries on and above the diagonal in the top left d × d
square.
(c) In the argument in part (b) above, instead of using a matrix, which is a
‘2-dimensional array’, we use a ‘k-dimensional array’. The space of completely
symmetric k-dimensional arrays of side d has dimension equal to the number of
selections of k things from {1, . . . , d} with repetition allowed and order unimpor-
tant.
Proposition 4.10 Let a = (an ) be a C-finite sequence of degree d. Then the sum
sequence s = (sn ), where sn = ∑ni=0 ai , is C-finite of degree at most d + 1.
Proof Let F(x) be the generating function of a, a rational function whose de-
nominator has degree d (and non-zero constant term). The sum sequence s is the
convolution of a with the all-1 sequence, whose generating function is 1/(1 − x);
so the generating function for s is f (x)/(1 − x), a rational function with denomi-
nator of degree at most d + 1.
their degree established), the result can be checked by mechanical computation (or,
better, by computation involving some clever tricks to reduce the work required).
I won’t give the proof in detail. The fact that the number of domino tilings is
C-finite is obtained using the transfer matrix method. In essence, there are only
a finite number of ways of covering the four 4 × 4 corner squares with dominoes,
some of which may stick out. No matter what corner tilings are chosen, the number
of ways of tiling the sides satisfy the same recurrence relation, with initial condi-
tions depending on the corner tilings chosen. So the required number is a sum of
fourth powers of sequence satisfying a 3-term recurrence (hence of degree 2), and
so itself has degree 5, by Proposition 4.9(c).
The fact that the conjectured answer is C-finite follows from general results
about C-finite sequences. The Fibonacci sequence has dimension 2, so its square
has dimension 3; the sequence (−1)n has dimension 1, so the sum has dimension
at most 4, and its square has dimension at most 10. So the result can be proved by
checking 15 values.
However, with some ingenuity, we can do better. Write the conjectured answer
as 16Fn4 + 16(−1)n Fn2 + 4. Now the fourth powers of Fibonacci numbers have
dimension 5, and the recurrence relation can be computed explicitly; then it can
be shown that the sequence ((−1)n Fn2 ) and the constant sequence both satisfy the
same recurrence relation. So the dimension is 5, and only 10 terms need to be
checked, a substantial saving.
Thus, to get a recurrence relation for p(n), we have to understand the coeffi-
cients of its inverse:
∑ a(n)xn = ∏ (1 − xk ).
n≥0 k≥1
Now a term on the right arises from each expression for n as a sum of distinct
positive integers; its value is (−1)k , where k is the number of terms in the sum.
Thus, a(n) is equal to the number of expressions for n as the sum of an even
number of distinct parts, minus the number of expressions for n as the sum of an
odd number of distinct parts.
This number is evaluated by Euler’s Pentagonal Numbers Theorem:
Proposition 4.11
a(n) = (−1)k if n = k(3k − 1)/2 for some k ∈ Z,
0 otherwise.
4.2.1 Derangements
Let d(n) be the number of derangements of {1, . . . , n} (permutations which
have no fixed points). We obtain a recurrence relation as follows. Each derange-
ment maps n to some i with 1 ≤ i ≤ n − 1, and by symmetry each i occurs equally
often. So we need only count the derangements mapping n to n − 1, and multiply
by n − 1.
We divide these derangements into two classes. The first type map n − 1 back
to n. Such a permutation must be a derangement of {1, . . . , n − 2} composed with
the transposition (n − 1, n); so there are d(n − 2) such. The second type map i
to n for some i = n − 1. Replacing the sequence i → n → n − 1 by the sequence
i → n − 1, we obtain a derangement of n − 1; every such derangement arises. So
there are d(n − 1) derangements of this type.
Thus,
d(n) = (n − 1)(d(n − 1) + d(n − 2)).
This is a linear recurrence relation, where the coefficients, rather than being con-
stants, are polynomials in n.
There is a simpler recurrence satisfied by d(n), which can be deduced from this
one, namely
d(n) = nd(n − 1) + (−1)n .
To prove this by induction, suppose that it is true for n − 1. Then (n − 1)d(n −
2) = d(n − 1) − (−1)n−1 ; so d(n) = (n − 1)d(n − 1) + d(n − 1) + (−1)n , and the
inductive step is proved. (Starting the induction is an exercise.)
Now this is a special case of a general recursion which can be solved, namely
u0 = c, un = pn un−1 + qn for n ≥ 1.
We can include the initial condition in the recursion by setting q0 = c and adopting
the convention that u−1 = 0.
If qn = 0 for n ≥ 1, then the solution is simply un = Pn for all n, where
n
Pn = c ∏ pi .
i=1
(−1)i
n!/e − d(n) = n! ∑ ,
i≥n+1 i!
and the modulus of the alternating sum of decreasing terms on the right is smaller
than that of the first term, which is n!/(n + 1)! = 1/(n + 1).
The exponential generating function for the derangement numbers is
d(n) n (−1)i i e−x
∑ n! x = ∑ i! x · ∑ x j = 1 − x .
n≥0 i≥0 j≥0
4.2.3 Involutions
Let sn be the number of elements π of the symmetric group Sn which satisfy
π 2 = 1 (this means that the cycles of π have lengths 1 or 2).
sn = sn−1 + (n − 1)sn−2 .
d
F(x) = exp(x)F(x).
dx
This first-order differential equation can be solved in the usual way with the initial
condition F(0) = 1 to give
for some constants A, B which can be deduced from the initial conditions. With
luck, the solutions of this differential equation will lead to the generating function
f for our problem. See Exercise 4.15 for an example of what can go wrong.
(m) (m)
By convention, we take s0 = 1 and sn = 0 for n < 0, so the recurrence relation
holds for all n > 0.
Since we are counting permutations, we let f (x) be the exponential generating
(m)
function for the sequence; let f (x) = ∑ an xn , where an = sn /n!. Dividing the
above recurrence relation by n!, we find that a factor of (n − 1)! can be cancelled,
giving
nan = ∑ ad .
d|m
so
d
dx
f (x) = ∑ xd−1 f (x),
d|m
(a ◦ (b ◦ (c ◦ d))), (a ◦ ((b ◦ c) ◦ d)), ((a ◦ b) ◦ (c ◦ d)), ((a ◦ (b ◦ c)) ◦ d), (((a ◦ b) ◦ c) ◦ d),
so C4 = 5.
Any bracketed product of n terms is of the form (A ◦ B), where A and B are
bracketed products of i and n − i terms respectively. So
n−1
Cn = ∑ CiCn−i for n ≥ 2.
i=1
Putting F(x) = ∑n≥1 Cn xn , the recurrence relation shows that F and F 2 agree in all
coefficients except n = 1. Since C1 = 1 we have F = F 2 + x, or F 2 − F + x = 0.
Solving this equation gives
√
F(x) = 12 (1 ± 1 − 4x).
Hence
1 1/2
Cn = − (−4)n
2 n
1 1 1 3 2n − 3 22n
= · · · ··· ·
2 2 2 2 2 n!
1 (2n − 2)! 22n
= · ·
2n+1 2n−1 (n − 1)! n · (n − 1)!
1 2n − 2
= .
n n−1
In an appendix to this chapter, we will have more to say about Catalan numbers.
Sometimes we cannot get an explicit solution, but can obtain some information
about the growth rate of the sequence.
i=1
⎪
⎪
n−1
⎩ 2 ∑ WiWn−i +Wn/2
⎪
⎪ 1
if n is even.
i=1
This cannot be solved explicitly. We will obtain a rough estimate for the rate of
growth. Later, we find more precise asymptotics.
We seek the nearest singularity to the origin. Since all coefficients are real and
positive, this will be on the positive real axis. (If a power series with positive real
coefficients converges at z = r, then it converges absolutely at any z with |z| = r.)
Let s be the required point. Then s < 1, so s2 < s; so F(z2 ) is analytic at z = s.
Now write the equation as
F(z)2 − 2F(z) + (F(z2 ) + 2z) = 0,
with ‘solution’
F(z) = 1 − 1 − 2z − F(z2 )
(taking the negative sign as before). Thus, s is the real positive solution of
F(s2 ) = 1 − 2s.
Solving this equation numerically (using the fact that F(s2 ) is the sum of a con-
vergent Taylor series and can be estimated from knowledge of a finite number of
terms), we find that s ≈ 0.403 . . ., so that Wn grows ‘like’ (2.483 . . .)n .
We will find more precise asymptotics for Wn in the final chapter of the book.
u u u u u
u u u
The next theorem, due to Euler, is quite unexpected, as is its application: it will
enable us to derive an efficient recurrence relation for the partition numbers.
Theorem 4.12 (Euler’s Pentagonal Numbers Theorem) (a) If n is not a pen-
tagonal number, then the numbers of partitions of n into an even and an odd
number of distinct parts are equal.
(b) If n = k(3k − 1)/2 for some k ∈ Z, then the number of partitions of n into an
even number of distinct parts exceeds the number of partitions into an odd
number of distinct parts by one if k is even, and vice versa if k is odd.
For example, if there are four partitions of n = 6 into distinct parts, viz. 6 =
5 + 1 = 4 + 2 = 3 + 2 + 1, two of each parity; while if n = 7, there are five such
partitions, viz. 7 = 6 + 1 = 5 + 2 = 4 + 3 = 4 + 2 + 1, three with an even and two
with an odd number of parts.
• The base is the bottom row of the diagram (the smallest part).
• The slope is the set of cells starting at the east end of the top row and pro-
ceeding in a south-westerly direction for as long as possible.
Note that any cell in the slope is the last in its row, since the row lengths are all
distinct. See Figure 4.2.
u u u u u u
u u u u u
u u u u
u u
Now we divide the set of partitions of n with distinct parts into three classes, as
follows:
• Class 1 consists of the partitions for which either the base is longer than the
slope and they don’t intersect, or the base exceeds the slope by at least 2;
• Class 2 consists of the partitions for which either the slope is at least as long
as the base and they don’t intersect, or the slope is strictly longer than the
base;
• Class 3 consists of all other partitions with distinct parts.
u u u u u u u
u u u u u u
u u u u
These bijections are mutually inverse. Thus, the numbers of Class 1 and Class 2
partitions are equal. Moreover, these bijections change the number of parts by 1,
and hence change its parity. So, in the union of Classes 1 and 2, the numbers of
partitions with even and odd numbers of parts are equal.
Now we turn to Class 3. A partition in this class has the property that its base
and slope intersect, and either their lengths are equal, or the base exceeds the
slope by 1. So, if there are k parts, then n = k2 + k(k − 1)/2 = k(3k − 1)/2 or
n = k(k + 1) + k(k − 1)/2 = k(3k + 1)/2. Figure 4.4 shows the two possibilities.
So, if n is not pentagonal, then Class 3 is empty; and, if n = k(3k − 1)/2, for
some k ∈ Z, then it contains a single partition with |k| parts. Euler’s Theorem
follows.
∞
Corollary 4.13 ∏ (1 − xn) = ∑ (−1)k xk(3k−1)/2 .
n≥1 k=−∞
u u u u u u u u u u u
u u u u u u u u u
u u u u u u u
Proof By Euler’s Pentagonal Numbers Theorem, the right-hand side is the gen-
erating function for even(n) − odd(n), where even(n) and odd(n) are the numbers
of partitions having all parts distinct and having an even or odd number of parts
respectively. We must show that the same is true for the left-hand side.
The coefficient of t n is made up of contributions from factors of the form
(1 − xn1 ), . . . , (1 − xnk ), where n1 + . . . + nk = n and n1 , . . . , nk are distinct; the
contribution from this choice of factors is (−1)k . So each term counted by even(n)
contributes 1, and each term counted by odd(n) contributes −1. So the theorem is
proved.
The right-hand side can be written as
1 + ∑ (−1)k xk(3k−1)/2 + xk(3k+1)/2 ,
k>0
using the first ‘definition’ of the pentagonal numbers. From this, we deduce the
promised recurrence for the partition numbers. This illustrates the general princi-
ple that finding a linear recurrence relation for a sequence is equivalent to finding
the inverse of its generating function.
Proof Since
∑ p(n)xn = ∏ (1 − xn)−1,
n≥0 n>0
we have
∑ p(n)x n
· 1 + ∑ (−1) (x k k(3k−1)/2
+x k(3k+1)/2
) = 1.
n≥0 k>0
Let Tn be the number of binary trees with n leaves. Then the first specification
gives T1 = 1, and the second gives
n−1
Tn = ∑ Tk Tn−k
k=1
r r r r r r r r
r \\r \
\r r r r r r r \\r \\ r r
\
r \\r r \r LLr LLr \
\r r \r r
\\r \\r \\ r r
\ \r
(a ◦ (b ◦ (c ◦ d))) (a ◦ ((b ◦ c) ◦ d)) ((a ◦ b) ◦ (c ◦ d)) ((a ◦ (b ◦ c)) ◦ d) (((a ◦ b) ◦ c) ◦ d)
• those in which the root has more than one descendant. Suppose that the
left-most descendant is a tree with k edges. There are Rk such trees. Each
accounts for k + 1 edges of the tree T (one of these is the edge through the
root of T ). Now deleting the left-most descendant gives an arbitrary rooted
plane tree with n − k − 1 edges. Here k runs between 0 and n − 1.
So we have
n−1
Rn = ∑ Rk Rn−k−1
k=0
q q q q q
Z Z Z Z Z
q Zq
q
BB Zq q
Zq q BB Zq
q
Zq
B Z
B ZB B B B BZZ
B
q q Bq ZBq Bq q Bq Bq Bq Zq
there are Dk of these (using the preceding paragraph). The dissection of the second
polygon is arbitrary, so there are Dn−k+1 of these. Thus we have
n−2
Dn = Dn−1 + ∑ Dk Dn−k+1 ,
k=2
where we take D2 = 1.
From this it is an easy exercise to deduce that Dn = Cn−1 for n ≥ 2.
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q @
q@q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q@q q q q @
q q @
q q q q q q@ q q q q q q q q @
q@q q q q q q q q q
q q q q q q@q q q q q q @
q@q q q q @q@q @
q q q @
q q q q q@q q q@ q q@ q @
q q
Proposition 4.15 The number of Dyck paths from (0, 0) to (2n, 0) is the Catalan
number Cn+1 . Of these, the number which meet the x-axis only at their endpoints
(0, 0) and (2n, 0) is Cn .
Proof Let Fn be the total number of paths satisfying the specification, and Gn be
the number which meet the axis only at the endpoints. By convention we take
F0 = 1. Figure 4.8 demonstrates that F3 = 5 and G3 = 2.
First we claim that Gn+1 = Fn . For take any path satisfying the specification;
translate it one step north-east, so that it goes from (1, 1) to (2n + 1, 1); then extend
it by a north-east step at the beginning and a south-east step at the end. The result-
ing path goes from (0, 0) to (2n + 2, 0), and meets the axis only at its endpoints.
Conversely, given such a path, if we cut off its first and last steps and translate it
by (−1, −1), we obtain a path from (0, 0) to (2n, 0) satisfying the original spec-
ification. So we have established a bijection between the sets counted by Fn and
Gn+1 .
Now take an arbitrary path counted by Fn , and suppose that its first return to
the x-axis is at the point (2k, 0). There are Gk ways to choose the part of the path
before the point (2k, 0), since it meets the axis only at its endpoints. Then there
are Fn−k ways to choose the remaining part of the path, since it is just a path from
(0, 0) to (2n − 2k, 0) satisfying the original specifications and moved east by 2k.
So
n
Fn = ∑ Gk Fn−k
k=1
for n ≥ 1.
From this it is easily deduced that Fn = Cn+1 and Gn = Cn .
4.5.6 Tableaux
This analysis can be translated into a different language again, one of great
importance in algebraic combinatorics.
Let us record the results of the count in the last subsection in a different way:
take two rows corresponding to the two candidates, and as each vote is counted,
enter its number in the row corresponding to the candidate for whom that vote is
cast. The representations of the five counts above are shown in Figure 4.9.
1 2 3 1 2 4 1 2 5 1 3 4 1 3 5
4 5 6 3 5 6 3 4 6 2 5 6 2 4 6
AAABBB AABABB AABBAB ABAABB ABABAB
Figure 4.9: Tableaux
We see that Figure 4.9 shows all possible ways of putting the numbers 1, . . . , 6
into six boxes in a 2 × 3 array such that
• the numbers in each row are increasing (this corresponds to the fact that the
votes are entered in order);
• the numbers in each column are increasing (this corresponds to the fact that
B is never ahead of A in the count).
Such an arrangement is called a tableau. We can generalise by asking for arbitrary
arrangements of boxes.
Let λ be a partition of the integer n: that is, λ is a sequence of positive integers,
arranged in non-increasing order, with sum n. (The number of partitions of n is the
partition function p(n) we met earlier in this chapter.)
The Young diagram, or Ferrers diagram, associated with λ is an arrangement
of boxes or cells in rows, the number of boxes in the ith row being the ith part of
λ . Figure 4.10 shows the Young diagrams corresponding to the five partitions of
4.
(4) (3, 1) (2, 2) (2, 1, 1) (1, 1, 1, 1)
Thus, the figures recording the results of a vote count of the type described
in the preceding subsection, with two candidates obtaining m votes each, are pre-
cisely the tableaux of shape (m, m); so we have
1 2m
f(m,m) = Cm+1 = .
m+1 m
Arbitrary tableaux have also an interpretation in terms of ballots. Suppose that
there are r candidates in an election with n voters, where the ith candidate receives
mi votes for i = 1, . . . , r; we may arrange the candidates so that their totals are
in non-increasing order (so that λ = (m1 , . . . , mr ) is a partition of n). Then fλ is
the number of ways of counting the votes if the candidates’ totals at any stage are
(non-strictly) in the order of the final result.
The numbers fλ are of very great importance in algebra. Their most important
interpretation is as follows. Let G be the symmetric group Sn of all permutations
of {1, . . . , n}. A representation of G is a homomorphism ρ from G to the group of
non-singular linear transformations of a vector space V over the complex numbers.
A representation is irreducible if the only subspaces of V mapped to themselves by
all the matrices ρ(g) are V itself and the zero space. Two representations are equiv-
alent if they are related by a change of basis of V ; that is, if ρ2 (g) = T −1 ρ1 (g)T
for some invertible transformation T .
It is known that any finite group has only finitely many irreducible representa-
tions up to equivalence; indeed, the number of irreducible representations is equal
to the number of conjugacy classes of the group. Moreover, the degree of any such
representation (the dimension of the vector space) divides the order of the group,
and the sum of squares of the degrees of all irreducible representations is equal to
the order of the group.
In the case where G = Sn , we have seen that two permutations are conjugate if
and only if they have the same cycle structure. So the number of conjugacy classes,
and hence the number of irreducible representations, is equal to p(n). More is true
in this case. The irreducible representations can be indexed by partitions of n; the
degree of the representation indexed by λ is fλ . Hence we conclude that
• fλ divides n!;
• ∑ fλ2 = n!.
Both of these facts have a combinatorial interpretation. The first is the remark-
able hook length formula for fλ . For each cell in a Young diagram, the correspond-
ing hook consists of this cell together with all cells to the right in the same row, or
below in the same column. The hook length is equal to the number of cells in the
hook.
For example, consider the partition (m, m) of n = 2m. The hook lengths of the
cells in the first row of the diagram are m + 1, m, m − 1, . . . , 2, while those for the
cells in the bottom row are m, m − 1, . . . , 1. So the Proposition gives
(2m)!
f(m,m) = ,
(m + 1)! m!
in agreement with the fact that f(m,m) = Cm+1 and our earlier formula for Catalan
numbers.
The second fact is explained by the Robinson–Schensted–Knuth algorithm:
Theorem 4.17 There is a bijection between the set of permutations of {1, . . . , n},
and the set of ordered pairs of tableaux of the same shape with n cells. Under this
bijection, if π corresponds to the pair (S, T ) of tableaux, then π −1 corresponds
to (T, S). Furthermore, the number of cells in the first row of the tableaux corre-
sponding to g is equal to the length of the longest increasing subsequence of g.
The last part of the theorem has a ‘dual’ statement: The number of cells in the
first column of the tableaux (that is, the number of rows) is equal to the length of
the longest decreasing subsequence of g.
We will outline the proof of this theorem in the next section. The theorem
immediately gives us the following counting results:
Corollary 4.18 (a) ∑ fλ2 = n!, where the sum is over the set of all partitions of
λ
n.
(b) ∑ fλ = sn is equal to the number of permutations π of {1, . . . , n} for which
λ
π 2 is the identity, with the same range of summation as in (a).
(c) The number of permutations of {1, . . . , n} with largest increasing subse-
quence of length m is equal to ∑ fλ , where the sum is over all partitions
λ
of n with largest part m.
Proof (a) The number of pairs of tableaux of the same shape is ∑ fλ2 .
(b) A permutation π satisfies π 2 = 1 (that is, π = π −1 ) if and only if it corre-
sponds to a pair (S, S) of tableaux with its entries equal.
(c) This is clear for the same reason as (a).
Note that we found a recurrence relation for the sequence (sn ) in Section 4.2.
For example, the numbers of tableaux of the three possible shapes with n = 3
are 1, 2 and 1 respectively. So S3 contains 12 + 22 + 12 = 3! elements, of which
1 + 2 + 1 = 4 satisfy g2 = 1; and the numbers with longest increasing subsequence
of length 1, 2, 3 are 1, 4, 1 respectively. All this is easily checked by listing the
permutations.
Proposition 4.19 All six permutations of {1, 2, 3} are Wilf-equivalent; the number
of permutations of length n avoiding any one of them is the Catalan number Cn .
Proving this, and finding bijections between the corresponding sets of permu-
tations, is a challenging exercise.
However, the difficulty of the problem increases very rapidly. It is not known
how many permutations of length n avoid 1324; even asymptotic estimates for this
number are unknown.
If a is greater than the last element of the jth row, then simply ap-
pend it to this row. (If the jth row is empty, put a in the first position.)
Otherwise, let x be the smallest element of the jth row for which
a > x. (Work from right to left until an element smaller than a is first
reached; then we have gone one step too far.) ‘Bump’ x out of the jth
row, replacing it with a; then INSERT x into the ( j + 1)st row.
To verify this algorithm, there are several things to check. We must show that
the property that S is a partial tableau and T a tableau of the same shape holds after
each step in the algorithm. The fact that rows and columns are increasing is, for
S, a consequence of the way INSERT works; for T , it is because i is greater than
any element previously in the tableau. The key point is that the newly created cell
doesn’t violate the condition that the row lengths are non-increasing; that is, there
should be a cell immediately above it. This is because the element ‘bumped’ is
smaller than the element to the right of the position it is ‘bumped’ out of, and so it
comes to rest to the left of this position.
At the end of the algorithm, we have two tableaux of the same shape.
We illustrate the algorithm with the permutation (2, 3, 1).
• In the first stage, put 2 into the first row of S, and 1 into the first row of T :
S = 2, T = 1.
• In the second stage, put 3 at the end of the first row of s, and 2 in the same
position in T :
S = 2 3, T = 1 2.
1 3 1 2
S= , T= .
2 3
is no known closed formula for P(n); but all the other numbers are familiar com-
binatorial coefficients:
n 2
n
Theorem 4.20 (a) P(n) = ∑ · k!.
k=0 k
2n
(b) Pm (n) = .
n
(c) Pd (n) = Bn+1 and Ps (n) = Bn , where Bn is the nth Bell number.
(d) Pmd (n) = Cn+1 and Pms (n) = Cn , where Cn is the nth Catalan number.
Proof (a) There are nk choices for the domain and nk choices for the range of a
partial permutation of cardinality k. Once the domain and range are chosen, there
are k! bijections between them. The formula follows.
(b) Argue as above. Once the domain and range are chosen, there is a unique
monotonic bijection between them. So
n 2
n 2n
Pm (n) = ∑ = ,
k=0 k n
by Proposition 3.5 (putting m = k = n).
(c) We show first that Pd (n) = Ps (n + 1). If f is a decreasing partial permutation
on {1, . . . , n}, then the map g given by g(x + 1) = f (x) whenever this is defined is
a strictly decreasing partial permutation on {1, . . . , n + 1}. The argument reverses.
This correspondence preserves the property of being monotonic, so also Pmd (n) =
Pms (n + 1).
Now we select a decreasing bijection by first choosing its fixed points, and then
choosing a strictly decreasing bijection on the remaining points. If there are k fixed
points, then there are Ps (n − k) ways to choose the strictly decreasing bijection. So
n
we have
n
Ps (n + 1) = Pd (n) = ∑ Ps (n − k).
k=0 k
Thus, Ps (n) satisfies the same recurrence as the Bell number Bn (see Sec-
tion 4.2.4), and we have
Ps (n) = Bn , Pd (n) = Bn+1 .
(d) The preceding proof fails for monotonic decreasing maps, since such a map
cannot jump over a fixed point. Instead, we encode a strictly decreasing map by a
Catalan object.
Let f be monotonic and strictly decreasing on {1, . . . , n}. We encode f by
a sequence of length 2n in the alphabet consisting of two symbols A and B as
follows. In positions 2i − 1 and 2i, we put
AB, if i ∈
/ Dom( f ) and i ∈
/ Ran( f ),
AA, if i ∈
/ Dom( f ) and i ∈ Ran( f ),
BB, if i ∈ Dom( f ) and i ∈
/ Ran( f ),
BA, if i ∈ Dom( f ) and i ∈ Ran( f ).
It can be shown that this gives a bijective correspondence between the set of such
functions and the set of solutions to the ballot problem in Section 4.5.5 above.
The proof is an exercise. (It is necessary to show that the resulting string has
equally many As and Bs, but each initial substring has at least as many As as
Bs; and that every string with these properties can be decoded to give a strictly
decreasing monotone function. The proof that the correspondence is bijective is
then straightforward.)
It follows that Pms (n) = Cn (the nth Catalan number), and from the remark in
part (c), also Pmd (n) = Cn+1 .
Remark Abdullahi Umar has found that many other interesting counting se-
quences arise in calculating the orders of various inverse semigroups of partial
permutations. Among these are Fibonacci, Stirling, Schröder, Euler, Lah and
Narayana numbers. Details are given in a sequence of papers by A. Laradji and
A. Umar.
4.8 Exercises
4.1 Show directly that, with the hypotheses of Theorem 4.4, the sequence un =
n j αin satisfies the recurrence (4.1) for 0 ≤ j ≤ mi − 1. Deduce the statement of the
Theorem.
(a) directly;
(b) using the method of Theorem 4.4.
4.3 Let A be a finite set of positive integers. Suppose that the currency of a certain
country has A as the set of denominations. Prove that the number f (n) of ways of
paying a bill of n units, where coins are paid in order (as in putting them into a
vending machine), has generating function 1/(1 − ∑a∈A xa ).
Suppose that A = {1, 2, 5, 10}. Prove that f (n) ∼ c α n for some constants c
and α, and estimate α.
What is the generating function for the number in the case when the order of
the coins is not significant (as in tendering a handful of coins in a shop)?
4.5 This exercise is due to Wilf, and illustrates his ‘snake oil’ method.
(b) Let
n
n + k n−k
an = ∑ 2k
2
k=0
1 − 2x
∑ anxn = (1 − x)(1 − 4x) ,
n≥0
4.6 Find an asymptotic formula for the number of horizontally convex polyomi-
noes with n cells.
4.7 (a) Let f (n) be the number of expressions for the natural number n as an
ordered sum of 2’s and 3’s. Prove that
(b) Let g(n) be the number of expressions for the natural number n as an un-
ordered sum of 2’s and 3’s. Prove that
1
∑ g(n)xn = (1 − x2)(1 − x3) ,
n≥0
4.8 According to Proposition 4.9(b), the sequence whose nth term is Fn2 (the
square of the nth Fibonacci number) satisfies a four-term recurrence relation. Find
such a relation.
4.10 Prove that the number of permutations of {1, . . . , n} which have exactly k
fixed points is equal to (n!/k!)d(n − k), where d denotes the derangement number.
4.11 Let an be the number of strings that can be formed from n distinct letters
(using each letter at most once, and including the empty string). Prove that
a0 = 1, an = nan−1 + 1 for n ≥ 1,
and deduce that an = e n!. What is the exponential generating function for this
sequence?
4.12 Prove from the recurrence relation for the number sn of involutions in Sn that
its exponential generating function is exp(x + x2 /2).
for n ≥ 2.
Let U(x) be the exponential generating function of (un ). Prove that
d
U(x) = (1 − x)U(x) + xS(x),
dx
where S(x) is as in Exercise 4.12, and deduce that
an = nan−1 , a0 = 1.
Clearly the unique solution is an = n!. What goes wrong if we try to apply the
method of Section 4.2.5 to this example?
4.17 (a) Prove that the inverse of the RSK algorithm produces a permutation π
from each pair (S, T ) of tableaux of the same shape.
(b) Prove that, if π is produced by (S, T ), then π −1 is produced by (T, S).
(c) Prove that, if the RSK algorithm adds a new element a in position i at the end
of the first row of S, then a is the smallest element which lies at the end of
an increasing subsequence of length i of the permutation π, and conversely.
Deduce that, on conclusion of the algorithm, the length of the first row of S
(and T ) is equal to the length of the longest increasing subsequence of π.
4.18 (a) Let π be a permutation of {1, . . . , n}. Suppose that the length of the
longest increasing subsequence of π is m. For i = 1, . . . , m, let Ai be the
subset of {1, . . . , n} consisting of positions for which the longest increasing
subsequence of π ending in that position is i. Show that each Ai carries a
decreasing subsequence of {1, . . . , n}. Deduce that the longest decreasing
subsequence of π has size at least n/m.
4.19 Let k and l be positive integers, and n = kl. How many permutations of
{1, . . . , n} have the property that the longest increasing subsequence has length k
and the longest decreasing subsequence has length l?
Find all such permutations in the case k = l = 2.
4.20 Let P(n) be the number of partial permutations of the set {1, . . . , n}. Prove
that P(n)/n! tends to infinity faster than any polynomial in n.
The permanent
The permanent of a square matrix is defined by the same sum of products as the
determinant, but omitting the signs. Much of enumerative combinatorics involves
calculating or estimating the permanent. However, there are no simple techniques
for this. In this chapter we define the permanent and prove some of the main results
about it (except for the proof of van der Waerden’s conjecture), and apply them to
the enumeration of Latin squares.
det(A) = ∑ sgn(π)a1 1π a2 2π · · · an nπ ,
π∈Sn
where Sn is the symmetric group of degree n, and sgn(π) is the sign of the permu-
tation π. This is a beautiful theoretical formula, but very inefficient for calculating
determinants in practice, since it is a sum of n! terms; a matrix can be reduced to
upper triangular form (and its determinant calculated) with at most n3 arithmetic
operations by Gaussian elimination.
A very similar formula defines the permanent of A:
per(A) = ∑ a1 1π a2 2π · · · an nπ .
π∈Sn
This looks simpler since the signs are not present. Unfortunately, there are no
short-cuts for calculating the permanent. No method substantially faster than eval-
uating the formula is known.
106
20:10:36, subject to the Cambridge
.006
5.2 Hall’s Theorem 107
(a) ai = a j for i = j;
(b) ai ∈ Ai for all i.
(The first condition says that the elements are distinct; the second, that they are
representatives of the corresponding sets.)
Now suppose that |X| = n, say X = {1, 2, . . . , n}. The sets A1 , . . . , An can be
represented by an n × n incidence matrix P = (pi j ), where
1 if j ∈ Ai ,
pi j =
0 otherwise.
So we have
|A(I)| ≥ |I| for all i ⊆ {1, . . . , n}.
This is known as Hall’s condition, after Philip Hall, who proved that it is also
sufficient:
Proof We have seen the necessity of the condition; now we prove its sufficiency.
There are several proofs, of which the one given here is one of the simplest.
The proof is by induction on n. For n = 1, Hall’s condition says |A1 | ≥ 1, so
certainly an SDR exists. So suppose, inductively, that any family with fewer than
n sets which satisfies Hall’s condition has an SDR.
We say that a subset of {1, . . . , n} is critical if |A(I)| = |I|. We divide the proof
into two cases.
since I is not critical. By induction, the family (A1 , . . . , An−1 ) has an SDR, say
(a1 , . . . , an−1 ). Clearly ai = an for i < n, so (a1 , . . . , an ) is an SDR for the original
family.
where the second inequality holds because |A(J)| = |J|. Combining the two SDRs
gives an SDR for the whole family.
The condition of being doubly stochastic requires the same condition for column
sums, though its probabilistic interpretation is less clear.
Ai = { j ∈ X : pi j > 0}.
P = qB + (1 − q) ∑ xi Bi ,
and (1 − q) + q ∑ xi = 1, as required.
A Latin square of order n is an n × n array with entries taken from the set
{1, . . . , n} of symbols, such that each symbol occurs once in each row and once in
each column.
For example, the following are the two Latin squares of order 3 with first row
(1, 2, 3). Since the first row is arbitrary, there are 2 × 6 = 12 Latin squares of
order 3 altogether.
1 2 3 1 2 3
2 3 1 3 1 2
3 1 2 2 3 1
Let L(n) be the number of Latin squares of order n. As hinted in the Preface,
the problem of calculating L(n) exactly seems to be intractible. Values are known
for n ≤ 11, as a result of extensive computation. The first ten are given in the
following table. (The last entry was found by Brendan McKay and Ian Wanless in
2005.)
n L(n)
1 1
2 2
3 12
4 576
5 161280
6 812851200
7 61479419904000
8 108776032459082956800
9 5524751496156892842531225600
10 9982437658213039871725064756920320000
11 776966836171770144107444346734230682311065600000
Can we find estimates for L(n), perhaps upper and lower bounds which are not
too far apart?
The number of ways of placing symbols into an n × n array without restriction
2 2
is nn ; so clearly L(n) ≤ nn . We can improve this by observing that each row is
a permutation of {1, . . . , n}, so that L(n) ≤ (n!)n . By Stirling’s formula, the right-
2
hand side is about (2πn)n/2 (n/e)n . So, roughly speaking, we have knocked off a
2
factor of en .
This bound can be further improved by noting that the second, . . . , nth rows are
derangements of the first; so L(n) ≤ n! d(n)n−1 . The improvement is only a factor
of about en−1 .
We can find a lower bound by building Latin squares row by row. We define a
k × n Latin rectangle for k ≤ n to be a k × n array containing the symbols {1, . . . , n}
so that each symbol occurs once in each row and at most once in each column.
Note that a 1 × n Latin rectangle is simply a permutation; there are n! of these. A
2 × n Latin rectangle consists of two permutations, the second a derangement of
the first; the number of these is n! d(n) ∼ (n!)2 /e.
Proof Given a k × n Latin rectangle L, let Ai be the set of symbols which do not
occur in the ith column. Then the ith entry in the (k + 1)st row must be chosen
from Ai ; and these entries must all be distinct, so the row is a SDR for the sets
(A1 , . . . , An ). Thus the number of ways to extend the array to one of size (k + 1) × n
is the number of SDRs of (A1 , . . . , An ), which is equal to the permanent of its
incidence matrix P.
We claim that P has all row and column sums n − k. For the rows this is clear,
since |Ai | = n − k for all i. The jth column sum is the number of columns in
which j does not occur in the array. But j occurs k times in the array, in k distinct
columns, so fails to appear in (n − k) columns.
The matrix (1/(n − k))P in the above proof is doubly stochastic. By the theo-
rem of Egorychev and Falikman, per((1/(n − k))P) ≥ n!/nn ; so we have
In the case k = 1, assuming that the first row is the identity permutation, we
have P = J − I, where I is the matrix with every entry 1. A permutation contributes
to the permanent of P if and only if it is a derangement; so per(P) = d(n) ∼ n!/e.
The estimate from the Egorychev–Falikman Theorem is
n! (1 − 1/n)n ∼ n!/e.
since (1 − 1/n) → e as n → ∞.
For n = 10, we have (9/10)10 = 0.3486784401, while 1/e = 0.367879441 . . .;
so our estimate from van der Waerden’s conjecture is about 5% too low. The
approximation gets better for larger n.
Now the total number of Latin squares satisfies
n−1
n! (n − k)n (n!)2n
L(n) ≥ ∏ = n2 .
k=0 nn n
2
The number on the right is about (2πn)n (n/e2 )n , by Stirling’s formula.
2
We have obtained upper and lower bounds for L(n) which are roughly (cn)n ,
where c = 1/e for the lower bound and 1/e for the upper. Existing exact results
2
are not adequate for making guesses about which bound is closer to the truth!
1 2 3 4 1 2 3 4
and
2 3 4 1 2 1 4 3
5.6 Exercises
Let d(n) denote the number of derangements of {1, . . . , n}.
5.1 Show that the permanent of the n × n matrix with diagonal entries 0 and off-
diagonal entries 1 is equal to d(n).
5.3 Show that it is not possible to construct Latin squares by placing elements
from {1, . . . , n} into an n × n grid arbitrarily subject to the constraint that no el-
ement occurs more than once in the same row or column. What is the smallest
number of elements which have to be added in order to reach a situation where
completion to a Latin square is impossible?
q-analogues
6.1 Motivation
We can look at q-analogues in several ways:
• The q-analogues are, typically, formulae which tend to the classical ones as
q → 1. Most basic is the fact that
qa − 1
lim =a
q→1 q − 1
for any real number a (this is immediate from l’Hôpital’s rule).
• There is a formal similarity between statements about subsets of a set and
subspaces of a vector space, with cardinality replaced by dimension. For
example, the Inclusion–Exclusion rule
|U ∪V | + |U ∩V | = |U| + |V |
for sets becomes
dim(U +V ) + dim(U ∩V ) = dim(U) + dim(V )
for vector spaces. Now, if the underlying field has q elements, then the
number of 1-dimensional subspaces of an n-dimensional vector space is
(qn − 1)/(q − 1), which is exactly the q-analogue of n.
114
20:20:19, subject to the Cambridge
.007
6.1 Motivation 115
Theorem 6.1 The cardinality of any finite field is a prime power. Moreover, for
any prime power q, there is a unique field with q elements, up to isomorphism.
To commemorate Galois, finite fields are called Galois fields, and the field
with q elements is denoted by GF(q).
Here is an example where Gaussian coefficients arise in a weighted counting
problem.
A lattice path in the first quadrant of the plane is a path starting at the origin,
each step being either one unit to the right, or one unit upwards.
The number of
m+n
lattice paths from the origin to the point (m, n) is . For to move from the
m
origin to (m, n), we must take m + n steps, and exactly m of these steps must be to
the right; we can choose which m of the m + n steps are to the right arbitrarily.
This fact can also be proved by means of a recurrence relation. Let P(m, n) be
the number of lattice paths from the origin to (m, n). Clearly P(0, n) = P(m, 0) = 0.
If m, n > 0, then any path to (m, n) must pass either through (m − 1, n) or through
(m, n − 1); so we have
r r r r r
r r r r r
r r r r r
6.2 q-integers
We define the q-integer [n]q to be
qn − 1
[n]q = .
q−1
These have the property that they tend to the usual factorials and binomial coeffi-
cients as q → 1. However, they will not always be the most appropriate definitions.
Indeed, it is not clear that there should always be a most appropriate definition. We
should proceed with caution!
It can be shown that the Gaussian coefficient is a polynomial in q, if we regard
q as an indeterminate. If instead we regard q as a complex number, it has a well-
defined value as long as q is not a dth root of unity for some d dividing k. (In the
excluded cases, the denominator is zero, but the limit still exists.)
Fortunately, in the case of q-binomial coefficients, the proposed definition works
without problems:
Proposition 6.2 If q is a prime power, then the number of k-dimensional sub-
n
spaces of an n-dimensional vector space over GF(q) is .
k q
since the ith vector must be chosen outside the span of its predecessors. Any such
choice is the basis of a unique k-dimensional subspace. Putting n = k, we see that
the number of bases of a k-dimensional space is
Proof This comes straight from the definition. Suppose that 0 < k < n. Then
n
n n−1 q −1 n−1
− = − 1
k q k−1 q qk − 1 k−1 q
n−k
k q −1 n−1
= q
qk − 1 k−1 q
n
= qk .
k−1 q
The array of Gaussian coefficients has the same symmetry as that of binomial
coefficients. From this we can deduce another recurrence relation.
We come now to the q-analogue of the Binomial Theorem, which states the
following.
e1 = x1 + x2 + x3 , e2 = x1 x2 + x2 x3 + x3 x1 , e3 = x1 x2 x3 .
n n
Proposition 6.6 ∏(z − xi) = ∑ (−1)k ek (x1, . . . , xn)zn−k .
i=1 k=0
Hence the Binomial Theorem and its q-analogue give the following specialisations:
so
n
ek (1, 1, . . . , 1) = .
k
so
n
ek (1, q, . . . , qn−1 ) = qk(k−1)/2 .
k q
These linear maps form a group, the general linear group GL(n, q).
To an algebraist, the general linear group might seem to be a q-analogue of the
symmetric group, but its order is not the q-factorial defined earlier!
Using the q-Binomial Theorem, we can transform the multiplicative formula
for the order of GL(n, q) into an additive formula:
Proposition 6.8
n
n
| GL(n, q)| = (−1)n qn(n−1)/2 ∑ (−1)k qk(k+1)/2 .
i=0 k q
Proof We have
n
| GL(n, q)| = (−1)n qn(n−1)/2 ∏(1 − qi ),
i=1
As n → ∞, we have
pn (q) → p(q) = ∏(1 − q−i ). (6.1)
i≥1
So, for example, p(2) = 0.2887 . . . is the limiting probability that a large random
matrix over GF(2) is invertible.
What is the q-analogue of the Stirling number S(n, k), the number of partitions
of an n-set into k parts? This is a philosophical, not a mathematical question; I
argue that the q-analogue is the Gaussian coefficient nk q .
The number of surjective maps from an n-set to a k-set is k!S(n, k), since the
preimages of the points in the k-set form a partition of the n-set whose k parts can
be mapped to the k-set in any order. The q-analogue is the number of surjective
linear maps from an n-space V to a k-space W . Such a map is determined by its
kernel U, an (n − k)-dimensional subspace of V , and a linear isomorphism from
V /U to W . So the analogue of S(n, k) is the number of choices of U, which is
n n
= .
n−k q k q
Proof We give two proofs, one depending on some algebra, and the other a rather
nice exercise in manipulating formal power series.
First proof: We use the fact that the roots of an irreducible polynomial of
degree k over GF(q) lie in the unique field GF(qk ) of degree k over GF(q). More-
over, GF(qk ) ⊆ GF(qn ) if and only if k | n; and every element of GF(qn ) generates
some subfield over GF(q), which has the form GF(qk ) for some k dividing n.
Now each of the qn elements of GF(qn ) satisfies a unique minimal polynomial
of degree k for some k; and every irreducible polynomial arises in this way, and
has k distinct roots. So the result holds.
Second proof: All the algebra we use in this proof is that each monic poly-
nomial of degree n can be factorised uniquely into monic irreducible factors. If the
number of monic irreducibles of degree k is mk , then we obtain all monic polyno-
mials of degree n by the following procedure:
• Choose ak monic irreducibles of degree k from the set of all mk such, with
repetitions allowed and order not important;
• Multiply the chosen polynomials together.
Altogether there are qn monic polynomials xn + c1 xn−1 + · · · + cn of degree n,
since there are q choices for each of the n coefficients. Hence
mk + ak − 1
q = ∑∏
n
, (6.3)
k ak
where the sum is over all sequences a1 , a2 , . . . of natural numbers which satisfy
∑ kak = n.
Multiplying by xn and summing over n, we get
1
1 − qx
= ∑ qnxn
n≥0
mk + ak − 1 kak
= ∑ ∏ ak
x
a1 ,a2 ,... k≥1
mk + a − 1
= ∏∑ (xk )a
k≥1 a≥0 a
= ∏ (1 − xk )−mk .
k≥1
Here the manipulations are similar to those for the sum of cycle indices which we
will meet in the next chapter; we use the fact that the number of choices of a things
from a set of m, with repetition allowed and order unimportant, is
m+a−1 −m
= (−1)a
a a
(see Theorem 3.6), and in the fourth line we invoke the Binomial Theorem with
negative exponent.
Taking logarithms of both sides, we obtain
qn x n
∑ = − log(1 − qx)
n≥1 n
= ∑ −mk log(1 − xk )
k≥1
xkr
= ∑ mk ∑ .
k≥1 r≥1 r
The coefficient of xn in the last expression is the sum, over all divisors k of n,
of mk /r = kmk /n. This must be equal to the coefficient on the left, which is qn /n.
We conclude that
qn = ∑ kmk , (6.4)
k|n
as required.
Note how the very complicated recurrence relation (6.3) for the numbers mk
changes into the much simpler recurrence relation (6.4) after taking logarithms!
We will see how to solve such a recurrence in the chapter on Möbius inversion.
dq f (x)
Dq f (x) = .
dq x
Note that, unlike ordinary calculus, the differentials are defined directly, and the
derivative is their quotient. If f (x) is differentiable, with derivative f (x), then we
have
lim Dq f (x) = f (x).
q→1
Dq xn = [n]q xn−1
Remark There is another type of quantum calculus, namely the h-calculus, de-
fined as follows: we have
dh f (x) = f (x + h) − f (x),
6.8 Exercises
n
6.1 Prove that is a polynomial of degree k(n − k) in the indeterminate q.
k q
n
6.2 Prove that, for fixed n, the Gaussian coefficients for k = 0, 1, . . . , n form
k q
a log-concave sequence.
(b) Let
n
n
Fq (n) = ∑ k ,
k=0 q
so that, if q is a prime power, then Fq (n) is the total number of subspaces of
an n-dimensional vector space over GF(q). Prove that
Fq (0) = 1, Fq (1) = 2, Fq (n) = 2Fq (n − 1) + (qn−1 − 1)Fq (n − 2) for n ≥ 2.
2 /4
(c) Deduce that, if q > 1, then Fq (n) ≥ c qn for some constant c (depending
on q).
6.4 This exercise shows that the Gaussian coefficients have a counting interpreta-
tion for all positive integer values of q (not just prime powers).
Suppose that q is an integer greater than 1. Let Q be a finite set of cardinality q
containing two distinguished elements 0 and 1. We say that a k × n matrix with
entries from Q is in reduced echelon form if the following conditions hold:
• If a row has any non-zero entries, then the first such entry is 1 (such entries
are called ‘leading 1’);
• if i < j and row j is non-zero, then row i is also non-zero, and its leading 1
occurs to the left of the leading 1 in row j;
• if a column contains the leading 1 of some row, then all other entries in that
column are 0.
n
Prove that is the number of k × n matrices in reduced echelon form with no
k q
rows of zeros.
6.5 A matrix is said to be in echelon form if it satisfies the first two conditions in
the definition of reduced echelon form. Show that, if q is an integer greater than 2,
the right-hand side of the q-Binomial Theorem with x = 1 counts the number of
n × n matrices in echelon form.
How many n × n matrices in reduced echelon form are there?
6.6 Let hk (x1 , . . . , xn ) be the complete symmetric function of degree k in the in-
determinates x1 , . . . , xn (the sum of all monomials of degree k that can be formed
using these indeterminates). For example,
h2 (x1 , x2 , x3 ) = x12 + x22 + x32 + x1 x2 + x2 x3 + x3 x1 .
Prove that
∞ n
∑ hk (x1, . . . , xn)zk = ∏(1 − xiz)−1.
k=0 i=1
Deduce that
6.7 The second proof of Theorem 6.9 shows that the number of irreducible poly-
nomials over GF(q) is exactly what is required if every element of GF(qn ) is the
root of a unique irreducible of degree dividing n over GF(q). Turn the argument
around to give a counting proof of the existence and uniqueness of GF(qn ), given
that of GF(q).
n
6.8 Let ω be a primitive dth root of unity. Express in terms of binomial
k ω
coefficients (whenever you can).
Solution by Pablo Spiga Let d be a natural number, and let ω be a primitive dth
root of unity in C, i.e. ω d = 1. Then, if 0 ≤ a, b ≤ d − 1, we have
nd + a n a
= .
kd + b ω k b ω
Note that we are assuming that ab ω = 0 whenever a < b.
Thus, we get
nd nd
nd
∏(1 + ω i−1
(−ξ )) = ∑ω j( j−1)/2
(−1)
j
j
ξ j, (6.5)
i=1 j=0 ω
but
nd n
n
∏(1 + ω (−ξ )) = (1 − ξ ) = ∑ k (−1)k ξ kd .
i−1 d n
(6.6)
i=1 k=0
We have proved that ndj = 0 if d does not divide j. Assume j = dk. By (6.5)
ω
and (6.6), as
ω dk(dk−1)/2 (−1)k(d+1) = 1, (6.7)
we get
nd n
= .
kd ω k
(For (6.7), note that if d is odd then (−1)d+1 = 1, while if d is even then we can
2
write −1 as ω d/2 , and we find ω dk(dk+d)/2 = ω d k(k+1)/2 .) This proves the result
for a = 0.
Assume a ≥ 1. If b = 0 then, by induction hypothesis and by the usual recur-
rence relation, we get
nd + a nd + a − 1 nd + a − 1
= + ω kd+b
kd + b ω kd + b − 1 ω kd + b ω
n a−1 n a −1
= + ωb
k b−1 ω k b ω
n a
= .
k b ω
Finally, if b = 0, then, as a − 1 < d − 1,
nd + a nd + a − 1 kd nd + a − 1
= +ω
kd ω (k − 1)d + d − 1 ω kd ω
n a − 1
= ω0
k 0 ω
n a
= .
k b ω
Remark Compare Lucas’ formula
np + a n a
≡ (mod p)
kp + b k b
if p is prime and 0 ≤ a, b < p.
6.9 Prove that, if the indetermintes x and y satisfy yx = qxy, then
n
n n−k k
(x + y)n = ∑ x y.
k=0 k q
6.11 Prove Heine’s formula, the q-analogue of the negative Binomial Theorem:
∞
1 n+ j−1
=∑ x j.
(1 − x)nq j=0 j q
6.14 Using each of the formulae (6.1) and (6.2), estimate the probability that a
large matrix over GF(2) is invertible. Compare the rate of convergence of the two
formulae.
k=0 q
This is a true analogue of a ‘false theorem’ about sets and functions. The number
of partial permutations of an n-set is
n 2
n
∑ k |Sk |
k=0
(see Theorem 4.20), whereas the number of maps from an n-set to itself is nn ; these
numbers are definitely not equal!
There are many situations in combinatorics where we want to count the number
of arrangements of some kind, not in total, but up to some symmetry of the prob-
lem concerned. The branch of mathematics which deals with symmetry is group
theory, in particular the theory of group actions. In this chapter we consider how
to use information about a group action (specifically, a polynomial known as the
cycle index) to solve problems of this kind.
Here is an example.
A cube has six faces, so if we paint each face red, white or blue, the total
numbers of ways that we can apply the colours is 36 = 729. However, if we can
pick up the cube and move it around, it is natural to count in a different way,
where two coloured cubes differing only by a rotation are counted as ‘the same’.
There are 24 rotations of the cube into itself, but the answer to our question is not
obtained just by dividing 729 by 24. The purpose of this section is to develop tools
for answering such questions.
The theory of permutation groups has many connections to combinatorics other
than its use in enumeration. To choose one example, a number of the ‘sporadic’
finite simple groups were first constructed as groups of automorphisms of combi-
natorial structures such as graphs or designs.
131
20:17:49, subject to the Cambridge
.008
132 Group actions and the cycle index
xgh = (xg )h
• x = x1 , so x ∼ x: ∼ is reflexive.
−1
• Let x ∼ y. Then y = xg , so x = yg , so y ∼ x: ∼ is symmetric.
Note that the three conditions in the definition of a permutation group translate
precisely into the three conditions of an equivalence relation.
Remark In the literature, this result is often referred to as Burnside’s Lemma; see
Neumann’s paper in the bibliography to see why this name came about and why it
is inappropriate.
Proof We count in two different ways the number N of pairs (x, g), with x ∈ X,
g ∈ G, and xg = x.
On the one hand, clearly
N = ∑ fix(g).
g∈G
Using this, we can count our coloured cubes. We have to examine the 24 rota-
tions and find the number of colourings fixed by each.
for the unsigned Stirling numbers of the first kind, from which the required equa-
tion is obtained by substituting −x for x and multiplying by (−1)n . Suppose first
that x is a positive integer. Consider the set of functions from {1, . . . , n} to a set X
of cardinality x. There are xn such functions. Now the symmetric group Sn acts on
these functions: the permutation g maps the function f to f g , where
f g (i) = f (ig−1 ).
A function f selects n elements of X with order important and repetitions allowed:
the selections are just f (1), . . . , f (n). So the orbit of f under Sn is simply a selec-
tion of n things from X, where repetitions are allowed and order is not important
(since Sn allows all reorderings). So the number of orbits is the number of such
selections, which is
x+n−1
= x(x + 1) · · · (x + n − 1)/n!
n
For example, our analysis of the rotations of the cube shows that the cycle index
of this group (acting on faces) is
1 6
(s + 3s21 s22 + 6s21 s4 + 6s32 + 8s23 ).
24 1
We use the notation
Z(G; si ← fi for i = 1, . . . , n)
for the result of substituting the expression fi for the indeterminate si for i =
1, . . . , n.
Theorem 7.3 If G acts on X, and we attach figures to the points of X with figure-
counting series A(x), then the function-counting series is given by
For example, in the coloured cubes, let Red have weight 1 and the other colours
weight 0. Then A(x) = 2 + x, and the function-counting series is
1
B(x) = ((2 + x)6 + 3(2 + x)2 (2 + x2 )2 + 6(2 + x)2 (2 + x4 )
24
+6(2 + x2 )3 + 8(2 + x3 )2 )
= 10 + 12x + 16x2 + 10x3 + 6x4 + 2x5 + x6 .
Proof The first step is to note that, if we ignore the group action and simply count
all the functions, the function-counting series is B(x) = A(x)n , where n = |X|. For
the term in xm in A(x)n is obtained by taking all expressions m = m1 + · · · + mn for
m as a sum of n non-negative integers, multiplying the corresponding terms am i
mi in
A(x), and summing the result. The indicated product counts the number of choices
of functions of weights m1 , . . . , mn to attach at the points 1, . . . , n of X, so the result
is indeed the function-counting series.
Note that this proves the theorem in the case where G is the trivial group.
Next, we have to count the functions of given weight fixed by a permutation g ∈
G. As we have seen, a function is fixed by g if and only if it is constant on the cycles
of g. Now if we choose a function of weight r to attach to the points of a particular
i-cycle of g, the number of choices is ar but the contribution to the weight is ir.
Arguing as above, the generating function for the number of fixed functions is
A(x)c1 (g) A(x2 )c2 (g) · · · A(xn )cn (g) = z(g; si ← A(xi ) for i = 1, . . . , n).
Finally, by the Orbit-counting Lemma, if we sum over g ∈ G and divide by |G|,
we find that the function-counting series is
B(x) = Z(G; si ← A(xi ) for i = 1, . . . , n).
r r r r
T T
T T
r
Tr r
Tr r r r r
labellings of A is n!/| Aut(A)|, since of the n! labellings, two are the same if and
only if they are related by an automorphism of A. (More formally, labellings cor-
respond bijectively to cosets of Aut(A) in the symmetric group Sn .) So the number
of labelled objects is
n!
∑ | Aut(A)| ,
A
where the sum is over the unlabelled objects on n points.
The cycle index method can be applied to give more sophisticated counts. For
example, let us count graphs on 4 vertices. The number of pairs of vertices is 6,
and each pair is either an edge or a non-edge. So the number of labelled graphs is
26 = 64, and the number of labelled graphs with k edges is 6k for k = 0, . . . , 6.
In order to count orbits, we must
let S4 act on the set of 64 graphs. But we
can think of a graph as the set of 42 = 6 pairs of vertices with a figure (either an
edge or a non-edge) attached to each. So we must compute the cycle index of S4
acting on pairs of vertices. Table 7.1 gives details. The notation 12 21 , for example,
means ‘two fixed points and one 2-cycle’. Such an element, say the transposition
(1, 2), fixes the two pairs {1, 2} and {3, 4}, and permutes the other four pairs in
two 2-cycles; so its cycle structure on pairs is 12 22 .
1 6
Z(G) = (s + 9s21 s22 + 8s23 + 6s2 s4 ).
24 1
Now if we take edges to have weight 1 and non-edges to have weight 0 (that is,
7.5 Exercises
7.1 Use the Cycle Index Theorem to write down a polynomial in two variables x
and y in which the coefficient of xi y j is the number of cubes in which the faces are
coloured red, white and blue, having i red and j blue faces, up to rotations of the
cube.
7.2 Find a formula for the number of ways of colouring the faces of the cube
with r colours, up to rotations of the cube. Repeat this exercise for the other four
Platonic solids.
7.3 A necklace has ten beads, each of which is either black or white, arranged on
a loop of string. A cyclic permutation of the beads counts as the same necklace.
How many necklaces are there? How many are there if the necklace obtained by
turning over the given one is regarded as the same?
7.4 Repeat this question for necklaces with n beads of r possible colours.
7.5 Let G act transitively on X, where |X| = n > 1.
(a) By considering the action of G on ordered pairs of elements of X, show that
1
∑ fix(g)2 ≥ 2.
|G| g∈G
(c) Deduce that the proportion of elements of G with no fixed points is at least
1/n.
F(x) = P(x + 1)
7.7 A necklace is made with two different colours of beads; there are N beads
altogether. We specify that there are to be k points on the necklace where the
colours change (that is, k maximal runs of beads of the same colour), where k
is even. Show that, if the two colours are interchangeable, then the number of
different necklaces which can be produced is the coefficient of xN in
Möbius inversion
Theorem 8.1 The number of elements of X lying in none of the sets Ai is equal to
∑ (−1)|J| |AJ |.
J⊆{1,...,n}
141
20:17:29, subject to the Cambridge
.009
142 Möbius inversion
∑ (−1)|J|.
J⊆K
k
If |K| = k > 0, then there are j sets of size j in the sum, which is
k
k
∑ j (−1) j = (1 − 1)k = 0,
j=0
whereas if K = 0/ then the sum is 1. So the points with K = 0/ (those lying in no set
Ai ) each contribute 1 to the sum, and the remaining points contribute nothing. So
the theorem is proved.
For let M and N be the sets, with N = {1, . . . , n}. Let X be the set of all functions
f : M → N, and Ai the set of functions whose range does not include the point i.
Then AJ is the set of functions whose range includes none of the points of J (that
is, functions from M to N \ J); so |AJ | = (n − j)m when |J| = j. A function is a
surjection if and only if it lies in none of the sets Ai . The result follows.
In particular, if m = n, then surjections are permutations, and we have
n
n
∑ (−1) j (n − j)n = n!.
j
j=0
Example: the problème des ménages We have some unfinished business from
Chapter 5: how many possible third rows of Latin rectangles which have first row
(1, 2, . . . , n) and second row (2, 3, . . . , n, 1) are there? In other words, we require a
permutation (a1 , . . . , an ) so that ai does not belong to the ith column of the array
1 2 ... n−1 n
.
2 3 ··· n 1
We will proceed slightly differently with the count: for r = 0, . . . , n, let Br be
the number of permutations (a1 , . . . , an ) for which ai ∈ {i, i + 1} holds for at least
a prescribed set of r values of i.
Now B1 = 2n(n − 1)!; for we can choose in n ways a position i for which
ai ∈ {i, i + 1} holds, in two ways which value of ai to take, and in (n − 1)! ways
the completion of the permutation.
To compute Br for r > 1, note that we may assume that either a1 = 1 or an = 1.
In the first case, we consider the list
(2, 2, 3, 3, . . . , n − 1, n − 1, n)
consisting of the elements after the one chosen written column by column and
claim that the remaining r − 1 positions satisfying the condition form a selection of
r − 1 from this list of 2n − 3 with no two consecutive. (Two consecutive numbers
either correspond to two entries from
the same column, or two entries with the
same value.) This can be done in 2n−r−1 r−1 ways (Exercise 1.1). The other choice
would give the same value, using the sequence (1,2, 2, . . ., n − 1, n − 1) instead.
So there are 2n choices of starting point and 2n−r−1 r−1 ways to choose the re-
maining values; but we must divide by r since we arbitrarily chose one of the
distinguished r points to be the start. Finally, there are (n − r)! ways to choose the
remaining n − r entries in the permutation. So
2n 2n − r − 1
Br = .
r r−1
Finally, applying PIE shows that the number of solutions of the problème des
ménages is
r−1 2n 2n − r − 1
n
Mn = n! − ∑ (−1) (n − r)!.
r=1 r r−1
The statement of PIE can be generalised to give a formula for the number of el-
ements of X which lie in a given collection of sets Ai and not in the remaining ones
(see Exercise 8.2). Indeed, the same formula applies if the numbers concerned are
arbitrary real numbers rather than cardinalities of sets:
Theorem 8.2 Let real numbers aJ and bJ be given for each subset J of N =
{1, . . . , n}. Then the following are equivalent:
Proof The theorem asserts the form of the solution to a system of linear equa-
tions; in other words, the inverse of a certain matrix. However, the same matrix
occurs in the original form of PIE.
The theorem as stated involves sums over supersets of the given index set.
However, it is easily transformed to involve sums over subsets (see Exercise 8.3).
In this form, it is a generalisation of the inverse relationship between the triangular
matrix of binomial coefficients and the signed version (see Exercise 8.4).
• x ≤ x (reflexivity);
• if x ≤ y and y ≤ x then x = y (antisymmetry);
• if x ≤ y and y ≤ z then x ≤ z (transitivity).
where x < y is short for x ≤ y and x = y. (Note that antisymmetry implies that at
most one of these three conditions holds.)
The usual order relations on the natural numbers, integers, and real numbers are
total orders. An important example of a partial order is the relation of inclusion on
the set of all subsets of a given set. Other important examples of partially ordered
sets include
• the positive integers ordered by divisibility (that is, x ≤ y if and only if x | y);
• the subspaces of a finite vector space, ordered by inclusion. (This is known
as a projective space.)
Theorem 8.3 Any partial order on a set X can be extended to a total order on X.
This theorem is easily proved for finite sets: take any pair of elements x, y which
are incomparable in the given relation; set x ≤ y, and include all consequences of
transitivity (show that no conflicts arise from this); and repeat until all pairs are
comparable. It is more problematic for infinite sets; it cannot be proved from the
Zermelo–Fraenkel axioms, but requires an additional principle such as the Axiom
of Choice.
The upshot of the theorem for finite sets is that any finite partially ordered set
can be written as X = {x1 , . . . , xn } so that, if xi ≤ x j , then i ≤ j (but not necessarily
conversely). This is often possible in many ways. For example, the subsets of
{a, b, c}, ordered by inclusion, can be written as
X1 = 0,
/ X2 = {a}, X3 = {b}, X4 = {c},
X5 = {a, b}, X6 = {a, c}, X7 = {b, c}, X8 = {a, b, c}.
(These equations show that the way in which we extend the partial order to a total
order does not affect the definitions.)
The definitions of addition and multiplication work equally well for an infinite
locally finite poset (since the sum in the formula for multiplication is finite). So
the incidence algebra of a locally finite poset is defined.
The incidence algebra has an identity, the function ι given by
1 if x = y,
ι(x, y) =
0 otherwise.
(The matrix Aι is the usual identity matrix.) Another important algebra element is
the zeta function ζ , defined by
1 if x ≤ y,
ζ (x, y) =
0 otherwise.
Thus ζ is the characteristic function of the partial order, and an arbitrary function
α belongs to the incidence algebra if and only if
ζ (x, y) = 0 ⇒ α(x, y) = 0.
A lower triangular matrix with ones on the diagonal has an inverse. The Möbius
function μ of a poset is the inverse of the zeta function. In other words, it satisfies
1 if x = y,
∑ μ(x, y) = 0 otherwise.
x≤z≤y
In particular, μ(x, x) = 1 for all x. Moreover, if we know μ(x, z) for x ≤ z < y, then
we can calculate
μ(x, y) = − ∑ μ(x, z).
x≤z<y
In particular, we see that the values of the Möbius function are all integers.
Example: The integers In the poset of integers, with the usual order, the Möbius
function is given by
1 if y = x;
μ(x, y) = −1 if y = x + 1;
0 otherwise.
for n > 0. This is exactly the inductive step in the proof that
The third of these is the ‘classical’ Möbius function, and plays an important
role in number theory. If you see μ(z) without any further explanation, it probably
means the classical Möbius function. In this case, Möbius inversion can be stated
as follows:
Proposition 8.6 Let f and g be functions on the positive integers. Then the fol-
lowing are equivalent:
(a) g(n) = ∑ f (m);
m|n
∑ φ (n/d) = n,
d|n
From this it is easy to deduce that, if n = pa11 · · · par r , where pi are distinct primes
and ai > 0, then
φ (n) = p1a1 −1 (p1 − 1) · · · par r −1 (pr − 1).
∑ m fq(m) = qn.
m|n
∑ a(X) = qn(n−k) ,
X⊆W
since μ(W,V
n ) = (−1)
k qk(k−1)/2 if dim(V ) − dim(W ) = k. (We have also used the
n
fact that k q = n−k q .) Now a(V ) is the required number of bijective linear maps;
a little manipulation gives the required formula. The exponent of q in the kth term
is
(n − k)(n − k − 1) n(n − 1) k(k + 1)
nk + = + .
2 2 2
8.7 Exercises
8.1 Before doing this exercise, you should review Exercise 3.12.
(a) In how many ways can k identical sweets be given to n children if we require
that the number xi of sweets given to the ith child should satisfy xi ≤ bi , for
some numbers b1 , . . . , bn ?
(b) In how many ways can k identical sweets be given to n children if we require
that the number xi of sweets given to the ith child should satisfy ai ≤ xi ≤ bi ,
for some numbers a1 , b1 , . . . , an , bn ?
(In this exercise, you are not required to write down an explicit formula; descrip-
tion of a method for calculating the number will suffice.)
8.3 Prove that, with the hypotheses of Theorem 8.2, the following conditions are
equivalent:
8.4 By taking the numbers aJ and bJ of the preceding exercise to depend only on
the cardinality j of J, show that the following statements are equivalent for two
sequences (xi ) and (yi ):
j
j
(a) x j = ∑ yi ;
i=0 i
j
j
(b) y j = ∑ (−1) j−i
yi .
i=0 i
8.6 Prove that the solution Mn to the problème des ménages satisfies the recur-
rence
(n − 2)Mn = n(n − 2)Mn−1 + nMn−2 + 4(−1)n−1
for n ≥ 4.
where C is the set of all chains from x to y, and l(c) is the length of c.
8.8 Let d(n) be the number of divisors of the positive integer n. Prove that
∑ d(m)μ(n/m) = 1
m|n
for n > 1.
8.9 Let P(X) denote the poset whose elements are the partitions of the set X,
with P ≤ Q if P refines Q (that is, every part of P is contained in a part of Q). Let
E be the partition into sets of size 1.
(b) Show that any interval [P, Q] is isomorphic to a product of posets of the form
P(X j ), and hence calculate μ(P, Q).
8.10 Let G be the cyclic group consisting of all powers of the permutation
g : 1 → 2 → · · · → n → 1.
1
Z(G) = ∑ φ (n/m)smn/m,
n m|n
(Notice that, if q is a prime power, the second expression is equal to the number of
monic irreducible polynomials of degree n over GF(q). Finding a bijective proof
of this fact is much harder!)
(a) Suppose that F and G are multiplicative. Show that the function H defined by
H(n) = ∑ F(k)G(n/k)
k|n
is multiplicative.
(b) Show that the Möbius and Euler functions are multiplicative.
(c) Let d(n) be the number of divisors of n, and σ (n) the sum of the divisors of n.
Show that d and σ are multiplicative.
8.13 The following problem, based on the children’s game ‘Screaming Toes’, was
suggested to me by Julian Gilbey.
n people stand in a circle. Each player looks down at someone else’s
feet (i.e., not at their own feet). At a given signal, everyone looks up
from the feet to the eyes of the person they were looking at. If two
people make eye contact, they scream. What is the probability of at
least one pair of people screaming?
Prove that the required probability is
n/2
(−1)k−1 (n)2k
∑ 2k k!(n − 1)2k
,
k=1
Remark For more general approximate versions of PIE, see the paper by Linial
and Nisan in the bibliography.
156
20:16:44, subject to the Cambridge
.010
9.1 The chromatic polynomial 157
Proposition 9.1 For a fixed (finite) loopless graph Γ with n vertices, the function
q → PΓ (q) is a monic polynomial of degree n.
Proof We can prove this, and find a formula for the polynomial, using the Princi-
ple of Inclusion and Exclusion, which we discussed in Chapter 8. Let E be the set
of edges, and X the set of all colourings (proper or not) of the vertices of the graph
with a fixed set of q colours. If f is a colouring, call an edge bad if its two vertices
have the same colour. In how many colourings does a given subset F of E consist
of bad edges?
Let Γ(F) be the graph with the same vertex set as Γ and with edge set F.
In any colouring in which the edges of F are bad, then any two vertices in the
same connected component of Γ(F) have the same colour; so the number of such
colourings is qc(F) , where c(F) denotes the number of connected components of
Γ(F).
So by PIE, the number of colourings with no bad edges is
Proof Let e = {v, w}. Consider the set of colourings of Γ\e, and partition this set
into two subsets:
Let S be the set of acyclic orientations of Γ\e. We divide S into two classes:
and the proof is complete. (In the penultimate line we use the deletion-contraction
formula for the chromatic polynomial of Γ, namely PΓ = PΓ\e − PΓ/e .)
The induction is on the number of edges, so we start the induction with null
graphs. The null graph on n vertices has chromatic polynomial qn and has a single
acyclic orientation, and (−1)n (−1)n = 1 as required.
For any subset F of E, define the rank of F to be r(F) = n − c(F), where as before
c(F) is the number of connected components of the graph with edge set F. Said
otherwise, r(F) is the size of the largest subgraph of Γ(F) which is a forest (that
is, contains no cycles). Then the Tutte polynomial of Γ is
The next theorem gives some specialisations of the Tutte polynomial. A span-
ning subgraph of Γ is a graph having all the vertices and some of the edges of Γ;
that is, of the form Γ(F) for some F ⊆ E. It is called a spanning tree or spanning
forest if it is a tree or forest respectively.
Proof (a) Putting x = y = 1 selects only those terms in which the exponent of both
x − 1 and y − 1 is zero, that is, corresponding to subsets F for which |F| = r(F) =
r(E). The second inequality asserts that Γ(F) is connected and the first that it is a
forest; so only trees are selected. Each tree contributes 1 to the evaluation.
(b) and (c) Similarly, putting x = 2 and y = 1 selects forests, while putting x = 1
and y = 2 selects connected graphs.
(d) is trivial since every term in the sum is 1.
(e) If x = 1 − q and y = 0, then the term corresponding to F in the sum is
using the fact that r(F) = n − c(F). Comparing with the Inclusion-Exclusion for-
mula (9.1) gives the result.
of f around any circuit in Γ is zero (where edges in the opposite direction to the
circuit are given a − sign). Note that, if we reverse the orientation of an edge, and
change the sign of the function f on that edge, the properties of being a flow or a
tension are unaltered. So the numbers of flows and tensions are independent of the
chosen orientation of the edges.
A flow or tension is nowhere-zero if it does not take the zero value (the identity
of the group A).
Proposition 9.5 Let A be an abelian group of order q. Then the number of nowhere-
zero tensions in a graph Γ with values in A is T (Γ; 1 − q, 0), while the number of
nowhere-zero flows in Γ with values in A is T (Γ; 0, 1 − q).
Remark Note that these numbers do not depend on the structure of the group A,
but only on its order.
There are many other counts associated with a graph, whose values are evalua-
tions of the Tutte polynomial. For example, Stanley’s Theorem 9.3 shows that the
number of acyclic orientations of a graph Γ is equal to T (Γ; 0, 2).
appear below.
Order Order
significant not
significant
q+n−1
With replacement qn
n
q
Without replacement (q)n
n
We can regard such a sample in which order is significant as a colouring of a graph
on n vertices with a set of q colours: for the first, second, . . . , nth vertex, we have to
choose a colour. Sampling with replacement gives an arbitrary colouring, which is
of course a proper colouring of the null graph (the graph with no edges). Sampling
without replacement gives a colouring in which all colours are distinct; this is a
proper colouring of the complete graph (in which all pairs of vertices are joined by
an edge).
A sample in which order is not significant gives rise to a colouring of the null or
complete graph, where we count two colourings as the same if some permutation
of the vertices carries the first to the second. In other words, for sampling with
order not significant, we are counting orbits of the symmetric group on proper
colourings of the null or complete graph.
Noting that the automorphism group of either of these graphs is the symmetric
group, we can now ask:
Given a graph Γ and a group G consisting of automorphisms of Γ,
what is the number of orbits of G on proper colourings of Γ with q
colours?
This question was considered in the paper by Cameron, Jackson and Rudd in the
bibliography, in which further details are given. Here we first consider the orbital
chromatic polynomial, which solves the question as stated, and then turn briefly to
the (more general) orbital Tutte polynomial.
Let Γ be a graph and G a group of automorphisms of Γ (a subgroup of Aut(Γ)).
Let PΓ,G (q) be the number of G-orbits on proper colourings of Γ with q colours.
By the Orbit-counting Lemma, we have
1
PΓ,G (q) = ∑ φ (Γ, g, q),
|G| g∈G
make the convention that, if a cycle of g contains an edge, then Γ/g has a loop on
the vertex resulting from shrinking this cycle, we conclude the following:
Theorem 9.7 Let A be a matrix over a principal ideal domain R, and suppose
that A has rank r. Then A can be converted, by row and column operations, into a
matrix D in which the first r elements d1 , . . . , dr on the main diagonal are non-zero
and satisfy di | di+1 for i = 1, . . . , r − 1; all other entries in the matrix are zero. The
elements d1 , . . . , dr are unique up to being replaced by associates.
The Smith normal form of A is the matrix D of the above theorem. The elements
d1 , . . . , dr and 0 (n − r times) are the invariant factors of A.
We now show that suitable specialisations of the orbital Tutte polynomial count
orbits of G on nowhere-zero flows and tensions, as well as on proper colourings.
If A is a finite abelian group and m a non-negative integer, we let αm (A) denote the
number of solutions of mx = 0 in A. Note that α0 (A) = |A| and α1 (A) = 1.
(a) If Γ is connected, then the number PΓ,G (q) of G-orbits on proper colourings
of Γ with q colours is obtained from OT (Γ, G) by substituting xi ← −1,
y0 ← q, yi ← 1 for i > 0, and multiplying by q.
(b) The number of G-orbits on nowhere-zero tensions with values in A is ob-
tained from OT (Γ, G) by substituting xi ← −1, yi ← αi (A).
(c) The number of G-orbits on nowhere-zero flows with values in A is obtained
from OT (Γ, G) by substituting xi ← αi (A), yi ← −1.
Remark The substitutions in (a) and (b) are not the same; that in (b) depends on
the structure of A, so the numbers of orbits on nowhere-zero flows or tensions do
depend on the structure of A if the group G is not the identity.
Proof We give a sketch of the proof for flows. (The version for colourings follows
from our earlier formula for the orbital chromatic polynomial, while the version
for tensions is very similar to that for flows.)
First, let f be a column vector of elements of A, of length equal to the number m
of columns of the integer matrix C. How many solutions of C f = 0 are there? Row
and column operations don’t change the number of solutions, so we can assume
that C is in Smith normal form, with non-zero elements d1 , . . . , dr . Then the equa-
tions become d1 f1 = 0, . . . , dr fr = 0, with dr+1 , . . . , dm arbitrary. So the number of
solutions is αd1 (A) · · · αdr (A)α0 (A)m−r , which is just the indicator monomial x(C)
with αi (A) substituted for xi for all i (which we write xi ← αi (A)).
The equations M f = 0 assert that f is a flow, while the equations (Pg − I) f = 0
assert that f is fixed by g. So the number of flows fixed by g is x(Mg ), with
xi ← αi (A) for all i.
∑ (−1)|E\F|x(Mg(F)) : xi ← αi(A)).
F⊆E
∑ x(Mg(F))y(Mg∗(E \ F))
F⊆E
It is known that the matrix M is totally unimodular; this means that any sub-
determinant is equal to 0 or ±1. It follows that all the invariant factors are 0 or
1, so that x(M) = x0m−r x1r (where r is the rank of M). Since the row space of M ∗
is the null space of M, and vice versa, a similar assertion holds for M ∗ . So, if G
is the trivial group, the orbital Tutte polynomial OT (Γ, G) involves the variables
x0 , x1 , y0 , y1 only. This is also an immediate consequence of Proposition 9.8.
It is now not difficult to show that the substitution x1 ← 1, y1 ← 1, x0 ← y − 1,
y0 ← x − 1 produces the usual Tutte polynomial of Γ.
Since the substitutions in parts (b) and (c) of Theorem 9.9 replace each of the
variables x0 , x1 , y0 , y1 by ±1 or |A|, we see again that the number of nowhere-zero
flows or tensions on Γ with values in A depends only on the order of A and not its
structure.
Example Let Γ be the complete graph on n vertices. Then L(Γ) has diagonal
entries n − 1 and off-diagonal entries −1. Thus, L(Γ) = nI − J, where j denotes
the all-1 matrix. We compute the eigenvalues of this matrix.
As we noted in general, the all-1 vector is an eigenvector of L(Γ) with eigen-
value 0. We claim that the other n − 1 eigenvalues are all equal to n. To see this,
let v = (v1 , . . . , vn ) be any vector orthogonal to the all-1 vector, that is, satisfying
v1 + · · · + vn = 0. Then vJ = 0, and so v(nI + J) = nv. So the claim is proved.
Now the formula in part (c) of the Matrix-Tree Theorem shows that the number
of spanning trees is (nn−1 )/n = nn−2 .
The fact that the number of trees on n vertices is nn−2 is Cayley’s Theorem.
We return to this theorem in the next chapter, where we will see several different
proofs of it.
9.5 Exercises
9.1 Calculate the chromatic polynomials of the path and the cycle on n vertices.
9.2 Calculate the orbital chromatic polynomial PΓ,G (q), where
(a) Γ is a path with n vertices and G consists of the identity and the reflection;
(b) Γ is a cycle with n vertices and G is the cyclic group of order n consisting of
rotations of the cycle.
9.3 Show that, if G is any group of automorphisms of a graph Γ, and q a positive
integer, then PΓ,G (q) = 0 if and only if PΓ (q) = 0. Show that this is false if q is not
assumed to be a positive integer.
9.4 Compute the Tutte polynomial of the complete graph K4 on four vertices.
9.5 (a) Prove that, for an arbitrary graph Γ, the chromatic polynomial of Γ is
equal to qc T (Γ; 1 − q, 0), where c is the number of connected components of
Γ.
(b) Modify the other parts of Proposition 9.4 so that they work for arbitrary
graphs (not necesssarily connected).
9.6 Let A be a real symmetric matrix with all row and column sums zero. Let J
denote the all-1 matrix. In this exercise we evaluate the determinant of B = A + J.
(a) Replace the first row by the sum of all the rows; this makes the entries in
the first row n and doesn’t change the other entries; the determinant is un-
changed.
(b) Replace the first column by the sum of all the columns. This makes the first
entry n2 , and the other entries in this column n, and doesn’t change the other
entries of the matrix; the determinant is unchanged.
(c) Subtract 1/n times the first row from each other row. The elements of the
first column, other than the first, become 0; we subtract 1 from all elements
not in the first row or column of B, leaving the entries of A; and the determi-
nant is unchanged.
Species
170
20:16:22, subject to the Cambridge
.011
10.2 Species and counting 171
There are many different proofs of Cayley’s Theorem. Below, we will see two
proofs which are made clearer by means of the concept of species. But first, one
of the classics:
Proposition 10.2 In the tree with Prüfer code P, the valency of the vertex i is one
more than the number of occurrences of i in P.
For, at the conclusion of the second algorithm, if we add in the last two vertices
to L, then L contains each vertex precisely once; and edges join each of the first
n − 2 vertices of L to the corresponding vertex in P, together with an edge joining
the last two vertices of L.
Using this, one can count labelled trees with any prescribed degree sequence
(see Exercise 10.10).
objects on the set {1, . . . , n}). By replacing the set of objects by the number of
labelled or unlabelled objects in the set, we obtain the usual generating functions
for such objects.
We say that two species are equivalent (written G ∼ H ) if there is a bijection
between the objects of the two species on a given point set such that the automor-
phism groups of corresponding objects are equal.
The most important formal power series associated with a species is its cycle
index, which is defined by the rule
Z̃(G ) = ∑ Z(Aut(A)),
A∈G
for the number Gn of labelled n-element G -objects (that is, objects on the point set
{1, . . . , n}); and the ordinary generating function
g(x) = ∑ gn x n
n≥0
for the number gn of unlabelled n-element G -objects (that is, isomorphism classes).
The first equation holds because putting si = 0 for all i > 1 kills all permutations
except the identity. The second holds because, with this substitution, each group
element contributes xn , and the result is 1/|G| ∑g∈G xn = xn .
Example: Sets The species S has as its objects the finite sets, with one set of
each cardinality up to isomorphism. Its cycle index was calculated in Chapter 7:
s
Z̃(S ) = ∑ (Sn ) = exp ∑
i
.
n≥0 i≥1 i
S(x) = exp(x),
xi
s(x) = exp ∑
i≥1 i
= exp(− log(1 − x))
1
= ,
1−x
in agreement with the fact that Sn = sn = 1 for all n ≥ 0.
Example: Total orders Let L be the species of total (or linear) orders. Each
n-set can be totally ordered in n! ways, all of which are isomorphic, and each of
which is rigid (that is, has the trivial automorphism group).
We have
1
Z̃(L ) = ∑ sn1 = ,
n≥0 1 − s1
so that
1
L(x) = l(x) = .
1−x
xn
C(x) = 1 + ∑ = 1 − log(1 − x),
n≥1 n
1
c(x) = ∑ xn = 1 − x .
Example: Permutations An object of the species P consists of a set carrying
a permutation. We will see later how P can be expressed as a composition, from
which its cycle index can be deduced (Exercise 7.2). We have
Z̃(P) = ∏ (1 − sn)−1,
n≥1
1
P(x) = ,
1−x
p(x) = ∏ (1 − xn )−1 .
n≥1
The function p(x) is the generating function for number partitions. For, as
we saw earlier, an unlabelled permutation is the same as a conjugacy class of
permutations; and conjugacy classes are determined by their cycle structure.
10.4.1 Products
Let G and H be species. We define the product K = G × H as follows: an
object of K on a set X consists of a distinguished subset Y of X, a G -object on Y ,
and a H -object on X \Y .
Since these objects are chosen independently, it is easy to check that
Z̃(G × H ) = Z̃(G )Z̃(H ).
Since the generating functions for labelled and unlabelled structures are speciali-
sations of the cycle index, we have similar multiplicative formulae for them.
For example, if S , G and G ◦ are the species of sets, graphs, and graphs with
no isolated vertices respectively, then
G ∼ S × G ◦.
10.4.2 Substitution
Let G and H be species. We define the substitution K = G [H ] as follows:
an object of K on a set X consists of a partition of X, an H -object on each part
si ← Z̃(H ; s j ← si j ) − 1
for all i. (The −1 in the formula corresponds to removing the empty H -structure
before substituting.)
From this, we see that the exponential generating functions for labelled struc-
tures obey the simple substitution law:
The situation for unlabelled structures is more complicated, and k(x) cannot be
obtained from g(x) and h(x) alone. Instead, we have
This equation also follows from the Cycle Index Theorem, since we are count-
ing functions on G -structures where the figures are non-empty H -structures with
weight equal to cardinality.
For example, if S , P and C are the species of sets permutations, and circular
orders, then the standard decomposition of a permutation into disjoint cycles can
be written
P ∼ S [C ].
The counting series for labelled structures are given by
xn
S(x) = ∑ n! = exp(x),
n≥0
n!xn 1
P(x) = ∑ =
1−x
,
n≥0 n!
(n − 1)!xn
C(x) = 1 + ∑ = 1 − log(1 − x);
n≥0 n!
First proof Let L and P be the species of total (or linear) orders and permuta-
tions, respectively. These species are quite different, but have the property that the
numbers of labelled objects on n points are the same (namely n!).
Hence the numbers of labelled objects in the two species L [T ∗ ] and P[T ∗ ]
are equal. (Here T ∗ is the species of rooted trees.)
Consider an object in L [T ∗ ]. This consists of a linear order (x1 , . . . , xr ), with
a rooted tree Ti at xi for all i. I claim that this is equivalent to a tree with two
distinguished vertices. Take edges {xi , xi+1 } for i = 1, . . . , r − 1, and identify xi
with the root of Ti for all i. The resulting graph is a tree. Conversely, given a tree
with two distinguished vertices x and y, there is a unique path from x to y in the
tree, and the remainder of the tree consists of rooted trees attached to the vertices
of the path.
Now consider an object in P[T ∗ ]. Identify the root of each tree with a point
of the set on which the permutation acts, and orient each edge of this tree towards
the root. The resulting structure defines a function f on the point set, where
• if v is not a root, then f (v) is the unique vertex for which (v, f (v)) is a
directed edge of one of the trees.
Second proof As in the preceding proof, let T ∗ denote the species of rooted
trees. If we remove the root from a rooted tree, the result consists of an unordered
collection of trees, each of which has a natural root (at the neighbour of the root of
the original tree). Conversely, given a collection of rooted trees, add a new root,
joined to the roots of all the trees in the collection, to obtain a single rooted tree.
So, if E denotes the species consisting of a single 1-vertex structure, and S the
species of sets, we have
T ∗ ∼ E × S [T ∗ ].
Hence, for the exponential generating functions for labelled structures, we have
This is, formally, a recurrence relation for the coefficients of T ∗ (x), and we need
to show that the nth coefficient is nn−1 . This can be done most easily with the
technique of Lagrange inversion, which is discussed in the next chapter.
It turns out that mathematics does provide a language to describe this, namely
category theory. It would take us too far afield to give all the definitions here. In
essence, a category consists of a collection of objects with a collection of mor-
phisms between them. In the only case with which we deal, objects are sets and
morphisms are set mappings. In particular, the class S whose objects are all finite
sets and whose morphisms are all bijections between them satisfies the axioms for
a category.
Now a species is simply a functor F from S to itself. This means that F as-
sociates to each finite set S a set F(S), and to each bijection f : S → S a bijec-
tion F( f ) : F(S) → F(S ), such that F respects composition and identity (that is,
F( f1 f2 ) = F( f1 )F( f2 ) and F(1S ) = 1F(S) , where 1S is the identity map on S).
The standard reference on species (apart from Joyal’s original paper) is the
book by Bergeron, Labelle and Leroux.
(c) Fn (G) ≤ Fn+1 (G), with equality if and only if Fn+1 (G) = 1.
(d) fn (G) ≤ fn+1 (G).
Theorem 10.6 Let C be a class of finite relational structures. Suppose that the
following conditions hold:
(a) C is closed under isomorphism;
(b) C is closed under taking induced substructures;
(c) C contains only countably many members up to isomorphism;
(d) C has the amalgamation property.
So now we know exactly which species arise in this way, and which have enu-
meration problems equivalent to orbit-counting problems for oligomorphic groups:
they are precisely the Fraı̈ssé classes, those which satisfy the conditions (a)–(d) of
Fraı̈ssé’s Theorem.
Many classes satisfy these conditions: among them are the classes of finite
partitions, graphs, directed graphs, tournaments, partially ordered sets, and so on.
For a slightly different example, consider finite sets which are partitioned into
sets of size at most 2, the parts being totally ordered. This can be shown to be a
Fraı̈ssé class. An n-element structure is specified up to isomorphism by an ordered
sequence of 1s and 2s with sum n; so the number of n-element unlabelled structures
in the class is the Fibonacci number Fn .
Investigations on enumeration of Fraı̈ssé classes (that is, counting orbits of
oligomorphic permutation groups) suggest that the counting sequences for such
classes should grow rapidly and smoothly. There are some general results pointing
in this direction. Here is just one such result, due to Francesca Merola (improving
a result of Dugald Macpherson). A permutation group G on a set X is said to
be primitive if there is no non-trivial partition of X which is invariant under G.
(The trivial partitions are the partition with a single part and the partition into
singletons.)
Theorem 10.7 There is a universal constant c > 1 (in fact, c > 1.324) with the
following property. Let G be a primtive oligomorphic permutation group, and
assume that fn (G) > 1 for some n. Then fn (G) ≥ cn /p(n) and Fn (G) ≥ n! cn /q(n),
where p and q are polynomials depending on G.
Remark The prototype of a structure to which the theorem of Engeler et al. ap-
plies is the ordered set (Q, <) of rational numbers. A famous theorem of Cantor
asserts that Q is, up to isomorphism, the unique countable totally ordered set which
is
• dense, that is, given x < y, we can find z with x < z < y; and
• without least or greatest element, that is, for any x we can find y and z with
y < x and x < z.
The definition of a total order, together with the two conditions in the above bullet
points, are first-order; so (Q, <) is countably categorical. You are invited to prove
that its automorphism group is oligomorphic.
10.8 Weights
The theory of species allows us to interpret the statement
‘Every graph is uniquely expressible as the disjoint union of connected
graphs’
as the relation G ∼ S [C ], where S , G and C are the species of sets, graphs and
connected graphs respectively.
We can get extra information if the objects in our species carry weights. The
weights must satisfy some restrictions to make the process work nicely. Rather
than describe these in general, I consider one special case. We give each edge a
weight (which, in general, may differ from edge to edge, but in the special case
here will be the same for each edge). Now the weight of a graph, or connected
graph, is just the product of its edge weights; and so the weight of a graph is the
product of the weights of its connected components. This means that
Gw ∼ S [Cw ],
where Gw and Cw are the species of weighted graphs and weighted connected
graphs respectively.
Now specialise further, to the case where w = −1, and consider the exponential
generating function for labelled objects. For graphs on n vertices, if n > 1, the
generating function is
(n2) n
= (1 − 1)(2) = 0.
n
∑ (−1) |E|
= ∑ (−1)i 2
i
E⊆({1,...,n}
2 )
i=0
So Gw (x) = 1+x. If cn,k is the number of connected labelled graphs with n vertices
and k edges, then
(n2)
(−1)k cn,k n
Cw (x) = ∑ ∑ x .
n k=0 n!
And as we saw earlier, S(x) = exp(x).
So we have ⎛ ⎞
(n2)
(−1)kc
1 + x = exp ∑ ⎝ ∑ ⎠ xn ,
n,k
n k=0 n!
or in other words,
⎛ ⎞
(n2)
(−1)k c (−1)n−1 n
∑⎝∑
n,k ⎠ n
x = log(1 + x) = ∑ x .
n k=0 n! n n
Equating coefficients, we find the pretty formula for the alternating sum of the
numbers of connected graphs:
(n2)
∑ (−1)k cn,k = (−1)n−1(n − 1)!.
k=0
(The lower limit in the sum can be taken to be n − 1, since this is the smallest
number of edges in a connected graph.)
Note that the terms in the alternating sum are much larger than the final value.
10.9 Exercises
10.1 Count the labelled trees in which the vertex i has valency ai for 1 ≤ i ≤ n,
where a1 , . . . , an are positive integers with sum 2n − 2.
10.3 Show that the cycle index for the species C of circular structures is
φ (m)
Z̃(C ) = 1 − ∑ log(1 − sm ).
m≥1 m
10.4 Use the result of the preceding exercise, and the fact that cn = 1 for all n
(where cn is the number of unlabelled n-element structures in C ) to prove the
identity
∏ (1 − xm)−φ (m)/m = exp(x/(1 − x)).
m≥1
10.5 Suppose that gn is the number of unlabelled n-element objects in the species G .
Show that the generating function for unlabelled structures in S [G ] is
∏ (1 − xn)−gn .
n≥1
Verify this combinatorially in the case G = S . How would you describe the ob-
jects of S [S ]?
P ∼ S × D.
Use this to deduce the exponential generating function for the number of derange-
ments of an n-set.
10.7 Let G be a species. The Stirling numbers of G are the numbers S(G )(n, k),
defined to be the number of partitions of an n-set into k parts with a G -object on
each part.
(a) Prove that, for G = S , C and L respectively, the Stirling numbers are respec-
tively the Stirling numbers S(n, k) of the second kind, the unsigned Stirling
numbers |s(n, k)| of the first kind, and the Lah numbers L(n, k) respectively.
(b) Let S(G ) be the lower triangular matrix of Stirling numbers of G . Prove that
(c) Let (an ) and (bn ) be sequences of positive integers with exponential gener-
ating functions A(x) and B(x) respectively. Prove that the following two
conditions are equivalent:
n
• a0 = b0 and bn = ∑ S(G )(n, k)ak for n ≥ 1;
k=1
• B(x) = A(G(x) − 1).
10.8 A forest is a graph whose connected components are trees. Show that there
is a bijection between labelled forests of rooted trees on n vertices, and labelled
rooted trees on n + 1 vertices with root n + 1.
Hence show that, if a forest of rooted trees on n vertices is chosen at random,
then the probability that it is connected tends to the limit 1/e as n → ∞.
√ It is true but harder to prove that the analogous limit for unrooted trees
Remark
is 1/ e.
U(x) = exp(2x),
u(x) = (1 − x)−2 .
10.10 Count the labelled trees in which the vertex i has valency ai for 1 ≤ i ≤ n,
where a1 , . . . , an are positive integers with sum 2n − 2.
10.11 A forest is a graph whose connected components are trees. Show that there
is a bijection between labelled forests of rooted trees on n vertices, and labelled
rooted trees on n + 1 vertices with root n + 1.
Hence show that, if a forest of rooted trees on n vertices is chosen at random,
then the probability that it is connected tends to the limit 1/e as n → ∞.
10.12 Let L and E denote the species of linear orders and of sets of size at most 2
respectively. Prove that the number of labelled n-element structures in L [E ] is the
Fibonacci number Fn .
10.13 (a) Prove the first three parts of Proposition 10.4. (The fourth part is
more difficult.)
(b) Prove Proposition 10.5.
(c) Prove that the relational structure on X whose relations are the orbits on
n-tuples of a permutation group G on X is homogeneous.
10.14 (a) Prove that the class of finite graphs is a Fraı̈ssé class.
(b) Prove that the class of finite posets is a Fraı̈ssé class.
(c) Prove that the class of structures consisting of a finite set partitioned into
parts of size at most 2 with the set of parts totally ordered (mentioned in the
text in connection with Fibonacci numbers) is a Fraı̈ssé class.
10.15 Let (a1 , . . . , an ) and (b1 , . . . , bn ) be two n-tuples of rational numbers, both
in strictly increasing order. Show that there exists an order-preserving permutation
of Q carrying ai to bi for i = 1, . . . , n. Deduce that, for G = Aut(Q, <), we have
n
fn (G) = 1, Fn (G) = n!, Fn∗ (G) = ∑ S(n, k)k!.
k=1
In this chapter, we say a bit more about asymptotics, prove Stirling’s formula,
consider the use of complex analysis in combinatorics, and finally describe sub-
additive and submultiplicative functions, for which the shape of the asymptotics is
easy to establish, although precise details may be more difficult.
187
20:15:45, subject to the Cambridge
.012
188 Analytic methods: a first look
for i ≥ 0. (So in particular F(n) ∼ G0 (n), F(n) − G0 (n) ∼ G1 (n), and so on. Note
that Gi (n) = o(Gi−1 (n)) for all i.)
The definition comes with a couple of warnings:
• an asymptotic series is not necessarily convergent;
• it is not necessarily the case that taking more terms in the series gives a better
approximation to F(n) for a fixed n.
Typically, F is a combinatorial enumeration function, and G a combination of
standard functions of analysis. In the next section, we prove Stirling’s formula as
an example.
...
.......
.......
........
.......
..
...
........
.
.......
.......
.......
......
......
...
........
....
......
.....
.....
.....
...
.......
.
....
........
.......
........
........
...........
.
....
....
....
.....
.....
.. .......
.
......
... ..
.. ...
......
.........
.. ...
......
......
......
..........
..
...
Let f (x) = log x, let g(x) be the function whose value is log m for m ≤ x < m+1,
and let h(x) be the function defined by the polygon with vertices (m, log m), for
1 ≤ m ≤ n. Clearly
n
g(x) dx = log 2 + · · · + log n = log n! .
1
The difference between the integrals of g and h is the sum of the areas of trian-
gles with base 1 and total height log n; that is, 12 log n.
Some calculus (given at the end of this proof) shows that the difference between
the integrals of f and h tends to a finite limit c as n → ∞.
Finally, a simple integration shows that
n
f (x) dx = n log n − n + 1.
1
We conclude that
so that
Cnn+1/2
n! ∼ .
en
To identify the constant C, we can proceed as follows. Consider the integral
π/2
In = sinn x dx.
0
where the last inequality comes from another application of Taylor’s Theorem
which yields log(1 + x) ≥ x − x2 /2 for x ∈ [0, 1]. Now ∑(1/m2 ) converges, so
the integral is bounded.
A sequence has a lim sup if and only if it is bounded. (Take r to be the infimum of
the set of values y for which only finitely many terms xn are greater than x. This set
is non-empty (it contains any upper bound for the sequence) and bounded below (a
lower bound for the sequence is also a lower bound for the set), and so the infimum
exists; it clearly has the required property.)
There is an analogous dual notion of limit inferior or lim inf.
Theorem 11.2 Suppose that A(z) = ∑ an zn defines a function which is analytic in
some neighbourhood of the origin in the complex plane. Suppose that the smallest
modulus of a singularity of A(z) is R. Then lim sup |an |1/n = 1/R.
This shows that all but finitely many an are bounded by (c + ε)n , but infinitely
many are not bounded by (c − ε)n , where c = 1/R. We conclude that the series is
‘roughly’ cn . (However, it is not true to say that an ∼ cn : why?)
On the other hand, if A(z) is analytic everywhere, then an ≤ ε n for n > n0 (ε),
for any positive ε. Indeed, an = o(ε n ) for any positive ε.
For example, if B(n) is the nth Bell number, then
B(n)zn
∑ n! = ee −1,
z
n≥0
f (m + n) ≤ f (m) + f (n).
Note that the final conclusion requires that c > 0. The function f (n) = na is
subadditive for any positive real number a < 1; we have limn→∞ f (n)/n = 0, but
the asymptotics of f are not determined by this fact.
We do not meet many subadditive functions in enumerative combinatorics. A
much more common situation is that we have a function which is submultiplica-
tive, in the following sense.
Let f be a function from the natural numbers to the positive reals; assume that
f (0) = 1. We say that f is submultiplicative if, for all positive integers m, n we
have
f (m + n) ≤ f (m) f (n).
c = lim f (n)1/n
n→∞
The final statement of the proposition asserts that f (n) is ‘roughly’ cn , that
is, f (n) grows exponentially with n; but it is not correct to conclude from our
hypothesis that f (n) ∼ cn , as we will see.
Here is a very simple example. Let f (n) be the number of strings of zeros
and ones of length n containing no two consecutive ones. We claim that f is
submultiplicative. For consider the strings of length m+n satisfying the constraint.
The first m positions form a string of length m with no two consecutive ones; there
are f (m) possibilities for this. The last n positions form a string of length n with no
two consecutive ones; there are f (n) possibilities. So there are f (m) f (n) possible
concatenations of such strings; not all pass the test, since we have to rule out the
possibility that the first string ends with 1 while the second begins with 1. So
certainly f (m + n) ≤ f (m) f (n). √
√in Chapter 2, we have f (n) ∼ Ac , where c = (1 + 5)/2
In fact, as √
we saw n
and A = (1 + 5)/2 5.
The result obtained using submultiplicativity is of course much less precise,
but much more general; it holds for the number of binary strings with an arbitrary
finite collection of forbidden substrings, for example.
11.5 Exercises
11.1 Prove that nk = o(cn ) for any constants k > 0 and c > 1, and that log n =
o(nε ) for any ε > 0.
(b) Define a relation ≤ on the set of equivalence classes of ≡ by the rule that
[x] ≤ [y] if R(x, y) holds. Show that this relation is well-defined (independent
of the choice of representatives of the equivalence classes) and is a total
order.
(c) Show that, given any equivalence relation on X and any total order on the set
of equivalence classes, there is a unique preorder on X which gives rise to
them in the manner just described.
(Hence P(n) is the number of ways of choosing a partition of the set {1, . . . , n} and
a total order of the set of parts.)
11.3 A self-avoiding walk of length n on the square lattice in the plane is a se-
quence of n + 1 points P0 , . . . , Pn with integer coordinates, such that
• P0 = (0, 0);
• for i = 1, . . . , n, we have Pi − Pi−1 ∈ {(1, 0), (0, 1), (−1, 0), (0, −1)};
• the points P0 , . . . , Pn are all distinct.
In other words, start at the origin, take n steps each one unit east, north, west or
south, such that no point is ever revisited.
Let f (n) be the number of self-avoiding walks of order n. Prove that
(a) 4 · 2n−1 ≤ f (n) ≤ 4 · 3n−1 ;
(b) f is submultiplicative.
Deduce that c = limn→∞ f (n)1/n exists, and that 2 ≤ c ≤ 3.
Remark The problem of actually finding the value of the limit c is extremely
difficult!
Further topics
In this section we consider some further topics, in rather less detail: Lagrange in-
version, Bernoulli numbers, the Euler–Maclaurin sum formula, and analytic tech-
niques due to Hayman, Meir and Moon, and Bender.
195
20:15:10, subject to the Cambridge
.013
196 Further topics
g(y) = yφ (g(y)).
where
dn−1
bn = φ (x)n .
dxn−1 x=0
f (x) = ∑ an x n ,
n≥m
where m may be positive or negative. If the series is not identically zero, we may
assume without loss of generality that am = 0, in which case m is the valuation of
f , written
m = val( f ).
We define addition, multiplication, composition, differentiation, etc., for formal
Laurent series as for formal power series. In particular, f (g(x)) is defined for any
formal Laurent series f , g with val(g) > 0. (This is less trivial than the analogous
result for formal power series.) In particular, we need to know that g(x)−m exists
as a formal Laurent series for m > 0. It is enough to deal with the case m = 1,
since certainly g(x)m exists. If val(g) = r, then g(x) = xr g1 (x), and so g(x)−1 =
x−r g1 (x)−1 , and we have seen that g1 (x)−1 exists as a formal power series, since
g1 (0) is invertible.
We denote the derivative of the formal Laurent series f (x) by f (x).
We also introduce the following notation: [xn ] f (x) denotes the coefficient of
xn in the formal power series (or formal Laurent series) f (x). The case n = −1 is
especially important, as we learn from complex analysis. The value of [x−1 ] f (x)
is called the residue of f (x), and is also written as Res f (x).
Everything below hinges on the following simple observation, which is too
trivial to need a proof.
Proposition 12.2 For any formal Laurent series f (x), we have Res f (x) = 0.
Now the following result describes the residue of the composition of two formal
Laurent series.
Theorem 12.3 (Residue Composition Theorem) Let f (x), g(x) be formal Lau-
rent series satisfying val(g) = r > 0. Then
Proof It is enough to consider the case where f (x) = xn , since Res is a linear
function.
Suppose that n = −1, so that the right-hand side is zero. Then
1 d n+1
Res(g (x)g (x)) =
n
Res g (x) = 0.
n+1 dx
So consider the case where n = −1. Let g(x) = axr h(x), where a = 0 and
h(0) = 1. Then
It is tempting to say
d
g (x)g(x)−1 = log g(x)
dx
d
= (log a + r log x + log h(x))
dx
r d
= + log h(x),
x dx
but this is not valid; log g(x) may not exist as a formal Laurent series. Con-
sider this point carefully; an error here would lead to the incorrect conclusion
that Res(g (x)/g(x)) = 0.
From the Residue Composition Theorem, we can prove a more general version
of Lagrange Inversion.
g(x) = xφ (g(x))
has a unique formal power solution g(x). Moreover, for any Laurent series f , we
have
1 n−1
[x ]( f (x)φ (x)n ) if n ≥ val( f ) and n = 0,
[x ] f (g(x)) = n
n
f (0) + Res( f (x) log(φ (0)−1 φ (x))) if n = 0.
Proof Let Φ(x) = x/φ (x), so that Φ(g(x)) = x and val(Φ(x)) = 1. Then g is the
inverse function of Φ.
We have
where we have made the substitution x = Φ(y) (so that y = g(x)) and used the
Residue Composition Theorem.
For n = 0, we have
1
[xn ] f (g(x)) = − [y−1 ] f (y) Φ(y)−n
n
1 −1
= [y ] f (y)Φ(y)−n
n
1 n−1
= [y ] f (y)φ n (y).
n
Call the right-hand side of this equation f (x). Now, if S is the sum that we want to
evaluate, then
n
2n + 1
S = [y2n ](1 + y) j ∑ (1 + y)k
k=0 2k + 1
= Res y−(2n+1) (1 + y) j f ((1 + y)1/2 ).
Now we do the following rather strange thing: make the substitution y = z2 (z2 −
2). Then val(y(z)) = 2, and (1 + y)1/2 = 1 − z2 . So the Residue Composition
Theorem gives
1 1
S = Res(z2 − 1)2 j − z
(z2 − 2)2n+1 z4n+2
= Res(z2 − 1)2 j z−(4n+1)
= [z4n ](z2 − 1)2 j
2j
= ,
2n
as required. (In the second line we have used the fact that (z2 − 2)−(2n+1) is a
formal power series and so its residue is zero.)
since the term Bn+1 cancels from this equation (which expresses Bn in terms of
earlier terms).
Conway and Guy, in The Book of Numbers, have a typically elegant presenta-
tion of the Bernoulli numbers. They write this relation as
(B + 1)n+1 = Bn+1
Theorem 12.5 The exponential generating function for the Bernoulli numbers is
Bn xn x
∑ =
exp(x) − 1
.
n≥0 n!
Proof Let F(x) be the e.g.f., and consider F(x)(exp(x) − 1). The coefficient of
xn+1 /(n + 1)! is
n n
Bk 1 n+1
(n + 1)! ∑ =∑ Bk = 0
k=0 k! (n + 1 − k)! k=0 k
for n ≥ 1. (Note that the sum runs from 0 to n rather than n + 1 because we
subtracted the constant term from the exponential.) The coefficient of x, however,
is clearly 1. So the product is x.
(−1)n n!
an = .
(n + 1)
One application of the Bernoulli numbers is in Faulhaber’s formula for the sum
of the kth powers of the first n natural numbers. Everyone knows that
n
∑i = n(n + 1)/2,
i=1
n
∑i 2
= n(n + 1)(2n + 1)/6,
i=1
n
∑ i3 = n2 (n + 1)2 /4,
i=1
Theorem 12.8
n
1 k
k+1
∑i = k+1
k
∑ j
B j (n + 1)k+1− j .
i=1 j=0
Proof This argument is written out in the shorthand notation of Conway and Guy.
Check that you can turn it into a more conventional proof!
We calculate
k+1
k + 1 k− j
(n + 1 + B) k+1
− (n + B) k+1
= ∑ j
n ((B + 1) j − B j ).
j=1
Thus we have
1
((n + 1 + B)k+1 − (n + B)k+1 ) = nk ,
k+1
from which by induction we obtain
n
1
((n + 1 + B)k+1 − Bk+1 ) = ∑ ik .
k+1 i=1
as required.
Warning Conway and Guy use a non-standard definition of the Bernoulli num-
bers, as a result of which they have B1 = 1/2 rather than −1/2. As a result, their
formulae look a bit different.
How large are the Bernoulli numbers? The generating function x/(exp(x) − 1)
has a removable singularity at the origin; apart from this, the nearest singularities
are at ±2πi, and so Bn is about n!(2π)−n ; in fact, it can be shown that
2 n! ζ (n)
|Bn | =
(2π)n
for n even, where ζ (n) = ∑k≥1 k−n . Of course, Bn = 0 if n is odd and n > 1.
Another curious formula for Bn is due to von Staudt and Clausen:
1
B2n = N − ∑ p
p−1|2n
for some integer N, where the sum is over the primes p for which p − 1 divides 2n.
x exp(tx) Bn (t)xn
=∑ .
exp(x) − 1 n≥0 n!
Proof All parts are easy exercises. Let F(t) = x exp(tx)/(exp(x) − 1).
(a) F(0) is the e.g.f. for the regular Bernoulli numbers, and F(1) = x + F(0).
(b) F(t + 1) − F(t) = x exp(tx).
(c) F (t) = xF(t).
(d) F(t) = F(0) exp(xt): use the rule for multiplying e.g.f.s.
Then
n
∑ f (i) = Sk − Rk ,
i=1
where the error term is
n
B2k ({t})
Rk = f (2k) (t) dt,
1 (2k)!
with B2k (t) the Bernoulli polynomial and {t} = t − t the fractional part of t.
Proof First let g be any function with continuous (2k)th derivative on [0, 1]. We
claim that
1
1
(g(0) + g(1)) − g(t) dt
2 0
k
B2i (2i−1) 1 B2k (t)
=∑ g (1) − g(2i−1) (0) − g(2k) (t) dt.
i=1 (2i)! 0 (2k)!
The proof is by induction: both the start of the induction (at k = 1) and the induc-
tive step are done by integrating the last term by parts twice, using the fact that
Bn (t) = nBn−1 (t) (see Proposition 12.9).
Now the result is obtained by applying this claim successively to the functions
g(x) = f (x + 1), g(x) = f (x + 2), . . . , g(x) = f (x + n), and adding.
If f is a polynomial, then f (2k) (x) = 0 for sufficiently large k, and the remain-
der term vanishes, giving Faulhaber’s formula. For other applications, we must
estimate the size of the remainder term.
There are various analytic conditions which guarantee a bound on the size
of Rk , so that it can be shown that we have an asymptotic series for the sum. I
will not give precise conditions here.
1 B2k
c + n log n − n + log n + ∑
2 2k(2k − 1)n2k−1
for
n
∑ log i = log n! .
i=1
where the sum begins 1/(2n)−1/(12n2 )+1/(120n4 )+· · · . Here γ is Euler’s con-
stant, a somewhat mysterious constant with value approximately 0.5772157 . . . .
Again the series is divergent for fixed n.
and let
∞
Lik (1 − e−x ) (k) x
n
1 − e−x
= ∑ n! .
B n
n=0
(k)
The numbers Bn are the poly-Bernoulli numbers of order k.
Kaneko gave a couple of nice formulae for the poly-Bernoulli numbers of neg-
ative order, of which one is relevant here:
Theorem 12.11 (Kaneko)
min(n,k)
(−k)
Bn = ∑ ( j!)2 S(n + 1, j + 1)S(k + 1, j + 1).
j=0
This formula has the (entirely non-obvious) corollary that these numbers have
(−k) (−n)
a symmetry property: Bn = Bk for all non-negative integers n and k.
Now we have a somewhat unexpected connection with a counting problem:
the poly-Bernoulli numbers of negative order evaluate the chromatic polynomial
of complete bipartite graphs at −1 (see Theorem 9.3). This application is taken
from an unpublished paper of the author with Celia Glass and Robert Schumacher.
(−n2 )
Theorem 12.12 The number of acyclic orientations of Kn1 ,n2 is Bn1 .
(d) If f and g are H-admissible, then exp( f (x)) and f (x)g(x) are H-admissible.
Theorem 12.15 Let f (x) = ∑n≥0 fn xn be H-admissible. Let a(x) = x f (x)/ f (x)
and b(x) = xa (x), and let rn be the smallest positive root of the equation a(x) = n.
Then
1
fn ∼ √ f (rn )rn−n .
2πbn
Example: Stirling’s formula Take f (x) = exp(x) (we have noted that this func-
tion is admissible), so that fn = 1/n!. Now a(x) = x = b(x), and rn = n. Thus
1 1
=√ en n−n ,
n! 2πn
which is just Stirling’s formula the other way up!
Example: Rooted trees The generating function y = T ∗ (x) for labelled rooted
trees satisfies
y = x exp(y).
Tn∗ 1
= √ n−3/2 en .
n! 2π
√ nn+1/2
n! ∼ 2π ,
en
Theorem 12.17 Suppose that the function y = f (x) is defined implicitly by the
equation F(x, y) = 0, and let f (x) = ∑n≥0 fn xn . Suppose that there exist real num-
bers ξ and η such that
(c) the only solution of F(x, y) = Fy (x, y) = 0 with |x| ≤ ξ and |y| ≤ η is (x, y) =
(ξ , η).
Then
fn ∼ Cn−3/2 ξ −n ,
where $
ξ Fx (ξ , η)
C= .
2πFyy (ξ , η)
12.7 Exercises
12.1 (a) Let f be a formal power series with constant term zero and coefficient
of x equal to 1. Suppose that all coefficients of f are integers. Show that the
unique formal power series g such that g( f (x)) = x has all its coefficients
integers.
(b) Show that the preceding statement holds if we replace the integers by an
arbitrary commutative ring with identity.
(c) Let R be a commutative ring with identity, and let N (R) denote the set of
formal power series ∑ an xn in R[[x]] with a0 = 0 and a1 = 1. Show that
N (R), with the operation of composition, is a group.
(d) Let R = Z. Find the inverse of x − x2 in N (R).
Remark The group N (R) is known to group theorists as the Nottingham group
over R.
12.2 Let sn be the number of involutions in Sn (permutations of {1, . . . , n} whose
square is the identity). We showed in Chapter 4 that
sn x n x2
∑ = exp x +
2
.
n≥0 n!
1 n n/2 √n−1/4
sn ∼ √ e .
2 e
12.3 Let W (x) be the generating function for the Catalan numbers, shifted back
one place:
W (x) = ∑ Cn+1 xn .
n≥0
for n ≥ 1. Reconcile this with the formula for Catalan numbers found in Chapter 4.
In this chapter I list a few books, papers and websites which may be useful if you
would like to follow up some of the things I have discussed.
212
20:14:47, subject to the Cambridge
.014
13.2 Books on combinatorial enumeration 213
a formula for the terms of the sequence or its generating function if known, and
several articles by the editor Neil Sloane and others describing uses of the Ency-
clopedia in research. I have used it myself on a number of occasions.
Index
217
218 Index
Index 219
220 Index
Index 221
222 Index