0% found this document useful (0 votes)
411 views43 pages

Alex Scott's Combinatorics Notes

The document discusses chains, antichains, and shadows in set systems. It presents proofs of Sperner's lemma, which states that an antichain in P(n) has size at most the middle layer size floor(n/2). It first proves this using Hall's theorem from graph theory and a lemma that P(n) can be partitioned into floor(n/2) chains. It also outlines alternative proofs of Sperner's lemma and discusses related concepts like shadows, symmetric chains, and the Kruskal-Katona theorem.

Uploaded by

jeanbourgain8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
411 views43 pages

Alex Scott's Combinatorics Notes

The document discusses chains, antichains, and shadows in set systems. It presents proofs of Sperner's lemma, which states that an antichain in P(n) has size at most the middle layer size floor(n/2). It first proves this using Hall's theorem from graph theory and a lemma that P(n) can be partitioned into floor(n/2) chains. It also outlines alternative proofs of Sperner's lemma and discusses related concepts like shadows, symmetric chains, and the Kruskal-Katona theorem.

Uploaded by

jeanbourgain8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Combinatorics

Part C

Alex Scott
Michaelmas 2020
Contents

1 Introduction and notation 2


1.1 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Notation and basic definitions . . . . . . . . . . . . . . . . . . 2

2 Chains, antichains and shadows 4


2.1 Sperner’s Lemma, LYM Inequality and Dilworth’s Theorem . 4
2.2 Symmetric chains and Littlewood-Offord . . . . . . . . . . . . 12
2.3 Shadows and Kruskal-Katona . . . . . . . . . . . . . . . . . . 15

3 Intersections and traces 21


3.1 Erdős-Ko-Rado and the Two Families Theorem . . . . . . . . 21
3.2 VC-dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 A brief interlude on upsets and downsets . . . . . . . . . . . . 29
3.4 More on intersecting families . . . . . . . . . . . . . . . . . . . 30
3.5 Borsuk’s Conjecture . . . . . . . . . . . . . . . . . . . . . . . 36

4 Combinatorial Nullstellensatz 39

1
Chapter 1

Introduction and notation

1.1 Notes
These notes are intended to complement (but not replace) the Part C lecture
course in Combinatorics. In particular, they do not give all the details that
are explained in lectures.
The notes owe much to the book Combinatorics by Béla Bollobás, and to
notes from two courses given elsewhere by Imre Leader. Thanks are due to
Damiano Soardo and Albert Slawinksi, who contributed to earlier versions
of the notes.
Please send any further corrections/suggestions to scott at [Link].

1.2 Notation and basic definitions


For a set X, we write P(X) = {A : A ⊆ X} for its power set (i.e. the
set of all its subsets). A family of sets F on a ground set X is a sub-
set F ⊆ P(X). A family of sets is also called a set system or a hyper-
graph. For instance, with ground set X = {1, 2, 3, 4}, we could have the
hypergraph F = {1, 13, 124, 34}: note that we have dropped some brack-
ets and commas for convenience (strictly speaking we should write F =
{{1}, {1, 3}, {1, 2, 4}, {3, 4}}, but this is more cumbersome).
Most of this course concerns properties of set systems with a finite ground
set. So from now on all sets will be assumed to be finite unless stated other-
wise.

2
For n ≥ 1, we write [n] := {1, 2, . . . , n}. We will usually write P(n)
instead of P([n]). Note that |P(n)| = 2n .
We will refer to sets of size k as k-sets. For k ≥ 0 we write X (k) =
{A ∈ P(X) : |A| = k} for the set of all subsets of X of size k. We define
X (<k) , X (≤k) , X (>k) and X (≥k) in the obvious way. A family F ⊆ P(X) is
k-uniform if F ⊆ X (k) .
We think of X (0) , X (1) , . . . as the layers of P(X), and refer to X (i)as the
ith layer. So the ith layer of P(n) is [n](i) , which has cardinality ni . The
smallest layers of P(n) are the 0th layer and the nth layer: these are {∅}
and {[n]} respectively, and both have size 1. If n is even, the largest layer is
[n](n/2) ; if n is odd there are two largest layers, [n](⌊n/2⌋) and [n](⌈n/2⌉) .
There are many ways of looking at the power set P(n). For instance:

• We can think of P(n) as a collection of sets with a partial order given


by set containment: A ⊆ B. We will talk more about this later.

• We can turn P(n) into a graph: the discrete cube Qn is the graph with
vertex set P(n) and an edge between A and B if and only if |A△B| = 1.
Here A △ B is the symmetric difference: A △ B := (A \ B) ∪ (B \ A).

• We can turn P(n) into an abelian group or a vector space Fn2 by iden-
tifying each A ⊆ [n] with its characteristic vector

χA = (χA (1), . . . , χA (n)) ∈ {0, 1}n ,

where (
1 i∈A
χA (i) :=
0 i∈
6 A

We will move back and forward between these perspectives throughout the
course.

3
Chapter 2

Chains, antichains and shadows

2.1 Sperner’s Lemma, LYM Inequality and


Dilworth’s Theorem
A family A ⊆ P(X) is a chain if, for all A, B ∈ A, either A ⊆ B or B ⊆ A.
A family A ⊆ P(X) is an antichain if, for all distinct A, B ∈ A, we have
A 6⊆ B and B 6⊆ A.
For instance, {∅, 1, 1234} is a chain in P(4), and {12, 234, 14} is an an-
tichain. For any set X, the layers X (0) , X (1) , . . . are antichains.
How large can a chain in P(n) be? This has a simple answer.

Proposition 1. Every chain in P(n) has at most n + 1 elements. There


are n! different maximal chains with n + 1 elements, and every chain C is
contained in some chain of size n + 1.

Proof. Exercise. 
How large can an antichain be? This is more interesting. We noted above 
n
that the layers of P(n) are antichains: the largest of these has size ⌊n/2⌋ .

Theorem 2. (Sperner’s Lemma) An antichain in P(n) has size at most


n
⌊n/2⌋
.

We shall see two proofs of Sperner’s Lemma.


For the first proof, we will need a result from Graph Theory. Let us recall
that a graph G = (V, E) is bipartite with vertex classes X, Y if X ∪ Y is a
partition of V such that every edge of G contains one vertex from each of X

4
and Y . The neighbourhood Γ(v) of a vertex v ∈ V is Γ(v) = {u : vu ∈ E(G)},
and for S ⊆ V we write Γ(S) = ∪v∈X Γ(v). A complete matching from X
to Y is a collection of vertex-disjoint edges such that every vertex in X is
incident to some edge in M . Here is the result that we will need.
Theorem 3. (Hall’s Theorem) Let G = (V, E) be a bipartite graph with
vertex classes X and Y . Then G has a complete matching from X to Y if
and only if, for all S ⊆ X, we have
|Γ(S)| ≥ |S|. (2.1)
We will refer to equation (2.1) as Hall’s Condition. Note that G has
a complete matching then (2.1) is clearly necessary. The point of Hall’s
Theorem is that it is also sufficient.
Sperner’s Lemma will follow quickly from the following result.
n

Lemma 4. There is a partition of P(n) into ⌊n/2⌋ chains.
Proof. Consider the subgraph of the discrete cube Qn between the vertices
in two consecutive layers. We claim that:
1. for r < n/2, there is a complete matching from [n](r) to [n](r+1) ;
2. for r > n/2, there is a complete matching from [n](r) to [n](r−1) .
n

If we glue these matchings together, we obtain a collection of ⌊n/2⌋ chains
that partition P(n).
It is therefore enough to prove that these matchings exist. For r < n/2, we
consider the bipartite subgraph G of Qn induced by [n](r) ∪ [n](r+1) . This has
bipartition ([n](r) , [n](r+1) ) and an edge between A ∈ [n](r) and B ∈ [n](r+1) iff
A ⊆ B. In order to prove that there is a complete matching, it is enough to
verify Hall’s Condition (2.1). We will do this by a double counting argument.
Consider a set S ⊆ [n](r) , and let T = Γ(S). Each A ∈ S has degree n − r
in G (as there are n − r ways to add an element to A to get an (r + 1)-set).
So the number of edges between S and T is
e(S, T ) = (n − r)|S|.
On the other hand, each B ∈ [n](r+1) has degree r + 1 (as there are r + 1
choices for an element to delete from B to get an r-set). So we have1
e(S, T ) ≤ (r + 1)|T |
1
Note that we may not have equality here, as there may be edges incident with T that
are not incident with S.

5
Putting these two bounds together, we get
n−r
|T | ≥ |S| ≥ |S|,
r+1
as r < n/2 (and so r ≤ (n − 1)/2). So Hall’s Condition is satisfied, and hence
there is a complete matching from [n](r) to [n](r+1) .
For r > n/2, we can argue similarly. Alternatively we can consider the
effect of replacing every set by its complement. 
Remarks:
n

1. Note that we cannot partition P(n) into fewer than ⌊n/2⌋ chains,
(⌊n/2⌋)
because no two sets from the antichain [n] can belong to the same
chain.

2. The chains that we get from Lemma 4 could be very ‘asymmetric’.


For instance, the chain that starts at ∅ could finish on a middle layer
(rather than continuing all the way up to [n]).

3. As an exercise you should calculate: what is (roughly) the average


length of the chains in the partition of P(n) given by Lemma 4?

We can now prove Sperner’s Lemma.


First proof of Sperner’s Lemma. This is now easy. A chain and an antichain

n
meet in at most one element. We have partitioned P(n) into ⌊n/2⌋ chains,
n

so no antichain can have more than ⌊n/2⌋ elements. 
Sperner’s Lemma tells us the maximal size of an antichain, but what
can we say about uniqueness? And what happens if we start using sets of
different sizes? The LYM Inequality (named after Lubell, Yamamoto and
Meshalkin, who all gave independent proofs of the result) gives a much more
refined picture.

Theorem 5. (LYM Inequality) Let F ⊆ P(n) be an antichain. Then


n
X |F ∩ [n](i) |
n
 ≤ 1. (2.2)
i=0 i

Furthermore, we have equality in (2.2) if and only if F = [n](i) for some i.

6
We will give two proofs of the LYM Inequality. The first one will use a
“local” version of the inequality.
Let F ⊆ X (k) be a k-uniform family on X. The (lower) shadow ∂F of F
is
∂F := {B ∈ X (k−1) : B ⊆ A for some A ∈ F}.

Lemma 6. (Local LYM Inequality) Let F ⊆ [n](r) . Then

|∂A| |A|
n
 ≥ n .
r−1 r

We have equality if and only if A = ∅ or A = [n](r) .

Proof. Once more, we use double counting of edges in Qn , here between A


and ∂A. Thus we are double counting elements of

E = {(A, B) : A ∈ A, B ∈ ∂A, B ⊆ A}.

Each element A ∈ A contains r sets of size r − 1, so

|E| = r|A|.

Each element B ∈ ∂A is contained in n − r + 1 sets of size r (not all of which


need be in A). So
|E| ≤ (n − r + 1)|∂A|.
It follows that
(n − r + 1)|∂A| ≥ r|A|, (2.3)
and so
|A| n−r+1 1 |∂A|

n ≤ |∂A| · · n = n  ,
r
r r r−1

as required.
If we have equality then we must have equality in (2.3), so for every
B ∈ ∂A and every i 6∈ B we have B ∪ i ∈ A. If A 6= ∅, [n](r) then choose
r-sets A1 ∈ A and A2 6∈ A with |A1 △A2 | as small as possible. We can choose
a1 ∈ A1 \A2 and a2 ∈ A2 \A1 : then A1 −a1 is in ∂A and so A3 := (A1 −a1 )∪a2
is in A. But this gives a contradiction, as |A3 △ A2 | < |A1 △ A2 |. 

7
Remark: In the last paragraph, what we really used was the fact that E
is the edge set of a connected graph. Note also that we are being a little
informal with notation in expressions like (A1 − a1 ) ∪ a2 . As with leaving out
brackets in listing set elements, this is fine as long as it is not ambiguous.
We use the Local LYM Inequality to prove the LYM Inequality.
Proof of the LYM Inequality. Let F ⊆ P(n) be an antichain and, for i =
0, . . . , n, define
Fi := F ∩ [n](i) .
The proof proceeds by ‘pushing down one layer at a time’. Define G ⊆ [n](i)
recursively by Gn = Fn and, for r < n,

Gr := Fr ∪ ∂Gr+1 .

Note that Fr and ∂Gr+1 are disjoint, as every set in ∂Gr+1 is contained in
some element of F, and F is an antichain.
We claim that, for r = 0, . . . , n,
n
|Gr | X |Fi |

n ≥

n . (2.4)
r i=r i

We prove this by (downwards) induction. For r = n it is immediate. Now


suppose it is true for r + 1: we show it is true for r. We have
|Gr | |Fr | |∂Gr+1 |

n = n +
 n
 as F, ∂Gr+1 disjoint
r r r
|Fr | |Gr+1 |
≥ 
n + n
 by Local LYM
r r+1
n
|Fr | X |Fi |

≥ n + n
 by induction
r i=r i

as required. We conclude that (2.4) holds for every r.


Setting r = 0 in (2.4), we get
n
|G0 | X |Fi |
1 ≥ n ≥ 
n ,
0 i=0 i

which gives (2.2) as required.

8
If we have equality at the end, we must have had equality at each appli-
cation of Local LYM, and so we have Gr = ∅ or Gr = [n](r) for each r. This
implies that Fr = [n](r) for some r, and Fi = ∅ for all other i. 
We give a second proof of the LYM Inequality. It gives a very slick proof
of the LYM inequality, but does not give the extremal set systems.
Second proof of the LYM Inequality. Let F be an antichain in P(n). Choose
a maximal chain C = (A0 , . . . , An ), where ∅ = A0 ⊆ . . . An = [n], uniformly
at random from all n! maximal chains. For A ∈ F with |A| = k we have
k!(n − k)! 1
P[A ∈ C] = = n .
n! k

The events (A ∈ C)A∈F are pairwise disjoint, so by the union bound we have
X X 1
1≥ P[A ∈ C] = n
,
A∈F A∈F |A|

which gives the LYM Inequality. 


The LYM Inequality gives a second proof of Sperner’s Lemma. In fact
we get a bit more, as it allows us to characterize the extremal set systems.
n

Corollary 7. Let F ⊆ P(n) be an antichain. Then |F| ≤ ⌊n/2⌋ , with
equality if and only if if F = [n](⌊n/2⌋) or F = [n]⌈n/2⌉ .
Proof. Let F ⊆ P(n) be an antichain. By the LYM Inequality, we have
n
X |F ∩ [n](i) |
1≥ n

i=0 i
n
X |F ∩ [n](i) |
≥ n

i=0 ⌊n/2⌋
|F|
= n
,
⌊n/2⌋

n

where the second inequality follows because ⌊n/2⌋ is the largest binomial
coefficient. If we have equality in the first line (where we applied LYM) then
we must have F = [n](i) for some i. But then to get equality in the second
line we must have i = ⌊n/2⌋ or i = ⌈n/2⌉. 

9
Lemma 4 and Sperner’s Lemma together tell us that the minimum number
of chains in a partition of P(n) is equal to the maximum size of an antichain.
This is a special case of a more general theorem about partially ordered sets.
A partially ordered set or poset (P, ≤) is a set P with a relation ≤ such
that, for all a, b, c ∈ P , we have

• a ≤ a (reflexivity),

• if a ≤ b and b ≤ c, then a ≤ c (transitivity),

• if a ≤ b and b ≤ a, then a = b (antisymmetry).

We write a < b to mean a ≤ b and a 6= b.


Two elements a, b ∈ P are comparable if either a ≤ b or b ≤ a. Two
elements are incomparable if they are not comparable.
A set C ⊆ P is a chain if every pair of elements from C is comparable.
A set A ⊆ P is an antichain if every pair of distinct elements from A is
incomparable.
Any set system F becomes a poset under the containment relation ⊆.
You should check that the chains and antichains in this poset are the same
as in the earlier definition, and that a chain and an antichain can have at
most one common element.
We say that a collection of chains covers a poset P if every element of P
is contained in one of the chains. We can now state our main theorem about
posets.

Theorem 8. (Dilworth’s Theorem) Let (P, ≤) be a finite poset. The min-


imum number of chains needed to cover P is equal to the maximum size of
an antichain.

Proof. Since a chain and an antichain meet in at most one element, it is clear
that the number of chains in any cover is at least as large as the maximum
size of an antichain. So we need only prove that there is a cover with this
many chains.
We argue by induction on |P |. The statement is immediate for |P | = 0,
so we assume |P | > 0 and we have proved the statement for smaller posets.
Let m be the maximum size of an antichain and let C be a maximal chain
(i.e. a chain C such that C ∪ {a} is not a chain, for any a ∈ P \ C). [Note
that maximal chain does not necessarily mean chain of maximal size!]

10
If P \ C contains no antichain of size m then we are done by induction:
we can cover P \ C with m − 1 chains, and then add C to get a cover of P
with m chains.
Otherwise, there is an antichain of size m in P \ C, say A = {a1 , . . . , am }.
Let
S + := {x ∈ P : x ≥ ai for some ai ∈ A}
and
S − := {x ∈ P : x ≤ ai for some ai ∈ A}.
Then S + ∩ S − = A, as A is an antichain. Also, S + ∪ S − = P as A is maximal
(if there were x 6∈ S + ∪ S − then A ∪ {x} would be a larger antichain).
We now check that S + and S − are both proper subsets of P . Let the
maximal chain C have elements {c1 , . . . , ck }, where c1 < · · · < ck . Since C
is maximal, ck is a maximal element of P and c1 is a minimal element of P .
Then:

/ S − , because ck is a maximal element of P and ck ∈


• ck ∈ /A

/ S + , because ck is a minimal element of P and c1 ∈


• c1 ∈ / A.

Therefore, S + and S − are proper subsets of P , and so by induction we can


partition S + into m chains C1+ , . . . , Cm
+
, and we can partition S + into m
− −
chains C1 , . . . , Cm .
Since A ⊆ S and A meets each chain Ci+ in at most one elemen, we see
+

that each Ci+ must contain exactly one element of A. Relabelling if necessary,
we may assume that ai ∈ Ci+ , for i = 1, . . . , m. Similarly, we may assume
that ai ∈ Ci− , for i = 1, . . . , m.
If ai is not maximal in Ci− , there exists b ∈ Ci− with b > ai . But then,
since b ∈ S − , we can find aj ∈ A with b ≤ aj . This means ai < b ≤ aj , so
ai < aj , which gives us a contradiction as A is an antichain. So ai is maximal
in Ci− . Similarly, ai is minimal in Ci+ .
Finally, we glue Ci+ and Ci− together (the ‘gluing point’ being ai ) to
obtain a partition of P into m chains. 
There is a ‘dual’ of Dilworth’s Theorem: the minimum number of an-
tichains in a cover P is equal to the maximum size of a chain. The proof of
this is an exercise on the first example sheet.

11
2.2 Symmetric chains and Littlewood-Offord
P
If z1 , . . . , zn ∈ C are such that |zi | ≥ 1, how many of the 2n sums zi , where
i∈I
I ⊆ [n], can equal 0? This question was raised in 1938, by Littlewood and
Offord. A few years later, Erdős found a neat solution in the real case.

Theorem 9. (Erdős) Suppose that x1 , . . . , xn ∈ R satisfy |xi | ≥ 1 for all


n
i. For every α ∈ R, there are at most ⌊n/2⌋ subsets I ⊆ [n] such that
P
i∈I xi ∈ [α, α + 1).

Note that the bound in this theorem is sharp: take xi = 1 for all i, and
set α = ⌊n/2⌋.
Proof of Theorem 9. Consider first the effect of replacing xi with −xi . Let
S1 , . . . , SN denote the N = 2n−1 sums corresponding to subsets I ⊆ [n] that
do not contain i. Then the full collection of sums of subsets is

S1 , . . . , SN , S1 + x i , S2 + x i , . . . , SN + x i .

If we replace xi with −xi , this becomes

S1 , . . . , SN , S1 − x i , S2 − xi , . . . , SN − xi ,

which is just a translation (and reordering) of the first collection. So replacing


xi by −xi does not affect the truth of the theorem. 
n
We may assume that xi ≥ 1 for all i. But now, if we take any ⌊n/2⌋ +1
subsets of [n], then
P by Sperner’s
P Lemma there must be some pair I, J with
I ( J. Then x
i∈J i ≥ i∈I xi + 1, so we cannot have both sums in
[α, α + 1). 
We will also see a solution to the Littlewood-Offord problem in the com-
plex case, but first let us think a bit more about chains.
A chain C1 ⊆ C2 ⊆ · · · ⊆ Cm in P(n) is symmetric if |Ci+1 | = |Ci | + 1,
for all i = 1, . . . , m − 1, and |C1 | + |Cm | = n. We saw in Proposition 4 that
n
P(n) can be partitioned into ⌊n/2⌋ chains, but the chains we obtained could
be asymmetric.

Proposition 10. For n ≥ 1, there is a partition of P(n) into symmetric


chains.

12
Proof. We argue by induction on n. The case n = 1 is easy! So suppose that
n > 1 and that C1 , . . . , Cm is a partition of P(n − 1) into symmetric chains.
For each chain Ci , say Ci = {A1 , . . . Ak } with A1 ⊆ . . . ⊆ Ak , we define two
chains in P(n):
Ci′ := {A1 , A2 , . . . , Ak , Ak ∪ n}
and
Ci′′ := {A1 ∪ n, . . . , Ak−1 ∪ n}.
If k = 1 then Ci′′ is empty: we discard these empty chains. The resulting
chains give a partition of P(n) into symmetric chains. 
n

A symmetric chain decomposition of P(n) has ⌊n/2⌋ chains, each of
which contains an element from the middle layer (or contains an element
from each middle layer, if n is odd). For each i ≤ n/2, there are
   
n n

i i−1

chains of size n − 2i + 1, and these run from the ith layer to the (n − i)th
layer.
We now return to the Littlewood-Offord problem, and in fact answer it
in any number of dimensions (note that the case k = 2 deals with the special
case of complex numbers).

Theorem 11. Let k, n ≥ 1 and suppose that x1 , . . . , xn ∈ Rk satisfy ||xi ||2 ≥


1 for all i. Let K ⊆ Rk have diameter
P diam(K) < 1. Then there are at most
n
⌊n/2⌋
subsets I ⊆ [n] such that i∈I xi ∈ K.
P
Proof. Let us define, for A ⊆ [n], xA = i∈A xi . We shall call a family
A ⊆ P(n) sparse if
||xA − xB || ≥ 1
for all distinct A, B ∈ A.
We shall say that a partition D1 ∪ · · · ∪ Dm of P(n) is symmetric if it has
the same number of sets of each size as a symmetric chain decomposition of
P(n).
It is enough to show that P(n) has a symmetric partition into sparse
n
families, as a sparse partition has ⌊n/2⌋ sets, and if I is a sparse set then
there is at most one A ∈ I with xA ∈ K.

13
We prove the existence of a symmetric partition into sparse families by
induction on n. The case n = 1 is easy. So suppose that n > 1 and that
F1 , . . . , Fm is a symmetric partition of P(n − 1) into sparse families (with
respect to the vectors x1 , . . . , xn−1 ).
Suppose Fi = {A1 , . . . , At }, where

hxA1 , xn i ≤ · · · ≤ hxAt , xn i,

and h·, ·i is the usual inner product on Rk . Define two new families in P(n):

Fi′ := {A1 , A2 , . . . , At , At ∪ n}

and
Fi′′ := {A1 ∪ n, A2 ∪ n, . . . , At−1 ∪ n},
discarding empty families (as in Proposition 10). We claim that this gives a
symmetric partition of P(n) into sparse families.
It is clear that the construction gives the same number of sets of each size
as in a symmetric chain decomposition of P(n); and that Fi′′ is sparse, since
the corresponding sums are translations by xn of the sums corresponding to
Fi . To see that Fi′ is also sparse, we note that Fi is a sparse subset. Also, for
any j ∈ [r], we have (writing x̂n = xn /||xn ||2 for the unit vector in direction
xn )

||xAt ∪{n} − xAj ||2 ≥ hxAt ∪{n} − xAj , x̂n i


= hxAt ∪{n} , x̂n i − hxAj , x̂n i
= hxAt + xn , x̂n i − hxAj , x̂n i
≥ 1 + hxAt , x̂n i − hxAj , x̂n i
≥ 1,

since hxAr , xn i ≥ hxAj , xn i. 

14
2.3 Shadows and Kruskal-Katona
For A ⊆ [n](r) , the Local LYM Inequality tells us that
|∂A| |A|
n
 ≥ n ,
r−1 r

with equality iff A = ∅ or A = [n]r . What happens in between?


It will be helpful to define two orders on [n](r) : the lexicographic and
colexicographic orders.
In lexicographic order or lex, we have

A < B if A 6= B and min(A △ B) ∈ A.

Equivalently, for distinct A, B ∈ [n](r) , with elements a1 < · · · < ar and


b1 < · · · < Br , we have A < B if ai < bi , where i = min{j : aj 6= bj }. This is
the familiar dictionary order.
In colexicographic order or colex,

A < B if A 6= B and max(A △ B) ∈ B.

Equivalently, we have A < B if


X X
2i < 2i .
i∈A i∈B

This can be thought of as ‘binary’ order.


We write A <lex B and A <colex B to distinguish betwen the two orders.
Lex and colex are very different. For instance, if we order pairs of natural
numbers by colex we get

12, 13, 23, 14, 24, 34, 15, 25, 35, . . .

while in lex we have

12, 13, 14, . . . , 23, 24, 25, . . . , 34, 35, 36, . . . , . . . .

The aim of this section is to prove the following theorem.


Theorem 12. (Kruskal-Katona Theorem) Let F ⊆ [n](r) and let A be the
family consisting of the first |F| elements of [n](r) in colex order. Then
|∂F| ≥ |∂A|.

15
In other words:

shadows are minimized by taking initial segments of colex.

Our strategy to prove the Kruskal-Katona Theorem is as follows: we replace


F with a family F ′ ⊆ [n](r) such that
• |F ′ | = |F|
• |∂F ′ | ≤ |∂F|
• F ′ is ‘closer’ to an initial segment of [n](r) .
We repeat, making the family ‘nicer’ at each step, and (hopefully) end up
with an initial segment of colex.
In order to carry out this strategy, we will employ compression operators.2
For distinct i, j ∈ [n], the compression operator Cij is the function from P(n)
to P(n) defined by
(
(A \ j) ∪ i if i ∈
/ A, j ∈ A
Cij (A) =
A otherwise.

For a set system F, we define

Cij (F) := {Cij (A) : A ∈ F} ∪ {A ∈ F : Cij (A) ∈ F}.

If i < j, we sometimes refer to Cij as a left compression.


Note that, for any A ⊆ [n] and any F ⊆ P(n), we have:
• |Cij (A)| = |A|
• |Cij (F)| = |F|
• Cij (Cij (F)) = Cij (F)
• if j ∈ A and A ∈ Cij (F) then A ∈ F and Cij (A) ∈ F.
You should check all of these as an exercise!
If compressions are to be useful, we need to know that they interact well
with shadows. This is indeed the case.
2
Actually, these compression operators won’t quite be enough to get what we want.
But we will shortly define a slightly more general compression operator, and those will be
enough.

16
Lemma 13. For 1 ≤ i < j ≤ n and F ⊆ [n](r) , we have |∂Cij (F)| ≤ |∂F|.
Proof. Let G = Cij (F): so we must show that |∂G| ≤ |∂F|. It will be enough
to show the following.
Claim. Let G′ ∈ ∂G \ ∂F. Then
1. i ∈ G′ , j 6∈ G′

2. (G′ \ i) ∪ j ∈ ∂F \ ∂G.
If the claim holds, then it implies that Cji gives an injection

Cji : ∂G \ ∂F → ∂F \ ∂G.

Indeed, (1) shows that Cji is injective on ∂G \ ∂F, and (2) shows that the
image is contained in ∂F \ ∂G.
Thus we need only prove the claim. So consider G′ ∈ ∂G \ ∂F. There are
G ∈ G and x ∈ [n] such that G = G′ ∪ x. Since G′ ∈ ∂G \ ∂F, we must have
G ∈ G \F and so i ∈ G and j 6∈ G; we must also have F := (G\i)∪j ∈ F \G.
If x = i then G′ ⊆ F , so we must have x 6= i. So i ∈ G′ and j 6∈ G′ ,
which proves (1).
Let F ′ = Cji (G′ ) = (G′ \ i) ∪ j. Then F ′ ⊆ F , so F ′ ∈ ∂F. All that
remains is to show that
F ′ 6∈ ∂G.
Suppose otherwise. Then there is z such that

F ′ ∪ z = (G′ \ i) ∪ j ∪ z ∈ G.

Two cases:
• z 6= i: since Cij (G) = G, we have

Cij (F ′ ∪ z) = G′ ∪ z ∈ G.

But then F ′ ∪ z and Cij (F ′ ∪ z) are both in G, and so both are in F.


This gives a contradiction, as then G′ ⊆ Cij (F ′ ∪ z), so G′ ∈ ∂F.

• z = i: then F ′ ∪ z = G′ ∪ j ∈ G. Since i, j ∈ G′ ∪ j we also have


G′ ∪ j ∈ F, which again gives a contradiction as then G′ ∈ ∂F.


17
We say that a family F ⊆ P(n) is left-compressed if Cij (F) = F for all
1 ≤ i < j ≤ n.
Corollary 14. Let F ⊆ [n](r) . Then there is a left-compressed family A ⊆
[n](r) such that |A| = |F| and |∂A| ≤ |∂F|.
Proof. For A ⊆ P(n), we define the function
XX
f (A) := 2a .
A∈A a∈A

Then for any i < j, applying Cij either leaves A unchanged or strictly de-
creases the value of f .
Let A ⊆ P(n) satisfy |A| = |F|, |∂A| ≤ |∂F| and, subject to this, have
f (A) minimal. Then, by the lemma above, A must be left-compressed. 
Any initial segment of [n](r) in colex is left-compressed, so we might hope
that Corollary 14 is enough to prove Kruskal-Katona. Unfortunately, not
every left-compressed set system is an initial segment of colex: for instance
{12, 13, 14} is left-compessed but 23 <colex 14.
We will need a more general compression operator to prove Kruskal-
Katona Theorem.
Let U, V ⊆ [n] satisfy |U | = |V | and U ∩ V = ∅. The U V -compression
operator CU V is the function from P(n) to P(n) defined by
(
(A \ V ) ∪ U if U ∩ A = ∅, V ⊆ A
CU V (A) =
A otherwise.

For a set system F, we define

CU V (F) := {CU V (A) : A ∈ F} ∪ {A ∈ F : CU V (A) ∈ F}.

A family A is (U, V )-compressed if CU V (A) = A.


It is clear that |CU V (A)| = |A| and |CU V (A)| = |A|. We will use the
following technical lemma, which extends Lemma 13.
Lemma 15. Let U, V ⊆ [n] be disjoint sets with |U | = |V |. Suppose that
F ⊆ [n](r) satisfies

∀u ∈ U ∃v ∈ V such that F is (U \ u, V \ v)-compressed. (2.5)

Then |∂CU V (F)| ≤ |∂F|.

18
Proof. We generalize the proof of Lemma 13. Let G = CU V (F) and consider
G′ ∈ ∂G \ ∂F. We will show:
Claim. Let G′ ∈ ∂G \ ∂F. Then

1. U ⊆ G′ , V ∩ G′ = ∅

2. (G′ \ U ) ∪ V ∈ ∂F \ ∂G.

As before, this implies that CV U gives an injection from ∂G \ ∂F to


∂F \ ∂G, and the lemma follows.
Thus we need only prove the claim. So suppose we are given G′ as in
the claim. There are G ∈ G and x ∈ [n] such that G = G′ ∪ x. Since
G′ ∈ ∂G \ ∂F, we must have G ∈ G \ F and so U ⊆ G and V ∩ G = ∅; we
must also have F := (G \ U ) ∪ V ∈ F \ G.
If x ∈ U then by (2.5) there is y ∈ V such that F is (U \ x, V \ y)-
compressed, so
CU \x,V \y (F ) = (G \ x) ∪ y ∈ F.
But then G′ = G \ x ⊆ F , which gives a contradiction. So we must have
x 6∈ U . So U ⊆ G′ and V ∩ G′ = ∅, which proves (1).
Let F ′ = CV U (G′ ) = (G′ \ U ) ∪ V . Then F ′ ⊆ F , so F ′ ∈ ∂F. All that
remains is to show that
F ′ 6∈ ∂G.
Suppose otherwise. Then there is z such that

F ′ ∪ z = (G′ \ U ) ∪ V ∪ z ∈ G.

Two cases:

• z 6∈ U : since CU V (G) = G, we have

CU V (F ′ ∪ z) = G′ ∪ z ∈ G.

But then F ′ ∪ z and CU V (F ′ ∪ z) are both in G, and so both are in F.


This gives a contradiction, as then G′ ⊆ CU V (F ′ ∪ z), so G′ ∈ ∂F.

• z ∈ U : then F ′ ∪z ∈ G and F ′ ∪z meets both U and V , so we must have


F ′ ∪ z ∈ F. But there is y ∈ V such that F is (U \ u, V \ v)-compressed,
and so CU \u,V \v (F ′ ∪ z) = G′ ∪ y ∈ F, which again gives a contradiction
as then G′ ∈ ∂F.

19

We can now prove the Kruskal-Katona Theorem:
Proof of Kruskal-Katona. Let A ⊆ [n](r) satisfy

• |A| = |F|

• |∂A| ≤ |∂F|
P P
• subject to this, A∈A i∈A 2i is minimal.

Let
Λ = {(U, V ) : |U | = |V | > 0, U ∩ V = ∅, max U < max V }.
If A is (U, V )-compressed for all (U, V ) ∈ Λ then A is an initial segment of
colex: if A ∈ A and B <colex A then max(B \ A) < max(A \ B) and so A is
(B \ A, A \ B)-compressed, which implies CB\A,A\B (A) = B ∈ A.
Otherwise, pick (U, V ) ∈ Λ such that A is not (U, V )-compressed and |U |
is minimal. Then A is (U \ u, V \ min V )-compressed for all u ∈ U , and so
by Lemma 15 we have |∂CU V (A)| ≤ |∂A|. But CU V (A) has strictly smaller
weight than A, which contradicts the minimality of the weight of A. 

20
Chapter 3

Intersections and traces

3.1 Erdős-Ko-Rado and the Two Families The-


orem
A family A ⊆ P(n) is intersecting if |A ∩ B| 6= ∅, for all A, B ∈ A.
What is the maximum size of an intersecting family in P(n)? The set
{A ⊆ [n] : 1 ∈ A} is intersecting and has size 2n−1 . It is easy to show that
this is best possible.

Proposition 16. Let A ⊆ P(n) be intersecting. Then |A| ≤ 2n−1 .

Proof. A contains at most one set from each pair (A, [n] \ A). 
A much more interesting question is: what is the largest intersecting
family of r-sets in P(n)? There are three regimes to consider:

• r > n/2: This case is trivial, as we can take the entire layer [n](r) .

• r = n/2: (Obviously, this only happens when n is even.) This case


is easy: we can take at most one from each pair (A, [n] \ A), and any
system obtained in this way is intersecting. Thus the maximum is
     
1 n n−1 n−1
= = .
2 n/2 n/2 − 1 r−1

• r < n/2: This is more interesting! One example is to take all r-sets

containing a fixed element, say 1. This gives a system of size n−1
r−1
.

21
Very good. But we could also take all sets that contain at least two
elements from {1, 2, 3}. This also gives an intersecting
  system, and a
litle calculation shows that it has size 3 n−3
r−2
+ n−3
r−3
. A little more
calculation shows that the first system is bigger, but of course we want
to handle all possible systems.

This last case is where the Erdős-Ko-Rado Theorem comes in.

Theorem 17. (Erdős-Ko-Rado  Theorem) For r ≤ n/2, if A ⊆ [n](r) is


n−1
intersecting then |A| ≤ r−1 .

Remark: In fact, for r < n/2, we get equality only for the systems that
consist of all r-sets containing a fixed point. (We won’t prove this here.) For
r = n/2 it is easy to construct many nonisomorphic systems.
We shall give two proofs: the first uses Katona’s ingenious circle method;
the second uses the Kruskal-Katona Theorem.
First proof of the Erdős-Ko-Rado Theorem. Consider any bijection f : [n] →
Zn . We say that A maps to an interval under f if f (A) := {f (a) : a ∈ A} =
{i, i + 1, . . . , i + k − 1}, for some 0 ≤ i ≤ n − 1 (where addition is modulo n).
We will double count the number N of pairs (f, A) such that f : [n] → Zn is
a bijection and f (A) is an interval.
For any fixed f , we claim that at most k sets in A map to intervals under
f . Indeed, suppose A ∈ A and f (A) = {i, i + 1, . . . , i + k − 1}. Since A is
intersecting, any other interval that we get under f must be of form

{j, j − 1, . . . , j − (k − 1)}

or
{j + 1, j + 2, . . . , j + k}),
for some j ∈ {i, i + 1, . . . , i + k − 2}. But for each j we can get at most one
of these two intervals (as they are disjoint). So we get at most k − 1 such
intervals, and hence at most k in total. Summing over all n! bijections from
[n] to Zn , we see that
N ≤ kn!.
On the other hand, each A ∈ A is an interval under n(n−k)!k! bijections,
so
N = |A|n(n − k)!k!.

22
Combining these bounds on N gives
 
kn! n−1
|A| ≤ = .
n(n − k)!k! k−1

Now for a proof involving shadows. Recall that ∂A is the shadow of the
set system A. We write ∂ (2) A = ∂(∂A), ∂ (3) A = ∂(∂ (2) A), and so on. Note
that ∂ (k) A is the collection of sets B that can be obtained from some A ∈ A
by deleting k elements.
Second proof of the Erdős-Ko-Rado Theorem. Let A ⊆ [n](r) be an intersect-
ing family, and set
B = {Ac : A ∈ A} ⊆ [n](n−r) .
Since A is intersecting, no set A ∈ A is contained in any set B ∈ B. So
∂ (n−2r) B ⊆ [n](r)
is disjoint from A. 
Now if |A| ≥ n−1
r−1
then
   
n−1 n−1
|B| = |A| ≥ = .
r−1 n−r
We now apply Kruskal-Katona repeatedly:
 
(n−r) n−1
|∂B| ≥ |∂[n − 1] |=
n−r−1
and so  
(2) (n−r−1) n−1
|∂ B| ≥ |∂[n − 1] |= ,
n−r−2
n−1

and so on, showing at each step that |∂ (i) B| ≥ n−r−i , until we get
   
(n−2r) (r+1) n−1 n−1
|∂ B| ≥ |∂[n − 1] |= = .
n − r − (n − 2r) r

So if |A| > n−1r−1
, we get
       
n (n−2r) n−1 n−1 n
≥ |A ∪ ∂ B| > + = ,
r r−1 r r
which gives a contradiction. 

23
Theorem 18 (Liggett’s Theorem). Suppose that Y1 , . . . , Yn are independent
random variables with P(Yi = 1) = 1 − P(Yi = 0) = p ≥ 1/2 for each i. Let
α1 , . . . , αn be non-negative numbers summing to 1. Then
!
X
P αi Yi ≥ 1/2 ≥ p.
i

Proof. Suppose first that no proper subset of the αi sum to 1/2. Let
X
A = {A ⊆ [n] : αi > 1/2}
i∈A

and let Ak = A ∩ [n](k) . Let Nk = |Ak |.


Then Nk + Nn−k = nk since for every set B of size k exactly one of
{B, B c } belongs to A. Also, for each k the family Ak is intersecting, so by
the Erdős-Ko-Rado theorem we have that Nk ≤ n−1 k−1
when 2k ≤ n.
We also have that
! n
X X
P αi Yi ≥ 1/2 = pk (1 − p)n−k Nk
i k=0

and (by binomial expansion of (1 − p + p)n−1 ) we have


n
X  
k n−k n−1
p= p (1 − p) .
k=0
k−1

So
! n    
X X n−1
k n−k
P αi Yi ≥ 1/2 −p= p (1 − p) Nk −
i k=0
k−1

X  
n−1
 X   
n−1

k n−k k n−k
= p (1 − p) Nk − + p (1 − p) Nk − .
2k≤n
k−1 2k>n
k−1

Changing variables in the second sum gives


X  
n−1
 X  
n−1
 
n−k k n−k k
p (1 − p) Nn−k − = p (1 − p) − Nk
2k<n
k 2k<n
k−1

24
so we get that !
X
P αi Yi ≥ 1/2 −p
i
X 
 
n−1

k n−k n−k k
≥ p (1 − p) −p (1 − p) Nk −
2k<n
k−1

which is non-negative (term-wise) for p ≥ 1/2.


The case where a proper sumset of the αi sum to 1/2 follows either by
a limiting argument or a more careful version of the above argument (non-
examinable). 
We next prove the Two Families Theorem, which is due to Bollobás.

Theorem 19. (Two Families Theorem) Let A1 , . . . , Ak and B1 , . . . , Bk be


finite sets such that, for all i,

Ai ∩ Bi = ∅

and, for all i 6= j,


Ai ∩ Bj 6= ∅.
Then
k 
X −1
|Ai | + |Bi |
≤ 1.
i=1
|Ai |

If we specify the size of the sets, we get the following useful corollary.

Corollary 20. Let A1 , . . . , Ak be a-sets and B1 , . . . , Bk be b-sets such that,


for all i,
Ai ∩ Bi = ∅
and, for all i 6= j,
Ai ∩ Bj 6= ∅.
Then  
a+b
k≤ .
a
Corollary 20 is an immediate consequence of the Two Families Theorem.

25
Proof of the Two Families Theorem. We may assume that all sets are sub-
sets of [n]. For a permutation π of [n], we write A <π B if
max π(A) < min π(B),
where we write π(S) := {π(x) : x ∈ S}.
Let π ∈ Sn be chosen uniformly at random from the set of permutations
of [n]. Then, for each i, as Ai ∩ Bi = ∅, we have
 −1
|Ai | + |Bi |
P(Ai <π Bi ) = ,
|Ai |
On the other hand, if Ai <π Bi then Aj 6<π Bj for j 6= i (as Ai ∩ Bj and
Aj ∩ Bi are both nonempty). So the events (Ai <π Bi )i∈[k] are disjoint, and
so
X k Xk  −1
|Ai | + |Bi |
1≥ P(Ai <π Bi ) = ,
i=1 i=1
|Ai |
which gives the required inequality. 
Let us see an application of the Two Families Theorem.
An r-uniform hypergraph H = (V, E) consists of a set V (of vertices) and
a set E ⊆ V (r) (of edges). The complete r-uniform hypergraph on k vertices
(r)
is Kk := ([k], [k](r) ). Isomorphism is defined just as you expect. If H ′ is
isomorphic to H we will often say that H ′ is a copy of H.
Let H be an r-uniform hypergraph. We say that an r-uniform hypergraph
G is H-saturated if G does not contain a copy of H, but if we add any edge
to G then the resulting hypergraph contains a copy of H.
For instance, in the case of graphs, if H = K3 then Turán’s Theorem
tells us that the mamximal number of edges in an H-saturated graph on n
vertices is ⌊n2 /4⌋; it is an exercise to show that the minimum number of
edges is n − 1.
In general, it is very hard to determine the maximum number of edges in
(r)
a Kk -saturated hypegraph. Suprisingly, Bollobás determined the minimum
number precisely.
Theorem 21. Let G be an r-uniform hypergraph with vertex set [n], and
(r+s)
suppose that adding any edge to G creates a copy of Kr . Then G has at
least    
n n−s

r r

26
edges.

Proof. Let A = {A1 , . . . , Am } be the non-edges of G. For each i, there is an


(r)
(r + s)-element set Ki ⊃ Ai such that adding Ai to G creates a copy of Kr+s
with vertex set Ki . Let Bi = [n] \ Ki . Then

• |Ai | = r and |Bi | = n − r − s for each i;

• Ai ∩ Bi = ∅ for each i;

• for distinct i, j, we have Ai ∩ Bj 6= ∅ (or else we would have Ai ⊆


[n] \ Bj = Kj , and so G would be missing two edges Ai , Aj from the
complete r-graph with vertex set Kj ).

So we can apply the Two Families Theorem to get


   
r + (n − r − s) n−s
m≤ = .
r r


The above bound is sharp: we can take the r-uniform hypergraph with
vertex class [n] and edges {F ∈ [n](r) : F ∩ [s] 6= ∅}.

3.2 VC-dimension
Let F ⊆ P(X) and S ⊆ X. The trace of F on S is the set system

F|S := {A ⊆ S : there exists F ∈ F such that F ∩ S = A}.

We set
trF (S) = |F|S| ,
i.e. the number of sets in the trace of F on S. We say that S is shattered by
F if F|S = P(S) (in other words, trF (S) = 2|S| ).
The VC-dimension of F is max{|S| : S ⊆ X is shattered by F}. [VC
stands for Vapnik-Chervonenkis.]

Example 1. The family [n]≤d has VC-dimension d.


What is the VC-dimension of the (infinite) family H consisting of all
half-planes (in R2 )? For instance, {(1, 1), (2, 1), (3, 1)} cannot be shattered

27
by H (there is not way to obtain the subset {(1, 1), (3, 1)} by intersecting
{(1, 1), (2, 1), (3, 1)} with half-planes!). However, {(0, 1), (1, 0), (1, 2)} is eas-
ily seen to be shattered by H. So the VC-dimension of H is at least 3. (Tricky
question: What is the VC-dimension of this system?)
The Sauer-Shelah Theorem tells us that if a family A ⊆ P(n) contains
more than |[n](≤d) | sets, then its VC-dimension is greater than d:
Theorem 22. If A ⊆ P(n) has VC-dimension at most d, then
     
(≤d) n n n
|A| ≤ |[n] |= + + ··· + .
0 1 d
We shall see a couple of proofs.
Proof 1 of the Sauer-Shelah Theorem. We argue by induction on n + d. Let
     
(≤d) n n n
f (n, d) = |[n] |= + + ··· + .
0 1 d
If n = 0 or d = 0, the result is trivial. If n + d > 0, with n, d > 0, let

B = {A \ n : A ∈ A} ⊆ P(n − 1),

C = {A ∈ A : n ∈
/ A, A ∪ {n} ∈ A} ⊆ P(n − 1).
Then B has VC-dimension at most d, while C has VC-dimension at most d−1
(as if S is shattered by C, then S ∪ {n} is shattered by A; so |S| ≤ d − 1).
Hence, by induction, |B| ≤ f (n − 1, d) and |C| ≤ f (n − 1, d − 1) and

|A| = |B| + |C| ≤ f (n − 1, d) + f (n − 1, d − 1) = f (n, d).


Proof 2 of the Sauer-Shelah Theorem. Define the i-compression operator by

πi (A) = A \ {i}

and
πi (A) = {πi (A) : A ∈ A} ∪ {A ∈ A : πi (A) ∈ A}.
Then πi does not increase the VC-dimension of a set system (exercise) and
|A| = |πi (A)|. Thus we can repeatedly apply the i-compression operator

28
until our family is i-compressed for all i ∈ [n] (this terminates, as every
compression
P either leaves the family unchanged or decreases the quantity
A∈A |A|).
So consider B, the i-compressed family obtained from A. If B contains
any set B of size at least d + 1 then B contains all subsets of B, and so has
VC-dimension at least d + 1. Otherwise,
     
(≤d) n n n
|A| = |B| ≤ |[n] |= + + ··· + .
0 1 d


3.3 A brief interlude on upsets and downsets


A family A is an upset if A ∈ A and A ⊆ B implies that B ∈ A. A is a
downset if A ∈ A and A ⊃ B implies that B ∈ A.
Theorem 23. (Kleitman’s Theorem) Let A, B ⊆ P(n) be downsets. Then
|A||B|
|A ∩ B| ≥ .
2n
Proof. We argue by induction on n. The case n = 1 is straightforward. For
n > 1, define
A+ = {A ⊆ [n − 1] : A ∪ {n} ∈ A}
and
A− = {A ⊆ [n − 1] : A ∈ A}.
Define B + and B − similarly.
Since A is a downset, A+ , A− are downsets and A+ ⊆ A− ; similarly for
B and B − . Then, by induction,
+

|A ∩ B| = |A+ ∩ B + | + |A− ∩ B − |
|A+ ||B + | |A− ||B − |
≥ +
2n−1 2n−1
1 1
= n (|A+ | + |A− |)(|B + | + |B − |) + n (|A+ | − |A− |)(|B + | − |B − |)
2 2
|A||B|
≥ ,
2n
since (|A+ | − |A− |) ≤ 0 and (|B + | − |B − |) ≤ 0. 

29
Some authors call the above Theorem “Harris’ Lemma” or “Harris-Kleitman
Lemma”.

3.4 More on intersecting families


A family A ⊆ P(n) is t-intersecting if |A ∩ B| ≥ t, for all A, B ∈ A.
When n is large enough, the Erdős-Ko-Rado Theorem generalises as fol-
lows.
Theorem 24. Let 1 < t ≤ k be positive integers. There exists an integer
n0 = n0 (k, t) such that the following holds. For all n > n0 , if A ⊆ [n](k) is
t-intersecting, then  
n−t
|A| ≤ ,
k−t
with equality if and only if A is of the form {A ∈ [n](k) : T ⊆ A}, where T is
a t-element subset of [n].
Proof. We may assume that A is maximal and so there are A, B ∈ A with
|A ∩ B| = t (exercise).
Let us fix A, B ∈ A with |A ∩ B| = t. If A ∩ B ⊆ C, for all C ∈ A, then
|A| ≤ n−t
k−t
as required.
So suppose that there exists C ∈ A with A ∩ B 6⊆ C. Every D ∈ A must
have at least t + 1 elements in A ∪ B ∪ C. Thus
    
|A ∪ B ∪ C| n t+1 k−t−1 n−t
|A| ≤ ≤ (3k) n < ,
t+1 k−t−1 k−t
provided n is large enough. 
What if we allow only intersection of a fixed size?
Theorem 25 (Fisher’s Inequality). Let k ≥ 1. Suppose that A ⊆ P(n)
satisfies |A ∩ B| = k for all distinct A, B ∈ A. Then |A| ≤ n.
Proof. If there exists A ∈ A with |A| = k, then A ⊆ B for all B ∈ A, and
the family {B \ A : B ∈ A} consists of pairwise disjoint sets, and so |A| ≤ n.
Otherwise, we may assume that |A| > k for all A ∈ A. Let χA ∈ Rn
denote the characteristic vector of A, so
(
1 i∈A
χA (i) =
0 i 6∈ A.

30
We claim that the vectors in {χA : A ∈ A} are linearly independent. Then
≤ dim(Rn ) = n.
|A| = |{χA : A ∈ A}| P
So suppose that λA χA = 0. For any B ∈ A, since hχA , χB i = |A ∩
A∈A
B| = k for A 6= B and hχB , χB i = |B|, we have
X X
0=h λA χA , χB i = λB |B| + λA k = λB (|B| − k) + Λk,
A∈A A6=B
P
where Λ = λA . Thus (noting that |B| > k)
A∈A


λB = − .
|B| − k
If Λ = 0, then λB = 0 for all B ∈ A. If Λ 6= 0, then Λ and λB have
P sign. But this holds for any B ∈ A, which is impossible since
opposite
Λ= λB . We conclude that the vectors χA are linearly independent, as
B∈A
required. 
We continue with modular intersection theorems, where the allowed sizes
of intersections are specified modulo p. Here is our first example.
Theorem 26. (Oddtown Theorem) Let A ⊆ P(n) be a family such that
• |A| is odd for all A ∈ A
• |A ∩ B| is even for all distinct A, B ∈ A.
Then |A| ≤ n.
First proof of the Oddtown Theorem. We work over the field with two ele-
ments F2 . We identify each element of with its characteristic vector in Fn2 .
Then, for all A, B ∈ A, we have
(
0 if A 6= B
hχA , χB i = |A ∩ B| =
1 A = B.
P
We claim that {χA : A ∈ A} is linearly independent in Fn2 . If λA χ A =
A∈A
0, then, for all B ∈ A, we have
X
0=h λA χ A , χ B i = λB .
A∈A

Hence the vectors {χA : A ∈ A} are linearly independent and so |A| ≤ n. 

31
Proof 2 of the Oddtwon Theorem. We work over the field with two elements
F2 . Let A = {A1 , . . . , Am }.
Let M = (mij ) be the m × n incidence matrix where
(
1 j ∈ Ai
mij =
0 j 6∈ Ai .

Then N = M M T is the m × m identity matrix. But then rank(N ) = m =


|A| ≤ rank(M ) ≤ n. 
For the next theorem, we need to introduce the multilinearization trick.
Given a polynomial f in one or more variables, we define f˜ to be the polyno-
mial obtained by replacing every occurrence xi , i > 1, of each variable x by
just x1 . For instance if f (x, y, z) = 4x3 +xy+z 10 then f˜(x, y, z) = 4x+xy+z.
Here is the crucial observation: if we evaluate f and f˜ at a point where all
variables take values 0 or 1 then they both take the same value.
Here is good example of this trick in action.

Theorem 27 (Modular Frankl-Wilson Theorem). Let p be prime and S ⊆


{0, 1, . . . , p − 1}. Suppose that A ∈ P(n) satisfies:

• |A| ∈
/ S mod p, for all A ∈ A;

• |A ∩ B| ∈ S (modulo p), for all distinct A, B ∈ A.

Then
|S|  
X n
|A| ≤ .
i=0
i

Proof. We work over the field with prime number of elements Fp , and intro-
duce variables x = (x1 , . . . , xn ). For each A ∈ A, we define the polynomial
!
Y X
fA (x) = xi − s .
s∈S i∈A

Then, for B ∈ A, if B 6= A we have


Y
fA (χB ) = (|A ∩ B| − s) = 0,
s∈S

32
while if A = B we have
Y
fA (χB ) = (|A| − s) 6= 0.
s∈S

We now replace each polynomial fA by the corresponding multilinear


polynomial f˜A . For any χ ∈ {0, 1}n , we have

f˜A (χ) = fA (χ).

It follows that, for B ∈ A, if B 6= A we have


Y
f˜A (χB ) = (|A ∩ B| − s) = 0,
s∈S

while if B = A we have
Y
f˜A (χA ) = (|A| − s) = αA 6= 0.
s∈S

P The polynomials {f˜A (x) : A ∈ A} are linearly independent. For if


˜
λA fA = 0 then, for any B ∈ A,
!
X
0= λA f˜A (χB ) = αB λB ,
A∈A

thus λB = 0. The f˜A are therefore linearly independent, and lie in the space
of multilinear polynomials of degree at most |S|. This has dimension
|S|  
X n
i=0
i

Q |S|
P 
n
(it is spanned by the monomials { i∈A xi : |A| ≤ |S|}). Thus |A| ≤ i
.
i=0

What if we drop the modular constraint?
For a set S ⊆ N, we say that a family A is S-intersecting if |A ∩ B| ∈ S
for all distinct A, B ∈ A.

33
Theorem 28 (Frankl-Wilson Theorem). Let A ⊆ P(n) be S-intersecting.
Then
|S|  
X n
|A| ≤ .
i=0
i

We won’t prove the full version of F-W here: instead we prove it under
the additional assumption that |A| 6∈ S for all A ∈ A.
Proof. This is an easy deduction from the Modular Frankl-Wilson Theorem:
just apply the modular version of the theorem with p > n. 
The next result is a uniform version of the Modular Frankl-Wilson The-
orem.

Theorem 29 (Ray-Chaudhuri-Wilson Theorem). Let A ⊆ [n](k) be an S-


intersecting system, where S ⊆ {0, . . . , k − 1}. Then
 
n
|A| ≤ .
|S|

Proof. Let x = (x1 , . . . , xn ). For each A ∈ A, we define


!
Y X
fA (x) = xi − s .
s∈S i∈A

So for A, B ∈ A we have
(
Y 0 A 6= B
fA (χB ) = (|A ∩ B| − s) =
s∈S
6 0 A = B.
=

Now for B ⊆ [n], let Y


pB (x) = xi .
i∈B

(In particular p∅ = 1.) We have


(
1 A⊆B
pA (χB ) =
0 A⊆
6 B.

34
Finally, define
n
X
q(x) = xi − k,
i=0
so q(χB ) = 0 for B ∈ A, and q(χB ) 6= 0 if |B| < k.
Define the collection
G: {fA : A ∈ A} ∪ {q · pB : |B| ≤ |S| − 1},
and let G̃ be the multilinearized collection
G̃ : {f˜A : A ∈ A} ∪ {gq pB : |B| ≤ |S| − 1}.
We claim that the collection G̃ is linearly independent. If this is true
then note that all the polynomials in G̃ are multilinear
P|S| nand have degree at
most s. Thus they lie in a space of dimension i=0 i . However, the set
P|S|−1 n n

{qp
e B : |B| ≤ |S| − 1} has size i=0 i
. So the set A has size at most |S| .
All that remains is to prove our claim. Suppose that
X X
λA f˜A + µB qg
pB = 0.
A∈A |B|≤|S|−1

For C ∈ A, we have
qg
pB (χC ) = qpB (χC ) = 0,
so for all C ∈ A we have
 
X X
0= λA f˜A + pB  (χC ) = λC .
µB qg
A∈A |B|≤|S|−1

It follows that we must have


X
µB qg
pB = 0.
|B|≤|S|−1

Now if not all µC are 0 then let C be of minimal size with µC 6= 0. Then


0 B ⊆ C, B 6= C (as µB = 0)
µB qg
pB (χC ) = 0 B 6⊆ C (as pB (χC ) = 0)


6= 0 B = C.
So X
0= µB qg pC (χC ) 6= 0,
pB (χC ) = µC qg
|B|≤|S|−1

which gives a contradiction. Thus the claim holds. 

35
3.5 Borsuk’s Conjecture
In 1933, Borsuk conjectured that if K ⊆ Rd has diameter 1 then it can be
partitioned into d + 1 sets of diameter less than 1. (It’s easy to see that there
are sets for which d + 1 is necessary: consider the regular simplex.)
Borsuk’s conjecture remained open until 1993, when Kahn and Kalai
showed that it is false.

Theorem 30. Let k(d) be the smallest integer k such that every subset of
Rd of diameter 1 can be partitioned into k(d) sets of smaller diameter. Then
there exists c > 1 such that √
k(d) ≥ c d
for infinitely many d.

In fact, Kahn and Kalai proved that (with a little more work) the bound
holds for all d. Surprisingly, their result uses the Modular Frankl-Wilson
Theorem.
We will need a preliminary lemma.

Lemma 31. Let p be a prime and suppose A ⊆ [4p](2p) . Suppose there is no


4p
pair of distinct A, B ∈ A with |A ∩ B| = p. Then |A| ≤ 4 p−1 .

Proof. For x ∈ [4p], let Ax := {A ∈ A : x ∈ A}. We can choose x such that


|Ax | ≥ 12 |A|. Then, for distinct A, B ∈ Ax , we have |A ∩ B| 6= 0, p, 2p. So,
for distinct A, B ∈ Ax , we have

|A ∩ B| 6≡ 0 mod p.

On the other hand, for all A ∈ Ax ,

|A| ≡ 0 mod p.

We can therefore apply the Modular Frankl-Wilson Theorem (with S =


{1, . . . , p − 1}) to get
p−1    
X 4p 4p
|Ax | ≤ ≤2 ,
i=0
i p−1

4p
 
4p
since 2 i
≥ i−1
, for all i ≤ p − 1. The result follows. 

36
4p

Proof of the Kahn-Kalai Theorem. Let d = 2
, where p is a prime. We
shall construct a set K ⊆ Rd . In fact, let

W = [4p](2) = E(K4p ),

so we identify [4p](2) with the edges of the complete graph with vertex set
[4p]. We will work in RW . Note that

• Coordinates in RW are indexed by edges of K4p ; and

• 0-1 vectors correspond to subgraphs of K4p .

For each A ∈ [4p](2p) , let

EA = {{i, j} ∈ W : |A ∩ {i, j}| = 1};

in other words EA is the edge set of the complete bipartite graph with vertex
classes A and Ac . We set

F = {EA : A ∈ [4p](2p) }.

Since |A| = |Ac | = 2p and EA = EAc , we have


 
1 4p
|F| = .
2 2p

We identify each element EA of F with the vector vA ∈ RW given by


(
1 e ∈ EA
(vA )e =
0 e 6∈ EA .

Let K be the set of points in RW corresponding to F.


Now for A, B ∈ [4p](2p) ,
!1/2
X
||vA − vB ||2 = (vA (e) − vB (e))2
e

= (|EA | + |EB | − 2|EA ∩ EB |)1/2 .

So ||vA − vB ||2 increases as |EA ∩ EB | decreases. It is easy to check that


|EA ∩ EB | is minimal if and only if |A ∩ B| = p.

37
Now suppose L ⊆ K satisfies diam(L) < diam(K). In the set A ⊆ F
corresponding to L there can be no pair A, B with |A ∩ B| = p. So by the
4p
lemma, |A| ≤ p−1 , and so
   
1 4p  4p
|F|/|A| ≥ .
2 2p p−1

For large p, an application√ of Stirling’s formula shows that this is at least


1.7p , which is at least 1.2 d . Thus if√ we want to partition K into sets of
smaller diameter we need at least 1.2 d sets. 
Remark. A couple of people asked if I could provide a few more details for
that last application
 of Stirling. By Example sheet 1, question 5(b), we
4p (1+o(1))4p 4p
have that 2p = 2 and p−1 = 2(1+o(1))H(1/4)4p where H(1/4) =
−(1/4) log2 (1/4) − (3/4) log2 (3/4) ≈ 0.81. So
   
1 4p  4p
= 2(1+o(1))(1−H(1/4))4p > 20.8p .
2 2p p−1
 1+o(1)

Then we observe that d = 4p 2
= 8(1 + o(1))p 2
so p = √
2 2
d, so 20.8p >

1.2 d .

38
Chapter 4

Combinatorial Nullstellensatz

Let F be a field. It is an elementary fact that if f ∈ F[x] has degree t, and


S ⊆ F has |S| > t then there is s ∈ S with f (s) 6= 0.
Theorem 32 (Combinatorial Nullstellensatz). Let F be a field and let f ∈
Qn 1 , . .ti. , xn ] be a polynomial of degree t. Suppose that the coefficient of
F[x
i=1 xi in f is nonzero, where t1 + · · · + tn = t. If S1 , . . . , Sn are subsets of
F with |Si | ≥ ti + 1 for each i then there is (s1 , . . . , sn ) ∈ S1 × · · · × Sn such
that f (s1 , . . . , sn ) 6= 0.
Proof. We argue by induction on t = deg(f ). If t = 1 then the result is
trivial. So suppose t > 1, and f (x) = 0 for all x ∈ S1 × · · · × Sn . Without
loss of generality, t1 > 0. Choose s1 ∈ S1 and write f as

f (x) = (x1 − s1 )q(x) + r(x),

where deg(q) = t − 1, and q has a monomial x1t1 −1 xt22 . . . xtnn with nonzero
coefficient, and r(x) ∈ F[x2 , . . . , xn ].
For any (s2 , . . . , sn ) ∈ S2 × . . . Sn we have

f (s1 , . . . , sn ) = r(s2 , . . . , sn )

and so r vanishes on S2 × · · · × Sn . Thus for any s′ = (s′1 , . . . , s′n ) ∈ S1 ×


· · · × Sn , we have

f (s′ ) = (s′1 − s1 )q(s′ ) + r(s′ ) = (s′1 − s1 )q(s′ ).

In particular, as s′1 − s1 = 6 0 for s′1 ∈ S1 \ s1 , we see that q vanishes on


(S1 \ s1 ) × S2 × · · · × Sn . But this contradicts the inductive hypothesis. 

39
Let us see some applications.
How many hyperplanes we need to cover all vertices of a cube in Rn ?
This is trivial: two hyperplanes will do. But what if we want to cover all but
one vertex, and leave the last vertex uncovered? It is easy to show that n
planes suffice. It turns out that this is the minimum!

Theorem 33. Let H1 , . . . , Hm be a family of m hyperplanes in Rn whose


union contains exactly 2n − 1 vertices from {0, 1}n . Then m ≥ n.

Proof. We may assume the uncovered vertex is 0. Work over the reals. Each
hyperplane Hi is defined by an equation of form

hx, ai i = bi

Note that bi 6= 0 as 0 6∈ Hi . So rescaling ai and bi , we may assume Hi is


given by
hx, ai i = 1
Define n m
Y Y
m+n
P (x) = (−1) (xj − 1) − (hx, ai i − 1)
j=1 i=1

If m < n then the coefficient of x1 . . . xn is nonzero, so by the Combina-


torial Nullstellensatz there is s ∈ {0, 1} × · · · × {0, 1} such that P (s) 6= 0.
But if s 6= 0 then both parts of P (x) are 0; while if s = 0 then we again have
P (s) = 0. This gives a contradiction. 
This is essentially the last point that we reached in lectures. In the next
section are two more well-known applications of the Nullstellensatz, which
are nice to see but do not form part of the examinable material this year.

Non-examinable applications
Given sets A, B in an abelian group, we write

A + B = {a + b : a ∈ A, b ∈ B}.

If A, B ⊆ Z then |A + B| ≥ |A| + |B| − 1 (exercise). The same thing happens


in Zp .

40
Theorem 34 (Cauchy-Davenport). If p is a prime and A, B ⊆ Zp then

|A + B| ≥ min{p, |A| + |B| − 1}

Proof. Work over Zp . Suppose first that |A|+|B| > p. Then for every c ∈ Zp ,
the sets A, c − B must have a common element, and so there is a solution to
x + y = c with x ∈ A, y ∈ B.
Now suppose |A| + |B| ≤ p and |A + B| ≤ |A| + |B| − 2. Pick C ⊃ A + B
with |C| = |A| + |B| − 2, and set
Y
f (x, y) = (x + y − c)
c∈C

This has degree |A| + |B| − 2, and the coefficient of x|A|−1| y |B|−1 is
 
|A| + |B| − 2
,
|A| − 1
which is nonzero in Zp .
But now, applying the Combinatorial Nullstellensatz with Sx = A and
Sy = B, we see that there must be (a, b) ∈ A × B with f (a, b) 6= 0, which
implies that a + b 6∈ C, a contradiction. 
Let’s prove a variant of Cauchy-Davenport. For sets A, B define
b = {a + b : a ∈ A, b ∈ B, a 6= b}
A+B
b = {1, . . . , 2a − 3}, so
Note that if A = B = {0, . . . , a − 1} then A+B
b = |A| + |B| − 3.
|A+B|
Theorem 35. Let p be prime and A, B ⊆ Fp be nonempty. Then
b ≥ min{p, |A| + |B| − 3}
|A+B|
b ≥ min{p, |A| + |B| − 3},
Proof. We will prove the stronger result that |A+B|
b
and if |A| 6= |B| then |A+B| ≥ min{p, |A| + |B| − 2}.
We work over Fp . Note first that if |A| + |B| ≥ p + 2 then for any c ∈ Zp
the set A ∩ (c − B) has size at least 2. So there are two pairs (a, b) ∈ A × B
with a + b = c, and one of these must have a 6= b.
Now if |A| = 1 or |B| = 1 the result is immediate. Also, if |A| = |B| then
we may delete any element from A and use the stronger result. So we may
assume that |A| + |B| ≤ p + 1 and |A| 6= |B|.

41
b ⊆ C. Define
Choose C such that |C| = |A| + |B| − 3 and A+B
Y
P (x, y) = (x − y) (x + y − c).
c∈C

Then P has degree |A| + |B| − 2 and vanishes on A × B. On the other hand,
the coefficient of x|A|−1 y |B|−1 is
   
|A| + |B| − 3 |A| + |B| − 3 (|A| + |B| − 3)! 
− = (|A|−1)−(|B|−1)
|A| − 2 |A| − 1 (|A| − 1)!(|B| − 1)!

which is nonzero in Fp (exercise), giving a contradiction. 

42

Common questions

Powered by AI

Compression operators play a pivotal role in extremal combinatorics by simplifying and transforming set families into configurations with desirable extremal properties. In the Kruskal-Katona Theorem, compression operators serve to iteratively modify a family F into a state that approaches an initial segment of the colex order, which is known to minimize shadows. This iterative process is fundamental for the proof, as it allows changes that preserve cardinality while progressively optimizing the shadow size, thus revealing the extremal properties of the family. The strategic use of these operators exemplifies their importance in extracting simplified, extremally optimal structures in complicated combinatorial systems .

The Kruskal-Katona Theorem addresses the challenge of determining the minimal possible shadow size (i.e., lower shadow) for a given family of sets, specifically for k-uniform families. Finding the minimal shadow size is significant as it gives insights into the structural properties of hypergraphs. The theorem asserts that the initial segment of sets ordered by colexicographic order minimizes the shadow size, which aids in understanding extremal configurations. Compression operators facilitate the proof by iteratively "compressing" the family F such that the set system becomes progressively closer to an initial segment of colex order. Compression maintains the family size while reducing the shadow size until it cannot be reduced further, effectively yielding an initial segment in colex order with minimal shadow properties .

The Erdős–Littlewood–Offord problem investigates the question of how many subset sums of a given sequence of real numbers can equal a specific range, like zero, particularly when each element in the sequence has a minimum absolute value of one. This problem is significant in combinatorial optimization as it characterizes distributions that extremally concentrate around a point or small intervals, affecting how sums of random variables distribute and revealing critical limits in additive combinatorics. The classic Erdős solution offers a clear, bounds-based perspective: for given vectors in \(\mathbb{R}^n\), the maximum number of subsets with sums within an interval of unit length cannot exceed \(\binom{n}{⌊n/2⌋}\), which impacts the design and analysis of random processes and sum considerations in combinatorial settings .

The LYM Inequality offers a more detailed analysis of antichains than Sperner's Lemma by not only confirming the maximal size of an antichain but also stating that for a family F which is an antichain in the power set P(n), the sum of the reciprocals of binomial coefficients corresponding to elements in F is at most 1. Specifically, it provides that the sum \( \sum_{i=0}^{n} \frac{|F \cap [n](i)|}{\binom{n}{i}} \leq 1 \), ensuring that \(|F \cap [n](i)| \leq \binom{n}{i}\). Furthermore, equality holds if and only if F is exactly \([n](i)\) for some i, thus identifying not just the size but the structure of maximal antichains .

Comparability in a poset affects its structure by determining which elements can be included in chains, as comparability is required for inclusion in the same chain. A pair of elements \(a\) and \(b\) in a poset P are comparable if either \(a \leq b\) or \(b \leq a\). This requirement ensures that all elements in a chain exhibit a total order. Conversely, incomparability leads to the formation of antichains, where no two distinct elements can be related by \(\leq\). The structural impact is significant: chains represent linearly ordered subsets, while antichains maximize incomparability, directly informing on the poset's width and covering properties as discussed in Dilworth’s Theorem .

Dilworth's Theorem establishes a direct relationship between the size of the largest antichain in a poset and the minimum number of chains needed to cover the poset. The theorem states that the minimum number of chains required to cover a finite poset (P, ≤) equals the maximum size of an antichain within the poset. This result arises because any chain can intersect with an antichain at most once, and therefore, the number of chains must be at least as large as the largest possible antichain .

Lexicographic order and colexicographic order differ in how they compare sequences. Lexicographic order, akin to dictionary order, compares sequences by their first differing element, e.g., for elements \(a_1, a_2, ..., a_r\), \(A < B\) if there exists the smallest index \(i\) such that \(a_i < b_i\). Colexicographic order, however, prioritizes the position of elements in reverse: \(A <_{colex} B\) if the largest element in the symmetric difference \(A \Delta B\) lies in \(B\). This distinction is crucial in proofs such as the Kruskal-Katona Theorem as colexicographic order minimizes shadows of families. The theorem asserts that initial segments of colex achieve minimal shadows, which is key to determining extremal properties of certain set families .

Sperner's Lemma states that the maximal size of an antichain in the power set \(P(n)\) is given by the largest binomial coefficient \(\binom{n}{⌊n/2⌋}\). An antichain contains at most \( \binom{n}{⌊n/2⌋} \) elements because each element in an antichain is incomparable with every other, implying they must all be chosen from one layer of \(P(n)\), which is the largest when \(n\) elements are divided most evenly, i.e., at the middle layer \([n](⌊n/2⌋)\). No antichain can have more elements than this layer's size without violating the condition of incomparability .

Covering a poset with chains is significant because it allows for the decomposition of a partially ordered set into a simpler, well-understood structure consisting of chains where every element is comparable. This process offers insights into the organizational structure and interdependencies within the poset. The implications of a minimal chain cover, as described by Dilworth's Theorem, reveal that the poset's maximum width (size of the largest antichain) is fundamental for such a decomposition. Essentially, it demonstrates the poset's complexity and how this complexity can be broken down into linear orders that collectively capture the entire poset .

In combinatorics, shadows (or lower shadows) represent the sets derived from removing an element from each set in a family, typically used to understand and bound various properties of set systems. Shadows are crucial for the Kruskal-Katona Theorem as they provide a measure for the transition between different uniform layers in a hypergraph or set family. The theorem employs shadows to identify minimal configurations, specifically that initial sections in colex order have minimal shadows. This optimization is critical for insights into extremal set theory problems, graph theory, and hypergraph constructions, where efficient transitions (minimal shadows) hold significant importance .

You might also like