Alex Scott's Combinatorics Notes
Alex Scott's Combinatorics Notes
Part C
Alex Scott
Michaelmas 2020
Contents
4 Combinatorial Nullstellensatz 39
1
Chapter 1
1.1 Notes
These notes are intended to complement (but not replace) the Part C lecture
course in Combinatorics. In particular, they do not give all the details that
are explained in lectures.
The notes owe much to the book Combinatorics by Béla Bollobás, and to
notes from two courses given elsewhere by Imre Leader. Thanks are due to
Damiano Soardo and Albert Slawinksi, who contributed to earlier versions
of the notes.
Please send any further corrections/suggestions to scott at [Link].
2
For n ≥ 1, we write [n] := {1, 2, . . . , n}. We will usually write P(n)
instead of P([n]). Note that |P(n)| = 2n .
We will refer to sets of size k as k-sets. For k ≥ 0 we write X (k) =
{A ∈ P(X) : |A| = k} for the set of all subsets of X of size k. We define
X (<k) , X (≤k) , X (>k) and X (≥k) in the obvious way. A family F ⊆ P(X) is
k-uniform if F ⊆ X (k) .
We think of X (0) , X (1) , . . . as the layers of P(X), and refer to X (i)as the
ith layer. So the ith layer of P(n) is [n](i) , which has cardinality ni . The
smallest layers of P(n) are the 0th layer and the nth layer: these are {∅}
and {[n]} respectively, and both have size 1. If n is even, the largest layer is
[n](n/2) ; if n is odd there are two largest layers, [n](⌊n/2⌋) and [n](⌈n/2⌉) .
There are many ways of looking at the power set P(n). For instance:
• We can turn P(n) into a graph: the discrete cube Qn is the graph with
vertex set P(n) and an edge between A and B if and only if |A△B| = 1.
Here A △ B is the symmetric difference: A △ B := (A \ B) ∪ (B \ A).
• We can turn P(n) into an abelian group or a vector space Fn2 by iden-
tifying each A ⊆ [n] with its characteristic vector
where (
1 i∈A
χA (i) :=
0 i∈
6 A
We will move back and forward between these perspectives throughout the
course.
3
Chapter 2
Proof. Exercise.
How large can an antichain be? This is more interesting. We noted above
n
that the layers of P(n) are antichains: the largest of these has size ⌊n/2⌋ .
4
and Y . The neighbourhood Γ(v) of a vertex v ∈ V is Γ(v) = {u : vu ∈ E(G)},
and for S ⊆ V we write Γ(S) = ∪v∈X Γ(v). A complete matching from X
to Y is a collection of vertex-disjoint edges such that every vertex in X is
incident to some edge in M . Here is the result that we will need.
Theorem 3. (Hall’s Theorem) Let G = (V, E) be a bipartite graph with
vertex classes X and Y . Then G has a complete matching from X to Y if
and only if, for all S ⊆ X, we have
|Γ(S)| ≥ |S|. (2.1)
We will refer to equation (2.1) as Hall’s Condition. Note that G has
a complete matching then (2.1) is clearly necessary. The point of Hall’s
Theorem is that it is also sufficient.
Sperner’s Lemma will follow quickly from the following result.
n
Lemma 4. There is a partition of P(n) into ⌊n/2⌋ chains.
Proof. Consider the subgraph of the discrete cube Qn between the vertices
in two consecutive layers. We claim that:
1. for r < n/2, there is a complete matching from [n](r) to [n](r+1) ;
2. for r > n/2, there is a complete matching from [n](r) to [n](r−1) .
n
If we glue these matchings together, we obtain a collection of ⌊n/2⌋ chains
that partition P(n).
It is therefore enough to prove that these matchings exist. For r < n/2, we
consider the bipartite subgraph G of Qn induced by [n](r) ∪ [n](r+1) . This has
bipartition ([n](r) , [n](r+1) ) and an edge between A ∈ [n](r) and B ∈ [n](r+1) iff
A ⊆ B. In order to prove that there is a complete matching, it is enough to
verify Hall’s Condition (2.1). We will do this by a double counting argument.
Consider a set S ⊆ [n](r) , and let T = Γ(S). Each A ∈ S has degree n − r
in G (as there are n − r ways to add an element to A to get an (r + 1)-set).
So the number of edges between S and T is
e(S, T ) = (n − r)|S|.
On the other hand, each B ∈ [n](r+1) has degree r + 1 (as there are r + 1
choices for an element to delete from B to get an r-set). So we have1
e(S, T ) ≤ (r + 1)|T |
1
Note that we may not have equality here, as there may be edges incident with T that
are not incident with S.
5
Putting these two bounds together, we get
n−r
|T | ≥ |S| ≥ |S|,
r+1
as r < n/2 (and so r ≤ (n − 1)/2). So Hall’s Condition is satisfied, and hence
there is a complete matching from [n](r) to [n](r+1) .
For r > n/2, we can argue similarly. Alternatively we can consider the
effect of replacing every set by its complement.
Remarks:
n
1. Note that we cannot partition P(n) into fewer than ⌊n/2⌋ chains,
(⌊n/2⌋)
because no two sets from the antichain [n] can belong to the same
chain.
6
We will give two proofs of the LYM Inequality. The first one will use a
“local” version of the inequality.
Let F ⊆ X (k) be a k-uniform family on X. The (lower) shadow ∂F of F
is
∂F := {B ∈ X (k−1) : B ⊆ A for some A ∈ F}.
|∂A| |A|
n
≥ n .
r−1 r
|E| = r|A|.
as required.
If we have equality then we must have equality in (2.3), so for every
B ∈ ∂A and every i 6∈ B we have B ∪ i ∈ A. If A 6= ∅, [n](r) then choose
r-sets A1 ∈ A and A2 6∈ A with |A1 △A2 | as small as possible. We can choose
a1 ∈ A1 \A2 and a2 ∈ A2 \A1 : then A1 −a1 is in ∂A and so A3 := (A1 −a1 )∪a2
is in A. But this gives a contradiction, as |A3 △ A2 | < |A1 △ A2 |.
7
Remark: In the last paragraph, what we really used was the fact that E
is the edge set of a connected graph. Note also that we are being a little
informal with notation in expressions like (A1 − a1 ) ∪ a2 . As with leaving out
brackets in listing set elements, this is fine as long as it is not ambiguous.
We use the Local LYM Inequality to prove the LYM Inequality.
Proof of the LYM Inequality. Let F ⊆ P(n) be an antichain and, for i =
0, . . . , n, define
Fi := F ∩ [n](i) .
The proof proceeds by ‘pushing down one layer at a time’. Define G ⊆ [n](i)
recursively by Gn = Fn and, for r < n,
Gr := Fr ∪ ∂Gr+1 .
Note that Fr and ∂Gr+1 are disjoint, as every set in ∂Gr+1 is contained in
some element of F, and F is an antichain.
We claim that, for r = 0, . . . , n,
n
|Gr | X |Fi |
n ≥
n . (2.4)
r i=r i
8
If we have equality at the end, we must have had equality at each appli-
cation of Local LYM, and so we have Gr = ∅ or Gr = [n](r) for each r. This
implies that Fr = [n](r) for some r, and Fi = ∅ for all other i.
We give a second proof of the LYM Inequality. It gives a very slick proof
of the LYM inequality, but does not give the extremal set systems.
Second proof of the LYM Inequality. Let F be an antichain in P(n). Choose
a maximal chain C = (A0 , . . . , An ), where ∅ = A0 ⊆ . . . An = [n], uniformly
at random from all n! maximal chains. For A ∈ F with |A| = k we have
k!(n − k)! 1
P[A ∈ C] = = n .
n! k
The events (A ∈ C)A∈F are pairwise disjoint, so by the union bound we have
X X 1
1≥ P[A ∈ C] = n
,
A∈F A∈F |A|
n
where the second inequality follows because ⌊n/2⌋ is the largest binomial
coefficient. If we have equality in the first line (where we applied LYM) then
we must have F = [n](i) for some i. But then to get equality in the second
line we must have i = ⌊n/2⌋ or i = ⌈n/2⌉.
9
Lemma 4 and Sperner’s Lemma together tell us that the minimum number
of chains in a partition of P(n) is equal to the maximum size of an antichain.
This is a special case of a more general theorem about partially ordered sets.
A partially ordered set or poset (P, ≤) is a set P with a relation ≤ such
that, for all a, b, c ∈ P , we have
• a ≤ a (reflexivity),
Proof. Since a chain and an antichain meet in at most one element, it is clear
that the number of chains in any cover is at least as large as the maximum
size of an antichain. So we need only prove that there is a cover with this
many chains.
We argue by induction on |P |. The statement is immediate for |P | = 0,
so we assume |P | > 0 and we have proved the statement for smaller posets.
Let m be the maximum size of an antichain and let C be a maximal chain
(i.e. a chain C such that C ∪ {a} is not a chain, for any a ∈ P \ C). [Note
that maximal chain does not necessarily mean chain of maximal size!]
10
If P \ C contains no antichain of size m then we are done by induction:
we can cover P \ C with m − 1 chains, and then add C to get a cover of P
with m chains.
Otherwise, there is an antichain of size m in P \ C, say A = {a1 , . . . , am }.
Let
S + := {x ∈ P : x ≥ ai for some ai ∈ A}
and
S − := {x ∈ P : x ≤ ai for some ai ∈ A}.
Then S + ∩ S − = A, as A is an antichain. Also, S + ∪ S − = P as A is maximal
(if there were x 6∈ S + ∪ S − then A ∪ {x} would be a larger antichain).
We now check that S + and S − are both proper subsets of P . Let the
maximal chain C have elements {c1 , . . . , ck }, where c1 < · · · < ck . Since C
is maximal, ck is a maximal element of P and c1 is a minimal element of P .
Then:
that each Ci+ must contain exactly one element of A. Relabelling if necessary,
we may assume that ai ∈ Ci+ , for i = 1, . . . , m. Similarly, we may assume
that ai ∈ Ci− , for i = 1, . . . , m.
If ai is not maximal in Ci− , there exists b ∈ Ci− with b > ai . But then,
since b ∈ S − , we can find aj ∈ A with b ≤ aj . This means ai < b ≤ aj , so
ai < aj , which gives us a contradiction as A is an antichain. So ai is maximal
in Ci− . Similarly, ai is minimal in Ci+ .
Finally, we glue Ci+ and Ci− together (the ‘gluing point’ being ai ) to
obtain a partition of P into m chains.
There is a ‘dual’ of Dilworth’s Theorem: the minimum number of an-
tichains in a cover P is equal to the maximum size of a chain. The proof of
this is an exercise on the first example sheet.
11
2.2 Symmetric chains and Littlewood-Offord
P
If z1 , . . . , zn ∈ C are such that |zi | ≥ 1, how many of the 2n sums zi , where
i∈I
I ⊆ [n], can equal 0? This question was raised in 1938, by Littlewood and
Offord. A few years later, Erdős found a neat solution in the real case.
Note that the bound in this theorem is sharp: take xi = 1 for all i, and
set α = ⌊n/2⌋.
Proof of Theorem 9. Consider first the effect of replacing xi with −xi . Let
S1 , . . . , SN denote the N = 2n−1 sums corresponding to subsets I ⊆ [n] that
do not contain i. Then the full collection of sums of subsets is
S1 , . . . , SN , S1 + x i , S2 + x i , . . . , SN + x i .
S1 , . . . , SN , S1 − x i , S2 − xi , . . . , SN − xi ,
12
Proof. We argue by induction on n. The case n = 1 is easy! So suppose that
n > 1 and that C1 , . . . , Cm is a partition of P(n − 1) into symmetric chains.
For each chain Ci , say Ci = {A1 , . . . Ak } with A1 ⊆ . . . ⊆ Ak , we define two
chains in P(n):
Ci′ := {A1 , A2 , . . . , Ak , Ak ∪ n}
and
Ci′′ := {A1 ∪ n, . . . , Ak−1 ∪ n}.
If k = 1 then Ci′′ is empty: we discard these empty chains. The resulting
chains give a partition of P(n) into symmetric chains.
n
A symmetric chain decomposition of P(n) has ⌊n/2⌋ chains, each of
which contains an element from the middle layer (or contains an element
from each middle layer, if n is odd). For each i ≤ n/2, there are
n n
−
i i−1
chains of size n − 2i + 1, and these run from the ith layer to the (n − i)th
layer.
We now return to the Littlewood-Offord problem, and in fact answer it
in any number of dimensions (note that the case k = 2 deals with the special
case of complex numbers).
13
We prove the existence of a symmetric partition into sparse families by
induction on n. The case n = 1 is easy. So suppose that n > 1 and that
F1 , . . . , Fm is a symmetric partition of P(n − 1) into sparse families (with
respect to the vectors x1 , . . . , xn−1 ).
Suppose Fi = {A1 , . . . , At }, where
hxA1 , xn i ≤ · · · ≤ hxAt , xn i,
and h·, ·i is the usual inner product on Rk . Define two new families in P(n):
Fi′ := {A1 , A2 , . . . , At , At ∪ n}
and
Fi′′ := {A1 ∪ n, A2 ∪ n, . . . , At−1 ∪ n},
discarding empty families (as in Proposition 10). We claim that this gives a
symmetric partition of P(n) into sparse families.
It is clear that the construction gives the same number of sets of each size
as in a symmetric chain decomposition of P(n); and that Fi′′ is sparse, since
the corresponding sums are translations by xn of the sums corresponding to
Fi . To see that Fi′ is also sparse, we note that Fi is a sparse subset. Also, for
any j ∈ [r], we have (writing x̂n = xn /||xn ||2 for the unit vector in direction
xn )
14
2.3 Shadows and Kruskal-Katona
For A ⊆ [n](r) , the Local LYM Inequality tells us that
|∂A| |A|
n
≥ n ,
r−1 r
15
In other words:
16
Lemma 13. For 1 ≤ i < j ≤ n and F ⊆ [n](r) , we have |∂Cij (F)| ≤ |∂F|.
Proof. Let G = Cij (F): so we must show that |∂G| ≤ |∂F|. It will be enough
to show the following.
Claim. Let G′ ∈ ∂G \ ∂F. Then
1. i ∈ G′ , j 6∈ G′
2. (G′ \ i) ∪ j ∈ ∂F \ ∂G.
If the claim holds, then it implies that Cji gives an injection
Cji : ∂G \ ∂F → ∂F \ ∂G.
Indeed, (1) shows that Cji is injective on ∂G \ ∂F, and (2) shows that the
image is contained in ∂F \ ∂G.
Thus we need only prove the claim. So consider G′ ∈ ∂G \ ∂F. There are
G ∈ G and x ∈ [n] such that G = G′ ∪ x. Since G′ ∈ ∂G \ ∂F, we must have
G ∈ G \F and so i ∈ G and j 6∈ G; we must also have F := (G\i)∪j ∈ F \G.
If x = i then G′ ⊆ F , so we must have x 6= i. So i ∈ G′ and j 6∈ G′ ,
which proves (1).
Let F ′ = Cji (G′ ) = (G′ \ i) ∪ j. Then F ′ ⊆ F , so F ′ ∈ ∂F. All that
remains is to show that
F ′ 6∈ ∂G.
Suppose otherwise. Then there is z such that
F ′ ∪ z = (G′ \ i) ∪ j ∪ z ∈ G.
Two cases:
• z 6= i: since Cij (G) = G, we have
Cij (F ′ ∪ z) = G′ ∪ z ∈ G.
17
We say that a family F ⊆ P(n) is left-compressed if Cij (F) = F for all
1 ≤ i < j ≤ n.
Corollary 14. Let F ⊆ [n](r) . Then there is a left-compressed family A ⊆
[n](r) such that |A| = |F| and |∂A| ≤ |∂F|.
Proof. For A ⊆ P(n), we define the function
XX
f (A) := 2a .
A∈A a∈A
Then for any i < j, applying Cij either leaves A unchanged or strictly de-
creases the value of f .
Let A ⊆ P(n) satisfy |A| = |F|, |∂A| ≤ |∂F| and, subject to this, have
f (A) minimal. Then, by the lemma above, A must be left-compressed.
Any initial segment of [n](r) in colex is left-compressed, so we might hope
that Corollary 14 is enough to prove Kruskal-Katona. Unfortunately, not
every left-compressed set system is an initial segment of colex: for instance
{12, 13, 14} is left-compessed but 23 <colex 14.
We will need a more general compression operator to prove Kruskal-
Katona Theorem.
Let U, V ⊆ [n] satisfy |U | = |V | and U ∩ V = ∅. The U V -compression
operator CU V is the function from P(n) to P(n) defined by
(
(A \ V ) ∪ U if U ∩ A = ∅, V ⊆ A
CU V (A) =
A otherwise.
18
Proof. We generalize the proof of Lemma 13. Let G = CU V (F) and consider
G′ ∈ ∂G \ ∂F. We will show:
Claim. Let G′ ∈ ∂G \ ∂F. Then
1. U ⊆ G′ , V ∩ G′ = ∅
2. (G′ \ U ) ∪ V ∈ ∂F \ ∂G.
F ′ ∪ z = (G′ \ U ) ∪ V ∪ z ∈ G.
Two cases:
CU V (F ′ ∪ z) = G′ ∪ z ∈ G.
19
We can now prove the Kruskal-Katona Theorem:
Proof of Kruskal-Katona. Let A ⊆ [n](r) satisfy
• |A| = |F|
• |∂A| ≤ |∂F|
P P
• subject to this, A∈A i∈A 2i is minimal.
Let
Λ = {(U, V ) : |U | = |V | > 0, U ∩ V = ∅, max U < max V }.
If A is (U, V )-compressed for all (U, V ) ∈ Λ then A is an initial segment of
colex: if A ∈ A and B <colex A then max(B \ A) < max(A \ B) and so A is
(B \ A, A \ B)-compressed, which implies CB\A,A\B (A) = B ∈ A.
Otherwise, pick (U, V ) ∈ Λ such that A is not (U, V )-compressed and |U |
is minimal. Then A is (U \ u, V \ min V )-compressed for all u ∈ U , and so
by Lemma 15 we have |∂CU V (A)| ≤ |∂A|. But CU V (A) has strictly smaller
weight than A, which contradicts the minimality of the weight of A.
20
Chapter 3
Proof. A contains at most one set from each pair (A, [n] \ A).
A much more interesting question is: what is the largest intersecting
family of r-sets in P(n)? There are three regimes to consider:
• r > n/2: This case is trivial, as we can take the entire layer [n](r) .
• r < n/2: This is more interesting! One example is to take all r-sets
containing a fixed element, say 1. This gives a system of size n−1
r−1
.
21
Very good. But we could also take all sets that contain at least two
elements from {1, 2, 3}. This also gives an intersecting
system, and a
litle calculation shows that it has size 3 n−3
r−2
+ n−3
r−3
. A little more
calculation shows that the first system is bigger, but of course we want
to handle all possible systems.
Remark: In fact, for r < n/2, we get equality only for the systems that
consist of all r-sets containing a fixed point. (We won’t prove this here.) For
r = n/2 it is easy to construct many nonisomorphic systems.
We shall give two proofs: the first uses Katona’s ingenious circle method;
the second uses the Kruskal-Katona Theorem.
First proof of the Erdős-Ko-Rado Theorem. Consider any bijection f : [n] →
Zn . We say that A maps to an interval under f if f (A) := {f (a) : a ∈ A} =
{i, i + 1, . . . , i + k − 1}, for some 0 ≤ i ≤ n − 1 (where addition is modulo n).
We will double count the number N of pairs (f, A) such that f : [n] → Zn is
a bijection and f (A) is an interval.
For any fixed f , we claim that at most k sets in A map to intervals under
f . Indeed, suppose A ∈ A and f (A) = {i, i + 1, . . . , i + k − 1}. Since A is
intersecting, any other interval that we get under f must be of form
{j, j − 1, . . . , j − (k − 1)}
or
{j + 1, j + 2, . . . , j + k}),
for some j ∈ {i, i + 1, . . . , i + k − 2}. But for each j we can get at most one
of these two intervals (as they are disjoint). So we get at most k − 1 such
intervals, and hence at most k in total. Summing over all n! bijections from
[n] to Zn , we see that
N ≤ kn!.
On the other hand, each A ∈ A is an interval under n(n−k)!k! bijections,
so
N = |A|n(n − k)!k!.
22
Combining these bounds on N gives
kn! n−1
|A| ≤ = .
n(n − k)!k! k−1
Now for a proof involving shadows. Recall that ∂A is the shadow of the
set system A. We write ∂ (2) A = ∂(∂A), ∂ (3) A = ∂(∂ (2) A), and so on. Note
that ∂ (k) A is the collection of sets B that can be obtained from some A ∈ A
by deleting k elements.
Second proof of the Erdős-Ko-Rado Theorem. Let A ⊆ [n](r) be an intersect-
ing family, and set
B = {Ac : A ∈ A} ⊆ [n](n−r) .
Since A is intersecting, no set A ∈ A is contained in any set B ∈ B. So
∂ (n−2r) B ⊆ [n](r)
is disjoint from A.
Now if |A| ≥ n−1
r−1
then
n−1 n−1
|B| = |A| ≥ = .
r−1 n−r
We now apply Kruskal-Katona repeatedly:
(n−r) n−1
|∂B| ≥ |∂[n − 1] |=
n−r−1
and so
(2) (n−r−1) n−1
|∂ B| ≥ |∂[n − 1] |= ,
n−r−2
n−1
and so on, showing at each step that |∂ (i) B| ≥ n−r−i , until we get
(n−2r) (r+1) n−1 n−1
|∂ B| ≥ |∂[n − 1] |= = .
n − r − (n − 2r) r
So if |A| > n−1r−1
, we get
n (n−2r) n−1 n−1 n
≥ |A ∪ ∂ B| > + = ,
r r−1 r r
which gives a contradiction.
23
Theorem 18 (Liggett’s Theorem). Suppose that Y1 , . . . , Yn are independent
random variables with P(Yi = 1) = 1 − P(Yi = 0) = p ≥ 1/2 for each i. Let
α1 , . . . , αn be non-negative numbers summing to 1. Then
!
X
P αi Yi ≥ 1/2 ≥ p.
i
Proof. Suppose first that no proper subset of the αi sum to 1/2. Let
X
A = {A ⊆ [n] : αi > 1/2}
i∈A
So
! n
X X n−1
k n−k
P αi Yi ≥ 1/2 −p= p (1 − p) Nk −
i k=0
k−1
X
n−1
X
n−1
k n−k k n−k
= p (1 − p) Nk − + p (1 − p) Nk − .
2k≤n
k−1 2k>n
k−1
24
so we get that !
X
P αi Yi ≥ 1/2 −p
i
X
n−1
k n−k n−k k
≥ p (1 − p) −p (1 − p) Nk −
2k<n
k−1
Ai ∩ Bi = ∅
If we specify the size of the sets, we get the following useful corollary.
25
Proof of the Two Families Theorem. We may assume that all sets are sub-
sets of [n]. For a permutation π of [n], we write A <π B if
max π(A) < min π(B),
where we write π(S) := {π(x) : x ∈ S}.
Let π ∈ Sn be chosen uniformly at random from the set of permutations
of [n]. Then, for each i, as Ai ∩ Bi = ∅, we have
−1
|Ai | + |Bi |
P(Ai <π Bi ) = ,
|Ai |
On the other hand, if Ai <π Bi then Aj 6<π Bj for j 6= i (as Ai ∩ Bj and
Aj ∩ Bi are both nonempty). So the events (Ai <π Bi )i∈[k] are disjoint, and
so
X k Xk −1
|Ai | + |Bi |
1≥ P(Ai <π Bi ) = ,
i=1 i=1
|Ai |
which gives the required inequality.
Let us see an application of the Two Families Theorem.
An r-uniform hypergraph H = (V, E) consists of a set V (of vertices) and
a set E ⊆ V (r) (of edges). The complete r-uniform hypergraph on k vertices
(r)
is Kk := ([k], [k](r) ). Isomorphism is defined just as you expect. If H ′ is
isomorphic to H we will often say that H ′ is a copy of H.
Let H be an r-uniform hypergraph. We say that an r-uniform hypergraph
G is H-saturated if G does not contain a copy of H, but if we add any edge
to G then the resulting hypergraph contains a copy of H.
For instance, in the case of graphs, if H = K3 then Turán’s Theorem
tells us that the mamximal number of edges in an H-saturated graph on n
vertices is ⌊n2 /4⌋; it is an exercise to show that the minimum number of
edges is n − 1.
In general, it is very hard to determine the maximum number of edges in
(r)
a Kk -saturated hypegraph. Suprisingly, Bollobás determined the minimum
number precisely.
Theorem 21. Let G be an r-uniform hypergraph with vertex set [n], and
(r+s)
suppose that adding any edge to G creates a copy of Kr . Then G has at
least
n n−s
−
r r
26
edges.
• Ai ∩ Bi = ∅ for each i;
The above bound is sharp: we can take the r-uniform hypergraph with
vertex class [n] and edges {F ∈ [n](r) : F ∩ [s] 6= ∅}.
3.2 VC-dimension
Let F ⊆ P(X) and S ⊆ X. The trace of F on S is the set system
We set
trF (S) = |F|S| ,
i.e. the number of sets in the trace of F on S. We say that S is shattered by
F if F|S = P(S) (in other words, trF (S) = 2|S| ).
The VC-dimension of F is max{|S| : S ⊆ X is shattered by F}. [VC
stands for Vapnik-Chervonenkis.]
27
by H (there is not way to obtain the subset {(1, 1), (3, 1)} by intersecting
{(1, 1), (2, 1), (3, 1)} with half-planes!). However, {(0, 1), (1, 0), (1, 2)} is eas-
ily seen to be shattered by H. So the VC-dimension of H is at least 3. (Tricky
question: What is the VC-dimension of this system?)
The Sauer-Shelah Theorem tells us that if a family A ⊆ P(n) contains
more than |[n](≤d) | sets, then its VC-dimension is greater than d:
Theorem 22. If A ⊆ P(n) has VC-dimension at most d, then
(≤d) n n n
|A| ≤ |[n] |= + + ··· + .
0 1 d
We shall see a couple of proofs.
Proof 1 of the Sauer-Shelah Theorem. We argue by induction on n + d. Let
(≤d) n n n
f (n, d) = |[n] |= + + ··· + .
0 1 d
If n = 0 or d = 0, the result is trivial. If n + d > 0, with n, d > 0, let
B = {A \ n : A ∈ A} ⊆ P(n − 1),
C = {A ∈ A : n ∈
/ A, A ∪ {n} ∈ A} ⊆ P(n − 1).
Then B has VC-dimension at most d, while C has VC-dimension at most d−1
(as if S is shattered by C, then S ∪ {n} is shattered by A; so |S| ≤ d − 1).
Hence, by induction, |B| ≤ f (n − 1, d) and |C| ≤ f (n − 1, d − 1) and
Proof 2 of the Sauer-Shelah Theorem. Define the i-compression operator by
πi (A) = A \ {i}
and
πi (A) = {πi (A) : A ∈ A} ∪ {A ∈ A : πi (A) ∈ A}.
Then πi does not increase the VC-dimension of a set system (exercise) and
|A| = |πi (A)|. Thus we can repeatedly apply the i-compression operator
28
until our family is i-compressed for all i ∈ [n] (this terminates, as every
compression
P either leaves the family unchanged or decreases the quantity
A∈A |A|).
So consider B, the i-compressed family obtained from A. If B contains
any set B of size at least d + 1 then B contains all subsets of B, and so has
VC-dimension at least d + 1. Otherwise,
(≤d) n n n
|A| = |B| ≤ |[n] |= + + ··· + .
0 1 d
|A ∩ B| = |A+ ∩ B + | + |A− ∩ B − |
|A+ ||B + | |A− ||B − |
≥ +
2n−1 2n−1
1 1
= n (|A+ | + |A− |)(|B + | + |B − |) + n (|A+ | − |A− |)(|B + | − |B − |)
2 2
|A||B|
≥ ,
2n
since (|A+ | − |A− |) ≤ 0 and (|B + | − |B − |) ≤ 0.
29
Some authors call the above Theorem “Harris’ Lemma” or “Harris-Kleitman
Lemma”.
30
We claim that the vectors in {χA : A ∈ A} are linearly independent. Then
≤ dim(Rn ) = n.
|A| = |{χA : A ∈ A}| P
So suppose that λA χA = 0. For any B ∈ A, since hχA , χB i = |A ∩
A∈A
B| = k for A 6= B and hχB , χB i = |B|, we have
X X
0=h λA χA , χB i = λB |B| + λA k = λB (|B| − k) + Λk,
A∈A A6=B
P
where Λ = λA . Thus (noting that |B| > k)
A∈A
kΛ
λB = − .
|B| − k
If Λ = 0, then λB = 0 for all B ∈ A. If Λ 6= 0, then Λ and λB have
P sign. But this holds for any B ∈ A, which is impossible since
opposite
Λ= λB . We conclude that the vectors χA are linearly independent, as
B∈A
required.
We continue with modular intersection theorems, where the allowed sizes
of intersections are specified modulo p. Here is our first example.
Theorem 26. (Oddtown Theorem) Let A ⊆ P(n) be a family such that
• |A| is odd for all A ∈ A
• |A ∩ B| is even for all distinct A, B ∈ A.
Then |A| ≤ n.
First proof of the Oddtown Theorem. We work over the field with two ele-
ments F2 . We identify each element of with its characteristic vector in Fn2 .
Then, for all A, B ∈ A, we have
(
0 if A 6= B
hχA , χB i = |A ∩ B| =
1 A = B.
P
We claim that {χA : A ∈ A} is linearly independent in Fn2 . If λA χ A =
A∈A
0, then, for all B ∈ A, we have
X
0=h λA χ A , χ B i = λB .
A∈A
31
Proof 2 of the Oddtwon Theorem. We work over the field with two elements
F2 . Let A = {A1 , . . . , Am }.
Let M = (mij ) be the m × n incidence matrix where
(
1 j ∈ Ai
mij =
0 j 6∈ Ai .
• |A| ∈
/ S mod p, for all A ∈ A;
Then
|S|
X n
|A| ≤ .
i=0
i
Proof. We work over the field with prime number of elements Fp , and intro-
duce variables x = (x1 , . . . , xn ). For each A ∈ A, we define the polynomial
!
Y X
fA (x) = xi − s .
s∈S i∈A
32
while if A = B we have
Y
fA (χB ) = (|A| − s) 6= 0.
s∈S
while if B = A we have
Y
f˜A (χA ) = (|A| − s) = αA 6= 0.
s∈S
thus λB = 0. The f˜A are therefore linearly independent, and lie in the space
of multilinear polynomials of degree at most |S|. This has dimension
|S|
X n
i=0
i
Q |S|
P
n
(it is spanned by the monomials { i∈A xi : |A| ≤ |S|}). Thus |A| ≤ i
.
i=0
What if we drop the modular constraint?
For a set S ⊆ N, we say that a family A is S-intersecting if |A ∩ B| ∈ S
for all distinct A, B ∈ A.
33
Theorem 28 (Frankl-Wilson Theorem). Let A ⊆ P(n) be S-intersecting.
Then
|S|
X n
|A| ≤ .
i=0
i
We won’t prove the full version of F-W here: instead we prove it under
the additional assumption that |A| 6∈ S for all A ∈ A.
Proof. This is an easy deduction from the Modular Frankl-Wilson Theorem:
just apply the modular version of the theorem with p > n.
The next result is a uniform version of the Modular Frankl-Wilson The-
orem.
So for A, B ∈ A we have
(
Y 0 A 6= B
fA (χB ) = (|A ∩ B| − s) =
s∈S
6 0 A = B.
=
34
Finally, define
n
X
q(x) = xi − k,
i=0
so q(χB ) = 0 for B ∈ A, and q(χB ) 6= 0 if |B| < k.
Define the collection
G: {fA : A ∈ A} ∪ {q · pB : |B| ≤ |S| − 1},
and let G̃ be the multilinearized collection
G̃ : {f˜A : A ∈ A} ∪ {gq pB : |B| ≤ |S| − 1}.
We claim that the collection G̃ is linearly independent. If this is true
then note that all the polynomials in G̃ are multilinear
P|S| nand have degree at
most s. Thus they lie in a space of dimension i=0 i . However, the set
P|S|−1 n n
{qp
e B : |B| ≤ |S| − 1} has size i=0 i
. So the set A has size at most |S| .
All that remains is to prove our claim. Suppose that
X X
λA f˜A + µB qg
pB = 0.
A∈A |B|≤|S|−1
For C ∈ A, we have
qg
pB (χC ) = qpB (χC ) = 0,
so for all C ∈ A we have
X X
0= λA f˜A + pB (χC ) = λC .
µB qg
A∈A |B|≤|S|−1
Now if not all µC are 0 then let C be of minimal size with µC 6= 0. Then
0 B ⊆ C, B 6= C (as µB = 0)
µB qg
pB (χC ) = 0 B 6⊆ C (as pB (χC ) = 0)
6= 0 B = C.
So X
0= µB qg pC (χC ) 6= 0,
pB (χC ) = µC qg
|B|≤|S|−1
35
3.5 Borsuk’s Conjecture
In 1933, Borsuk conjectured that if K ⊆ Rd has diameter 1 then it can be
partitioned into d + 1 sets of diameter less than 1. (It’s easy to see that there
are sets for which d + 1 is necessary: consider the regular simplex.)
Borsuk’s conjecture remained open until 1993, when Kahn and Kalai
showed that it is false.
Theorem 30. Let k(d) be the smallest integer k such that every subset of
Rd of diameter 1 can be partitioned into k(d) sets of smaller diameter. Then
there exists c > 1 such that √
k(d) ≥ c d
for infinitely many d.
In fact, Kahn and Kalai proved that (with a little more work) the bound
holds for all d. Surprisingly, their result uses the Modular Frankl-Wilson
Theorem.
We will need a preliminary lemma.
|A ∩ B| 6≡ 0 mod p.
|A| ≡ 0 mod p.
4p
4p
since 2 i
≥ i−1
, for all i ≤ p − 1. The result follows.
36
4p
Proof of the Kahn-Kalai Theorem. Let d = 2
, where p is a prime. We
shall construct a set K ⊆ Rd . In fact, let
W = [4p](2) = E(K4p ),
so we identify [4p](2) with the edges of the complete graph with vertex set
[4p]. We will work in RW . Note that
in other words EA is the edge set of the complete bipartite graph with vertex
classes A and Ac . We set
F = {EA : A ∈ [4p](2p) }.
37
Now suppose L ⊆ K satisfies diam(L) < diam(K). In the set A ⊆ F
corresponding to L there can be no pair A, B with |A ∩ B| = p. So by the
4p
lemma, |A| ≤ p−1 , and so
1 4p 4p
|F|/|A| ≥ .
2 2p p−1
38
Chapter 4
Combinatorial Nullstellensatz
where deg(q) = t − 1, and q has a monomial x1t1 −1 xt22 . . . xtnn with nonzero
coefficient, and r(x) ∈ F[x2 , . . . , xn ].
For any (s2 , . . . , sn ) ∈ S2 × . . . Sn we have
f (s1 , . . . , sn ) = r(s2 , . . . , sn )
39
Let us see some applications.
How many hyperplanes we need to cover all vertices of a cube in Rn ?
This is trivial: two hyperplanes will do. But what if we want to cover all but
one vertex, and leave the last vertex uncovered? It is easy to show that n
planes suffice. It turns out that this is the minimum!
Proof. We may assume the uncovered vertex is 0. Work over the reals. Each
hyperplane Hi is defined by an equation of form
hx, ai i = bi
Non-examinable applications
Given sets A, B in an abelian group, we write
A + B = {a + b : a ∈ A, b ∈ B}.
40
Theorem 34 (Cauchy-Davenport). If p is a prime and A, B ⊆ Zp then
Proof. Work over Zp . Suppose first that |A|+|B| > p. Then for every c ∈ Zp ,
the sets A, c − B must have a common element, and so there is a solution to
x + y = c with x ∈ A, y ∈ B.
Now suppose |A| + |B| ≤ p and |A + B| ≤ |A| + |B| − 2. Pick C ⊃ A + B
with |C| = |A| + |B| − 2, and set
Y
f (x, y) = (x + y − c)
c∈C
This has degree |A| + |B| − 2, and the coefficient of x|A|−1| y |B|−1 is
|A| + |B| − 2
,
|A| − 1
which is nonzero in Zp .
But now, applying the Combinatorial Nullstellensatz with Sx = A and
Sy = B, we see that there must be (a, b) ∈ A × B with f (a, b) 6= 0, which
implies that a + b 6∈ C, a contradiction.
Let’s prove a variant of Cauchy-Davenport. For sets A, B define
b = {a + b : a ∈ A, b ∈ B, a 6= b}
A+B
b = {1, . . . , 2a − 3}, so
Note that if A = B = {0, . . . , a − 1} then A+B
b = |A| + |B| − 3.
|A+B|
Theorem 35. Let p be prime and A, B ⊆ Fp be nonempty. Then
b ≥ min{p, |A| + |B| − 3}
|A+B|
b ≥ min{p, |A| + |B| − 3},
Proof. We will prove the stronger result that |A+B|
b
and if |A| 6= |B| then |A+B| ≥ min{p, |A| + |B| − 2}.
We work over Fp . Note first that if |A| + |B| ≥ p + 2 then for any c ∈ Zp
the set A ∩ (c − B) has size at least 2. So there are two pairs (a, b) ∈ A × B
with a + b = c, and one of these must have a 6= b.
Now if |A| = 1 or |B| = 1 the result is immediate. Also, if |A| = |B| then
we may delete any element from A and use the stronger result. So we may
assume that |A| + |B| ≤ p + 1 and |A| 6= |B|.
41
b ⊆ C. Define
Choose C such that |C| = |A| + |B| − 3 and A+B
Y
P (x, y) = (x − y) (x + y − c).
c∈C
Then P has degree |A| + |B| − 2 and vanishes on A × B. On the other hand,
the coefficient of x|A|−1 y |B|−1 is
|A| + |B| − 3 |A| + |B| − 3 (|A| + |B| − 3)!
− = (|A|−1)−(|B|−1)
|A| − 2 |A| − 1 (|A| − 1)!(|B| − 1)!
42
Compression operators play a pivotal role in extremal combinatorics by simplifying and transforming set families into configurations with desirable extremal properties. In the Kruskal-Katona Theorem, compression operators serve to iteratively modify a family F into a state that approaches an initial segment of the colex order, which is known to minimize shadows. This iterative process is fundamental for the proof, as it allows changes that preserve cardinality while progressively optimizing the shadow size, thus revealing the extremal properties of the family. The strategic use of these operators exemplifies their importance in extracting simplified, extremally optimal structures in complicated combinatorial systems .
The Kruskal-Katona Theorem addresses the challenge of determining the minimal possible shadow size (i.e., lower shadow) for a given family of sets, specifically for k-uniform families. Finding the minimal shadow size is significant as it gives insights into the structural properties of hypergraphs. The theorem asserts that the initial segment of sets ordered by colexicographic order minimizes the shadow size, which aids in understanding extremal configurations. Compression operators facilitate the proof by iteratively "compressing" the family F such that the set system becomes progressively closer to an initial segment of colex order. Compression maintains the family size while reducing the shadow size until it cannot be reduced further, effectively yielding an initial segment in colex order with minimal shadow properties .
The Erdős–Littlewood–Offord problem investigates the question of how many subset sums of a given sequence of real numbers can equal a specific range, like zero, particularly when each element in the sequence has a minimum absolute value of one. This problem is significant in combinatorial optimization as it characterizes distributions that extremally concentrate around a point or small intervals, affecting how sums of random variables distribute and revealing critical limits in additive combinatorics. The classic Erdős solution offers a clear, bounds-based perspective: for given vectors in \(\mathbb{R}^n\), the maximum number of subsets with sums within an interval of unit length cannot exceed \(\binom{n}{⌊n/2⌋}\), which impacts the design and analysis of random processes and sum considerations in combinatorial settings .
The LYM Inequality offers a more detailed analysis of antichains than Sperner's Lemma by not only confirming the maximal size of an antichain but also stating that for a family F which is an antichain in the power set P(n), the sum of the reciprocals of binomial coefficients corresponding to elements in F is at most 1. Specifically, it provides that the sum \( \sum_{i=0}^{n} \frac{|F \cap [n](i)|}{\binom{n}{i}} \leq 1 \), ensuring that \(|F \cap [n](i)| \leq \binom{n}{i}\). Furthermore, equality holds if and only if F is exactly \([n](i)\) for some i, thus identifying not just the size but the structure of maximal antichains .
Comparability in a poset affects its structure by determining which elements can be included in chains, as comparability is required for inclusion in the same chain. A pair of elements \(a\) and \(b\) in a poset P are comparable if either \(a \leq b\) or \(b \leq a\). This requirement ensures that all elements in a chain exhibit a total order. Conversely, incomparability leads to the formation of antichains, where no two distinct elements can be related by \(\leq\). The structural impact is significant: chains represent linearly ordered subsets, while antichains maximize incomparability, directly informing on the poset's width and covering properties as discussed in Dilworth’s Theorem .
Dilworth's Theorem establishes a direct relationship between the size of the largest antichain in a poset and the minimum number of chains needed to cover the poset. The theorem states that the minimum number of chains required to cover a finite poset (P, ≤) equals the maximum size of an antichain within the poset. This result arises because any chain can intersect with an antichain at most once, and therefore, the number of chains must be at least as large as the largest possible antichain .
Lexicographic order and colexicographic order differ in how they compare sequences. Lexicographic order, akin to dictionary order, compares sequences by their first differing element, e.g., for elements \(a_1, a_2, ..., a_r\), \(A < B\) if there exists the smallest index \(i\) such that \(a_i < b_i\). Colexicographic order, however, prioritizes the position of elements in reverse: \(A <_{colex} B\) if the largest element in the symmetric difference \(A \Delta B\) lies in \(B\). This distinction is crucial in proofs such as the Kruskal-Katona Theorem as colexicographic order minimizes shadows of families. The theorem asserts that initial segments of colex achieve minimal shadows, which is key to determining extremal properties of certain set families .
Sperner's Lemma states that the maximal size of an antichain in the power set \(P(n)\) is given by the largest binomial coefficient \(\binom{n}{⌊n/2⌋}\). An antichain contains at most \( \binom{n}{⌊n/2⌋} \) elements because each element in an antichain is incomparable with every other, implying they must all be chosen from one layer of \(P(n)\), which is the largest when \(n\) elements are divided most evenly, i.e., at the middle layer \([n](⌊n/2⌋)\). No antichain can have more elements than this layer's size without violating the condition of incomparability .
Covering a poset with chains is significant because it allows for the decomposition of a partially ordered set into a simpler, well-understood structure consisting of chains where every element is comparable. This process offers insights into the organizational structure and interdependencies within the poset. The implications of a minimal chain cover, as described by Dilworth's Theorem, reveal that the poset's maximum width (size of the largest antichain) is fundamental for such a decomposition. Essentially, it demonstrates the poset's complexity and how this complexity can be broken down into linear orders that collectively capture the entire poset .
In combinatorics, shadows (or lower shadows) represent the sets derived from removing an element from each set in a family, typically used to understand and bound various properties of set systems. Shadows are crucial for the Kruskal-Katona Theorem as they provide a measure for the transition between different uniform layers in a hypergraph or set family. The theorem employs shadows to identify minimal configurations, specifically that initial sections in colex order have minimal shadows. This optimization is critical for insights into extremal set theory problems, graph theory, and hypergraph constructions, where efficient transitions (minimal shadows) hold significant importance .