0% found this document useful (0 votes)
2 views

analytic-number-theory-notes

These notes document a course on Analytic Number Theory taught by Kannan Soundararajan at Stanford in Fall 2017, focusing on advanced topics such as Vinogradov's three prime theorem and the Hardy-Littlewood circle method. The notes include mathematical proofs, heuristics, and exercises aimed at understanding the distribution of prime numbers and their representations. The author acknowledges that the notes may contain errors and encourages feedback from readers.

Uploaded by

ericavisintin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

analytic-number-theory-notes

These notes document a course on Analytic Number Theory taught by Kannan Soundararajan at Stanford in Fall 2017, focusing on advanced topics such as Vinogradov's three prime theorem and the Hardy-Littlewood circle method. The notes include mathematical proofs, heuristics, and exercises aimed at understanding the distribution of prime numbers and their representations. The author acknowledges that the notes may contain errors and encourages feedback from readers.

Uploaded by

ericavisintin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

ANALYTIC NUMBER THEORY NOTES

AARON LANDESMAN

1. I NTRODUCTION
Kannan Soundararajan taught a course (Math 249A) on Analytic
Number Theory at Stanford in Fall 2017.
These are my “live-TeXed“ notes from the course. Conventions
are as follows: Each lecture gets its own “chapter,” and appears in
the table of contents with the date.
Of course, these notes are not a faithful representation of the course,
either in the mathematics itself or in the quotes, jokes, and philo-
sophical musings; in particular, the errors are my fault. By the same
token, any virtues in the notes are to be credited to the lecturer and
not the scribe. 1 Please email suggestions to [email protected]

1This introduction has been adapted from Akhil Matthew’s introduction to his
notes, with his permission.
1
2 AARON LANDESMAN

2. 9/26/17
2.1. Overview. This will be somewhat of an introductory course in
analytic methods, but more like a second introduction. We’ll assume
familiarity with the prime number theorem, connecting contribu-
tions of primes from zeros of the zeta function. You might look at the
first half of Davenport’s book or so as a prerequisite. We’ll assume
the students know how to prove there are infinitely many primes in
arithmetic progressions.
To get started, we’ll do the first four or five lectures proving Vino-
gradov’s three prime theorem:
Theorem 2.1 (Vinogradov). Every large odd number is the sum of three
primes.
When we say “large,” one can actually compute the bound explic-
itly (i.e., it is effective).
Remark 2.2. Helfgott, a few years back, made the bound accessible
so that one could compute exactly which odd numbers were not ex-
pressible as the sum of three primes. He showed something like all
primes more than 7 could be written as the sum of three primes.
To start the proof, write N = p1 + p2 + p3 , and we’ll count the
number of ways to write N as such. In fact, we’ll consider

∑ Λ ( n1 ) Λ ( n2 ) Λ ( n3 )
N = n1 + n2 + n3

where
(
log p if n = pk
Λ(n) :=
0 else
If we define

Ψ( x ) := ∑ Λ(n) = x + O(xe −c log x
).
n≤ x

This is equivalent to saying



π ( x ) = li( x ) + o ( xe−c log x
).
Here
Z x
dt
li( x ) = .
2 log t
x
li( x ) is about log x .
ANALYTIC NUMBER THEORY NOTES 3

2.2. Heuristic of proof. A first guess is that there are about π ( N )


choices for each of p1 , p2 , p3 . Their sum must add up to a given num-
ber N. The chance that p1 + p2 + p3 is exactly N is roughly N1 . Hence,
the number of such ways is approximately
3
N2

N 1
= .
log N N (log N )3
We can also estimate
R3 ( N ) : = ∑ Λ(n1 )Λ(n2 )Λ(n3 ) ∼ N 3 /N = N 2 + O( N 1+1/2 + log N )
N = n1 + n2 + n3
where the error comes from contributions of powers of primes.
2.3. Hardy and Littlewood’s circle method. Let
S(α) := ∑ Λ(n)e(nα)
n≤ N
where
e( x ) = e2πix .
Then,
Z 1 Z 1

0
3
S(α) e(− Nα)dα = ∑
0 n ,n ,n ≤ N
Λ(n1 )Λ(n2 )Λ(n3 )e (n1 + n2 + n3 − N )α) dα
1 2 3

= R3 ( N ).
To bound this, note that S(0) = Ψ( N ) ∼ N. We’d like to bound it by
about N 2 .
Also, S(1/2) is pretty big because
S(1/2) = (Λ(2) + λ(4) + · · · ) − ∑ Λ(n)
n≤ N,n odd

because e( x ) = −1 if x is odd.
Then, for all
10−6
|λ| ≤ ,
N
we have <S(α) > .99N. Then,
Z
S(α)3 e(− Nα)dα ' 10−6 N 2 .
|α|≤10−6 /N
We could similarly make an argument in a small neighborhood of
1
2 . This gives an analytic reason that the number of representations
might be on the scale of N 2 . So there are portions of the integral
which give the correct answer.
4 AARON LANDESMAN

Exercise 2.3 (Waring’s problem). We want to know whether we can


write N = x1k + x2k + · · · + x3k (i.e., as a sum of four squares or nine
cubes, etc.)
(1) First, find a probabilistic guess for the number of such repre-
sentations.
(2) Use the circle method
 S
Z 1
∑ e(nk α) e(− Nα)dα.
 

0 1
n≤ N k

Then, find portions of the integrand that correspond to the


right probabilistic answer.
Returning to our integral for three primes, let’s think about when
S(α) is big.

Example 2.4. Let’s try 31 .

∑ Λ(n)e(n/3).
n≤ N

We have a contribution from powers of 3 which is about log N, so

∑ Λ(n)e(n/3) = O(log N ) + e(1/3)Ψ( N; 3, 1) + e(2/3)Ψ( N; 3, 2)


n≤ N
∼ N/2.
where
N N
Ψ( N; 3, 1) ∼ = .
φ (3) 2
where Ψ( N; a, b) counts the number of primes up to N which is
b mod a.
Remark 2.5. Note that
a
S( )
q
counts approximately the distribution of primes in progressions mod
q with ( a, q) = 1. Sometimes when q is small, since we’re only count
primes coprime to q, we will get an answer substantially away from
0.
We’ll later need to think through the uniformity of q in terms of N.
ANALYTIC NUMBER THEORY NOTES 5

Remark 2.6 (Insight). S(α) is big near most rational numbers with
small denominators. It’s not big near 1/4, so we’re only saying it’s
big near certain ones. This might have something to do with whether
the denominator of the rational number is square free: if it is not
square free, you essentially get translates over roots of unity of that
prime whose square divides q, and things cancel out
Exercise 2.7. Show

ak
∑ e(
q
) = µ ( q ),
k mod q

where the ∗ means (k, q) = 1 and µ is the Möbius function.


Goal 2.8. If α is not near a rational number with small denominator
then |S(α)| is small.
To accomplish this, Hardy and Littlewood decided to split [0, 1]
into two parts - major arcs M and minor arcs m. The major arcs are
close to qa for q small, and the minor arcs are the rest. The measure of
the minor arcs are big while the measure of the major arcs have small
measure. That is, the minor arcs have nearly full measure. So, there
is a very small set on which the generating function is big. There is
also a big set on which the generating function is small.
There is a trivial bound |S(α)| ≤ Ψ( N ) ∼ N, using the triangle
inequality. One can also work out
Lemma 2.9. We have
Z 1
|S(α)|2 dα ∼ N log N.
0
Proof.
Z 1 Z 1

0
2
|S(α)| dα = ∑ Λ ( n1 ) Λ ( n2 )
0
e ((n1 − n2 )α) dα
n1 ,n2 ≤ N

= ∑ Λ ( n )2
n≤ N

= ∑ log nΛ(n) + O( N log N )
n≤ N
= N log N.


Exercise 2.10. Verify the above, where the O( N log N ) difference
is coming from prime powers. The idea for the last step is that most
6 AARON LANDESMAN

numbers less than N are on the order of N. One might use partial
summation, which is integration by parts.

Usually,
p
|S(α)| ∼ N log N.
If α is far from every rational number, such as the golden ratio, φ,
1
we might try to compute S(φ). We might expect that S(φ)  N 2 +ε .
We don’t know whether this is true, but we do know
∑ Λ(n)e(nφ)  N 1−δ
n≤ N
for some δ > 0 (and it will probably even be a pretty large δ). We will
now develop a technique saying that once you are far away from a
rational number, you can get this sort of power saving.
2.4. Strategy for determining asymptotics for R3 ( N ). We have the
integral
Z 1 Z Z
3
S(α) e(− Nα)dα = ( + )S(α)3 e(− Nα)dα
0 M m
We want the second part over the minor arcs to be smaller than N 2 .
The idea is that we can bound
!Z
Z 1

m
S(α)3 e(− Nα)dα ≤ ∑ |S(α)|
0
|S(α)|2 dα
α∈m

 ( N log N ) ∑ |S(α)|.
m
So, it is enough to have
εN
∑ |S(α)| ≤
log N
.
α∈m
This will show the contribution from the minor arcs is less than that
of the major arcs, assuming we know the major arcs contribute N 2 .
Exercise 2.11 (Roth). For all δ > 0, there is N0 = N (δ) so that for all
N ≥ N0 , such that every subset A ⊂ [1, N ] with |A | ≥ δN has a
(nontrivial) three term arithmetic progression.
Letting
A (α) = ∑ e( aα),
a ∈A
ANALYTIC NUMBER THEORY NOTES 7

we obtain
Z 1
A (α)2 A (−2α)dα
0
counts the number of triples ( x, y, z) with x + z = 2y. This includes
|A | trivial solutions, so we want to see this integral is larger. We
might expect δ3 N 2 solutions. But now, it’s a bit hard to see how to
actually bound this integral.
Exercise 2.12 (Vague exercise). If, “away from 0,”
|A (α)| ≤ ε|A |
then the contribution of that portion of the integral
Z 1
A (α)2 A (−2α)dα
0
is bounded by ε|A |2 . (We’d like to know something like ε ≤ δ/106 .)
The idea is that either we have this bound above, or else we get
some additive structure in A which we exploit to get a bigger den-
sity set.
There are notes on this on Sound’s web-page from a course he
taught on additive combinatorics.
Now, we want to focus on showing that for some definition of the
minor arcs, the sum S(α) has a little bit of cancellation.
2.5. Vinogradov’s method. Here is the key idea from Vinogradov’s
method. This comes up many times throughout analytic number
theory. We’d like to understand the sequence
S(α) = ∑ Λ(n)e(nα).
n≤ N

We could similarly study


∑ Λ(n)e( f (n)),
n≤ N
√ √
where, say, f (n) is e( n) or e( n + (log n)2 ). We could similarly
study

∑ e( n)
n≤ N
or
∑ e(t log n)
n≤ N
8 AARON LANDESMAN

for looking at 0’s of the zeta function. Let’s start with the simplest
version of these, where instead of summing over primes and prime
powers, we only sum over all the integers. Say we want to consider
∑ e(nα).
n≤ N
This is a geometric progression, so it is easy to sum:
e(α) (1 − e( Nα))
∑ e(nα) =
1 − e(α)
n≤ N
x
=
sin πα
where x is bounded by 2, and the numerator is approximately sin πNα.

Exercise 2.13. Show


 
2
∑ e(nα) ≤ min N, sin πα
n≤ N
 
1
 min N, .
||α||
letting ||α|| denote the distance from the nearest integer.
Let Φ be a smooth function. Then,
∑ Φ(n/N )e(nα)
n
is some smooth version of what we are trying to approximate. We
might try to use the Poisson summation formula. We can write
∑ Φ(n/N )e(nα) = N ∑ φ̂( N (k + α))
n k
and work out the Poisson summation formula. For φ smooth, the
Fourier transform is rapidly decreasing.

3. 9/28/17
Recall last time we had
R3 ( N ) : = ∑ Λ ( n1 ) Λ ( n2 ) Λ ( n3 ).
n1 + n2 + n3 = N

The goal was to show this asymptotes to N 2 . We set


S(α) = ∑ Λ(n)e(nα),
n≤ N
ANALYTIC NUMBER THEORY NOTES 9

and found
Z 1
R3 ( N ) = S(α)3 e(− Nα)dα.
0

The idea is to show that S(α) is large only near rational numbers
with small denominators (the minor arcs). On the complement, we
want to show |S(α)| is small, and then bound
Z  Z 1
3
| S(α) e(− Nα)dα| ≤ sup |S(α)| |S(α)|2 dα.
m α∈m 0

We then could use Parseval’s identity to bound this by N log N.


Toward the end of last time, we found
1
∑ e(nα)  min( N,
||α||
).
n≤ N

Exercise 3.1. Count the number of ways of writing N = n1 + n2 +


n3 asymptotically by writing down the associated integral using the
circle method. The exponential sum will only be big for α near 0.
There is only one major arc in this case. The answer should be about
N 2 /2, and the point is to see where the 1/2 comes from.
Recall from elementary number theory that Λ(n) = ∑ ab=n µ( a) log b.
If we look at the Dirichlet series for
ζ0 1
− (s) = · −ζ 0 (s)
ζ ζ (s)
where the first term has Dirichlet series Λ(n), 1ζ has Dirichlet series
µ and −ζ 0 (s) has Dirichlet series log. Then, the convolution of µ and
log is Λ. Then,
!
Z Z N

n≤ N
log ne(nα) =
1−
log td ∑ e(nα)
n≤t
Z N
1
= log N ∑ e(nα) −
1− t ∑ e(nα)dt
n≤ N n≤t
 
1
 (log N ) min N, .
||α||
Then,

∑ Λ(n)e(nα) = ∑ µ( a) ∑ log be( abα).


n≤ N a b≤ N/a
10 AARON LANDESMAN

Example 3.2. First, let’s try the case α = 0. Then,

∑ Λ(n) = ∑ µ( a) ∑ log b
n≤ N a b≤ N/a

If we knew ∑∞
µ( a)
a=1 a = 0 we could then prove the prime number
theorem. This is essentially equivalent to proving the prime number
theorem, so it would take some work. Things are good when a is
small, but there is a problem when a is big.
Goal 3.3. Our overall aim is to bound S(α).
3.1. Vinogradov’s idea. We’d like to somehow decompose Λ(n) into
pieces, where either we use a simple exponential sum, or using the
following idea. The idea has to do with bilinear forms.
We notate m ∼ M meaning M < m ≤ 2M.
B( M, N ) := ∑ ∑ am bn · f (m, n),
m∼ M n∼ N

with ai , bi arbitrary complex numbers, and f (m, n) is an oscillatory


term, such as f (m, n) = e(mnα). Intuitively f (m, n) should have
some “cancellation.”
Goal 3.4. We’d like to bound the sum B( M, N ) by something like
!1/2 !1/2
∑ | a m |2 ∑ | bn | 2 · Nf
m∼ M n∼ N

where g is some sort of operator norm of the matrix f (m, n).


(1) We think of f (m, n) as something that cancels out. It does not
always have the same sign.
(2) We typically have | f (m, n)| small, e.g., ≤ 1.
(3) We might also imagine am , bn ≤ 1.
We’d then like to compare the bound we obtain to the trivial bound
MN. We’d like to beat this trivial bound.
This will be impossible to bound if
(1) f (m, n) = 1.
(2) f (m, n) = α(m) β(n).
(3) Both M and N are big (or at least the associated matrix has
large rank).
In order to avoid these impossibilities, we will need to exploit that
f (m, n) is genuinely a 2-variable function, and does not decouple.
ANALYTIC NUMBER THEORY NOTES 11

To obtain the bound, we will use Cauchy-Schwartz. We have


2 ! 2

∑ ∑ am bn f (m, n) ≤ ∑ |am |2  ∑ ∑ bn f (m, n)  .


m∼ M n∼ N m∼ M m∼ M n∼ N

Let ∗ denote
2

∑ ∑ bn f (m, n)
m∼ M n∼ N

Then,
2
∗= ∑ ∑ bn f (m, n)
m∼ M n∼ N

≤ ∑ bn 1 b n 2 ∑ f (m, n1 ) f (m, n2 ).
n1 ,n2 ∼ N m∼ M

From this we have gained that we have replaced the unknown am , bn


by inner products of our known matrix f (m, n).
If we knew f (m, n) were orthogonal, then the sum amounts to
terms with n1 = n2 of the form
∑ |bn |2 M.
n∼ N
Things will never be quite so good that we will precisely get orthog-
onality. But, we might have some approximate orthogonality. For
example, if n1 6= n2 , maybe we can bound the correlation by 1. Then,
the off-diagonal terms are of the form
∑ | bn 1 bn 2 .
n1 6 = n2

Using Cauchy’s inequality, we obtain


| bn 1 bn 2 |  | bn 1 | 2 + | bn 2 | 2 .
Hence,
∑ | bn 1 bn 2  N ∑ | bn | 2 .
n1 6 = n2 n∼ N

In the above favorable circumstances, putting the above together, we


get a bound
!1/2 !1/2
√ √ 
∑ ∑ mn a b f ( m, n )  ∑ m | a | 2
∑ n | b | 2
M + N
m∼ M n∼ N m∼ M n∼ N
12 AARON LANDESMAN

We might
√ √ try setting all | an | = |bn | = 1, 1and then
now
1
our bound
is M N + N M instead of MN so we save √ + √ . Again, this
M N
bound holds under various assumptions that the f (m, n) are approx-
imately orthogonal.
This is the key strategy. We now want to implement the above
strategy in the situation we are in. The key point of the strategy
is that we have transferred the problem from understanding the un-
known an , bm to the known problem of understanding the correlation
of f (m, n).

3.2. Applying Vinogradov’s idea. We now want

∑ ∑ am bn e(mnα).
m∼ M n∼ N

Thinking of the am as µ( a) and the bn as µ(n). We then want to bound

∗= ∑ | bn 1 bn 2 | ∑ e ( m ( n1 − n2 ) α ) .
n1 ,n2 ∼ N m∼ M

Suppose we write n1 − n2 =: k. Then |k | ≤ N. We then have


 
  1
∗ ∑ 2 2
|bn1 | + |bn2 | min M,
n1 ,n2 ∼ N ||(n1 − n2 )α||
!  
1
 ∑ |bn1 | ∑ min M, ||kα|| .
2
n ∼N
1 |k|≤ N

We conclude
!1/2  1/2
1/2  
 1
∑ am bn e(mnα) ≤ ∑ | a m |2 ∑ | bn | 2  ∑ min M,
||kα||
 .
m,n n |k|≤ N

We’d like to show we get something from this if α is not close to a


rational number with small denominator. So we keep in mind that α
might be irrational.
We start with Dirichlet’s theorem:
Theorem 3.5 (Dirichlet). For all Q > 1 and all α ∈ R, there exists a
rational number a/q with ( a, q) = 1 and q ≤ Q so that
a 1
α− < .
q qQ
ANALYTIC NUMBER THEORY NOTES 13

So, we can get pretty good approximations to irrational numbers


with small denominators. A crude version of this is
a 1
α− ≤ 2.
q q
We can get approximations of this type by continued fraction expan-
sions.
Let (∗) denote
 
1
(∗) := ∑ min M,
|k|≤ N
||kα||
We should expect that if q is small, then we might revert to the trivial
bound MN. Perhaps there is some inverse relationship with q. So

maybe we get something like MN/q or MN/ q. So, the larger q
gets, the more saving we should get over the trivial bound. So, very
small values of q are not good, but we’d like to show that if q is in
some intermediate range, we might hope to be in a good situation.
So, the bound we will write down will depend on the Diophantine
properties of α and the scale on which we are operating.
So, assume α has the rational approximation given by Dirichlet’s
theorem satisfying
a 1
α− ≤ 2.
q q
Split the interval from m to n of length k into several intervals of
length q. How do |kα| vary on this interval - there is at most one
value which is very close to an integer. Then, we have
 q
∑ min M, a  M + q log q.
0≤ a ≤ q

The log q is unimportant and we can remove it if we’d like, using


Poisson summation if we had a smooth function. It would then be
min( M, q/a2 ) and we could remove the log.
We now want to bound the following by dividing N into N/q + 1
intervals of length q.
 
1
(∗) = ∑ min M,
|k|≤ N
||kα||
= ( N/q + 1) ( M + q log q)
 
log q
 ( M + q) ( N + q) .
q
14 AARON LANDESMAN

We have proven:
a 1
Proposition 3.6. If |α − q < q2
and ( a, q) = 1, then

∑ ∑ am bn e(mnα)
m∼ M n∼ N
 1/2  1/2  log q 1/2 √ 
∑ | an | 2
∑ | bn | 2
p p
 MN + Mq + Nq + q .
q
Question 3.7. Why is the above bound useful?
Just to summarize, this might be helpful to think of the case that
the ai , b j are bounded in norm by 1. We then get approximately a
bound by

MN √ √ √ √   MN

1 1

√ √
√ MN + q M + N + q = √ + MN √ + √ + q MN.
q q M N
The middle term is what we would get from the orthogonality re-
lation. If q is small or large, we don’t beat the trivial bound, as ex-
pected. This is the crucial bound for our particular bilinear form.
Next time, we’ll try and rewrite the coefficients as a bilinear form
as a function of multiple summands. The final ingredient is to write
a combinatorial identity to express Λ(n) in terms of things we un-
derstand. That is, we want to write it as something like

∑ (log b)e(bα) + ∑ am bn e(mnα).


m,n

where both m and n are large in the second sum.

4. 10/3/17
4.1. Review. Last time, we were trying to bound sums like
! 2

| ∑ ∑ f (m, n)|2 ≤ ∑ | am |2 ∑ ∑ bn f (m, n) 


m n m m n

= ∑ | bn 1 bn 2 | ∑ f (m, n1 ) f (m, n2 )
n1 ,n2 m
 
= ∑ | bn 1 | 2 + | bn 2 | 2 ∑ f (m, n1 ) f (m, n2 ) ,
n1 ,n2 m
ANALYTIC NUMBER THEORY NOTES 15

and we hoped that the correlations, i.e., the terms ∑m f (m, n1 ) f (m, n2 )
were bounded by M if n1 = n2 and O(1) if n1 6= n2 . We then could
bound the above by ( M + N ) ∑i |bn |2 .
Recall last time, we were trying to bound sums like

∑ ∑ am bn e(mnα)
m∼ M n∼ N

where m ∼ M means M ≤ m ≤ 2M. We had


a 1
α− ≤ 2,
q q
with ( a, q) = 1. We proved last time

∑ ∑ am bn e(mnα)
m∼ M n∼ N
!1/2 !1/2
log q √ √ √ √  
 √
q ∑ | am | 2
∑ | bn | 2
MN + q M+ N +q .
m n

We had
Λ(n) = ∑ µ(a) log b
S(α) = ∑ Λ(n)e(nα).
n≤ N

Our tools are


(1)

∑ log ne(nα)
n≤ x

is some sort of geometric progression which after factoring


out a log, which we understand
(2)

∑ ∑ am bn e(mnα)
m∼ M n∼ N

which is well bounded using Vinogradov’s method discussed


last time (and above this lecture) assuming the covariances
are small for off-diagonal terms and on the order of the num-
ber of elements for the diagonal terms.
Our goal is now the following combinatorial one:
Goal 4.1. Write Λ(n) in a form where we can use the above two tools.
16 AARON LANDESMAN

Theorem 4.2 (Vaughan’s identity). We have



Λ(n) ζ0
∑ ns = −
ζ
(s)
n =1
1 µ( a)
ζ (s)
= ∑ as
a
log b
−ζ 0 (s) = ∑ .
b
bs

Proof. This follows from straightforward manipulations of Dirichlet


series, the first one comes from the derivative of log ζ (s). 

4.2. Mollifying ζ (s). We now want to Mollify ζ (s). For this, we will
use Selberg’s sieve.
One way to study the zeta function could be an appropriate trun-
cation. We may consider
µ(n)
M(s) = ∑ ns
.
n ≤U

which is a sort of approximation to the inverse of the Riemann ζ


function, using the above identity that
1 µ( a)
ζ (s)
= ∑ as
a

One can compute,



a(n)
ζ (s) M(s) = ∑ ns
.
n =1

where a(n) is defined by

a(n) = ∑ µ(d)
d|n,d≤U

1
 if n = 1
= 0 if 1 < n ≤ U

 the norm is bounded by d(n) if n > U

where d(n) is the number of divisors of n.


ANALYTIC NUMBER THEORY NOTES 17

We have
ζ0 ζ0
− ( s ) = − ( s ) (1 − ζ M + ζ M )
ζ ζ
ζ0
= −ζ (s) M(s) − (s) (1 − ζ M(s)) .
ζ
First, we should be fairly happy with the term −ζ (s) M(s) which
has Dirichlet series given by
  !
log b M(n)
∑ bs ∑ ns ,
n ≤U

and we’ll have a long sum in the b’s, where we can hope to get some
cancellation. Next, to understand
ζ0
(s) (1 − ζ M(s)) .
ζ
We try to think of this product as a sort of bilinear form. The terms
from 1 − ζ M(s) only matter for n larger than U. Thinking of this
term as a bilinear term, we’re happy because 1 − ζ M(s) is large. But,
we have to ensure that ζ 0 /ζ is not too “skinny.” To deal with this, we
can subtract out the small primes, and then later add them back.
To accomplish this, we define
Λ(n)
P(s) := ∑ ns
.
n ≤V

Then,
ζ0 ζ0
− (s) = − (s) − P(s) + P(s)
ζ ζ
Λ(n) Λ(n)
= ∑ + ∑
n >V
ns n ≤V
ns

Then,
ζ0 0 ζ0
− (s) = −ζ (s) M(s) − (s) (1 − ζ M(s))
ζ ζ
 0
−ζ

0
= −ζ (s) M(s) + (s) − P(s) (1 − ζ M(s)) + P(s) (1 − ζ M(s))
ζ
 0
−ζ

0
= −ζ (s) M(s) + (s) − P(s) (1 − ζ M(s)) + P(s) − ζ (s) M(s) P(s)
ζ
18 AARON LANDESMAN

The point of this breakdown is that we now have three terms we can
handle using our two tools we have.
The middle term decomposes into two parts, both of which are
big, which gives a bilinear form.
The last term has a long sum from the ζ (s) term in simple coeffi-
cients. For P(s), we can just ignore it because V is small. The first
term is similarly a long some from the ζ 0 term.
Remark 4.3. The first term which we handle via our first summation
technique is called a “type 1 sum” and the second handled via our
second bilinear form summation technique is called a “type 2 sum.”
Example 4.4. Let’s say we want to write
−ζ 0 −ζ 0  
(s) = (s) (1 − ζ M)2 + 2ζ M − ζ 2 M2
ζ ζ
 0 
ζ
= − + ζ M (1 − ζ M) − 2ζ 0 M + ζζ 0 M2 .
0
ζ
The first is a type 2 sum, the second is a type 1. The ζ and ζ 0 are both
somewhat a simple divisor function because if the product of ζ, ζ 0
goes in a long range then at least one of them must be summed in a
long range. So, the third term is also a type 1 sum.
Remark 4.5 (Heath Brown identity, aka binomial theorem). Given
−ζ 0
( s ) (1 − ζ M ) k
ζ
one can try expanding this in k via the binomial theorem, and try to
bound various terms.
4.3. Proving Vinogradov’s theorem. Recall we have
a 1
α− ≤ 2.
q q
with (α, q) = 1. Our goal is to bound S(α) in terms of q. Trivially we
know S(α) is bounded by N, and we want to save a bit more than
one log on the minor arcs, and then we’ll have to concentrate on the
major arcs.
Using Vaughan’s identity, (where we have not yet specified U and
V). There are three type 1 sums and one type 2 sum (from the bilinear
form. Recall we have
ζ0
 0
−ζ

0
− (s) = −ζ (s) M(s) + (s) − P(s) (1 − ζ M(s)) + P(s) − ζ (s) M(s) P(s)
ζ ζ
ANALYTIC NUMBER THEORY NOTES 19

and we are trying to bound the four sums. First we deal with the
P(s) term, which is

∑ Λ(n)e(nα)  V.
n ≤V

Next, we try to bound the first term, which is the contribution


from primes coming from ζ 0 M. This term is
 
N 1
∑ µ(n) ∑ log re(nrα)  ∑ min n , ||nα|| .
n ≤U r ≤ N/n n ≤U

It is convenient to split over dyadic blocks 2k < n ≤ 2k+1 .

Exercise 4.6. Carry out the argument from week 1 for dealing with
sums like
 
1
∑ min N, ||nα|| .
|n|≤ N

There is a small lie in what we will next do, and your job is to fix it
by Thursday. You should check what happens for smaller n as well.

Pretending that only the large range matters, we can bound


 
N 1
∑ µ(n) ∑ log re(nrα)  ∑ min n , ||nα||
n ≤U r ≤ N/n n ≤U
  
U N
 (log N ) +1 + q log q
q U
 
2 N N
 (log N ) +U+ +q
q U

Now, we’ll aim to attack the last type 1 sum, which is the term
corresponding to ζ (s) M(s) P(s) which is

∑ ∑ µ(n)Λ(`) ∑ e(n`rα)
n≤U `≤V r ≤ N/n`

if we let n` = a then a ≤ UV. Then, the terms in a are bounded


by something like ∑n`=a Λ(`) = log a (using that the left hand side
is the convolution of ζ with ζ 0 /ζ which is ζ 0 which has coefficients
20 AARON LANDESMAN

given by log. Therefore,

∑ ∑ µ(n)Λ(`) ∑ e(n`rα)  ∑ log a ∑ e ( arα)


n≤U `≤V r ≤ N/n` a≤UV r ≤ N/a
 
N 1
 (log N ) ∑ min ,
a ||αa||
a≤UV
 
2 Nq N
 (log N ) + q + UV + .
q UV

Exercise 4.7. Verify the above bounds using a method similar to the
type 2 bound of the first ζ 0 M term.
Adding up our three type one sums, and removing terms trivially
bounded by others, we get
 
2 N N
(log N ) + q + UV + .
q U
This handles three of the four terms. The last term remaining to be
handled is the type 2 sum corresponding to
 0
−ζ

(s) − P(s) (1 − ζ M(s))
ζ
The first sum only contains terms larger than V and the second only
contains terms larger than U. Using
 
1
1 − ζ M(s) = ∑ s  ∑ µ(d)
n n d|n,d>U

we obtain the sum


∑ Λ(n) ∑ ∑ µ(s)e(mnα)
n >V m>U d|m,d>U

with mn ≤ N.
Remark 4.8. We now have two terms with variables in our bilinear
form, both with large values. Both will range over dyadic intervals.
It starts to look like a bilinear form, though there is the caveat that the
two variables are connected by the condition that mn ≤ N. Hence,
we need some technical device to separate the variables.
Essentially, this is saying these are like points lying below a hy-
perbola and we would like to approximate the hyperbola by some
rectangle.
ANALYTIC NUMBER THEORY NOTES 21

Morally,
∑ Λ(n) ∑ ∑ µ(s)e(mnα)
n >V m>U d|m,d>U

with mn ≤ N. Ignoring the condition mn ≤ N, (which we will fix


next time) the above sum is approximated by
∑ ∑ a(m)b(n)e(mnα).
m∼ A n∼ B
for A > U, B > V, AB ≤ N. This is the kind of bilinear form we
want for our type 2 sum.
We can now use the bilinear form estimate from type 2, we get the
estimate
!1/2 !1/2 
log q √ √ √ √
 
∑ a ( m ) 2
∑ b ( n ) 2

q
AB + A + B q + q .
m∼ A n∼ B

Note that the correlation is as needed because we have checked it for


the particular bilinear form am bn e(mnα). Here, bn = Λ(n). Then,

∑ Λ(n)2  B log B.
n∼ B
Next,
∑ a ( m )2  ∑ d ( m )2
m∼ A m∼ A

Exercise 4.9. Show


∑ d(n) ∼ x log x.
n≤ x

(write this as ∑n≤ x ∑d|n 1 and interchange the two sums).

It turns out ∑n≤ x d(n)2 ∼ Cx (log x )3 . We end up getting a bound


from
∑ a ( m )2  ∑ d(m)2  Cx (log x )3 .
m∼ A m∼ A
So, we have some loose ends which we shall address next time
including
(1) thinking about these sum of divisor functions up to x,
(2) thinking through the type one bounds for the first and fourth
terms,
(3) and putting these things all together.
22 AARON LANDESMAN

5. 10/5/17
5.1. Recap of last time. Recall that we have defined
S(α) := ∑ Λ ( n ) e ( n ).
n≤ N

Our goal is to bound these exponential sums. We assume


a 1
α− ≤ 2
q q
for ( a, q) = 1. We had a way of approaching this bound with expo-
nential sums and bilinear forms. We used the combinatorial identity
−ζ 0
 0
−ζ

0
(s) = P(s) + ζ (s) M(s) − ζ (s) M(s) P(s) + (s) − P(s) (1 − ζ (s) M(s))
ζ ζ
where
Λ(n)
P(s) = ∑ ns
.
n ≤V

and
µ(n)
M(s) = ∑ ns
.
n ≤U

The first three terms are type 1 sums, and the last term is a type 2
sum. Last time, we discussed the bound for the type 1 sums. We
saw they were bounded by
 
2 N N
 (log N ) + q + UV + .
q U
For example, to bound the term ζ 0 (s) M(s), we had to bound
 
N 1
∑ min n , ||nα|| .
n ≤U

We could split this into dyadic blocks and carry out the usual sum.
When 1 ≤ n ≤ q, we cannot split it into intervals of length q.
Exercise 5.1. For the blocks, we should take a dyadic sum over inter-
vals 2k q ≤ n ≤ 2k+1 q, and then we should pay attention to the case
1 ≤ n ≤ q, and we should get a bound around q log q or something
like that for the sum of the first q terms.
ANALYTIC NUMBER THEORY NOTES 23

At the end of last class, we were discussing the type 2 sums. There
were many small things we needed to keep track of. We wrote the
sum as
 

∑ Λ(n) ∑  ∑ µ(d) e(mnα)


n >V m>U,mn≤ N d|m,d>U

Last time, we divided this sum into dyadic intervals with m ∼ A


and n ∼ B.
Remark 5.2. We have to justify why the sum can be split the sum
into dyadic blocks subject to the condition that mn ≤ N.
We then bounded the above by
!1/2 !1/2   √
log q √ √ √ 
∑ d(m) 2
∑ Λ(n) 2

q
AB + A+ B q+q
m∼ A n∼ B
!1/2  √

log q √ √ √ 
 ∑ d(m) 2
( B log B) 1/2

q
AB + A+ B q+q
m∼ A

5.2. Bounding the sum of the divisor function. We can see


∑ d(n) = ∑ ∑ 1
n≤ x a≤ x b≤ x/a
x
= ∑ ( a + O(1))
a≤ x
= x log x + O( x ).
This is a wasteful O(1) when a is small. Dirichlet’s idea was to deal
with the hyperbola ab = x and count b ≤ B and a ≤ A. One could
count points a certain portion of the hyperbola based on whether
A or B is smaller on the outside or inside. When one carries this
size of A + B, instead of x (with
out, one gets an error term on the √
A + B = x. One can take A √ = B = x. One ends up getting an error
of x log x + (2γ − 1) x + O( x ).
Exercise 5.3. Carry out Dirichlet’s idea and check this error term.
Then, we can compute
dk (n) = ∑ 1
a1 ··· ak =n

= ∑ d k −1 ( b ),
ab=n
24 AARON LANDESMAN

and use induction. If we knew ∑b≤y dk−1 (b), we could then use the
hyperbola method for a ≤ A, ab ≤ B and choose the parameters A, B
with AB = x.
5.3. A second method for bounding the sum of the divisor func-
tion. We now want a second method of calculating this. We are try-
ing to bound
d(n)
ζ ( s )2 = ∑ ns
.
We have
xs
Z c+i∞
1
∑ d(n) = 2πi c−i∞
ζ ( s )2
s
ds
n≤ x
for c > 1.
Exercise 5.4. Show
(
1 ds 1 if y > 1
Z
ys =
2πi (c) s 0 if y < 1
(see davenport’s book) where the path (c) means that from c − i∞ to
c + i∞. Essentially, one can prove this by noting the integral is 0 for
very small y, and there is only one pole at y = 1 which has residue 1.
When we expand
x s = x (1 + (s − 1) log x + · · · )

and
1
ζ (s) = + γ + O ( s − 1).
s−1
The residue of the pole at s = 1 is
x log x + (2γ − 1) x.
It would be useful, and can be done easily, to have some bounds for
|ζ (s)|  (1 + |t|)1/2
where s = σ + iτ, 0 ≤ σ ≤ 1.
We are trying to bound
1 xs
Z
ζ (s)2 ds
2πi (c) s
xs xc
Z c+iT
1
= ζ (s)2 ds + O( ).
2πi c−iT s T
ANALYTIC NUMBER THEORY NOTES 25

We can then try to bound this integral by something like x log x +


(2γ − 1) x, similarly to the way done in Davenport’s book.
In potential hope of formalizing this method, we are trying to we
want to bound

∑ d k ( n ),
n≤ x

and we consider

dk (n)
ζ (s)k = ∑ n s
.
n =1

We then can compute these by examining


1 xs
Z
ζ (s)k ds.
2πi (c) s
So, we want to know vaguely what the residue of this integral at
s = 1. The residue is something like

x (log x )k
( k − 1) !
with some lower order terms we can work out. The main term will
be a polynomial of log x of degree k − 1.
Exercise 5.5. Show we end up getting a residue of the form

xPk (log x ) + O( x1−δk )

for Pk a degree k − 1 polynomial.


Remark 5.6. Gauss should that the number of lattice points in a circle
of radius R is
N ( R) = πR2 + O( R1/2+ε ).

(the best currently known is only error R2/3−δ Dirichlet’s divisor


problem is to show

∑ d(n) = x log x + (2γ − 1) x + O(x1/4+ε ).


n≤ x

In both cases, the main term is the area of the region you are consid-
ering. The best error known is only about O( x1/3−δ ).
26 AARON LANDESMAN

5.4. Calculating the sum of squares of the divisor function. Now,


we’d like to calculate
∑ d ( n )2 .
n≤ x
We will instead calculate

d ( n )2
 
4 9
∑ ns = ∏ 1 + ps + p2s + · · · .
n =1 p

Note that this will converge absolutely whenever we are to the right
of 1. We have
 
1 1
ζ (s) = ∏ 1 + s + 2s + · · · .
p p p
We can approximate

d ( n )2
 
4 9
∑ ns = ∏ 1 + s + 2s + · · ·
p p
n =1 p

= ζ ( s )4 F ( s )
where
 
α
F (s) = ∏ 1 + 2s + · · ·
p
p

for some α, which converges absolutely if Re(s) > 1/2. One then
obtains that
∑ d(n)2 ∼ Cx (log x)3 ,
n≤ x

using the bound for ζ 4 as xPk (log x ) with Pk of degree k − 1. with


an asymptotic power saving error term. We could similarly use the
hyperbola method to approximate
d ( n )2 = ( d4 ∗ f ) ( n )
for a multiplicative function f with f ( p) = 0.
Remark 5.7. When one actually calculates what
d ( n )2
∑ ns
n
one might find something like
ζ ( s )4
ζ (2s)
ANALYTIC NUMBER THEORY NOTES 27

although this identity is not relevant to finding the correct asymp-


totic formula.
Exercise 5.8 (Fun exercise!). Let a(n) be the number of abelian groups
of order n. First make a guess for
∑ a(n)
n≤ x
asymptotically. Then compute the asymptotics. Hint: Use the iden-
tity for the partition function

∑ p ( n ) x n = ∏ (1 − x n ) −1
n =0
to get the constant in the asymptotics, which ends up being some-
thing like ζ (2) · ζ (3) · · · .
Remark 5.9. One might also try computing
∑ dπ (n)
n≤ x

∑ di ( n )
n≤ x

where dπ (n) are the coefficients of the Dirichlet series of ζ (s)π . Re-
latedly, one might try to count
∑ 1,
n≤ x,n= a2 +b2
and one can work out
∑ 1 = ζ (s)1/2 L(s, χ−4 )1/4 F (s),
n = a2 + b2
where f (s) is regular to the left of 1.
It’s not completely obvious how these functions continue analyti-
cally. We could make sense of
ζ (s) = exp (π log ζ (s)) ,
which makes sense to the right of 1. But, it also can be extended to
regions where there are no zeros or poles of the ζ function. If we
understand the zero-free region of the zeta function, then we can
make sense of this function in this zero-free region.
In the region
c
γ > 1− ,
log T
28 AARON LANDESMAN

ζ (s) 6= 0 as shown in davenport.


It turns out this function has a singularity which is not a pole (nor
essential nor removable) and it turns out to be something like
x (log x )π −1
Γ(π )
for the function dπ (n).
This idea is called the Selberg, Delange method (or in a paper to-
day on arXiv, the LSD method).
Remark 5.10. We only wanted an upper bound for

∑ d ( n )2 .
n≤ x

We didn’t need an asymptotic. In analytic number theory, this is


called Rankin’s method. We can bound
d ( n )2
∑ d ( n )2 ≤ x α ∑ α
n≤ x n≤ x n

d ( n )2
≤ xα ∑ nα
n =1
 
4
= x ∏ 1+ α +···
α
p p

Then, we want to optimize to choose the best α. Making α close to 1,


the product blows up and x α gets small. From calculus, there will be
some choice of α which minimizes this product.
For example, if you guess α = 1 + log1 x , you find x α is about x and
   4
4 1
∏ 1+ α +···
p
∼ ζ 1+
log x
.
p

This yields x (log x )4 as a bound, and you are only off by one log.
Exercise 5.11. Verify this.
Exercise 5.12. Let p(n) denote the number of partitions of n. Prove
that
 √ 
p(n) ≤ exp π 2/3n .

Moreover, find the optimal constant so that p(n) ≤ eαn . (Sound thinks
the constant above is optimal).
ANALYTIC NUMBER THEORY NOTES 29

Hardy and Ramanujan found


√ 
exp π 2/3n
p(n) ∼ √ .
4n 3
Hint: Show

∑ ∏ (1 − x n ) −1
n =1
Then,
p( N ) ≤ ∏ (1 − x n ) −1 x − N .
Exercise 5.13 (Fun mathoverflow problem). Let N be a parameter.
How many subsets
S ⊂ [1, N ]
are there so that
1
∑ s
< 1.
s∈S

Obviously the answer is ≤ 2 N , and the exercise is to find a better


bound. Hint: This is not an application of what we’ve discussed, but
it is an application of the ideas we’ve discussed.
5.5. Returning to our type 2 sum. Recall we had A > U, B > V. We
can now bound
!1/2 !1/2 
log q √ √ √ √
 
∑ d ( m ) 2
∑ Λ ( n ) 2

q
AB + A + B q + q
m∼ A n∼ B
log q √ √ √ √
 1/2   
 A(log A)3 ( B log B)1/2 √ AB + A+ B q+q
q
3

AB √ √ p

 (log N ) √ + A B + B A + qAB
q
 
3 N N N p
 (log N ) √ + √ + √ + qN .
q V U
using for the last step that AB ≤ N. We are doing well here because
both variable A and B vary only in long ranges (i.e., U and V are
reasonably large). We then have to add the error from the type 1
sum which was
 
2 N N
 (log N ) + q + UV + .
q U
30 AARON LANDESMAN

Adding these together, we get


 
3 N N N p
S(α)  (log N ) √ + √ + √ + qN + UV .
q V U
By symmetry, we may as√well choose U = V, and so we should
optimize by choosing N/ U = U 2 . Hence, U = N 2/5 . Then, one
obtains
 
3 N p 4/5
S(α)  (log N ) √ + qN + N .
q
There is one small caveat, where we must ensure how to separate
the variables m and n subject to the condition mn ≤ N. We’ll have to
finish this next time. Believing this for the moment, we’ve proven.
So, if
N
10
> q > (log N )10 .
(log N )
So, as long as we can approximate α by some rational q in this range,
we get a good bound. These will be called the “minor arcs.”
Theorem 5.14. Let φ be the golden ration. Then,

∑ Λ(n)e(nφ)  N 4/5 (log N )3 .


n≤ N

Proof. We can plug in q to be around N using Fibonacci number
approximations plugged into the above formula, and then the bound
is (log N )3 N 4/5 . 
Remark 5.15. The bound also works well for bounding things like

∑ Λ(n)ein
n≤ N

using that
1 a C
− ≥ 20 ,
π q q
so one can always find a pretty good approximation to π. So, we get
a bound of about

∑ Λ(n)ein  N .99
n≤ N
ANALYTIC NUMBER THEORY NOTES 31

6. 10/10/17
6.1. Exercises and questions. Last time, we let q be a number with
a 1
α− < 2
q q
 
N
∑ Λ(n)e(nα)  (log N ) 3 4/5
p
√ + qN + N .
n≤ N
q

Exercise 6.1. Let φ be the Euler totient function, let


!
φ(n)k
∑ n
n≤ N

Find asymptotics for this. Why might these asymptotics be interest-


ing.
6.2. Recapping what we have seen in the Proof of Vinogradov’s
theorem. Last time, we had some sum in terms of m ∼ A, n ∼ B
with a condition mn ≤ N. We want to separate m and n. We have
(
N s ds
Z c+i∞ 
1 if mn ≤ N

1
=
2πi c−i∞ mn s 0 if mn > N
When we plug this into our bilinear form

∑ ∑ f 1 f 2 e(mnα),
m∼ A n∼ B

(for appropriate f 1 , f 2 ) we get


Z c+i∞
1 f 1 f 2 e(mnα) s ds
2πi ∑ ∑
c−i∞ m∼ A n∼ B ms ns
N
s
.

This separates the variables at the cost of log N when we integrate


ds
s .

Question 6.2 (Possibly open question). We have

∑ Λ(n)e(nφ)  N 4/5+ε ,
n≤ N

for φ the golden ratio. Can one say something better? Presumably
the right answer is N 1/2 , though that may be hard. Maybe one could
show something like N 3/4 . The key is that we have rational approx-
imations at every scale.
32 AARON LANDESMAN

Recapping what we have done so far, we were trying to bound


Z 1
S(α)3 e(− Nα)dα.
0

We split this up into major and minor arcs. On the minor arcs, we
bounded this by
Z  
| S(α)3 e(− Nα)dα| < max |S(α)| N log N.
m m

We expect a main term on the order of N 2 . We have a good bound


on S(α) so long as
N
10
≥ q ≥ (log N )10
(log N )
The minor arcs will be all points which satisfy an approximation of
this type, and the major arcs will be all points which do not satisfy
an approximation of this type.
N
Let Q = 10 . By Dirichlet’s theorem, for all α ∈ (0, 1), there
(log N )
exists ( a, q) = 1, q ≤ Q and
a 1 1
α− ≤ ≤ 2.
q qQ q

Definition 6.3. We say α ∈ m (in a minor arc) if there exists such an


approximation with

q ≥ (log N )10 .
Otherwise, there exists α ∈ M (in a major arc).
That is,
a 1
α− ≤
q qQ

with q < (log N )10 . The major arcs M are disjoint. The total measure
of the major arcs is

φ ( q )2 C (log N )10
|M| = ∑ qQ

Q
,
q≤(log N )10

which is roughly (log N )20 /N.


ANALYTIC NUMBER THEORY NOTES 33

We now wish to understand S(α) for α on a major arc. Let α =


a
q
1
+ β for q small, | β| ≤ qQ . The idea is to understand S( qa ). Let’s
instead try to understand
 
an
∑ Λ(n) exp q .
n≤ x

6.3. Riemann hypothesis and counting primes. To start, let us re-


call what the Riemann hypothesis says about counting the number
of primes up to x. Let Ψ( x ) be the number of primes up to x. It
implies
 
Ψ( x ) = x + O x1/2+ε .

If one further assumes the generalized Riemann hypothesis, one finds


x
Ψ ( x; q, a) = + O( x1/2+ε ).
φ(q)
Further, the constant in O( x1/2+ε ) is independent of q. In particular,
this means φ(q) ∼ q, Thus, we have a nice asymptotic for q ≤ x1/2−ε .

Conjecture 6.4 (Montgomery). We have


!
x x1/2+ε
Ψ( x; q, a) = +O √ .
φ(q) q

Plugging in the generalized Riemann hypothesis, we get


  ∗  
na ak
∑ Λ(n) exp q = ∑ ∑ Λ(n) exp
q
n≤ x k mod q n≡k mod q,n≤ x
∗   
ak x
= ∑ exp + O( x 1/2+ε
)
k mod q
q φ ( q )
µ(q)
= x + O(qx1/2+ε )
φ(q)
where the superscript ∗ again means k is coprime to q and we are
using
 
Exercise 6.5. Show ∑∗k mod q exp qk = µ(q).

Now, suppose (n, q) = 1 and we want to express exp (n/q) in


terms of characters χ mod q.
34 AARON LANDESMAN

Letting χ0 denote the identity on (Z/qZ)× We can consider


∗   ∗  
k k 1
∑ exp q χ0 = ∑ exp q φ(q) ∑ χ(k)χ(n)
k mod q k mod q χ mod q
1
= ∑ χ ( n ) τ ( χ ),
φ(q) χ mod q

where
 
k
τ (χ) = ∑ χ(k) exp .
k mod q
q

Then,
 
an 1
∑ Λ(n) exp q
= ∑ τ (χ) ∑ Λ(n)χ(an).
φ(q) χ mod
n≤ x q n≤ x

Define
ψ( x; χ) := ∑ Λ ( n ) χ ( n ).
n≤ x

Then,
Ψ ( x; q, a) = ∑ Λ( x )
n≤ x,n≡ a mod q
1
= ∑ χ(a)Ψ(x, χ).
φ(q) χ mod q

The generalized Riemann hypothesis (GRH) is essentially the state-


ment that for χ = χ0 , we have Ψ( x, χ) = Ψ( x ) up to a small error
(which is just the usual Riemann hypothesis) and for χ 6= χ0 , we
have
|Ψ( x, χ)| = O( x1/2+ε ).
In the case χ = χ0 , we get the main term with
∗  
k
τ (χ0 ) = ∑ exp = µ ( q ).
k mod q
q

So, the main term is


µ(q)
Ψ ( x ).
φ(q)
ANALYTIC NUMBER THEORY NOTES 35

Exercise 6.6 (A bit tricky, perhaps). Using orthogonality of charac-


ters show that if χ is primitive modq (meaning not having a pe-

riod dividing q) then |τ (χ)| = q. Hint: See Davenport’s section
on Gauss sums.
Plugging this in the above formulas and the GRH bounds, we see
a refined GRH bound
 
an µ(q)    √ 
∑ Λ ( n ) exp
q
=
φ(q)
x + O x 1/2+ε
+ O qx 1/2+ε
.
n≤ x

So, compared to our previous error bound with O(qx1/2+ε ) error, we



only get O( qx1/2+ε ).
We are not assuming GRH, rather we want an unconditional proof,
so the above discussion assuming GRH can now be ignored.
We have
 
an 1
∑ Λ(n) exp q = φ(q) ∑ χ(a)τ (χ)ψ(x, χ)
n≤ x χ mod q
√ !
µ(q) q
φ(q) χ∑
= Ψ( x, χ0 ) + O |Ψ( x, χ)| .
φ(q) 6=χ 0

We have
 
Ψ ( x, χ0 ) = Ψ( x ) + O (log x )2 .

From the prime number theorem, we have


  p 
Ψ( x ) = x + O x exp −c log x

for some c > 0. The key step in the proof of this is that the region
σ > 1 − log 2c+|t| has no zeros of the zeta function ζ (s), where s =
σ + it.
We therefore get a bound
Ψ ( x, χ0 ) = Ψ( x ) + O (log x )2
xρ x
∼ x− ∑ +O .
|ρ|≤ T
ρ T

In the best case, we might have


 p 
Ψ( x; χ)  x exp −c log x ,
p 
for χ 6= χ0 mod q and q ≤ exp log x .
36 AARON LANDESMAN

The conclusion is that if


 p 
q ≤ exp c log x
for c small, then
 
an µ(q)   p 
∑ Λ(n) exp
q
=
φ(q)
x + O x exp −c log x .
n≤ x
In a similar way, one would like to show
x   p 
ψ( x; q, a) = + O x exp −c log x .
φ(q)
The short version of the story is that we can basically do this, but
with one important caveat, which is called a Landau-Siegel zero.
6.4. Siegel Zeros. We want to understand Ψ( x; χ). If χ 6= χ0 , we
have
!
x ρχ x (log x )2
Ψ( x, χ) = − ∑ +O
ρ
ρ ,|ρ |≤ T χ
T
χ χ

where ρχ are the zeros of



χ(n)
L(s, χ) := ∑ ns
.
n =1
One can find proofs of all of these things in Davenport. If χ is
primitive then L(s, χ) has a functional equation of the form
 q s/2  s + α 
Γ L(s, χ)
π 2
where α is either 0 or 1 depending on whether χ(−1) = 1, (so α = 0)
or χ(−1) = −1 (so α = 1). This yields the volume at 1 − s. One can
count the number of zeros of ζ (s) or L(s, χ) up to height T, which is
approximately
T log qT
.

It is also useful to know the Hadamard factorization, which, once
you know this is an order 1 function, tells you this has a factorization
in terms of its zeros. That is,
 !
s s/ρ
L(s, χ) = ∏ 1 − e e A+ Bs .
ρ ρ
So the sum of the reciprocals of the squares of the 0’s converge, but
possibly not the sum of the reciprocals of the 0’s.
ANALYTIC NUMBER THEORY NOTES 37

For the zeta function we have,


ξ (s) = s(s − 1)π −s/2 Γ(s/2)ζ (s).
which kills the pole at s = 1. This satisfies the functional equation
ξ ( s ) = ξ (1 − s ).
The main difference between the L functions and the ζ function,
we will need to know something about the zero free region. For
ζ (s), there is a zero free region of the form
c
σ > 1− ,
log 2 + |t|
with σ = im s. We want
c
σ > 1− ,
q(log 2 + |t|)
is free of zeros of L(s, χ). This would imply
  p 
Ψ( x, χ) = O x exp −c log x
p 
for q ≤ exp c log x . This holds if χ is a complex character modq.
But for quadratic characters χ mod q, there is the unfortunate pos-
sibility that there could be one exceptional real simple zero β
Theorem 6.7 (Siegel). Let β be the possible zero of L(s, χ) as above. Then,
C (ε)
β < 1−

for any ε > 0 for some constant C (ε) which cannot be computed (i.e., the
proof is ineffective).
Next time we’ll say a bit more about Siegel’s theorem. It might
be helpful to review things about the prime number theorem in pro-
gressions which we will go over as needed on Thursday.
Sound also says he is happy to look at or discuss solutions if you
do end up solving problems.

7. 10/12/17
7.1. Review. Let χ be some character Z/qZ → C× . We have
Ψ( x, χ) = ∑ Λ ( n ) χ ( n ).
n≤ x
If χ = χ0 , we have
    p 
Ψ( x ) + O (log x )2 = x + O x exp −c log x .
38 AARON LANDESMAN

If χ 6= χ0 , GRH implies
|Ψ( x, χ)|  x1/2+ε .
We would like an unconditional bound around
 p 
(7.1) | Ψ ( x, χ )|  x exp − c log x
p 
and we would like to say when q ≤ exp log x . We have
 
an 1
∑ Λ(n) exp q = φ(q) ∑ τ (χ)χ (a) Ψ(x, χ).
n≤ x χ mod q

The main term comes from χ = χ0 where τ (χ0 ) = µ(q), using an


exercise on computing Gauss sums from last time. The error term,
assuming GRH is of the form q1/2 + x1/2+ε . If Equation 7.1 holds,
then we can bound
 
an µ(q)   p 
∑ Λ(n) exp q  φ(q) + O x exp −c log x
n≤ x
p 
in the range q ≤ exp c log x ) .
We don’t actually know Equation 7.1, but for our application to
10
sums of three primes, we thought p  q as only going up to (log x )
of
and not all the way to exp c log x .
If χ is complex, (i.e., not a real character) then
 p 
|Ψ( x, χ)|  x exp −c log x
p 
for q ≤ exp c log x then there are no zeros of L(s, χ) for
c
σ > 1− .
log q(2 + |t|)
with σ = im s. If instead χ is real or quadratic, then the zero free
region above holds except possibly for one real simple zero.
Theorem 7.1 (Siegel). The real zero β (if it exists) must satisfy
C (ε)
β < 1−

for any ε > 0 and some ineffective constant C (ε).
Remark 7.2. If the zero does not exist, then we can obtain Equa-
tion 7.1. If there does exist a Siegel zero for χ mod q, then
−xβ   p 
Ψ( x, χ) = + O x exp −c log x
β
ANALYTIC NUMBER THEORY NOTES 39

If q ≤ (log x ) A , we can choose ε small enough, we can ensure β <


1 − √ C , and then we can absorb the main term for Ψ( x, χ) − xβ
β

log x
into the error term. In the presence of the Siegel zero, we can only
get this
p uniform desired result for q ≤ (log x ) A , but not for q ≤
exp c log x . This is also ineffective.
Therefore, we obtain
 
an µ(q)   p 
∑ Λ ( n ) exp
q

φ ( q )
x + O x exp − c log x .
n≤ x

Remark 7.3. Suppose β is very close to 1. Pretend β = 1. Then there


is one character χ mod q so that Ψ( x, χ) is approximately − x.
If you think of
1 x x β χ( a)
φ(q) ∑
Ψ( x; q, a) = χ( a)ψ( x, χ) = − .
φ(q) β φ(q)
and here χ is real so χ( a) = χ( a). Then, half of the progression get
most of the primes and the other half get none of them (this happens
depending on whether χ( a) = ±1).
7.2. Proving Vinogradov’s theorem using Siegel’s theorem. For the
moment, we’ll assume Siegel’s theorem and finish the proof of Vino-
gradov’s theorem. We’ll later come back to discuss Siegel’s theorem.
We have seen that if
q ≤ (log x ) A
(A is around 10) then
 
an µ(q)   p 
∑ Λ(n) exp q = φ(q) + O x exp −c log x .
n≤ x

The major arcs are of the form


a 1
α− ≤ ,
q qQ
with
N
Q=
(log N )10
for q ≤ (log N )10 . Recall we have already bounded the minor arcs,
and we are now trying to bound the major arcs. Set α = qa + β. We
40 AARON LANDESMAN

would like to understand


 
an
S(α) := ∑ Λ(n) exp
q
exp (nβ) .
n≤ N
 
We can think of the the product of Λ(n) exp an
q whose partial sums
we understand and exp (nβ) which doesn’t vary very much. So,
 
an
S(α) = ∑ Λ(n) exp exp (nβ)
n≤ N
q
Z N  !
an
= exp( xβ)d ∑ Λ(n) exp
1 n≤ x q
µ(q) N
Z   p  Z N  p  
= exp( xβ)dx + O N exp −c log N + O βx exp −c log x dx
φ(q) 1 1
  p 
= O 1 + N | β| N exp −c log N
  p 
= O N exp −c log x .
Where we used integration by parts to get the above bounds on the
1
error terms, and then we used that β ≤ qQ , and we might have to
adjust the constant c to absorb some factors of log N.
Remark 7.4. The above bound makes sense: If β is very close to qa ,
we pick up the same error term we had before. But if β is very far,
then the error term should group approximately proportionally to
N | β|, which indeed it does.
We now want to evaluate the major arc contribution
Z
S(α)3 e (− Nα) dα.
M

We are hoping this is of size N 2 · C with C some constant we can


evaluate.
Indeed,
∗ Z 1/qQ  3   
a a
Z
3
S(α) e (− Nα) dα = ∑ ∑ S
q
+ β exp − N
q
+β dβ.
M 10 a mod q −1/qQ
q≤(log N )

We know
 3 Z N 
a µ(q) 
3
p 
S +β = exp ( xβ ) dx + O N exp − c log N
q φ ( q )3 0
ANALYTIC NUMBER THEORY NOTES 41

The error term in the integral over the major arcs is then
 
 p  1   p 
O ∑ N 3
exp − c log N
Q
 = O N 2
exp − c log N .
10
q≤(log N )

So the error terms are under control. We now want to understand the
main term.The   is almost independent of a except for the
main term
a
factor exp − N q +β . We want to understand the main term of
Z
S(α)3 e (− Nα) dα.
M
which is
∗ Z 1/qQ Z N    
µ(q) a
∑ ∑ φ ( q )3
exp ( xβ) dx exp − N
q
+β dβ.
q≤(log N ) 10 a mod q −1/qQ 0

Recall the Ramanujan sum


 
aN
cq ( N ) := ∑ exp
q
.
( a,q)=1

The main term is then


Z 1/qQ Z N 3
µ(q)
∑ φ ( q )3
cq ( N )
−1/qQ 0
exp ( xβ) dx e (− Nβ) dβ.
10
q≤(log N )

We can now replace


Z N Z 1
exp( xβ)dx = N exp ( Nxβ) dx.
0 0
This yields
Z 1/qQ Z N 3
exp ( xβ) dx e (− Nβ) dβ
−1/qQ 0
Z 1/qQ Z 1
3
3
= N) N exp ( Nxβ) dx e (− Nβ) dβ
−1/qQ 0
Z N/qQ Z 1 3
2
=N exp ( Nxβ) dx e ( β) dβ
− N/qQ 0
Z ∞ Z 1 3
q2 Q2
 
2
=N exp ( Nxβ) dx e ( β) dβ + O
−∞ 0 N2
42 AARON LANDESMAN

where the last step uses that the tail is


   2 2
dβ q Q
Z
O 3
=O .
| β|> N/qQ β N2
This integral above is called the singular integral. Plugging in this
remainder term, we get that the error contribution is
 
1  
O  Q2 ∑ φ ( q )3 φ ( q ) q 2
= O Q 2
( log N ) 10

q≤(log N )10
!
N2
=O .
(log N )10
So, we can replace our integral from − N/qQ to N/qQ by an inte-
gral going off to infinity. This integral is essentially computing the
number of ways to write N as a sum of three numbers, which is es-
sentially N 2 /2. But, we can also compute it since this is essentially a
Fourier transform. That is,
Z ∞ Z 1 3
2
N exp ( Nxβ) dx e ( β) dβ
−∞ 0

is the convolution of χ[0,1] ∗ χ[0,1] ∗ χ[0,1] which has Fourier transform


above, and for this convolution we get
Z
δ ( t1 + t2 + t3 = 1) .
t1 ,t2 ,t3 ∈[0,1]

Then one can use Parseval’s identity to compute the Fourier trans-
form of this. Here, Parseval is counting the number of ways of writ-
ing 1 as a sum of three real numbers. Before we were writing N as a
sum of three integers.
Now, let’s finish our calculation. We were trying to compute
Z 1/qQ Z N 3
µ(q)
∑ φ(q)3 cq ( N ) −1/qQ 0 exp (xβ) dx e (− Nβ) dβ.
10
q≤(log N )

which is approximated, using our above discussion, by

µ(q) N2
∑ φ ( q )3
c q ( N )
2
.
q≤(log N )10
ANALYTIC NUMBER THEORY NOTES 43

This sum is called the singular sum The tail of this sum is roughly
!
1 2
∑ O φ(q)  (log N )−10 .
10
q>(log N )

Therefore, the main term is of the form



N2 µ(q)
2 ∑ φ ( q )
c ( N ).
3 q
q =1

Let

µ(q)
S ( N ) := ∑ φ ( q )
c ( N ).
3 q
q =1

We can write, using the Chinese remainder theorem so that c p1 p2 ( N ) =


c p1 ( N )c p2 ( N ). So we have
 
1
S (n) = ∏ 1 − c p (n) .
p ( p − 1) s

Then,
p −1  
aN
cp(N) = ∑ exp p
a =1
(
−1 if p - N
=
p−1 if p | N

Then,
 
1
S (n) = ∏ 1 − c p (n)
p ( p − 1) s
 !
1 1
= ∏ 1 + 3 ∏
· 1− 2
.
p-N ( p − 1) p | N ( p − 1)

Remark 7.5. We see this cancels out when N is even. When N is


even, the major arc at 0 is cancelled by the major arc at 1/2. And in
general, the major arc at a/q is canceled by a similar one at a/2q.
44 AARON LANDESMAN

Finally, we have
Z 1

0
S(α)3 e (− Nα) = ∑
n1 + n2 + n3 = N

N2
 
N
= S (N) + O .
2 log N
Because the contribution of prime squares and cubes is negligible,
we get that every sufficiently large odd number is the sum of three
primes (and in fact it is the sum of three primes in many ways),
where here we are using that
S (N) ≥ c
for all N where c is some universal constant bounded below by
 
1
2· ∏ 1− .
p ≥3 ( p − 1)2

This finishes the proof.


Remark 7.6 (Philosophy). Under suitable situation, we’d like to say
we can get an answer by counting contributions at each place and
then multiplying them together.
For example, say we’d like to count the number of ways to write
2N = p1 + p2 . We can try to do the same computation mod p (i.e.,
the counting the number of ( a, b) so that N = a + b mod p for a, b
relatively prime to p, and similarly over the infinite place). We could
then approximate

∑ Λ(n1 )Λ(n2 ) ∼ S ( N )2N,


n1 +n2 =2N
R1
we can then try to use the circle method to approximate 0 S(α)2 e(−2Nα)dα.
But we can no longer use Parseval’s identity to bound the minor arcs
because
Z 1

0
|S(α)|2 = ∑ Λ(n)2 ∼ 2N log N.
n≤2N

Remark 7.7. At the beginning of this course, we mentioned we could


try to count the number of ways to write N as a sum

N = x1k + · · · + xsk .
ANALYTIC NUMBER THEORY NOTES 45

Letting P = N 1/k , this is approximated by the integral


 s
Z 1
!

0
∑ exp nk α e (− Nα) dα.
x≤P
Then, we might expect
Ps /N = Ps−k .
There might then be local obstructions (e.g., squares are always 1 mod
8 or 0 mod 8). Then, if S is large enough in terms of 4k, one might try
to show this can be done. Instead of trying to understand exponen-
tial sums over primes, we would want to understand exponential
sums over powers. But, once S ≥ k + 1 and there are no congruence
constructions, this sort of result should hold. For example, every
large number should be a sum of four squares. But for three squares,
there is a congruence obstruction - 7 mod 8 can never be written as a
sum of three squares.
It turns out you can write numbers as sums of 7 cubes. But for
fifth powers, the problem turns out to be much harder.
Next time we’ll talk about effectivity and Siegel’s theorem.

8. 10/17/17
8.1. Exercises to solidify the ideas thus far. Here are two exercises,
which are a bit longer and harder than usual.
Exercise 8.1 (Difficult exercise). Assume GRH. Give a bound for
∑ Λ(n) exp (nα)
n≤ x

for α − qa ≤ q12 without using bilinear forms, but instead using GRH
and thinking about the prime number theorem and arithmetic pro-
gressions. Hint: We discussed how to write
 
na 1
∑ Λ(n) exp q ∼ φ(q) ∑ χ(a)τ (χ)Ψ(x, χ),
n≤ x

 Ψusing GRH.
and one can input information about
na
One would then try to write exp q in terms of exp (nβ) with
1
| β| ≤
qQ
and then one should try to obtain good minor arc estimates for this
summation using a “quasi-Riemann hypothesis” (i.e., assuming there
46 AARON LANDESMAN

are no zeros with σ ≥ 2/3. Unconditionally, we know information


about primes in progressions up to some √ modulus. Assuming GRH,
we know estimates for primes up to x. We can then √ find approxi-
mations for numbers with the denominator up to x. We can then
use the prime number theorem for everything, and then we won’t
even have to worry about major and minor arcs, we can hit the whole
problem in both cases using GRH.
If one is more careful (via a result due to Hardy and Littlewood) it
is enough to assume there are no zeros with σ ≥ 3/4.
Remark 8.2. On GRH, one should be able to prove

∑ Λ(n) exp (nφ)  x3/4+ε ,


n≤ x

whereas Vinogradov’s method only gave an x4/5 . To get the 3/4


estimate, we would need Hardy and Littlewood’s refinement. This
refinement due to Hardy and Littlewood
 is
 a refinement of the Gauss
sum idea. One might decompose exp an q as a sum of multiplicative

characters. This incurs a loss of q. When one writes it as
1
∑ τ (χ)χ(ax),
φ(q) χ mod q

one rewrites a number on the order of 1 as a number on the order of



q. For n ≤ x, one can write exp(nβ) in terms of integral of the form
Z
f (y)niy dy
|y|≤ x

We try to replace this additive character in x in terms of a multiplica-


tive character in y. A gauss sum is then of the form
 
n
τ (χ) = ∑ exp χ ( n ).
q
There will then be an integral which is an analog of a Gauss sum.

One then saves a factor of q, instead of just writing it naively by
breaking it up into progressions.
Exercise 8.3 (Difficult exercise). This exercise is to prove a theorem
of Davenport. Let µ(n) be the Möbius function. Show

x
sup ∑ µ(n) exp (nα) A
α ∈R n ≤ x (log x ) A
ANALYTIC NUMBER THEORY NOTES 47

for any A > 0. You will have to figure out what happens when α is
on a major or minor arc.
When α is on a minor arc, you will want to use Vinogradov’s
√ √
method and use a bilinear estimate, you will have x/ q + q · x.
0
We use Vaughn’s identity for obtaining bilinear forms for − ζζ (s),
and we would need an analog for identifying 1ζ (s). One would take
ζ · M for M a modifier, and play around with powers of that.
To deal with the case when α is on a major arc. This has to do with
understanding
 
an
∑ µ(n) exp q ,
n≤ x

for q ≤ (log x ) A . One might rewrite this in terms of χ mod q. The


goal would then be to understand
∑ µ ( n ) χ ( n ).
n≤ x
Many results holding for prime numbers also hold for the Möbius
function. For studying primes we look at something like
1 −ζ 0 ds
Z
(s) x s ,
2πi
(c) ζ s
and in this case we would be looking at
1 1 ds
Z
xs .
2πi (c) L(s, χ) s
For primes there is a pole at s = 1, but the pole at s = 1 becomes a 0
for 1/L. So there is some cancellation.
On the major arcs, there are savings with powers of log, and on
the minor arcs there is another method using bilinear forms which
gives savings of powers of log.
Exercise 8.4. Assuming GRH, show

sup ∑ µ(n) exp (nα)  x3/4+ε .


α ∈R n ≤ x

Maybe 5/6 instead of 3/4 would be easier to prove. Presumably the


correct answer is x1/2+ε . The supremum does obtain x1/2 because
Parseval tells us
Z 1 2

0
∑ µ(n) exp (nα) dα = ∑ µ ( n )2 .
n≤ x n≤ x
48 AARON LANDESMAN

Remark 8.5. The minor arc technology gave use an estimate of the
form
N p
√ + qN + E
q
where E is some error term endemic to the method.
√ The first two
terms are optimized when q is on the order of N, in which case the
first two terms give N 3/4 , so you cannot really do better than N 3/4
with this minor arcs method. We can write exp(nα) in terms of
Z
∑ t
χ(n)nit f ,
χ

for some function f . Here n ≤ N, α ∈ (0, 1). One than writes α = qa +


β. One does not want to use too many q’s and too many t’s. Roughly
one uses q characters χ and integrates over t which is roughly 1 +
| β| x. One needs to balance what weight to put on the sum and what
weight to put on the integral. You can always choose
√ q ≤ Q, | β| ≤
1 N
+ | β| N ≤ 1 + Q . One can choose Q ∼ N so the integral
qQ . Then, 1 √ √
goes up to N and the sum adds over N terms. One looses N 1/2
complexity when doing the above procedure. So it is very hard to
beat N 3/4 in these major and minor arc estimates.
8.2. Zeros of ζ and L-functions. It’s good to have some intuition for
where the 0’s come from and how you might prove these functions
have a 0-free region. We’d like to prove
ζ (1 + it) 6= 0, L(1, χ) 6= 0,
where
  −1
χ( p)
L (1, χ) = ∏ 1−
p
.
p

We can consider the product


  −1
1
ζ (1 + it) = ∏ 1−
p1+it
.
p

How will you find t with


|ζ (1 + it)|
being large or small. The small primes have a much bigger impact
on this product above than the large primes. The maximum impact
occurs when the small primes are as big or small as possible.
ANALYTIC NUMBER THEORY NOTES 49

To make
|ζ (1 + it)|
large, we would like
pit ∼ 1
and to make it small we would like
pit ∼ −1
for many small primes p.
Exercise 8.6 (Simple exercise). Show that for any N, there are qua-
dratic characters χ with χ( p) = 1 for all p ≤ N (and similarly
χ( p) = −1 for all p ≤ N).
Remark 8.7. Then, χ will occur to some modulus q, and q might be
very large in terms of N. If one tries to compute one of these via
the Chinese remainder theorem, one might then have the modulus
exponentially large in N.
Conjecture 8.8 (Vinogradov). If χ( p) = 1 for all p ≤ N, then q > N A
for arbitrarily large A. Conversely, χ( p) = −1 for all p ≤ N then
q > N A for all large A.
Example 8.9. The least quadratic non-residue must be smaller than
qε , if Conjecture 8.8 were to hold true.
Remark 8.10. The chance that all the first N primes land heads, one
should expect the chance is around 1/2 N . So, one would expect the
number of primes would have to be exponentially large.
Every once in a while, there can be a surprise. For example, if
D = −163. Then,
−163
 
= −1
p
for p < 41. There are 12 such primes. If you think of things as coin
tosses, there would only be a 1/212 (since there are 12 primes up to
and including 37) but 163 is substantially smaller than 212 = 4096.
L(1, χ−163 ) should be very small. Indeed, the Class number for-
mula gives

πh Q

−163 π
L (1, χ−163 ) √ ∼ .
163 13
50 AARON LANDESMAN

and the class number is 1 here (and h denotes class √number). Recall
that the class number formula implies that for Q

−D ,

πh(Q

−D
L (1, χ D ) = √ .
D
Goldfeld and Gross-Zagier’s result implies
 
C log D
L 1, χ− D ≤ √ .
D
Siegel’s theorem implies that if χ is a quadratic character mod q, then
L(1, χ) ≥ C (ε) q−ε
for all ε > 0.
Remark 8.11. The zero free region is determined as follows. If L ( β, χ) =
0, then
L (1, χ) = (1 − β) L0 (σ, χ)
for some β ≤ σ ≤ 1.
Exercise 8.12. Prove that if
1
1 ≤ σ ≤ 1− ,
log q
then
L0 (σ, χ) ≤ C (log q)2 .
See Davenport. Essentially you can make sense of this a little left of
the 1 line. Then you can differentiate it and deduce this. So, you
have a bound on how close a zero can be to 1. So, if there’s a bad
Siegel 0 it has to be bounded away from 1.
8.3. The zero-free region of the Zeta function. We’d like to instead
look at the completed zeta function
ξ (s) = s (s − 1) π −s/2 Γ(s/2)ζ (s).
The functional equation says
ξ ( s ) = ξ ( s − 1).
The Hadamard product formula gives
 
s
e A+ Bs
∏ 1 − ρ es/ρ .
ρ

The trivial zeros come because the Γ(s/2) function has zeros at s =
0, −2, −4, · · · .
ANALYTIC NUMBER THEORY NOTES 51

Exercise 8.13. The Riemann hypothesis is equivalent to


|ξ (σ + it)|
is monotonically increasing in σ ≥ 1/2 Hint: Show that 0 by 0, the
Hadamard product above will be increasing.
Furthermore,
|ζ (σ + it)|
is monotone increasing on σ ≥ 1.
Exercise 8.14. Let χ be an even character, i.e., χ(−1) = 1. Define
ξ (s, χ) := π −s/2 Γ(s/2) L(s, χ).
Then prove
|ξ (s, χ)|
is monotone increasing in σ > 1.
If ζ (1 + it) = 0, then
pit ∼ −1
for many small primes p. This implies
p2it ∼ 1
for many small primes p. This implies ζ (1 + 2it) is very big. This
relates to the classical inequality
ζ (σ)3 |ζ (σ + it)|4 |ζ (σ + 2it)| ≥ 1.
Then,
χ( p) pit ∼ −1
implies
χ( p)2 p2it ∼ 1,
which implies
 
2
L 1 + 2it, χ
is big. This would yield a contradiction unless
χ2 = χ0
and t = 0. In this case, we are considering the ζ function at 1, which
is big because it has a pole. This is the Siegel zero situation where
we have a quadratic character and want a lower bound for L(1, χ).
52 AARON LANDESMAN

8.4. Siegel zero situation. We’ll now discuss a proof due to Gold-
feld of Siegel’s theorem. We want to show that a lower bound for
L(1, χ)
L(1, χ)  C (ε)q−ε
for χ a quadratic character modq. We look at the region
h ε i
1− ,1 .
10
Either
(1) All quadratic Dirichlet L-functions have no zero in this region
We take β = 1 − 10ε
. We define Ψ to some character mod3.
(2) There is some quadratic character Ψ mod r for some r with
L( β, Ψ) = 0 with 1 ≥ β ≥ 1 − 10ε
.
Consider
a(n)
ζ (s) L(s, χ) L(s, Ψ) L(s, χΨ) = ∑ ns
.

Exercise 8.15. Check a(n) ≥ 0 for all n by just expanding the defini-
tions of Dirichlet characters for the various L functions.
This function above is the Dedekind ζ function for the biquadratic
extension defined by χ and Ψ. This function is always non-negative
on primes because
(1 + χ( p)) (1 + Ψ( p)) ≥ 0
and both χ, Ψ takes values ±1, 0.
Then, for c > 1, consider
1
Z
I := ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ)Γ(s) X s ds
2πi (c)

where X is some large parameter that we haven’t yet defined, which


is roughly (qr )10 .
Exercise 8.16. Show that
1
Z
X s Γ(s)ds = e−1/x .
2πi (c)

Look at the 0’s of Γ, compute the residues, and you will see the Taylor
expansion for e−1/x .
ANALYTIC NUMBER THEORY NOTES 53

Here we have nice absolutely convergent integrals. But instead


of picking up the characteristic function, we pick up a “smoothed”
version of the characteristic function.
Then, we have

a(n) −n/x
I= ∑ e .
n =1 n
β

where we are plugging in


a(n)
∑ nβ
and ( X/n)s We have Then, we have

a(n) −n/x
I= ∑ n β
e ≥ e−1/x ≥ 1/2.
n =1

Dirichlet L-functions are entire. The only L-function with a pole is


the Riemann zeta function. For any other character, the L function
terms cancel out every q-steps. For example, using integration by
parts
Z ∞
!
1
L(s, χ) = s
d ∑ χ(n)
1− y n≤y
Z ∞
!
1
=s
1− y s +1
∑ χ(n) dy.
n≤y

Moving the line of integration to the left, we encounter poles at


1 − β from the ζ function, there are poles from the Γ function. We
take Re s = − β + 1/2, so this is negative, but not as negative as −1.
We encounter poles at
s = 1 − β, s = 0
coming from ζ (s + β) and Γ(s). The pole at s = 1 − β has residue
Computing the residue at 1 − β we get
L (1, χ) L (1, Ψ) L (1, χΨ) X 1− β Γ(1 − β).

The residue at 0 is given as follows: near 0, we have Γ(0) ∼ 1


s using
that s · Γ(s) = Γ(s + 1) and Γ(1) = 1 and is smooth.
The residue at 0 is
ζ ( β) L( β, χ) L( β, Ψ) L( β, χΨ) ≤ 0.
54 AARON LANDESMAN

Indeed, if all Dirichlet functions have no 0’s, L( β, χ) is positive and


L( β, Ψ), L( β, χΨ) is positive, and ζ ( β) is negative. In the second case
L( β, Ψ) = 0.
We then have a lower bound for the residue at 1 − β. This is what
we want, because we want a lower bound for L(1, χ). We would be
done if we had upper bounds for the latter Dirichlet L functions. We
can just replace using integration by parts
Z ∞
!
1
L(s, χ) = s
d ∑ χ(n)
1− y n≤y
Z ∞
!
1
=s
1− y s +1
∑ χ(n) dy.
n≤y

as above.
Exercise 8.17. Indeed show that for χ a character modq, show
| L (1, χ) |  log q.
for X a large power of qr (using Re(s) = − β + 1/2).
Therefore, we would conclude a bound of the form
L (1, χ)  (qr )−ε .
We could get an effective bound, but we don’t know what r is. In
case 1, r = 3, so things would be fine. But, if there is some violation
to the Riemann hypothesis, then r depends on what the violation to
the Riemann hypothesis is. So this r is the source of the ineffectivity
in Siegel’s theorem.
Next time, we’ll discuss effectivity of the 3-prime theorem. Then
we’ll move on to discussing a theorem of Maynard:
Theorem 8.18. There are infinitely many primes with no 7 in their decimal
expansion.

9. 10/24/17
9.1. Quick recap of the proof of Siegel’s theorem. Recall that last
time we proved Siegel’s theorem:
Theorem 9.1 (Siegel). We have
C (ε)
L(1, χ) >

(with C (ε) ineffective).
ANALYTIC NUMBER THEORY NOTES 55

The idea of the proof was to construct an auxiliary character Ψ mod


r. There were two cases. In the first case,all characters have no ze-
ros [1 − ε/10, 1] and we took ψ mod r = 13 In the second case we
assume there exists some r with a zero β ≥ 1 − 10 ε
. The idea was to
consider
1 a(n)e−n/x
Z
ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ) X s Γ(s)ds = ∑ ≥ e−1/x
2πi (c) nβ
We then move the line of integration to Re(s) = 1/2 − β. This has a
pole at 1 − β. We obtain
L(1, χ) L(1, Ψ) L(1, χΨ) X 1− β Γ(1 − β).
Then, at s = 0, we have
ζ ( β) L( β, χ) L( β, Ψ) L( β, χΨ) ≤ 0.
At the end of last time, we claimed
1
Lemma 9.2. The integral on 2 − β is negligible. That is,
1
Z
ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ) X s Γ(s)ds  1
2πi ( 21 − β)

for appropriate values of x (we will take (qr )20 ).


Proof. Indeed,
1
Z
ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ) X s Γ(s)ds
2πi ( 21 − β)
Z ∞
1 1 1 1
 x1/2− β e−|t| ζ ( + it) L( + it, χ) L( + it, Ψ) L( + it, χΨ) dt
−∞ 2 2 2 2
We want some kind of polynomial bound to show this integral is
negligible. We have
ξ (s) = s (s − 1) π −s/2 Γ(s/2)ζ (s)
is entire of order 1 (meaning it doesn’t grow more than exponen-
tially). We want to use the maximum modulus principal in a com-
plex strip with real part between −1 and 2. It’s easy to bound ξ
because ζ is a bounded function on Re(s) = 2. Similarly, we can un-
derstand asymptotics of the other terms. By the functional equation,
we then also understand the value at Re(s) = −1. So, by this variant
of the maximum modulus principal, we can bound |ξ (1/2 + it)| by,
essentially, |ξ (2 + it)|. This cannot literally be true because it would
imply the Riemann hypothesis, but if we restrict to a rectangular re-
gion, bounding things from above and below, we will have good
56 AARON LANDESMAN

enough bounds. But, in any case, after making this precise, we can
bound Γ by a sterling approximation, and then bound
|ζ (1/2 + it)|  (1 + |t|) .

Remark 9.3. If we instead carry this out between −ε and 1 + ε, one


can obtain the convexity bound

|ζ (1/2 + it)|  (1 + |t|)1/4+ε .


The Lindelöf hypothesis says we can replace 1/4 + ε by any positive
exponent.
Altogether, we can bound
1
Z
ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ) X s Γ(s)ds
2πi ( 21 − β)
Z ∞
1 1 1 1
 x1/2− β e−|t| ζ ( + it) L( + it, χ) L( + it, Ψ) L( + it, χΨ) dt
−∞ 2 2 2 2
Z ∞
 x1/2− β e−|t| ((1 + |t|) qr )4 dt
−∞
4 −.4
 (qr ) x .

Now, choose x = qr20 so that (qr )4 x −.4  1.



Using the above bound from the lemma, together with
1 a(n)e−n/x
Z

2πi (c)
ζ (s + β) L(s + β, χ) L(s + β, Ψ) L(s + β, χΨ) X s Γ(s)ds = ∑ nβ
≥ e−1/x

(with the last term bounded by .9) we get


1 x β −1
L (1, χ) L (1, Ψ) L (1, χΨ) ≥
3 Γ (1 − β )
1
≥ (1 − β) x −ε/10
5
= (1 − β) /5 (qr )−2ε .
Then, note
L(1, χ) ≤ c log r,
L(1, χΨ) ≤ c log qr.
ANALYTIC NUMBER THEORY NOTES 57

So, one obtains


L (1, χ) ≥ C (1 − β) (qr )−3ε .
The constant C is calculatable, but the reason for the ineffectivity is
that we do not know what r and β are.
Remark 9.4. One can effectively prove
C
L (1, χ) ≥ √ ,
q
so β must be at least √1r away from 1, or something like that. So really
the constant above only depends on r, since we can get a bound on
β from that.
9.2. Effectivity of ternary Goldbach. Returning to ternary Goldbach,
we considered
 
an
∑ Λ(n) exp q .
n≤ N

Using q ≤ (log N )10 , we found


∑ Λ ( n ) χ ( n ),
n≤ N
−Nβ √c
is bounded by something like β for β > 1 − q. Then,

N β ≤ N 1−c/ q

is small compared to N only when q ≤ (log N )1.99 . But, we wanted


q to go up to (log N )10 rather than (log N )2 . So, we will have to use
Siegel’s theorem in some range. Even though Siegel’s theorem is not
effective, we can use that Siegel zeros are rare to still get effectivity
of ternary Goldbach.
Lemma 9.5. There cannot exist two primitive quadratic characters
χ1 ( modq1 ), χ2 ( modq2 )
−10
with Q ≤ q1 , q2 ≤ Q100 and both L functions having a 0 at least 1 − 10
log Q .

Proof. Suppose we have two such characters χ1 , χ2 . We’ll now play


these two characters against each other. Consider
ζ (s) L(s, χ1 ) L(s, χ2 ) L(s, χ1 χ2 ).
This is the Dedekind zeta function of a biquadratic field, so its Dirich-
let coefficients are all positive.
58 AARON LANDESMAN

Instead, consider

ξ (s)ξ (s, χ1 )ξ (s, χ2 )ξ (s, χ1 χ2 ).

Consider its logarithmic derivative and evaluate at some real num-


ber σ > 1. We have
ξ0 ξ0 ξ0 ξ0
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 )
ξ ξ ξ ξ

Using the Hadamard product formula we have


 
s
ξ (s) = e A+ Bs
∏ 1 − ρ es/ρ
ρ

then
ξ0 1
ξ
(s) = ∑ s − ρ.
ρ

Then,
1 σ−β
Re = .
s−ρ | s − ρ |2
with s = σ + it, ρ = β + iγ. On the one hand, the expression

ξ0 ξ0 ξ0 ξ0
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 )
ξ ξ ξ ξ

is always positive. On the other hand, if we have two real zeros, we


would obtain
ξ0 ξ0 ξ0 ξ0 1 1
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 ) ≥ + .
ξ ξ ξ ξ σ − β1 σ − β2

We also know

ξ (σ ) = σ (σ − 1) π −σ/2 Γ(σ/2)ζ (σ),

and with α equal to either 0 or 1,


 q σ/2
1
ξ (σ, χ1 ) = Γ(σ + α/2) L(σ, χ1 ).
π
ANALYTIC NUMBER THEORY NOTES 59

We get similar expressions for the other two ξ functions. Then, we


obtain
ξ0 ξ0 ξ0 ξ0
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 )
ξ ξ ξ ξ
1 1 1 1
+ log q1 + log q2 + log q1 q2 + O(1)
1−σ 2 2 2
ζ0 L0 L0 L0
+ (σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 ).
ζ L L L
We then obtain
ζ0 L0 L0 L0
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 ).
ζ L L L
is approximated by
Λ(n)
−∑ (1 + χ1 (n) + χ2 (n) + χ1 χ2 (n)) ,
nr
which has all Dirichlet coefficients negative. Therefore, we have

ξ0 ξ0 ξ0 ξ0
(σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 )
ξ ξ ξ ξ
1 1 1 1
+ log q1 + log q2 + log q1 q2 + O(1)
1−σ 2 2 2
ζ 0 L 0 L 0 L0
+ (σ) + (σ, χ1 ) + (σ, χ2 ) + (σ, χ1 χ2 )
ζ L L L
1
≤ + log q1 q2 + O(1).
σ−1
If β 1 , β 2 were close to 1, we have a lower bound by something close
to 2/1 − σ.
Exercise 9.6. If q1 , q2 are comparable to each other, say q1 ≤ q1002 , q2 <
1 1
q100
1 , then we can’t have both a lower bound by σ− β1 + σ− β 2 and an
upper bound by
1
+ log(q1 q2 ) + O(1)
θ−1
10−6
Here, we are choosing σ to be around 1 + log q1 q2 .


60 AARON LANDESMAN

9.3. Discriminants of number fields. Let K be a number field over


Q. Then, some prime must be ramified in K because the discriminant
is more than 1. In general, if K has degree n over Q, what can we say
about dK := disc K.
Question 9.7 (Open question). Take f ∈ Z[ x ] of degree n irreducible.
How does disc( f ) grow?
Theorem 9.8 (Minkowski, Stark-Odlyzko). The discriminant of a num-
ber field K is bounded below by cn with c > 1.
Remark 9.9. Minkowski got this by thinking about lattices and using
the geometry of numbers.
One way to think of this is the following idea going back to Stark:
Let r1 be the number of real embeddings, r2 be the number of com-
plex embeddings, so that r1 + 2r2 = n. Consider the Dedekind zeta
function
  r1 r
−s/2
s/2
ξ K ( s ) = s ( s − 1) d K ζ K ( s ) π Γ(s/2) (2π )−s Γ(s) 2
 
s
= (· · · ) ∏ 1 − (· · · )
ρK ρK

where · · · indicate factors we must include to make the product con-


verge. This will satisfy a functional equation ξ K (1 − s) = ξ K (s), and
will have a Hadamard product, and so on. Then,
ξ K0 1
ξK
(σ) = ∑ σ − ρK ≥ 0.
ρK

Then,
ξ K0 −1 1 Γ0 ζ0
 
1 1 1
(σ) = + + log dK + r1 log π + (σ/2) + r2 (· · · ) + K (σ).
ξK σ σ−1 2 2 2Γ ζK
ζK0
Using that the last term ζ K (σ) is negative, and the whole sum is pos-
0
itive, choosing σ near 1 optimally and knowledge of ΓΓ (1/2) and
Γ0
Γ (1) gives a lower bound for the discriminant. One must choose an
appropriate value of σ, Sound suggests something like σ = 1 + nc .
So, if you have a field with small discriminant, this also means
there are not many primes of small norm. The zeros of such an L
function are then also nicely behaved.
Exercise 9.10. Work out the details in the above remark.
ANALYTIC NUMBER THEORY NOTES 61

There is a nice survey by Odlyzko (if one searches “discriminants


Odlyzko“) and also an article by Serre on “Minorations of discrimi-
nants” (in French).
Remark 9.11. The ring of integers in a number field may not be
monogenic, so the discriminant of a polynomial may be much larger
than the discriminant of a number field. We don’t have a good lower
bound on the discriminant of a polynomial.
Suppose
(log N )1.9 q ≤ (log N )10 .
suppose there is some χ mod q0 with a Siegel zero at β 0 . All we have
to worry about are α = qa + β with q0 | q. We have an expression of
the form
 
an Nµ(q) τ (χ)
∑ Λ(n) exp q = φ(q) + (· · · ) + φ(q) χ(a)Ψ( N, χ).
n≤ N

The last Ψ( N, χ) is bounded by something like N β0 /β 0 . So, the


above is approximated by
Nµ(q) τ (χ) N β0
+ χ( a)
φ(q) φ(q) β0
and then we can then approximate these things by major and minor
arcs.
We then have to change whatever main term we had before with
this new main term coming from
τ (χ) N β0
χ( a)
φ(q) β0
That is, we have
 3   
a a
Z Z

M
3
S(α) exp (− Nα) dα = ∑ 1
| β|≤ qQ
S
q
+β exp − N
q
+β dβ.
10
q≤(log N ) ,q0 |q

Then, to find the contribution of the cube of this main term, we find

τ ( χ )3 N 3β0 − aN
 
∑ φ(q)3 χ(a) β3 exp q
a mod q 0
 
From χ( a) exp −aN q we get another Gauss sum so

τ ( χ )3 N 3β0 − aN τ (χ)4 N 3β0 −1
 
∑ φ(q)3 χ(a) β3 exp q ∼ φ(q)3 β3 .
a mod q 0 0
62 AARON LANDESMAN

where the −1 in 3β 0 − 1 is coming from the integral. Then, τ (χ) is



bounded by q. So, the above is bounded by
q2 N 3β0 −1
.
q3 β30
So, in conclusion, we get a bound like
N2
∑ q
 N 2 q0 log log N.
q≤(log N )10 ,q0 |q

for q0 > (log N )1.9 . Therefore, the proof is effective.

10. 10/26/17
Remark 10.1. Goldfeld Gross Zagier says
c log | D |
L (1, χ) ≥ p
|D|
for imaginary quadratic fields. Then, effectively, h(− D ) > c log | D |.

Remark 10.2 (Euler’s idoneal numbers). Consider Q

p
− D . Can it
be that all p ≤ | D | are either ramified or inert?
Gauss’ genus theorem tells us
h (− D ) ≥ 2 part of the class group = 2# primes | D = d (| D |) .
The divisor function grows as O(| D |ε ). The problem is to find all
discriminants − D < 0 where
 √ 
cl Q − D = (Z/2Z)r .

√ There
Remark10.3.  is work of Biro on class numbers of fields of the
form Q 2
n +4 .

10.1. Primes with missing digits. In general it is quite hard to an-


swer questions of the form:
(1) If p is a prime, is p + 2 a primes?
(2) If n is even, when is n2 + 1 prime?
Here are some theorems coming out of sieve methods.
Theorem 10.4 (Piatetski-Shapiro, 1950s). If 1 < α < 1.1, then there are
infinitely many primes of the form
b n α c.
ANALYTIC NUMBER THEORY NOTES 63

Remark 10.5. This is quite a sparse set of numbers up to x, there are


only x1/α such numbers.
Recall from Fermat that every p ≡ 1 mod 4 can be written as a
sum of two squares.
Theorem 10.6 (Fouvry and Iwaniec). Infinitely often, one can write p ≡
1 mod 4 as p = n2 + m2 with n a prime.
Theorem 10.7 (Friedlander and Iwaniec). For p ≡ 1 mod 4 one can
write p = m2 + n4 for infinitely many primes.
Theorem 10.8 (Heath-Brown and Li). There are infinitely many primes
p ≡ 1 mod 4 with p = m2 + q4 for q primes.
Theorem 10.9 (Heath-Brown). There are infinitely many primes of the
form a3 + 2b3 .
Remark 10.10. This answers an old question of Hardy and Little-
wood asking if there are infinitely many primes which are sums of
three cubes.
Friedlander and Iwaniec only involves pairs m, n over sets of size
x3/4 = x1/2 · x1/4 and in Heath-Brown’s result, this only involves a
set of size x2/3 up to some x.
Question 10.11. Are there infinitely many primes p = a2 + b3 ?
Remark 10.12. This is analogous to the question of whether there are
infinitely many elliptic curves with prime discriminant or conductor
(since the discriminant of an elliptic curve in short Weierstrass form
is something like 4a3 + 27b2 ).
The main result we’ll spend the next few lectures proving is of a
similar flavor.
Theorem 10.13 (Maynard). If q is a sufficiently large base (e.g. q = 107 ),
write n = ∑kj=0 n j q j with 0 ≤ n j ≤ q − 1 as a base q expansion. Select a
forbidden digit 0 ≤ a0 ≤ q − 1 Let
A := {n ∈ N : n does not have the digit a0 base q } .
Then,
n o
# n<q :n∈Ak
= ( q − 1) k
 log(q−1)/ log q
= qk .
64 AARON LANDESMAN

Then,

∑ Λ ( n ) ∼ κ a0 ( q ) ( q − 1 ) k .
n<qk
n ∈A

for κ a0 (q) an explicit positive constant

Remark 10.14. There is also a more involved version where one ob-
tains a lower bound of the form  (q − 1)k for q = 10.

One can also find elements of A that are, say, squares.

Theorem 10.15 (Mauduit and Rivat). Write primes in binary p = ∑ a j 2 j .


Count s( p) := ∑ j a j . Then, s( p) is equally likely to be 0 or 1 mod 2.

More generally, this can be done with any base replacing 2, with
obvious exceptions.
There is also the following cute result: One might ask if one can
find Fermat primes, with two 1’s in the binary expansion and all
other digits 0. This might be a hard problem because the set is quite
sparse, but, one can try to further ask if there are infinitely many
primes with k 1’s. One might try an easier problem asking if there
simply exist primes with exactly k 1’s in their binary expansion. The
following theorem shows the answer is yes.

Theorem 10.16 (Drmota, Mauduit, and Rivat). √ Let K be an integer and


let k be on the scale of K/2 (say k − K/2 = O( k)). Then,
n o
p < 2K : p prime , there are k digits equal to 1

has an asymptotic formula, with about 1/k of the numbers in this set prime.
K 2K
Remark 10.17. One would expect about (K/2 )∼ √
K

Theorem 10.18 (Bourgain). There are primes p ≤ 2k for which you can
specify any αK of the binary digits, for some fixed α > 0, where the last
digit must be 1 (so that the number is not even).

10.2. Beginning the proof of Maynard’s theorem. We now turn to


proving Maynard’s theorem on primes without a specified digit. Re-
call we have fixed a base q, an integer k, and defined A as the set of
primes up to qk without a digit a0 .
ANALYTIC NUMBER THEORY NOTES 65

We are trying to count


∑ Λ(n) = ∑ Λ ( n )1 A ( n ).
n<qk n<qk
n ∈A
Z 1
= S(α)A (−α)dα,
0
where
S(α) = ∑ Λ(n) exp(nα)
n<qk

and
A (−α) = ∑ exp (−nα) .
n<qk
n ∈A
In our situation, we can understand the Fourier transform of A
so well that we can actually understand its L1 norm. It is binary be-
cause we are intersecting two sets here: primes and integers without
a specified digit.
We could still use the circle method here, but it is a little easier to
apply the circle method in a discrete setting.
10.3. The circle method in a discrete setting. Consider
q k −1  
−a
 
1 a
∑ S q k A q k = ∑ Λ ( m )1 A ( n )
q k a =0 k m,n<q
m≡n mod qk

= ∑ Λ ( n )1 A ( n ).
n≤qk

The last equality holds because m, n < qk . For the penultimate one,
writing
   
a am
S = ∑ Λ(m) exp
qk m qk
and
−a −na
   
A = ∑ 1 A (n) exp ,
qk n qk
and the only terms that survive are m ≡ n mod qk . This is an excel-
lent approximation to the integral from the circle method.
Exercise 10.19. Verify the above equalities.
66 AARON LANDESMAN

We now separate terms into major and minor arcs, as in the circle
method.
We now write
a `
k
= +β
q d
1
with d ≤ qk/2 and | β| ≤ dqk/2
.
We try to approximate a/qk using rational numbers with denomi-
k/2
nator at most q . By Dirichlet’s theorem, we can always write num-
bers in this form.
We write A as some large positive number. The major arcs are
those values of a with

k
A (log qk ) A
d ≤ log q , | β| ≤ .
qk
The minor arcs are the remaining values of a. The major arcs are
distinct because the denominators are small and we are taking small
intervals around each rational number.
We’ll first deal with the major arcs. The harder part will come later
when we deal with the minor arcs.

10.4. The major arc contribution. There are two cases:


(1) The denominator d is a small power of q
(2) The denominator d is not a small power of q.
The main terms will come from the first case. Consider the sum
S(α) on the major arcs of the first case, so
` b
α= + k
d q
with d a power of q. b is small (at most d) because β is bounded.
Consider
 
`n
S (`/d) = ∑ Λ(n) exp
d
n<qk
!
µ(d) k qk
= q +O 4A
φ(d) log qk
where we have proved asymptotic formulas for this in certain ranges
using Siegel’s theorem.
ANALYTIC NUMBER THEORY NOTES 67

Then,
  !
qk
   
` b µ(d)  nb 
S + k = ∑ exp +O .
d q φ(s)
n<qk
qk (log k)3A
 
qk
 
a
If b 6= 0, the main term vanishes and S 3A . So,
qk
= O
(log qk )
we will only have to worry about the terms d = 1 or d = q.
Let’s now sum all these major arcs. The contribution from the error
terms of case 1 is
!
1 1 qk k
+O  ( q − 1)
qk qk log qk A

using that |A (α)| ≤ (q − 1)k . We have main terms


1 k
q ( q − 1) k
qk
if d = 1 and b = 0
−` −1 k
  
1
qk
∑ A
q q−1
q .
1≤`≤q−1
d=q
b =0

Now we have to think a bit about the Fourier transform. We have


!
A (α) = ∑ exp ∑ nj qj α
0≤n0 ,n1 ,...,nk−1 ≤q−1 j
n j 6 = a0
 
k −1  
= ∏ ∑ exp n j q j α  .
j =0 n j 6 = a0

Then, splitting contributions from j = 1, . . . , k − 1 and j = 0, we get


q −1  !
−` − n0 ` − a0 `
   
A
q
= ( q − 1) k −1
∑ e q − exp q
n0 =0
− a0 `
 
= − exp ( q − 1 ) k −1
q
68 AARON LANDESMAN

Adding all these main terms together we get


q −1 !
−` −1 k ( q − 1 ) k −1 − a0 `
   
1 k 1
qk
q ( q − 1 ) k
+ ∑
qk 1≤`≤q−1
A
q q−1
q +
( q − 1) ∑ exp q
`=1

 ( q − 1) k q if a0 = 0
q −1
= k
 
 ( q − 1) 1 − 1 2 if a0 6= 0
( q −1)

11. 10/31/17
11.1. review. Let q be a large but fixed base and let A be the set of
numbers missing a0 ∈ [0, q − 1].
Goal 11.1. Count
∑ Λ(n)
n<qk
n ∈A

and show it is asymptotically c a0 (q) (q − 1)k for some constant c a0 (q) >
0.
Let
A ( α ) Ak ( α ) : = ∑ 1A (n)e (nα) ,
n<qk

with
S (α) = ∑ Λ(n)e (nα) .
n≤qk

Our goal was to compute the discrete Fourier transform


 Z 1
−a
  
1 a
∑ S qk A qk = 0 S(α)A (−α)dα.
qk a mod qk

We can approximate
a ` 1
− <
qk d dqk/2
with d ≤ qk/2 . Last time, we were trying to work out the major arc
A
contributions over intervals with d ≤ log qk and
A
a ` log qk
− ≤ .
qk d qk
ANALYTIC NUMBER THEORY NOTES 69

Last time we computed the major arc contribution in the case d was
A
a power of q less than log qk where we got
( q
q −1 if a0 = 0 mod q
c a0 ( q ) ( q − 1 ) k =
1 − 1 2 if a0 6= 0.
( q −1)

11.2. Remaining major arcs. We next deal with the remaining major
arcs. Namely, we show those centered at d` for d not a power of q are
negligible. We’ll put a trivial bound on S and the cancellation will
come form A (α).
We know
|S(α)| ≤ qk (1 + o (1))
trivially. We now look for cancellation in
   
a `
A =A +c
qk d
A
(log qk )
for c at most qk
, as we are on a major arc.
We have
 
k −1 q −1  
Ak ( α ) = ∏ ∑ e nj qj α  .
j =0 n j =0,n j 6= a0

A crude L∞ bound will in fact work for us, as we now explain.


We can bound



e nj θ ≤ (q − 3) + |e(nθ ) + e ((n + 1)θ )|
n j 6 = a0

≤ (q − 3) + 2 |cos(πθ )|
= (q − 1) − 2 (1 − | cos πθ |)
 
2
≤ (q − 1) exp −cq ||θ || .
for some small cq > 0, where
|| x || = min | x − n|.
n ∈Z
Then,
!
k −1 2
|Ak (α)| ≤ (q − 1) exp −cq k
∑ qj α .
j =0
70 AARON LANDESMAN
 A 
` (log qk )
Assuming α = d +O qk
, we get
!
k −1 2
|Ak (α)| ≤ (q − 1)k exp −cq ∑ qj α
j =0
!
k/2 2
`
 (q − 1)k exp −cq ∑ qj .
j =0
d

1
Remark 11.2. Note that if ||θ || ≤ 2q then ||qθ || = q ||θ ||.
log d
Lemma 11.3. For k in an interval of length log q + 1, we can find k0 with
` 1
qk0 ≥ .
d 2q
Proof. We have
` 1
qk ≥
d d
and now using Remark 11.2, we see that powers of q increase, and
eventually the value “wraps around 1” so we get some term which
is not too small. 
Therefore,
 
k k
|Ak (α)|  (q − 1) exp −c1 .
log k
Therefore, these other major arcs contribute
!
1  2A −ckq
∑ k k k k

q (q − 1) exp −cq k/ log k  (q − 1) log q exp .
qk A log k
d<( log qk
)
(`,d)=1,
d not a power of q

which is negligible compared to the main term computed last class.


11.3. The minor arcs. The real crux of the matter for dealing with
minor arcs is that it is possible to get good bounds for the L1 norm
of |A (α)| . We want to bound either
 
a
∑ Ak q k
a mod qk
ANALYTIC NUMBER THEORY NOTES 71

or
Z 1
|Ak (α)| dα.
0
There will be a huge amount of cancellation here.
Let’s look at
k −1 q −1
|Ak (α)| = ∏ ∑ e ( n j q j α ) − e ( a0 q j α ) .
j =0 n =0

We’ll now try to bound


q −1
∑ e ( n j q j α ) − e ( a0 q j α )
n =0

Let θ := n j q j α. Then, we have


q −1
1 − e(qθ )
 
∑ e(n j q α) − e(a0 q α) ≤ min q − 1, 1 + 1 − e(θ )
j j
n =0
We have
1 − e(qθ ) 2

1 − e(θ ) 2 |sin πθ |
1
= .
2 ||θ ||
Therefore,
q −1
1 − e(qθ )
   
1
∑ e(n j q α) − e(a0 q α) ≤ min q − 1, 1 + 1 − e(θ )
j j
≤ min q − 1, 1 +
2 ||θ ||
.
n =0

We now plug this in above for each θ = q j α. We have


!
k −1
1
|Ak (α)| ≤ ∏ min q − 1, 1 +
j =0 2 qj α

Let’s write
b1 b2 b3
α= + 2 + 3 +···
q q q
for 0 ≤ b j ≤ q − 1. Multiplying by q j α yields
b j +1
z+ + εj
q
72 AARON LANDESMAN

1
with z and integer and 0 < ε j ≤ q. Therefore, the distance to the
nearest integer of q j α is well determined by b j+1 . We have good
bounds whenever b j+1 6= 0, q − 1, while at b j+1 = 0, q − 1, we need
to use the rather poor bound of q − 1.
Putting the above together, we have
!
k −1
1
|Ak (α)| ≤ ∏ min q − 1, 1 +
j =0 2 qj α


 q−1 if b j+1 = 0 or q − 1
q q −1

= 1 + 2bj+1 if 1 ≤ b j+1 ≤ 2
 q q −1
1 + if 2 ≤ b j+1 ≤ q − 2.

2( q −1− b ) j +1

Plugging this in a summing over all possibilities for digits, we have


  k q−1/2  !
a q
∑ A k q k = ∏ α ( q − 1 ) + ∑ 2 + bi
k
a mod q j =1 b =1
k
= ∏ (3q + q log q) .
j =1

From the above, we have deduced the following lemma.

Lemma 11.4. We have


 
a
∑ A
qk
≤ (3q + q log q)k
q mod qk

and
Z 1
|Ak (α)| dα ≤ (3 + log q)k .
0

So if q is large, there is a lot of cancellation, but if q is small, we


won’t get very much.
Before continuing the proof, let’s motivate this. Recall that in our
sum of three primes problem, we had some bound of the form
 √ √ 
∑ Λ ( n ) e ( nα )  x 4/5
+ x/ q + qx
n≤ x
ANALYTIC NUMBER THEORY NOTES 73

Here |α − qa | = q12 for x.9x ≥ q > x.1 . On the L1 norm in this lemma,
we’re only using a very small power of x. As long as the denom-
inators are not too small or large, we are doing well on the minor
arcs.
Then, we want to bound
1
k
q a
∑ Ak ( a/qk )S(− a/qk )

where the latter is bounded by qk /(log qk ) A and the former is bounded


by the lemma. This will work out when d ≥ q.01k in `/d. So we
would be happy for large q.
The second idea will be used to estimate ∑ N j=1 Ak ( α j ) for rela-
tively few values of α j . We’ll need an additional spacing condition
that αi − α j ≥ δ if j 6= i. This is natural because
`1 `2 1
− ≥ .
d1 d2 d1 d2
Estimates like this are called large sieve estimates. These are usually
done in L2 , but here we’ll do an L1 estimate.
Lemma 11.5. With the spacing condition that αi − α j ≥ δ if j 6= i. we
have
N  
1
∑ Ak (α j )  δ + q (3 + log q)k
k
j =1
R1
Our hope to get a bound is something like N 0 |A (α)| dα.
Remark 11.6 (Sobolev inequality). We have
Z u
f (t) = f (u) − f 0 (v)dv.
t
 
Integrating both over u ∈ t − 2δ , t + 2δ , we have
Z t+δ/2 Z t+δ/2
δ | f (t)| ≤ | f (u)| + δ| f 0 (v)|dv.
t−δ/2 t−δ/2
Then
Z t+δ/2 Z t+δ/2
1
| f (t)|  | f (u)|du + f 0 (v) dv.
δ t−δ/2 t−δ/2
Since all points α j were at least δ apart, these intervals will not over-
lap when proving the lemma.
74 AARON LANDESMAN

Proof. We have the bound


N Z 1 Z 1
1
∑ Ak ( α j ) 
δ 0
|Ak (α)| dα +
0
Ak0 (α) dα
j =1
Z 1
1
 (3 + log q)k + Ak0 (α) dα
δ 0
Using
A (α) = ∑ e(nα)1 A (n),
n<qk
we get
A 0 (α)2πi ∑ ne(nα)1 A (n)
n<qk

Writing
k −1
n= ∑ nj qj
j =0

we have
!
k −1   k −1  
A j (α)  ∑ ∑ nj qj e nj qj α ∏ ∑ e ni qi α
j =06 = a0 n j i =0,i 6= j n i 6 = a0

 qk · B
where B was the bound for A (α). Then, integrating it out, we again
get a bound qk (3 + log q)k . 
The idea to finish the proof is to use the above lemma’s bound
with a different k. We will continue bounding the minor arcs next
time using our lemmas with other values of k.

12. 11/2/17
12.1. Review. Last time, we wanted to evaluate
−a
   
1 a
∑ Ak q k S q k .
qk a mod qk

We had the major arcs which were


k A
(  )
a a `  A log q
: = + η, d = log qk , |η | ≤ .
q q d qk
ANALYTIC NUMBER THEORY NOTES 75

We gave an asymptotic formula for these major arcs of the form


c a0 ( q ) ( q − 1 ) k .
This did not depend on the size of q and is true for any base at least
3 or so.
It remains to deal with the minor arcs. Last time, we saw we could
estimate the L1 norm of Ak . We showed
 
1 a
q k ∑ A k
q k
 (3 + log q)k

We also found
Z 1
|Ak (α)| dα  (3 + log q)k .
0
By a Sobolev type argument, we saw that if
α1 , . . . , α N
are δ spaced (i.e., αi − α j ≥ δ if i 6= j.
Then,
N  
1
∑ Ak (α j ) ≤ δ + q (3 + log q)k .
k
j =1

12.2. Bounding the minor arcs. Recall that by Dirichlet’s theorem,


a ` 1
− ≤
qk d dqk/2
using Dirichlet’s theorem with Q = qk/2 , with d ≤ qk/2 . We can
A
assume d ≥ log qk as we are on the minor arcs.
For now, fix a choice of B and D (where we will split d into dyadic
intervals based on D and qk |η | into dyadic intervals based on D). We
will later range over different possibilities of D and B.
We will split this into terms with D ≤ d ≤ 2D. We won’t worry
about over-counting because we’ll ultimately estimate things by tak-
ing absolute values. Write
a `
k
= +η
q d
and so that B ≤ qk |η | ≤ 2B and
 
k `
q η+ ∈ Z.
d
We also need to consider the case where qk |η | ≤ 1.
76 AARON LANDESMAN

The number of choices for η with qk |η | between B and 2B is roughly


2B (since η can be negative). We have D ≤ qk/2 . Then qk |η | 
qk/2 /D. We can assume
B  qk/2 /D.
Being on a minor arc means either
A
(1) D ≥ log qk .
A
(2) or if D is small then B ≥ log qk .
So, being on a minor arc means BD is somewhat large.
Goal 12.1. We now want to understand the contribution of one of
these dyadic blocks.
We want to understand
 
`
∑ ∑ ∑ Ak
d
+η .
D ≤d≤2D (`,d)=1 η
qk |η |∼ B
qk (η +`/d)∈Z

This is a sum containing about D2 B terms. This number of terms in


the sum is then at most qk/2 D  qk since B  qk/2 /D.
Recall that we are trying to estimate the number of primes up to qk
not containing the digit a0 . Now, our set Ak is self similar meaning
 
`   
Ak
d
+η = ∑ e n 0 + n 1 q + · · · + n k −1 q k −1
α
n0 ,n1 ,...,nk−1
0≤ n i ≤ q −1
n j 6 = a0

a `
where α = q = d + η.
Proposition 12.2. We have
 
`
∑ ∑ ∑ Ak
d

D ≤d≤2D (`,d)=1 η
qk |η |∼ B
qk (η +`/d)∈Z
  αq
k 2
 ( q − 1) D B
where
 
q
log q −1 ( B + log q)
αq =
log q
ANALYTIC NUMBER THEORY NOTES 77

Then,
 
 αq q
D 2αq
= q k1
= (3 + log q)k1 .
q−1
where qk1 ∼ D2 , qk2 ∼ B.
Proof. We’ll now split this sum into

 the first k1 digits

the middle k − k1 − k2 digits

 the last k digits
2

with k1 − 4k2 ≤ k.
The first k1 digits is dominated by Ak1 (α) . For the middle digits,
there are (q − 1)k−k1 −k2 , each bounded by 1, so we get (q − 1)k−k1 −k2
as a bound. For the last digits, we get

qk−k2 nk−k2 + nk−k2 +1 + · · · α.




Therefore, multiplying the contributions from all the digits, we get


    
e n 0 + n 1 q + · · · + n k −1 q k −1 α = Ak 1 ( α ) ( q − 1 ) k − k 1 − k 2 Ak 2 q k − k 2 α .

Remark 12.3. Thinking about what we are doing, there are D2 points
of the form `/d and about B points η near each `/d. Given a fixed
`/d, we are multiplying it by something corresponding to each of
the B well spaced B intervals. Then, we choose qk1 on the scale of D2
and qk2 on the scale of B.
We can bound
   !    
` ` 1 `
Ak + η  ( q − 1) k − k 1 − k 2 sup Ak 1 +η Ak 2 q k
+η .
d |η |∼ B/qk
d qk2 d

Note that the last Ak2 term corresponds to B well spaced points mod
1. That is there are B 1k2 (since we chose qk2 ∼ B, and in fact we will
q
need qk2
< B).
Now, we sum this over `, d, η. By a lemma from last time, fixing
`, d and summing over η, we get
   
1 `
∑ Ak 2 q k 2 q d + η
k
∼ qk2 (3 + log q)k2 .
η
78 AARON LANDESMAN

Then, we want to compute


 !
`
( q − 1) k − k 1 − k 2 ∑ sup Ak 1
d

`,d,d∼ D |η |∼ B/qk

 (q − 1)k−k1 −k2 qk1 (3 + log q)k1


using the lemma from last time again and the fact that rational num-
bers with denominator on the order of D are D12 spaces. 
The bound of the above proposition yields a useful bound when
D2 B is small compared to qk , where this bound starts to beat the L1
bound.
We want
  
`
∑ ∑ ∑ |Ak (`/d + η ) | S − d + η
d∼ D ` η,qk |η |∼ B
 
k

2
αq `
 ( q − 1) D B max S +η .
d∼ D,qk |η |∼ B d

We want this to be small compared to (q − 1)k · qk . We already know


that for these points, the size of
 
`
S +η
d
Recall
 
`
S
d
+η =| ∑ Λ(n)e(nα)|.
n≤qk

Remark 12.4. Recall that from Vinogradov’s theorem, we found

∑ λ(n)e (nα)
n≤ x

we had an approximation of the form


a 1
α− ≤ 2
q q
was bounded by

 
x
x 4/5
+ √ + xq (log x )3 .
q
We can use the same bound here.
ANALYTIC NUMBER THEORY NOTES 79

1
We have d ≤ qk/2 and |η | ≤ dqk/2
. Therefore,
!
qk
 
` 3
q 
4k/5 k
S +η  q +√ + qk D log q .
d D

√ only worry is that if D is small then there is an issue, because


The
1/ D is big. In this case, we would like to look for savings in B. So,
this is not quite enough when D is small.
Recall that the approximations d` to qa are convergents of the con-
tinued fractions. We have
a ` B
− ∼ .
qk d qk
Perhaps this approximation is not too good. We could try taking later
approximations of continued fractions, taking the next convergent.
Choose a modulus Q and pick an approximation uv with v ≤ Q and
a u 1
k
− ≤ .
q v vQ
1 B 103 qk
Arrange this so that dQ  10q k . That is, choose Q = BD . Then,
this u/v is not the same as `/d because it is a closer approximation.
Further,
1 ` u 1 2B
≤ − ≤ + k.
dv d v vQ q
1 1 1
Further, vQ is small compared to dv , and dv ≤ 3 2B
qk
and we get

103 qk 1 qk
v≥ .
BD 10 BD
So, in this case, we can redo our previous argument with a larger
denominator. Using the bound from before, we see
!

`

q k  3
s + η  q4k/5 + √ log qk .
d BD
using that

qk BD
 q3k/4
qk/2
and so we can absorb the third term into q4k/5 .
80 AARON LANDESMAN

We are now basically done. We know BD is at least some power


of log qk . We want to find
−a
   
1 a
∑ Ak q k S q k
qk minor arcs
 !
2 B αq
 5 ( D 2 B )αq D
 log qk (q − 1)k max + √
D,B,DB≥(log qk )
A qk/5 DB

Now, DB ≤ qk/2 and D ≤ qk/2 , so we just need αq < 15 for the first
term to be sufficiently small and we need αq < 41 for the second term
to be sufficiently small. We we just need to check αq < 15 .
Let’s now examine this constraint. Recall
 
q
log q−1 ( B + log q)
αq =
log q
If q is sufficiently large, this will hold. Indeed, for q > 2 · 106 this will
hold.
For example, if αq = .19, the first term is bounded by q−.01k and
−.12A
the second term is bounded by ( DB)−.12 ≤ log qk , and we
can choose A as large as we want so that this savings dominates
5
log qk .
Exercise 12.5. Work out any changes in the case that q is composite.
Hint: There is essentially no difference. We only used some sim-
plifications for computing the major arcs. We divided major arcs
into cases that the denominators are powers of the modulus q. One
would then have to work out differences when the denominator di-
vides q, or something like that.
Remark 12.6. For further ideas along this line, look at Piatetski-Shapiro
yielding primes of the form bnα c for α = 1.01.

13. 11/7/17
Today we’ll start talking about something new, the Bombieri-Vinogradov
Theorem. This tells us about the distribution of primes in arithmetic
progressions. Let Ψ( x; q, a) denote the number of primes up to x
congruent to a mod q. Let ( a, q) = 1. We are looking to estimate
x
E ( x; q, a) := Ψ ( x; q, a) −
φ(q)
ANALYTIC NUMBER THEORY NOTES 81

The generalized Riemann hypothesis implies


| E ( x; q, a)|  x1/2 (log x )2
which is good for q ≤ x1/2 / (log x )2 .
In conditionally, we’ll need to include Siegel zeros.
Theorem 13.1 (Bombieri-Vinogradov). For every A > 0 there exists a
B > 0 so that
x
∑ (max max | E (y; q, a)| 
q≤ Q a,q)=1
y≤ x (log x ) A
x1/2
provided Q ≤ .
(log x ) B
Remark 13.2. The generalized Riemann hypothesis yields a bound
of the form Q · x1/2 (log x )2 , which is essentially the same.
Remark 13.3. There is a trivial bound of the form
x
| E ( x; q, a)|  log x
q
so one trivially obtains the bound  x (log x )2 trivially, and Bombieri-
Vinogradov lets us save arbitrary powers of log x.
Remark 13.4. The key ideas in the proof are
(1) Bilinear forms and Vaughn’s identity
(2) Primes in progressions and Siegel zeros
(3) Large Sieve inequalities
13.1. Large Sieve for Additive characters. Suppose we have α1 , . . . , α R ∈
R/Z which are δ well-spaces, i.e., |αr − αs | ≥ δ for r 6= s.
Goal 13.5. Our goal is to bound
2
R
∑ ∑ Nan e

nα j .
j =1 n =1

for an ∈ C.
Instead of taking the sum in the above goal from n = 1 to N we
can re-parameterize the sum as
M+ N N
∑ an e (nα) = ∑ a M+ N e (( M + n)α)
n = M +1 n =1
N
= ∑ a M+ N e( Mα) · e (nα)
n =1
82 AARON LANDESMAN

so this is no more general.

Remark 13.6. We can bound


!2
∑ | an | ≤ N ∑ | a n |2
n n

by Cauchy Schwartz and we shouldn’t expect anything better than


this.
Suppose on the other hand, that the an are “wiggling around in
all directions randomly” and all have norm 1. If the an are behaving
independently for different values of n, and in this case we might
expect some kind of square-root cancellation. That is, we might have
 
∑ | an | 2
·R

Maybe in place of R, we might get 1δ because if R points were evenly


spaced, we would be using square root cancellation and averaging.

We now want to get estimates of the above form. We’ll prove


something stronger, but here’s a first pass:

Theorem 13.7. We have

2 !
R N
1 N
∑ ∑ an e ( N + ) ∑ | a n |2

nα j 
j =1 n =1
δ n =1

We give two proofs.

First Proof. The first step to prove this is a Sobolev argument. We


have

2
N Z α j +δ/2
1 2
∑ an e ∑ an e(nα)

nα j  dα
n =1
δ α j −δ/2
Z α j +δ/2
! !
+
α j −δ/2
∑ an e(nα) ∑ nan e(nα) dα.
n n
ANALYTIC NUMBER THEORY NOTES 83

Summing from 1 to R, we have


2 2
R N Z 1
1
∑ ∑ an e ∑ an e(nα)

nα j  dα
j =1 n =1
δ 0 n
 1/2  1/2
Z 1 2 Z 1 2
+
0
∑ an e(nα) dα 
0
∑ nan e(nα) dα
n n
!1/2 !1/2
1
 ∑ | a n |2 + ∑ | an | 2
∑ |nan | 2
δ n n n
!1/2  !1/2 
1
 ∑
δ n
| a n |2 + ∑ | a n |2 N ∑ | a n |2 .
n n

using Parseval’s identity. 


Second proof. This argument is based on duality. Say we have ( am,n ) M× N
an M × N matrix. From this we can consider three kinds of objects.
(1)
2
M N N
∑ ∑ amn yn ≤C ∑ | y n |2
m =1 n =1 n =1

(2)
!1/2 !1/2
M N
∑ ∑ amn xm yn C ∑ | xm | 2
∑ |yn | 2

m =1 n =1 m n

(3)
2
N M
∑ ∑ amn xm ≤ C ∑ | x m |2 .
n =1 m =1 m

Exercise 13.8. Show one of the above three inequalities holds for all
choices of x, y if and only if the other two do. I.e., show the above
three statements are equivalent.
By duality, in order to give the desired bound, it suffices to bound
N R
∑ ∑ br e(nαr )
n =1 r =1
84 AARON LANDESMAN

in terms of the L2 norm of b for all choices of b. Expanding this out,


we get

N R N
∑ ∑ br e(nαr ) = ∑ br bs ∑ e (n (αr − αs )) .
n =1 r =1 r,s≤ R n =1

Since αr and αs are all well spaced, the terms in the exponentials
won’t be close to integers very often.
There are two types of terms, those with r = s, in which case we
get a contribution of N ∑r |br |2 .
There are also the off-diagonal terms with r 6= s. Here,

N
1
∑ e (nθ ) 
||θ ||
n =1

where ||θ || is the integer nearest to θ. We can then estimate the sum
of the off diagonal terms by
!
  1 1
∑ |br |2 + |bs |2 ||αr − αs ||  ∑ |br |2 ∑ ||αr − αs ||
r 6=s r s 6 =r
! !
R
1
 ∑ |br |2 ∑ jδ
r j =1
! 
1
 ∑ |br | 2
log R
r δ
!
1 1
 ∑ |br |2 log .
r δ δ

using symmetry to bound the |br |2 + |bs |2 by (an implicit factor of 2


times |br |2 .
So, we have proved, in the dual form, that
2
N   
1 1
∑ ∑ br e(nαr ) ≤ N+O
δ
log
δ ∑ |br |2 .
n =1 r r

We now have an extra factor of log, and we will now explain how
to remove this factor of log . We will set it up, but won’t really carry
it out.
ANALYTIC NUMBER THEORY NOTES 85

Recall we were trying to estimate


2
R
∑ ∑ br e (nαr )
n =1 r =1
Say we start with the characteristic function between 1, N and taking
a smoothing Φ of this characteristic function supported on a small
interval around (1, N ) and always positive. We instead try to esti-
mate
2
R
∑ Φ(n) ∑ br e (nαr )
n =1 r =1
One could image one might be able to smooth on an the interval
 
1 1
1− ,N+
δ δ
so that Φ is supported on this interval. We have
2
R
∑ Φ(n) ∑ br e (nαr ) = ∑ br bs ∑ Φ(n)e (n (αr − αs ))
n =1 r =1 r,s n

= ∑ br bs ∑ Φ̂ (k + αr − αs ) .
r,s k
using Poisson summation. The Fourier transform is large at 0 (around
N + 2δ ). You can get the rate of decay by integrating by parts many
times. One can learn about the decay from the derivative of Φ.
The Fourier transform is approximately supported on an interval of
length δ , apart from some small fluctuations. Since Φ̂(k + αr − αs )
never gets within δ of an integer, it is always close to 0 when r 6= s.
Therefore, including the contribution at r = s, we get the sum is well
estimated by
R
Φ̂(0) ∑ |br |2
r =1
and we save the log term.
Exercise 13.9 (Involved exercise). Complete the above sketch into a
proof

Remark 13.10. Here is a problem: Can on e choose Φ ≥ 0, Φ ≥ χ[1,N ]
and Φ̂ supported in (−δ, δ) minimizing Φ̂(0)?
86 AARON LANDESMAN

There is a solution discovered by Beurling and Selberg. One ob-


tains something like Φ̂(0) ≤ N + 1δ − 1.

We will next deduce the large sieve from the above theorem. Say
we have
 
a
: q ≤ Q, ( a, q) = 1
q
1
which is about Q2 points each Q2
spaced.

Q 2
∗ M+ N   M+ N
an  
∑ ∑ ∑ an e
q
≤ N + O ( Q2 ) ∑ | a n |2 .
q=1 a mod q n= M+1 n = M +1

Example 13.11 (Important example). Take an = 1 if n ∈ [ M + 1, M + N ]


is prime and 0 otherwise. To examine the left hand side,
∗   ∗  
an an
∑ ∑ e
q
= ∑ ∑ e
q
a mod q n prime ∈[ M+1,M+ N ] n∈[ M+1,M+ N ],n prime a mod q

= ∑ µ(q)
n∈[ M+1,M+ N ],n prime
= µ(q) (π ( M + N ) − π ( M))

where π (k ) is the number of primes up to k. Using Cauchy-Schwartz,


we get
2
∗  
an
φ(q) ∑ ∑ e
q
≥ µ(q)2 (π ( M + N ) − π ( M))2 .
a mod q nprime

Combining the above, the left hand side of the Large sieve is bounded
by

Q 2
∗ M+ N
µ ( q )2
 
an
∑ ∑ ∑ an e
q
≥ ∑ φ
(q) (π ( M + N ) − π ( M))2
q=1 a mod q n= M+1 q≤ Q

and the large sieve implies

µ ( q )2  
∑ φ
(q) (π ( M + N ) − π ( M))2 ≤ N + O( Q2 ) (π ( M + N ) − π ( M)) .
q≤ Q
ANALYTIC NUMBER THEORY NOTES 87

This implies that the number of primes in the interval M to M + N


is bounded by
! −1
   µ ( q ) 2
N + O Q2 ∑ φ(q) .
q≤ Q

If we make the Q too big, this O( Q2 ) term will start to dominate. So,
we might want Q2 to be something like Q = N 1/2−ε . The bound then
becomes
1
N (1 + o (1)) .
∑q≤ N 1/2−ε µ(q)2 /φ(q)

Exercise 13.12. Show


µ ( n )2
∑ ∼ log x.
n≤ x φ (n )

Then, the number of primes between N and M + N yields a bound


of
1 2N (1 + o (1))
N (1 + o (1)) ≤
∑q≤ N 1/2−ε µ(q) /φ(q)
2 log N
This yields
Theorem 13.13 (Brun-Titchmarsh theorem). We have
2 (1 + o (1)) N
π ( M + N ) − π ( M) ≤ .
log N
Remark 13.14. The constants in this inequality can be made explicit.
In fact, one can replace
2 (1 + o (1)) N
π ( M + N ) − π ( M) ≤ .
log N
by
2N
π ( M + N ) − π ( M) ≤ .
log N
without any error terms. That is, the number of primes from M to
M + N is no more than twice the number of primes from 1 to N.
Remark 13.15. In fact, one might expect π ( x ) + π (y) ≥ π ( x + y).
This contradicts a conjecture of Hardy and Littlewood, so is expected
to be false.
88 AARON LANDESMAN

Exercise 13.16. Generalize the Brun-Titchmarsh theorem as follows.


Use the Large Sieve appropriately to show
x (2 + o (1))
π ( x; q, a) ≤ .
φ(q) log( x/q)

For example if x = q1,000,000 . Then, π ( x; q, a) is at most 2.00001 times


the expected number of primes. I.e.,
x
π ( x; q, a) ≤ (2.000001) .
φ(q) log x
This constant more than 2 is significant because of Siegel zeros. Then,

x xβ
Ψ( x; q, a) = − χ( a)
φ(q) φ(q) β
for χ a quadratic character. If one could replace the 2 by 1.99 one
would imply there are no Siegel zeros.
Exercise 13.17 (What is large about the large sieve). For primes we
used that one residue class is forbidden and so we get some im-
balances. Now, more generally, suppose we have S ⊂ [1, N ] with
p +1
|S( mod p)| ≤ 2 . Use the Large sieve to show

|S| ≤ N 1/2+ε .
Here, the sieve is large because we are forbidding a large
√ number of
residue classes. Say here the primes p range up to p ≤ N.
Remark 13.18. This bound is tight because if we take S to be the set
of squares, we get the claimed number of residue classes.
Remark 13.19. There is a conjecture of Helfgott and Venkatesh say-
ing that if one is missing half the residue classes and do have half the
residue classes, it should look like some quadratic polynomial.

14. 11/9/17
Last time we discussed the Large sieve in its additive form. That
is, if α1 , . . . , α R are δ well spaced, then
R M+ N   
1
∑ ∑ an e(nαr ) ≤ N + O δ ∑ |an |2 .
r =1 M +1
ANALYTIC NUMBER THEORY NOTES 89

a
One way we’ll apply this is by taking q with ( a, q) = 1, q ≤ Q, R =
1
Q2 , δ ∼ Q2
and obtaining
2
∗ M+ N  
an  
∑ ∑ ∑ a(n)e
q
≤ N + O( Q2 ) ∑ | a(n)|2 .
q≤ Q a mod q n= M +1

14.1. A multiplicative version of the large sieve. We’ll now for-


mulate the large sieve in a multiplicative form in order to prove
Bombieri Vinogradov. We’ll average over all characters χ mod q and
sum over q ≤ Q.
2
∗ M+ N  
∑ ∑ ∑ a(n)χ(n) ≤ N + O( Q2 ) ∑ | a(n)|2
q≤ Q χ mod q n= M+1

Remark 14.1. Here the term of size N corresponds to a particular


“bad character.” and the Q2 corresponds to the sum over the re-
maining characters with square-root savings bounding by some L2
norm. We’d like to think characters of different moduli are orthog-
onal to each other, but we don’t want to recount characters, so we
have the star on our sum to indicate we are summing over primitive
characters χ (not induced by characters of smaller modulus).
In fact, we’ll obtain something slightly more precise than the above.
We want to go from multiplicative characters to something involv- 
n
ing additive characters. We’ll want to pass between χ(n) and e q .
Let χ mod q be a primitive character. Let
 
a
τ (χ) = ∑ χ( a)e
a mod q
q

be the Gauss sum. Suppose (n, q) = 1. Then, consider


   
an an
∑ χ( a)e q = ∑ χ( a)e q χ(n)χ(n)
a mod q a mod q
= τ ( χ ) χ ( n ).
noting that χ(n)χ(n) = 1 if n is coprime to q. Then,
 
1 an
χ(n) = ∑
τ (χ) a mod q
χ( a)e
q
.

This holds for all χ mod q so long as (n, q) = 1.


90 AARON LANDESMAN

Exercise 14.2. If χ is primitive, then in fact the above equality is true


for all n. That is, if n has factor in common with q, then the left hand
side is 0, and we have to check the right hand side is also zero so
long as χ is primitive.
For example, consider q a prime. Then, every character except the
principal character has right hand side evaluating to 0 when q | n.

We have
M+ N M+ N  
1 an
∑ a(n)χ(n) = τ (χ) ∑ χ( a) ∑ a(n)e q
.
n = M +1 a mod q n = M +1

Let
  M+ N  
a an
S := ∑ a(n)e .
q n = M +1
q

We wanted to bound
2

2 1 ∗ a
∑ ∑ a(n)χ(n) = ∑
q χ mod ∑ χ( a)S( q )
χ mod q q a mod q
2
1 a
≤ ∑
q χ mod ∑ χ( a)S( q )
q
a mod q
 2
φ(q) ∗ a
= ∑
q a mod q
S
q
.


using that |τ (χ)| = q.
So, using the above and the large sieve,
2 2
∗ M+ N ∗  
q a
∑ φ(q) ∑ ∑ a(n)χ(n) ≤ ∑ ∑ S
q
q≤ Q χ mod q n = M +1 q≤ Q a mod q
  M+ N
≤ N + O ( Q2 ) ∑ | a(n)|2 .
n = M +1

Remark 14.3. The idea is that we are estimating some quantity on


average, and one term is very bad and the rest of the terms have
square-root cancellation.
ANALYTIC NUMBER THEORY NOTES 91

x
14.2. Proving Bombieri Vinogradov. Let Q = (log x ) B
. Our goal is
to bound
x x
∑ max Ψ( x; q, a) −
φ(q)

(log x )2
.
q≤ Q ( a,q)=1

In our original statement, we also had a maximum over y up to x,


which we will forget about, as it is not so important.
Recall
1
φ(q) ∑
Ψ( x; q, a) = χ( a)Ψ( x, χ)
χ
!
1 x x
= ∑
φ (q ) χ6=χ
χ( a)Ψ( x, χ) −
φ(q)
+O
φ(q) (log x ) A+100
0

and this error term is bounded using


1
∑ φ(q)
 log x.
q≤ Q

Here we are using


1
φ(q) ∑
1n≡a mod q = χ( a)χ(n)

and so
1
φ(q) ∑
Ψ( x; q, a) = χ ( a ) ∑ Λ ( n ) χ ( n ).
χ n≤ x

Then, we get
x 1
φ(q) χ∑
max Ψ( x; q, a) − ≤ |Ψ( x, χ)|
( a,q)=1 φ(q) 6=χ 0

Suppose χ mod q is induced by some primitive character χe mod qe.


We’ll assume qe > 1 so the principal character does not show up, and
then qe | q. Then,

1 1
∑ φ(q) ∑ | Ψ ( x, χ )| = ∑ ∑ ∑ φ(q) |Ψ(x, χ)| .
q≤ Q χ mod q,χ6=χ 0 1<qe≤ Q χe mod qe q≤ Q,e
q|q

We have χ(n) = χe(n) if (n, q) = 1. If (n, q) > 1 bun (n, qe) = 1 then
the two could be different.
92 AARON LANDESMAN

We have the bound


|Ψ( x, χ) − Ψ ( x, χe)| = ∑ Λ(n)  log x# { p | q : p - qe}
n≤ x
(n,q)>1
(n,e
q)=1

 (log x )2 .

Exercise 14.4. Show that we can bound



1
∑ ∑ ∑ φ(q) |Ψ(x, χ)| .
1<qe≤ Q χe mod qe q≤ Q,e
q|q

by

1  
(14.1) ∑ ∑ ∑ φ(q)
| Ψ ( x, χ
e )| + O Q ( log x ) 3

q<qe≤ Q χe mod qe qe mod q,q≤ Q

using the bound on the difference between Ψ( x, χ) and Ψ( x, χe) above,


and so it suffices to bound Equation 14.1.
The point is that the difference is bounded by (log x )2 the number
of characters is φ(qe) which cancels out and we get a factor of Q and
three factors of log.
Then, bound the inner most sum by
1 1
∑ φ ( q r )

φ ( q )
log x
r ≤ Q/e
q
e e
1
 (log x )2 .
qe
Warning 14.5. Now, we replace qe by q to avoid writing lots of tildes.
Let’s break q ≤ Q into dyadic blocks
R ≤ q ≤ 2R
with R ≤ Q. There are on the order of log x such blocks. We can
bound Equation 14.1 by

1
(14.2) (log x )3 max
R≤ Q R ∑ ∑ |Ψ( x, χ)| .
R≤q≤2R χ mod q

We now have two cases.


(1) R is small so R ≤ (log x )10A .
(2) R is large (here we will use the large sieve)
ANALYTIC NUMBER THEORY NOTES 93

Remark 14.6. We’d like to bound


x
∑ max | E( x; q, a)|  .
q≤

x/(log x ) B
α (log x ) A

Even if we are only interested in q ≥ x1/3 , we still will have to


deal with small moduli because of imprimitive characters. We could
avoid dealing with small moduli if we only sum over primes.
Work out a Bombieri Vinogradov theorem in the range
Exercise 14.7. √
1/3 x
x to Q = B for integers, all of whose prime factors are bigger
(log x )
than x1/10 .
14.3. The case R is small. Here we can use Siegel zeros and what
we know about zero-free regions. If q ≤ (log x )10A then
x
|Ψ ( x, χ)| 
(log x )100A+100
using Siegel’s theorem.
These easily yield a bound of Equation 14.2 by
x
.
(log x )10A+10

14.4. The case R is large. Now, let’s deal with the range

10A x
(log x ) ≤R≤ .
(log x ) B
We now use the trick of decomposing Λ(n) via Vaughan’s identity.
We have

∑ Λ(n)e (nα)
which by Vaughan’s identity yields a good bound

∑ ∑ am bn e(mnα).
m n

There is an issue that if we only wanted to estimate ∑n Λ(n)χ(n)


for one character χ we could get something like ∑m,n am bn χ(m)χ(n).
But, we’re only trying to average over all characters Q over ranging
R. The idea is now to write down Vaughan’s identity and the use the
large sieve.
94 AARON LANDESMAN

Recall Vaughan’s identity says that for


Λ(n)
P(s) =≤m≤U
ns
µ(n)
M(s) = ∑ ns
n ≤V

we have
−ζ 0 −ζ 0
 
(s) = (s) − P(s) (1 − ζ (s) M(s)) + P(s) − ζ 0 (s) M(s) − ζ (s) M(s) P(s)
ζ ζ
with the first term on the right a type 2 sum and the latter three terms
Type 1 sums.
We’d now like to try to bound what all these terms give us.
First, let’s consider the type 2 sum.

14.5. Bounding the type 2 sum. Recall we are trying to bound some
sum of terms of the form ∑n Λ(n)χ(n).
Expanding Λ(n) using Vaughan’s identity, we get some bound of
the form
 

∑ ∑ ∑ ∑ Λ(m) +  ∑ µ ( d )  χ ( m ) χ ( n ).
R≤q≤2R χ mod q m>U n>V,mn≤ x d|n,d>V

But note that the term

∑ µ(d)
d|n,d>V

is bounded by d(n), so we can essentially ignore this.


Exercise 14.8. Use a Perron type integral to separate the variables m
and n. Then, group them into dyadic blocks with M ≤ m ≤ 2M, N ≤
n ≤ 2N with the conditions M ≥ U, N ≥ V, MN  x to remove the
dependence mn ≤ x.
Then, using the above, show
 

∑ ∑ ∑ ∑ Λ(m) +  ∑ µ(d) χ(m)χ(n)
R≤q≤2R χ mod q m>U n>V,mn≤ x d|n,d>V
! !

 (log x ) 3
∑ ∑ ∑ Λ(m)χ(m) ∑ a(n)χ(n) .
q∼ R χ mod q m∼ M n∼ N
ANALYTIC NUMBER THEORY NOTES 95

We now use Cauchy-Schwartz and ∑m∼ M Λ(m)2  M log M, ∑n∼ N d(n)2 


N (log N )3 to obtain
! !

(log x ) 3
∑ ∑ ∑ Λ(m)χ(m) ∑ a(n)χ(n)
q∼ R χ mod q m∼ M n∼ N
 1/2  1/2
2 2
∗ ∗
 (log x )3 ∑ ∑ ∑ Λ(m)χ(m)  ∑ ∑ ∑ a(n)χ(n) 
q χ m∼ M q χ n∼ N
!1/2 !1/2
   
 (log x ) 3
M+R 2
∑ Λ(m) 2
N+R 2
∑ d(n) 2
m∼ M n∼ N
n o1/2 n o1/2
5 2 2 2 2
 (log x ) M + MR N + NR
5

MN MN √ 2

 (log x ) MN + √ R + √ R + MNR
M N

 
5 xR xR 2
 (log x ) x + √ + √ + xR .
U V
Then, taking U = V = x1/10 , we see
√ 2
 
1 5 xR xR
max√ (log x ) x + √ + √ + xR
(log x )10A ≤ R≤ x/(log x ) B R U V


 
x A 5 1 1 5 x
 + x (log x ) √ + √ + x (log x ) 5
log x U V (log x ) B

For B > A + 10 or so we can bound all the terms by x/(log x ) A .
So, this completes the type 2 sum case.
It only remains to deal with the type 1 sum case. We’ll do the
trivial type 1 sum, which comes from P(s). This is
Λ(n)
∑ ns
.
u ≤U

We then have to bound



1
R q∑ ∑ | ∑ Λ(n)χ(n)|  UR
∼ R χ mod q n≤U

 U s.

Since U is small, around x1/10 , we have bounded this sum.


96 AARON LANDESMAN

Next time we’ll deal with the other two terms. Let’s just give an
idea of how to deal with one of them now. We’re trying to bound the
term corresponding to
ζ ( s ) M ( s ) P ( s ).
We want to bound

∑ ∑ ∑ Λ(m)µ(n) ∑ χ ( k ) χ ( m ) χ ( n ).
q∼ R χ mod q m≤U,n≤V k≤ x/mn

We then have the problem of evaluating

∑ χ(k)
k≤ x/mn

which is certainly bounded by q, and we’d like to even improve this



a bit to q, plug it in, and take trivial estimates on everything.

15. 11/14/17
15.1. Brun-Titchmarsh. Recall a few classes ago, we were trying to
bound π ( M + N ) − π ( M). We wanted to bound
∗  
ap
∑ ∑ e
q
= µ(q) (π ( M + N ) − π ( M)) .
a mod q M+1≤ p≤ M+ N

We used Cauchy Schwartz to bound


! 2

∗  
ap
µ(q)2 (π ( M + N ) − π ( M))2 ≤ ∑ 1  ∑ ∑ e
q
.
a mod q a mod q M +1≤ p≤r + N

One then gets a bound to which one can now use the large sieve.

15.2. Back to Bombieri Vinogradov. Recall we have reduced to proof


to bounding

1
(log x )3 √max R ∑ ∑ |Ψ ( x; χ)| .
R≤ x/(log x ) B R≤q≤2R χ mod q

We had two ranges. If R small, like R ≤ (log x )10A we could use our
bounds for |Ψ ( x, χ)|. To conclude, we only needed to deal with

10A x
(log x ) ≤R≤
(log x ) B
using Vaughan’s identity and the large sieve.
ANALYTIC NUMBER THEORY NOTES 97

Recall
−ζ 0 ζ0
 
(s) − P(s) (1 − ζ M(s)) = − (s) − P(s) + ζ 0 (s) M(s) + ζ (s) M(s) P(s).
ζ ζ
with
Λ(n)
P= ∑ ns
n ≤U
µ(n)
M= ∑ ns
n ≤V

We were able to bound the type 2 sum by



 
5 x x x x
 (log x ) + √ + √ + xR 
R U V (log x ) A
We also bounded P by UR  x.6 . Next, we bound the type 1 sum
ζ MP given by

1
R ∑ ∑ ∑ Λ(m)µ(n) ∑ χ(kmn) .
R≤q≤2R χ mod q m≤U,n≤V k≤ x/mn

We’ll also bound this crudely, forgetting about the sums over M and
N, and get cancellation from the sum over k.

15.3. Polya Vinogradov theorem. We’ll prove the Polya Vinogradov


theorem:
Theorem 15.1 (Polya-Vinogradov Theorem). Suppose χ mod q is prim-
itive. Then,


max
x
∑ χ(n)  q log q
n≤ x

Remark 15.2. Assuming Polya Vinogradov, we can bound the type


1 sum ζ MP by

1
R ∑ ∑ ∑ Λ(m)µ(n) ∑ χ(kmn)
R≤q≤2R χ mod q m≤U,n≤V k≤ x/mn
1 2 √
 R UV R log R
R
 x 1− ε .
98 AARON LANDESMAN

Exercise 15.3. Show



∑ χ(n) log n  q (log q) (log x )
n≤ x

using Partial summation. Hint: Consider


!
Z x dt
1
∑ χ(n) t
n≤t

and obtain a log n from this integral.


Exercise 15.4. Bound ζ 0 (s) M(s) in a similar way using the previous
exercise. In fact, one can bound this term by
1 √
ζ 0 (s) M(s)  R2 V R (log R) (log x ) .
R
It only remains to prove the Polya-Vinogradov Theorem.
Proof. The idea is to rewrite the character χ in terms of additive char-
acters.
 
an
∑ χ( a)e q
a mod q
 
an
= χ(n) ∑ χ( a)χ(n)e = τ (χ)χ(n)
a mod q
q

This yields,
 
1 an
χ(n) = ∑ χ( a)e
τ (χ) a mod q
.
q

therefore, we have
 
1 an
∑ χ(n) = τ (χ) ∑ χ( a) ∑ e q
.
n≤ x a mod q n≤ x

Summing
 
an
∑e q
n≤ x

as a progression, we have
 
 
an 1 
∑e q
≤ min  x,
a
.
n≤ x q
ANALYTIC NUMBER THEORY NOTES 99

We also have

1
|τ (χ)| = O( √ ).
q

a | a|
We know ∑n≤ x χ(n) ≤ x. Say −q/2 ≤ a ≤ q/2. Then, q ∼ q
q
and we use the bound x if | a| ≤ a and the bound q/| a| if | a| > q/x.
Therefore, we obtain a bound

 
1 an
∑ χ(n) = τ (χ) ∑ χ( a) ∑ e q
n≤ x a mod q n≤ x
1
 √ (q log q)
q

= q log q.

Remark 15.5. Here is an alternate heuristic explanation of Polya-


Vinogradov. We have

n 1  n   an 
∑ χ(n)Φ x
= ∑ χ( a) ∑ Φ x e q
τ (χ) a mod
n q n
  
1 a
= ∑ χ( a) ∑ x Φ x k + q
τ (χ) a mod
b
q k
  
1 kq + a
= ∑ χ(a) ∑ xΦb x
τ (χ) a mod q
q k
 
=
1
∑ b xm .
χ(m) x Φ
τ (χ) m q

where we let m = kq + a. So the left hand side is a sum over χ of


size x and the right hand side is a sum over χ of size q/x. This is an

involution. This explains why Polya Vinogradov holds. If x < q

and we get a bound by q. If not, do the flip and bound the χ right
√ q √
hand side trivially which gives a bound x/ q · x  q.
100 AARON LANDESMAN
 
1
Say we want to understand L 2, χ . We have
  ∞
1 χ(n)
L
2
,χ = ∑ √
n
n =1
Z ∞
!
1
=
1
√ d
y ∑ χ(n)
n≤y
Z ∞
!
1 1
=
2 1 y3/2
∑ χ(n) dy
n≤y

where we can bound



∑ χ(n) min (y, q log q)
n≤y

using Polya Vinogradov.


Exercise 15.6. Show the above is bounded by
 q1/4 log q.
The kind of argument we were discussing in Remark 15.5 yields
 
1 χ(n) χ(n)
L , χ = ∑ √ + ε(χ) ∑ √
2 √
n≤ q n √
n≤ q n

where ε( x ) is a complex number of size 1.  


So, q1/4 log q is called the convexity bound for for L 12 , χ .
The Riemann hypothesis implies the Lindelöf hypothesis, which
implies
 
1
L , χ  qε
2
for any ε > 0.
In fact, we can slightly improve the above bound.
Theorem 15.7 (Burgess). We have
 
1
L , χ  q3/16+ε
2
for q cube free.
If χ is quadratic, there is an even better result:
ANALYTIC NUMBER THEORY NOTES 101

Theorem 15.8 (Conrey and Iwaniec). For χ quadratic,


 
1
L , χ  q1/6+ε
2
Burgess has a result saying q is cubefree if

∑ χ(n) = o(x)
n≤ x

if x ≥ q1/4+ε .
Exercise 15.9. If χ is quadratic modq for q a prime, then then the
least quadratic non-residue (lqnr) modq is at most q1/2 log q. Gauss
1/2
√ ≤ q . A trick of Vinogradov allows you to save a
showed lqnr
factor of e and Polya Vinogradov yields

lqnr ≤ q1/(2 e)+ε

Burgess implies

lqnr ≤ q1/(4 e)+ε
.
15.4. Some more extended exercises.
Exercise 15.10 (Difficult, theorem of Goldfeld).
Question 15.11 (Open question, Sophie Germian). Are there infin-
itely many primes p with p − 1 = 2q with q a prime.
A weakening of the above statement would be to find primes p so
that p − 1 has a large prime factor.
Let
P( x )
be the largest prime dividing x The exercise is to show that there are
lots of primes p ≤ x so that
P( p − 1) ≥ x1/2+δ
for some δ > 0. Hint: The point of this exercise is to use Bombieri
Vinogradov. Here is the idea of the proof. Suppose there were lots of
such primes, say
 

∑ ∑ Λ(d) ∼ x
p≤ x q | p −1
102 AARON LANDESMAN

Suppose q ≤ x1/2+δ always. Let Q = x1/2+δ . We can now exchange


these two sums to obtain
∑ log q (π ( x; q, 1))
q≤ Q

If q ≤ x1/2 (log x ) A would give some bound by Bombieri Vinogradov.


So, we write
∑ log q (π ( x; q, 1))
q≤ Q

= ∑ log q (π ( x; q, 1)) + ∑ log q (π ( x; q, 1))


q≤ x1/2 /(log x ) B x1/2 /(log x ) B ≤q≤ Q

and we get for the first term is asymptotic to


π (x)
∑ (log q)
φ(q)
∼ x/2
q≤ x1/2 /(log x ) B

where we can bound the error term by Bombieri Vinogradov So,


there must be a large contribution from the second term. We don’t
know how to control π ( x; q, 1). But, we do have an upper bound on
them by Brun-Titchmarsh. Indeed, we can estimate π ( x; q, 1) from
Brun Titchmarsh.
2x (1 + o (1))
π ( x; q, 1) ≤ .
φ(q) log( x/q)
Use this to show that the second term is at most .49x if δ ≤ .01. This
finishes the proof because then these two terms cannot add up to be
as big as x. This would yield a contradiction.
Remark 15.12. The above theorem was published under Morris Gold-
feld, but this is the same person as Dorian Goldfeld, who changed
his name after he published this.
Exercise 15.13 (Difficult exercise, Titchmarsh divisor problem). Pick
a number at random and say its largest prime factor. “Does p − 1
look in some ways like a random number?”
Prove the following lemma.
Lemma 15.14. We have
∑ d(n) ∼ x log x.
n≤ x
and
∑ d( p − 1) ∼ Cx
p≤ x
ANALYTIC NUMBER THEORY NOTES 103

for some C > 0.


Sketch of proof. We know
d(n) = 2 ∑

1
d|n,d≤ n

and so it suffices to bound


∑ √
∑ 1 = ∑√
π ( x; d, 1).
p≤ x d≤ x,d| p−1 d≤ x

One can solve this problem by combining it with Bombieri Vino-


√ B √
gradov
√ in the range d ≤ x/ ( log x ) . For the small region x/ (log x ) B ≤
d ≤ x, try to bound π ( x; d, q) by Brun-Titchmarsh and hope it be-
comes an error term. 
Question 15.15 (Open problem). Let
d3 ( n )
be defined by

d3 ( n )
ζ ( s )3 = ∑ ns
.
n =1
so d3 (n) is the number of ways of writing n = abc. Bound
∑ d3 ( p − 1).
p≤ x

One can keep track of small prime factors, but occasionally it might
have a very large prime factors.
Conjecture 15.16 (Montgomery’s conjecture). We have
!
Ψ( x ) x1/2+ε
Ψ( x; q, a) = +O √ .
φ(q) q

Remark 15.17. There reasoning behind this is that the error term is
approximately
1
φ(q) χ mod∑
χ(n)Ψ( x, χ)
q,χ6=χ 0

and we can bound Ψ( x, χ) by x1/2 and then get some cancellation in


the sums of the characters.
Montgomery’s conjecture would imply the Elliott-Halberstam con-
jecture.
104 AARON LANDESMAN

Conjecture 15.18 (Elliott-Halberstam conjecture). We have


Ψ( x ) x
∑ max Ψ( x; q, a) − 
q ≤ x 1− ε
( a,q)=1 φ(q) (log x ) A
for any A > 0.

16. 11/16/17
Today we’ll begin a discussion of gaps between primes. Let pn be
the nth prime. By the prime number theory, pn ∼ n log n. Hence, on
average, pn+1 − pn ∼ log n.
Question 16.1 (Open question). What can we say about the distribu-
tion of
p n +1 − p n
log n
as n varies?
To make sense of this question, we can pick an interval (α, β) ⊂
R>0 and ask about
p n +1 − p n
 
1
lim # n ≤ N : ∈ (α, β) .
n→∞ N log pn
How might we guess this? There is a naive model called the Cramer
model. This is clearly bogus, but also reasonably successful. Define
the random variable Xn by
(
1 with probability log1 n
Xn : =
0 with probability 1 − log1 n
for n ≥ 3.
Now, let’s count the probability
Prob ( Xn+1 = Xn+2 = · · · = Xn+h−1 = 0, Xn+h = 1, given Xn = 1) .
This is asking for the chance of a gap of size h.
To calculate this, thinking of h as small compared to log n. we see
this is approximately
  h −1
1 1 1
1− ∼ e−h/ log n .
log n log n log n
Thinking of this a different way, looking at the interval [n, n + h],
we can ask for the chance there are exactly k values for which Xm =
1. We can also handle this quite easily. We can pick k numbers to be
ANALYTIC NUMBER THEORY NOTES 105

primes. Calculating using the binomial theorem, we see this chance


is
 h−k !
   k
h 1 1 h 1 −h/ log n
k
1− ∼ e .
k (log n) log n log n k!

Example 16.2. So the guess one might obtain from this is that in the
interval
[n, n + log n]
the chance of finding k primes is about
1 −1
e
k!
So, we might make the following conjecture.
Conjecture 16.3. If h = λ log n, then as n → ∞ chosen randomly,
then The number of primes in [n, n + h] is Poisson with parameter λ.
That is,
1 λk −λ
# {n ≤ N : [n, n + h] contains k primes } ∼ e .
N k!
Remark 16.4. Saying this another way,
p n +1 − p n
  Z β
1
# n≤N: ∈ (α, β) ∼ e− x dx.
N log pn α

Example 16.5. One could ask the same sort of question about any
subset of the integers (or discrete subset of the real numbers). For
example, say we would like to know the spacing of the zeros of the
zeta function. Say they are of the form 1/2 + iγn with
2πn
γn ∼ .
log n
The spacings of
log n
γn + 1 − γn ∼1

on average. We can ask now about the distribution. These are not
expected to behave like a Poisson process.
Question 16.6. Why should we believe the above conjecture?
Well, there is in fact a better conjecture going back to Hardy and
Littlewood.
106 AARON LANDESMAN

Definition 16.7. Let H = { h1 , . . . , hk } be a set of distinct integers.


The singular series of H is

1 −k
  
νH ( p)
S(H ) : = ∏ 1 − 1− .
p p p

Remark 16.8. The singular series S(H ) is approximated by

∑ Λ(n + h1 ) · · · Λ(n + hk ) ∼ S(H ) N.


n≤ N

Conjecture 16.9 (Hardy and Littlewood). Let H = {h1 , . . . , hk } be a


set of distinct integers. Then,
N
# {n ≤ N : n + h1 , . . . , n + hk are all primes } ∼ S(H ) .
(log N )k
with S(H ) the singular series of H
We next justify the above definition of singular series.
Remark 16.10. The Cramer model predicts
N
# {n ≤ N : n + h1 , . . . , n + hk are all primes } ∼ .
(log N )k
Exercise 16.11. Let n, n + 2 be both prime. We would like to conjec-
ture a value for the singular series S({0, 2}). We might expect an
approximation via the circle method like
! !
Z 1

0
∑ Λ(n)e(nα) ∑ Λ(m)e(−mα) e(2α)dα.
n≤ N m≤ N

Compute what the major arc contribution is. When one computes
this, one might have a guess as to what the answer should be. There
will be a major arc around 0 and a major arc around 1. Hint: Here,
take α close to qa with error roughly n1 . Consider
 
a
∑ Λ(n)e q n + nβ .
n

Assume Λ(n) behaves like log n from the prime number theorem,
and similarly put in an estimate from the prime number theorem on
arithmetic progressions. One should also check we get 0 as our main
term if we put e(α) instead of e(2α) above.
ANALYTIC NUMBER THEORY NOTES 107

Exercise 16.12. Suppose n, n + 2, n + 6 are all primes. Then, we might


want to compute
! !
Z 1

0
∑ Λ(n)Λ(n + 2)e(nα) ∑ Λ(m)e(−mα)e(6α .
n≤ N m≤ N

Here again, compute an estimate for the major arcs.


16.1. A probabilistic argument for the distribution of primes. One
could also think probabilistically (which, according to Hardy and
Littlewood’s paper, is not a notion in mathematics but rather a no-
tion in physics or philosophy).
The idea is to add in by hand all the density information for any
prime p we have. That is, given a prime p, we ask
Question 16.13. What is the probability that n + h1 , . . . , n + hk are all
coprime to p for n chosen randomly.
This is asking that n not be in the classes −h1 , . . . , −hk mod p. So,
we need that n does not lie in the νH ( p) congruence classes mod p,
where
νH ( p) = # {h1 mod p, . . . , hk mod p} .
So, the probability that n + h1 , . . . , n + hk are all coprime to p should
be
ν ( p)
1− H .
p
We want to keep track of the difference between this probability and
the Cramer model. The Cramer model only uses the fact that k ran-
 k
1
dom numbers are all coprime to p with probability 1 − p .
Now, the guess is to take
1 −k
  
νH ( p)
S(H ) : = ∏ 1 − 1− .
p p p

If p > max(h j ) then vH ( p) = k so the above is 1 + O(1/p2 ) for large


p, and hence converges absolutely. This implies the above product
is 0 if and only if one of the terms is equal to 0. This means there is
some prime p for which
vH ( p ) = p
for some prime p.
108 AARON LANDESMAN

Remark 16.14. Hardy and Littlewood did not like this probabilistic
argument because it is assuming primes
 are independent.
 That is,
1 2e−γ N
when one considers π ( N ) ∼ N ∏ p≤√ N 1 − p ∼ log N .

Exercise 16.15. Show the above conjecture predicts the number of


twin primes is roughly
N
1.33 .
(log N )2
That is, because N, N + 1 cannot both be prime, there is a little higher
chance of N and N + 2 being prime.
Exercise 16.16 (Extended exercise, due to Gallagher).
∑ S ({h1 , . . . , hk }) ∼ ∑ 1.
h1 ,...,hk ≤ H h1 ,...,hk ≤ H
hi distinct hi distinct

Exercise 16.17 (Lead in to previous exercise). Pick a prime p. Show


1 k
    
νH ( p) k
E 1− : H ⊂ {0, p − 1} = 1 −
p p
where E denotes expected value. Hint: Use a sort of double counting
argument
Let’s now construct some sets where the singular series is nonzero.

Example 16.18. (1) Consider


{h1 , . . . , hk } := {0, k!, 2k!, . . . , (k − 1)k!} .
Then,
(
1 if p ≤ k
vH ( p ) =
>0 if p > k
(2) Take k distinct primes all larger than k for H . This has a
nonzero singular series.
Question 16.19. What is the largest set in 1, 107 which is admissible
 
(i.e., has a nonzero singular series).
Question 16.20. If we find such a large set, what is it good for?
Say the set in 1, 107 has size
 
7
 k. If we do find such a large set, then
there are intervals n, n + 10 with at least k primes. Then, Hardy
and Littlewood’s conjecture also implies that the number of primes
up to 107 is an upper bound for the number of primes in n, n + 107 .

ANALYTIC NUMBER THEORY NOTES 109

Remark 16.21. Hensley and Richards (with a nice paper by Richards


in the bulletin of the AMS in 1974) have a nice historical document
on computing. For example, if y = 20, 000 one can construct an
admissible set of length more than 20, 000 more than π (y). They
showed that Hardy and Littlewood’s conjecture above contradicts
Hardy and Littlewood’s conjecture that π ( x + y) ≤ π ( x ) + π (y).
Remark 16.22. The above is really a problem in sieve theory. In more
detail, given an interval [1, y], for each small prime p, remove one
progression a p mod p. The aim is to keep as many numbers as pos-
sible. Stop once the prime exceeds the number of remaining integers.
For example, we start at 2, remove either 0 or 1 mod2. Then, go
to 3, and remove some residue mod3. Then, we stop once there are
fewer integers left than the prime we have reached. This is sort of
like a greedy algorithm.
Remark 16.23. Here is another variant: Consider the interval [1, y].
For each prime p ≤ z, remove one progression mod p until nothing
is left. How small can one make z?
We did prove something interesting about Remark 16.22 using the
large sieve. Essentially, we get an upper bound, that the number is
always at most 2π (y), as follows from the large sieve.
Remark 16.24. One could use the above problems to show
p − pn
lim sup n+1 → ∞.
log n
For a while, the best result was
Theorem 16.25 (Erdos-Rankin).
c log pn log2 pn
p n +1 − p n ≥ log4 pn
(log3 pn )2
where logn denotes the nth iterated log, log n = log logn−1 .
But, in 2014, there were some improvements:
Theorem 16.26 (Ford, Green, Konyagin, Tao, Maynard). One can bound
c log pn log2 pn
p n +1 − p n ≥ log4 pn
log3 pn
The other side of the problem is to ask whether
p − pn
lim inf n+1 = 0.
log pn
110 AARON LANDESMAN

In fact, this was shown by Goldston, Pintz, and Yildirim in 2005.


However, their method did not show
p n +2 − p n
= 0.
log pn
This was the basis for an amazing result of Zhang in 2013 yielding
bounded gaps between primes:
pn+1 − pn < 70 ∗ 107 .
The main ingredient was Zhang’s version of Bombieri Vinogradov:
Let a 6= 0 be any integer. Then,
x
∑ | E( x; q, a)| 
log x A
.
q≤ x 1/2 + δ ,p|q =⇒ p≤ x δ ( )
Using this and the work of Goldston Pintz and Yildirim, he was
able to get bounded gaps. However, in the same year, Maynard and
Tao showed that instead of getting two primes, one could get many
primes in a bounded interval. Further, one could use the original
version of Bombieri Vinogradov instead of Zhang’s variant.
Theorem 16.27 (Maynard, and independently, Tao). For any ` there
exists k such that for any admissible set H (meaning there is no prime p
with all residues mod p appearing in H of size k, there exist many n with
at least ` primes in (n + h1 , n + h2 , . . . , n + hk ).
So, one can find 3 or 4 or more primes in a bounded interval. In
other words:
Corollary 16.28.
lim inf pn+` − pn < ∞.

17. 11/28/17
Last time, we were discussing the Hardy Littlewood conjecture:
Conjecture 17.1 (Hardy-Littlewood). If you have a set
H : = { h1 , . . . , h k }
then
x
# {n ≤ x : n + h primes } ∼ S(H )
(log x )k
where
 −k  
1 ν ( p)
S(H ) = ∏ 1− 1− H
p p p
ANALYTIC NUMBER THEORY NOTES 111

with

νH ( p) = #H mod p

Note νH ( p) = k if p is large enough. At the end of last time we


stated recent work of Maynard and Tao:

Theorem 17.2 (Maynard-Tao). For any s, there exists a suitably large k


such that for every admissible set H = {h1 , . . . , hk } (meaning S 6= 0)
there are infinitely many n with at least s primes in n + h1 , . . . , n + hk .

Remark 17.3. Sieve methods can give upper bounds for the number
x
of prime k-tuples asymptotic to k . We have already seen one
(log x )
version of this, which is the large sieve.

Exercise 17.4 (Extended exercise). Use the large sieve to give such
an upper bound. Recall the large sieve looked at some exponential
sum. One can try to bound the number of inadmissible tuples over
all possible primes.

But, now we describe a different sieve method, known as Selberg’s


sieve.

17.1. Selberg’s sieve. Selberg’s sieve is based on the simple idea


that squares of real numbers are non-negative.
Consider
 


∑  ∑ λd 
x ≤n≤ x d|(n+h1 )···(n+hk )

For λd ∈ R. We want to arrange that the square of the quantity in


parentheses is always non-negative and at least 1 if n + h1 , . . . , n + hk
are prime.
We will assume λd = 0 for d > R, with R = R( x ) chosen as some
function of x, to be decided later. We should choose λ1 = 1.
So, with these stipulations on λk , we have
 


∑  ∑ λd  ≥ # { R < n ≤ x : n + h1 , . . . , n + hk are all prime } .
x ≤n≤ x d|(n+h1 )···(n+hk )
112 AARON LANDESMAN

On the other hand, expanding, we get


 
 

∑ ∑ ∑ ∑
 
 λd  = λ d1 λ d2  1
.
√  √
x ≤n≤ x d|(n+h1 )···(n+hk ) d1 ,d2 x ≤n≤ x
[d1 ,d2 ]|(n+h1 )···(n+h2 )

where [d1 , d2 ] is the least common multiple of d1 and d2 . The paren-


thesized expression above is a quadratic form in λd ’s, and the prob-
lem is to minimize this quadratic form subject to the linear condition
that λ1 = 1.
We now try and minimize this quadratic form. Suppose p | [d1 , d2 ].
Then, there is some i with n ≡ −k i mod p Then, n lies in νH ( p)
residue classes mod p.
We define f to be multiplicative and f ( p) = νH ( p). Then, f ([d1 , d2 ])
It follows that n lies in f ([d1 , d2 ]) residue classes mod [d1 , d2 ]. So,
the quadratic form
 
f ([d1 , d2 ])
∑ λd1 λd2 [d1, d2 ] x + O ( f ([d1, d2 ])) .
The function f is multiplicative by the Chinese remainder theorem,
though some annoying things might happen on squares of primes.
For simplicity, we’ll make the additional assumption that λd is 0
unless d is squarefree. Then,
 
O ( f ([d1 , d2 ])) = O kω ([d1 ,d2 ])
= O (xε ) ,
Here ω (n) is the number of distinct prime factors of n. The above
is useful if [d1 , d2 ] ≤ x1−ε . This is good if R ≤ x1/2−ε . We will also
assume |λd |  dε . We will justify this later.
In this case, the total contribution of the error terms, is bounded
by the number of terms times the bound which is R2 x ε ≤ x1−ε . Our
quadratic form is then
λ d1 λ d2
∑ [ d1 , d2 ]
f ([d1 , d2 ])
d1 ,d2

and we want to minimize this subject to the constraints


(1) λ1 = 1
(2) λd = 0 unless d ≤ R = x1/2−ε and d is square free
(3) |λd |  dε .
ANALYTIC NUMBER THEORY NOTES 113

We’d now like to diagonalize this quadratic form and read of the
diagonal entries by Cauchy Schwartz. Now, d1 , d2 are tied together
by the lcm function. Let (d1 , d2 ) denote the gcd of d1 , d2 . Let a =
(d1 , d2 ) be the gcd. Then, let d1 = ar1 , d2 = ar2 . We obtain
λ ar1 λ ar2
∑ ∑ ar1 r2
f ( ar1 r2 ) .
a r1 ,r2 ,
(r1 ,r2 )=1

By multiplicativity, we have
f ( ar1 r2 ) = f ( a) f (r1 ) f (r2 ).
Next, use Möbius inversion to remove the coprimality condition. We
have
(
1 if (r , r2 ) = 1
∑ µ(b) = 0 else 1
b|(r ,r )1 2

Write r1 = bs1 , r2 = bs2 . Define


λds f (s)
ξd = ∑ s
.
s

Then,
λ ar1 λ ar2
∑ ∑ ar1 r2
f ( ar1 r2 )
a r1 ,r2 ,
(r1 ,r2 )=1
f ( a ) µ ( b ) f ( b )2 λ abs1 λ abs2
= ∑∑ ∑ f ( s1 ) f ( s2 )
a b
a b2 s1 ,s2 s1 s2
f ( a ) µ ( b ) f ( b )2 λ abs
= ∑∑ ∑ f (s)
a b
a b2 s s
f ( a ) f ( b )2
= ∑ ξ d2 ∑ µ(b)
d ab=d
ab2
 
f (d)  f (b)
= ∑ ξ d2 ∑ ∑ µ(b) .
d ab=d
d b|d
b

Then, let
 
f (b)
h(d) := ∑ µ(b) .
b|d
b
114 AARON LANDESMAN

We may observe
 
f ( p)
h(d) = ∏ 1−
p
.
p|d

This is starting to look related to the singular series. Therefore, our


quadratic form can be written as

 
f (d)  f (b) f (d)
∑ ξ d2 ∑ d ∑ b
µ(b) = ∑ d
h(d)ξ d2 .
d ab=d b|d d

We have now diagonalized our quadratic form, but we now need


to transform our constraint λ1 into a constraint in terms of the ξ d . So,
we’d like to invert our linear change of variables. We want to write
down λd in terms of things involving ξ i . To do this, using Möbius
inversion, we want to understand
 
f (s) 
λd = ∑ λds ∑ µ(b)
s s b|s

λdbt f (bt)
= ∑ µ(b) ∑
b t bt
µ(b) f (b)
=∑ ξ db .
b
b

Then, we want ξ d = 0 unless d ≤ R and squarefree. We also want


λ1 = 1 if and only if
µ(b)
∑ b
f (b)ξ b = 1.

We have
!2 ! !
µ(b) µ ( b )2 f ( b ) f (d)
1= ∑ b f (b)ξ b ≤ ∑ b h(b) ∑ d
h(d)ξ d2
b b d

The equality case of Cauchy Schwartz occurs when the vectors are
proportional to each other, which occurs when
µ(b)
ξb ∼ .
h(b)
ANALYTIC NUMBER THEORY NOTES 115

Therefore, the minimum of the quadratic form given by


f (d)
∑ d
h(d)ξ d2
d
is bounded by
! −1
µ ( b )2 f ( b )
∑ b h(b)
b≤ R
and this is the equality case of Cauchy Schwartz, so it actually attains
this bound. And further one can determine the constant of propor-
tionality c by
µ ( b )2 f ( b )
1=c ∑ bh(b)
.
b≤ R
We obtain that
x
# {n ≤ x : n + h1 , . . . , n + hk all prime } ≤
µ ( b )2 f ( b )
+ O ( x 1− ε )
∑b≤ R bh(b)
where one needs to verify
Exercise 17.5. Verify ξ d  dε , λd  dε
Then R ≤ x1/2−ε .
Now, f ( p) is about k, so f (n) is roughly dk (n) the k-divisor func-
tion of n. Next, h(n) is roughly 1. Then,
d (b) 1 ds
Z
∑ kb = 2πi ζ ( s + 1) k R s
s
b≤ R

(log R)k
∼ .
k!
So, we should expect
µ ( b )2 f ( b ) (log R)k
∑ bh(b) ∼ k!
b≤ R
up to multiplying by some convergent Euler factor to mitigate our
estimates above.
Exercise 17.6. Verify that this Euler factor is S(H )−1 , meaning
k
µ ( b )2 f ( b ) −1 (log R )
∑ bh(b) = S ( H )
k!
+ lower order terms .
b≤ R
116 AARON LANDESMAN

Combining the above, we conclude


Theorem 17.7.
k!2k x
# {n ≤ x : n + h1 , . . . , n + hk all prime } ≤ S(H ) (1 + o (1)) .
(log x )k
This is a typical application of Selberg’s sieve.
Remark 17.8. When k = 2 we can do a funny trick which gives a
better bound. We can replace the 2!22 by a factor of 4.
µ(d)
Remark 17.9. Recall that the optimal choice of ξ d = h(d)
, up to scal-
ing. Then,
µ(s) f (s)
λd = ∑ λdss
.
Imagine that ξ ds ∼ µds , Then, we get
µ ( s )2 f ( s )
 
∑s≤ R/d sh(s)
λd ∼ µ(d)
µ ( s )2 f ( s )
∑s≤ R sh(s)
Then,
 k
log R/d
λd = µ(d) .
log R
Exercise 17.10. Show
 n k
∑ µ ( d ) log
d
= 0 unless n has at most k prime factors.
d|n

Remark 17.11. Consider the simplest case k = 1. This could be tricky


because we might be trying to count primes in a short interval.
Exercise 17.12. Check that one gets exactly the same upper bound
for π ( x + y) − π ( x ) as with the large sieve using Theorem 17.7. (Not
only asymptotically the same, but rather exactly the same expres-
sion.)
Recall our quadratic form in the case of sieving out a 1-tuple is
λd λd
∑ [d11, d22]
d ,d ≤ R1 2

with
 
log( R/d)
λd = µ(d) .
log R
ANALYTIC NUMBER THEORY NOTES 117

The optimal answer ended up being


λ d1 λ d2 1
∑ [ d 1 , d 2 ]
=
log R
.
d1 ,d2 ≤R

Then,
λd λd µ ( a )2 µ (r ) µ ( s )
∑ [d11, d22] = ∑ a r s
.
d1 ,d ≤ R
2
a,r,s
r ≤ R/a
s≤ R/a

for R/2 ≤ a ≤ R.
Exercise 17.13 (difficult extended exercise). Then,
µ ( d1 ) µ ( d2 )
∑ [ d 1 , d 2 ]
→ c 6= 0
d1 ,d2 ≤R

as R → ∞.
The sieve accounts for the above by replacing
µ ( a )2 µ (r ) µ ( s )
∑ a r s
.
a,r,s
r ≤ R/a
s≤ R/a

by
µ ( a )2 µ (r ) µ ( s )
  
log R/ar log R/as
∑ a r s log R log R
.
a,r,s
r ≤ R/a
s≤ R/a

Here is a variant useful for what we will do next. We want to find


when n + h1 , . . . , n + hk are all prime.
Fix n + h1 = p. Sieve n + h1 , . . . , n + hk
 2

∑  ∑ λd 
n + h1 = p ≤ x d|(n+h2 )···(n+hk )

with λ1 = 1, λd = 0 unless d ≤ R and d is squarefree. This lets us


count
π ( x; [d1 , d2 ] , a)

and handle this using Bombieri-Vinogradov with R ≤ x1/4−ε .


118 AARON LANDESMAN

Exercise 17.14. Show that


x k
π ( x; [d1 , d2 ] , a) ≤ S(H )4k−1 (k − 1)!
log x
using that we have a k − 1 dimensional sieve. When k = 2 this gives
a better bound with 4 instead of 8, but it is worse for k > 2.

18. 11/30/17
Last time, we discussed Selberg’s sieve. We proved bounds like
S(H ) x
# {n ≤ x : n + h1 , . . . , n + hk are all prime } ≤ (1 + o (1)) 2k k! .
(log x )k
This was shown by considering a quadratic form
 2
 ∑ λd 
d|(n+h1 )···(n+hk )

with
 k
log R/d
λd ∼ µ(λ) .
log R
with R = x1/2−ε .
Exercise 18.1. Another way to find this is to consider
 2

∑  ∑ λd 
n≤ x,n+h1 prime d|(n+h1 )···(n+hk )

taking R = x1/4−ε . Then, show that one gets a bound which is better
when k = 2, but not for other k, of the form
 2
S(H ) x
∑  ∑ λd  ≤ (1 + o (1)) 4k−1 (k − 1)! .
n≤ x,n+h1 prime d|(n+h1 )···(n+hk ) (log x )k
Exercise 18.2. Use Selberg’s sieve to bound
n o
# n ≤ x : n2 + 1 is prime .
Use sieve weights summing over polynomial values. That is, bound
 2
n o
# n ≤ x : n2 + 1 is prime ≤ ∑  ∑ λd 
n≤ x d | n2 +1
ANALYTIC NUMBER THEORY NOTES 119

with λd = 1. We get n2 + 1 ≡ 0 mod [d1 , d2 ]. Probably take λd = 0


for primes 3 mod 4 since such primes won’t divide this. Then diago-
nalize the quadratic form and see what you get. The solutions to this
will be given by some multiplicative function of the form
f ([d1 , d2 ])
x .
[ d1 , d2 ]
which is 2 if the prime is 1 mod 4 and 0 if it is 3 mod 4.
Derive a similar bound for other polynomials.
Theorem 18.3 (Goldston-Pintz-Yildirim).
p − pn
lim inf n+1 = 0.
n→∞ log pn
The idea of proof is relatively simple. Start with an admissible k-
tuple, meaning S(H ) 6= 0. The idea is to look for a non-negative
function a(n) ≥ 0 so that we can make
k
∑ a(n) < ∑ ∑ a ( n ).
n≤ x j=1 n≤ x,n+h j prime

Then, there is some n with at least two primes among n + h1 , . . . , n +


hk , essentially by pigeonhole principal. Then, motivated by Selberg’s
sieve, we will take
 2

a(n) =  ∑ λd 
d|(n+h1 )···(n+hk )

with λ1 = 1 and λd = 0 unless d ≤ R is squarefree. We’d like the


desired equality above with k as small as possible.
We won’t be able to solve this, (it would imply bounded gaps) but
we can tweak it a bit to get Theorem 18.3.
In Selberg’s sieve, we wished to minimize a quadratic form given
a linear form. Here, the real problem is to maximize the ratio of
quadratic forms.
Let
Q1 ( λ ) : = ∑ a(n)
n≤ x

and
Q2 ( λ ) : = ∑ a ( n ).
n≤ x,n+h j prime
120 AARON LANDESMAN

Then, we can write Q1 (λ) as

f ([d1 , d2 ])
x· ∑ λ d1 λ d2
[ d1 , d2 ]
+ O ( R2 x ε )
d1 ,d2

with

f ( p) := νH ( p)

(recall νH ( p) is usually k.) This is good if R ≤ x1/2−ε .


We can write Q2 (λ) as

k ∑ λ d1 λ d2 ∑ 1
d1 ,d2 n≤ x,
n+h1 prime ,
(n+h2 )···(n+hk )≡0 mod [d1 ,d2 ]

To evaluate this sum, we take all possible choice in H mod p, and


rule out the single choice n ≡ −h1 mod p. So, n lies in g ([d1 , d2 ])
residue classes mod [d1 , d2 ] with g( p) = f ( p) − 1.
For our inner sum, we get an estimate of the form

x g ([d1 , d2 ])
.
log x φ ([d1 , d2 ])

Averaging over all d1 , d2 and using Bombieri-Vinogradov, the error


terms are under control so long as R ≤ x1/4−ε .
So, we can approximate Q2 (λ) by

x λ d1 λ d2
k ∑
log x d ,d φ ([d1 , d2 ])
g ([d1 , d2 ]) .
1 2

Let’s simplify and assume that


 
log R/d
λd = µ(d) P .
log R

where P is a polynomials vanishing to order k at 0.


It is now just a calculation to figure out these two quadratic forms
and see if we can find a suitable polynomial P.
ANALYTIC NUMBER THEORY NOTES 121

Again Q1 (λ) is given by


f ( a) λ as1 λ as2 f (s1 ) f (s2 )
∑ a s ∑ s1 s2
a 1 ,s2 ,( s1 ,s2 )=1
!2
f ( a ) µ ( b ) f ( b )2 λ abs f (s)
=∑ ∑ s
a a b2 s

λds f (s) 2
  
f (d) f ( p)
=∑
d ∏ ∑ s
1− .
d p|d
p

Similarly, we can write Q2 (λ) as


!2
kx ∑ a,b g( a)µ(b) g(b)2 λ abs g(s)
log x φ ( a ) φ ( b )2 ∑ φ(s)
s

λds g(s) 2
  
kx g(d) g( p)
log x ∑ φ(d) ∏ ∑ φ(s)
= 1− .
d p|d
φ( p)

For both these cases, the first step is to understand the rightmost
terms
λds f (s) 2 λds g(s) 2
   
∑ s and ∑
φ(s)
.

So, for d ≤ R, we want to evaluate


 
h(l ) log R/dl
∑ µ(dl ) l P log R
l ≤ R/d

g(l )l
where h(l ) is either f (l ) or φ(l ) . In both cases, h is multiplicative.
Usually h( p) ∼ w with w = k − 1 or k.
Take
p ( t ) (0) t
P(y) = ∑ t!
y
t≥k

Lemma 18.4. For c > 0 and (c) the corresponding vertical line in the
complex plane,
(log z)t
(
1 zs
Z
t! if z > 1
ds =
2πi (c) st+1 0 if z < 1
122 AARON LANDESMAN

Proof. Either move the line of integration to the left picking up the
pole at t = 0. If z < 1 move the line of integration to the right, and
there is no pole so the integral is 0. 
We now want to understand

!
P ( t ) (0) µ(dl ) R s ds 1
Z
∑ 2πi (c)
∑ l 1+ s h ( l ) d st+1 (log R)t
.
t≥k `=1

Using the lemma, we can evaluate this, which only gives a nonzero
result if d < R. We have

µ(dl )
∑ 1+ s
h(l )
`=1 l

can be approximated by something like ζ w (s1+1) with w either k − 1


or k, using that a power of ζ gives the series for the divisor function,
and the Möbius function inverts this.
Then, we get ζ (s + 1)−w up to some Euler product involving terms
of primes squared, which can be thought of as quite tame. Note that
1 1
ζ ( s + 1 ) w ζ t +1 ( s )
has a pole of order t − w + 1 at s = 0. The idea to
evaluate our desired integral is to move contours using the zero free
region for ζ (s). We will pick up a contribution of the pole at s = 0.
Then,
p(t) (0) (log R/d)t−w
∑ (log R)t (t − w)!
· T | s =0
t≥k

for T the tame Euler factor from above, and T |s=0 denoting the eval-
uation of T at s = 0.
We can simplify the above to
T | s =0
 
(w) log R/d
P .
(log R)w log R
To finish this argument, we compute
1 s
   
h( p)
T | s =0 = ∏ 1 −
p ∏
1− µ(d)
p p-d
p
 
 −w   −w !
1 h ( p ) 1
= ∏ µ( p) 1 −  ∏ 1− 1− .
p|d
p p-d
p p

Let’s plug this in to our first quadratic form and see what we get.
ANALYTIC NUMBER THEORY NOTES 123

For Q1 (λ), we get


 
  −2k
x f (d)  f ( p) 1
2k ∑ µ ( d )2
d ∏ 1−
p
1−
p

(log R) d p|d
! 2 !  2
−2k 
1 f ( p) (k) log R/d
· ∏ 1− 1− P .
p-d
p p log R
We want to find
  !
−2k 2 !
2 1 −2k
 
µ(n) f (n) f ( p) 1 f ( p)
∑ n s +1  ∏ 1 − p 1−
p
 ∏ 1−
p
1−
p
n p|d p-d

= ζ (s + 1)k T2 |s=0
and then we want to understand the other corresponding term. We
have
 (  )
1 k 1 −2k f ( p) 2 1 2k
     
f ( p) f ( p)
∏ 1− p 1−
p
1−
p
+
p
1−
p
1−
p
.
p

This should, hopefully, turn out to be the Hardy-Littlewood con-


stant.
 (  )
1 k 1 −2k f ( p) 2 1 2k
     
f ( p) f ( p)
∏ 1− p 1−
p
1−
p
+
p
1−
p
1−
p
p

1 −k
  
f ( p)
= ∏ 1− 1−
p p p
= S(H ).
Then, by partial summation,
 
  −2k
x 2 f (d)  f ( p) 1
∑ µ ( d ) ∏ 1 − 1 − 
(log R)2k d d p|d
p p
!  !
f ( p) 2
2
1 −2k

(k) log R/d
· ∏ 1− 1− P
p-d
p p log R
!
Z R  2 k
x log R/z log R ( log z )
∼ 2k 1
P(k) d S(H )
(log R) k!
xS(H ) y k −1
Z 1
∼ P(k) (1 − y)2 dy.
(log R)k 0 ( k − 1) !
124 AARON LANDESMAN

One does a similar calculation for Q2 (λ). One can similarly compute
the Euler product, and one again gets

xS(H ) y k −2
Z 1
Q2 ( λ ) = k P(k−1) (1 − y)2 dy.
log x (log R)k−1 0 ( k − 2) !

All we need is to find a polynomial P to make the ratio Q2 (λ)/Q1 (λ) >
1 subject to the condition that R < x1/4 . It’s advantageous to make
R as large as possible in terms of x but R < x1/4 . These quadratic
forms we have only depend on the polynomial P. Next time, we’ll
finish this.
It will end up happening that when R = x1/4−ε you get this ratio
to be just under 1.
But, one can even get this ratio to tend to infinity thinking of this
as a higher dimensional problem, using an argument of Maynard.

19. 12/5/17
Let H := {h1 , . . . , hk } be an admissible tuple so that S(H ) 6= 0.
We want to compare
 2

Q1 : = ∑  ∑ λd 
n≤ N d|(n+h1 )···(n+hk )

with
 2
k
Q2 : = ∑ ∑  ∑ λd  .
j=1 n≤ N,n+h j prime d|(n+h1 )···(n+hk )

We took λd = 0 unless d ≤ R is square free. Then, R ≤ N 1/4−ε by


Bombieri Vinogradov. We can take
 
log R/d
λd = µ(d) P
log R
for P a polynomial vanishing to order k at 0. We computed the main
terms of these two sums.
We found that for R ≤ N 1/2−ε
S(H ) N y k −1
Z 1
Q1 ∼ P(k) (1 − y)2 dy.
(log R)k 0 ( k − 1) !
ANALYTIC NUMBER THEORY NOTES 125

We found also that for R ≤ N 1/4−ε


kS(H ) N y k −2
Z 1
Q1 ∼ P(k−1) (1 − y)2 dy
(log N ) (log R)k−1 0 ( k − 2) !

Let Q(y) = P(k−1) (y) be a polynomial vanishing to order at least


1 at 0. Then, we want to know if
y k −2 y k −1
Z 1 Z 1
k log R
(19.1) Q (1 − y)2 dy > Q0 (1 − y)2 dy.
log N 0 ( k − 2) ! 0 ( k − 1) !
so that we can understand the ratio Q1 /Q2 . In Selberg’s sieve one
typically takes Q(y) = y so that P ∼ yk . In GPY, they took Q(y) = yl
with l chosen to be large.
We now have to compute these integrals, which are examples of
the β function. Recall
Lemma 19.1.
Z 1
a!b!
y a (1 − y)b dy = .
0 ( a + b + 1)
Proof. Take n = a + b + 1. We put down n numbers at random
x1 , . . . , xn ∈ (0, 1)
independently and uniformly. We now ask what the chance is that
xn is in position a + 1? It is easy to see, by symmetry, there is a n1
chance.
On the other hand, we can order the n objects, and we can inte-
grate over the possible positions of the nth object such that it is in
position a + 1. By choosing the ordering for the other objects, we see
this probability is
 Z 1
a+b
xna (1 − xn )b
a 0

Therefore,
1
 Z
1 a+b
= xna (1 − xn )b .
a+1 a 0


Simplifying (19.1) we get that it suffices to check
k log R (k − 2)! (2l )! (k − 1)!l 2 (2l − 2)!
( k − 1) < .
log N (k − 1 + 2l )! (k + 2l − 2)!
126 AARON LANDESMAN

Simplifying further, we want to compare


k log R
2l (2l − 1) < l 2 .
k − 1 + 2l log N

Now say k is large and l ∼ k, we get roughly that it suffices to
check
log R
(4 − ε ) > 1.
log N
But, we chose R = N 1/4−ε . If we could chose R larger than N 1/4
we could prove bounded gaps between primes. But, with Bombieri
Vinogradov, this barely fails to give bounded gaps between primes.
Exercise 19.2. Assuming Elliott-Halberstam, obtain a bound for lim ( pn+1 − pn ) .
We’ll next talk about Maynard’s refinement. The additional idea
in GPY is the following. We have
 2

Q1 = ∑ ∑  ∑ λd 
h1 ,...,hk ≤ H distinct n≤ N d|(n+h1 )···(n+h+k)
 2

Q2 = ∑, ∑ ∑  ∑ λd  .
h≤ H h1 ,...,hk ≤ H distinct n≤ N,n+h prime d|(n+h1 )···(n+hk )

If every interval [n, n + H ] contains at most one prime than Q2 ≤ Q1 .


If h 6= h1 , . . . , hk the second form is of size
N
.
(log N ) (log R)k
One has many more possibilities coming from h not in h1 , . . . , hk .
One has Q2 is almost Q1 when h ∈ H , but it is then pushed over by
allowing h ∈ / H . Multiplying this by the size of H which is δ log N.
Therefore, we get enough extra help from elements of [n, n + H ] not
lying in the tuple H .
Here, k is very large, so we are looking at a high dimensional sieve.
This is a different optimization problem and has some surprises.
Maynard and Tao have a method which gives many primes in
bounded gaps. GPY gives only 2 primes in bounded gaps, but not
many.
Let
H = { h1 , . . . , h k }
ANALYTIC NUMBER THEORY NOTES 127

be admissible. Consider
 2
 
 
 
 
∑  ∑ d1,...,dk 
 
 λ
n∼ x d1 |(n+h1 ) 

d2 |(n+h2 ) 
 . ..

dk |(n+hk )

and compare it to
 2

∑  ∑ λd1 ,...,dk  .
n∼ x di |(n+hi )
n+h j prime

Before we had

∑ λd
d|(n+h1 )···(n+hk )

and
λd = ∑ λd1 ,...,dk
d1 ···dk =d

where we might have in mind that


λd1 ,...,dk = µ(d1 ) · · · µ(dk )
equal to a function
 
log d1 log dk
F ,...,
log R log R
whereas we are allowing F ( x1 , . . . , xk ) supported on x1 + · · · + xk ≤
1 rather than just the function G ( x1 + · · · xk ). So there is more flex-
ibility in allowing functions of many variables rather than just of a
single variable.
We now introduce the trick, previously known as using a small
sieve, but after Green Tao it is known as the W-trick.
The most naive sieve can be used to count

∑ 1
n≤ x
p|n =⇒ p>z
128 AARON LANDESMAN

To count this, if

∏p
p≤z

is very small, we can easily sieve this. The product above is around
log x
ez . For example if z ≤ 106 this is very easy to sieve. For example,
take
W= ∏ p,
p≤log log log x

then W = (log log x )O(1) . When studying these, we insist n lies in


some progression ν mod W. That is, we want to understand
 2
 
 
 
 
∑  ∑ λd1 ,...,dk 
 
n∼ x,n≡ν mod W d1 |(n+h1 )
 

d2 |(n+h2 ) 
 . ..

dk |(n+hk )

and compare it to
 2

∑  ∑ λd1 ,...,dk  .
n∼ x di |(n+hi )
n+h j prime
n≡ν mod W

Then, n + hi , n + h j = 1 for all i and j with n ≡ ν mod W.
So, we want to understand the quadratic form

∑ ∑ λd1 ,...,dk λe1 ,...,ek .


n∼ x d1 ,...,dk
n≡ν mod W e1 ,...,ek
[di ,ei ]|(n+hi )

We have λd1 ,...,dk = 0 unless


(1) d1 , . . . , dk ≤ R and are squarefree
(2) d1 · · · dk is coprime to W.
(3) di , d j = 1.
These above conditions are automatic, but there is an additional con-
dition: Note that if i 6= j we must have (di , e j ) = 1, as otherwise the
sum would be 0.
ANALYTIC NUMBER THEORY NOTES 129

So, suppose now


x
∑ ∑ λd1 ,...,dk λe1 ,...,ek = ∑ λd1 ,...,dk λe1 ,...,ek
W ∏ik=1 [di , ei ]
.
n∼ x d1 ,...,dk d1 ,...,dk
n≡ν mod W e1 ,...,ek e1 ,...,ek
[di ,ei ]|(n+hi )

The error term is ok if R ≤ x1/2−ε .


If di , ei have a common factor this will only appear once as in the
denominator. But if di and d j have a common factor, this will appear
in the denominator at least to the power 2. But it turns out these
form part of the tail of a convergent sum which will go to 0. Hence,
we will ignore the condition that if i 6= j implies di , e j = 1. To
justify this, we need to check that the terms with (di , ei ) > 1 will
contribute a small amount compared to the main term (assuming
we have removed small prime factors.
Then,
x
∑ ∑ λd1,...,dk λe1,...,ek = ∑ λd1,...,dk λe1,...,ek W k [d , e ]
n∼ x d1 ,...,dk d1 ,...,dk ∏ i =1 i i
n≡ν mod W e1 ,...,ek e1 ,...,ek
[di ,ei ]|(n+hi )
k
x λd1 ,...,dk λe1 ,...,ek

W ∑ d 1 · · · d k e1 · · · e k ∏ ( d i , ei ) .
d1 ,...,dk i =1
e1 ,...,ek

Then, we can write


( d i , ei ) = ∑ φ ( f i ).
f i |(di ,ei )

We can write our quadratic form Q1 as approximately


 2
k
x λd1 ,...,dk 
∑ ∏ φ( f i ) 
 ∑

Q1 ∼  .
W f 1 ,..., f k i =1 d1 ,...,dk
d1 · · · d k 
f i | di

We the set
! 
k λd1 ,...,dk
y f1 ,..., f k = ∏ µ ( f i ) φ ( f i )  ∑ d1 · · · d k  .
i =1 f |d i i

Then,
λd1 ,...,dk ∼ µ(d1 ) · · · µ(dk ).
130 AARON LANDESMAN

The Möbius function of the f i then cancel out and the f i mostly cancel
with the φ( f i ). Therefore, the choice of y f1 ,..., f k will look like
 
log f 1 log f n
y f1 ,..., f k ∼ F ,..., .
log R log R
Then, after an invertible change of variables,
y a1 ,...,ak
λd1 ,...,dk = ∏ µ(d j )d j ∑

.
d |a
φ ( x1 ) · · · φ ( x k )
j j

Then,
x y2f1 ,..., f
Q1 ∼
W ∑ φ( f 1 ) · · · φ( f k )
k
.
f 1 ,..., f k

So, y f1 ,..., f k = 0 unless f 1 · · · f k ≤ R is squarefree and coprime to


W. We make the choice
 
log f 1 log f k
y f1 ,..., f k = F ,..., .
log R log R
Then, Q1 is approximately

x φ (W ) k
  Z
k
(log R) F ( x1 , . . . , xk )2 dx1 · · · dxk
R W x1 ,...,xk

where
F ( x1 , . . . , x k ) = 0
unless x1 + · · · xk ≤ 1.
Remark 19.3. The (log R)k in the numerator (instead of the denomi-
nator) is a scaling fact relating to how we chose the y f1 ,..., f k .
Let’s now start understanding Q2 , which we will finish on Thurs-
day. This will be similar. We have
 2

Q2 ∼ k ∑  ∑ λd1 ,...,dk 
n∼ x di | n + hi
n+hk prime
n≡ν mod W
=k ∑ λd1 ,...,dk λe1 ,...,ek ∑ 1
d1 ,...,dk n∼ x
e1 ,...,ek n≡ν mod W
d k =1 n+hk prime
e k =1 n+hi ≡0 mod [di ,ei ]
ANALYTIC NUMBER THEORY NOTES 131

Again, on the last line above there will be a coprimality condition


on (di , e j ) which we can ignore as was done above. Then, we can
simplify
x
∑ 1 ∼
(log x ) φ(W ) ∏ik=−11 φ ([di , ei ])
.
n∼ x
n≡ν mod W
n+hk prime
n+hi ≡0 mod [di ,ei ]

We’ll finish understanding this quadratic form on Thursday.

20. 12/7/17
Recall last time we had an admissible set
H = { h1 , . . . , h k }
and we chose
W= ∏ p
p≤log log log x

where
ν mod W with (ν + hi , W ) = 1 for all i
We had
 2

Q1 = ∑  ∑ λd1 ,...,dk  .
n≤ x d | n + hi
n≡ν mod W

When R ≤ x1/2−ε , we evaluated the above as


x y2r1 ,...,rk
W ∑ pr φ(r j )
r1 ,...,rk

with
λd1 ,...,dk
∏ µ (ri ) φ (ri ) ∑

yr1 ,...,rk =
ri | di
d1 · · · d k

where
yr1 ...,rk
∏ µ ( di ) di ∑

λd1 ,...,dk = .
di |ri
φ (r1 ) · · · φ (r k )

We chose
 
log r1 log rk
yr1 ,...,rk = F ,...,
log R log R
132 AARON LANDESMAN

with
F ( x1 , . . . , x k )

supported on x1 + · · · + xk ≤ 1.
We then get

x φ (W ) k
  Z
k
Q1 = (log R) F ( x1 , . . . , xk )2 dx1 · · · dxk .
W W x1 ,...,xk

We have
1 φ ((d, e))
=
φ ([d, e]) φ(d)φ(e)
using that

φ(n) = ∑ g(d)
d|n

with g multiplicative (on relatively prime inputs) with g( p) = p − 2.


We compare this Q1 to (with the function g defined above)
2
Q2 = ∑ ∑ λd1,...,dk
n≤ x
n+hk prime
n≡ν mod W
λd1 ,...,dk λe1 ,...,ek
= ∑ φ(W ) log x ∏ φ ([di , ei ])
d1 ,...,dk
e1 ,...,ek
d k = e k =1
!
k
x λd1 ,...,dk λe1 ,...,ek
=
φ(W ) log x ∑ ∏ g( f i ) ∑ ∏ φ(d j ) ∏ φ(e j )
.
f 1 ,..., f k i =1 di ,ei
f i | d i , f i | ei
d k = e k =1

Note  that above we had a coprimality condition [ei , di ] | n + hi and


di , e j = 1 for i 6= j.
Then, we let
λd1 ,...,dk
∏ µ( f i ) g( f i ) ∑
(k) 
yf
1 ,..., f k
= · .
d1 ,...,dk
φ ( d1 ) · · · φ ( d k )
dk =1, f i |di

(k)
By convention we set y f
1 ,..., f k
= 0 unless f k = 1.
ANALYTIC NUMBER THEORY NOTES 133

We then have
 2
(k)
x yf
1 ,..., f k
Q2 =
φ(W ) log x ∑ k
.
f 1 ,..., f k ∏ j=1 g ( f j )

Lemma 20.1. Letting f k = dk = 1,

y f1 ,..., f k−1 ,rk


∼∑
(k)
yf .
1 ,..., f k φ (r k )
rk

Proof. We have

λd1 ,...,dk
∏ µ( f i ) g( f i ) ∑
(k) 
yf
1 ,..., f k
=
d1 ,...,dk
φ ( d1 ) · · · φ ( d k )
f i | di
1 yr1 ,...,rk
∏ µ( f i ) g( f i ) ∑ ∏ µ(d j )d j ∑
 
=
f i | di
φ ( d1 ) · · · φ ( d k ) r1 ,...,rk φ (r1 ) · · · φ (r k )
d k =1 d j |r j

yr1 ,...,rk k µ(d )d


j j
∏ µ( f j ) g( f j ) ∑ ∑ ∏

=
r1 ,...,rk ∏ φ(r j ) d1 ,...,d1 =1 j=1 φ(d j )
f j |r j f i | di |ri

Fixing f i , r j we want to compute

k
µ(d j )d j
∑ ∏ φ ( d j )
.
d1 ,...,d1 =1 j =1
f i | di |ri

if k = j, then the term is 1. If j < k the term is

f j µ( f j )
 
p
φ( f j ) ∏ 1−
p−1
.
p |r j / f j

p
Then, 1 − p−1 the above term is relatively small, unless r j = f j , in
which case the product is the empty product and goes away.
134 AARON LANDESMAN

Therefore, we can simplify

yr1 ,...,rk k µ(d )d


j j
∏ µ( f j ) g( f j ) ∑ ∑ ∏

r1 ,...,rk ∏ φ(r j ) d1 ,...,d1 =1 j=1 φ(d j )
f j |r j f i | di |ri
!
µ( f j ) g( f j ) f j µ( f j ) yf ,..., f ,r
∼∏ ∑ φ( f1 ) · · 1· φ( kf−k−1 1k)φ(rk )
j
φ( f j ) rk
!
µ( f j ) g( f j ) f j µ( f j ) y f1 ,..., f k−1 ,rk
=∏ ) ∑
j
φ ( f j )2 rk φ (r k )
y f1 ,..., f k−1 ,rk
∼∑ .
rk φ (r k )

Using the lemma, let’s continue to evaluate Q2 . Recall we chose F


so that
 
log r1 log rk
yr1 ,...,rk = F ,..., .
log R log R
By Lemma 20.1,
y f1 ,..., f k−1 ,rk
∼∑
(k)
yf
1 ,..., f k φ (r k )
rk
Z   
φ (W ) log f 1 log f k−1
= (log R) F ,..., , xk dxk .
W log R log R

Plugging this in for Q2 , we have



(k) 2 
x y f ,..., f

1 k
Q2 = k
φ(W ) log x f 1 ,..., f k ∏ j=1 g ( f j )
 2   k −1
x φ (W ) φ (W )
= log R log R
φ(W ) log x W W
Z Z 2
F ( x1 , . . . , xk−1 , xk ) dxk dx1 · · · dxk−1 .
x1 ,...,xk−1 xk

(where here we are really multiplying by k for choosing a particular


hi , and we are assuming F ( x1 , . . . , xk ) is symmetric).
ANALYTIC NUMBER THEORY NOTES 135

So, for comparison, we have


 k Z
x φ (W )
Q1 = log R F ( x1 , . . . , xk )2 dx1 · · · dxk
W W x1 ,...,xk
 k  Z Z 2
kx φ(W ) log R
Q2 = log R F ( x1 , . . . , xk ) dxk dx1 · · · dxk−1 .
W W log x x1 ,...,xk−1 xk

We then see that this almost matches up with our first quadratic
form Q1 . The only difference is that we have two different quadratic
forms based on the function F we are choosing. So, we have boiled
everything down to a problem of optimizing F.
Recall we want F ( x1 , . . . , xk ) to be symmetric and x1 + · · · + xk ≤
1. We will choose
k
F ( x1 , . . . , x k ) = ∏ g(kxi )
i =1

for some fixed function g on x1 + · · · + xk ≤ 1.


The numerator of the ratio Q2 /Q1 is given by
 Z !2
log R
Z
k
log x x1 ··· xk−1
xk ∏ g(kxi )dxk dx1 · · · dxk−1
x1 +···+ xk ≤1
 Z Z  2 k −1
k log R
g(uk )duk ∏ g u j du j
2
= k +1
k log x u1 ,...,uk−1 uk ,u1 +···uk ≤k j =1

with kxi = ui .
Then, the denominator is similarly given by
1
Z

kk u1 ,...,uk ∏ g(u j )2 du j .
u1 +···+uk ≤k

Next, we will give an upper bound on the denominator and a


lower bound on the numerator. For the denominator, we have the
upper bound
Z Z ∞ k
u1 ,...,uk ∏ g(u j ) du j ≤
2 2
g(u) du .
0
u1 +···+uk ≤k

So, g is a function on the positive reals, but we may as well take g


supported on [0, k ], since we are only integrating over ui with ∑i ui ≤
k. Let’s assume that g is supported on [0, B] with B a bit smaller than
k, say B ∼ k/100. Now, let’s obtain a lower bound for the numerator.
136 AARON LANDESMAN

We’ll let uk go up to B. We’ll then make an upper bound for the


numerator
Z Z  2 k −1
(20.1) u1 ,...,uk−1 g(uk ) ∏ g(u j )2 du j .
u1 +···+uk−1 ≤k− B uk j =1

If we ignore the restriction that ∑i ui ≤ k − B, then then Q2 /Q1 is


bounded below by, up to some constant,
R 2  
( g(u)) log R
R .
g ( u )2 log x
Remark 20.2. But now, how can we ignoring the restriction that u1 +
· · · + uk−1 ≤ k − B? This might seem like a serious issue, but we now
discuss the answer.
The key additional observation is the following. If we know ug(u)2 du ≤
R
1
g(u)2 , then most of the weight is concentrated on values of u ≤
R
2
1/2. Then, ∑i ui + · · · + uk−1 is at most k/2, most of the time. So we
should then be able to ignore ∑ik=−11 ui ≤ k − B condition. Let’s now
make this idea more precise.
We now observe that
Z Z  2 k −1
u1 ,...,uk−1 g(uk ) ∏ g(u j )2 du j
u1 +···+uk−1 ≤k− B uk j =1
2 Z  !
u 1 + · · · + u k −1 2 k −1
Z 
≥ g(u)
u1 ,...,uk−1
1−
k−B ∏ g(u j )2 du j
j =1
2  Z  k −1  k −1
k−1
Z Z  Z
2 2 2 2
≥ g(u) { g(u) − u g(u) du g(u)
( k − B )2
2  Z  k −3
( k − 1) ( k − 2)
Z
2 2
− ug(u) g(u) }.
( k − B )2
Then,
Z 
B
Z
2 2 2
u g(u) du ∼ g(u)
2
and the whole term
 k −1  k −1
k−1
Z  Z Z
2 2 2 1 2
u g(u) du g(u) ≤ g(u) .
( k − B )2 200
ANALYTIC NUMBER THEORY NOTES 137

So, we can bound Equation 20.1 by


Z 2  Z  k −1
1 2
g(u) g(u) .
2
So, we now want
1
Z Z
2
ug(u) ≤ g(u)2 du
2
2
and we want to make ( g(u)) large in comparison to g(u)2 .
R R
The condition vaguely means that most of the mass should appear
on small numbers. You can start to guess what function might work
for g (or you can try to use calculus of variations). We can try g(u) =
1
u , though this has a pole at 0 Let’s try
(
1
if A1 ≤ u ≤ B
g(u) = u
0 else
Then,
Z
g(u) = log AB
Z
g ( u )2 ∼ A
Z
ug(u)2 du = log AB.
So, we need something like log AB ≤ A/2. So, let’s take B =
k/100. We then, have log AB ∼ log k. We can take A = 3 log k. It
then meets the condition that B ≤ k/100 and
1
Z Z
2
ug(u) ≤ g(u)2 du.
2
To conclude, we now want to compute Q2 /Q1 . Indeed,
R 2
( g(u)) (log AB)2
R =
g ( u )2 A
(log k)2

3 log k
log k
= .
3
And indeed, this goes to ∞ so long as k → ∞.
So, in any tuple where k is sufficiently large, where you expect to
find k primes, you can at least find log k number of primes.

You might also like