0% found this document useful (0 votes)
49 views

An Introduction To Modular Forms

This document provides an introduction to modular forms by discussing: 1) Functional equations relating the values of functions under transformations of their domains. Fourier series are discussed as an example of functions satisfying translation functional equations. 2) Elliptic functions, which are functions defined on subsets of the complex plane that are doubly periodic with respect to a lattice of periods. 3) The document introduces some basic examples and exercises to illustrate functional equations and modular properties of functions.

Uploaded by

kimlozalti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

An Introduction To Modular Forms

This document provides an introduction to modular forms by discussing: 1) Functional equations relating the values of functions under transformations of their domains. Fourier series are discussed as an example of functions satisfying translation functional equations. 2) Elliptic functions, which are functions defined on subsets of the complex plane that are doubly periodic with respect to a lattice of periods. 3) The document introduces some basic examples and exercises to illustrate functional equations and modular properties of functions.

Uploaded by

kimlozalti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

An Introduction to Modular Forms

Henri Cohen
arXiv:1809.10907v1 [math.NT] 28 Sep 2018

Abstract
In this course we introduce the main notions relative to the classical theory of mod-
ular forms. A complete treatise in a similar style can be found in the author’s book
joint with F. Strömberg [1].

1 Functional Equations

Let f be a complex function defined over some subset D of C. A functional equation


is some type of equation relating the value of f at any point z ∈ D to some other
point, for instance f (z + 1) = f (z). If γ is some function from D to itself, one can
ask more generally that f (γ (z)) = f (z) for all z ∈ D (or even f (γ (z)) = v(γ , z) f (z)
for some known function v). It is clear that f (γ m (z)) = f (z) for all m ≥ 0, and even
for all m ∈ Z if γ is invertible, and more generally the set of bijective functions u
such that f (u(z)) = f (z) forms a group.
Thus, the basic setting of functional equations (at least of the type that we
consider) is that we have a group of transformations G of D, that we ask that
f (u(z)) = f (z) (or more generally f (u(z)) = j(u, z) f (z) for some known j) for all
u ∈ G and z ∈ D, and we ask for some type of regularity condition on f such as
continuity, meromorphy, or holomorphy.
Note that there is a trivial but essential way to construct from scratch functions f
satisfying a functional equation of the above type: simply choose any function g and
set f (z) = ∑v∈G g(v(z)). Since G is a group, it is clear that formally f (u(z)) = f (z)
for u ∈ G. Of course there are convergence questions to be dealt with, but this is a
fundamental construction, which we call averaging over the group.
We consider a few fundamental examples.

Henri Cohen
Institut de Mathématiques de Bordeaux, Université de Bordeaux, 351 Cours de la Libération,
33405 TALENCE Cedex, FRANCE, e-mail: [email protected]

1
2 Henri Cohen

1.1 Fourier Series

We choose D = R and G = Z acting on R by translations. Thus, we ask that f (x +


1) = f (x) for all x ∈ R. It is well-known that this leads to the theory of Fourier
series: if f satisfies suitable regularity conditions (we need not specify them here
since in the context of modular forms they will be satisfied) then f has an expansion
of the type
f (x) = ∑ a(n)e2π inx ,
n∈Z

absolutely convergent for all x ∈ R, where the Fourier coefficients a(n) are given by
the formula Z 1
a(n) = e−2π inx f (x) dx ,
0

which follows immediately from the orthonormality of the functions e2π imx (you
may of course replace the integral from 0 to 1 by an integral from z to z + 1 for any
z ∈ R).
An important consequence of this, easily proved, is the Poisson summation for-
mula: define the Fourier transform of f by
Z ∞
fb(x) = e−2π ixt f (t) dt .
−∞

We ignore all convergence questions, although of course they must be taken into
account in any computation.
Consider the function g(x) = ∑n∈Z f (x + n), which is exactly the averaging pro-
cedure mentioned above. Thus g(x + 1) = g(x), so g has a Fourier series, and an
easy computation shows the following (again omitting any convergence or regular-
ity assumptions):

Proposition 1.1 (Poisson summation). We have

∑ f (x + n) = ∑ fb(m)e2π imx .
n∈Z m∈Z

In particular
∑ f (n) = ∑ fb(m) .
n∈Z m∈Z

A typical application of this formula is to the ordinary Jacobi theta function: it


2
is well-known (prove it otherwise) that the function e−π x is invariant under Fourier
transform. This implies the following:
2 2
Proposition 1.2. If f (x) = e−aπ x for some a > 0 then fb(x) = a−1/2 e−π x /a .

Proof. Simple change of variable in the integral. ⊓


Corollary 1.3. Define


An Introduction to Modular Forms 3

−aπ n2
T (a) = ∑e .
n∈Z

We have the functional equation

T (1/a) = a1/2T (a) .

Proof. Immediate from the proposition and Poisson summation. ⊓


This is historically the first example of modularity, which we will see in more
detail below.
2
Exercise 1.4. Set S = ∑n≥1 e−(n/10) .
1. Compute
√ numerically S to 100 decimal digits, and show that it is apparently equal
to 5 π − 1/2. √
2. Show that in fact S is not exactly equal to 5 π − 1/2, and using the above corol-
lary give a precise estimate for the difference.

Exercise 1.5. 1. Show that the function f (x) = 1/ cosh(π x) is also invariant under
Fourier transform.
2. In a manner similar to the corollary, define

T2 (a) = ∑ 1/ cosh(π na) .


n∈Z

Show that we have the functional equation

T2 (1/a) = aT2 (a) .

3. Show that in fact T2 (a) = T (a)2 (this may be more difficult).


4. Do the same exercise as the previous one by noticing that S = ∑n≥1 1/ cosh(n/10)
is very close to 5π − 1/2.

Above we have mainly considered Fourier series of functions defined on R. We


now consider more generally functions f defined on C or a subset of C. We again
assume that f (z + 1) = f (z), i.e., that f is periodic of period 1. Thus (modulo reg-
ularity) f has a Fourier series, but the Fourier coefficients a(n) now depend on
y = ℑ(z):
Z 1
f (x + iy) = ∑ a(n; y)e2π inx with a(n; y) =
0
f (x + iy)e−2π inx dx .
n∈Z

If we impose no extra condition on f , the functions a(n; y) are quite arbitrary.


But in almost all of our applications f will be holomorphic; this means that
∂ ( f )(z)/∂ z = 0, or equivalently that (∂ /∂ (x) + i∂ /∂ (y))( f ) = 0. Replacing in the
Fourier expansion (recall that we do not worry about convergence issues) gives

∑ (2π ina(n; y) + ia′(n; y))e2π inx = 0 ,


n∈Z
4 Henri Cohen

hence by uniqueness of the expansion we obtain the differential equation a′ (n; y) =


−2π na(n; y), so that a(n; y) = c(n)e−2π ny for some constant c(n). This allows us to
write cleanly the Fourier expansion of a holomorphic function in the form

f (z) = ∑ c(n)e2π inz .


n∈Z

Note that if the function is only meromorphic, the region of convergence will be
limited by the closest pole. Consider for instance the function f (z) = 1/(e2π iz − 1) =
eπ iz /(2i sin(π z)). If we set y = ℑ(z) we have |e2π iz | = e−2π y , so if y > 0 we have the
Fourier expansion f (z) = − ∑n≥0 e2π inz , while if y < 0 we have the different Fourier
expansion f (z) = ∑n≤−1 e2π inz .

2 Elliptic Functions

The preceding section was devoted to periodic functions. We now assume that our
functions are defined on some subset of C and assume that they are doubly peri-
odic: this can be stated either by saying that there exist two R-linearly independent
complex numbers ω1 and ω2 such that f (z + ωi ) = f (z) for all z and i = 1, 2, or
equivalently by saying that there exists a lattice Λ in C (here Zω1 + Zω2 ) such that
for any λ ∈ Λ we have f (z + λ ) = f (z).
Note in passing that if ω1 /ω2 ∈ Q this is equivalent to (single) periodicity, and if
ω1 /ω2 ∈ R \ Q the set of periods would be dense so the only “doubly periodic” (at
least continuous) functions would essentially reduce to functions of one variable.
For a similar reason there do not exist nonconstant continuous functions which are
triply periodic.
In the case of simply periodic functions considered above there already existed
some natural functions such as e2π inx . In the doubly-periodic case no such function
exists (at least on an elementary level), so we have to construct them, and for this
we use the standard averaging procedure seen and used above. Here the group is the
lattice Λ , so we consider functions of the type f (z) = ∑ω ∈Λ φ (z + ω ). For this to
converge φ (z) must tend to 0 sufficiently fast as |z| tends to infinity, and since this is
a double sum (Λ is a two-dimensional lattice), it is easy to see by comparison with
an integral (assuming |φ (z)| is regularly decreasing) that |φ (z)| should decrease at
least like 1/|z|α for α > 2. Thus a first reasonable definition is to set

1 1
f (z) = ∑ = ∑
(z + ω )3 (m,n)∈Z (z + m ω 1 + nω2 )3
.
ω ∈Λ 2

This will indeed be a doubly periodic function, and by normal convergence it is


immediate to see that it is a meromorphic function on C having only poles for z ∈ Λ ,
so this is our first example of an elliptic function, which is by definition a doubly
periodic function which is meromorphic on C. Note for future reference that since
−Λ = Λ this specific function f is odd: f (−z) = − f (z).
An Introduction to Modular Forms 5

However, this is not quite the basic elliptic function that we need. We can inte-
grate term by term, as long as we choose constants of integration such that the inte-
grated series continues to converge. To avoid stupid multiplicative constants, we in-
tegrate −2 f (z): all antiderivatives of −2/(z+ ω )3 are of the form 1/(z+ ω )2 +C(ω )
for some constant C(ω ), hence to preserve convergence we will choose C(0) = 0 and
C(ω ) = −1/ω 2 for ω 6= 0: indeed, |1/(z + ω )2 − 1/ω 2 | is asymptotic to 2|z|/|ω 3 |
as |ω | → ∞, so we are again in the domain of normal convergence. We will thus
define:
 
1 1 1
℘(z) = 2 + ∑ − ,
z ω ∈Λ \{0}
(z + ω )2 ω 2

the Weierstrass ℘-function.


By construction ℘′ (z) = −2 f (z), where f is the function constructed above, so
℘ (z+ ω ) = ℘′ (z) for any ω ∈ Λ , hence ℘(z+ ω ) = ℘(z)+ D(ω ) for some constant

D(ω ) depending on ω but not on z. Note a slightly subtle point here: we use the fact
that C \ Λ is connected. Do you see why?
Now as before it is clear that ℘(z) is an even function: thus, setting z = −ω /2 we
have ℘(ω /2) = ℘(−ω /2) + D(ω ) = ℘(ω /2) + D(ω ), so D(ω ) = 0 hence ℘(z +
ω ) = ℘(z) and ℘ is indeed an elliptic function. There is a mistake in this reasoning:
do you see it?
Since ℘ has poles on Λ , we cannot reason as we do when ω /2 ∈ Λ . Fortu-
nately, this does not matter: since ωi /2 ∈/ Λ for i = 1, 2, we have shown at least that
D(ωi ) = 0 hence that ℘(z + ωi ) = ℘(z) for i = 1, 2, so ℘ is doubly periodic (so
indeed D(ω ) = 0 for all ω ∈ Λ ).
The theory of elliptic functions is incredibly rich, and whole treatises have been
written about them. Since this course is mainly about modular forms, we will simply
summarize the main properties, and emphasize those that are relevant to us. All are
proved using manipulation of power series and complex analysis, and all the proofs
are quite straightforward. For instance:

Proposition 2.1. Let f be a nonzero elliptic function with period lattice Λ as above,
and denote by P = Pa a “fundamental parallelogram” Pa = {z = a + xω1 + yω2 , 0 ≤
x < 1, 0 ≤ y < 1}, where a is chosen so that the boundary of Pa does not contain
any zeros or poles of f (see Figure 1).
1. The number of zeros of f in P is equal to the number of poles (counted with
multiplicity), and this number is called the order of f .
2. The sum of the residues of f at the poles in P is equal to 0.
3. The sum of the zeros and poles of f in P belongs to Λ .
4. If f is nonconstant its order is at least 2.

Proof. For (1), (2), and (3), simply integrate f (z), f ′ (z)/ f (z), and z f ′ (z)/ f (z) along
the boundary of P and use the residue theorem. For (4), we first note that by (2) f
cannot have order 1 since it would have a simple pole with residue 0. But it also
cannot have order 0: this would mean that f has no pole, so it is an entire function,
and since it is doubly-periodic its values are those taken in P which is compact, so
6 Henri Cohen

f is bounded. By a famous theorem of Liouville (of which this is the no less most
famous application) it implies that f is constant, contradicting the assumption of
(4). ⊓

ω1 ω1 + a ω1 + ω2 + a

Ca

ω2
ω2 + a
a

Fig. 1 Fundamental Parallelogram Pa

Note that clearly ℘ has order 2, and the last result shows that we cannot find an
elliptic function of order 1. Note however the following:

Exercise 2.2. 1. By integrating term by term the series defining −℘(z) show that if
we define the Weierstrass zeta function
 
1 1 1 z
ζ (z) = + ∑ − + 2 ,
z ω ∈Λ \{0} z + ω ω ω

this series converges normally on any compact subset of C \ Λ and satisfies


ζ ′ (z) = −℘(z).
2. Deduce that there exist constants η1 and η2 such that ζ (z + ω1 ) = ζ (z) + η1 and
ζ (z + ω2 ) = ζ (z) + η2 , so that if ω = mω1 + nω2 we have ζ (z + ω ) = ζ (z) +
mη1 + nη2 . Thus ζ (which would be of order 1) is not doubly-periodic but only
quasi-doubly periodic: this is called a quasi-elliptic function.
An Introduction to Modular Forms 7

3. By integrating around the usual fundamental parallelogram, show the important


relation due to Legendre:

ω1 η2 − ω2 η1 = ±2π i ,

the sign depending on the ordering of ω1 and ω2 .

The main properties of ℘ that we want to mention are as follows: First, for z
sufficiently small and ω 6= 0 we can expand
1 1
= ∑ (−1)k (k + 1)zk k+2 ,
(z + ω )2 k≥0 ω

so
1
℘(z) = + ∑ (−1)k (k + 1)zk Gk+2 (Λ ) ,
z2 k≥1
where we have set
1
Gk (Λ ) = ∑ ω k
,
ω ∈Λ \{0}

which are called Eisenstein series of weight k. Since Λ is symmetrical, it is clear


that Gk = 0 if k is odd, so the expansion of ℘(z) around z = 0 is given by

1
℘(z) = + ∑ (2k + 1)z2k G2k+2 (Λ ) .
z2 k≥1

Second, one can show that all elliptic functions are simply rational functions in
℘(z) and ℘′ (z), so we need not look any further in our construction.
Third, and this is probably one of the most important properties of ℘(z), it satis-
fies a differential equation of order 1: the proof is as follows. Using the above Taylor
expansion of ℘(z), it is immediate to check that
2
F(z) = ℘′(z) − (4℘(z)3 − g2 (Λ )℘(z) − g3(Λ ))

has an expansion around z = 0 beginning with F(z) = c1 z + · · · , where we have set


g2 (Λ ) = 60G4 (Λ ) and g3 (Λ ) = 140G6 (Λ ). In addition, F is evidently an elliptic
function, and since it has no pole at z = 0 it has no poles on Λ hence no poles at all,
so it has order 0. Thus by Proposition 2.1 (4) f is constant, and since by construction
it vanishes at 0 it is identically 0. Thus ℘ satisfies the differential equation
2
℘′ (z) = 4℘(z)3 − g2(Λ )℘(z) − g3(Λ ) .

A fourth and somewhat surprising property of the function ℘(z) is connected to


the theory of elliptic curves: the above differential equation shows that (℘(z),℘′ (z))
parametrizes the cubic curve y2 = 4x3 − g2x − g3, which is the general equation of
an elliptic curve (you do not need to know the theory of elliptic curves for what
follows). Thus, if z1 and z2 are in C \ Λ , the two points Pi = (℘(zi ),℘′ (zi )) for i = 1,
8 Henri Cohen

2 are on the curve, hence if we draw the line through these two points (the tangent
to the curve if they are equal), it is immediate to see from Proposition 2.1 (3) that
the third point of intersection corresponds to the parameter −(z1 + z2 ), and can of
course be computed as a rational function of the coordinates of P1 and P2 . It follows
that ℘(z) (and ℘′ (z)) possesses an addition formula expressing ℘(z1 + z2 ) in terms
of the ℘(zi ) and ℘′ (zi ).
Exercise 2.3. Find this addition formula. You will have to distinguish the cases z1 =
z2 , z1 = −z2 , and z1 6= ±z2 .
An interesting corollary of the differential equation for ℘(z), which we will prove
in a different way below, is a recursion for the Eisenstein series G2k (Λ ):
Proposition 2.4. We have the recursion for k ≥ 4:

(k − 3)(2k − 1)(2k + 1)G2k = 3 ∑ (2 j − 1)(2(k − j) − 1)G2 jG2(k− j) .


2≤ j≤k−2

Proof. Taking the derivative of the differential equation and dividing by 2℘′ we
obtain ℘′′ (z) = 6℘(z)2 − g2 (Λ )/2. If we set by convention G0 (Λ ) = −1 and
G2 (Λ ) = 0, and for notational simplicity omit Λ which is fixed, we have ℘(z) =
∑k≥−1 (2k + 1)z2k G2k+2 , so on the one hand

℘′′ (z) = ∑ (2k + 1)(2k)(2k − 1)z2k−2G2k+2 ,


k≥−1

and on the other hand ℘(z)2 = ∑K≥−2 a(K)z2K with

a(K) = ∑ (2k1 + 1)(2k2 + 1)G2k1 +2 G2k2 +2 .


k1 +k2 =K

Replacing in the differential equation it is immediate to check that the coefficients


agree up to z2 , and for K ≥ 2 we have the identification

6 ∑ (2k1 + 1)(2k2 + 1)G2k1 +2 G2k2 +2 = (2K + 3)(2K + 2)(2K + 1)G2K+4


k1 +k2 =K
ki ≥−1

which is easily seen to be equivalent to the recursion of the proposition using G0 =


−1 and G2 = 0. ⊓

For instance
3 5 18G34 + 25G26
G8 = G24 G10 = G4 G6 G12 = ,
7 11 143
and more generally this implies that G2k is a polynomial in G4 and G6 with rational
coefficients which are independent of the lattice Λ .
As other corollary, we note that if we choose ω2 = 1 and ω1 = iT with T tending
to +∞, then the definition G2k (Λ ) = ∑(m,n)∈Z2 \{(0,0)} (mω1 + nω2 )−2k implies that
An Introduction to Modular Forms 9

G2k (Λ ) will tend to ∑n∈Z\{0} n−2k = 2ζ (2k), where ζ is the Riemann zeta function.
If follows that for all k ≥ 2, ζ (2k) is a polynomial in ζ (4) and ζ (6) with rational
coefficients. Of course this is a weak but nontrivial result, since we know that ζ (2k)
is a rational multiple of π 2k .
To finish this section on elliptic functions and make the transition to modular
forms, we write explicitly Λ = Λ (ω1 , ω2 ) and by abuse of notation G2k (ω1 , ω2 ) :=
G2k (Λ (ω1 , ω2 )), and we consider the dependence of G2k on ω1 and ω2 . We note two
evident facts: first, G2k (ω1 , ω2 ) is homogeneous of degree −2k: for any nonzero
complex number λ we have G2k (λ ω1 , λ ω2 ) = λ −2k G2k (ω1 , ω2 ). In particular,
G2k (ω1 , ω2 ) = ω2−2k G2k (ω1 /ω2 , 1). Second, a general Z-basis of Λ is given by
(ω1′ , ω2′ ) = (aω1 + bω2 , cω1 + d ω2 ) with a, b, c, d integers such that ad − bc = ±1.
If we choose an oriented basis such that ℑ(ω1 /ω2 ) > 0 we in fact have ad − bc = 1.
Thus, G2k (aω1 + bω2 , cω1 + d ω2 ) = G2k (ω1 , ω2 ), and using homogeneity this
can be written
   
−2k a ω1 + b ω2 −2k ω1
(cω1 + d ω2 ) G2k , 1 = ω2 G2k ,1 .
c ω1 + d ω2 ω2

Thus, if we set τ = ω1 /ω2 and by an additional abuse of notation abbreviate


G2k (τ , 1) to G2k (τ ), we have by definition

G2k (τ ) = ∑ (mτ + n)−2k ,


(m,n)∈Z2 \{(0,0)}

and we have shown the following modularity property:



Proposition 2.5. For any ac db ∈ SL2 (Z), the group of 2 × 2 integer matrices of
determinant 1, and any τ ∈ C with ℑ(τ ) > 0 we have
 
aτ + b
G2k = (cτ + d)2k G2k (τ ) .
cτ + d

This will be our basic definition of (weak) modularity.

3 Modular Forms and Functions

3.1 Definitions

Let us introduce some notation:


• We denote by Γ the modular group SL2 (Z). Note that properly speaking the
modular group should be the group of transformations τ 7→ (aτ +b)/(cτ +d), which
is isomorphic to the quotient of SL2 (Z) by the equivalence relation saying that M
and −M are equivalent, but for this course we will stick to this definition. If γ =
a b we will of course write γ (τ ) for (aτ + b)/(cτ + d).
c d
10 Henri Cohen

• The Poincaré upper half-plane


 H is the set of complex numbers τ such that
ℑ(τ ) > 0. Since for γ = ac db ∈ Γ we have ℑ(γ (τ )) = ℑ(τ )/|cτ + d|2 , we see that
Γ is a group of transformations of H (more generally so is SL2 (R), there is nothing
special about Z).
• The completed upper half-plane H is by definition H = H ∪ P1 (Q) = H ∪
Q ∪ {i∞}. Note that this is not the closure in the topological sense, since we do not
include any real irrational numbers.

Definition 3.1. Let k ∈ Z and let F be a function from H to C.


a b

1. We will say that F is weakly modular of weight k for Γ if for all γ = c d ∈Γ
and all τ ∈ H we have

F(γ (τ )) = (cτ + d)k F(τ ) .

2. We will say that F is a modular form if, in addition, F is holomorphic on H and


if |F(τ )| remains bounded as ℑ(τ ) → ∞.
3. We will say that F is a modular cusp form if it is a modular form such that F(τ )
tends to 0 as ℑ(τ ) → ∞.

We make a number of immediate but important remarks.


Remarks 3.2 1. The Eisenstein series G2k (τ ) are basic examples of modular forms
of weight 2k, which are not cusp forms since G2k (τ ) tends to 2ζ (2k) 6= 0 when
ℑ(τ ) → ∞.
2. With the present definition, it is clear that there are no nonzero modular forms
of odd weight k, since if k is odd we have (−cτ − d)k = −(cτ + d)k and γ (τ ) =
(−γ )(τ ). However, when considering modular forms defined on subgroups of Γ
there may be modular forms of odd weight,
 so we keep the above definition.
3. Applying modularity to γ = T = 10 11 we see that F(τ + 1) = F(τ ), hence F has
a Fourier series expansion, and if F is holomorphic, by the remark made above in
the section on Fourier series, we have an expansion F(τ ) = ∑n∈Z a(n)e2π inτ with
R
a(n) = e2π ny 01 F(x + iy)e−2π inx dx for any y > 0. Thus, if |F(x + iy)| remains
bounded as y → ∞ it follows that as y → ∞ we have a(n) ≤ Be2π ny for a suitable
constant B, so we deduce that a(n) = 0 whenever n < 0 since e2π ny → 0. Thus if
F is a modular form we have F(τ ) = ∑n≥0 a(n)e2π inτ , hence limℑ(τ )→∞ F(τ ) =
a(0), so F is a cusp form if and only if a(0) = 0.

Definition 3.3. We will denote by Mk (Γ ) the vector space of modular forms of


weight k on Γ (M for Modular of course), and by Sk (Γ ) the subspace of cusp forms
(S for the German Spitzenform, meaning exactly cusp form).

Notation: for any matrix γ = ac db with ad − bc > 0, we will define the weight k
slash operator F|k γ by

F|k γ (τ ) = (ad − bc)k/2(cτ + d)−k F(γ (τ )) .


An Introduction to Modular Forms 11

The reason for the factor (ad − bc)k/2 is that λ γ has the same action on H as γ , so
this makes the formula homogeneous. For instance, F is weakly modular of weight
k if and only if F|k γ = F for all γ ∈ Γ .
We will also use the universal modular form convention of writing q for e2π iτ , so
that a Fourier expansion is of the type F(τ ) = ∑n≥0 a(n)qn . We use the additional
convention that if α is any complex number, qα will mean e2π iτα .

Exercise 3.4. Let F(τ ) = ∑n≥0 a(n)qn ∈ Mk (Γ ), and let γ = CA DB be a matrix in
M2+ (Z), i.e., A, B, C, and D are integers and ∆ = det(γ ) = AD − BC > 0. Set g =
gcd(A,C), let u and v be such that uA + vC = g, set b = uB + vD, and finally let
ζ∆ = e2π i/∆ . Prove the matrix identity
    
AB A/g −v g b
= ,
CD C/g u 0 ∆ /g

and deduce that we have the more general Fourier expansion

gk/2 2
F|k γ (τ ) = k ∑ ζ∆nbg a(n)qg /∆ ,
∆ n≥0

which is of course equal to F if ∆ = 1, since then g = 1.

3.2 Basic Results

The first fundamental result in the theory of modular forms is that these spaces
are finite-dimensional. The proof uses exactly the same method that we have used to
prove the basic results on elliptic functions. We first note that there is a “fundamental
domain” (which replaces the fundamental parallelogram) for the action of Γ on H ,
given by
F = {τ ∈ H , −1/2 ≤ ℜ(τ ) < 1/2, |τ | ≥ 1} .
The proof that this is a fundamental domain, in other words that any τ ∈ H has
a unique image by Γ belonging to F is not very difficult and will be omitted. We
then integrate F ′ (z)/F(z) along the boundary of F, and using modularity we obtain
the following result:
Theorem 3.5 Let F ∈ Mk (Γ ) be a nonzero modular form. For any τ0 ∈ H , denote
by vτ0 (F) the valuation of F at τ0 , i.e., the unique integer v such that F(τ )/(τ − τ0 )v
is holomorphic and nonzero at τ0 , and if F(τ ) = G(e2π iτ ), define vi∞ (F) = v0 (G)
(i.e., the number of first vanishing Fourier coefficients of F). We have the formula

vτ (F) k
vi∞ (F) + ∑ e τ
=
12
,
τ ∈F

where ei = 2, eρ = 3, and eτ = 1 otherwise (ρ = e2π i/3 ).


12 Henri Cohen

− 12 1
2

Fig. 2 The fundamental domain, F, of Γ

This theorem has many important consequences but, as already noted, the most
important is that it implies that Mk (Γ ) is finite dimensional. First, it trivially implies
that k ≥ 0, i.e., there are no modular forms of negative weight. In addition it easily
implies the following:

Corollary 3.6. Let k ≥ 0 be an even integer. We have


(
⌊k/12⌋ if k ≡ 2 (mod 12) ,
dim(Mk (Γ )) =
⌊k/12⌋ + 1 if k 6≡ 2 (mod 12) ,


0 if k < 12 ,
dim(Sk (Γ )) = ⌊k/12⌋ − 1 if k ≥ 12, k ≡ 2 (mod 12) ,


⌊k/12⌋ if k ≥ 12, k 6≡ 2 (mod 12) .

Since the product of two modular forms is clearly


L
a modular form (of weight the
sum of the two weights), It is clear that M∗ (Γ ) = k Mk (Γ ) (and similarly S∗ (Γ ))
is an algebra, whose structure is easily described:

Corollary 3.7. We have M∗ (Γ ) = C[G4 , G6 ], and S∗ (Γ ) = ∆ M∗ (Γ ), where ∆ is the


unique generator of the one-dimensional vector space S12 (Γ ) whose Fourier expan-
sion begins with ∆ = q + O(q2).

Thus, for instance, M0 (Γ ) = C, M2 (Γ ) = {0}, M4 (Γ ) = CG4 , M6 (Γ ) = CG6 ,


M8 (Γ ) = CG8 = CG24 , M10 (Γ ) = CG10 = CG4 G6 ,

M12 (Γ ) = CG12 ⊕ C∆ = CG34 ⊕ CG26 .

In particular, we recover the fact proved differently that G8 is a multiple of G24


(the exact multiple being obtained by computing the Fourier expansions), G10 is a
An Introduction to Modular Forms 13

multiple of G4 G6 , G12 is a linear combination of G34 and G26 . Also, we see that ∆ is
a linear combination of G34 and G26 (we will see this more precisely below).
A basic result on the structure of the modular group Γ is the following:

Proposition 3.8.
 Set T = 10 11 , which acts on H by the unit translation τ 7→ τ + 1,
and S = 01 −10 which acts on H by the symmetry-inversion τ 7→ −1/τ . Then Γ is
generated by S and T , with relations generated by S2 = −I and (ST )3 = −I (I the
identity matrix).

There are several (easy) proofs of this fundamental result, which we do not give.
Simply note that this proposition is essentially equivalent to the fact that the set F
described above is indeed a fundamental domain.
A consequence of this proposition is that to check whether some function F
has the modularity property, it is sufficient to check that F(τ + 1) = F(τ ) and
F(−1/τ ) = τ k F(τ ).

Exercise 3.9. (Bol’s identity). Let F be any continuous function defined on the
upper-half plance H , and define I0 (F, a) = F and for any integer m ≥ 1 and a ∈ H
set: Z τ
(τ − z)m−1
Im (F, a)(τ ) = F(z) dz .
a (m − 1)!
1. Show that Im (F, a)′ (τ ) = Im−1 (F, a)(τ ), so that Im (F, a) is an mth antiderivative
of F.
2. Let γ ∈ Γ , and assume that k ≥ 1 is an integer. Show that

Ik−1 (F, a)|2−k γ = Ik−1 (F|k γ , γ −1 (a)) .

3. Deduce that if we set Fa∗ = Ik−1 (F, a) then

D(k−1) (Fa∗ |2−k γ ) = F|k γ ,

where D = (1/2π i)d/d τ = qd/dq is the basic differential operator that we will
use (see Section 3.10).
4. Assume now that F is weakly modular of weight k ≥ 1 and holomorphic on H
(in particular if F ∈ Mk (Γ ), but |F| could be unbounded as ℑ(τ ) → ∞). Show
that
(Fa∗ |2−k |γ )(τ ) = Fa∗ (τ ) + Pk−2 (τ ) ,
where Pk−2 is the polynomial of degree less than or equal to k − 2 given by
Z a
(X − z)k−2
Pk−2 (X) = F(z) dz .
γ −1 (a) (k − 2)!

What this exercise shows is that the (k − 1)st derivative of some function which
behaves modularly in weight 2 − k behaves modularly in weight k, and conversely
that the (k − 1)st antiderivative of some function which behaves modularly in weight
k behaves modularly in weight k up to addition of a polynomial of degree at most
14 Henri Cohen

k − 2. This duality between weights k and 2 − k is in fact a consequence of the


Riemann–Roch theorem.
Note also that this exercise is the beginning of the fundamental theories of peri-
ods and of modular symbols.
Also, it is not difficult to generalize Bol’s identity. For instance, applied to the
Eisenstein series G4 and using Proposition 3.13 below we obtain:

Proposition 3.10. 1. Set

π 3  τ 3
F4∗ (τ ) = − + ∑ σ−3 (n)qn .
180 i n≥1

We have the functional equation

ζ (3) π3 τ
τ 2 F4∗ (−1/τ ) = F4∗ (τ ) + (1 − τ 2) − .
2 36 i
2. Equivalently, if we set

π 3  τ 3 π 3  τ  ζ (3)
F4∗∗ (τ ) = − − + + ∑ σ−3 (n)qn
180 i 72 i 2 n≥1

we have the functional equation

F4∗∗ (−1/τ ) = τ −2 F4∗∗ (τ ) .

Note that the appearance of ζ (3) comes from the fact that, up to a multiplicative
constant, the L-function associated to G4 is equal to ζ (s)ζ (s − 3), whose value at
s = 3 is equal to −ζ (3)/2.

3.3 The Scalar Product

We begin by the following exercise:

Exercise 3.11. 1. Denote by d µ = dxdy/y2 a measure on H , where as usual x and


y are the real and imaginary part of τ ∈ H . Show that this measure is invariant
under SL2 (R).
2. Let f and g be in Mk (Γ ). Show that the function F(τ ) = f (τ )g(τ )yk is invariant
under the modular group Γ .

It follows in particular from this exercise that if F(τ )R is any integrable function
which is invariant by the modular group Γ , the integral Γ \H F(τ )d µ makes sense
if it converges. Since
R F is a fundamental domain for the action of Γ on H , this can
also be written F F(τ )d µ . Thus it follows from the second part that we can define
An Introduction to Modular Forms 15
Z
dxdy
< f , g >= f (τ )g(τ )yk ,
Γ \H y2

whenever this converges.


It is immediate to show that a necessary and sufficient condition for convergence
is that at least one of f and g be a cusp form, i.e., lies in Sk (Γ ). In particular it is
clear that this defines a scalar product on Sk (Γ ) called the Petersson scalar product.
In addition, any cusp form in Sk (Γ ) is orthogonal to Gk with respect to this scalar
product. It is instructive to give a sketch of the simple proof of this fact:

Proposition 3.12. If f ∈ Sk (Γ ) we have < Gk , f >= 0.

Proof. Recall that Gk (τ ) = ∑(m,n)∈Z2 \{(0,0)}(mτ + n)−k . We split the sum according
to the GCD of m and n: we let d = gcd(m, n), so that m = dm1 and n = dn1 with
gcd(m1 , n1 ) = 1. It follows that

Gk (τ ) = 2 ∑ d −k Ek (τ ) = 2ζ (k)Ek (τ ) ,
d≥1

where Ek (τ ) = (1/2) ∑gcd(m,n)=1 (mτ + n)−k . We thus need to prove that < Ek , f >=
0.
 the other hand, denote by Γ∞ the group generated by T , i.e., translations
On
1 b for b ∈ Z. This acts by left multiplication on Γ , and it is immediate to check
01
that a system of representatives for this action is given by matrices ( mu nv ), where
gcd(m, n) = 1 and u and v are chosen arbitrarily (but only once for each pair (m, n))
such that un − vm = 1. It follows that we can write

Ek (τ ) = ∑ (mτ + n)−k ,
γ ∈Γ∞ \Γ

where it is understood that γ = ( mu nv ) (the factor 1/2 has disappeared since γ and −γ
have the same action on H ).
Thus
Z
dxdy
< Ek , f > = ∑ (mτ + n)−k f (τ )yk
Γ \H γ ∈Γ \Γ y2

Z
dxdy
= ∑ (mτ + n)−k f (τ )yk .
γ ∈Γ∞ \Γ Γ \H
y2

Now note that by modularity f (τ ) = (mτ + n)−k f (γ (τ )), and since ℑ(γ (τ )) =
ℑ(τ )/|mτ + n|2 it follows that

(mτ + n)−k f (τ )yk = f (γ (τ ))ℑ(γ (τ ))k .

Thus, since d µ = dxdy/y2 is an invariant measure we have


16 Henri Cohen
Z Z
dxdy
< Ek , f > = ∑ f (γ (τ ))ℑ(γ (τ ))k d µ = f (τ )yk .
γ ∈Γ∞ \Γ Γ \H Γ∞ \H y2

Since Γ∞ is simply the group of integer translations, a fundamental domain for


Γ∞ \H is simply the vertical strip [0, 1] × [0, ∞[, so that
Z ∞ Z 1
< Ek , f >= yk−2 dy f (x + iy)dx ,
0 0

which trivially vanishes since the inner integral is simply the conjugate of the con-
stant term in the Fourier expansion of f , which is 0 since f ∈ Sk (Γ ).

The above procedure (replacing the complicated fundamental domain of Γ \H


by the trivial one of Γ∞ \H ) is very common in the theory of modular forms and is
called unfolding.

3.4 Fourier Expansions

The Fourier expansions of the Eisenstein series G2k (τ ) are easy to compute. The
result is the following:

Proposition 3.13. For k ≥ 4 even we have the Fourier expansion

(2π i)k
Gk (τ ) = 2ζ (k) + 2 ∑ σk−1 (n)qn ,
(k − 1)! n≥1

where σk−1 (n) = ∑d|n, d>0 d k−1 .

Since we know that when k is even 2ζ (k) = −(2π i)k Bk /k!, where Bk is the k-th
Bernoulli number defined by
t Bk
= ∑ tk ,
et − 1 k≥0 k!

it follows that Gk = 2ζ (k)Ek , with

2k
Ek (τ ) = 1 − ∑ σk−1 (n)qn .
Bk n≥1

This is the normalization of Eisenstein series that we will use. For instance
An Introduction to Modular Forms 17

E4 (τ ) = 1 + 240 ∑ σ3 (n)qn ,
n≥1
E6 (τ ) = 1 − 504 ∑ σ5 (n)qn ,
n≥1
E8 (τ ) = 1 + 480 ∑ σ7 (n)qn .
n≥1

In particular, the relations given above which follow from the dimension formula
become much simpler and are obtained simply by looking at the first terms in the
Fourier expansion:

441E43 + 250E62 E43 − E62


E8 = E42 , E10 = E4 E6 , E12 = , ∆= .
691 1728
Note that the relation E42 = E8 (and the others) implies a highly nontrivial relation
between the sum of divisors function: if we set by convention σ3 (0) = 1/240, so that
E4 (τ ) = ∑n≥0 σ3 (n)qn , we have

E8 (τ ) = E42 (τ ) = 2402 ∑ qn ∑ σ3 (m)σ3 (n − m) ,


n≥0 0≤m≤n

so that by identification σ7 (n) = 120 ∑0≤m≤n σ3 (m)σ3 (n − m), so

σ7 (n) = σ3 (n) + 120 ∑ σ3 (m)σ3 (n − m) .


1≤m≤n−1

It is quite difficult (but not impossible) to prove this directly, i.e., without using at
least indirectly the theory of modular forms.

Exercise 3.14. Find a similar relation for σ9 (n) using E10 = E4 E6 .

This type of reasoning is one of the reasons for which the theory of modular
forms is so important (and lots of fun!): if you have a modular form F, you can
usually express it in terms of a completely explicit basis of the space to which it be-
longs since spaces of modular forms are finite-dimensional (in the present example,
the space is one-dimensional), and deduce highly nontrivial relations for the Fourier
coefficients. We will see a further example of this below for the number rk (n) of
representations of an integer n as a sum of k squares.

Exercise 3.15. 1. Prove that for any k ∈ C we have the identity

nk qn
∑ σk (n)qn = ∑ 1 − qn ,
n≥1 n≥1

the right-hand side being called a Lambert series.


2. Set F(k) = ∑n≥1 nk /(e2π n − 1). Using the Fourier expansions given above, com-
pute explicitly F(5) and F(9).
3. Using Proposition 3.10, compute explicitly F(−3).
18 Henri Cohen

4. Using Proposition 3.23 below, compute explicitly F(1).

Note that in this exercise we only compute F(k) for k ≡ 1 (mod 4). It is also
possible but more difficult to compute F(k) for k ≡ 3 (mod 4). For instance we
have:
Γ (1/4)8 1
F(3) = − .
80(2π )6 240

3.5 Obtaining Modular Forms by Averaging

We have mentioned at the beginning of this course that one of the ways to obtain
functions satisfying functional equations is to use averaging over a suitable group or
set: we have seen this for periodic functions in the form of the Poisson summation
formula, and for doubly-periodic functions in the construction of the Weierstrass
℘-function. We can do the same for modular forms, but we must be careful in two
different ways. First, we do not want invariance by Γ , but we want an automorphy
factor (cτ + d)k . This is easily dealt with by noting that (d/d τ )(γ (τ )) = (cτ + d)−2 :
indeed, if φ is some function on H we can define

F(τ ) = ∑ φ (γ (τ ))((d/d τ )(γ (τ )))k/2 .


γ ∈Γ


′ A B
 questions, by using the chain rule ( f ◦ g) =
Exercise 3.16. Ignoring all convergence

( f ◦ g)g show that for all δ = C D ∈ Γ we have

F(δ (τ )) = (Cτ + D)k F(τ ) .

But the second important way in which we must be careful is that the above
contruction rarely converges. There are, however, examples where it does converge:

Exercise 3.17. Let φ (τ ) = τ −m , so that

1
F(τ ) =
 ∑ (aτ + b)m (cτ + d)k−m
.
γ = a b ∈Γ
c d

Show that if 2 ≤ m ≤ k − 2 and m 6= k/2 this series converges normally on any com-
pact subset of H (i.e., it is majorized by a convergent series with positive terms),
so defines a modular form in Mk (Γ ).

Note that the series converges also for m = k/2, but this is more difficult.
One of the essential reasons for non-convergence of the function F is the trivial
observation that for a given pair of coprime integers (c, d) there are infinitely many
elements γ ∈ Γ having (c, d) as their second row. Thus in general it seems more
reasonable to define
An Introduction to Modular Forms 19

F(τ ) = ∑ φ (γc,d (τ ))(cτ + d)−k ,


gcd(c,d)=1

where γc,d is any fixed matrix in Γ with  second row equal to (c, d). However, we
need this to make sense: if γc,d = ac db ∈ Γ is one such matrix, it is clear that the
 
general matrix having second row equal to (c, d) is T n ac db = a+nc b+nd , and as
 c d
usual T = 10 11 is translation by 1: τ 7→ τ +1. Thus, an essential necessary condition
for our series to make any kind of sense is that the function φ be periodic of period 1.
The simplest such function is of course the constant function 1:
Exercise 3.18. (See the proof of Proposition 3.12.) Show that

F(τ ) = ∑ (cτ + d)−k = 2Ek (τ ) ,


gcd(c,d)=1

where Ek is the normalized Eisenstein series defined above.


But by the theory of Fourier series, we know that periodic functions of period 1
are (infinite) linear combinations of the functions e2π inτ . This leads to the definition
of Poincaré series:

1 e2π inγc,d (τ )
Pk (n; τ ) = ∑
2 gcd(c,d)=1 (cτ + d)k
,

where we note that we can choose any matrix γc,d with bottom row (c, d) since the
function e2π inτ is 1-periodic, so that Pk (n; τ ) ∈ Mk (Γ ).
Exercise 3.19. Assume that k ≥ 4 is even.
1. Show that if n < 0 the series defining Pk diverges (wildly in fact).
2. Note that Pk (0; τ ) = Ek (τ ), so that limτ →i∞ Pk (0; τ ) = 1. Show that if n > 0 the
series converges normally and that we have limτ →i∞ Pk (n; τ ) = 0. Thus in fact
Pk (n; τ ) ∈ Sk (Γ ) if n > 0.
3. By using the same unfolding method as in Proposition 3.12, show that if f =
∑n≥0 a(n)qn ∈ Mk (Γ ) and n > 0 we have

(k − 2)!
< Pk (n), f >= a(n) .
(4π n)k−1

It is easy to show that in fact the Pk (n) generate Sk (Γ ). We can also compute
their Fourier expansions as we have done for Ek , but they involve Bessel functions
and Kloosterman sums.

3.6 The Ramanujan Delta Function

Recall that by definition ∆ is the generator of the 1-dimensional space S12 (Γ ) whose
Fourier coefficient of q1 is normalized to be equal to 1. By simple computation, we
20 Henri Cohen

find the first terms in the Fourier expansion of ∆ :

∆ (τ ) = q − 24q2 + 252q3 − 1472q4 + · · · ,

with no apparent formula for the coefficients. The nth coefficient is denoted τ (n)
(no confusion with τ ∈ H ), and called Ramanujan’s tau function, and ∆ itself is
called Ramanujan’s Delta function.
Of course, using ∆ = (E43 − E62 )/1728 and expanding the powers, one can give
a complicated but explicit formula for τ (n) in terms of the functions σ3 and σ5 , but
this is far from being the best way to compute them. In fact, the following exercise
already gives a much better method.

Exercise 3.20. Let D be the differential operator (1/(2π i))d/d τ = qd/dq.


1. Show that the function F = 4E4 D(E6 ) − 6E6 D(E4 ) is a modular form of weight
12, then by looking at its constant term show that it is a cusp form, and finally
compute the constant c such that F = c · ∆ .
2. Deduce the formula
n
τ (n) = (5σ3 (n) + 7σ5(n)) + 70 ∑ (2n − 5m)σ3(m)σ5 (n − m) .
12 1≤m≤n−1

3. Deduce in particular the congruences τ (n) ≡ nσ5 (n) ≡ nσ1 (n) (mod 5) and
τ (n) ≡ nσ3 (n) (mod 7).

Although there are much faster methods, this is already a very reasonable way to
compute τ (n).
The cusp form ∆ is one of the most important functions in the theory of modular
forms. Its first main property, which is not at all apparent from its definition, is that
it has a product expansion:
Theorem 3.21 We have
∆ (τ ) = q ∏ (1 − qn)24 .
n≥1

Proof. We are not going to give a complete proof, but sketch a method which is one
of the most natural to obtain the result.
We start backwards, from the product R(τ ) on the right-hand side. The logarithm
transforms products into sums, but in the case of functions f , the logarithmic deriva-
tive f ′ / f (more precisely D( f )/ f , where D = qd/dq) also does this, and it is also
more convenient. We have
nqn
D(R)/R = 1 − 24 ∑ n
= 1 − 24 ∑ σ1 (n)qn
n≥1 1 − q n≥1

as is easily seen by expanding 1/(1 − qn) as a geometric series. This is exactly the
case k = 2 of the Eisenstein series Ek , which we have excluded from our discussion
for convergence reasons, so we come back to our series G2k (we will divide by the
An Introduction to Modular Forms 21

normalizing factor 2ζ (2) = π 2 /3 at the end), and introduce a convergence factor


due to Hecke, setting

G2,s (τ ) = ∑ (mτ + n)−2|mτ + n|−2s .


(m,n)∈Z2 \{(0,0)}

As above this converges for ℜ(s) > 0, satisfies

G2,s (γ (τ )) = (cτ + d)2 |cτ + d|2s G2,s (τ )

hence in particular is periodic of period 1. It is straightforward to compute its Fourier


expansion, which we will not do here, and the Fourier expansion shows that G2,s
has an analytic continuation to the whole complex plane. In particular, the limit as
s → 0 makes sense; if we denote it by G∗2 (τ ), by continuity it will of course satisfy
G∗2 (γ (τ )) = (cτ + d)2 G∗2 (τ ), and the analytic continuation of the Fourier expansion
that has been computed gives
!
∗ π2 3 n
G2 (τ ) = 1− − 24 ∑ σ1 (n)q .
3 π ℑ(τ ) n≥1

Note the essential fact that there is now a nonanalytic term 3/(π ℑ(τ )). We will of
course set the following definition:

Definition 3.22. We define


3
E2 (τ ) = 1 − 24 ∑ σ1 (n)qn and E2∗ (τ ) = E2 (τ ) − .
n≥1 π ℑ(τ )

Thus E2 (τ ) = D(R)/R, G∗2 (τ ) = (π 2 /3)E2∗ (τ ), and we have the following:



Proposition 3.23. For any γ = ac db ∈ Γ We have E2∗ (γ (τ )) = (cτ + d)2 E2∗ (τ ).
Equivalently,
12
E2 (γ (τ )) = (cτ + d)2 E2 (τ ) + c(cτ + d) .
2π i
Proof. The first result has been seen above, and the second follows from the formula
ℑ(γ (τ )) = ℑ(τ )/|cτ + d|2. ⊓

Exercise 3.24. Show that


!
1 m
E2 (τ ) = −24 − + ∑ −m .
24 m≥1 q − 1

Proof of the theorem. We can now prove the theorem on the product expansion
of ∆ : noting that (d/d τ )γ (τ ) = 1/(cτ + d)2 , the above formulas imply that if we set
S = R(γ (τ )) we have
22 Henri Cohen

D(S) D(R)
= (γ (τ ))(d/d τ )(γ (τ ))
S R
12 c
= (cτ + d)−2E2 (γ (τ )) = E2 (τ ) +
2 π i cτ + d
D(R) D(cτ + d)
= (τ ) + 12 .
R cτ + d
By integrating and exponentiating, it follows that

R(γ (τ )) = (cτ + d)12R(τ ) ,

and since clearly R is holomorphic on H and tends to 0 as ℑ(τ ) → ∞ (i.e., as q → 0),


it follows that R is a cusp form of weight 12 on Γ , and since S12 (Γ ) is 1-dimensional
and the coefficient of q1 in R is 1, we have R = ∆ , proving the theorem. ⊓

Exercise 3.25. We have shown in passing that D(∆ ) = E2 ∆ . Expanding the Fourier
expansion of both sides, show that we have the recursion

(n − 1)τ (n) = −24 ∑ σ1 (m)τ (n − m) .


1≤m≤n−1

Exercise 3.26. 1. Let F ∈ Mk (Γ ), and for some squarefree integer N set

G(τ ) = ∑ µ (d)d k/2 F(d τ ) ,


d|N

where µ is the Möbius function. Show that G|kWN = µ (N)G, where WN = N0 −1 0
is the so-called Fricke involution.
2. Show that if N > 1 the same result is true for F = E2 , although E2 is only quasi-
modular. √
3. Deduce that if µ (N) = (−1)k/2−1 we have G(i/ N) = 0.
4. Applying this to E2 and using Exercise 3.24, deduce that if µ (N) = 1 and N > 1
we have
m φ (N)
∑ e2π m/√N − 1 = 24 ,
gcd(m,N)=1

where φ (N) is Euler’s totient function.


5. Using directly the functional equation of E2∗ , show that for N = 1 there is an
additional term −1/(8π ), i.e., that

m 1 1
∑ e 2π m − 1
=
24

8 π
.
m≥1
An Introduction to Modular Forms 23

3.7 Product Expansions and the Dedekind Eta Function

We continue our study of product expansions. We first mention an important identity


due to Jacobi, the triple product identity, as well as some consequences:
Theorem 3.27 (Triple product identity) If |q| < 1 and u 6= 0 we have

∏ (1 − qn)(1 − qnu) ∏ (1 − qn/u) = ∑ (−1)k (uk − u−(k+1))qk(k+1)/2 .


n≥1 n≥0 k≥0

Proof. (sketch): denote by L(q, u) the left-hand side. We have clearly L(q, u/q) =
−uL(q, u), and since one can write L(q, u) = ∑k∈Z ak (q)uk this implies the recursion
ak (q) = −qk ak−1 (q), so ak (q) = (−1)k qk(k+1)/2 a0 (q), and separating k ≥ 0 and k <
0 this shows that

L(q, u) = a0 (q) ∑ (−1)k (uk − u−(k+1))qk(k+1)/2 .


k≥0

The slightly longer part is to show that a0 (q) = 1: this is done by setting u = i/q1/2
and u = 1/q1/2, which after a little computation implies that a(q4 ) = a(q), and from
there it is immediate to deduce that a(q) is a constant, and equal to 1. ⊓

To give the next corollaries, we need to define the Dedekind eta function η (τ ),
by
η (τ ) = q1/24 ∏ (1 − qn) ,
n≥1

(recall that qα = e2π iατ ).


Thus by definition η (τ )24 = ∆ (τ ). Since ∆ (−1/τ ) =
τ ∆ (τ ), it follows that η (−1/τ ) = c · (τ /i)1/2 η (τ ) for some 24th root of unity
12

c (where we always use the principal determination of the square root), and since
we see from the infinite product that η (i) 6= 0, replacing τ by i shows that in fact
c = 1. Thus η satisfies the two basic modular equations

η (τ + 1) = e2π i/24 η (τ ) and η (−1/τ ) = (τ /i)1/2 η (τ ) .

Of course we have more generally

η (γ (τ )) = vη (γ )(cτ + d)1/2 η (τ )

for any γ ∈ Γ , with a complicated 24th root of unity vη (γ ), so η is in some (rea-


sonable) sense a modular form of weight 1/2, similar to the function θ that we
introduced at the very beginning.
The triple product identity immediately implies the following two identities:

Corollary 3.28. We have


24 Henri Cohen
!
η (τ ) = q1/24 1 + ∑ (−1)k (qk(3k−1)/2 + qk(3k+1)/2) and
k≥1

η (τ )3 = q1/8 ∑ (−1)k (2k + 1)qk(k+1)/2 .


k≥0

Proof. In the triple product identity, replace (u, q) by (1/q, q3 ): we obtain

∏ (1 − q3n)(1 − q3n−1) ∏ (1 − q3n+1) = ∑ (−1)k (q−k − qk+1)q3k(k+1)/2 .


n≥1 n≥0 k≥0

The left-hand side is clearly equal to η (τ ), and the right-hand side to

1 − q + ∑ (−1)k (qk(3k+1)/2 − q(k+1)(3k+2)/2)


k≥1

= 1 + ∑ (−1)k qk(3k+1)/2 − q + ∑ (−1)k qk(3k−1)/2 ,


k≥1 k≥2

giving the formula for η (τ ). For the second formula, divide the triple product iden-
tity by 1 − 1/u and make u → 1. ⊓

Thus the first few terms are:

∏ (1 − qn) = 1 − q − q2 + q5 + q7 − q12 − q15 + · · ·


n≥1

∏ (1 − qn)3 = 1 − 3q + 5q3 − 7q6 + 9q10 − 11q15 + · · · .


n≥1

The first identity was proved by L. Euler.

Exercise 3.29. 1. Show that 24∆ D(η ) = η D(∆ ), and using the explicit Fourier ex-
pansion of η , deduce the recursion
 
k 2 k(3k + 1)
∑ (−1) (75k + 25k + 2 − 2n)τ n −
2
=0.
k∈Z

2. Similarly, from 8∆ D(η 3 ) = η 3 D(∆ ) deduce the recursion


 
k 2 k(k + 1)
∑ (−1) (2k + 1)(9k + 9k + 2 − 2n)τ n −
2
=0.
k∈Z

Exercise 3.30. Define the q-Pochhammer symbol (q)n by (q)n = (1−q)(1−q2) · · · (1−
qn ).
1. Set f (a, q) = ∏n≥1 (1 − aqn ), and define coefficients cn (q) by setting f (a, q) =
∑n≥0 cn (q)an . Show that f (a, q) = (1 − aq) f (aq, q), deduce that cn (q)(1 − qn) =
−qn cn−1 (q) and finally the identity
An Introduction to Modular Forms 25

∏ (1 − aqn) = ∑ (−1)n anqn(n+1)/2/(q)n .


n≥1 n≥0

2. Write in terms of the Dedekind eta function the identities obtained by specializ-
ing to a = 1, a = −1, a = −1/q, a = q1/2, and a = −q1/2.
3. Similarly, prove the identity

1/ ∏ (1 − aqn) = ∑ anqn /(q)n ,


n≥1 n≥0

and once again write in terms of the Dedekind eta function the identities obtained
by specializing to the same five values of a.
4. By multiplying two of the above identities and using the triple product identity,
prove the identity
2
1 qn
= ∑
∏n≥1 (1 − qn) n≥0 (q)2n
.

Note that this last series is the generating function of the partition function p(n),
so if one wants to make a table of p(n) up to n = 10000, say, using the left-hand
side would require 10000 terms, while using the right-hand side only requires 100.

3.8 Computational Aspects of the Ramanujan τ Function

Since its introduction, the Ramanujan tau function τ (n) has fascinated number the-
orists. For instance there is a conjecture due to D. H. Lehmer that τ (n) 6= 0, and an
even stronger conjecture (which would imply the former) that for every prime p we
have p ∤ τ (p) (on probabilistic grounds, the latter conjecture is probably false).
To test these conjectures as well as others, it is an interesting computational chal-
lenge to compute τ (n) for large n (because of Ramanujan’s first two conjectures, i.e.,
Mordell’s theorem that we will prove in Section 4 below, it is sufficient to compute
τ (p) for p prime).
We can have two distinct goals. The first is to compute a table of τ (n) for n ≤ B,
where B is some (large) bound. The second is to compute individual values of τ (n),
equivalently of τ (p) for p prime.
Consider first the construction of a table. The use of the first recursion given
in the above exercise needs O(n1/2 ) operations per value of τ (n), hence O(B3/2 )
operations in all to have a table for n ≤ B.
However, it is well known that the Fast Fourier Transform (FFT) allows one to
compute products of power series in essentially linear time. Thus, using Corollary
3.28, we can directly write the power series expansion of η 3 , and use the FFT to
compute its eighth power η 24 = ∆ . This will require O(B log(B)) operations, so is
much faster than the preceding method; it is essentially optimal since one needs
O(B) time simply to write the result.
26 Henri Cohen

Using large computer resources, especially in memory, it is reasonable to con-


struct a table up to B = 1012, but not much more. Thus, the problem of computing
individual values of τ (p) is important. We have already seen one such method in
Exercise 3.20 above, which gives a method for computing τ (n) in time O(n1+ε ) for
any ε > 0.
A deep and important theorem of B. Edixhoven, J.-M. Couveignes, et al., says
that it is possible to compute τ (p) in time polynomial in log(p), and in particular in
time O(pε ) for any ε > 0. Unfortunately this algorithm is not at all practical, and at
least for now, completely useless for us. The only practical and important applica-
tion is for the computation of τ (p) modulo some small prime numbers ℓ (typically
ℓ < 50, so far from being sufficient to apply the Chinese Remainder Theorem).
However, there exists an algorithm which takes time O(n1/2+ε ) for any ε > 0, so
much better than the one of Exercise 3.20, and which is very practical. It is based
on the use of the Eichler–Selberg trace formula, together with the computation of
Hurwitz class numbers H(N) (essentially the class numbers of imaginary quadratic
orders counted with suitable multiplicity): if we set H3 (N) = H(4N) + 2H(N) (note
that H(4N) can be computed in terms of H(N)), then for p prime

τ (p) = 28p6 − 28p5 − 90p4 − 35p3 − 1


− 128 ∑ t 6 (4t 4 − 9pt 2 + 7p2)H3 (p − t 2) .
1≤t<p1/2

See [1] Exercise 12.13 of Chapter 12 for details. Using this formula and a cluster, it
should be reasonable to compute τ (p) for p of the order of 1016 .

3.9 Modular Functions and Complex Multiplication

Although the terminology is quite unfortunate, we cannot change it. By definition, a


modular function is a function F from H to C which is weakly modular of weight 0
(so that F(γ (τ )) = F(τ ), in other words is invariant under Γ , or equivalently defines
a function from Γ \H to C), meromorphic, including at ∞. This last statement re-
quires some additional explanation, but in simple terms, this means that the Fourier
expansion of F has only finitely many Fourier coefficients for negative powers of q:
F(τ ) = ∑n≥n0 a(n)qn , for some (possibly negative) n0 .
A trivial way to obtain modular functions is simply to take the quotient of two
modular forms having the same weight. The most important is the j-function defined
by
E 3 (τ )
j(τ ) = 4 ,
∆ (τ )
whose Fourier expansion begins by
1
j(τ ) = + 744 + 196884q + 21493760q2 + · · ·
q
An Introduction to Modular Forms 27

Indeed, one can easily prove the following theorem:


Theorem 3.31 Let F be a meromorphic function on H . The following are equiva-
lent:
1. F is a modular function.
2. F is the quotient of two modular forms of equal weight.
3. F is a rational function of j.

Exercise 3.32. 1. Noting that Theorem 3.5 is valid more generally for modular
functions (with vτ ( f ) = −r < 0 if f has a pole of order r at τ ) and using the
specific properties of j(τ ), compute vτ ( f ) for the functions j(τ ), j(τ ) − 1728,
and D( j)(τ ), at the points ρ = e2π i/3 , i, i∞, and τ0 for τ0 distinct from these three
special points.
2. Set f = f (a, b, c) = D( j)a /( jb ( j − 1728)c ). Show that f is a modular form if and
only if 2c ≤ a, 3b ≤ 2a, and b + c ≥ a, and give similar conditions for f to be a
cusp form.
3. Show that E4 = f (2, 1, 1), E6 = f (3, 2, 1), and ∆ = f (6, 4, 3), so that for instance
D( j) = −E14 = −E42 E6 /∆ .

An important theory linked to modular functions is the theory of complex mul-


tiplication, which deserves a course in itself. We simply mention one of the basic
results.
We will say that a complex number τ ∈ H is a CM point (CM for Complex
Multiplication) if it belongs to an imaginary quadatic field, or equivalently if there
exist integers a, b, and c with a 6= 0 such that aτ 2 + bτ + c = 0. The first basic
theorem is the following:
Theorem 3.33 If τ is a CM point then j(τ ) is an algebraic integer.
Note that this theorem has two parts: the first and most important part is that j(τ )
is algebraic. This is in fact easy to prove. The second part is that it is an algebraic
integer, and this is more difficult. Since any modular function f is a rational function
of j, it follows that if this rational function has algebraic coefficients then f (τ ) will
be algebraic (but not necessarily integral). Another immediate consequence is the
following:

Corollary 3.34. Let τ be a CM point and define Ωτ = η (τ )2 , where η is as usual


the Dedekind eta function. For any modular form f of weight k (in fact f can also be
meromorphic) the number f (τ )/Ωτk is algebraic. In fact E4 (τ )/Ωτ4 and E6 (τ )/Ωτ6
are always algebraic integers.

But the importance of this theorem lies in algebraic number theory. We give the
following theorem without explaining the necessary notions:
Theorem 3.35 Let τ be a CM point, and D = b2 − 4ac its discriminant, where we
choose gcd(a, b, c) = 1. Then Q( j(τ )) is the ring class field of discriminant
√ D, and in
particular if D is the discriminant of a quadratic field K = Q( D), then K( j(τ )) is
28 Henri Cohen

the Hilbert class field of K. In particular, the degree of the minimal polynomial of the
algebraic integer j(τ ) is equal to the class number h(D) of the order of discriminant
D.
Examples:


j((1 + i 3)/2) = 0 = 1728 − 3(24)2
j(i) = 1728 = 123 = 1728 − 4(0)2

j((1 + i 7)/2) = −3375 = (−15)3 = 1728 − 7(27)2

j(i 2) = 8000 = 203 = 1728 + 8(28)2

j((1 + i 11)/2) = −32768 = (−32)3 = 1728 − 11(56)2

j((1 + i 163)/2) = −262537412640768000 = (−640320)3
= 1728 − 163(40133016)2

j(i 3) = 54000 = 2(30)3 = 1728 + 12(66)2
j(2i) = 287496 = (66)3 = 1728 + 8(189)2

j((1 + 3i 3)/2) = −12288000 = −3(160)3 = 1728 − 3(2024)2

√ −191025 − 85995 5
j((1 + i 15)/2) =
2
√ √ !3 √ !2
1 − 5 75 + 27 5 273 + 105 5
= = 1728 − 3
2 2 2

Note that we give the results in the above form since it can be shown that the
functions j1/3 and ( j − 1728)1/2 also have interesting arithmetic properties.
The example with D = −163 is particularly spectacular:

Exercise 3.36. Using the above table, show that



(eπ 163
− 744)1/3 = 640320 − ε ,

with 0 < ε < 10−24, and more precisely that ε is approximately equal to 65628e−(5/3)π 163

(note that 65628 = 196884/3).

Exercise 3.37. 1. Using once again the example of 163, compute heuristically a
few terms of the Fourier expansion of j assuming that it is of the form 1/q +
∑n≥0 c(n)q√n with c(n) reasonably small integers using the following method. Set

q = −e−π 163 , and let J = (−640320)3 be the exact value of j((−1 + i 163)/2).
By computing J −1/q, one notices that the result is very close to 744, so we guess
that c(0) = 744. We then compute (J − 1/q − c(0))/q and note that once again
the result is close to an integer, giving c(1), and so on. Go as far as you can with
this method.
An Introduction to Modular Forms 29

2. Do the same for 67 instead of 163. You will find the same Fourier coefficients
(but you can go less far).
3. On the

other hand, do the same for 58, starting with J equal to the integer close
to eπ 58 . You will find a different Fourier expansion: it corresponds in fact to
another modular function, this time defined on a subgroup of Γ , called a Haupt-
modul. √
4. Try to find other rational numbers D such that eπ D is close to an integer, and do
the same exercise for them (an example where D is not integral is 89/3).

3.10 Derivatives of Modular Forms

If we differentiate the modular equation f ((aτ + b)/(cτ + d)) = (cτ + d)k f (τ ) with
a b ∈ Γ using the operator D = (1/(2π i))d/d τ (which gives simpler formulas
c d
than d/d τ since D(qn ) = nqn), we easily obtain
   
aτ + b k c
D( f ) = (cτ + d)k+2 D( f )(τ ) + f (τ ) .
cτ + d 2 π i cτ + d

Thus the derivative of a weakly modular form of weight k looks like one of weight
k + 2, except that there is an extra term. This term vanishes if k = 0, so the derivative
of a modular function of weight 0 is indeed modular of weight 2 (we have seen above
the example of j(τ ) which satisfies D( j) = −E14 /∆ ).
If k > 0 and we really want a true weakly modular form of weight k + 2 there are
two ways to do this. The first one is called the Serre derivative:

Exercise 3.38. Using Proposition 3.23, show that if f is weakly modular of weight
k then D( f ) − (k/12)E2 f is weakly modular of weight k + 2. In particular, if f ∈
Mk (Γ ) then SDk ( f ) := D( f ) − (k/12)E2 f ∈ Mk+2 (Γ ).

The second method is to set D∗ ( f ) := D( f ) − (k/(4π ℑ(τ ))) f since by Proposi-


tion 3.23 we have D∗ ( f ) = SDk ( f ) − (k/12)E2∗ f . This loses holomorphy, but is very
useful in certain contexts.
Note that if more than one modular form is involved, there are more ways to
make new modular forms using derivatives:

Exercise 3.39. 1. For i = 1, 2 let fi ∈ Mki (Γ ). By considering the modular function


f1k2 / f2k1 of weight 0, show that

k2 f2 D( f1 ) − k1 f1 D( f2 ) ∈ Sk1 +k2 +2 (Γ ) .

Note that this generalizes Exercise 3.20.


2. Compute constants a, b, and c (depending on k1 and k2 and not all 0) such that

[ f1 , f2 ]2 = aD2 ( f1 ) + bD( f1 )D( f2 ) + cD2 ( f2 ) ∈ Sk1 +k2 +4 (Γ ) .


30 Henri Cohen

This gives the first two of the so-called Rankin–Cohen brackets.


As an application of derivatives of modular forms, we give a proof of a theorem
of Siegel. We begin by the following:

Lemma 3.40. Let a and b be nonnegative integers such that 4a + 6b = 12r + 2. The
constant term of the Fourier expansion of Fr (a, b) = E4a E6b /∆ r vanishes.

Proof. By assumption Fr (a, b) is a meromorphic modular form of weight 2. Since


D(∑n≥n0 a(n)qn ) = ∑n≥n0 na(n)qn, it is sufficient to find a modular function Gr (a, b)
of weight 0 such that Fr (a, b) = D(Gr (a, b)) (recall that the derivative of a modular
function of weight 0 is still modular). We prove this by an induction first on r, then
on b. Recall that by Exercise 3.32 we have D( j) = −E14 /∆ = −E42 E6 /∆ , and since
4a + 6b = 14 has only the solution (a, b) = (2, 1) the result is true for r = 1. Assume
it is true for r − 1. We now do a recursion on b, noting that since 2a + 3b = 6r + 1,
b is odd. Note that D( jr ) = r jr−1 D( j) = −rE43r−1 E6 /∆ r , so the constant term of
Fr (a, 1) indeed vanishes. However, since E43 − E62 = 1728∆ , if a ≥ 3 we have

Fr (a − 3, b + 2) = E4a−3 E6b (E43 − 1728∆ )/∆ r = Fr (a, b) − 1728Fr−1(a − 3, b) ,

proving that the result is true for r by induction on b since we assumed it true for
r − 1. ⊓

We can now prove (part of) Siegel’s theorem:


Theorem 3.41 For r = dim(Mk (Γ )) define coefficients cki by

E12r−k+2
= ∑ cki qi ,
∆r i≥−r

where by convention we set E0 = 1. Then for any f = ∑n≥0 a(n) ∈ Mk (Γ ) we have


the relation
∑ ck−na(n) = 0 .
0≤n≤r

In addition we have ck0 6= 0, so that a(0) = ∑1≤n≤r (ck−n /ck0 )a(n) is a linear combi-
nation with rational coefficients of the a(n) for 1 ≤ n ≤ r.

Proof. First note that by Corollary 3.6 we have r ≥ (k − 2)/12 (with equality only
if k ≡ 2 (mod 12)), so the definition of the coefficients cki makes sense. Note also
that since the Fourier expansion of E12r−k+2 begins with 1 + O(q) and that of
∆ r by qr + O(qr+1 ), that of the quotient begins with q−r + O(q1−r ) (in particular
ck−r = 1). The proof of the first part is now immediate: the modular form f E12r−k+2
belongs to M12r+2 (Γ ), so by Corollary 3.7 is a linear combination of E4a E6b with
4a+6b = 12r +2. It follows from the lemma that the constant term of f E12r−k+2 /∆ r
vanishes, and this constant term is equal to ∑0≤n≤r ck−n a(n), proving the first part of
the theorem. The fact that ck0 6= 0 (which is of course essential) is a little more diffi-
cult and will be omitted, see [1] Theorem 9.5.1. ⊓

An Introduction to Modular Forms 31

This theorem has (at least) two consequences. First, a theoretical one: if one
can construct a modular form whose constant term is some interesting quantity and
whose Fourier coefficients a(n) are rational, this shows that the interesting quantity
is also rational. This is what allowed Siegel to show that the value at negative inte-
gers of Dedekind zeta functions of totally real number fields are rational, see Section
7.2. Second, a practical one: it allows to compute explicitly the constant coefficient
a(0) in terms of the a(n), giving interesting formulas, see again Section 7.2.

4 Hecke Operators: Ramanujan’s discoveries

We now come to one of the most amazing and important discoveries on modular
forms due to S. Ramanujan, which has led to the modern development of the subject.
Recall that we set

∆ (τ ) = q ∏ (1 − qm )24 = ∑ τ (n)qn .
m≥1 n≥1

We have τ (2) = −24, τ (3) = 252, and τ (6) = −6048 = −24 · 252, so that τ (6) =
τ (2)τ (3). After some more experiments, Ramanujan conjectured that if m and n are
coprime we have τ (mn) = τ (m)τ (n). Thus, by decomposing an integer into products
of prime powers, assuming this conjecture, we are reduced to the study of τ (pk ) for
p prime.
Ramanujan then noticed that τ (4) = −1472 = (−24)2 − 211 = τ (2)2 − 211 , and
again after some experiments he conjectured that τ (p2 ) = τ (p)2 − p11 , and more
generally that τ (pk+1 ) = τ (p)τ (pk ) − p11 τ (pk−1 ). Thus uk = τ (pk ) satisfies a linear
recurrence relation
uk+1 − τ (p)uk + p11 uk−1 = 0 ,
and since u0 = 1 the sequence is entirely determined by the value of u1 = τ (p).
It is well-known that the behavior of a linear recurrent sequence is determined by
its characteristic polynomial. Here it is equal to X 2 − τ (p)X + p11 , and the third of
Ramanujan’s conjectures is that the discriminant of this equation is always negative,
or equivalently that |τ (p)| < p11/2 .
Note that if α p and β p are the roots of the characteristic polynomial (necessarily
distinct since we cannot have |τ (p)| = p11/2 ), then τ (pk ) = (α pk+1 − β pk+1 )/(α p −
β p ), and the last conjecture says that α p and β p are complex conjugate, and in par-
ticular of modulus equal to p11/2 .
These conjectures are all true. The first two (multiplicativity and recursion) were
proved by L. Mordell only one year after Ramanujan formulated them, and indeed
the proof is quite easy (in fact we will prove them below). The third conjecture
|τ (p)| < p11/2 is extremely hard, and was only proved by P. Deligne in 1970 using
the whole machinery developed by the school of A. Grothendieck to solve the Weil
conjectures .
32 Henri Cohen

The main idea of Mordell, which was generalized later by E. Hecke, is to intro-
duce certain linear operators (now called Hecke operators) on spaces of modular
forms, to prove that they satisfy the multiplicativity and recursion properties (this is
in general much easier than to prove this on numbers), and finally to use the fact that
S12 (Γ ) = C∆ is of dimension 1, so that necessarily ∆ is an eigenform of the Hecke
operators whose eigenvalues are exactly its Fourier coefficients.
Although there are more natural ways of introducing them, we will define
the Hecke operator T (n) on Mk (Γ ) directly by its action on Fourier expansions
T (n)(∑m≥0 a(m)qm ) = ∑m≥0 b(m)qm , where

b(m) = ∑ d k−1 a(mn/d 2) .


d|gcd(m,n)

Note that we can consider this definition as purely formal, apart from the presence of
the integer k this is totally unrelated to the possible fact that ∑m≥0 a(m)qm ∈ Mk (Γ ).
A simple but slightly tedious combinatorial argument shows that these operators
satisfy
T (n)T (m) = ∑ d k−1 T (nm/d 2 ) .
d|gcd(n,m)

In particular if m and n are coprime we have T (n)T (m) = T (nm) (multiplicativity),


and if p is a prime and k ≥ 1 we have T (pk )T (p) = T (pk+1 ) + pk−1 T (pk−1 ) (recur-
sion). This shows that these operators are indeed good candidates for proving the
first two of Ramanujan’s conjectures.
We need to show the essential fact that they preserve Mk (Γ ) and Sk (Γ ) (the latter
will follow from the former since by the above definition b(0) = ∑d|n d k−1 a(0) =
a(0)σk−1 (n) = 0 if a(0) = 0). By recursion and multiplicativity, it is sufficient to
show this for T (p) with p prime. Now if F(τ ) = ∑m≥0 a(m)qm , T (p)(F)(τ ) =
∑m≥0 b(m)qm with b(m) = a(mp) if p ∤ m, and b(m) = a(mp)+ pk−1 a(m/p) if p | m.
On the other hand, let us compute G(τ ) = ∑0≤ j<p F((τ + j)/p). Replacing di-
rectly in the Fourier expansion we have

G(τ ) = ∑ a(m)qm/p ∑ e2π im j/p .


m≥0 0≤ j<p

The inner sum is a complete geometric sum which vanishes unless p | m, in which
case it is equal to p. Thus, changing m into pm we have G(τ ) = p ∑m≥0 a(pm)qm .
On the other hand, we have trivially ∑ p|m a(m/p)qm = ∑m≥0 a(m)q pm = F(pτ ).
Replacing both of these formulas in the formula for T (p)(F) we see that
 
1 τ+ j
T (p)(F)(τ ) = pk−1 F(pτ ) +
p 0≤∑
F .
j<p p

Exercise 4.1. Show more generally that


An Introduction to Modular Forms 33
 
1 aτ + b
T (n)(F)(τ ) = ∑ ak−1 ∑ F .
ad=n d 0≤b<d d

It is now easy to show that T (p)F is modular: replace τ by γ (τ ) in the above


formula and make a number of elementary manipulations to prove modularity. In
fact, since Γ is generated by τ 7→ τ + 1 and τ 7→ −1/τ , it is immediate to check
modularity for these two maps on the above formula.
As mentioned above, the proof of the first two Ramanujan conjectures is now
immediate: since T (n) acts on the one-dimensional space S12 (Γ ) we must have
T (n)(∆ ) = c · ∆ for some constant c. Replacing in the definition of T (n), we thus
have for all m cτ (m) = ∑d|gcd(n,m) d 11 τ (nm/d 2 ). Choosing m = 1 and using τ (1) = 1
shows that c = τ (n), so that

τ (n)τ (m) = ∑ d 11 τ (nm/d 2 )


d|gcd(n,m)

which implies (and is equivalent to) the first two conjectures of Ramanujan.
Denote by Pk (n) the characteristic polynomial of the linear map T (n) on Sk (Γ ).
A strong form of the so-called Maeda’s conjecture states that for n > 1 the polyno-
mial Pk (n) is irreducible. This has been tested up to very large weights.

Exercise 4.2. The above proof shows that the Hecke operators also preserve the
space of modular functions, so by Theorem 3.31 the image of j(τ ) will be a rational
function in j:
1. Show for instance that

T (2)( j) = j2 /2 − 744 j + 81000 and


T (3)( j) = j3 /3 − 744 j2 + 356652 j − 12288000 .

2. Set J = j − 744, i.e., j with no term in q0 in its Fourier expansion. Deduce that

T (2)(J) = J 2 /2 − 196884 and


T (3)(J) = J 3 /3 − 196884J − 21493760 ,

and observe that the coefficients that we obtain are exactly the Fourier coeffi-
cients of J.
3. Prove that T (n)( j) is a polynomial in j. Does the last observation generalize?
34 Henri Cohen

5 Euler Products, Functional Equations

5.1 Euler Products

The case of ∆ is quite special, in that the modular form space to which it naturally
belongs, S12 (Γ ), is only 1-dimensional. As can easily be seen from the dimension
formula, this occurs (for cusp forms) only for k = 12, 16, 18, 20, 22, and 26 (there
are no nonzero cusp forms in weight 14 and the space is of dimension 2 in weight
24), and thus the evident cusp forms ∆ Ek−12 for these values of k (setting E0 = 1)
are generators of the space Sk (Γ ), so are eigenforms of the Hecke operators and
share exactly the same properties as ∆ , with p11 replaced by pk−1 .
When the dimension is greater than 1, we must work slightly more. From the for-
mulas given above it is clear that the T (n) form a commutative algebra of operators
on the finite dimensional vector space Sk (Γ ). In addition, we have seen above that
there is a natural scalar product on Sk (Γ ). One can show the not completely trivial
fact that T (n) is Hermitian for this scalar product, hence in particular is diagonaliz-
able. It follows by an easy and classical result of linear algebra that these operators
are simultaneously diagonalizable, i.e., there exists a basis Fi of forms in Sk (Γ ) such
that T (n)Fi = λi (n)Fi for all n and i. Identifying Fourier coefficients as we have done
above for ∆ shows that if Fi = ∑n≥1 ai (n)qn we have ai (n) = λi (n)ai (0). This im-
plies first that ai (0) 6= 0, otherwise Fi would be identically zero, so that by dividing
by ai (0) we can always normalize the eigenforms so that ai (0) = 1, and second, as
for ∆ , that ai (n) = λi (n), i.e., the eigenvalues are exactly the Fourier coefficients. In
addition, since the T (n) are Hermitian, these eigenvalues are real for any embedding
into C, hence are totally real, in other words their minimal polynomial has only real
roots. Finally, using Theorem 3.5, it is immediate to show that the field generated
by the ai (n) is finite-dimensional over Q, i.e., is a number field.

Exercise 5.1. Consider the space S = S24 (Γ ), which is the smallest weight where
the dimension is greater than 1, here 2. By the structure theorem given above, it is
generated for instance by ∆ 2 and ∆ E43 . Compute the matrix of the operator T (2)
on this basis of S, diagonalize this matrix, so find the eigenfunctions of T (2) on S
(the prime number 144169 should occur). Check that these eigenfunctions are also
eigenfunctions of T (3).

Thus, let F = ∑n≥1 a(n)qn be a normalized eigenfunction for all the Hecke oper-
ators in Sk (Γ ) (for instance F = ∆ with k = 12), and consider the Dirichlet series

a(n)
L(F, s) = ∑ s
,
n≥1 n

for the moment formally, although we will show below that it converges for
ℜ(s) sufficiently large. The multiplicativity property of the coefficients (a(nm) =
a(n)a(m) if gcd(n, m) = 1, coming from that of the T (n)) is equivalent to the fact
that we have an Euler product (a product over primes)
An Introduction to Modular Forms 35

a(p j )
L(F, s) = ∏ L p (F, s) with L p (F, s) = ∑ js
,
p∈P j≥0 p

where we will always denote by P the set of prime numbers.


The additional recursion property a(p j+1 ) = a(p)a(p j ) − pk−1 a(p j−1 ) is equiv-
alent to the identity
1
L p (F, s) =
1 − a(p)p−s + pk−1 p−2s

(multiply both sides by the denominator to check this). We have thus proved the
following theorem:
Theorem 5.2 Let F = ∑n≥1 a(n)qn ∈ Sk (Γ ) be an eigenfunction of all Hecke oper-
ators. We have an Euler product

a(n) 1
L(F, s) = ∑ n s
=∏
1 − a(p)p −s + pk−1 p−2s
.
n≥1 p∈P

Note that we have not really used the fact that F is a cusp form: the above theorem
is still valid if F = Fk is the normalized Eisenstein series
Bk Bk
Fk (τ ) = − Ek (τ ) = − + ∑ σk−1 (n)qn ,
2k 2k n≥1

which is easily seen to be a normalized eigenfunction for all Hecke operators. In


fact:
Exercise 5.3. Let a ∈ C be any complex number and let as usual σa (n) = ∑d|n d a .
1. Show that
σa (n) 1
∑ n s
= ζ (s − a)ζ (s) = ∏
1 − σa (p)p −s + pa p−2s
,
n≥1 p∈P

with σa (p) = pa + 1.
2. Show that  mn 
σa (m)σa (n) = ∑ d a σa ,
d|gcd(m,n)
d2

so that in particular Fk is indeed a normalized eigenfunction for all Hecke opera-


tors.

5.2 Analytic Properties of L-Functions

Everything that we have done up to now is purely formal, i.e., we do not need to
assume convergence. However in the sequel we will need to prove some analytic
36 Henri Cohen

results, and for this we need to prove convergence for certain values of s. We begin
with the following easy bound, due to Hecke:

Proposition 5.4. Let F = ∑n≥1 a(n)qn ∈ Sk (Γ ) be a cusp form (not necessarily an


eigenform). There exists a constant c > 0 (depending on F) such that for all n we
have |a(n)| ≤ cnk/2 .

Proof. The trick is to consider the function g(τ ) = |F(τ )ℑ(τ )k/2 |: since we have
seen that ℑ(γ (τ )) = ℑ(τ )/|cτ + d|2 , it follows that g(τ ) is invariant under Γ . It
follows that supτ ∈H g(τ ) = supτ ∈F g(τ ), where F is the fundamental domain used
above. Now because of the Fourier expansion and the fact that F is a cusp form,
|F(τ )| = O(e−2π ℑ(τ ) ) as ℑ(τ ) → ∞, so g(τ ) tends to 0 also. It immediately follows
that g is bounded on F, hence on H , so that there exists a constant c1 > 0 such that
|F(τ )| ≤ c1 ℑ(τ )−k/2 for all τ .
We can now easily prove Hecke’s bound: from the Fourier series section we know
that for any y > 0
Z 1
a(n) = e2π ny F(x + iy)e−2π inx dx ,
0

so that |a(n)| ≤ c1 e2π ny y−k/2 , and choosing y = 1/n proves the proposition with
c = e2 π c1 . ⊓

The following corollary is now clear:

Corollary 5.5. The L-function of a cusp form of weight k converges absolutely (and
uniformly on compact subsets) for ℜ(s) > k/2 + 1.

Remark 5.6. Deligne’s deep result mentioned above on the third Ramanujan conjec-
ture implies that we have the following optimal bound: there exists c > 0 such that
|a(n)| ≤ cσ0 (n)n(k−1)/2 , and in particular |a(n)| = O(n(k−1)/2+ε ) for all ε > 0. This
implies that the L-function of a cusp form converges absolutely and uniformly on
compact subsets in fact also for ℜ(s) > (k + 1)/2.

Exercise 5.7. . Define for all s ∈ C the function σs (n) by σs (n) = ∑d|n d s if n ∈ Z>0 ,
σs (0) = ζ (−s)/2 (and σs (n) = 0 otherwise). Set

S(s1 , s2 ; n) = ∑ σs1 (m)σs2 (n − m) .


0≤m≤n

1. Compute S(s1 , s2 ; n) exactly in terms of σs1 +s2 +1 (n) for (s1 , s2 ) = (3, 3) and
(3, 5), and also for (s1 , s2 ) = (1, 1), (1, 3), (1, 5), and (1, 7) by using properties
of the function E2 .
2. Using Hecke’s bound for cusp forms, show that if s1 and s2 are odd positive
integers the ratio S(s1 , s2 ; n)/σs1 +s2 +1 (n) tends to a limit L(s1 , s2 ) as n → ∞, and
compute this limit in terms of Bernoulli numbers. In addition, give an estimate
for the error term |S(s1 , s2 ; n)/σs1 +s2 +1 (n) − L(s1 , s2 )|.
3. Using the values of the Riemann zeta function at even positive integers in terms
of Bernoulli numbers, show that if s1 and s2 are odd positive integers we have
An Introduction to Modular Forms 37

ζ (s1 + 1)ζ (s2 + 1)


L(s1 , s2 ) =  .
(s1 + s2 + 1) s1s+s
1
2
ζ (s1 + s2 + 2)

4. (A little project.) Define L(s1 , s2 ) by theabove formula for all s1 , s2 in C for


which it makes sense, interpreting s1s+s 1
2
as Γ (s1 + s2 + 1)/(Γ (s1 + 1)Γ (s2 +
1)). Check on a computer whether it still seems to be true that

S(s1 , s2 ; n)/σs1 +s2 +1 (n) → L(s1 , s2 ) .

Try to prove it for s1 = s2 = 2, and then for general s1 , s2 . If you succeed, give
also an estimate for the error term analogous to the one obtained above.

We now do some (elementary) analysis.

Proposition 5.8. Let F ∈ Sk (Γ ). For ℜ(s) > k/2 + 1 we have


Z ∞
(2π )−sΓ (s)L(F, s) = F(it)t s−1 dt .
0
R
Proof. Using Γ (s) = 0∞ e−t t s−1 dt, this is trivial by uniform convergence which
insures that we can integrate term by term. ⊓

Corollary 5.9. The function L(F, s) is a holomorphic function which can be analyti-
cally continued to the whole of C. In addition, if we set Λ (F, s) = (2π )−sΓ (s)L(F, s)
we have the functional equation Λ (F, k − s) = i−k Λ (F, s).

Note that in our case k is even, so that i−k = (−1)k/2 , but we prefer writing the
constant as above so as to be able to use a similar result in odd weight, which occur
in more general situations.

Proof. Indeed, splitting the integral at 1, changing t into 1/t in one of the integrals,
and using modularity shows immediately that
Z ∞
(2π )−sΓ (s)L(F, s) = F(it)(t s−1 + ikt k−1−s ) dt .
1

Since the integral converges absolutely and uniformly for all s (recall that F(it)
tends exponentially fast to 0 when t → ∞), this immediately implies the corollary.

As an aside, note that the integral formula used in the above proof is a very
efficient numerical method to compute L(F, s), since the series obtained on the right
by term by term integration is exponentially convergent. For instance:

Exercise 5.10. Let F(τ ) = ∑n≥1 a(n)qn be the Fourier expansion of a cusp form of
weight k on Γ . Using the above formula, show that the value of L(F, k/2) at the
center of the “critical strip” 0 ≤ ℜ(s) ≤ k is given by the following exponentially
convergent series
38 Henri Cohen

a(n) −2π n
L(F, k/2) = (1 + (−1)k/2) ∑ k/2
e Pk/2 (2π n) ,
n≥1 n

where Pk/2 (X) is the polynomial

Pk/2 (X) = ∑ X j / j! = 1 + X/1! + X 2/2! + · · · + X k/2−1/(k/2 − 1)! .


0≤ j<k/2

Note in particular that if k ≡ 2 (mod 4) we have L(F, k/2) = 0. Prove this directly.

Exercise 5.11. 1. Prove that if F is not necessarily a cusp form we have |a(n)| ≤
cnk−1 for some c > 0.
2. Generalize the proposition and the integral formulas so that they are also valid
form non-cusp forms; you will have to add polar parts of the type 1/s and 1/(s −
k).
3. Show that L(F, s) still extends to the whole of C with functional equation, but
that it has a pole, simple, at s = k, and compute its residue. In passing, show that
L(F, 0) = −a(0).

5.3 Special Values of L-Functions

A general “paradigm” on L-functions, essentially due to P. Deligne, is that if some


“natural” L-function has both an Euler product and functional equations similar to
the above, then for suitable integral “special points” the value of the L-function
should be a certain (a priori transcendental) number ω times an algebraic number.
In the case of modular forms, this is a theorem of Yu. Manin:
Theorem 5.12 Let F be a normalized eigenform in Sk (Γ ), and denote by K the
number field generated by its Fourier coefficients. There exist two nonzero complex
numbers ω+ and ω− such that for 1 ≤ j ≤ k − 1 integral we have

Λ (F, j)/ω(−1) j ∈ K ,

where we recall that Λ (F, s) = (2π )−sΓ (s)L(F, s).


In addition, ω± can be chosen such that ω+ ω− =< F, F >.
In other words, for j odd we have L(F, j)/ω− ∈ K while for j even we have
L(F, j)/ω+ ∈ K.
For instance, in the case F = ∆ , if we choose ω− = Λ (F, 3) and ω+ = Λ (F, 2),
we have

(Λ (F, j))1≤ j≤11 odd = (1620/691, 1, 9/14, 9/14, 1, 1620/691)ω−


(Λ (F, j))1≤ j≤11 even = (1, 25/48, 5/12, 25/48, 1)ω+ ,

and ω+ ω− = (8192/225) < F, F >.


An Introduction to Modular Forms 39

Exercise 5.13. (see also Exercise 3.9). For F ∈ Sk (Γ ) define the period polynomial
P(F, X) by
Z i∞
P(F; X) = (X − τ )k−2 F(τ ) d τ .
0
1. For γ ∈ Γ show that
Z γ −1 (i∞)
P(F; X)|2−k = (X − τ )k−2 F(τ ) d τ .
γ −1 (0)

2. Show that P(F; X) satisfies

P(F; X)|2−k S + P(F; X) = 0 and


2
P(F; X)|2−k (ST ) + P(F; X)|2−k (ST ) + P(F; X) = 0 .

3. Show that
k−2  
k−2
P(F; X) = − ∑ (−i)k−1− j Λ (F, k − 1 − j)X j .
j=0 j

4. If F = ∆ , using Manin’s theorem above show that up to the multiplicative con-


stant ω+ , ℜ(P(F; X)) factors completely in Q[X] as a product of linear polyno-
mials, and show a similar result for ℑ(P(F; X)) after omitting the extreme terms
involving 691.

5.4 Nonanalytic Eisenstein Series and Rankin–Selberg

If we replace the expression (cτ + d)k by |cτ + d|2s for some complex number s, we
can also obtain functions which are invariant by Γ , although they are nonanalytic.
More precisely:

Definition 5.14. Write as usual y = ℑ(τ ). For ℜ(s) > 1 we define

ys
G(s)(τ ) = ∑ |cτ + d|2s
and
(c,d)∈Z2 \{(0,0)}
1 ys
E(s)(τ ) = ∑ ℑ(γ (τ ))s = ∑ .
γ ∈Γ∞ \Γ
2 gcd(c,d)=1 |cτ + d|2s

This is again an averaging procedure, and it follows that G(s) and E(s) are in-
variant under Γ . In addition, as in the case of the holomorphic Eisenstein series Gk
and Ek , it is clear that G(s) = ζ (2s)E(s). One can also easily compute their Fourier
expansion, and the result is as follows:

Proposition 5.15. Set Λ (s) = π −s/2Γ (s/2)ζ (s). We have the Fourier expansion
40 Henri Cohen

σ2s−1 (n)
Λ (2s)E(s) = Λ (2s)ys + Λ (2−2s)y1−s +4y1/2 ∑ s−1/2
Ks−1/2 (2π ny) cos(2π nx) .
n≥1 n

In the above, Kν (x) is a K-Bessel function which we do not define here. The
main properties that we need is that it tends to 0 exponentially (more precisely
Kν (x) ∼ (π /(2x))1/2e−x as x → ∞) and that K−ν = Kν . It follows from the above
Fourier expansion that E(s) has an analytic continuation to the whole complex
plane, that it satisfies the functional equation E (1 − s) = E (s), where we set
E (s) = Λ (2s)E(s), and that E(s) has a unique pole, at s = 1, which is simple with
residue 3/π , independent of τ .
Exercise 5.16. Using the properties of the Riemann zeta function ζ (s), show this
last property, i.e., that E(s) has a unique pole, at s = 1, which is simple with residue
3/π , independent of τ .
There are many reasons for introducing these nonholomorphic Eisenstein series,
but for us the main reason is that they are fundamental in unfolding methods. Recall
that using unfolding, in Proposition 3.12 we showed that Ek (or Gk ) was orthogonal
to any cusp form. In the present case, we obtain a different kind of result called a
Rankin–Selberg convolution. Let f and g be in Mk (Γ ), one of them being a cusp
form. Since E(s) is invariant by Γ the scalar product < E(s) f , g > makes sense, and
the following proposition gives its value:
Proposition 5.17. Let f (τ ) = ∑n≥0 a(n)qn and g(τ ) = ∑n≥0 b(n)qn be in Mk (Γ ),
with at least one being a cusp form. For ℜ(s) > 1 we have

Γ (s + k − 1) a(n)b(n)
< E(s) f , g >=
(4π )s+k−1 ∑ s+k−1
.
n≥1 n

Proof. We essentially copy the proof of Proposition 3.12 so we skip the details:
setting temporarily F(τ ) = f (τ )g(τ )yk which is invariant by Γ , we have
Z
< E(s) f , g > = ∑ ℑ(γ (τ ))s F(γ (τ )) d µ
Γ \H γ ∈Γ \Γ

= ∑ ℑ(τ )s F(τ ) d µ
Γ∞ \H
Z ∞ Z 1
s+k−2
= y F(x + iy) dx dy .
0 0

The inner integral is equal to the constant term in the Fourier expansion of F,
hence is equal to ∑n≥1 a(n)b(n)e−4π ny (note that by assumption one of f and g is a
cusp form, so the term n = 0 vanishes), and the proposition follows. ⊓

Corollary 5.18. For ℜ(s) > k set

a(n)b(n)
R( f , g)(s) = ∑ ns
.
n≥1
An Introduction to Modular Forms 41

1. R( f , g)(s) has an analytic continuation to the whole complex plane and satisfies
the functional equation R(2k − 1 − s) = R(s) with

R(s) = Λ (2s − 2k + 1)(4π )−sΓ (s)R( f , g)(s) .

2. R( f , g)(s) has a single pole, which is simple, at s = k with residue

3 (4π )k
< f,g > .
π (k − 1)!

Proof. This immediately follows from the corresponding properties of E(s): we


have
Λ (2s − 2k + 2)(4π )−sΓ (s)R( f , g)(s) =< E (s − k + 1) f , g > ,
and the right-hand side has an analytic continuation to C, is invariant when changing
s into 2k − 1 − s. In addition by the proposition E(s − k + 1) = E (s − k + 1)/Λ (2s −
2k + 2) has a single pole, which is simple, at s = k, with residue 3/π , so R( f , g)(s)
3 (4π )k
also has a single pole, which is simple, at s = k with residue < f , g >.
π (k − 1)!


It is an important fact (see Theorem 7.9 of my notes on L-functions in the present
volume) that L-functions having analytic continuation and standard functional equa-
tions can be very efficiently computed at any point in the complex plane (see the
note after the proof of Corollary 5.9 for the special case of L(F, s)). Thus the above
corollary gives a very efficient method for computing Petersson scalar products.
Note that the holomorphic Eisenstein series Ek (τ ) can also be used to give
Rankin–Selberg convolutions, but now between forms of different weights:
Exercise 5.19. Let f = ∑n≥0 a(n)qn ∈ Mℓ (Γ ) and g = ∑n≥0 b(n)qn ∈ Mk+ℓ (Γ ), at
least one being a cusp form. Using exactly the same unfolding method as in the
above proposition or as in Proposition 3.12, show that

(k + ℓ − 2)! a(n)b(n)
< Ek f , g >= ∑ nk+ℓ−1 .
(4π )k+ℓ−1 n≥1

6 Modular Forms on Subgroups of Γ

6.1 Types of Subgroups

We have used as basic definition of (weak) modularity F|k γ = F for all γ ∈ Γ . But
there is no reason to restrict to Γ : we could very well ask the same modularity
condition for some group G of transformations of H different from Γ .
There are many types of such groups, and they have been classified: for us, we
will simply distinguish three types, with no justification. For any such group G we
42 Henri Cohen

can talk about a fundamental domain, similar to F that we have drawn above (I do
not want to give a rigorous definition here). We can distinguish essentially three
types of such domains, corresponding to three types of groups.
The first type is when the domain (more precisely its closure) is compact: we say
in that case that G is cocompact. It is equivalent to saying that it does not have any
“cusp” such as i∞ in the case of G. These groups are very important, but we will not
consider them here.
The second type is when the domain is not compact (i.e., it has cusps), but it has
finite volume for the measure d µ = dxdy/y2 on H defined in Exercise 3.11. Such a
group is said to have finite covolume, and the main example is G = Γ that we have
just considered, hence also evidently all the subgroups of Γ of finite index.
Exercise 6.1. Show that the covolume of the modular group Γ is finite and equal to
π /3.
The third type is when the volume is infinite: a typical example  is the group
Γ∞ generated by integer translations, i.e., the set of matrices 10 1n . A fundamental
domain is then any vertical strip in H of width 1, which can trivially be shown
to have infinite volume. These groups are not important (at least for us) for the
following reason: they would have “too many” modular forms. For instance, in the
case of Γ∞ a “modular form” would simply be a holomorphic periodic function of
period 1, and we come back to the theory of Fourier series, much less interesting.
We will therefore restrict to groups of the second type, which are called Fuchsian
groups of the first kind. In fact, for this course we will even restrict to subgroups G
of Γ of finite index.
However, even with this restriction, it is still necessary to distinguish two types
of subgroups: the so-called congruence subgroups, and the others, of course called
non-congruence subgroups. The theory of modular forms on non-congruence sub-
groups is quite a difficult subject and active research is being done on them. One
annoying aspect is that they apparently do not have a theory of Hecke operators.
Thus will will restrict even more to congruence subgroups. We give the following
definitions:
Definition 6.2. Let N ≥ 1 be an integer.
1. We define
   
a b 10
Γ (N) = {γ = ∈Γ, γ ≡ (mod N)} ,
c d 01
   
a b 1∗
Γ1 (N) = {γ = ∈Γ, γ ≡ (mod N)} ,
c d 01
   
a b ∗∗
Γ0 (N) = {γ = ∈Γ, γ ≡ (mod N)} ,
c d 0∗

where the congruences are component-wise and ∗ indicates that no congruence


is imposed.
An Introduction to Modular Forms 43

2. A subgroup of Γ is said to be a congruence subgroup if it contains Γ (N) for


some N, and the smallest such N is called the level of the subgroup.

It is clear that Γ (N) ⊂ Γ1 (N) ⊂ Γ0 (N), and it is trivial to prove that Γ (N)
is normal in Γ (hence in any subgroup of Γ containing it), that Γ1 (N)/Γ (N) ≃
Z/NZ (with the map ac db 7→ b mod N), and  that Γ1 (N) is normal in Γ0 (N) with
Γ0 (N)/Γ1 (N) ≃ (Z/NZ)∗ (with the map ac db 7→ d mod N).
If G is a congruence subgroup of level N we have Γ (N) ⊂ G, so (whatever the
definition) a modular form on G will in particular be on Γ (N). Because of the above
isomorphisms, it is not difficult to reduce the study of forms on Γ (N) to those on
Γ1 (N), and the latter to forms on Γ0 (N), except that we have to add a slight “twist”
to the modularity property. Thus for simplicity, we will restrict to modular forms on
Γ0 (N).

6.2 Modular Forms on Subgroups

In view of the definition given for Γ , it is natural to say that F is weakly modular
of weight kon Γ0 (N) if for all γ ∈ Γ0 (N) we have F|k γ = F, where we recall that
if γ = ac db then F|k γ (τ ) = (cτ + d)−k F(τ ). To obtain a modular form, we need
also to require that F is holomorphic on H , plus some additional technical condi-
tion “at infinity”. In the case of the full modular group Γ , this condition was that
F(τ ) remains bounded as ℑ(τ ) → ∞. In the case of a subgroup, this condition is
not sufficient (it is easy to show that if we do not require an additional condition
the corresponding space will in general be infinite-dimensional). There are several
equivalent ways of giving the additional condition. One is the following: writing as
usual τ = x + iy, we require that there exists N such that in the strip −1/2 ≤ x ≤ 1/2,
we have |F(τ )| ≤ yN as y → ∞ and |F(τ )| ≤ y−N as y → 0 (since F is 1-periodic,
there is no loss of generality in restricting to the strip).
It is easily shown that if F is weakly modular and holomorphic, then the above
inequalities imply that |F(τ )| is in fact bounded as y → ∞ (but in general not as
y → 0), so the first condition is exactly the one that we gave in the case of the full
modular group.
Similarly, we can define a cusp form by asking that in the above strip |F(τ )| tends
to 0 as y → ∞ and as y → 0.

Exercise 6.3. If F ∈ Mk (Γ ) show that the second condition |F(τ )| ≤ y−N as y → 0


is satisfied.

Now that we have a solid definition of modular form, we can try to proceed as
in the case of the full modular group. A number of things can easily be generalized.
It is always convenient to choose a system of representatives (γ j ) of right cosets for
Γ0 (N) in Γ , so that G
Γ = Γ0 (N)γ j .
j
44 Henri Cohen

For instance, if F is the fundamental domain of Γ seen above, one can choose D =
F
γ j (F) as fundamental domain for Γ0 (N). The theorem that we gave on valuations
generalizes immediately:

vτ (F) k
∑ eτ
= [Γ : Γ0 (N)] ,
12
τ ∈D

where D is D to which is added a finite number of “cusps” (we do not explain this;
it is not the topological closure), eτ = 2 (resp., 3) if τ is Γ -equivalent to i (resp., to
ρ ), and eτ = 1 otherwise, and we can then deduce the dimension of Mk (Γ0 (N)) and
Sk (Γ0 (N)) as we did for Γ :
Theorem 6.4 We have M0 (Γ0 (N)) = C (i.e., the only modular forms of weight 0 are
the constants) and S0 (Γ0 (N)) = {0}. For k ≥ 2 even, we have

dim(Mk (Γ0 (N))) = A1 − A2,3 − A2,4 + A3 and


dim(Sk (Γ0 (N))) = A1 − A2,3 − A2,4 − A3 + δk,2 ,

where δk,2 is the Kronecker symbol (1 if k = 2, 0 otherwise) and the Ai are given as
follows:
 
k−1 1
12 ∏
A1 = N 1+ ,
p|N
p
     
k−1 k −3
A2,3 =
3

3 ∏ 1+ p if 9 ∤ N, 0 otherwise,
p|N
     
k−1 k −4
A2,4 =
4

4 ∏ 1+ p if 4 ∤ N, 0 otherwise,
p|N
1
A3 = ∑ φ (gcd(d, N/d)) .
2 d|N

6.3 Examples of Modular Forms on Subgroups

We give a few examples of modular forms on subgroups. First note the following
easy lemma:

Lemma 6.5. If F ∈ Mk (Γ0 (N)) then for any m ∈ Z≥1 we have F(mτ ) ∈ Mk (Γ0 (mN)).

Proof. Trivial since when ac db ∈ Γ0 (mN) one can write (m(aτ + b)/(cτ + d)) =
(a(mτ ) + mb)/((c/m)τ + d). ⊓

Thus we can already construct many forms on subgroups, but in a sense they are
not very interesting, since they are “old” in a precise sense that we will define below.
An Introduction to Modular Forms 45

A second more interesting example is Eisenstein series: there are more general
Eisenstein series than those that we have seen for Γ , but we simply give the follow-
ing important example: using a similar proof to the above lemma we can construct
Eisenstein series of weight 2 as follows. Recall that E2 (τ ) = 1 − 24 ∑n≥1 σ1 (n)qn is
not quite modular, and that E2∗ (τ ) = E2 (τ ) − 3/(π ℑ(τ )) is weakly modular (but of
course non-holomorphic). Consider the function F(τ ) = NE2 (N τ ) − E2 (τ ), analo-
gous to the construction of the lemma with a correction term.
We have the evident but crucial fact that we also have F(τ ) = NE2∗ (N τ ) − E2∗ (τ )
(since ℑ(τ ) is multiplied by N), so F is also weakly modular on Γ0 (N), but since it
is holomorphic we have thus constructed a (nonzero) modular form of weight 2 on
Γ0 (N).
A third important example is provided by theta series. This would require a book
in itself, so we restrict to the simplest case. We have seen in Corollary 1.3 that the
2
function T (a) = ∑n∈Z e−aπ n satisfies T (1/a) = a1/2 T (a), which looks like (and is)
a modularity condition. This was for a > 0 real. Let us generalize and for τ ∈ H
set
2 2
θ (τ ) = ∑ qn = ∑ e2π in τ ,
n∈Z n∈Z

so that for instance we simply have T (a) = θ (ia/2). The proof of the functional
equation for T that we gave using Poisson summation is still valid in this more
general case and shows that

θ (−1/(4τ )) = (2τ /i)1/2 θ (τ ) .

On the other hand, the definition


 trivially shows that θ (τ + 1) = θ (τ ). If we denote
by W4 the matrix 04 −1 0 corresponding to the map τ 7→ −1/(4τ ) and as usual T =
1 1 , we thus have θ | W = cθ and θ
01 1/2 4 1/2 T = θ for some 8th root of unity c.
(Note: we always use the principal determination of the square roots; if you are
uncomfortable with this, simply square everything, this is what we will do below
anyway.) This implies that if we let Γθ be the intersection of Γ with the group
generated by W4 and T (as transformations of H ), then for all γ ∈ Γθ we will have
θ |1/2 γ = c(γ )θ for some 8th root of unity c(γ ), but in fact c(γ ) is a 4th root of unity
which we will give explicitly below.
One can easily describe this group Γθ , and in particular show that it contains Γ0 (4)
as a subgroup of index 2. This implies that θ 4 ∈ M2 (Γ0 (4)), and more generally of
course θ 4m ∈ M2m (Γ0 (4)).
As one of the most famous application of the finite-dimensionality of modular
form spaces, solve the following exercise:

Exercise 6.6. 1. Using the dimension formulas, show that 2E2 (2τ )− E2 (τ ) together
with 4E2 (4τ ) − E2 (τ ) form a basis of M2 (Γ0 (4)).
2. Using the Fourier expansion of E2 , deduce an explicit formula for the Fourier
expansion of θ 4 , and hence that r4 (n), the number of representations of n as
a sum of 4 squares (in Z, all permutations counted) is given for n ≥ 1 by the
formula
46 Henri Cohen

r4 (n) = 8(σ1 (n) − 4σ1(n/4)) ,


where it is understood that σ1 (x) = 0 if x ∈
/ Z. In particular, show that this trivially
implies Lagrange’s theorem that every integer is a sum of four squares.
3. Similarly, show that r8 (n), the nth Fourier coefficient of θ 8 , is given for n ≥ 1 by

r8 (n) = 16(σ3 (n) − 2σ3(n/2) + 16σ3(n/4)) .

Remark 6.7. Using more general methods one can give “closed” formulas for rk (n)
for k = 1, 2, 3, 4, 5, 6, 7, 8, and 10, see e.g., [1].

6.4 Hecke Operators and L-Functions

We can introduce the same Hecke operators as before, but to have a reasonable
definition we must add a coprimality condition: we define T (n)(∑m≥0 a(m)qm ) =
∑m≥0 b(m)qm , with
b(m) = ∑ d k−1a(mn/d 2) .
d|gcd(m,n)
gcd(d,N)=1

This additional condition gcd(d, N) = 1 is of course automatically satisfied if n is


coprime to N, but not otherwise.
One then shows exactly like in the case of the full modular group that

T (n)T (m) = ∑ d k−1 T (nm/d 2 ) ,


d|gcd(n,m)
gcd(d,N)=1

that they preserve modularity, so in particular the T (n) form a commutative algebra
of operators on Sk (Γ0 (N)). And this is where the difficulties specific to subgroups of
Γ begin: in the case of Γ we stated (without proof nor definition) that the T (n) were
Hermitian with respect to the Petersson scalar product, and deduced the existence of
eigenforms for all Hecke operators. Unfortunately here the same proof shows that
the T (n) are Hermitian when n is coprime to N, but not otherwise.
It follows that there exist common eigenforms for the T (n), but only for n co-
prime to N, which creates difficulties.
An analogous problem occurs for Dirichlet characters: if χ is a Dirichlet charac-
ter modulo N, it may in fact come by natural extension from a character modulo M
for some divisor M | N, M < N. The characters which have nice properties, in par-
ticular with respect to the functional equation of their L-functions, are the primitive
characters, for which such an M does not exist.
A similar but slightly more complicated thing can be done for modular forms.
It is clear that if M | N and F ∈ Mk (Γ0 (M)), then of course F ∈ Mk (Γ0 (N)). More
generally, by Lemma 6.5, for any d | N/M we have F(d τ ) ∈ Mk (Γ0 (N)). Thus we
want to exclude such “oldforms”. However it is not sufficient to say that a newform
An Introduction to Modular Forms 47

is not an oldform. The correct definition is to define a newform as a form which is


orthogonal to the space of oldforms with respect to the scalar product, and of course
the new space is the space of newforms. Note that in the case of Dirichlet characters
this orthogonality condition (for the standard scalar product of two characters) is
automatically satisfied so need not be added.
This theory was developed by Atkin–Lehner–Li, and the new space Sknew (Γ0 (N))
can be shown to have all the nice properties that we require. Although not trivial,
one can prove that it has a basis of common eigenforms for all Hecke operators, not
only those with n coprime to N. More precisely, one shows that in the new space
an eigenform for the T (n) for all n coprime to N is automatically an eigenform for
any operator which commutes with all the T (n), such as, of course, the T (m) for
gcd(m, N) > 1.
In addition, we have not really lost anything by restricting to the new space, since
it is easy to show that
M M
Sk (Γ0 (N)) = B(d)Sknew (Γ0 (M)) ,
M|N d|N/M

where B(d) is the operator sending F(τ ) to F(d τ ). Note that the sums in the above
formula are direct sums.
Exercise 6.8. The above formula shows that

dim(Sk (Γ0 (N))) = ∑ σ0 (N/M) dim(Sknew (Γ0(M))) ,


M|N

where σ0 (n) is the number of divisors of n.


1. Using the Möbius inversion formula, show that if we define an arithmetic func-
tion β by β (p) = −2, β (p2 ) = 1, and β (pk ) = 0 for k ≥ 3, and extend by multi-
plicativity (β (∏ pvi i ) = ∏ β (pvi i )), we have the following dimension formula for
the new space:

dim(Sknew (Γ0 (N))) = ∑ β (N/M) dim(Sk (Γ0(M))) .


M|N

2. Using Theorem 6.4, deduce a direct formula for the dimension of the new space.

Proposition 6.9. Let F ∈ Sk (Γ0 (N)) and WN = N0 −1
0 .
1. We have F|kWN ∈ Sk (Γ0 (N)), where

F|kWN (τ ) = N −k/2 τ −k F(−1/(N τ )) .

2. If F is an eigenform (in the new space) then F|kWN = ±F for a suitable sign ±.
Proof. (1): this simply follows from the fact that WN normalizes Γ0 (N): WN−1Γ0 (N)WN =
Γ0 (N) as can easily be checked, and the same result would be true for any other nor-
malizing operator such as the Atkin–Lehner operators which we will not define. The
operator WN is called the Fricke involution.
48 Henri Cohen

(2): It is easy to show that WN commutes with all Hecke operators T (n) when
gcd(n, N) = 1, so by what we have mentioned above, if F is an eigenform in the new
space it is automatically an eigenform for WN , and since WN acts as an involution,
its eigenvalues are ±1. ⊓

The eigenforms can again be normalized with a(1) = 1, and their L-function has
an Euler product, of a slightly more general shape:
1 1
L(F, s) = ∏ −s + pk−1 p−2s ∏ 1 − a(p)p−s
.
p∤N
1 − a(p)p p|N

Proposition 5.8 is of course still valid, but is not the correct normalization to obtain
a functional equation. We replace it by
Z ∞
N s/2 (2π )−sΓ (s)L(F, s) = F(it/N 1/2 )t s−1 dt ,
0

which of course is trivial from the proposition by replacing t by t/N 1/2 . Indeed,
thanks to the above proposition we split the integral at t = 1, and using the action of
WN we deduce the following proposition:

Proposition 6.10. Let F ∈ Sknew (Γ0 (N)) be an eigenform for all Hecke operators,
and write F|kWN = ε F for some ε = ±1. The L-function L(F, s) extends to a holo-
morphic function in C, and if we set Λ (F, s) = N s/2 (2π )−sΓ (s)L(F, s) we have the
functional equation
Λ (F, k − s) = ε i−k Λ (F, s) .

Proof. Indeed, the trivial change of variable t into 1/t proves the formula
Z ∞
N s/2 (2π )−sΓ (s)L(F, s) = F(it/N 1/2 )(t s−1 + ε ik t k−1−s ) dt ,
1

from which the result follows. ⊓


Once again, we leave to the reader to check that if F(τ ) = ∑n≥1 a(n)qn we have

a(n) −2π n/N 1/2


L(F, k/2) = (1 + ε (−1)k/2) ∑ k/2
e Pk/2 (2π n/N 1/2) .
n≥1 n

6.5 Modular Forms with Characters

Consider again the problem of sums of squares, in other words of the powers of
θ (τ ). We needed to raise it to a power which is a multiple of 4 so as to have a pure
modularity property as we defined it above. But consider the function
 θ 2 (τ ). The
same proof that we mentioned for θ 4 shows that for any γ = ac db ∈ Γ0 (4) we have
An Introduction to Modular Forms 49
 
−4
θ 2 (γ (τ )) = (cτ + d)θ 2(τ ) ,
d

where −4 d is the Legendre–Kronecker character (in this specific case equal to
(−1) (d−1)/2 since d is odd, being coprime
 to c). Thus it satisfies a modularity prop-
erty, except that it is “twisted” by −4d . Note that the equation makes sense since if
we change γ into −γ (which does not change  γ (τ )), then
 (cτ + d) is changed into
−(cτ + d), and −4 d is changed into −4
−d = − −4
d . It is thus essential that the
multiplier that we put in front of (cτ + d)k , here −4d , has the same parity as k.
We mentioned above that the study of modular forms on Γ1 (N) could be reduced
to those on Γ0 (N) “with a twist”. Indeed, more precisely it is trivial to show that
M
Mk (Γ1 (N)) = Mk (Γ0 (N), χ ) ,
χ (−1)=(−1)k

where χ ranges through all Dirichlet characters modulo N of the specified parity,
and where Mk (Γ0 (N), χ ) is defined as the space of functions F satisfying

F(γ (τ )) = χ (d)(cτ + d)k F(τ )



for all γ = ac db ∈ Γ0 (N), plus the usual holomorphy and conditions at the cusps
(note that γ 7→ χ (d) is the group homomorphism from Γ0 (N) to C∗ which induces
the above-mentioned isomorphism from Γ0 (N)/Γ1 (N) to (Z/NZ)∗ ).

Exercise 6.11. 1. Show that a system


 of coset representatives of Γ1 (N)\Γ0 (N) is
given by matrices Md = Nu −vd , where 0 ≤ d < N such that gcd(d, N) = 1 and u
and v are such that ud + vN = 1.
2. Let f ∈ Mk (Γ1 (N)). Show that in the above decomposition of Mk (Γ1 (N)) we have
f = ∑χ (−1)=(−1)k f χ with

fχ = ∑ χ (d) f |k Md .
0≤d<N, gcd(d,N)=1

These spaces are just as nice as the spaces Mk (Γ0 (N)) and share exactly the same
properties. They have finite dimension (which we do not give), there are Eisen-
stein series, Hecke operators, newforms, Euler products, L-functions, etc... An ex-
cellent rule of thumb is simply to replace any formula containing d k−1 (or pk−1 )
by χ (d)d k−1 (or χ (p)pk−1 ). In fact, in the Euler product of the L-function of an
eigenform we do not need to distinguish p ∤ N and p | N since we have

1
L(F, s) = ∏ 1 − a(p)p−s + χ (p)pk−1−2s ,
p∈P

and χ (p) = 0 if p | N since χ is a character modulo N.


Thus, for instance θ 2 ∈ M1 (Γ0 (4), χ−4 ), more generally θ 4m+2 ∈ M2m+1
 (Γ0 (4), χ−4 ),
where we use the notation χD for the Legendre–Kronecker symbol Dd .
50 Henri Cohen

The space M1 (Γ0 (4), χ−4 ) has dimension 1, generated by the single Eisenstein
series  
(−4) n (D) D k−1
1 + 4 ∑ σ0 (n)q , where σk−1 (n) = ∑ d
n≥1 d|n
d

according to our rule of thumb (which does not tell us the constant 4). Comparing
(−4)
constant coefficients, we deduce that r2 (n) = 4σ0 (n), where as usual r2 (n) is the
number of representations of n as a sum of two squares. This formula was in essence
discovered by Fermat.
For r6 (n) we must work slightly more: θ 6 ∈ M3 (Γ0 (4), χ−4 ), and this space has
dimension 2, generated by two Eisenstein series. The first is the natural “rule of
thumb” one (which again does not give us the constant)
(−4)
F1 = 1 − 4 ∑ σ2 (n)qn ,
n≥1

and the second is


(−4,∗)
F2 = ∑ σ2 (n)qn ,
n≥1

where  
(D,∗) D
σk−1 =∑ d k−1 ,
d|n
n/d
(D)
a sort of dual to σk−1 (these are my notation). Since θ 6 = 1 + 12q + · · ·, comparing
the Fourier coefficients of 1 and q shows that θ 6 = F1 + 16F2, so we deduce that
    
(−4) (−4,∗) −4 −4
r6 (n) = −4σ2 (n) + 16σ2 (n) = ∑ 16 −4 d2 .
d|n
n/d d

6.6 Remarks on Dimension Formulas and Galois Representations

The explicit dimension formulas alluded to above are valid for k ∈ Z except for
k = 1; in addition, thanks to the theorems mentioned below, we also have explicit
dimension formulas for k ∈ 1/2 + Z. Thus, the theory of modular forms of weight 1
is very special, and their general construction more difficult.
This is also reflected in the construction of Galois representations attached to
modular eigenforms, which is an important and deep subject that we will not men-
tion in this course, except to say the following: in weight k ≥ 2 these representations
are ℓ-adic (or modulo ℓ), i.e., with values in GL2 (Qℓ ) (or GL2 (Fℓ )), while in weight
1 they are complex representations, i.e., with values in GL2 (C). The construction
in weight 2 is quite old, and comes directly from the construction of the so-called
Tate module T (ℓ) attached to an Abelian variety (more precisely the Jacobian of a
modular curve), while the construction in higher weight, due to Deligne, is much
An Introduction to Modular Forms 51

deeper since it implies the third Ramanujan conjecture |τ (p)| < p11/2 . Finally, the
case of weight 1 is due to Deligne–Serre, in fact using the construction for k ≥ 2
and congruences.

6.7 Origins of Modular Forms

Modular forms are all pervasive in mathematics, physics, and combinatorics. We


just want to mention the most important constructions:
• Historically, the first modular forms were probably theta functions (this dates
back to J. Fourier at the end of the 18th century in his treatment of the heat equa-
tion) such as θ (τ ) seen above, and more generally theta functions associated to
lattices. These functions can have integral or half-integral weight (see below)
depending on whether the number of variables which occur (equivalently, the
dimension of the lattice) is even or odd. Later, these theta functions were gener-
alized by introducing spherical polynomials associated to the lattice.
For example, the theta function associated to the lattice Z2 is simply f (τ ) =
2 2
∑(x,y)∈Z2 qx +y , which is clearly equal to θ 2 , so belongs to M1 (Γ0 (4), χ−4 ). But
we can also consider for instance
2 +y2
f5 (τ ) = ∑ (x4 − 6x2 y2 + y4 )qx ,
(x,y)∈Z2

and show that f5 ∈ S5 (Γ0 (4), χ−4 ):


Exercise 6.12. 1. Using the notation and results of Exercise 3.39, show that
[θ , θ ]2 = c f5 for a suitable constant c, so that in particular f5 ∈ S5 (Γ0 (4), χ−4 ).
2. Show that the polynomial P(x, y) = x4 − 6x2 y2 + y4 is a spherical polynomial,
in other words that D(P) = 0, where D is the Laplace differential operator
D = ∂ 2 /∂ 2 x + ∂ 2/∂ 2 y.
• The second occurrence of modular forms is probably Eisenstein series, which in
fact are the first that we encountered in this course. We have only seen the most
basic Eisenstein series Gk (or normalized versions) on the full modular group
and a few on Γ0 (4), but there are very general constructions over any space such
as Mk (Γ0 (N), χ ). Their Fourier expansions can easily be explicitly computed and
are similar to what we have given above. More difficult is the case when k is only
half-integral, but this can also be done.
As we have seen, an important generalization of Eisenstein series are Poincaré
series, which an also be defined over any space as above.
• A third important construction of modular forms comes from the Dedekind eta
function η (τ ) defined above. In itself it has a complicated multiplier system,
but if we define an eta quotient as F(τ ) = ∏m∈I η (mτ )rm for a certain set I
of positive integers and exponents rm ∈ Z, then it is not difficult to write nec-
essary and sufficient conditions for F to belong to some Mk (Γ0 (N), χ ). The
52 Henri Cohen

first example that we have met is of course the Ramanujan delta function
∆ (τ ) = η (τ )24 . Other examples are for instance η (τ )η (23τ ) ∈ S1 (Γ0 (23), χ−23 ),
η (τ )2 η (11τ )2 ∈ S2 (Γ0 (11)), and η (2τ )30 /η (τ )12 ∈ S9 (Γ0 (8), χ−4 ).
• Closely related to eta quotients are q-identities involving the q-Pochhammer
symbol (q)n and generalizing those seen in Exercise 3.30, many of which give
modular forms not related to the eta function.
• A much deeper construction comes from algebraic geometry: by the modular-
ity theorem of Wiles et al., to any elliptic curve defined over Q is associated a
modular form in S2 (Γ0 (N)) which is a normalized Hecke eigenform, where N
is the so-called conductor of the curve. For instance the eta quotient of level 11
just seen above is the modular form associated to the isogeny class of the elliptic
curve of conductor 11 with equation y2 + y = x3 − x2 − 10x − 20.

7 More General Modular Forms

In this brief section, we will describe modular forms of a more general kind than
those seen up to now.

7.1 Modular Forms of Half-Integral Weight

Coming back again to the function θ , the formulas seen above suggest that θ itself
must be considered a modular form, of weight 1/2. We have already mentioned that
 
2 −4
θ (γ (τ )) = (cτ + d)θ 2(τ ) .
d

But what about θ itself? For this, we must be very careful about the determination
of the square root:
Notation: z1/2 will always denote the principal determination of the square root,
i.e., such that −π /2 < Arg(z1/2 ) ≤ π /2. For instance (2i)1/2 = 1 + i, (−1)1/2 = i.
1/2 1/2
Warning: we do not in general have (z1 z2 )1/2 = z1 z2 , but only up to sign. As a
second notation, when k is odd, z will always denote (z1/2 )k and not (zk )1/2 (for
k/2

instance (2i)3/2 = (1 + i)3 = −2 + 2i, while ((2i)3 )1/2 = 2 − 2i).


Thus, let us try and take the square root of the modularity equation for θ 2 :
 
−4 1/2
θ (γ (τ )) = v(γ , τ ) (cτ + d)1/2 ,
d

where v(γ , τ ) = ±1 and may depend on γ and τ . A detailed study of Gauss sums
shows that v(γ , τ ) = −4c
d , the general Kronecker symbol, so that the modularity
equation for θ is, for any γ ∈ Γ0 (4):
An Introduction to Modular Forms 53

 c  −4 −1/2
θ (γ (τ )) = vθ (γ )(cτ + d)1/2 θ (τ ) with vθ (γ ) = .
d d

Note that there is something very subtle going on here: this complicated theta mul-
tiplier system vθ (γ ) must satisfy a complicated cocycle relation coming from the
trivial identity θ ((γ1 γ2 )(τ )) = θ (γ1 (γ2 (τ ))) which can be shown to be equivalent to
the general quadratic reciprocity law.
The following definition is due to G. Shimura:

Definition 7.1. Let k ∈ 1/2 + Z. A function F from H to C will be said to be a


 form of (half integral) weight k on Γ0 (N) with character χ if for all γ =
modular
a b ∈ Γ (N) we have
c d 0

F(γ (τ )) = vθ (γ )2k χ (d)(cτ + d)k F(τ ) ,

and if the usual holomorphy and conditions at the cusps are satisfied (equivalently
if F 2 ∈ M2k (Γ0 (N), χ 2 χ−4 )).

Note that if k ∈ 1/2 + Z we have vθ (γ )4k = χ−4 , which explains the extra factor
χ−4 in the above definition.
Since vθ (γ ) is defined only for γ ∈ Γ0 (4) we need Γ0 (N) ⊂ Γ0 (4), in other words
4 | N. In addition, by definition vθ (γ )(cτ + d)1/2 = θ (γ (τ ))/θ (τ ) is invariant if we
change γ into −γ , so if k ∈ 1/2 + Z the same is true of vθ (γ )2k (cτ + d)k , hence it
follows that in the above definition we must have χ (−d) = χ (d), i.e., χ must be an
even character (χ (−1) = 1).
As usual, we denote by Mk (Γ0 (N), χ ) and Sk (Γ0 (N), χ ) the spaces of modular
and cusp forms. The theory is more difficult than the theory in integral weight, but
is now well developed. We mention a few items:
1. There is an explicit but more complicated dimension formula due to J. Oesterlé
and the author.
2. By a theorem of Serre–Stark, modular forms of weight 1/2 are simply linear
combinations of unary theta functions generalizing the function θ above.
3. One can easily construct Eisenstein series, but the computation of their Fourier
expansion, due to Shimura and the author, is more complicated.
4. As usual, if we can express θ m solely in terms of Eisenstein series, this leads
to explicit formulas for rm (n), the number of representation of n as a sum of m
squares. Thus, we obtain explicit formulas for r3 (n) (due to Gauss), r5 (n) (due to
Smith and Minkowski), and r7 (n), so if we complement the formulas in integral
weight, we have explicit formulas for rm (n) for 1 ≤ m ≤ 8 and m = 10.
5. The deeper part of the theory, which is specific to the half-integral weight case,
is the existence of Shimura lifts from Mk (Γ0 (N), χ ) to M2k−1 (Γ0 (N/2), χ 2 ), the
description of the Kohnen subspace Sk+ (Γ0 (N), χ ) which allows both the Shimura
lift to go down to level N/4, and also to define a suitable Atkin–Lehner type new
space, and the deep results of Waldspurger, which nicely complement the work
of Shimura on lifts.
54 Henri Cohen

We could try to find other types of interesting modularity properties than those
coming from θ . For instance, we have seen that the Dedekind eta function is a
modular form of weight 1/2 (not in Shimura’s sense), and more precisely it satisfies
the following modularity equation, now for any γ ∈ Γ :

η (γ (τ )) = vη (γ )(cτ + d)1/2η (τ ) ,

where vη (γ ) is a very complicated 24-th root of unity. We could of course de-


fine η -modular forms of half-integral weight k ∈ 1/2 + Z by requiring F(γ (τ )) =
vη (γ )2k (cτ + d)k F(τ ), but it can be shown that this would not lead to any inter-
esting theory (more precisely the only interesting functions would be eta-quotients
F(τ ) = ∏m η (mτ )rm , which can be studied directly without any new theory.
Note that there are functional relations between η and θ :

Proposition 7.2. We have

η 2 (τ + 1/2) η 5 (2τ )
θ (τ ) = = 2 .
η (2τ + 1) η (τ )η 2 (4τ )

Exercise 7.3. 1. Prove these relations in the following way: first show that
 the right-
hand sides satisfy the same modularity equations as θ for T = 1 1 and W =
 01 4
0 −1 , so in particular that they are weakly modular on Γ (4), and second show
4 0 0
that they are really modular forms, in other words that they are holomorphic on
H and at the cusps.
2. Using the definition of η , deduce two product expansions for θ (τ ).

We could also try to study modular forms of fractional or even real weight k
not integral or half-integral, but this would lead to functions with no interesting
arithmetical properties.
In a different direction, we can relax the condition of holomorphy (or meromor-
phy) and ask that the functions be eigenfunctions of the hyperbolic Laplace operator
 2 
∂ ∂2 ∂2
∆ = −y2 2
+ 2
= −4y2
∂ x ∂ y ∂ τ∂ τ

which can be shown to be invariant under Γ (more generally under SL2 (R)) to-
gether with suitable boundedness conditions. This leads to the important theory of
Maass forms. The case of the eigenvalue 0 reduces to ordinary modular forms since
∆ (F) = 0 is equivalent to F being a linear combination of a holomorphic and an-
tiholomorphic (i.e., conjugate to a holomorphic) function, each of which will be
modular or conjugate of modular.
The case of the eigenvalue 1/4 also leads to functions having nice arithmetical
properties, but all other eigenvalues give functions with (conjecturally) transcen-
dental coefficients, but these functions are useful in number theory for other rea-
sons which we cannot explain here. Note that a famous conjecture of Selberg as-
serts that for congruence subgroups there are no eigenvalues λ with 0 < λ < 1/4.
An Introduction to Modular Forms 55

For instance, for the full modular group, the smallest nonzero eigenvalue is λ =
91.1412 · · ·, which is quite large.

Exercise 7.4. Using the fact that ∆ is invariant under Γ show that ∆ (ℑ(γ (τ ))) =
s(1 − s)ℑ(γ (τ )) and deduce that the nonholomorphic Eisenstein series E(s) intro-
duced in Definition 5.14 is an eigenfunction of the hyperbolic Laplace operator with
eigenvalue s(1 − s) (note that it does not satisfy the necessary boundedness condi-
tions, so it is not a Maass form: the functions E(s) with ℜ(s) = 1/2 constitute what
is called the continuous spectrum, and the Maass forms the discrete spectrum of ∆
acting on Γ \H ).

7.2 Modular Forms in Several Variables

The last generalization that we want to mention (there are much more!) is to several
variables. The natural idea is to consider holomorphic functions from H r to C,
now for some r > 1, satisfying suitable modularity properties. If we simply ask
that γ ∈ Γ (or some subgroup) acts component-wise, we will not obtain anything
interesting. The right way to do it, introduced by Hilbert–Blumenthal, is to consider
a totally real number field K of degree r, and denote by ΓK the group of matrices
γ = ac db ∈ SL2 (ZK ), where ZK is the ring of algebraic integers of K (we could
also consider the larger group GL2 (ZK ), which leads to a very similar
 theory).
 Such
ai bi
a γ has r embeddings γi into SL2 (R), which we will denote by γi = ci di , and the
correct definition is to ask that

F(γ1 (τ1 ), · · · , γr (τr )) = (c1 τ1 + d1 )k · · · (cr τr + dr )k F(τ1 , . . . , τr ) .

Note that the restriction to totally real number fields is due to the fact that for γi to
preserve the upper-half plane it is necessary that γi ∈ SL2 (R). Note also that the γi
are not independent, they are conjugates of a single γ ∈ SL2 (ZK ).
A holomorphic function satisfying the above is called a Hilbert-Blumenthal mod-
ular form (of parallel weight k, one can also consider forms where the exponents
for the different embeddings are not equal), or more simply a Hilbert modular form
(note that there are no “conditions at infinity”, since one can prove that they are
 unless K = Q).
automatically satisfied
Since T = 10 11 ∈ SL2 (ZK ) is equal to all its conjugates,
 such modular forms
have Fourier expansions, but using the action of 10 α1 with α ∈ ZK it is easy to
show that these expansions are of a special type, involving the codifferent d−1 of K,
which is the fractional ideal of x ∈ K such that Tr(xZK ) ⊂ Z, where Tr denotes the
trace.
One can construct Eisenstein series, here called Hecke–Eisenstein series, and
compute their Fourier expansion. One of the important consequences of this com-
putation is that it gives an explicit formula for the value ζK (1 − k) of the Dedekind
zeta function of K at negative integers (hence by the functional equation of ζK , also
56 Henri Cohen

at positive even integers), and in particular it proves that these values are rational
numbers, a theorem due to C.-L. Siegel as an immediate consequence of Theorem
3.41. An example is as follows:

Proposition 7.5. Let K = Q( D) be a real quadratic field with D a fundamental
discriminant. Then:
1. We have
 
1 D − s2
ζK (−1) =
60 ∑√ σ1 4 ,
|s|< D
 
1 D − s2
120 ∑√
ζK (−3) = σ3 .
4
|s|< D

2. We also have formulas such as


  
2 D
∑√ 1σ (D − s ) = 60 9 − 2
2
ζK (−1) ,
|s|< D
  
2 D
∑√ σ3 (D − s ) = 120 129 − 8 2 ζK (−3) .
|s|< D

We can of course reformulate these results in terms of L-functions by using


L(χD , −1) = −12ζK (−1) and L(χD , −3) = 120ζK (−3), where as usual χD is the
quadratic character modulo D.
Exercise 7.6. Using Exercise 6.6 and the above formulas, show that the number
r5 (D) of representations of D as a sum of 5 squares is given by
     
D D
r5 (D) = 480 5 − 2 ζK (−1) = −40 5 − 2 L(χD , −1) .
2 2

Note that this formula can be generalized to arbitrary D, and is due to Smith and
(much later) to Minkowski. There also exists a similar formula for r7 (D): when −D
(not D) is a fundamental discriminant
  
D
r7 (D) = −28 41 − 4 L(χ−D , −2) .
2

Note also that if we restrict to the diagonal τ1 = · · · = τr , a Hilbert modular form


of (parallel) weight k gives rise to an ordinary modular form of weight kr.
We finish this section with some terminology with no explanation: if K is not
a totally real number field, one can also define modular forms, but they will not
be defined on products of the upper-half plane H alone, but will also involve the
hyperbolic 3-space H3 . Such forms are called Bianchi modular forms.
An Introduction to Modular Forms 57

A different generalization, close to the Weierstrass ℘-function seen above, is the


theory of Jacobi forms, due to M. Eichler and D. Zagier. One of the many interesting
aspects of this theory is that it mixes in a nontrivial way properties of forms of
integral weight with forms of half-integral weight.
Finally, we mention Siegel modular forms, introduced by C.-L. Siegel, which are
defined on higher-dimensional symmetric spaces, on which the symplectic groups
Sp2n (R) act. The case n = 1 gives ordinary modular forms, and the next simplest,
n = 2, is closely related to Jacobi forms since the Fourier coefficients of Siegel
modular forms of degree 2 can be expressed in terms of Jacobi forms.

8 Some Pari/GP Commands

There exist three software packages which are able to compute with modular forms:
magma, Sage, and Pari/GP since the spring of 2018. We give here some basic
Pari/GP commands with little or no explanation (which is available by typing ?
or ??): we encourage the reader to read the tutorial tutorial-mf available with
the distribution and to practice with the package, since it is an excellent way to learn
about modular forms. All commands begin with the prefix mf, with the exception
of lfunmf which more properly belongs to the L-function package.
Creation of modular forms: mfDelta (Ramanujan Delta), mfTheta (ordinary
theta function), mfEk (normalized Eisenstein series Ek ), more generally mfeisenstein,
mffrometaquo (eta quotients), mffromqf (theta function of lattices with or
without spherical polynomial), mffromell (from elliptic curves over Q), etc...
Arithmetic operations: mfcoefs (Fourier coefficients at infinity), mflinear
(linear combination, so including addition/subtraction and scalar multiplication),
mfmul, mfdiv, mfpow (clear), etc...
Modular operations: mfbd, mftwist, mfhecke, mfatkin, mfderivE2,
mfbracket, etc...
Creation of modular form spaces: mfinit, mfdim (dimension of the space),
mfbasis (random basis of the space), mftobasis (decomposition of a form on
the mfbasis), mfeigenbasis (basis of normalized eigenforms).
Searching for modular forms with given Fourier coefficients:
mfeigensearch, mfsearch.
Expansion of F|k γ : mfslashexpansion.
Numerical functions: mfeval (evaluation at a point in H or at a cusp), mfcuspval
(valuation at a cusp), mfsymboleval (computation of integrals over paths in the
completed upper-half plane), mfpetersson (Petersson scalar product), lfunmf
(L-function associated to a modular form), etc...
Note that for now Pari/GP is the only package for which these last functions
(beginning with mfslashexpansion) are implemented.
58 Henri Cohen

9 Suggestions for further Reading

The literature on modular forms is vast, so I will only mention the books which I
am familar with and that in my opinion will be very useful to the reader. Note that
the classic book [4] is absolutely remarkable, but may be difficult for a beginning
course.
In addition to the recent book [1] by F. Strömberg and the author (which of course
I strongly recommend !!!), I also highly recommend the paper [5], which is essen-
tially a small book. Perhaps the most classical reference is [3]. The more recent book
[2] is more advanced since its ultimate goal is to explain the modularity theorem of
Wiles et al.

References

1. H. Cohen and F. Strömberg, Modular Forms: A Classical Approach, Graduate Studies in


Math. 179, American Math. Soc., (2017).
2. F. Diamond and J. Shurman, A first course in modular forms, Graduate Texts in Math. 228,
Springer (2005),
3. T. Miyake, Modular Forms, Springer (1989).
4. G. Shimura, Introduction to the arithmetic theory of automorphic functions, Publ. Math. Soc.
Japan 11, Princeton University Press (1994) (reprinted from the 1971 original).
5. D. Zagier, Elliptic modular forms and their applications, in “The 1-2-3 of modular forms”,
Universitext, Springer (2008), pp. 1–103.

You might also like