Pub Factorization Unique and Otherwise
Pub Factorization Unique and Otherwise
i i
Factorization
Unique and Otherwise
i i
i i
i i
i i
i i
i i
i i
i i
Factorization
Unique and Otherwise
Steven H. Weintraub
Lehigh University
i i
i i
i i
i i
Copyright
c 2008 A K Peters, Ltd.
All rights reserved. No part of the material protected by this copyright notice
may be reproduced or utilized in any form, electronic or mechanical, including
photocopying, recording, or by any information storage and retrieval system,
without written permission from the copyright owner.
Weintraub, Steven H.
Factorization : unique and otherwise / Morgens Esrom Larsen.
p. cm. -- (CMS Treatises in mathematics)
Includes index.
ISBN 978-1-56881-241-0 (alk. paper)
1. Factorization (Mathematics). 2. Rings of integers. 3. Rings
(Algebra). I. Title.
QA161.F3W45 2008
512.7 2--dc22
2007049328
Printed in Canada
12 11 10 09 08 10 9 8 7 6 5 4 3 2 1
i i
i i
i i
i i
i i
i i
i i
i i
Contents
Preface ix
Introduction 1
1 Basic Notions 7
1.1 Integral Domains . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Quadratic Fields . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Unique Factorization 19
2.1 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . 20
2.2 The GCD-L Property and Euclid’s Algorithm . . . . . . . 31
2.3 Ideals and Principal Ideal Domains . . . . . . . . . . . . . 45
2.4 Unique Factorization Domains . . . . . . . . . . . . . . . 51
2.5 Nonunique Factorization: The Case D < 0 . . . . . . . . . 60
2.6 Nonunique Factorization: The Case D > 0 . . . . . . . . . 67
2.7 Summing Up . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
vii
i i
i i
i i
i i
viii Contents
B Congruences 205
B.1 The Notion of Congruence . . . . . . . . . . . . . . . . . . 205
B.2 Linear Congruences . . . . . . . . . . . . . . . . . . . . . . 211
B.3 Quadratic Congruences . . . . . . . . . . . . . . . . . . . 223
B.4 Proof of the Law of Quadratic Reciprocity . . . . . . . . . 236
B.5 Primitive Roots . . . . . . . . . . . . . . . . . . . . . . . . 241
B.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Index 259
i i
i i
i i
i i
Preface
ix
i i
i i
i i
i i
x Preface
Steven H. Weintraub
Bethlehem, PA, USA
August 2007
i i
i i
i i
i i
Introduction
We shall here be concerned with the circle of ideas that surrounds the
Fundamental Theorem of Arithmetic.
First we recall the usual definition of a prime: a prime number is a
positive integer, other than 1, that has no divisors except itself and 1. For
example 2 and 3 are primes, but 6 = 2 · 3 and 10 = 2 · 5 are not.
Then the Fundamental Theorem of Arithmetic states that every posi-
tive integer can be factored into primes in an essentially unique way. For
example,
1 = 1,
2 = 2,
6 = 2 · 3,
10 = 2 · 5,
15 = 3 · 5,
2499 = 3 · 72 · 17.
i i
i i
i i
i i
2 Introduction
3 = 3,
√ √
5 = (2 + −1)(2 − −1),
7 = 7,
11 = 11,
√ √
13 = (3 + 2 −1)(3 − 2 −1),
√ √
17 = (4 + −1)(4 − −1).
√
On the other hand, let us consider numbers of the form a + b −5 with
a and b integers. Numbers of this form do not have unique factorization.
For example, we have the following two factorizations of 6 into irreducibles:
√ √
6 = (2)(3) = (1 + −5)(1 − −5).
√
We can also consider numbers of the form a+b 10 with a and b integers.
Numbers of this form also do not have unique factorization. For example,
we have the following two factorizations of 10 into irreducibles:
√
14 = (2)(5) = ( 10)2 .
We have used the word “irreducible” rather than “prime” here as that turns
out to be the correct mathematical language.
In fact, we will prove the Fundamental Theorem of Arithmetic in a way
that enables us to establish it in many cases, including the two we have
√
mentioned—the ordinary integers, and numbers of the form a + b −1 with
a and b integers—simultaneously.
On the other hand, we will also be able to systematically show that in
many cases, including the two we have mentioned—numbers of √ the form
√
a + b −5 with a and b integers, and numbers of the form a + b 10 with a
and b integers—unique factorization does not hold.
As we will see, instead of unique factorization being the norm and non-
unique factorization the exception, the situation is reversed! It is really
a very special property, though a crucially important one, of the ordinary
integers that the Fundamental Theorem of Arithmetic holds for them.
Chapters 1 and 2 of this book are basically devoted to proving unique
and nonunique
√ factorization for ordinary integers and for numbers of the
form a + b D. (Here a and b are not always integers, but this is a technical
point we will defer until later.)
√
In Chapter 3, we investigate numbers of the form a + b −1 with a and
b integers. Numbers of this form are called the Gaussian integers. As we
i i
i i
i i
i i
Introduction 3
i i
i i
i i
i i
4 Introduction
the units are. We have given the answer for the Gaussian integers, but we
can ask the same question in other
√ cases as well. Here we ask the question
for numbers of the form a + b D for D positive and not a perfect square.
If D = 2 we have units
√ √
1 = (3 + 2 2)(3 − 2 2),
√ √
1 = (17 + 12 2)(17 − 12 2),
√ √
1 = (99 + 70 2)(99 − 70 2),
√ √
1 = (577 + 408 2)(577 − 408 2).
√ √
Note that a factorization 1 = (a + b D)(a − b D) gives a solution of
the equation a2 − b2 D = 1, and vice versa. Thus the search for units is
intimately related to the search for solutions of the equation a2 − b2 D = 1.
The units above correspond to solutions for D = 2: 1 = 32 − 22 · 2 =
172 − 122 · 2 = 992 − 702 · 2 = 5772 − 4082 · 2. But we can consider this
equation for other values of D as well. For example, for D = 61 we have
the solution
1 = (1766319049)2 − (226153980)2 · 61
and for D = 109 we have the solution
In fact, for any such D there are infinitely many solutions (and hence
infinitely many units). We shall prove this in Chapter 4 where we inves-
tigate the equation a2 − b2 D = 1, known as Pell’s equation. Our proof
is a variant of the cakravala method experimentally developed by Indian
mathematicians between the ninth and twelfth centuries. This is also a
result known to Fermat, and his proof of this result may well have been
along the lines of ours, as our proof uses a method of “composition” very
closely related to our method in Chapter 3. Also, our proof is constructive,
enabling us to find solutions by hand for values of D that are not too large.
The above solutions for D = 61 and for D = 109 were known to have been
found by Fermat (by hand, obviously, since computers did not exist in the
seventeenth century).
Our investigations in Chapters 1 through 4 can be considerably gener-
alized. To use the appropriate technical language, in these chapters we are
considering quadratic fields, and we can consider analogous problems for
algebraic number fields. Indeed, our treatment here parallels the historical
development of the subject. Quadratic fields were investigated first, and
the phenomena that arose there motivated the development of the general
i i
i i
i i
i i
Introduction 5
i i
i i
i i
i i
Chapter 1
Basic Notions
(6) R is closed under multiplication, i.e., if a and b are any two elements
of R, then ab is an element of R.
(7) Multiplication is commutative, i.e., if a and b are any two of R, then
ab = ba.
i i
i i
i i
i i
8 1. Basic Notions
(10) Multiplication distributes over addition, i.e., if a, b, and c are any three
elements of R, then a(b + c) = ab + ac and (b + c)a = ba + ca.
(11) R has no zero divisors, i.e., if a and b are any two non-zero elements
of R, then their product ab is also non-zero. Note that, by taking the
contrapositive, this condition may be equivalently rephrased as follows:
if a and b are any two elements of R with ab = 0, then a = 0 or b = 0.
(1) If a + b = a + c then b = c.
Proof:
(1) a+b=a+c
−a + (a + b) = −a + (a + c)
(−a + a) + b = (−a + a) + c
0+b = 0+c
b=c
(2) ab = ac
ab + a(−c) = ac + a(−c)
a(b − c) = a(c − c)
a(b − c) = a0
a(b − c) = 0
i i
i i
i i
i i
Example 1.3.
(1) The ordinary integers Z form an integral domain. (Indeed, the term
“integral domain” has its origin in this fact.)
(2) The rational numbers Q form an integral domain. Q is just the set
of fractions {a/b} with a and b integers, with the usual addition and
multiplication of fractions. (Note that Q includes Z, as the integer a
is equal to the fraction a/1.)
Then R is an integral
√ domain. Let us√examine R a little more carefully.
First, if a = a + b D and β = c + d D are in R, then
√ √ √
α + β = (a + b D) + (c + d D) = (a + b) + (c + d) D
is in R, and
√ √ √
αβ = (a + b D)(c + d D) = (ac + bdD) + (ad + bc) D
(4) Fix an integer D ≡ 1 (mod 4) that is not a perfect square, and let
√
R = {(a + b D)/2 | a and b are integers, and either they
are both even or they are both odd}.
i i
i i
i i
i i
10 1. Basic Notions
(5) Let D be a fixed integer that is not a perfect square, and let
√
R = {a + b D | a and b are rational numbers}.
The integral domains in Example 1.3(3) look pretty natural, but the
integral domains in Example 1.3(4) look rather artificial. It turns out to
be the case that, depending on the value of D, we sometimes want to
consider the former and sometimes the latter. See the exercises for why
this is the case.
We now make a further definition.
i i
i i
i i
i i
ab = 0
−1
a (ab) = a−1 (0)
(a−1 a)b = 0
1b = 0
b = 0,
so b = 0 as required.
R∗ = {units of R}.
Remark 1.7. Note that an integral domain R is a field if and only if every
nonzero element of R is a unit.
√
In particular, α−1 is of the form e + f D where e and f are rational
numbers (to be precise, e = (a/(a2 − b2 D)) and f = (−b/(a2 − b2 D))), so
i i
i i
i i
i i
12 1. Basic Notions
i i
i i
i i
i i
In computing with quadratic fields, there are some quantities that are
extremely useful.
√ √
Definition 1.11. Let α = a + b D be an element of Q( D). Then its
conjugate α is defined by √
α = a − b D,
its norm N(α) is defined by
N(α) = αα = a2 − b2 D,
Tr(α) = α + α = 2a.
Proof: (1), (2), and (3) are easy to check by direct computation, and we
leave them as exercises.
√ We prove (4).
Let α = a + b D and suppose N(α) = 0. Then
0 = N(α) = a2 − b2 D
i i
i i
i i
i i
14 1. Basic Notions
√
Lemma 1.13. For any element x of O( D), N(x) is an integer.
√
Proof: If x = a+b D, then N(x) = a2 −b2 D, so if a and b are integers, N(x)
is certainly
√ an integer. Thus the only case we need to check is that of x =
(a + b D)/2 where a and b are both odd and D ≡ 1 (mod 4). In this case
we write a = 2m + 1, b = 2n + 1, D = 4E + 1. Then N(x) = (a2 − b2 D)/4 =
((2m + 1)2 − (2n + 1)2 (4E + 1))/4 = m2 + m − 4n2 E − 4nE − E − n2 − n
is an integer.
√
Lemma 1.14. Let R = O( D). Then the units of R are precisely those
elements x of R with | N(x)| = 1.
Proof: First suppose | N(x)| = 1. Then N(x) = ±1. But N(x) = xx. Thus
either xx = 1, in which case x has inverse x−1 = x, or xx = −1, in which
case x has inverse x−1 = −x, so in either case x is a unit.
Conversely, suppose that x is a unit. Then there is an element y of
R with xy = 1. Then on the one hand xy = 1 = 1, and on the other
hand xy = xy, by Lemma 1.12. Thus xy = 1. Multiplying, we see that
xxyy = 1, i.e., that N(x) N(y) = 1. However, by Lemma 1.13, N(x) and
N(y) are both integers. Therefore, we must have either N(x) = N(y) = 1
or N(x) = N(y) = −1. But in either case, we conclude | N(x)| = 1.
√
Let us use Lemma 1.14 to try to find the units in O( D).
Corollary 1.15.
√
(1) The units in O( −1) are {±1, ±i}.
√ √ √
(2) The units in O( −3) are {±1, ±(1 + −3)/2, ±(1 − −3)/2}.
√
(3) For any other negative value of D, the units in O( D) are {±1}.
i i
i i
i i
i i
a solution, other than x = ±1, and, even if we know there are solutions, it
is completely unclear how to find them.
Nevertheless, let us experiment a bit, by taking small values of D.
Example 1.16.
√ √
(1) Let D = 2. Then N(1 + 2) =√12 − 12 · 2 =√−1 so x = 1 + 2 is a unit
and its inverse is −x = −(1 − 2) = −1 + 2. Since xx−1 = 1, x−k =
(xx−1 )k = k k
√ 1 2 = 1 so√x is also a unit 3for any √ k. So, for example,
√
x = (1 + 2) = 3 + 2 2 is a unit, as is x = (1 + 2)3 = 7 + 5 2, etc.
2
are units. Again, {. . . , ±x−3 , ±x−2 , ±x−1 , ±1, ±x, ±x2 , ±x3 , . . .} is an
infinite set of distinct units.
√ √
(3) Let D = 5. Then N((1+ 5)/2) = (12 −12 ·5)/4 √ = −1 so x = k(1+ 5)/2
is a unit and its inverse is −x = −(−1 √ − 5)/2. Again, x is√a unit
for any k, so for example, x2 = (3 + 5)/2 and x3 = 2 + 5 are
units, and once again {. . . , ±x−3 , ±x−2 , ±x−1 , ±1, ±x, ±x2 , ±x3 , . . .}
is an infinite set of distinct units.
i i
i i
i i
i i
16 1. Basic Notions
1.3 Exercises
Exercise 1.1. Let R be an integral domain and let S be a subset of R that
satisfies the following four conditions:
(1) 1 is in S;
(2) if a is in S, then −a is in S;
(3) if a and b are in S, then a + b is in S;
(4) if a and b are in S, then ab is in S.
(b) Give examples to show that if S satisfies any three of these four condi-
tions then S may not be an integral domain. (Thus you will need four
examples, one for each omitted condition.)
Exercise 1.2. Show that the usual rules of signs hold in any integral do-
main R:
Exercise 1.3.
(b) Suppose that b is a unit. Show that for any a, bx = a has the solution
x = ab−1 .
Exercise 1.4. Recall from Remark 1.9 that if a and b are elements of an
integral domain R, we say that b divides a if there is an element x satisfying
the equation bx = a, in which case we write x = a/b. Show that with this
definition, the usual rules of fractions hold:
i i
i i
i i
i i
1.3. Exercises 17
(Note that in some cases the right-hand side of the above equalities may
be defined when the left-hand side is not. We mean these equalities to hold
when both sides are defined.)
i i
i i
i i
i i
18 1. Basic Notions
√ √
O( D) is the set of elements of Q( D) that are roots of a
monic quadratic with integer coefficients (i.e., roots of a quadratic
polynomial f (x) = x2 + mx + n with m and n integers).
√
(a) Verify that this is true for the following elements α of O( D):
√
(i) α = 3 + 8 6;
√
(ii) α = 7 − 10 11;
√
(iii) α = 2 + 9 5;
√
(iv) α = 4 + 5 −2;
√
(v) α = −6 + 11 −5;
√
(vi) α = 32 + 72 −3.
(b) Show that this is the case in general. That is, show that
√
(i) if D ≡ 2 or 3 (mod 4), then α = c + d D is a root of a monic
quadratic with integer coefficients if and only if c and d are both
integers;
√
(ii) if D ≡ 1 (mod 4), then α = c+d D is a root of a monic quadratic
with integer coefficients if and only if either c and d are both
integers or c = a/2 and d = b/2 with a and b both odd integers.
√
In the text of this book, we treat integral domains of the form O( D).
But many of the statements we make have analogs for polynomials, and
we leave the treatment of the polynomial situation to the exercises. Here
is the first case: Let R be an integral domain. Then
(In considering R[X] you may assume the usual properties of polynomial
arithmetic. The cases we will be most concerned with here are R = Q and
R = Z and indeed for the purposes of this book you may confine your
attention to these.)
Exercise 1.14. Show that R[X] is an integral domain.
Exercise 1.15. Show that R[X]∗ = R∗ (i.e., that the units of R[X] are
the constant polynomials {a} for those values of a that are units of R).
In particular, if R is a field, the units of R[X] are the nonzero constant
polynomials.
i i
i i
i i
i i
Chapter 2
Unique Factorization
We now embark on the proof that a number of the integral domains we are
interested in satisfy unique factorization. We have written “proof” rather
than “proofs” as it is our goal to establish a framework that will enable us
to come up with one proof that handles all these cases simultaneously. To
be precise, our strategy will be as follows:
Step 1b. Prove that certain integral domains are Euclidean domains.
Step 2b. Prove that every Euclidean domain is a principal ideal domain.
Step 3b. Prove that every principal ideal domain is a unique factorization
domain.
Thus, putting all of these steps together, we see that certain integral
domains are unique factorization domains.
The obvious question now is: “Which ones?”√As we shall see, these
include the integers Z, and the integral domains O( D) for some (definitely
not all!) values of D.
Indeed, the first part of this chapter will be devoted √ to the general
argument we have just described, and to proving that O( D) is a unique
factorization domain in many cases. However, once we accomplish that we
will√turn our attention to the opposite phenomenon, and will prove that
O( D) is not a unique factorization domain in many other cases.
19
i i
i i
i i
i i
20 2. Unique Factorization
We will not be able to settle the issue in all cases, and in fact, in
complete generality the answer is unknown. We will describe our (that is,
mathematicians’) present state of knowledge about this question.
a ≤ ab.
Remark 2.2.
(1) Note that we do not require 0 to be defined, though it may be.
(2) Note that under this definition it is possible that a = 0 even though
a = 0.
Proof:
i i
i i
i i
i i
(2) This follows from earlier work we have done. Let us see this.
Property (1): We showed in Lemma 1.13 that N(α) is an integer, so
α = | N(α)| is a nonnegative integer.
Remark 2.4. Unfortunately, the word “norm” is used to mean two slightly
different things.
√ We called N(α) a norm in Chapter 1. Note that N(α) is
defined on Q( D), may be negative, and need not be an integer. In our
definition here, the norm is required to be a nonnegative
√ integer, and so we
must consider | N(α)|, and only for α in O( D). In this chapter, we will
always use a norm in the sense of Definition 2.1, and we will always denote
such a norm by · .
Remark 2.6.
(1) This is a familiar property for the integers, which you probably learned
in elementary school: 75 = 17 · 4 + 7, 93 = 11 · 8 + 5, 105 = 23 · 5 + 0.
Nevertheless, it requires proof! We shall prove it momentarily.
(2) Note we are not claiming that the quotient and remainder are unique.
For example, 100 = 3 · 33 + 1 = 3 · 34 + (−2) both work.
(3) Strictly speaking, the definition of a Euclidean domain includes the
integral domain R and the norm · . We usually say, however, “R is
an integral domain” when the norm is understood.
i i
i i
i i
i i
22 2. Unique Factorization
Proof: We are claiming that for any integer a, and any integer b = 0, there
is an integer q and an integer r with a = bq + r and r = 0 or r < b.
For each fixed value of b, we prove this claim by complete induction on
a. We shall prove this claim in the case a ≥ 0 and b > 0 here.
So suppose a ≥ 0 and b > 0. Note then that a = |a| = a and
b = |b| = b.
If a = 0, then a = 0, and this claim is certainly true: a = b · 0 + 0 so
q = 0 and r = 0.
Also, if 0 < a < b, then 0 < a < b and this claim is also true:
a = b · 0 + a so q = 0 and r = a satisfies r < b.
Now assume that this claim is true for all integers a with a < a.
Consider a. We have just proved this claim if a = 0 or 0 < a < b,
so we may restrict our attention to the case that b ≤ a. But in this
case 0 < b ≤ a so 0 ≤ a − b < a. Set a = a − b. Then we can apply the
inductive hypothesis to a to conclude that a = bq + r for some r with
r = 0 or r < b. Substituting, we see that a − b = bq + r and hence
that a = b(q + 1) + r = bq + r with q = q + 1 and r = r . But then also
r = 0 or r < b, as required.
Hence our claim is true for a, and so by complete induction we may
conclude that our claim is true for every a ≥ 0.
Thus, we have proved the lemma in this case. We leave the remaining
cases as exercises.
D = −11, −7, −3, −2, −1, 2, 3, 5, 6, 7, 11, 13, 17, 21, and 29.
Proof: This is a very long proof, so let us begin by describing our strat-
√
egy. We are trying to investigate an algebraic question—when O( D)
is Euclidean—but we will convert this question to a geometric question.
Then we will solve this question, only using basic analytic geometry. The
geometric idea is simple, but we will work very hard at it and obtain our
results. Thus, this proof is an illustration of the fact that often in mathe-
matics one may start with a simple idea and by pushing it hard enough go
a long way with it.
i i
i i
i i
i i
ac − bdD −ad + bc √ √
= 2 + 2 D = (e + f D),
c − d2 D c − d2 D
i i
i i
i i
i i
24 2. Unique Factorization
α = βγ0 = βγ + β(γ0 − γ) = βγ + ρ
where we set ρ = β(γ0 − γ). Then ρD = β(γ0 − γ)D = β · γ0 − γD
and, since γ0 − γD < 1, we have ρD < βD , so we have found values
for γ and ρ that satisfy the conditions of a Euclidean domain. (We are
assuming that γ is an element of R, and then we see that ρ = α − βγ is
also an element of R, and furthermore we see that (and this is the crucial
point!) ρD < βD .)
Hence√we have reduced
√ our problem to √ the problem
√ of showing that √ for
any e +√ f D in Q( D), there is an s + t D in O( D) with (e + f D) −
(s + t D)D < 1. Translating this into our geometric language, we need
to show that for any point (e, f ) in the plane with rational coordinates,
there is a point (s, t) in the plane corresponding to an element of R, with
(e, f ) − (s, t)D < 1.
We will√need to know which points of the plane correspond to elements
of R = O( D), so let us determine √ that now. The answer depends on D.
If D ≡ 2 or 3 (mod 4), then s + t D is in R when both s and t are integers,
so the points in the plane corresponding to elements of R are the √ points
(s, t) with both coordinates integers. If D ≡ 1 (mod 4), then s + t D is in
R when both s and t are integers or when both s and t are half-integers, so
the points in the plane corresponding to elements of R are the points (s, t)
i i
i i
i i
i i
±2 ±1 1 2
±1
±2
i i
i i
i i
i i
26 2. Unique Factorization
±2 ±1 1 2
±1
±2
new metric, the one in which we are really interested.) A little thought
shows that we have simply translated (i.e., shifted) the problem to points
apparently nearest the origin, and we can always translate the problem in
this way. Thus, if we can show every Q-point apparently nearest the origin
is within a distance of 1 from some O-point (s, t), this will be true of all
Q-points in the plane, and again we will be done.
Thus, we have reduced our problem to considering the Q-points that
are apparently nearest the origin. We shall denote this region by 0 . Now
the real work begins.
0.5
±1 ±0.5 0 0.5 1
±0.5
±1
i i
i i
i i
i i
0.6
0.4
0.2
±1 ±0.5 0 0.5 1
±0.2
±0.4
±0.6
0.6
0.4
0.2
±1 ±0.5 0 0.5 1
±0.2
±0.4
i i
i i
i i
i i
28 2. Unique Factorization
Tracing the argument back, we√see that for √ the Q-point γ0 = (e, f ),
corresponding to the element e + f D of√Q( D), if γ1 = √ (s, t) is the O-
point, corresponding to the element s + t D of R = O( D), apparently
nearest to γ0 , then choosing γ = γ1 we have found an O-point γ with
γ0 − γD < 1, as required, in the cases D = −1, −2, or −3, completing
the proof in these cases.
Now we consider the case D > 0, and again ask for what points (x, y) we
have (x, y)D < 1. Here (x, y)D = x2 − Dy 2 . We recognize |x2 − Dy 2 | =
1 as the equation of two pairs of hyperbolas. The equation x2 − Dy 2 = 1
gives a pair of hyperbolas, one opening to the right and one opening to the
left, having vertices 1 unit to the right and 1 unit to the left of the origin,
respectively, and the equation x2 − Dy 2 = −1 gives a pair√of hyperbolas,
one opening
√ up and one opening down, having vertices 1/ D units above
and 1/ D units below the origin, respectively. (We shall say these two
pairs of hyperbolas are centered
√ at the origin and have semi-major axis
1 and semi-minor axis 1/ D, although those terms are usually just used
for ellipses.) Now the points (x, y) with (x, y)D < 1 are the points with
|x2 − Dy 2 | < 1, i.e., with −1 < |x2 − Dy 2 | < 1, so they are the points
“inside” of these hyperbolas. That is, they are the points in the region
bounded by all four of these curves, consisting of a rectangular area in the
center adjoined by four tails that go apparently infinitely far out toward
the northeast, northwest, southeast, and southwest. (Actually, the exact
direction they go out
√ depends on D. √ To be precise, they go out around the
asymptotes y = x/ D and y = −x/ D.) But for D = 2 or 3, every point
1.5
0.5
±2 ±1 1 2
±0.5
±1
±1.5
i i
i i
i i
i i
0.5
±2 ±1 0 1 2
±0.5
±1
apparently nearest the origin is in this region, as we see from Figures 2.6
and 2.7.
Again, tracing the argument back, we see √ that for√the Q-point γ0 =
(e, f ), corresponding to the element e + f D of Q( √ D), if γ1 = √ (s, t)
is the O-point, corresponding to the element s + t D of R = O( D),
apparently nearest to γ0 , then choosing γ = γ1 we have found an O-point
γ with γ0 − γD < 1, as required, in the cases D = 2 or 3, completing the
proof in these cases as well.
This concludes the argument in the (relatively) easy cases. As we men-
tioned above, the other cases are handled in Appendix C, to which we refer
the interested reader.
√
We have just proved that, for some values of D, O( D) is a Euclidean
√
domain. In fact, the only negative values of D for which O( D) is a
Euclidean domain are the ones we have given, and we shall prove that now.
Before we do, we will remark that our list √ for positive D is not complete.
Also, it is much harder to prove that O( D) is not a Euclidean domain for
a positive value of D. The problem here is that the “tails” of the hyperbolas
apparently
√ go infinitely far out and so it is possible that some point γ of
O( D) apparently very far away from γ0 will really be within a distance
of 1 from it (or perhaps even closer). We saw some examples of this in the
part of the proof of Theorem 2.8 that appears in Appendix C, but with
more work can come up with very dramatic
√ examples. For example, taking
D = 41, we have that √ γ = 46 − 7 41, which is apparently very far away
from γ0 = (23/125) 41, is actually very close to γ0 . Calculation shows
γ0 − γ41 = 1/250 = 0.004!
i i
i i
i i
i i
30 2. Unique Factorization
√
Lemma 2.9. If D < 0 and D = −1, −2, −3, −7, or −11, then R = O( D)
is not a Euclidean domain with respect to its norm · .
Proof: We shall continue to use the language and notation of the proof of
Theorem 2.8. √
To show that R = O( D) is not a Euclidean domain with respect to
its norm, we need only find a point γ0 of 0 that is not within a distance
of 1, in the norm · D , of any point γ of R, i.e., which is not in the interior
of an ellipse centered at any γ.
First, suppose D ≡ 2 or 3 (mod 4). We are excluding D = −1 or −2,
so √we have D ≤ −5, i.e., |D| ≥ 5. Now each √ ellipse has semiminor axis
1/ −D < 1/2 centered at a point γ = s + t D where both √ s and t, and
in particular t, are integers. Thus, in order for γ0 = e + f D to be in such
√
an ellipse, we must have f within a distance
√ of 1/ −D of some integer.
But f = 1/2 is a distance of 1/2√> 1/ −D from the nearest integer, and
hence any integer, so γ0 = (1/2) −D is a suitable point.
Now suppose D ≡ 1 (mod 4). We are excluding D = −3, −7, or −11,
so we have D ≤ −15, i.e., |D| ≥ 15. Suppose in fact that D = −15, so
|D| ≥ 17. The argument here √ is very similar to the previous case. Each √
ellipse has semiminor axis 1/ −D < 1/4 centered at a point γ = s + t D
where both s and t, and in√particular t, are integers or half-integers. Thus,
in order for γ0 = e + √ f D to be in such an ellipse, we must have f
within a distance
√ of 1/ −D from the nearest integer or half-integer, so
γ0 = (1/4) D is a suitable point.
Thus, we have proved the lemma for every value of D except for D =
√
−15, and our proof does not work in that case, for the point γ0 = (1/4) −15
is indeed within a distance of 1 from γ = 0, as (0, 1/4)−15 = 15/16 <
1. But here we make a different choice of γ0 . Here we choose γ0 =
√
(3/11) −15. If γ = 0, then γ0 − γ−15 = γ0 −15 = (0, 3/11)−15 =
√
135/121 > 1; if γ = (±1/2) + (1/2) −15 then γ0 − γ−15 =
(±1/2, −5/22)−15 = 496/484 > 1; and for any other value of γ we see
that γ0 − γ−15 is even larger (as the difference of the y-coordinates is
larger), so γ0 is a suitable point.
Remark 2.10.
√
(1) Actually, the complete list of positive values of D for which R = O( D)
is a Euclidean domain with respect to its norm · is known. We state
this result without proof. These values of D are D = 2, 3, 5, 6, 7, 11,
13, 17, 19, 21, 29, 33, 37, 41, 57, and 73.
i i
i i
i i
i i
(2) To √
be precise, what we showed is that for the values of D in Lemma 2.9,
O( D) is not a Euclidean domain with respect to the norm √ · that
we have defined. This leaves open the possibility that O( D) is a
Euclidean domain with respect to some different norm. We shall not
investigate this question.
Remark 2.11. We are left with a final question: where does the name “Eu-
clidean” come from? The answer is that in a Euclidean domain, we may
perform Euclid’s algorithm. We shall save this for the next section, when
will learn not only how to do it, but also what it is good for.
Lemma 2.12. Let R be a Euclidean domain and let β be any nonzero ele-
ment of R. Then β ≥ 1 and β = 1 if and only if β is a unit.
But by assumption, β = 1, and by what we have just proved, there
are no nonzero elements r of R with r < 1, so we must have r = 0 and
then 1 = ββ with β = q, so β is a unit in R.
i i
i i
i i
i i
32 2. Unique Factorization
common divisor g = gcd(a, b) is the unique positive integer with the prop-
erty that (1) g divides both a and b; and (2) if d is any integer dividing
both a and b, then d divides g.
We should point out that a priori the gcd may not exist. We are claiming
that there is one and only one positive integer with a certain property, and
a priori there may be no such integer or more than one such integer.
But for the positive integers the gcd does indeed exist and can be found
by taking the common prime factors of a and b. For example, if a = 360 =
23 · 32 · 5 and b = 2268 = 22 · 34 · 7, then g = 22 · 32 = 36. If a = 37 = 37 and
b = 143 = 11 · 13, then g = 1 (as they have no prime factors in common).
If a = 280067 = 229 · 1223 and b = 227168 = 25 · 31 · 229, then g = 229.
This is not really a satisfactory answer, however, because this assumes
unique factorization, which we have not shown yet. In fact, we will use the
gcd to prove unique factorization, not the other way around. (It is also not
really satisfactory from a practical viewpoint either, since this method of
finding the gcd requires us to factor a and b into a product of primes, and
this is not so easy, unless a and b are small.) Moreover, we will see that we
have a gcd in any Euclidean domain. Thus, since we have already shown
√
that O( −1) is Euclidean, we can consider elements of that ring.
For example, if a = 23 − i and b = 24 + bi, then g = −1 + i. This
comes from the prime factorization 23 − i = (−1 + i)(2 − i)(7 + 2i) and
24+6i = −(1+i)(−1+i)(3)(4+i), and as difficult as it may be to find prime
√
factorizations in Z, it is more difficult to find them in O( −1). (Actually, I
have exaggerated here to make a point. We will develop a lot more theory,
√
which will tell us how to do prime factorization in O( −1), and we will
see that it is not too much more difficult than in Z.)
What we shall use is not only the property that elements α and β of
R have a gcd, but in addition, that the gcd can be written as a linear
combination of α and β. That is, if γ is the gcd of α and β, then there are
elements δ and ε of R with γ = αδ + βε. For example,
i i
i i
i i
i i
Definition 2.13. Let R be an integral domain and let {αi } be a set of ele-
ments of R, not all of which are zero. Then an element γ of R is a greatest
common divisor (gcd) of {αi }, γ = gcd({αi }) if
In general, a gcd may or may not exist. We shall soon explore the
question of when it does. But for the moment, let us assume that a gcd
does exist and explore the consequences of that assumption.
Lemma 2.14.
(1) Let γ be a gcd of {αi } and let ε be any unit of R. Then γ = γε is also
a gcd of {αi }.
(2) If γ and γ are any two gcd’s of {αi }, then γ = γε for some
unit ε.
Proof:
i i
i i
i i
i i
34 2. Unique Factorization
Proof: We must show that 1 has the two properties of a gcd of {αi }. Now
1 has property (1) of a gcd of {αi } as 1 certainly divides each αi .
Suppose now that ζ is any element of R that divides each αi . Then γζ
divides each γαi . But γαi = αi so γζ divides each αi . By property (2) of
a gcd of {αi }, we have that γζ divides γ, and hence ζ divides 1. Thus we
see that 1 also has property (2) of a gcd of {αi }, so we conclude that 1 is
a gcd of {αi }.
To see the distinction between these two definitions, let us consider the
set of integers {6, 10, 15}. This set is relatively prime, as it has a gcd of
1, but is not pairwise relatively prime. Looking at pairs of elements, we
see that 6 and 10 have a gcd of 2, that 6 and 15 have a gcd of 3, and that
10 and 15 have gcd of 5. Thus in this set, no two distinct elements are
relatively prime.
We have been proceeding a bit hypothetically, assuming a gcd exists
and exploring the consequences of that assumption. Now let us turn to the
question of when a gcd actually does exist.
We shall formulate a stronger property than the mere existence of the
gcd, and investigate that.
i i
i i
i i
i i
Definition 2.18. An integral domain R has the GCD-L property if the fol-
lowing is true:
(1) Every set of elements A = {αi } in R, not all zero, has a gcd γ, and
Here is our main theorem. First, we will give a very short and slick (but
nonconstructive) proof of this theorem. Then we will give a constructive
proof that will lead us to Euclid’s algorithm.
Theorem 2.20. Every Euclidean domain R has the GCD-L property.
First Proof: Let R be a Euclidean domain with norm ·, and let A = {αi }
be any set of elements of R not all of which are zero.
Let S be the set of all linear combination of the elements of A,
S= αi βi | each βi is in R, and only finitely many βi = 0 .
T = {α | α is in S }.
i i
i i
i i
i i
36 2. Unique Factorization
i i
i i
i i
i i
Lemma 2.23. Let α1 and α2 be any two elements of R, and let δ be any
element of R. Set α2 = α2 + δα1 . Then D({α1 , α2 }) = D({α1 , α2 }).
Lemma 2.24. Let A = {αi } be any set of elements of R, not all of which
are zero, and suppose that A has a gcd γ. Let α be any element of R. If
{α , γ} has a gcd γ , then γ is also a gcd of the set A = {α } ∪ A. (In
particular, the set A has a gcd.)
i i
i i
i i
i i
38 2. Unique Factorization
Proof: We shall show that D({α , γ}) = D({α } ∪ A). In light of Re-
mark 2.11, this proves the lemma.
Again, we show that these two sets are equal by showing that every
element of one of them is also an element of the other.
Suppose δ is in D({α , γ}). Then δ divides α and δ divides γ. Since γ
divides each αi , we see that δ divides each αi , so δ is in D({α } ∪ A).
On the other hand, suppose δ is in D({α } ∪ A). Then δ divides α ,
and δ divides each αi . But by the definition of the gcd of A, δ divides γ.
Hence δ is in D({α , γ}).
With these results in hand, we can now give our second proof of Theo-
rem 2.20. This proof only applies, however, to the case that A = {αi } is a
finite set.
Second Proof of Theorem 2.20: Suppose that A = {αi } = {α1 , . . . , αn } is
a finite set consisting of n elements. We prove the theorem by induction
on n.
First, suppose n = 1. Then the gcd of {α1 } is clearly γ = α1 . (α1
divides α1 and any β that divides α1 divides α1 .) So {α1 } has a gcd, and
furthermore α1 = α1 · 1 so we see that both conditions in Definition 2.18
are satisfied.
Next, suppose n = 2. This is the crucial case. To handle this case we
employ a procedure known as Euclid’s algorithm. Consider {α1 , α2 }. If
α2 = 0 then (as every element of R divides 0), the gcd of {α1 , α2 } is the
gcd of {α1 }, which we have just observed is α1 . Also, α1 = α1 · 1 + α2 · 0.
So in this case we are done. Similarly, if α1 = 0 then by the same logic the
gcd of {α1 , α2 } is α2 , and α2 = α1 · 0 + α2 · 1, and we are again done. Now
suppose α1 and α2 are both nonzero.
To avoid notational confusion, we shall set θ1 = α1 and θ2 = α2 . We
may use the division algorithm in the Euclidean domain R to write
Keep going.
i i
i i
i i
i i
i i
i i
i i
i i
40 2. Unique Factorization
θi = θi−2 − θi − 1δi−2
= (θ1 ε1 + θ2 ε2 ) − (θ1 ζ1 + θ2 ζ2 )δi−2
= θ1 (ε1 − ζ1 δi−2 ) + θ2 (ε2 − ζ2 δi−2 ),
γ = α1 β1 + α2 β2 + . . . + αn−1 βn−1
γ = αn ζ1 + γζ2 .
i i
i i
i i
i i
36 = 360 + (108)(−3)
= 360 + (2268 + 360(−6))(−3)
= 2268(−3) + 360(19).
552 = 36 · 15 + 12,
36 = 12 · 3,
so the gcd is 12, and furthermore, also using our work above,
12 = 552 + 36(−15)
= 552 + (2268(−3) + 360(19))(−5)
= 2268(45) + 360(−285) + 552(1).
143 = 37 · 3 + 32,
37 = 32 · 1 + 5,
32 = 5 · 6 + 2,
5 = 2 · 2 + 1,
2 = 1 · 2,
1 = 5 + 2(−2)
= 5 + (32 + 5(−6))(−2)
= 32(−2) + 5(13)
= 32(−2) + (37 + 32(−1))(13)
= 37(13) + 32(−15)
= 37(13) + (143 + 37(−3))(−15)
= 143(−15) + 37(58).
i i
i i
i i
i i
42 2. Unique Factorization
143 = 37 · 4 + (−5),
37 = −5(−7) + 2,
−5 = 2(−3) + 1,
2 = 1 · 2,
1 = −5 + 2(3)
= −5 + (37 + (−5)7)(3)
= 37(3) + (−5)22
= 37(3) + (143 + 37(−4))22
= 143(22) + 37(−85).
i i
i i
i i
i i
= 12345638(37) + 91(−5019655)
= 12345638(37) + (24691367 + 12345638(−2))(−5019655)
= 24691367(−5019655) + 12345638(10039347).
= 24691367(−5019655) + (111111106 + 24691367(−4))(10039347)
= 111111106(10039347) + 24691367(−45177043)
= 111111106(10039347) + (135802473 + 111111106(−1))(−45177043)
= 135802473(−45177043) + 111111106(55216390)
= 135802473(−45177043) + (246913579 + 135802473(−1))(55216390)
= 246913579(55216390) + 135802473(−100393433)
= 246913579(55216390) + (876543210 + 246913579(−3))(−100393433)
= 876543210(−100393433) + 246913579(356396689)
= 876543210(−100393433) + (1123456789 + 876543210(−1))(356396689)
= 1123456789(356396689) + 876543210(−456790122).
i i
i i
i i
i i
44 2. Unique Factorization
Example 2.26.
√
(1) Now we do an example with R = O( −1). In finding the quotient
(and remainder) at each √ step, we√follow the strategy
√ of the proof of
Theorem 2.8: if (a + b √ D)/(c + d D) = e + f D with e, f in Q, we
let the quotient be s + t D where s and t are integers closest to e and
f , respectively (and then the remainder is forced).
Let α1 = 24 + 6i and α2 = 23 − i. Then
24 + 6i = (23 − i)(1) + (1 + 7i)
(as (25 + 6i)/(23 − i) = (546 + 162i)/530),
23 − i = (1 + 7i)(−3i) + (2 + 2i)
(as (23 − i)/(1 + 7i) = (16 − 162i)/50),
1 + 7i = (2 + 2i)(2 + i) + (−1 + i)
(as (1 + 7i)/(2 + 2i) = (16 + 5i)/8),
2 + 2i = (−1 + i)(−2i),
so the gcd is −1 + i, and then
−1 + i = (1 + 7i) + (2 + 2i)(−2 − i)
= (1 + 7i) + ((23 − i) + (1 + 7i)(3i))(−2 − i)
= (23 − i)(−2 − i) + (1 + 7i)(4 − 6i)
= (23 − i)(−2 − i) + ((24 + 6i) + (23 − i)(−1))(4 − 6i)
= (24 + 6i)(4 − 6i) + (23 − i)(−6 + 5i).
(By way of further explanation, (25 + 6i)/(23 − i) = (546 + 162i)/530 =
(546/530) + (162/530)i, which is nearest to 1; (23 − i)/(1 + 7i) = (16 −
162i)/50 = (16/50)+(−162/50)i, which is nearest to −3i; and (1+7i)/(2+
2i) = (16 + 5i)/8 = 2 + (5/8)i, which is nearest to 2 + i.)
√
(2) In our next example, R = O( −7). The logic of this example depends
on the proof of Theorem 2.8 in the case D = −7. Since we deferred the
proof of that case of Theorem 2.8 to Appendix C, we similarly defer
this example to Appendix C.
Remark 2.27. Note that the gcd is only defined up to multiplication by a
unit in R (i.e., if γ is a gcd of αi and α2 , so is γ for any unit of R—compare
Lemma 2.14), so it would be, strictly speaking, better to speak of “a” gcd
rather than “the” gcd. In particular, if R = Z, with units {±1}, then, for
example, −36 is also a gcd of 2268 and 360, and −36 = 2258(3)+360(−19).
√
If R = O( −1), the units are {±1, ±i}, so 1 − i = −(−1 + i), −1 − i =
i(−1 + i), and 1 + i = −i(−1 − i) are also gcd’s of 24 + 6i and 23 − i, and,
for example, 1 + i = (24 + 6i)(−6 − 4i) + (23 − i)(5 + 6i).
i i
i i
i i
i i
i i
i i
i i
i i
46 2. Unique Factorization
(2) I = {0}. (In this case, I is called the zero ideal. Otherwise, I is called
a nonzero ideal.) Note that {0} is a principal ideal as {0} = I0 . (The
only multiple of 0 is 0.)
Let us return to Definition 2.29. On the one hand, we see that the ideal
generated by α is simply the set of multiples of α. On the other hand, we
see that the set of multiples of any element α of R is an ideal, and in fact a
principal ideal. So you may ask: are all ideals of R of this form? Excellent
question! But we shall defer the answer to this question for a while. Right
now, we continue with a general development of properties of ideals.
Lemma 2.32. Let I be an ideal of R. The following are equivalent:
(1) 1 is in I.
(2) I = R.
Proof:
i i
i i
i i
i i
Let us see that IA is always an ideal, and in fact that we get every ideal
this way.
Lemma 2.35.
Proof:
In case A = α (i.e., if this set consists of the single element α), then
IA = Iα is nothing other than the principal ideal Iα that we have already
considered. Suppose instead, for example, that A = {α1 , α2 }. Then we
have the ideal Iα1 ,α2 , and this is not of the form Iα , so is not a priori a
principal ideal. But it may turn out that in fact Iα1 ,α2 = Iα0 for some α0 ,
i.e., that it is indeed a principal ideal. In fact, it may turn out that this
is always the case. This is a very important situation to which we give a
name.
Definition 2.36. An integral domain R is a principal ideal domain (PID)
if every ideal I in R is principal.
i i
i i
i i
i i
48 2. Unique Factorization
Recall that we have earlier defined the GCD-L property (Definition 2.18).
Proposition 2.37. Let R be an integral domain. Then R is a PID if and
only if R has the GCD-L property.
Proof: This follows directly from Theorem 2.20 and Proposition 2.37.
i i
i i
i i
i i
(1) Let A be any set of elements of R, not all zero. Let γ = gcd(A). Then
IA = Iγ .
Proof: (1) is the claim in the first paragraph of the proof of Proposi-
tion 2.37, and (2) is the claim in the fourth paragraph of the proof of
Proposition 2.37.
Remark 2.40. One can ask whether, conversely, every PID is a Euclidean
domain. The answer is no, but the examples are not particularly illumi-
nating (or useful), so we shall not give any.
We have just shown that any set {αi } of elements of a PID R has a
gcd γ. Recall Definition 2.15: if {αi } has a gcd of 1, then {αi } is relatively
prime.
The following is a classical and extremely useful result.
Lemma 2.41 (Euclid’s Lemma). Let R be a PID and let α be any nonzero
element of R. Let β and γ be elements of R and suppose that α divides βγ.
If α and β are relatively prime, then α divides γ.
1 = αζ + βθ
γ = αγζ + βγθ.
so α divides γ.
i i
i i
i i
i i
50 2. Unique Factorization
γ = αδ = α(βζ) = (αβ)ζ,
so αβ divides γ.
i i
i i
i i
i i
had two conditions, (1) and (2), and we used both of them. The second
proof, on the other hand, only used condition (1) of Definition 2.18 (that
the relevant elements of R had a gcd), but did not use condition (2) of
Definition 2.18 (that the gcd could be expressed as a linear combination of
those elements). Hence the second proof works more generally. Thus, we
see that
Remark 2.46. You may have been a bit surprised by parts (3) and (4) of
this definition. The usual definition of a positive integer a being prime is
that a = 1 and if a = bc for some positive integers b and c, then b = 1 or
c = 1. The generalization of that to an integral domain R is in part (3) of
Definition 2.45, but we are not calling this generalization prime. Rather,
we are calling it irreducible, and using the term prime to denote something
else, defined in part (4). Although surprising, this turns out to be the right
i i
i i
i i
i i
52 2. Unique Factorization
thing to do. As we will show (see Lemma 2.48), in the integers, or in any
integral domain with unique factorization, these two notions turn out to
be equivalent, so it does not matter whether we use notion (3) or (4) in
that case, but in general (4) is the right notion to use.
We will observe that (3) may be more practical to check. In the case of
the positive integers, to check (3) we only have to try divisors of a, and we
know any such divisor must be less than or equal to a, so there are only
finitely many numbers to check.
On the other hand, to check (4), we must look at any number d and
see if d is divisible by a. If so, we must look at any factorization d = bc
of d, and see if a divides one of the factors b or c, and here there are
infinitely many numbers to check. This difference, as well as millennia of
mathematical tradition, are the reasons the usual definition is preferred for
the positive integers.
i i
i i
i i
i i
Remark 2.50. Observe that essential uniqueness is the best we can hope
for. For example, in the integers, we have 6 = (1)(2)(3) = (−1)(−2)(3) =
(−1)(2)(−3) = (1)(−2)(−3) = (1)(3)(2) = (−1)(−3)(2) = (−1)(3)(−2) =
(1)(−3)(−2).
i i
i i
i i
i i
54 2. Unique Factorization
α = up1 · · · pk = vq1 · · · q
i i
i i
i i
i i
(Lemma 2.48), which tells us that in the PID R, every irreducible element
is prime.
Consider p1 . It is a prime and divides the product vq1 · · · q = (vq1 )
(q2 · · · q ) and hence divides one of the factors vq1 or (q2 · · · q ). If it divides
vq1 , and hence q1 (as v is a unit), fine.
Otherwise, it divides q2 · · · q = q2 (q3 · · · q ). If it divides q2 , fine. Other-
wise it divides q3 · · · q = q3 (q4 · · · q ). Continue in this fashion to conclude
that p1 divides some qi . Reordering the qi ’s, if necessary, we may assume
that p1 divides q1 . But q1 is irreducible, so p1 and q1 must be associates,
so q1 = u p1 for some unit u . Then
α = up1 p2 · · · pk = vq1 q2 · · · q = vu p1 q2 · · · q ,
so, setting v = vu ,
α = up2 · · · pk = v q2 · · · q .
Now apply the same argument to p2 and α to conclude that p2 divides,
and hence is an associate of, some one of q2 , . . . , q , which by renumbering
we may assume is q2 , and then
α = up3 · · · pk = v q3 · · · q .
Continuing in this way, we see that, until the process stops, and possibly
after reordering the qi ’s, p1 is an associate of q1 , p2 is an associate of q2 ,
. . . . We claim that when the process stops, we have used all the pi ’s and
all the qi ’s, so k = and we are just left with a unit on each side, proving
the theorem.
Otherwise, either we have used all the pi ’s but not all the qi ’s, so > k,
and we are left with
u = wqk+1 · · · q
for some unit w, which is impossible, as the left-hand side u is a unit but
the right-hand side wqk+1 · · · q is not, or vice versa, so k > and we are
left with
up+1 · · · pk = w,
which is similarly impossible.
i i
i i
i i
i i
56 2. Unique Factorization
√ √
(2) O( D), the ring of integers in the quadratic field Q( D), for D =
−11, −7, −3, −2, −1, 2, 3, 5, 6, 7, 11, 13, 17, 21, and 29.
i i
i i
i i
i i
(2) for every such prime p, if pe is the highest power of p dividing α, and
if pf is the highest power of p dividing β, then f ≥ e.
Proof: First, let us suppose that conditions (1) and (2) are satisfied. Then,
for some primes p1 , . . . , pk , q1 , . . . , q and some exponents, and some units
u and v, we have
α = upe11 pe22 · · · pekk ,
β = vpf11 pf22 · · · pfkk q1g1 q2g2 · · · qg .
Then, setting
γ = wpf11 −e1 pf22 −e2 · · · pfkk −ek q1g1 q2g2 · · · qg
with w = uv −1 , we have
β = αγ,
so in this situation α divides β.
On the other hand, suppose α divides β, so β = αγ.
Factor α and γ into primes, where we allow the exponents d1 , . . . , dk to
be zero:
α = upe11 · · · pekk ,
γ = wpd11 · · · pdkk q1g1 · · · qg .
i i
i i
i i
i i
58 2. Unique Factorization
Proof: Since α and β1 are relatively prime, they have no common prime
factors. Thus, we have prime factorizations
α = pe11 · · · pekk ,
β1 = q1g1 · · · qg .
i i
i i
i i
i i
Then, by Lemma 2.58, applied first to α and γ and then to β and γ, we see
g
that the prime factorizations of γ must contain every pei i and every qj j , so
ek g1 g
it contains pi · · · pk q1 · · · q and hence αβ divides γ.
e1
α = pe11 · · · pekk .
i i
i i
i i
i i
60 2. Unique Factorization
Proof:
i i
i i
i i
i i
(2) R has an element α that is not divisible by p, but with α divisible
by p,
Proof: Write α = pq, for some q > 1. By Lemma 2.3, α = |αα|, so
α = pq gives
αα = ±pq.
By Lemma 2.64, p is irreducible. Now p does not divide α, by assumption,
and then p does not divide α either. Thus, p divides the product αα
without dividing either factor, so p is not a prime. Now in a UFD, every
irreducible is prime (Lemma 2.55), so R cannot be a UFD.
Remark 2.66.
√ It is easy√to tell when p divides α. Suppose √ p is odd and
α = a√+ b D or (a + b D)/2. Then α/p = a/p + (b/p) D or ((a/p) +
(b/p) D)/2, so p divides α if and only if p divides a and p divides b. The
case p = 2√is a little more complicated. If D ≡ 2 or 3 (mod 4), then 2 divides
α = a + b D if and√only if a and b are both even. If D ≡ 1 (mod 4), then
2 divides α = a + b D with a and b integers if and only if either a and b
are both even or both odd integers. (This justifies the √ claim in the proof
of Corollary 2.65√that if p does not divide α = a + b D, then p does not
divide α = a − b D either.)
i i
i i
i i
i i
62 2. Unique Factorization
Proof: We divide the proof into two cases: (1) D ≡ 2 or 3 (mod 4), and (2)
D ≡ 1 (mod 4).
√
(1) Suppose D ≡ 2√or 3 (mod 4). Then any element α of O( D) is of the
form β = a + b D with a and b integers. Then
i i
i i
i i
i i
Then
α = 1 − D.
Then
α = (1 − D)/4.
i i
i i
i i
i i
64 2. Unique Factorization
Proof:
√
(1) Let β be an element of R. Then β = a + b D where a and b are
integers. Suppose β = p. Then
p = α = a2 − b2 D = a2 + b2 (−D)
Proof:
(1) We claim that if D < 0, D ≡ 1 (mod 4), and |D| is composite, then D
has a prime factor p with p < |D|/4.
(To see this, note that if D < 0, D ≡ 1 (mod 4), and |D| is composite,
then |D| ≥ 15. If |D| = 15, then D is divisible by 3 and 3 < 15/4.
Otherwise |D| > 15. Let p be the smallest prime dividing |D|. Then
|D|/p = 1, so |D|/p is divisible by some prime p ≥ p. We now argue
i i
i i
i i
i i
Then
α = |D|,
so we see hypothesis (2) of Corollary 2.65 holds as well.√ Hence, by
Corollary 2.65 (the Non-UFD Test), we conclude that O( D) is not a
UFD.
√
Corollary 2.71. For D ≡ 5 (mod 8), D < 0, and |D| < 1000, O( D) is
not a UFD except (possibly) for the cases D = −3, −11, −19, −43, −67,
and −163 and the cases D = −211, −283, −331, −547, −691, −787, and
−907.
Proof: For every value of D with D ≡ 5 (mod 8), D < 0, and |D| < 1000,
except for those listed in the statement of the corollary, |D|√is composite
or m = (1 − D)/4 is composite, so by Proposition 2.70, O( D) is not a
UFD.
Actually, using the same idea, we can get a stronger test than that in
Proposition 2.70, and use it to strengthen Corollary 2.71.
i i
i i
i i
i i
66 2. Unique Factorization
√
odd integer a < (1/2) D2 + 4D with (a2 −
(2) If there is a nonnegative √
D)/4 composite, then O( D) is not a UFD.
Proof:
(1) Let
√
α=a+ D.
Then
α = a2 − D.
√
Now a < (1/4) D2 + 16D implies, by simple algebra, that a2 − D <
(D/4)2 , so a2 − D must have a prime factor p less than |D|/4. Thus, by
Lemma 2.69, we see that hypothesis (1) of Corollary 2.65 holds. Since
a is even, a2 − D is odd, so p is odd and hence α is not divisible by p.
Thus, we see that hypothesis (2) of Corollary 2.65 holds as well.√Hence
we conclude that, by Corollary 2.65 (the Non-UFD Test), O( D) is
not a UFD.
(2) Let
√
α = (a + D)/2,
√
and note that α is in O( D) as a is odd. Then
√
Then, similarly, a < (1/2) D2 + 4D implies that (a2 −D)/4 < (D/4)2 ,
so (a2 − D)/4 must have a prime factor p less than |D|/4. Certainly α
is not divisible by p. Thus, we see that hypothesis (2) of Corollary 2.65
holds as well.
√ Hence we conclude that, by Corollary 2.65 (the Non-UFD
Test), O( D) is not a UFD.
i i
i i
i i
i i
Corollary
√ 2.75. For D ≡ 5 (mod 8), D < 0, and |D| < 1, 000, 000, 000,
O( D) is not a UFD except (possibly) for the cases D = −3, −11, −19,
−43, −67, and −163.
Proof: A computer computation using Proposition 2.72 rules out all values
of D except for those in the statement of the corollary. (As a matter of
curiosity, the largest value of a we have to consider for D in this range is
a = 11, which occurs for D = −543, 764, 323.)
Remark √2.76. What about the exceptions in Corollary 2.75? It turns out
that O( D) is a UFD for the cases D = −3, −11, −19, −43, −67, and
−163. Given the large range of values of D, the reader may (strongly)
√
suspect that these are the only such values of D for which O( D) is a
UFD. This turns out to be true, and is a very deep (and famous) theorem.
i i
i i
i i
i i
68 2. Unique Factorization
Proof: Set q = p if condition (1) holds and set q = p p if condition (2)
holds.
Once again, we divide the proof into two cases: (1) D ≡ 2 or 3 (mod 4),
and (2) D ≡ 1 (mod 4).
√
(1) Suppose D ≡ 2√or 3 (mod 4). Then any element β of O( D)√is of the
form β = a + b D with a and b integers. Suppose β = a + b D with
β = 2. Then
a2 − Db2 = ±2,
so
a2 ≡ ±2 (mod D),
and hence
a2 ≡ ±2 (mod q).
But it is a result from number theory (see Corollary B.37(2) in the sit-
uation of condition (1) and Corollary B.38 in the situation of condition
(2)) that the congruence a2 ≡ ±2 (mod q) does not have a solution.
√
(2) Suppose D ≡ 1√(mod 4). Then any element β of O( D) is of√the form
(2a) β = a +√b D with a and b integers or (2b) β = (a + b D)/2 =
(a/2) + (b/2) D with a and b odd integers. In case (2a) the argument
is exactly
√ the same as above. We consider case (2b). Suppose β =
(a + b D)/2 with β = 2. Then
i i
i i
i i
i i
x ≡ a1 (mod b1 ),
x ≡ a2 (mod b2 ),
with b1 and b2 relatively prime, always has a solution, and that this solution
is unique (mod b1 b2 ), or in other words, that this pair of congruences is
equivalent to a single congruence
x ≡ a3 (mod b1 b2 )
for some a3 . (There are various methods for finding a3 . If b1 and b2 are
small, trial and error is as good as any.) We will use this in the statement
of our results.
i i
i i
i i
i i
70 2. Unique Factorization
We next apply this to the case D ≡ 2 (mod 8), so we have the pair of
simultaneous congruences
D ≡ 0 (mod 5) and D ≡ 2 (mod 8),
i i
i i
i i
i i
(4) Next, we consider condition (2) of Theorem 2.78. To apply this con-
dition we need to find a pair of primes p and p with p ≡ 3 (mod 8)
and p ≡ 7 (mod 8). The first such pair is p = 3 and p = 7. Then we
have the pair of simultaneous congruences
Proof: Once again we divide the proof into two cases: (1) D ≡ 2 or
3 (mod 4), and (2) D ≡ 1 (mod 4).
√
(1) Suppose D ≡ 2√or 3 (mod 4). Then any element β of O( D)√is of the
form β = a + b D with a and b integers. Suppose β = a + b D with
β = p1 . Then
a2 − Db2 = ±p1 ,
so
a2 ≡ ±p1 (mod D),
and hence
a2 ≡ ±p1 (mod p2 ).
i i
i i
i i
i i
72 2. Unique Factorization
√
(2) Suppose D ≡√1 (mod 4). Then any element β of O( √ D) is of the form
= a+b D with a and b integers or (b) β = (a+b D)/2 = (a/2)+
(a) β √
(b/2) D with a and b odd integers. In case (a), the argument is exactly
√
the same as above. We consider case (b). Suppose β = (a + b D)/2
with β = p1 . Then
Remark 2.81. Let p2 be an odd prime and let q be any integer relatively
prime to p. If p2 ≡ 3 (mod 4), then, by Corollary B.34, exactly one of
the congruences x2 ≡ q (mod p2 ) and x2 ≡ −q (mod p2 ) has a solution,
so the hypothesis of Lemma 2.80 is never satisfied. If p ≡ 1 (mod 4),
then, by Corollary B.34, either both of the congruences x2 ≡ q (mod p2 )
and x2 ≡ −q (mod p2 ) have a solution, or neither does, so in this case the
hypothesis in Lemma 2.80 that x2 ≡ ±p1 (mod p2 ) does not have a solution
is equivalent to the simpler hypothesis that x2 ≡ p1 (mod p2 ) does not have
a solution.
i i
i i
i i
i i
We also claim that D has an element α not divisible by p1 but with α
divisible by p1 . By hypothesis (2), there is an a with a2 ≡ D (mod p1 ). Let
√
α = a + D.
√ √
Clearly α is not divisible by p1 as α/p1 = (a + D)/p1 is not in O( D).
(Note that p1 is odd.) But then, by hypothesis (2),
α = |a2 − D|
√
is divisible by p1 . Thus, by Corollary 2.65 (the Non-UFD Test), O( D) is
not a UFD.
Now let us give some examples of the use of Theorem 2.82. Once again
we will formulate our results with the help of the Chinese Remainder The-
orem.
Example 2.83. We wish to apply Theorem 2.82. For this we need to find
primes p2 with p2 ≡ 1 (mod 4). The first such primes are p2 = 5 and
p2 = 13.
(1) Consider p2 = 5. We want to find odd primes p1 with x2 ≡ p1 (mod 5)
not having a solution. Since the squares (mod 5) are 0, 1, and 4, this
means that p1 ≡ 2 or 3 (mod 5). But also p1 ≡ 1 (mod 2) (as p1 is
odd), so by the Chinese Remainder Theorem, the condition on p1 in
this case is p1 ≡ 3 or 7 (mod 10). The first such primes are p1 = 3 and
p1 = 7.
D ≡ 0 or 10 (mod 15).
√
So for these values of D, by Theorem 2.82, we have that O( D)
is not a UFD. Remembering that D must be square-free, the first
few such positive values of D are D = 10, 15, 30, 55, 70, 85, 105,
115, 130, 145, 165, and 195.
i i
i i
i i
i i
74 2. Unique Factorization
(2) Consider p2 = 13. We want to find odd primes p1 with x2 ≡ p1 (mod 13)
not having a solution. Since the squares (mod 13) are 0, 1, 3, 4, 9, 10,
and 12, this means that p1 ≡ 2, 5, 6, 7, 8, or 11 (mod 13). But also
p1 ≡ 1 (mod 2), so by the Chinese Remainder Theorem, the condition
on p1 in this case is p1 ≡ 5, 7, 11, 15, 19, or 21 (mod 26). The first
such primes are p1 = 5 and p1 = 7.
i i
i i
i i
i i
x2 ≡ p1 (mod p2 ),
x2 ≡ p2 (mod p1 ),
where p1 and p2 are distinct odd primes. It states that if at least one of
p1 and p2 is congruent to 1 (mod 4), then either both of these congruences
have a solution or neither does, while if both p1 and p2 are congruent to
3 (mod 4), then exactly one of these congruences has a solution. Since in
Theorem 2.82 we require that p2 ≡ 1 (mod 4), we are in the first of these
cases, and so we may replace hypothesis (1) in Theorem 2.82 by hypothesis
(1 ):
(1 ) The congruence x2 ≡ p2 (mod p1 ) does not have a solution.
Example 2.85. We wish to apply Theorem 2.82 with hypothesis (1) re-
placed by hypothesis (1 ). For this we need to find an odd prime p1 . The
first such prime is p1 = 3.
i i
i i
i i
i i
76 2. Unique Factorization
Remark 2.86. The techniques in Example 6.7 and Example 6.9 produce the
same pairs (p1 , p2 ) and hence the same values of D, but they produce them
in a different order. Depending on circumstances, either one may be more
convenient to use.
We now give an example of a rather trickier application
√ of Corollary 2.65
(the Non-UFD Test) that enables us to show that O( D) is not a UFD in
some additional cases.
Theorem 2.87. Let D be congruent to 2 (mod √ 8) and suppose that D is
divisible by a prime p ≡ 3 (mod 8). Then O( D) is not a UFD.
√
Proof: We shall show that O( D) does not have an element of norm p.
We prove this by contradiction.
√
Suppose β = a+b D with β = p. Then |a2 −Db2 | = p, so a2 −Db2 =
±p and hence a2 = ±p + Db2 . Since the right-hand side of this equation
is divisible by p, the left-hand side must be divisible by p as well, and
so a = pc for some c. Substituting into this equation and dividing each
term by p, we obtain the equation pc2 = ±1 + db2 , where d = D/p. Since
D ≡ 2 (mod 8) and p ≡ 3 (mod 8), we see that d ≡ 6 (mod 8). Thus, this
equation yields the congruence
3c2 ≡ ±1 + 6b2 (mod 8).
But, as you can easily check, for any integer x, x2 ≡ 0, 1, or 4 (mod 8).
Substituting these possibilities
√ for c2 and b2 , we see that this congruence
has no solution. Let α = D. Then α is not divisible by p, but α√= |D|
is divisible by p. Hence, by Corollary 2.65 (the Non-UFD Test), O( D) is
not a UFD.
i i
i i
i i
i i
√ 2.1. The values of D between 2 and 499 for which our methods show that
Table
O( D) is not a UFD.
Example 2.88. We wish to apply Theorem 2.87. For this we need primes
congruent to 3 (mod 8). The first such primes are p = 3 and p = 11.
i i
i i
i i
i i
78 2. Unique Factorization
(1) Consider p = 3. Then the hypotheses of Theorem 2.87 give the pair of
simultaneous congruences
(2) Consider p = 11. Then the hypotheses of Theorem 2.87 give the pair
of simultaneous congruences
Example 2.89. Table 2.1 is a√table of values of D between 2 and 499 for
which we can show that O( D) is not a UFD by using Theorem 2.78,
Theorem 2.82, or Theorem 2.87. For each value of D we give a pair (q1 , q2 )
that provides the argument. (In case q1 = 2, we are using Theorem 2.78
with q2 = q, as in Example 2.79. In case q1 is an odd prime, we are using
Theorem 2.82, as in Example 2.83 or Example 2.85, or Theorem 2.87, as
in Example 2.88. In this last case we have simply set q2 = D. For many
values of D, there is more than one pair (q1 , q2 ) that work. In those cases,
we have simply chosen one.)
2.7 Summing Up
In the preceding √sections of this chapter, we have shown that certain inte-
gral domains O( D) are or are not unique factorization domains. In this
section we will sum up our work and also report on some interesting results
that are beyond our ability to prove here.
i i
i i
i i
i i
2.7. Summing Up 79
We will state the results for the cases of imaginary quadratic fields
(D < 0) and real quadratic fields (D > 0) separately.
First, imaginary quadratic fields.
√
Theorem 2.90. Let D < 0, and let R = O( D).
Proof: (1) is Lemma 2.9; (2) is Theorem 2.68; and (3) is part of Proposi-
tion 2.70.
Note that this theorem leaves an infinite number of cases open, those with
D ≡ 5 (mod 8), |D| prime. Some of these cases we have dealt with in
Proposition 2.72 (applied in Corollary 2.75).
The following is a very deep theorem.
Theorem
√ 2.91. There are exactly nine values of D < 0 for which R =
O( D) is a UFD. They are D = −1, −2, −3, −7, −11, −19, −43, −67,
and −163.
Proof: (1) is Lemma 2.9 and (2) is Theorem 2.78 combined with Theo-
rem 2.87.
i i
i i
i i
i i
80 2. Unique Factorization
Conjecture 2.93
√ (Gauss). There are an infinite number of values of D > 0
for which O( D) is a UFD.
2.8 Exercises
Exercise 2.1. Complete the proof of Lemma 2.7. (You must prove Lemma 2.7
in the cases not explicitly done in the text, i.e., in the cases where a ≤ 0
and b > 0; where a ≥ 0 and b < 0; and where a ≤ 0 and b < 0. You may
do so by proving each of these cases from scratch, adapting the proof of
the case a ≥ 0 and b > 0 given in the text to these other cases, but even
better would be a proof that simply reduces these other cases to the case
a ≥ 0 and b > 0 and uses the fact that we know Lemma 2.7 is true in that
case.)
Exercise 2.2. Lemma 2.7 shows that for any integer a and any nonzero
integer b, there exist integers q and r with a = bq + r and −|b| + 1 ≤ r ≤
|b| − 1. Show that for any integer a and any nonzero integer b, there exist
unique integers q and r with a = bq + r and 0 ≤ r ≤ |b| − 1.
i i
i i
i i
i i
2.8. Exercises 81
Exercise 2.5. Let α and β be elements of R. Show that α and β are rela-
tively prime if and only if α2 and β 2 are relatively prime.
gcd(α, β, γ) ∼
= gcd(gcd(α, β), γ).
i i
i i
i i
i i
82 2. Unique Factorization
(b) if λ and λ are any two lcm’s of {αi }, then λ = λε for some unit ε of
R.
Exercise 2.8. Let α and β be elements of R, not both of which are zero.
Suppose that α and β are relatively prime. Show that αβ is an lcm of α
and β.
Exercise 2.9. More generally, let α and β be any two elements of R, not
both of which are zero. Let γ be a gcd of α and β. Show that λ = αβ/γ is
an lcm of α and β. Thus we see that if γ is a gcd of α and β, and λ is an
lcm of α and β, then γλ ∼= αβ.
lcm(α, β, γ) ∼
= lcm(lcm(α, β), γ).
Exercise 2.11. Use the result of the preceding exercise and induction to
show that every finite set of elements of R, not all of which are zero, has
an lcm.
Exercises 2.3 –2.11 . Do Exercises 2.3–2.11 for a UFD R by using the re-
sults of Section 2.4. Exercises 2.3 –2.11 are easier than Exercises 2.3–2.11,
but that is because we are using more background—the prime factorization
of elements of R.
i i
i i
i i
i i
2.8. Exercises 83
Exercise 2.15. Let {a1 /b1 , . . . , an /bn } be a set of fractions in lowest terms.
(By this we mean that ai and bi are relatively prime, for each i.)
Exercise 2.16. In each case, find the gcd of the following set of integers,
and express the gcd as a linear combination of those integers:
Exercise 2.17. Find the lcm of each of the sets of integers in Exercise 2.16.
√ √
Exercise 2.18. Let R = O( −1). Set i = −1. In each case, find a gcd
of the following sets of elements of R, and express that gcd as a linear
combination of those elements:
i i
i i
i i
i i
84 2. Unique Factorization
(b) {5 + i, 7 + 2i},
(e) {1 + i, 1 − i},
Exercise 2.19.
i i
i i
i i
i i
2.8. Exercises 85
Exercise 2.21.
√
(a) Consider the following factorizations in O( −2):
√ √ √ √
51 = 3 · 17 = (1 + 5 −2) · (1 − 5 −2) = (7 + −2)(7 − −2).
i i
i i
i i
i i
86 2. Unique Factorization
√
(c) Consider the following factorizations in O( 6):
√ √
6 = 6 · 6 = 2 · 3.
Exercise 2.22.
(c) Let p and q be primes with pq ≡ 3 (mod 4) and suppose that D = 1−pq
is square-free. Show that
√ √
pq = (1 − D)(1 + D) = p · q
√
are two factorizations of pq into irreducibles in O( D). (This applies
to D = −14, −34, −38, −94, −118, −142, . . . .)
i i
i i
i i
i i
2.8. Exercises 87
Exercise 2.25. Verify that Corollary 2.75 is true. (This requires the use of
a computer.)
Exercise 2.28.
Exercise 2.30. With the above definition of · on R[X], show that, for
any two nonzero polynomials f (X) and g(X),
(The reason for not defining the norm of the 0 polynomial in R[X] is
to make these equations hold. If we defined the norm of the 0 polynomial
to be 0, as you might think, we would have to make exceptions to (a) and
(b) in order to make them hold.)
i i
i i
i i
i i
88 2. Unique Factorization
Exercise 2.32.
(a) Show that Z[X] is not a Euclidean domain with the above norm.
(b) More generally, let R be any integral domain that is not a field. Show
that R[X] is not a Euclidean domain with the above norm.
Exercise 2.33. Find the gcd of each of the following sets of polynomials in
Q[X], and express the gcd as a linear combination of those polynomials:
(a) {X 2 + X + 1, X + 1};
(b) {X 3 + X + 1, X + 2};
(c) {X 3 + 2X 2 + X + 2, X 4 + 5X 2 + 4}.
R = {f (X) = a0 + a2 X 2 + a3 X 3 + . . . + an X n in Q[X]},
i.e., R consists of those polynomials in Q[X] that do not have an “X” term.
(a) Show that X 2 and X 3 are irreducible but not prime elements of R.
(b) This implies that R is not a UFD. Find an explicit element of R that
has two distinct factorizations into irreducibles. (You may have already
done (b) in doing (a).)
I = {f (X) = an Xn + . . . + a0 | a0 is even},
i i
i i
i i
i i
2.8. Exercises 89
Thus, we see that Z[X] is not a PID. (It turns out that Z[X] is a UFD,
so Z[X] gives an example of a UFD that is not a PID.)
i i
i i
i i
i i
Chapter 3
√
In this chapter we will investigate O( −1), known as the Gaussian inte-
gers. We recall that
√ √
O( −1) = {a + b −1 | a and b are integers}
= {a + bi | a and b are integers},
√
and we have shown that O( −1) is a UFD. We will begin by proving a
justly famous theorem of Fermat: every prime congruent to 1 (mod 4) can
be written in as a sum of squares of two positive integers, p = x2 + y 2 ,
uniquely up to the order of the summands. (For example, 5 = 22 + 12 ,
13 = 32 + 22 , 17 = 42 + 12 , 29 = 52 + 22 , 37 = 62 + 12 , 41 = 52 + 42 ).
We shall present three proofs of this theorem in the first section of this
chapter.
The first proof, due to Euler, is believed to be Fermat’s original proof.
(Fermat did not write his proof down, but left a hint as to his approach.)
This is the longest and most difficult proof. (Actually, the proof we write
down is a variant of Euler’s proof, looking ahead to Chapter 4.)
The second proof is a twentieth-century proof due to Thue. It is of
medium length and difficulty.
The third proof uses the fact that the Gaussian integers are a UFD,
and, given that fact, is short and easy!
√
The fact that O( −1) is a UFD gives us unique factorization into
primes, but by itself does not tell us what the primes are (or how to find a
factorization). Using Fermat’s theorem, we can concretely and completely
answer this question, and we do so in the second section of this chapter.
91
i i
i i
i i
i i
Now for the numbers we wish to rule in. We begin with an observation:
if each of two numbers is representable as a sum of two squares, so is their
product. For example, since 5 = 22 + 12 and 13 = 32 + 22 , we can conclude
that 65 is also a sum of two squares, and in fact 65 = 82 + 12 = 72 + 42 .
That this is true is a simple algebraic fact.
Lemma 3.2. If m = a2 + b2 and n = c2 + d2 , then
mn = e2 + f 2
mn = e2 + f 2
i i
i i
i i
i i
Fermat’s proof contained several brilliant ideas. The first was to turn
this lemma around and to show that if m and mn are sums of two squares,
then, under proper conditions, so is n. The second was not to try to show
directly that n is a sum of two squares, but instead to find some multiple
mn of n that is, and to apply the first idea. Of course, to make that work
Fermat had to know that m is a sum of two squares and for that he applied
his “method of descent,” a variant of mathematical induction.
Now let us look at Fermat’s proof precisely. We will break it up into a
number of steps.
Lemma 3.4.
Proof:
x2 + y 2 = a2 + 12 ≡ −1 + 1 ≡ 0 (mod p).
x2 + y 2 ≡ 0 (mod p)
a (x + y ) ≡ 0
2 2 2
(mod p)
(ax) + (ay) ≡ 0
2 2
(mod p)
1 + (ay) ≡ 0
2
(mod p)
(ay) ≡ −1
2
(mod p),
i i
i i
i i
i i
Case (I): p divides the first factor −sb + ta. Write −sb + ta = pd and
consider the first representation of N p,
Now p divides the left-hand side and the last term on the
right-hand side, so it must divide the first term as well, so we
may write sa + tb = pc. Then
N p = M p2 = (pc)2 + (pd)2 = p2 c2 + p2 d2 ,
so
M = c2 + d2
sa + tb = pc
−sb + ta = pd.
i i
i i
i i
i i
s = ac − bd
t = ad + bc
Now p divides the left-hand side and the last term on the
right-hand side, so it must divide the first term as well, so we
may write sa − tb = pc. Then once again
N p = M p2 = (pc)2 + (pd)2 = p2 c2 + p2 d2 ,
so
M = c2 + d2
sa − tb = pc
sb + ta = pd.
s = ac + bd
t = ad − bc
i i
i i
i i
i i
so M < p/2.
In particular, every prime factor of M is less than p/2. Let q be any
prime factor of M . We claim q ≡ 3 (mod 4). For suppose q ≡ 3 (mod 4).
Then N = x2 + y 2 and N is a multiple of q, so x2 + y 2 ≡ 0 (mod q) and
hence, by Lemma 3.4, x ≡ y ≡ 0 (mod q). In other words, x and y are each
divisible by q, which contradicts our assumption that x and y are relatively
prime.
Thus we see that, for any prime factor q of N , q = 2 or q ≡ 1 (mod 4).
Also, let us observe that we may write q = 2 as a sum of two squares,
2 = 12 + 1 2 .
With this in hand, we prove the theorem by induction. Assume that
every prime q ≡ 1 (mod 4), q < p, can be written as a sum of two squares,
and consider p. We have written N = M p as a sum of two squares, N =
x2 + y 2 . But
N = (2 · 2 . . . · 2 · q1 · q2 . . . · qk )p
where we have a certain number of factors of 2 (perhaps none) and a certain
number of odd prime factors q1 , q2 , . . ., qk (perhaps none), with q1 < p,
q2 < p, . . ., qk < p. But now, since each of these factors is a sum of two
squares, we may apply Lemma 3.5 repeatedly, eliminating one factor of M
each time, until finally we obtain a representation of p as a sum of two
squares, as claimed.
Now we must show the uniqueness part of the theorem.
Suppose p = a2 +b2 = s2 +t2 . Apply Lemma 3.5 with N = p to conclude
that p = s2 + t2 is obtained from p = a2 + b2 and a representation of
M = p/p = 1 by composition, or from p = a2 + (−b)2 and a representation
of M = p/p = 1 by composition. But obviously the only representations
i i
i i
i i
i i
We now present our second proof, and begin with a key lemma.
Lemma 3.8. (Thue). Let p be a prime and let a be any integer relatively
prime to p. Then there are integers x0 and y0 with ax0 ≡ y0 (mod p) and
√ √
0 < |x0 | < p, 0 < |y0 | < p.
√
Proof: Let k = [ p], for convenience, and consider
S = {ax − y | 0 ≤ x ≤ k, 0 ≤ y ≤ k}.
There are k + 1 choices for x and k + 1 choices for y, so the set S has
(k + 1)2 > p elements. Since there are only p congruence classes (mod p),
by the Pigeonhole Principle there must be two different elements of S that
are congruent (mod p),
and so
a(x1 − x2 ) ≡ y1 − y2 (mod p).
i i
i i
i i
i i
i i
i i
i i
i i
In particular, ad = bc. Then a divides the product bc. But and a and b
are relatively prime, so by Euclid’s Lemma a divides c. Similarly, c divides
the product ad, and c and d are relatively prime, so by Euclid’s Lemma c
divides a. Hence a and c divide each other so a = c, and then b = d as
well. (In the other case we get u = 0 and v = 1, which yields the other
order a = d and b = c.)
N = (x + i)(x − i) = M p.
α = a + bi.
Then
p = α = αα = (a + bi)(a − bi) = a2 + b2 ,
and so p is written as a sum of squares, as claimed. (Also, we find that
β = α = a − bi.)
Uniqueness: Suppose p = a2 + b2 = c2 + d2 . Then
Now each of these four factors has norm p, and p is an ordinary prime, so
each of these factors is irreducible (Lemma 2.64). In a UFD, primes are
irreducible and vice versa (Lemma 2.55), so each of these factors is prime.
√
Thus, since a + b −1 is prime and divides the product, it must divide one
of the factors on the right-hand side.
i i
i i
i i
i i
In general, if α divides β, β = αγ, then β = α · γ, and here α and
β each have norm p, so γ = 1, which implies γ is a unit (Lemma 1.14).
√
But by Corollary 1.15 we already know all the units in O( −1). They are
{±1, ±i}. Thus, we have the possibilities
or
c − di = (a + bi)γ with γ = 1, −1, i, or − i.
Solving for c and d, we see that these give the possibilities (c, d) = (±a, ±b)
or (±b, ±a), where the signs can be chosen independently, so up to order
and requiring both entries positive there is a unique solution.
Remark 3.9. Combining the easy Lemma 3.1 and the trivial observation
2 = 12 + 12 with Theorem 3.6, we see that Fermat’s Theorem can be
rephrased as follows:
Theorem (Fermat). Let p be a prime. Then p can be written as a sum
of squares of integers, p = x2 + y 2 for some integers x and y, if and only
if p = 2 or p ≡ 1 (mod 4), in which case x and y are essentially unique.
Here by “essentially unique” we mean unique up to sign and up to
interchanging the order of x and y.
(We mention this not because it adds anything to what we have already
proved but rather for comparison with the exercises for this chapter.)
For our purposes, Theorem 3.6 is all we need. But not only did Fermat
show which primes could be represented as sums of two squares, he also
showed which integers could be represented as sums of squares. Since this
is also an interesting result, and since the rest of the work is relatively easy,
we shall finish this section by showing that.
i i
i i
i i
i i
There are now two possibilities: If s < t, then u2 is not divisible by q but
2(t−s) 2
q v is divisible by q, so their sum is not divisible by q. If s = t, then
u and v are not divisible by q, and q ≡ 3 (mod 4), so again, by Lemma 3.4,
u2 + v 2 is not divisible by q. Thus, in either case the highest power of q
dividing n is 2s, which is even, as claimed.
i i
i i
i i
i i
(3) for any ordinary prime p ≡ 3 (mod 4), p and its associates pi, −p, and
−pi.
Remark 3.12.
√
(1) Recall that the associates of α are αε for any unit ε of O( −1). Since
√
we determined in Corollary 1.15 that the units of O( −1) are ±1 and
±i, we have just multiplied the first prime in each list by these units
to obtain the others.
(2) Note that 1 + i and its conjugate 1 − i are associates, but if a and b
are as in Theorem 3.11(2) then a + bi and its conjugate a − bi are not
associates.
Proof: In this theorem we are making two claims: first, that every Gaussian
integer on this list is a prime, and second, that every prime in the Gaussian
integers is on this list. Both of these claims are consequences of our earlier
work.
If α = 1 + i then α = 2 and if α = a + bi as in (2a) or α = a − bi
as in (2b) then α = p. Thus, in either case α is an ordinary prime, so
√
by Lemma 2.64(1) α is irreducible. If p ≡ 3 (mod 4), then O( −1) has no
element β of norm p (as if β = x + yi, p = β = x2 + y 2 , and we know
from Lemma 3.1 that this is impossible). Then, setting γ = p, we have
√
γ = p2 , so γ is irreducible by Lemma 2.64(2). Finally, since O( −1) is
a UFD, the primes and the irreducibles are the same (Lemma 2.55), so we
have proved our first claim.
Now suppose α is prime, or equivalently, irreducible. Let α = a + bi.
Then α = a − bi is also irreducible, or equivalently prime. (If we had
α = βγ then we would have α = βγ). Let
N = α = αα = a2 + b2 .
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
i i
Remark 3.14. Let us now shift our point of view from factoring in the
Gaussian integers and ask “what happens” to ordinary primes when we go
from the ordinary integers to the Gaussian integers. Looking at Theorem
2.1, we can see three sorts of behavior:
√
(1) The ordinary prime 2 is (up to a unit) the square of a prime in O( −1):
√
2 = −i(1 + i)2 . The ordinary prime 2 is said to ramify in O( −1).
3.3 Exercises
Compare Exercises 3.1–3.5 with Remark 3.9, and Exercises 3.6–3.10 with
Theorem 3.10. In Exercises 3.11, 3.12, and 3.13 “essentially” means up to
sign and order and in the remaining exercises “essentially” means up to
sign.
√
Exercise 3.1. Use the fact that O( −2) is a UFD and the fact that, for
a prime p = 2, −2 is a quadratic residue (mod p) if and only if p ≡ 1 or
3 (mod 8), as shown in Corollary B.37, to prove the following:
Exercise 3.2.
√
(a) Use the fact that O( −3) is a UFD and the fact that for an odd prime
p = 3, −3 is a quadratic residue (mod p) if and only if p ≡ 1 (mod 3),
by Corollary B.41, a corollary of the Law of Quadratic Reciprocity
(Theorem B.40), to prove the following:
i i
i i
i i
i i
Exercise 3.3.
√
(a) Use the fact that O( −7) is a UFD and the fact that for an odd
prime p = 7, −7 is a quadratic residue (mod p) if and only if p ≡ 1, 2,
or 4 (mod 7), by Corollary B.41, a corollary of the Law of Quadratic
Reciprocity (Theorem B.40), to prove the following:
Exercise 3.4.
√
(a) Use the fact that O( 2) is a UFD and the fact that, for a prime p = 2,
2 is a quadratic residue (mod p) if and only if p ≡ 1 or 7 (mod 8), as
shown in Corollary B.37, to prove the following:
i i
i i
i i
i i
Exercise 3.5.
√
(a) Use the fact that O( 5) is a UFD, the fact that 2 is not a quadratic
residue (mod 5), and the fact that for an odd prime p = 5, 5 is a
quadratic residue (mod p) if and only if p ≡ 1 or 4 (mod 5), by the Law
of Quadratic Reciprocity (Theorem B.40), to prove the following:
i i
i i
i i
i i
Exercise 3.11. Let n be a positive integer. Show that the number of es-
sentially different ways of writing n as a sum of squares of two integers is
equal to the number of essentially different ways of writing 2n as a sum of
squares of two integers.
(b) For a nonnegative integer k, show that p2k and p2k+1 can each be
written as a sum of squares of two integers in k + 1 essentially different
ways.
Exercise 3.13.
i i
i i
i i
i i
Exercise 3.14. Let n be a positive integer. Show that the number of essen-
tially different ways of writing n in the form x2 + 2y 2 for integers x and
y is equal to the number of essentially different ways of writing 2n in the
form x2 + 2y 2 for integers x and y.
(b) For a nonnegative integer k, show that p2k and p2k+1 can each be
written in the form x2 + 2y 2 for integers x and y in k + 1 essentially
different ways.
Exercise 3.16.
Clearly, Exercises 3.14, 3.15, and 3.16 are the analogs for D = −2 of
Exercises 3.11, 3.12, and 3.13 for D = −1, and clearly, there are analogs
of these exercises for D = −3 and D = −7. There are no such analogs
for D = 5, as in this case there are infinitely many representations (a
consequence of our study of Pell’s equation in Chapter 4).
i i
i i
i i
i i
Chapter 4
Pell’s Equation
a2 − b2 D = 1,
a = 1766319049, b = 226153980.
Pell’s equation has a long history. A method for solving it, called the
cakravala method, was developed by Indian mathematicians in the seventh
to twelfth centuries. Fermat (who was certainly unaware of this work) was
the first western mathematician both to find a method and to prove that
it always works. The full theory of Pell’s equation is due to Lagrange,
using the method of continued fractions. There is also a twentieth-century
approach using Diophantine approximation. (Pell, however, had nothing
to do with it. His name is attached to this equation because of Euler’s
mistaken belief that he did.)
The approach using Diophantine approximation gives a very quick proof
that Pell’s equation always has infinitely many solutions, but it has the
serious disadvantage that it does not provide any method, other than trial
and error, for finding them.
While the method of continued fractions gives the full theory, including
an effective method for finding all solutions, it requires developing the the-
111
i i
i i
i i
i i
i i
i i
i i
i i
while
i i
i i
i i
i i
(3) By definition,
while
Definition 4.4. The reduced composition (a, b) ∗r (c, d) of (a, b) and (c, d) is
(In other words, to obtain the reduced composition of (a, b) and (c, d), we
first compose (a, b) and (c, d) and then reduce the result.)
We now give several properties of reduced composition.
Lemma 4.5.
(3) If (a, b) and (c, d) are reduced and (a, b)∗r (c, d) = ±(1, 0), then (c, d) =
±(a, b).
Proof:
(1) By definition,
i i
i i
i i
i i
(2) (a, b) ∗r (c, d) = ((a, b) ∗ (c, d))red = (c, d) ∗ (a, b)red = (c, d) ∗r (a, b).
(3) If (a, b) ∗r (c, d) = ±(1, 0), then (a, b) ∗ (c, d) = (e, 0) for some e, so in
particular ad + bc = 0.
Then ad = −bc, so in particular a divides −bc. Now a and b are
relatively prime, as (a, b) is assumed to be reduced. So, by Euclid’s
Lemma, a divides c, i.e., c = ka for some a. Then ad = −bc = −b(ka),
so d = −bk. Thus (c, d) = (ka, −kb) = k(a, −b). But (c, d) is assumed
to be reduced, so k = ±1.
(4) Let (e, f ) = (a, b)∗(c, d). Then (te, tf ) = (a, b)∗t(c, d). Now (e, f )red =
(a, b) ∗r (c, d) = (e/s, f /s) where s = gcd(e, f ), while (te, tf )red =
(a, b) ∗r t(c, d) = (te/s , tf /s ) where s = gcd(te, tf ). But then we
have that s = gcd(te, tf ) = t gcd(e, f ) = ts, so
Corollary 4.6.
and
((a, b) ∗r (c, d)) ∗r (e, f ) = (a, b) ∗r ((c, d) ∗r (e, f )).
(2) If (a, b) is reduced and (a, b) ∗r (x, y) = (a, b), then (x, y)red = ±(1, 0).
Proof:
(1) The first claim follows immediately from Lemma 4.5(3). Let
i i
i i
i i
i i
Let us call attention to what Corollary 4.6 says. The first claim is that
if we wish to compute the reduced composition of (a, b), (c, d), and (e, f ),
we can either reduce at each stage or simply compute the composition at
each stage and reduce at the end—it makes no difference. The same holds
for any number of representations:
The second claim is that reduced composition is associative, and the third
claim is a cancellation result for reduced composition, up to sign.
In developing this theory of composition and reduction, we should not
lose sight of our goal—to solve Pell’s equation a2 − b2 D = 1. In our
language, this is finding a representation 1 = a2 − b2 D. Of course, 1
has the representations 1 = 12 − 02 D = (−1)2 − 02 D. We call these
two representations trivial representations and all other representations
of 1 nontrivial representations. So in fact, we are looking for nontrivial
representations of 1.
i i
i i
i i
i i
Lemma 4.7. Let (a, b) and (c, d) both be reduced and suppose that (a, b) and
(c, d) represent the same integer m. Suppose also that a ≡ c (mod m) and
b ≡ d (mod m). Let
Proof: Let (E, F ) = (a, b) ∗ (c, d) = (a, −b) ∗ (c, d) = (ac − bdD, ad − bc).
Then (E, F ) represents m2 .
We claim that gcd(E, F ) = m.
Let us begin by seeing that each of E and F are divisible by m.
First F : Since a ≡ c (mod m) and d ≡ b (mod m), ad ≡ cb (mod m),
ad − bc ≡ 0 (mod m), i.e., m divides F = ad − bc.
Next E: Since (a, b) represents m, m = a2 − b2 D, so a2 − b2 D ≡
0 (mod m), a(a) − b(b)D ≡ 0 (mod m). But c ≡ a (mod m) and d ≡
b (mod m), so ac − bdD ≡ 0 (mod m), i.e., m divides E = ac − bdD.
Thus we see that (E/m, F/m) represents 1. But, as we observed, this
automatically shows that (E/m, F/m) is reduced, i.e., gcd(E/m, F/m) =
1, and so m = gcd(E, F ). But then
In the next section, we will show that we can always find (a, b) and
(c, d) satisfying the hypotheses of Lemma 4.7, and that will give us a single
nontrivial solution of Pell’s equation. But in fact, once we have a single
nontrivial solution, we have infinitely many, as we see from the next lemma.
Lemma 4.8. Let (e, f ) represent 1 nontrivially (i.e., (e, f ) = ±(1, 0)). Let
i i
i i
i i
i i
Then
(e, f ), (e, f )2 , (e, f )3 , (e, f )4 , . . .
2 3 4
(e, f ), (e, f ) , (e, f ) , (e, f ) , . . .
all represent 1 nontrivially and are all distinct.
Proof: If (e, f ) represents 1, then (e, f ) ∗ (e, f ) represents 1 · 1, i.e., (e, f )2
represents 1, and then (e, f )2 ∗(e, f ) represents 1·1, i.e., (e, f )3 represents 1,
n
etc., so (e, f )n represents 1 for every n ≥ 1. But then (e, f ) also represents
1 for every n ≥ 1.
We must show that they are all nontrivial and all distinct.
Replacing (e, f ) by (e, −f ), (−e, f ), or (−e, −f ), if necessary, we may
assume that e > 0 and f > 0. Write (ek , fk ) = (e, f )k . We claim that
f1 , f2 , f3 , . . . are a strictly increasing sequence of positive integers (and
hence that e1 , e2 , e3 , . . . are a strictly increasing sequence of positive in-
tegers), which shows that (e, f ), (e, f )2 , (e, f )3 , . . . are all nontrivial and
k
distinct. But also (e, f ) = (ek , −fk ), so all of these are nontrivial and
distinct.
We show this claim by direct computation, using induction. By as-
sumption, in case k = 1, (e1 , f1 ) = (e, f ) and e and f are positive integers.
Now suppose ek and fk are positive integers. Then
i i
i i
i i
i i
so
a = b D + 1/b2
a/b = D + 1/b2 .
Now for an arbitrary choice of b, D + 1/b2 will not have a rational square
root, so a/b will not be a rational number. We want to find a value of b
for which it does, but there is no a priori guarantee that such a value of b
exists. However,
√ we do observe that for a solution
√ (a, b), the ratio a/b will
be close to D, which we write as√a/b ∼ D, and furthermore, as b gets
larger the ratio a/b gets closer to D.
√ So let’s start with a guess for a and b that has a/b reasonably close to
D. Indeed, let us start with a pair (a, b) and set e = a2 − b2 D, so (a, b)
represents e. We would like to get e = 1, but let’s settle for the moment for
keeping |e| small. (What “small” means turns out to be a delicate question,
but we will save that for later.)
We would like to get a better guess (A, B) and we will try to do so
(with hindsight gained from working many examples) by setting
B = a + bx
A = ax + bD.
i i
i i
i i
i i
ax + bD
a = ,
e
a + bx
b = .
e
Then a and b are relatively prime, nonnegative integers.
(2) With this value of x, a and b are relatively prime, nonnegative inte-
gers.
i i
i i
i i
i i
a + by ≡ 0 (mod e)
has exactly one solution y0 (mod e). But since x0 , x0 + 1, . . . , x0 + |e| − 1
are |e| consecutive integers, exactly one of them is congruent to y0 (mod e).
Choose that one and call it x.
Now for the second claim.
Let A = ax + bD and B = a + bx. By our choice of x, B ≡ 0 (mod e)
so b = |B/e| is a nonnegative integer. But also
i i
i i
i i
i i
A = ax + bD and B = a + bx,
and so
e = (A/e)2 − (B/e)2 D
= (A2 − B 2 D)/e
= (1/|e|)(x2 − D).
√ √
( D − |e|/2)2 < x2 < ( D + |e|/2)2 ,
√ √
D − |e| D + |e|2 /4 < x2 < D + |e| D + |e|2 /4,
√ √
−|e| D + |e|2 /4 < x2 − D < |e| D + |e|2 /4.
In case (1),
x2 − D x2 − D √ √ √ √
|e | = = < D + |e|/4 < D + 2 D/4 < 2 D,
e |e|
i i
i i
i i
i i
Proof: We follow the notation of the proof of Lemma 4.9 and Lemma 4.10.
We have by Lemma 4.9 that
Then, by definition,
where d = gcd(A, B). But we have seen that |e| divides both A and B, and
by the properties of the gcd,
as claimed. √ 2 2
√ Finally, if |e| = |a − b D| < 2 D, then also |e | = |(a ) − (b ) D| <
2 2
2 D by Lemma 4.10.
i i
i i
i i
i i
We think of using Lemma 4.9 recursively. That is, we start with a pair
(a, b), apply Lemma 4.9 to (a, b) to get a pair (a , b ), apply Lemma 4.9
again to (a , b ) to get a pair (a , b ), . . . . This works for Lemma 4.10, or
for Corollary 4.11, in a similar way, but √ here we must start out with a pair
(a, b) with e = a2 − b2 D having |e| < 2 D. It may seem difficult to find
such a pair, but in fact it is trivial—we choose (a, b) = (1, 0)!
Let us establish some notation.
Definition 4.12. Let (a, b) and (a , b ) be as in Lemma 4.9. We write
If furthermore (a , b ) = P(a , b ) we write (a , b ) = P 2 (a, b), etc.
(ai , bi ) = P i (1, 0)
ei = a2i − b2i D.
Corollary 4.11 gives a value of x for which (ai+1 , bi+1 ) = (ai , bi )∗r (x, 1),
and we denote that value of x by xi+1 , so
As a practical matter,
√ to employ Lemma
√ 4.9 we must search among the
|e| values of x with D − |e|/2 < x < D + |e|/2 for the one value of x
that makes a + bx ≡ 0 (mod e). But if we employ this method recursively,
we may use the following relation.
Lemma 4.14. In the above notation,
Proof: We are looking for a value of xi+1 that in particular satisfies the
congruence ai + bi xi+1 ≡ 0 (mod ei ), a congruence with a unique solution
mod ei . But we know that
i i
i i
i i
i i
Theorem 4.15. Let D be any positive integer that is not a perfect square.
Then Pell’s equation a2 − b2 D = 1 has infinitely many solutions.
are all distinct. To see this, suppose we choose any two nonnegative integers
s < t. Recall that, by Corollary 4.11, and by Corollary 4.6(1),
where
(X, Y ) = (xs+1 , 1) ∗r (xs+2 , 1) ∗ · · · ∗r (xt , 1).
Note that each xi > 0, so Y = 0 (by the formula for composition) and
hence (X, Y ) = ±(1, 0). Then, by the contrapositive of Corollary 4.6(2),
(at , bt ) = (as , bs ).
Now we shall show that Pell’s equation has a single nontrivial solution.
i i
i i
i i
i i
For each i, let ai ≡ ai (mod ei ) with 0 ≤ ai < |ei | and let bi ≡ bi (mod ei )
with 0 ≤ bi < |ei |. Consider the triples of integers
α2 − β 2 D = 1
(α, β), (α, β)2 = (α, β) ∗ (α, β), (α, β)3 = (α, β)2 ∗ (α, β), ...
Then, setting
(αi , βi ) = (as0 , bs0 ) ∗r (asi , bsi )
i i
i i
i i
i i
Definition 4.16. For a fixed value and D, the period of the sequence {ei },
i.e., the smallest value of k for which ei+k = ei for every i ≥ 0, is called
the small period of D, denoted by k = p(D).
Definition 4.17. For a fixed value of D, the period of the sequence {(ei , ai , bi )},
i.e., the smallest value of k for which (ei+k , ai+k , bi+k ) = (ei , ai , bi ) for every
i ≥ 0, is called the large period of D, denoted by k = P (D).
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 0 1 0 0
1 4 1 3 1 1 4
2 11 3 4 3 3 5
3 18 5 −1 0 0 3
4 137 38 −3 2 2 4
5 393 109 −4 1 1 5
6 649 180 1 0 0 3
7 4936 1369 3 1 1 4
8 14159 3927 4 3 3 5
9 23382 6485 −1 0 0 3
10 177833 49322 −3 2 2 4
11 510117 141481 −4 1 1 5
12 842401 233640 1 0 0 3
ai xi+1 + bi D ai bi xi+1
(ai+1 , bi+1 ) = , = (ai , bi ) ∗r (xi+1 , 1).
ei ei
Suppose we had not used absolute values and instead had defined
ai xi+1 + bi D ai bi xi+1
(ai+1 , bi+1 ) = , .
ei ei
Then we would have had (ai+1 , bi+1 ) = (ai , bi )∗r (xi+1 , 1) or (ai+1 , bi+1 ) =
(ai , bi ) ∗r −(xi+1 , 1). This would not have changed the small period p(D),
but might have changed the large period. Call the new value P (D).
Then it could only have changed P (D) by at most a factor of two, i.e.,
P (D) = P (D), P (D) = 2P (D), or P (D) = (1/2)P (D), and all three
possibilities occur.
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 1 1 0 0
1 4 1 −3 1 1 4
2 13 3 −2 1 1 5
3 61 14 −3 1 2 5
4 170 39 1 0 0 4
5 1421 326 −3 2 2 4
6 4433 1017 −2 1 1 5
7 20744 4759 −3 2 1 5
8 57799 13260 1 0 0 4
9 483136 110839 −3 1 1 4
10 1507207 345777 −2 1 1 5
11 7052899 1618046 −3 1 2 5
12 19651490 4508361 1 0 0 4
13 164264819 37684934 −3 2 2 4
14 512445947 117563163 −2 1 1 5
15 2397964916 550130881 −3 2 1 5
16 6681448801 1532829480 1 0 0 4
i ai bi ei ai bi xi
0 1 0 1 0 0
1 5 1 4 1 1 5
2 9 2 −3 0 2 3
3 32 7 −5 2 2 6
4 55 12 1 0 0 4
5 527 115 4 3 3 5
6 999 218 −3 0 2 3
7 3524 769 −5 4 4 6
8 6049 1320 1 0 0 4
9 57965 12649 4 1 1 5
10 109881 23978 −3 0 2 3
11 387608 84583 −5 3 3 6
12 665335 145188 1 0 0 4
13 6375623 1391275 4 3 3 5
14 12085911 2637362 −3 0 2 3
15 42633356 9303361 −5 1 1 6
16 73180801 15969360 1 0 0 4
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 0 1 0 0
1 5 1 −4 1 1 5
2 16 3 −5 1 3 7
3 27 5 4 3 1 3
4 70 13 −1 0 0 5
5 727 135 4 3 3 5
6 2251 418 5 1 3 7
7 3775 701 −4 3 1 3
8 9801 1820 1 0 0 5
9 101785 18901 −4 1 1 5
10 315156 58523 −5 1 3 7
11 528527 98145 4 3 1 3
12 1372210 254813 −1 0 0 5
13 14250627 2646275 4 3 3 5
14 44124091 8193638 5 1 3 7
15 73997555 13741001 −4 3 1 3
16 192119201 35675640 1 0 0 5
17 1995189565 370497401 −4 1 1 5
18 6177687896 1147167843 −5 1 3 7
19 10360186227 1923838285 4 3 1 3
20 26898060350 4994844413 −1 0 0 5
21 279340789727 51872282415 4 3 3 5
22 864920429531 160611691658 5 1 3 7
23 1450500069335 269351100901 −4 3 1 3
24 3765920568201 699313893460 1 0 0 5
25 39109705751345 7262490035501 −4 1 1 5
26 121095037822236 22486783999963 −5 1 3 7
27 203080369893127 37711077964425 4 3 1 3
28 527255777608490 97908939928813 −1 0 0 5
29 5475638145978027 1016800477252555 4 3 3 5
30 16954170215542571 3148310371686478 5 1 3 7
31 28432702285107115 5279820266120401 −4 3 1 3
32 73819574785756801 13707950903927280 1 0 0 5
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 1 1 0 0
1 6 1 5 1 1 6
2 11 2 −3 2 2 4
3 39 7 2 1 1 5
4 206 37 −3 2 1 5
5 863 155 −6 5 5 7
6 1520 273 1 0 0 5
7 17583 3158 5 3 3 6
8 33646 6043 −3 1 1 4
9 118521 21287 2 1 1 5
10 626251 112478 −3 1 2 5
11 2623525 471199 −6 1 1 7
12 4620799 829920 1 0 0 5
13 53452314 9600319 5 4 4 6
14 102283829 18370718 −3 2 2 4
15 360303801 64712473 2 1 1 5
16 1903802834 341933083 −3 2 1 5
17 7975515137 1432444805 −6 5 5 7
18 14047227440 2522956527 1 0 0 5
19 162495016977 29184966602 5 2 2 6
20 310942806514 55846976677 −3 1 1 4
21 1095323436519 196725896633 2 1 1 5
22 5787559989109 1039476459842 −3 1 2 5
23 24245563392955 4354631736001 −6 1 1 7
24 42703566796801 7669787012160 1 0 0 5
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 0 1 0 0
1 8 1 6 2 1 8
2 23 3 7 2 3 10
3 61 8 9 7 8 11
4 99 13 −1 0 0 7
5 1546 203 −6 4 5 8
6 4539 596 −7 3 1 10
7 12071 1585 −9 2 1 11
8 19603 2574 1 0 0 7
9 306116 40195 6 2 1 8
10 898745 118011 7 1 5 10
11 2390119 313838 9 7 8 11
12 3881493 509665 −1 0 0 7
13 60612514 7958813 −6 4 5 8
14 177956049 23366774 −7 5 4 10
15 473255633 62141509 −9 2 1 11
16 768555217 100916244 1 0 0 7
17 12001583888 1575885169 6 2 1 8
18 35236196447 4626739263 7 4 6 10
19 93707005453 12304332620 9 7 8 11
20 152177814459 19981925977 −1 0 0 7
21 2376374222338 312033222275 −6 4 5 8
22 6976944852555 916117740848 −7 6 2 10
23 18554460335327 2436320000269 −9 1 1 11
24 30131975818099 3956522259690 1 0 0 7
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 0 1 0 0
1 8 1 3 2 1 8
2 39 5 −4 3 1 7
3 164 21 −5 4 1 9
4 453 58 5 3 3 6
5 1523 195 4 3 3 9
6 5639 722 −3 2 2 7
7 29718 3805 −1 0 0 8
8 469849 60158 −3 1 2 8
9 2319527 296985 4 3 1 7
10 9747957 1248098 5 2 3 9
11 26924344 3447309 −5 4 4 6
12 90520989 11590025 −4 1 1 9
13 335159612 42912791 3 2 2 7
14 1766319049 226153980 1 0 0 8
15 27925945172 3575550889 3 2 1 8
16 137863406811 17651600465 −4 3 1 7
17 579379572416 74181952749 −5 1 4 9
18 1600275310437 204894257782 5 2 2 6
19 5380205503727 688864726095 4 3 3 9
20 19920546704471 2550564646598 −3 2 2 7
21 104982939026082 13441687959085 −1 0 0 8
22 1659806477712841 212516442698762 −3 1 2 8
23 8194049449538123 1049140525534725 4 3 1 7
24 34436004275865333 4409078544837662 5 3 2 9
25 95113963378057876 12178095108978261 −5 1 1 6
26 319777894410038961 40943363871772445 −4 1 1 9
27 1183997614262097968 151595360378111519 3 2 2 7
28 6239765965720528801 798920165762330040 1 0 0 8
i i
i i
i i
i i
i ai bi ei ai bi xi
0 1 0 1 0 0
1 10 1 9 1 1 10
2 19 2 −3 1 2 8
3 124 13 −3 1 1 10
4 849 89 −10 9 9 11
5 1574 165 1 0 0 9
6 30755 3224 9 2 2 10
7 59936 6283 −3 2 1 8
8 390371 40922 −3 2 2 10
9 2672661 280171 −10 1 1 11
10 4954951 519420 1 0 0 9
11 96816730 10149151 9 4 4 10
12 188678509 19778882 −3 1 2 8
13 1228887784 128822443 −3 1 1 10
14 8413535979 881978219 −10 9 9 11
15 15598184174 1635133995 1 0 0 9
16 304779035285 31949524124 9 8 8 10
17 593959886396 62263914253 −3 2 1 8
18 3868538353661 405533009642 −3 2 2 10
19 26485808589231 2776467153241 −10 1 1 11
20 49103078824801 5147401296840 1 0 0 9
21 959444306260450 100577091793201 9 7 7 10
22 1869785533696099 196006782289562 −3 1 2 8
23 12178157508437044 1276617785530573 −3 1 1 10
24 83377317025363209 8740317716424449 −10 9 9 11
25 154576476542289374 16204017647318325 1 0 0 9
26 3020330371328861310 316616653015472624 9 5 5 10
27 5886084266115433250 617029288383626923 −3 2 1 8
28 38336835968021460851 4018792383317234162 −3 2 2 10
29 262471767510034792701 27514517394837012211 −10 1 1 11
30 486606699052048124551 51010242406356790260 1 0 0 9
i i
i i
i i
i i
Table 4.9. Small and large periods for values of D between 2 and 99.
i i
i i
i i
i i
Lemma 4.20.
represents 1.
represents 1.
Proof:
Lemma 4.21. Let (a, b) and (c, d) both be reduced and suppose that, for
some integer m, (a, b) represents m and (c, d) represents −m. Suppose also
that a ≡ c (mod m) and b ≡ d (mod m). Then (a, b) ∗r (c, d)
represents −1.
Proof: This is almost identical to the proof of Lemma 4.7. (E, F ) = (a, b) ∗
(c, d) represents −m2 and then gcd(E, F ) = |m|, so (e, f ) = |m| 1
((a, b) ∗
(c, d)) = (a, b) ∗r (c, d) represents −1.
We also make the trivial but useful observation that if (a, b) represents
m, then (a, −b), (−a, b), and (−a, −b) also represent m.
Now let us look at some examples (and we refer the reader to Tables 4.1–
4.8).
i i
i i
i i
i i
√
4.4. Units in O( D) 137
Example 4.22.
√
4.4 Units in O( D)
In this section we assume that D is a square-free
√ positive integer, D = 1.
Our objective is to find all units of R = O( D).
Definition
√ 4.23. Let D be a square-free positive
√ integer, D = 1, and let
R = O( D). Among √ all units ε = c + d D of R, let ε0 be defined as
follows: ε0 = c0 + d0 D where d0 is the minimum positive value of d and
c0 is positive. In almost all cases this determines c0 uniquely, but if not,
i i
i i
i i
i i
Let us observe that this definition makes sense. First note that, since
Pell’s√equation x2 − y 2 D = 1 has a nontrivial solution, there is a unit
c + d D of R with c and d both positive (and √ in fact, there
√ are infinitely
many such units). Next note that ε = c + d D = (a + b D)/2 where a
and b are both even integers, if D ≡ 2 or 3 (mod 4), or where a and b are
either both even integers or both odd integers, if D ≡ 1 (mod 4), and in
each case the possible values of b (and if necessary, the possible values of
a) are well-ordered, as we require that b and a be positive.
Proof: First let us observe that it is not a priori clear that there is a small-
est unit greater than 1, as there could be an infinite sequence of units
ε, ε , ε , . . . with ε > ε > ε > . . . > 1. But in fact there is such a smallest
unit.
Note that the statement of the lemma is equivalent to the following
statement:
{. . . , ±ε−3 −2 −1
0 , ±ε0 , ±ε0 , ±1, ±ε0 , ±ε0 , ±ε0 , . . .}.
2 3
√ √
Proof: Let ε √= c + d D be any unit √ of R. Note that ε = c − d D,
−ε = −c − d D, and −ε = −c + d D are all units of R. Also note that
ε = ±ε−1 (+ if εε = 1 and − if εε = √ −1). Thus in order to prove the
theorem, it suffices to prove if ε = c + d D is any unit of R with c and d
positive, then ε =√εk0 for some positive integer k.
Since ε = c+d D with c and d positive, we see that ε > 1. Also, ε0 > 1,
so the sequence 1, ε0 , ε20 , ε30 , . . . of nonnegative powers of ε0 is a (strictly)
increasing sequence that diverges to +∞. Hence there is some positive
integer k with εk−1
0 < ε ≤ εk0 . But then 1 < ε ≤ ε0 where ε = ε/(εk−1 0 ).
i i
i i
i i
i i
√
Remark 4.26. For a given value of D, let εPell be the unit εPell = a + b D
obtained from the smallest solution to Pell’s equation a2 − b2 D = 1 in
√ integers a and b. Then sometimes εPell is the fundamental unit ε0
positive
of O( D) and sometimes not. Table 4.10 is a table of ε0 , εPell , and the
relation between them for selected values of D.
4.5 Exercises
In Exercises 4.1 and 4.2, D is a positive integer that is not a perfect square.
In the remaining exercises, D is a square-free positive integer, D = 1.
Exercise 4.1.
i i
i i
i i
i i
(b) In particular, use our method to find, by hand computation, the small-
est solution a = 176631909 and b = 226153980 of Pell’s equation
a2 − 61b2 = 1 in positive integers a and b. (Compare Table 4.7 and
Example 4.22.) If Fermat could do this computation by hand, so can
you.
Exercise 4.3. The Archimedes cattle problem was a famous problem posed
by Archimedes. Search the Internet for this problem. (You will find lots
of references.) The heart of this problem is solving Pell’s equation a2 −
4729494b2 = 1. Write a computer program to implement our method of
solving Pell’s equation that can handle relatively large values of a, b, and
D and apply it to the case D = 4729494. Show that at step 60 it produces
the following solution:
a = 109931986732829734979866232821433543901088049,
b = 50549485234315033074477819735540408986340.
The next few problems explain why many of the results in√Section 4.1
are true, by relating these results to arithmetic in the field Q( D). Given
a pair√of rational
√ numbers (a, b), let (a, b) correspond to√the element α =
a + b D of Q( D). We write this as (a, b) ↔ α = a + b D.
Exercise 4.4.
(b) Show that if (a, b) ↔ α and (c, d) ↔ β, then (a, b) ∗ (c, d) = αβ.
Thus, composition of representations
√ (as defined in Definition 4.2) cor-
responds to multiplication in Q( D).
Exercise 4.6.
i i
i i
i i
i i
(b) Use√the fact that N(αβ) = N(α) N(β) for any two elements α and β of
Q( D) to prove Lemma 4.1.
Exercise 4.7. Let (a, b) ↔ α. Observe that, if α = ±1, then the powers of
α are all distinct. Use this to prove Lemma 4.8.
Exercise 4.8. We have often had occasion to compute (a, b) ∗ (c, d). If
(a, b) ↔ α and (c, d) ↔ β, show that (a, b) ∗ (c, d) ↔ N(α)β/α, so compo-
sition with (a, b) corresponds to multiplication by N(α)/α.
Exercise 4.13. Use the results of Exercises 4.10 and 4.11 to prove Lemma 4.24.
√
Exercise 4.14. Let R = O( D). Suppose that √ D ≡ 1 (mod 8). Show that
every unit ε of R is of the form ε = a + b D with a and b integers. (Of
course, this is automatically true for D ≡ 2, 3, 6, or 7 (mod 8) as in those
cases, every element of R is of that form. In case D ≡ 5 (mod 8), all units
of R may or may not be of this form, as we see from Table 4.10.)
√
Exercise 4.15. Let R = O( D). √Suppose that D ≡ 3, 6, or 7 (mod 8).
Show that every unit ε of R = O( D) has norm N(ε) = 1.
i i
i i
i i
i i
Exercise 4.16.
(b) Extend this table to include all values of D between 26 and 47.
(b) Suppose that D ≡ 1 or 2 (mod 8). Show that εPell = εk0 for k = 1 or 2.
(c) Suppose that D ≡ 5 (mod 8). Show that εPell = εk0 for k = 1, 2, 3, or
6. (All of these possibilities occur, as illustrated in Table 4.10.)
√
Exercise 4.18. Let R = O( D).
i i
i i
i i
i i
Chapter 5
143
i i
i i
i i
i i
Example 5.2.
(2) It is relatively easy to show that there are transcendental numbers, but
not so easy to show that any particular number is transcendental. It
is a famous theorem of Hermite that e is transcendental and a famous
theorem of Lindemann that π is transcendental,
√
and a case of the
famous Gelfond-Schneider theorem that 2 2 is transcendental.
At first glance it may not be obvious what this has to do with K being
“algebraic,” but here is the connection.
i i
i i
i i
i i
Now that we know what an algebraic number field is, we need to know
what an algebraic integer is.
Note that this lemma is far from obvious. It is a nontrivial fact that
if mα (X) has integer coefficients and mβ (X) has integer coefficients, then
mα+β (X) and mαβ (X) have integer coefficients, but it turns out to be true.
But having this lemma we may make the following definition.
Lemma 5.11. Let K be an algebraic number field and let α be any element
of K. Then there is an integer m such that mα is an algebraic integer.
Proof: Let mα (X) = X d +ad−1 X d−1 +. . .+a1 X +a0 and let each ai be the
rational number ai = ri /si . Let m = lcm(s0 , . . . , sd−1 ) and let β = mα.
Then α = β/m, so 0 = mα (α) = mα (β/m) = [β/m]d +ad−1[β/m]d−1 +. . .+
a1 [β/m] + a0 . Multiplying through by md we see that β d + ad−1 mβ d−1 +
. . .+a1 md−1 β+a0 md = 0, so mβ (X) = X d +ad−1 mX d−1 +. . .+a1 md−1 X +
a0 md and all of the coefficients of mβ (X) are integers.
i i
i i
i i
i i
√ √
O( D) = {[a + b D]/2 | a and b are integers, and either they are
both even or they are both odd}.
√
√ case, we see that O( D) is a free Z-module of rank 2 with basis
In this
{1, [1 + D]/2}.
i i
i i
i i
i i
Thus, we conclude that α β is an element of (γ). But we can also see that
every element of (γ) is of this form. For if γ is an arbitrary element of (γ),
then γ = γν for some element ν of R, and then
i i
i i
i i
i i
Of course, we must check that this does define an ideal, and this is
indeed the case. The proof of this is similar to the proof of Lemma 2.35.
We leave the proof of the following useful lemma to the reader.
Lemma 5.16. Let R be an integral domain and let I = (α1 , . . . , αi ) and
J = (β1 , . . . , βj ) be ideals in R.
(1) I ⊆ J if and only if αk is an element of J for each k = 1, . . . , i.
Consequently, I = J if and only if αk is an element of J for each
k = 1, . . . , i and βk is an element of I for each k = 1, . . . , j.
K = (α1 β1 , . . . , α1 βj , α2 β1 , . . . , α2 βj , . . . , αi β1 , . . . , αi βj ).
Remark 5.17. Note that R = (1) and that IR = I for any ideal I of R.
Also note that multiplication of ideals is certainly commutative, and that
Lemma 5.16(2) implies that multiplication of ideals is also associative.
Now let us think about the analog of divisibility. Suppose that the
element α of R divides the element β of R. Then β is a multiple of α, and
then every multiple of β is a multiple of α. In other words, the principal
ideal (α) contains the principal ideal (β). So, here we do not need to make
another definition. We simply regard the analog of divisibility as being that
the ideal I contains the ideal J, I ⊇ J. Now recall from Definition 2.45
that an element α of R is a prime if α dividing βγ implies that α divides
β or α divides γ. With this analogy and definition in mind, we can define
a prime ideal in general.
i i
i i
i i
i i
We just noted that the analog of α dividing β for two ideals I and J was
that I ⊇ J. But we could have asked for something stronger. Namely, if α
divides β, then β = αγ for some γ. Thus, passing to ideals, we might ask
that the analog be that there is an ideal K with J = IK. This is indeed
stronger in general, as the next proposition shows.
i i
i i
i i
i i
i i
i i
i i
i i
(It is easy to check that F is a field, and that the “usual” laws of
fractions work in F. The reader may worry that the definition of F is
circular in that we are already assuming that we can divide elements of R
in defining F, but this is not the case. In the definition of F “/” is just a
symbol, but once F is defined, we can give it its usual interpretation, so
that b[a/b] = a.)
Example 5.27.
(2) Let K be an algebraic number field and let R = O(K) be the ring of
integers of K. Then K is the quotient field of R. To see this, let α be
any element of K. By Lemma 5.11, there is an integer n with β = nα
in R, and then α = β/n, and n is in R by Remark 5.9.
i i
i i
i i
i i
I −1 = {α in K | αI ⊆ R}.
i i
i i
i i
i i
Here is the main general result about Dedekind domains, and the reason
why we introduced them.
(In this case, we can also factor every nonzero fractional ideal essentially
e e 1 +1 ek1 +2 e 1 +k2
uniquely as I = P1e1 P2e2 · · · Pk1k1 Pk1k+1 Pk1 +2 · · · Pk1k+k2
with the Pi ’s mu-
tually distinct prime ideals and with e1 , . . . , ek1 positive integers and ek1 +1 ,
. . . , ek1 +k2 negative integers.)
i i
i i
i i
i i
i i
i i
i i
i i
Lemma 5.39. Let K be an algebraic number field and let R = O(K). Let
n = degK/Q .
(1) If I = (m), the principal ideal generated by the nonzero integer m, then
#(R/I) = |m|n .
Proof:
(2) We claim that I contains a nonzero integer m. To see this, choose any
nonzero element α and consider its minimum polynomial mα (X). We
know that mα (X) = X d + ad−1 X d−1 + . . . + a1 X + a0 with all the
coefficients integers, and mα (α) = 0. Note that a0 = 0 as otherwise
mα (X) would have X as a factor, and mα (X)/X would be a polynomial
of lower degree having α as a root. But then we can solve for a0 :
and m = a0 is in I.
Let J = (m). Then I ⊇ J, so R/I is a quotient of R/J. But by (1),
R/J is a finite set, so R/I must be a finite set as well.
i i
i i
i i
i i
Proof of Theorem 5.37: We must verify that R satisfies the three conditions
for a Dedekind domain in Definition 5.22.
(2) Observe that for any two ideals I and J of R with I a proper subset of
J, R/J is a proper quotient of R/I, and hence J divides I. Now
let I1 ⊂ I2 ⊂ I3 . . . be a sequence of ideals in R. Then I1 > I2 >
I3 > . . . is a strictly decreasing sequence of positive integers and so
must be finite.
and let
Definition 5.42. Let K be an algebraic number field. The ideal class group
of K is the quotient group
i i
i i
i i
i i
h(K) = #(C(K)),
(1) h(K) = 1.
Proof: (1) and (2) are equivalent by definition, and (2) and (3) are equiv-
alent by Corollary 5.36.
We have the projection map π : I(K) → C(K) and we let [I] = π(I),
and call [I] the ideal class of the ideal I. Thus the ideal class [I] of I is
trivial if and only if I is a principal ideal.
Theorem 5.43 and Corollary 5.45 tell us that, for any algebraic number
field K, even if unique factorization of elements does not hold in O(K), in
some sense it only misses by a finite amount.
Remark 5.46. There is an effective procedure for finding C(K) for any al-
gebraic number field K.
Lemma 5.47. Let K be an algebraic number field and let I and J be any
two ideals of R = O(K). Then IJ = I · J.
Proof: From abelian group theory, we know that R/I is isomorphic to the
quotient of R/IJ by I/IJ, so #(R/IJ) = #(R/I) · #(I/IJ). Since R is a
Dedekind domain, I/IJ is isomorphic to R/J.
i i
i i
i i
i i
Proof: If I = (0) we are done, and if I is a principal ideal then I = (α) for
some α in R and we are done.
Suppose I = (n, α) for some integer n. Let g = gcd(n, α). Then for
some integers a and b, g = na+αb = na±ααb is in I, and then I = (g, α)
and we are done.
Thus, to complete the proof we must show that every nonzero ideal
I in R is of the form I = (α) or I = (n, α). To see this, choose any
nonzero element β0 of I. Then I contains the nonzero integer ββ0 , and
i i
i i
i i
i i
√
5.5. Prime Ideals in O( D) 159
√
then I contains β0 β0 D as well.√Let S1 = {|n | = 0 | n is an integer in I}
and let S2 = {|b | = 0 | a + b D is in I}. Then S1 is a nonempty set
of positive integers and so has a smallest element n. Similarly, S2 √ has
a smallest element b (which√may be a half–integer). Let α = a + b D
be in I. Now if β = c + d D is any element of I, it follows from the
division algorithm that d is a multiple of b, d = jb for some integer j. Then
m = β − jα is an integer, and it again follows from the division algorithm
that m is a multiple of n, m = n for some integer . Thus we see that
β = n + αj and so I = (n, α).
Remark 5.50.
(2) We see from (1) that if I = (α) is a principal ideal, then I = (g, α) is
a principal ideal for g = ±α. On the other hand, if g = ±α, then
I = (g, α) = (±α, α) = (±αα, α) = (α) is a principal ideal.
(2) P = (p) for some prime p where R does not have an element that is
not divisible by p but whose norm is divisible by p; or
Proof: We prove the theorem by ruling out every ideal that is not of one
of the above forms.
Let I be an ideal of R. By Remark 5.50, we may assume I is of the
form (g, α) with g an integer dividing α.
First, suppose g has more than one prime factor. Write g = g1 g2 with
g1 and g2 relatively prime, g1 = ±1, g2 = ±1. Let I1 = (g1 , α) and
I2 = (g2 , α). Then I1 I2 = (g, g1 α, g2 α, α2 ) ⊆ (g, α) = I but I1 ⊆ I, as
I1 has the element g1 with g1 = g12 not divisible by g, and I2 has the
element g2 with g2 = g22 not divisible by g. Thus I is not a prime ideal.
i i
i i
i i
i i
Proof:
i i
i i
i i
i i
√
5.5. Prime Ideals in O( D) 161
(1) in case (1), there are two ideals of this form if P = P and a unique
ideal of this form if P = P ;
(3) in case (3), there are two ideals of this form if P = P and a unique
ideal of this form if P = P .
Proof:
(1) Suppose α0 is any element of R with α0 = p. Note by Lemma 5.48
that α0 is divisible by (α0 ) or by α0 , and that α0 is divisible by α0 or
by α0 , which readily implies that (α0 ) = (α0 ) or (α0 ) = (α0 ).
(3) Suppose α1 is any element of R with α1 divisible by p but with α1
not divisible by p. Then,√in the notation of the proof of Lemma 5.48, we
must have α1 = a√ 1 + b1 D with b1 ≡ 0 (mod p) and a1 ≡ kb1 (mod p),
and α1 = a1 + b1 D with b1 ≡ 0 (mod p) and a1 ≡ k b1 (mod p), with
i i
i i
i i
i i
(2) if p divides eD D, case (2) does not occur, and P = P in cases (1)
and (3).
i i
i i
i i
i i
√
5.5. Prime Ideals in O( D) 163
Proof: Assembling Propositions 5.51, 5.52, 5.53, and 5.54 gives us almost
all of this theorem. There is only one thing left to do. Proposition 5.51
shows that every prime ideal must be of the form above. To complete the
proof we must show that every ideal of the above form is indeed a prime
ideal. Thus let P be as in the statement of the theorem, and suppose that
I and J are ideals with P ⊇ IJ. We must show that P ⊇ I or P ⊇ J.
To begin with, we note that every element of P has norm divisible by p.
Also, by Lemma 5.49 we may assume that I = (m, β) and J = (n, γ) with
m dividing β and n dividing γ. Then IJ = (mn, nβ, mγ, βγ). Since
P ⊇ IJ every element of IJ must have norm divisible by p. In particular,
mn = m2 n2 is divisible by p. Thus at least one of m and n is divisible
by p. We shall assume that m is divisible by p. (Otherwise interchange I
and J.) To proceed further, we must break the proof up into several cases.
We number the cases as in the statement of the theorem.
Case (2)(b): Here P = (p). Now p divides m and m divides β, so p
divides β and hence p divides β. Then P = (p) ⊇ I.
Case (1): If p divides β then P ⊇ (p) ⊇ I and we are done, so assume
not. By adding a “redundant” generator if necessary, we may assume that
P = (p, α)√with α not divisible by p but with α divisible by p. Write
α = a + b D. Note that b ≡ 0 (mod p) and define k by a ≡ kb (mod p).
Then,√as in the proof of Proposition 5.53, every element
√ of P is of the form
a +b D with a ≡ kb (mod p). Similarly, β = c+d D with d ≡ 0 (mod p)
and c ≡ kd (mod p) or c ≡ −kd (mod p). In the former case, again as in
the proof of Proposition 5.53, β is in P and so P ⊇ I. In the latter case,
by the same argument, β is in P and so P ⊇ I. But here P = P , so P ⊇ I.
Case (2)(a): In the event that D ≡ 1 (mod 4), we have a preliminary
step. In this event p must be odd, so we √ may replace β and√γ by 2β and
2γ, if necessary, to ensure that β = a + b D and γ = c + d D with a, b,
c, and d integers.
By the argument for Case (1), P ⊇ I except possibly in the situation
that β is in P , so suppose we are in that situation. Then nβ is in P , as
P ⊇ IJ. If n is not divisible by p, that gives β in P , and then P = P ,
which is impossible. Thus, we must have n divisible by p. Now consider the
ideal J = (n, γ). By the same argument as above, P ⊇ J except possibly
in the situation that γ is in P . Thus, the final situation to consider is
i i
i i
i i
i i
Remark 5.56. In the situation of Theorem 5.55, let p be an odd prime not
dividing D. From the proof of Lemma 5.48, we can see that if D is a
quadratic residue (mod p) then case (2)(a) occurs, while if D is a quadratic
nonresidue (mod p) then case (2)(b) occurs.
Lemma 5.57. In the situation of Theorem 5.55, let D ≡ 1 (mod 4) and let
p = 2. If D ≡ 1 (mod 8) then case (2)(a) occurs, while if D ≡ 5 (mod 8)
then case (2)(b) occurs.
√
Proof: If α = a+ b D with a and b integers and with α even, then either
a and b are both even or they are both odd. In either event,
√ α is divisible
by 2. Thus, the only possibility for α√ 1 is α1 = a1 + b1 D with a1 and
b1 both half–integers, i.e., α = [a + b D]/2 with a and b odd integers.
Using the fact that c2 ≡ 1 (mod 8) for any odd integer c, we see that in
this case α ≡ [1 − D]/4 (mod 2). Thus, if D ≡ 1 (mod 8), we may choose
α1 to be any element of R of this form, and we are in case 2(a), while if
D ≡ 5 (mod 8), there is no possible choice for α1 and we are in case 2(b).
i i
i i
i i
i i
√
5.5. Prime Ideals in O( D) 165
i i
i i
i i
i i
(5) In case (1), suppose that R has an element β, which is not divisible by
2, with β = 4. Then I 2 = (β), a principal ideal.
Proof: We have proved (1), (2), (3), and (4) in Theorem √ 5.55 and Lemma 5.57.
As for (5), in this case we must have β = [a + b D]/2 with a and b odd
integers. Replacing β by β, if necessary, we may assume that a ≡ b (mod 4).
Computation then shows that a ≡ b (mod 8) if D ≡ 1 (mod 16) and that
a ≡ b + 4 (mod 8) if √ D ≡ 9 (mod 16).
Let α1 = [1 + D]/2. Then I 2 = (4, 2α1 , α21 ) = (4, 2α1 − 2α21 , α21 ) =
(4, [1 − D]/2, α21 ) = (4, α21 ). Certainly β divides 4, and a long but routine
computation shows that β divides α21 . Thus I 2 ⊆ (β). Another computation
shows that bβ − α21 = 4c for some integer c, so bβ is in I 2 . But certainly
4β is in I 2 , so β is in I 2 . Thus I 2 ⊇ (β). Hence I 2 = (β).
√
Corollary 5.60. Let D ≡ 1, 2, 3, 6, or 7 (mod 8) and let R = O( D). Let
I be the ideal of Proposition 5.59.
(1) If D < 0, D = −1, −2, or − 7, then I is a nonprincipal ideal of R.
Consequently, h(D) > 1.
(2) If D is divisible by a prime congruent to 5 (mod 8), or if D is divis-
ible by a prime congruent to 3 (mod 8) and by a prime congruent to
5 (mod 7), then I is a nonprincipal ideal of R. Consequently, h(D) > 1.
(3) If D ≡ 2, 3, 6, or 7 (mod 8) and D is as in (1) or (2), then h(D) is
even. If D ≡ 1 (mod 8), D is as in (1) or (2), and R has an element
β that is not divisible by 2, with β = 4, then h(D) is even.
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 167
Proof: It is immediate from Lemma 2.67, Lemma 2.77, √ Theorem 5.55, and
Proposition 5.59 that I is a nonprincipal ideal of O( D), and that, if D
is as in (3), I 2 is
√ a principal ideal. Thus, in this case, [I] is an element of
order 2 of C(Q( D)), and so this group has even order.
√ √
Example 5.61. If D = −1, then√α = 1 + D is an element √ of O( D) with
α = 2. If D = −2, then √ α = D is an element of O( √ D) with α = 2.
If D = −7, then α = [1 + D]/2 is an element of √ O( D) with α = 2.
√Here are examples of real quadratic fields O( D) having elements α of
O( D) with α = 2:
√
D = 2: α = 2, √
D = 3: α = 1 + √3,
D = 6: α = 2 + √6,
D = 7: α = 3 + √7,
D = 11: α = 3 + √11,
D = 14: α = 4 + √14,
D = 17: α = [5 + √ 17]/2,
D = 19: α = 13 + 3√19,
D = 22: α = 14 +√3 22,
D = 23: α = 5 + 23,√
D = 31: α = 39 +√ 7 31,
D = 33: α = [5 +√ 33]/2,
D = 34: α = 6 + √34,
D = 38: α = 6 + √38,
D = 41: α = [7 + √ 41]/2,
D = 43: α = 59 + 9 43,
√
D = 46: α = 156 √+ 23 46,
D = 47: α = 7 + √47,
D = 51: α = 7 + √51,
D = 57: α = [7 + √ 57]/2,
D = 59: α = 23 +√3 59,
D = 62: α = 8 + √62,
D = 66: α = 8 + 66,√
D = 67: √ 67,
α = 221 + 27
D = 71: α = 59 +√ 7 71,
D = 73: α = [9 +√ 73]/2,
D = 79: α = 9 + 79,√
D = 86: α = 102 +√11 86,
D = 89: α = [9 + 89]/2,√
D = 94: α = 1464 +√ 151 94,
D = 97: α = [69 + 7 97]/2.
i i
i i
i i
i i
Thus the values of D in this example are not covered by Corollary 5.60.
Proposition 5.63.
(1) Let p1 and p2 be odd primes. Suppose that p2 ≡ 1 (mod 4) and that p1
is a quadratic nonresidue (mod p2 ).
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 169
As we shall now see, for D negative we can easily get stronger infor-
mation
√ on the ideal class group, and hence the class number, of the field
Q( D).
Theorem 5.64.
√ Let p1 , . . . , pk be distinct primes and set D = −p1 · · · pk .
Then C(Q( D)) has a subgroup isomorphic to (Z/2Z) k−1
√ . Consequently,
h(D) is divisible by 2 k−1
. If D ≡ 3 (mod 4), then C(Q( D)) has a subgroup
isomorphic to (Z/2Z)k . Consequently, h(D) is divisible by 2k .
i i
i i
i i
i i
Theorem 5.66. Let n be an arbitrary integer and let q > 1 be odd. Let
a be a positive integer with a and q relatively prime. Let D be the unique
√defined by b D = a −q , where b is an integer. √
2 2 n
square-free integer Suppose
that q and a + b D do not have a common nonunit factor in O( D) for
j
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 171
√ √ √ √
in I k+1 , as is√[ D][2b D][a + b√ D] = 2bD[a + b D]. Thus I k+1 contains
both q[a + b D] and 2bD[a + b D]. Since q and 2bD are relatively prime
integers,
√ there are integers x and y with qx + 2bDy = 1, which implies that
(a + b D) is in I k+1 , as required.
√ √ √ √
Now I n = (q n , a + b D) = (a + b D) as q n = ±[a +√b D][a − b D],
so I n is a principal ideal. Thus [I] has order j in C(Q( D)) for some j
dividing n. We want to show j = n.
j
√ factor of n. If I is principal,j
Suppose j < n, in which case j is a proper
then I =√(γj ) for some element γj of O( D). Then γj divides
j
√ both p
and a + b D, so by hypothesis
√ I = O( D). But
γj is a unit, and then √j
Suppose that a2 < q n in Theorem 5.66. Then D < 0. Note then that
in any particular
√ case we can check the hypothesis in Theorem
√ 5.66 that
q and a + b D do not have a common nonunit factor in O( D) for any
j
proper factor j of n. For any√ such common factor must have norm dividing
q n , and, for D negative, O( D) has only finitely many elements of norm N
for any integer N . But, more interestingly, we can give general conditions
that ensure that this hypothesis holds.
n/j
Thus γn and γj are associates, so they have the same norm. But the
norm is multiplicative,
√ and that implies that γj has norm q j .
Let γj = c + d D. Since a, b, and q are relatively prime, d cannot be
0, and then c cannot be 0 either. Note that |d| ≥ 1 if D ≡ 2 or 3 (mod 4)
and |d| ≥ 1/2 if D ≡ 1 (mod 4), and similarly for |c|. Now γj has norm
q j = c2 − d2 D = c2 + d2 (−D), so we see that q j ≥ 1 − D if D ≡ 2 or
3 (mod 4) and that q j ≥ (1 − D)/4 if D ≡ 1 (mod 4). But, as a little
algebra shows, this contradicts our hypothesis on q and D.
i i
i i
i i
i i
Remark 5.69.
(2) More precisely, let I = (g, α) with α divisible by g but with g and
α having no common integer divisor other than ±1. We saw in Theo-
rem 5.55 that if g = p is a prime, then II = (p), and hence II = (g)
for an arbitrary integer g.
√
Now we use the work we have done to examine the situation in O( D)
for specific values of D.
√ √
Example 5.70. Throughout this example R = O( D) and K = Q( D).
The assertions here can all be verified by direct computation and we shall
omit the details.
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 173
i i
i i
i i
i i
Thus, we see that we have the following factorization of the ideal (27)
into a product of prime ideals:
3
(27) = I 3 I .
Thus, we see that we have the following factorization of the ideal (27)
into a product of prime ideals:
3
(27) = I 3 I .
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 175
i i
i i
i i
i i
i i
i i
i i
i i
√
5.6. Examples of Ideals in O( D) 177
Remark 5.72. For any particular value of D, h(D) can be computed from
Dirichlet’s class number formula, which we shall not present here. The
nature of this formula, however, is not such that we can draw general con-
clusions about the behavior or properties of h(D). As you might imagine,
the formula is more complicated for positive values of D than for nega-
tive values of D. The class number formula evidently yields an integer for
D < 0, but it is not a priori evident that this number is positive. The class
number formula evidently yields a real number for D > 0, but it is not
even a priori evident that this real number is an integer!
Remark 5.73. Tables of the values of h(D) for all D, positive and negative,
with |D| < 500, can be found in the book Number Theory by Z. I. Bore-
vich and I. R. Shafarevich, Academic Press, New York, 1966. (Note the
following error in the tables: the correct value for h(−485) is 20.)
i i
i i
i i
i i
Lemma 5.75. Let P be any prime ideal of R. Then P divides (p)K for
exactly one prime number p.
Proof: It is easy to check that for any two integers m1 and m2 , (m1 m2 )K =
(m1 )K (m2 )K . Now consider P . By the proof of Lemma 5.39, we know
that P contains an integer m. Let m have prime factorization m =
pa1 1 pa2 2 · · · pakk . Then P divides (m)K = ((p1 )K )a1 ((p2 )K )a2 · · · ((pk )K )ak .
By the definition of a prime ideal, this implies that P divides (pi )K for
some i. Suppose that P divides (pj )K as well, for some pj = pi . Then pi
is in P and pj is in P , so 1 = pi x + pj y for some integers x and y is in P ,
and then P = R, a contradiction.
In the situation of Lemma 5.75, we say that P lies over p. Thus we see
that every prime ideal of R lies over some prime number p. Our object
in this section is to investigate the prime ideals lying over an arbitrary
prime number p. To this end, we consider any prime number p. By unique
factorization of ideals in R, we have
with each Pi a prime ideal. Note that (p)K = pn and Pi divides
(p)K , so we must have Pi = pfi for some fi . The integer ei is called
the ramification index of Pi and the integer fi is called the residue class
field degree of Pi . (Note that R/Pi is a field as Pi is a prime ideal and hence
a maximal ideal of the Dedekind domain R.) Here is our basic result.
i i
i i
i i
i i
g
n= ei fi .
i=1
Example 5.77. Let us examine the possible behavior for a quadratic field
K. Here n = 2 so we have the following possibilities:
(1) g = 1, e1 = 2, f1 = 1,
(2a) g = 2, e1 = e2 = 1, f1 = f2 = 1,
(2b) g = 1, e1 = 1, f1 = 2.
We have numbered the cases as above as they correspond to the cases with
the same number in Theorem 5.55. (In Theorem 5.55 we divided case (1)
into cases (1a) and (1b) depending on whether P was a principal ideal, but
that distinction is not relevant here.) In case (1), p is said to ramify in K.
In case (2a), p is said to split in K. In case (2b), p is said to be inert in K.
i i
i i
i i
i i
as above. Then ei > 1 for some i if and only if p divides ΔK . (In this
case, we say that Pi is ramified.)
√
Example 5.79. In the case of a quadratic field K = Q( D),
ΔK = D if D ≡ 1 (mod 4),
ΔK = 4D if D ≡ 2 or 3 (mod 4).
We see that the general theory specializes to the result we had in Theo-
rem 5.55, as, in the notation of that theorem, p divides eD D if and only if
p divides ΔK .
In fact, even for quadratic fields the situation can be very intricate. We
close this section by citing the following theorem:
Theorem 5.80. Let m be any integer and let Si , Sr , and Ss be any three
finite disjoint sets of primes. Then there are infinitely many positive values
D, and infinitely many negative values of D, for which, setting K =
of √
Q( D),
i i
i i
i i
i i
I = (α ) ∩ K.
(In other words, the original ideal I consists of those elements of the
ideal (α ) that are in K.)
i i
i i
i i
i i
are also algebraic integers as they are both roots of the monic polynomial
√
√ √
X 4 + 4X 2 + 9. Since 1 +√ −5 and 1 − −5 are both multiples of 2, it
then readily follows that 2 is an ideal generator of I1 .
Next consider the ideals I2 and I2 . Now let K be the field K =
√ √
Q( 3, −5). Then we have the following factorizations of elements in
O(K ):
2
3 = [ 3]
√ √
√ √ 3 + −15
1 + −5 = 3 ,
√ 3
√
√ √ 3 − −15
1 − −5 = 3 .
3
√
Again we need to check that all the factors are algebraic integers. Now 3
√as it√is a root of the monic polynomial X − 3, and
2
is√an algebraic integer,
√
( 3 + −15)/3 and ( 3 − −15)/3 are also algebraic integers, as they are
both √roots of the monic polynomial X 4 + 8X 2 + 4. It then readily follows
that 3 is an ideal generator of both I2 and I2 .
i i
i i
i i
i i
i i
i i
i i
i i
r + 2s = n.
Theorem 5.88 says that, given an “abstract” algebraic number field of de-
gree n, as we have defined it here, there are n ways to identify it with a
“concrete” algebraic number field, as we have previously defined it.
i i
i i
i i
i i
Example 5.89.
Example 5.91.
√
(1) Let K = Q( D) for D < 0 be an imaginary quadratic field. From
Example 5.89(1) we see that r + s − 1 = 0, so that UK is isomorphic to
the finite group RK . For D = −1, RK = {±1, ±i} is a group of order
4. For D = −3, RK = {±1, ±ω, ±ω 2} is a group of order 6. Otherwise,
RK = {±1} is a group of order 2.
√
(2) Let K = Q( D) for D > 0 be a real quadratic field. Then RK = {±1}
is a group of order 2. From Example 5.89(1) we see that r + s − 1 = 1,
so that FK is isomorphic to Z. Let ε0 be a fundamental unit of K.
i i
i i
i i
i i
5.10 Exercises
√
Exercise 5.1. Let R = O( D).
i i
i i
i i
i i
(c) Show that I and J are relatively prime if and only if they have no
common prime ideal factor. In this case, show that L = IJ.
i i
i i
i i
i i
Exercise 5.7. Use the descriptions of the ideals P in Remark 5.58 to show
that these ideals are maximal and hence prime, thus providing another
proof of Theorem 5.55.
Exercise 5.10. Let R = Z[X], the ring of polynomials in the variable “X”
with integer coefficients. R is known to be a U F D. Note that parts (a)
and (b) below show that R is not a Dedekind domain.
(c) Find a pair of ideals I and J such that J ⊃ I but such that there is no
ideal K with I = JK.
(e) Let I = (2, X). Show that, for any positive integer n, the minimum
number of elements in a generating set for I n is n + 1. In particular,
no power of I is principal.
i i
i i
i i
i i
(a) Show that R does not satisfy the ascending chain condition (ACC).
i i
i i
i i
i i
Appendix A
Mathematical Induction
Stated in this way, there is clearly nothing special about F (n), and we
may substitute any proposition. This leads us to the principle of mathe-
matical induction.
191
i i
i i
i i
i i
Axiom A.1 (The Principle of Mathematical Induction). Let P (n) be any propo-
sition about positive integers. Suppose that
Let us emphasize that the point of (2) is that we do not need to prove
P (n), but rather that we can assume P (n) and use it to prove P (n+ 1). (In
a typical proof, at the point we involve the truth of P (n), we often state
“By the inductive hypothesis . . . .”)
As a practical matter, when using mathematical induction to prove a
proposition, it is usually the case that verifying (1) is easy but verifying
(2) takes work. Occasionally it is the case that verifying (2) is easy but
verifying (1) takes work. Rarely it is the case that verifying both (1) and
(2) take work. (And it is virtually never the case that verifying both (1)
and (2) is easy—that would be getting something for nothing.)
There is a variant of mathematical induction called complete induction.
Again, we will first introduce it in the context of dominoes, so again let
us suppose we have an infinite number of dominoes numbered 1, 2, 3, . . . .
Again, let us suppose that the first domino falls. But now let us suppose
that the dominoes are arranged a bit differently, so that it is not necessarily
the case that if domino n falls, then domino n+1 falls, for every n. Suppose
it is instead the case that if dominoes 1 through n all fall, the domino n + 1
falls, for every n. What will happen? Again, all the dominoes will fall.
There is another possibility. Assume the first domino falls, and the
arrangement of the dominoes is that domino n+1 is not necessarily knocked
down by domino n, but by domino k for some k between 1 and n, for every
n. What will happen? Again, all the dominoes will fall. But in this
third situation, we generally do not know which domino k will knock down
domino n + 1. So we handle that lack of information by simply assuming
that we are back in the second case, that is, if dominoes 1 through n all
fall (and hence the mysterious domino k falls, whatever k may happen to
be), domino n + 1 will also fall, for every n.
There are some other possibilities as well, where the first domino falls,
and where domino n + 1 falls if some (perhaps known, perhaps mysterious)
combination of the preceding n dominoes falls. But again, we can handle
this by simply assuming that we are back in the second case, that is, if
dominoes 1 through n all fall, domino n + 1 will also fall, for every n.
i i
i i
i i
i i
Axiom A.2 (The Principle of Complete Induction). Let P (n) be any propo-
sition about positive integers. Suppose that
(2) for each positive integer n, if P (k) is true for all integers k between 1
and n, then P (n + 1) is true.
We now show that all three of these variants of induction are logically
equivalent. The proof of this is rather subtle and tricky.
i i
i i
i i
i i
i i
i i
i i
i i
(1) S(1) is true, for if the set T contains the positive integer 1, then 1 is
certainly the smallest positive integer in T (as 1 is the smallest positive
integer, period).
Next we claim that
i i
i i
i i
i i
i i
i i
i i
i i
Theorem A.7 (The Pigeonhole Principle). If m objects are sorted into n cat-
egories, and m > n, then at least one category contains more than one
object.
(1) P (1) is true: If there is only one category, then all m > 1 objects are
in that category.
(2) P (n) implies P (n + 1): Suppose m > n + 1 objects are sorted into n
categories. Pick a category. If that category has k > 1 objects, we
are done. So suppose not. Then it has k = 0 or 1 objects. Consider
the remaining n categories and the remaining m − k objects. If k = 0,
m−k = m > n+1 > n, while if k = 1, m−k = m−1 > (n+1)−1 = n,
so in either event m − k > n and by the inductive hypothesis at least
one of the remaining categories contains more than one object. So in
any case P (n + 1) is true.
Then, by induction, we conclude that P (n) is true for every positive inte-
ger n.
Theorem A.8. If m objects are sorted into n categories and m < n, then at
least one category is empty.
i i
i i
i i
i i
(1) P (1) is true: Pick a category. If that category is empty, we are done.
Otherwise, the single object is in that category. In that case, every
other category is empty.
(2) P (m) implies P (m + 1): Suppose m + 1 < n objects are sorted into n
categories. Pick a category. If that category is empty, we are done. So
suppose not. Then it contains j ≥ 1 objects. Consider the remaining
n − 1 categories and the remaining m = m + 1 − j objects. Since
m + 1 < n, m = m + 1 − j < n − j ≤ n − 1. Then, by the inductive
hypothesis, at least one of the remaining categories is empty. So in any
case P (m + 1) is true.
Theorem A.9. Suppose that n objects are sorted into n categories. The
following are equivalent.
Proof: If n = 1, then the single object is in the single category, and all
three statements are true, and so are equivalent.
Assume henceforth that n ≥ 2. We begin by observing that (3) is
logically equivalent to (1) and (2). Hence (3) implies (1) and (3) implies
(2). If we show that (1) implies (2), that will show that (1) implies (3),
and if we show that (2) implies (1), that will show that (2) implies (3). So
we must prove these two implications.
(1) implies (2): We prove the contrapositive: not-(2) implies not-(1).
Suppose (2) is false and some category is empty. Then the remaining
n − 1 categories contain n objects, so by the original pigeonhole principle
(Theorem A.7) some category must contain more than one object, and (1)
is false.
(2) implies (1): We prove the contrapositive: not-(1) implies not-(2).
Suppose (1) is false and some category contains j > 1 objects. Then the
remaining n − 1 categories contain m = n − j < n − 1 objects, so by the
above variant on the pigeonhole principle (Theorem A.8), some category
must be empty, and (1) is false. (We may have m = 0, but then all of the
other categories are empty and (1) is certainly false.)
i i
i i
i i
i i
A.3 Exercises
Exercise A.1. A geometric progression with first term a and ratio r is a
sequence of the form a, ar, ar2 , ar3 , . . . . Show that, for r = 1, the sum
of the first n terms of this progression is a(rn − 1)/(r − 1), i.e., show that
n−1
rn − 1
ari = a .
i=0
r−1
(Of course, if r = 1, then the progression is constant and the sum of its
first n terms is an.)
Exercise A.4.
(a) Let x = m1 + m21 − 1 for some positive integer m1 . Show that,
for every n ≥ 1, x = mn +√ m2n − 1 for some √
n
positive integer
√
2
mn . (For example,
√ if x√= 2 + 3, then x = 7 + 4 3 = 7 + 48,
x3 = 26 + 15 3 = 26 + 675, . . ..)
(b) More generally, let x = m1 + m21 − N for some positive integer m1
and some integer N . Show that, for every n ≥ 1, x = √
n
mn + m2n − N n
for some positive integer m
√n . (For example,
√ if x = 2+√ 6, in which
√ case
N = −2, then x2 = 10+4 6 = 10+ 96, x3 = 44+18 6 = 44+ 1944,
. . ..)
i i
i i
i i
i i
(a) Show that e(d + 1) = 3e(d) + o(d) and that o(d + 1) = 3o(d) + e(d) for
every positive integer d. (Hint: think about extending a characteristic
of degree d − 1 to a characteristic of degree d.)
(b) Show that e(d) = 2d−1 (2d +1) and o(d) = 2d−1 (2d −1) for every positive
integer d.
(e) Experiment with arbitrary values of r, s, and t and come up with, and
prove, a formula for an in general.
n 1 2 3 4 5 6 7 8 9 10
fn 1 1 2 3 5 8 13 21 34 55
i i
i i
i i
i i
n 1 2 3 4 5 6 7 8 9 10
n 1 3 4 7 11 18 29 47 76 123
Observe that 1 = f1 . Show that n = 2fn−1 +fn and that fn = (2/5) n−1 +
(1/5) n for every n ≥ 2.
n 0 1 2 3 4 5 6
pn 1 1 3 7 17 41 99
qn 0 1 2 5 12 29 70
i i
i i
i i
i i
i i
i i
i i
i i
Exercise A.14. Fix a rational number D that is not a perfect square. Choose
nonzero rational numbers a and b and set N = a2 − b2 D. Define sequences
{p0 , p1 , p2 , . . .} and {q0 , q1 , q2 , . . .} by
(b) Show that pn = (−N qn−1 +aqn )/b and that qn = (−N pn−1 +apn )/(bD).
Exercise A.17. You may be familiar with magic squares. A magic square
is a square array with the sums of its rows, columns, and diagonals all the
i i
i i
i i
i i
i i
i i
i i
i i
Appendix B
Congruences
x ≡ a (mod n)
x ≡ a (mod n)
205
i i
i i
i i
i i
206 B. Congruences
Thus, for example, the integers x with x ≡ 0 (mod 2) are the integers
of the form x = 2k, i.e., the even integers, and the integers x with x ≡
1 (mod 2) are the integers of the form x = 1 + 2k, i.e., the odd integers.
We think of two integers that are congruent modulo n as being equiv-
alent in a certain way, or, technically speaking, that congruence modulo n
is an equivalence relation. That is the content of the next proposition.
Proposition B.3.
(2) For any two integers a and b, if a ≡ b (mod n), then b ≡ a (mod n).
(3) For any three integers a, b, and c, if a ≡ b (mod n) and b ≡ c (mod n),
then a ≡ c (mod n).
Proof:
(3) If a ≡ b (mod n), then a − b = nk1 for some k1 . If b ≡ c (mod n), then
b − c = nk2 for some k2 . But then
a1 + a2 ≡ b1 + b2 (mod n),
a1 − a2 ≡ b1 − b2 (mod n),
a 1 a2 ≡ b 1 b 2 (mod n).
i i
i i
i i
i i
a1 + a2 ≡ b1 + b2 (mod n),
and also
a1 − a2 ≡ b1 − b2 (mod n),
and, finally,
a1 a2 ≡ b1 b2 (mod n).
Next we have the following result, which states, in fancier language, that
for a positive integer n, the integers are a complete set of representatives of
the congruence classes modulo n. But we state the result in a much more
down-to-earth way. Note, however, that in order to prove this result we
need to use our work in Section 2.1.
But even before we state it, we observe that this encapsulates the usual
result of division. When we divide the integer x by the positive integer n,
we get a quotient (which we do not care about here) and a remainder. We
can always get a remainder between 0 and n − 1, and when we impose this
restriction on the remainder, it is unique.
Theorem B.5. Let n be a positive integer. For any integer x, the congruence
x ≡ a (mod n)
i i
i i
i i
i i
208 B. Congruences
Proof: First we will show that x ≡ a (mod n) for some such integer a
between 0 and n − 1, and then we will show that only one such integer a
works.
We begin by applying Lemma 2.7, which states that the integers are a
Euclidean domain, and we refer to Definition 2.5 to see what that means.
Doing so, we see that, given x, there are integers k0 and a0 with
Thus, we have that the congruence is true for some a between 0 and n − 1.
Now we must show that there is only one such a. We do this by assuming
there is another value a and showing in fact it must just be a.
So suppose
i i
i i
i i
i i
Proof:
(1) Since x ≡ a (mod n), we have that, for some k, x − a = nk = (n1 d)k =
n1 (dk), so x ≡ a (mod n1 ).
x = a + n1 k = a + n1 (b + dj) = a + n1 b + n1 dj = a + n1 b + nj
and hence
x ≡ a + n1 b (mod n).
i i
i i
i i
i i
210 B. Congruences
Proof: Part (1) of this corollary follows immediately from part (1) of Propo-
sition B.7, and part (2) of this corollary follows immediately from part (2)
of Proposition B.7.
x ≡ a1 (mod n1 ),
x ≡ a (mod n).
i i
i i
i i
i i
i i
i i
i i
i i
212 B. Congruences
Proof: By definition, ax1 ≡ ax2 (mod n) means that n divides ax1 − ax2 =
a(x1 − x2 ). By assumption, a and n are relatively prime, so we may apply
Euclid’s Lemma (Lemma 2.41) to conclude that n divides x1 − x2 , which
means that x1 ≡ x2 (mod n).
Theorem B.11. Suppose that a and n are relatively prime. Then for any b,
the congruence
ax ≡ b (mod n)
has a solution, and this solution is unique (mod n).
ci ≡ ai (mod n) and 0 ≤ ci ≤ n − 1.
i i
i i
i i
i i
Second Proof: This proof is more involved than our first proof but has the
virtue of applying to more general situations (to any PID, in the language
of Chapter 2).
We are assuming that a and n are relatively prime, i.e., have a gcd of
1, so from Lemma 2.7, Definition 2.18, and Theorem 2.20, we know that
there are integers a and n with
aa + nn = 1,
Set x = a b. Then
Corollary B.12. Suppose that a and n are relatively prime. Then for any
b, the congruence
ax ≡ b (mod n)
x ≡ c (mod n)
i i
i i
i i
i i
214 B. Congruences
Now let us ask how to go about solving a congruence ax ≡ b (mod n), with
a and n relatively prime, in practice.
First, let us suppose n is relatively small. Then we can directly apply
Corollary B.12. The congruence ax ≡ b (mod n) is equivalent to x ≡
c (mod n), and so there are n possibilities for the congruence class of c,
namely integers 0, 1, 2, . . ., n − 1, and we may simply proceed by trial
and error. For example, suppose we wish to solve the congruence 7x ≡
2 (mod 10). Then we need only try x = 0, 1, 2, . . ., 9. We see, in order,
that 7 · 0 = 0 ≡ 0 (mod 10); 7 · 1 = 7 ≡ 7 (mod 10); 7 · 2 = 14 ≡ 4 (mod 10);
7 · 3 = 21 ≡ 1 (mod 10); 7 · 4 = 28 ≡ 8 (mod 10); 7 · 5 = 35 ≡ 5 (mod 10);
7 · 6 = 42 ≡ 2 (mod 10). Thus the solution to our congruence is x ≡
6 (mod 10).
This is fine if we want to solve a single congruence ax ≡ b (mod n). But
if we want to be able to solve multiple congruences with the same a and
n (i.e., ax ≡ b1 (mod n), ax ≡ b2 (mod n), ax ≡ b3 (mod n), etc.) there
is a better way to proceed. In the notation of Corollary B.12, we first
find the solution x = a of the congruence ax ≡ 1 (mod n), and then the
solution of the congruence ax ≡ 1 (mod n) is given by x ≡ a b (mod n).
For example, suppose we want to solve the congruences 5x ≡ b (mod 14)
for different values of b. We first solve 5x ≡ 1 (mod 14) by trial and error,
letting x = 0, 1, 2, . . ., 13. We see, in order, that 5 · 0 = 0 ≡ 0 (mod 14);
5 · 1 = 5 ≡ 5 (mod 14); 5 · 2 = 10 ≡ 10 (mod 14); 5 · 3 = 15 ≡ 1 (mod 14).
Thus a = 3. Then the solution to 5x ≡ 2 (mod 14) is x = 3·2 = 6 (mod 14);
the solution to 5x ≡ 3 (mod 14) is x = 3 · 3 = 9 (mod 14); the solution to
5x ≡ 4 (mod 14) is x = 3 · 4 = 12 (mod 14); the solution to 5x ≡ 5 (mod 14)
is x = 3 · 5 = 15 ≡ 1 (mod 14) (this one is obvious); the solution to
5x ≡ 6 (mod 14) is x = 3 · 6 = 18 ≡ 4 (mod 14); etc.
Obviously this method is practical only if n is small. But in fact we
have already derived a method for finding a , which works effectively for
i i
i i
i i
i i
1 = 37(58) + 143(−15),
so if n = 143 and a = 37, then a = 58, and our congruence has the solution
Of course, we live in an age where 143 is a very small number for a com-
puter. But for n large, Euclid’s algorithm is much more efficient than
trial and error, and for n very large it is the only practical method, even
for a computer. Here is an example with larger numbers. Consider the
congruence
1 = 1123456789(356396689) + 876543210(−456790122),
i i
i i
i i
i i
216 B. Congruences
Lemma B.13. Suppose that a and n have gcd d, and set n1 = n/d. If
then
x1 ≡ x2 (mod n1 ).
(1) If b is divisible by d then this congruence has a solution, and this solu-
tion is unique (mod n1 ).
(2) If b is not divisible by d then this congruence does not have a solution.
Proof: Let us write a = da1 and n = dn1 . We know that a1 and n1 are
relatively prime by Lemma 2.16.
(1) Suppose that b is divisible by d, and write b = db1 . Then our original
congruence is
da1 x ≡ db1 (mod dn1 ),
i.e., da1 x − db1 is divisible by dn1 , so da1 x − db1 = (dn1 )k for some k.
But then d(a1 x − b1 ) = da1 x − db1 = (dn1 )k = d(n1 k), so a1 x − b1 =
n1 k, and hence
a1 x1 ≡ b1 (mod n1 ).
But a1 and n1 are relatively prime, so we may apply Theorem B.11 to
conclude this congruence has a unique solution (mod n1 ).
i i
i i
i i
i i
ax ≡ b (mod n).
a1 x ≡ b1 (mod n1 ),
x ≡ c1 (mod n1 )
Remark B.16. Note that (in the notation of Corollary B.12 and Corol-
lary B.15) if aa + nn = d, then, dividing by d, we have a1 a + n1 n = 1,
so a1 a ≡ 1 (mod n1 ) and so we may choose a1 = a .
Let us see how to apply this theorem and corollary. For example, con-
sider the congruence
360x ≡ 324 (mod 2268).
In Example 2.25(1), we found that gcd(2268, 360) = 36. Since 324 is divis-
ible by 36 (as 324 = 36 · 9), this congruence has a solution and it is unique
i i
i i
i i
i i
218 B. Congruences
If we want to solve for x (mod 2268), we then find that x can be congruent
to any one of the following 36 values (mod 2268):
x= 45 + 63 · 0 ≡ 45 (mod 2268),
x= 45 + 63 · 1 ≡ 108 (mod 2268),
x= 45 + 63 · 2 ≡ 171 (mod 2268),
..
.
x= 45 + 63 · 35 ≡ 2250 (mod 2268).
Our final topic is the Chinese Remainder Theorem, which deals with the
solution of simultaneous linear congruences. The case of two simultaneous
congruences is the crucial one, and we shall do that one separately first to
pave the way for the general situation.
Lemma B.17. Let n1 and n2 be relatively prime, and let b1 and b2 be arbi-
trary. Then the system of simultaneous congruences
x ≡ b1 (mod n1 ),
x ≡ b2 (mod n2 )
i i
i i
i i
i i
x ≡ b1 (mod n1 ),
x ≡ b2 (mod n2 ),
..
.
x ≡ bk (mod nk )
i i
i i
i i
i i
220 B. Congruences
x ≡ c (mod nk−1 nk )
for some c. Then, replacing these two original congruences by this new one,
we obtain the equivalent system
x ≡ b1 (mod n1 ),
x ≡ b2 (mod n2 ),
..
.
x ≡ bk−2 (mod nk−2 ),
x≡c (mod nk−1 nk ).
Let us see how to apply this theorem. If the numbers are relatively small,
trial and error works again. For example, consider the pair of simultaneous
congruences
x ≡ 8 (mod 9),
x ≡ 5 (mod 7).
Then we set x = 8 + 9y and we try y = 0, 1, . . . , 6 until we find a value
that works: y = 0, x = 8 ≡ 1 (mod 7); y = 1, x = 17 ≡ 3 (mod 7);
y = 2, x = 26 ≡ 5 (mod 7). Thus this system has the solution
x ≡ 26 (mod 63).
We solve the last two congruences first. (Actually, it does not matter what
order we do it in.) We just did that, so we use our answer and note that
the original system is equivalent to
i i
i i
i i
i i
If the numbers get larger, we once again have to use Euclid’s algorithm. We
will first give a formula for the solution of two simultaneous congruences,
and then deal with the general case.
Recall that, for n1 and n2 relatively prime, we used Euclid’s algorithm
to find n1 and n2 with n1 n1 + n2 n2 = 1. Then n1 n1 ≡ 1 (mod n2 ) and
n2 n2 ≡ 1 (mod n1 ).
Lemma B.19. Let n1 and n2 be relatively prime and let b1 and b2 be arbi-
trary. Then the pair of simultaneous congruences
x ≡ b1 (mod n1 ),
x ≡ b2 (mod n2 )
Proof: We simply have to check that the given value of x satisfies both
congruences. We see that
and
n2 n2 b1 + n1 n1 b2 ≡ (n1 n1 )b2 ≡ 1(b2 ) ≡ b2 (mod n2 ).
Thus this value of x (mod n1 n2 ) is a solution, and any x ≡ x (mod n1 n2 )
is also a solution. Furthermore, this is the only solution (mod n1 n2 ) as we
have already shown that the solution is unique (mod n1 n2 ).
i i
i i
i i
i i
222 B. Congruences
1 = 37(58) + 143(−15).
Thus, we have n1 = 37, n1 = 58, n2 = 143, n2 = −15, b1 = 19, and
b2 = 91, so
i i
i i
i i
i i
Set N = n1 n2 · · · nk . For each i, let mi = N/ni and let mi be such that
mi mi ≡ 1 (mod ni ). Then this system has the solution
Proof: We must check that this value of x satisfies all of the congruences.
We will simply check that it satisfies the first one. The others are the same
except for the subscripts.
The key thing to note is that n1 divides m2 = N/n2 = n1 n3 · · · nk and
similarly that n1 divides m3 , . . ., mk . Thus each term m2 m2 b2 , m3 m3 b3 ,
. . ., mk mk bk is divisible by n1 , i.e., is congruent to 0 (mod n1 ). Then we
have
m1 m1 b1 + m2 m2 b2 + · · · + mk mk bk ≡ m1 m1 b1 + 0 + · · · + 0 (mod n1 )
≡ m1 m1 b1 (mod n1 )
≡ (m1 m1 )b1 (mod n1 )
≡ 1(b1 ) (mod n1 )
≡ b1 (mod n1 )
as m1 m1 ≡ 1 (mod n1 ) by the definition of m1 .
Thus, x is indeed a solution and, just as before, any x ≡ x (mod N ) is
also a solution, while by uniqueness (mod N ) this is the only solution.
i i
i i
i i
i i
224 B. Congruences
Lemma B.21. Let p be an odd prime. For any a ≡ 0 (mod p), the congru-
ence x2 ≡ a (mod p) either has no solutions or two solutions.
i.e., x20 − y02 is divisible by p. But x20 − y02 = (x0 − y0 )(x0 + y0 ), and now
we can apply Euclid’s Lemma: since p divides this product, it must divide
one of the factors, so p divides x0 − y0 , in which case y0 ≡ x0 (mod p), or
p divides x0 + y0 , in which case y0 ≡ −x0 (mod p).
i i
i i
i i
i i
Table B.1. Quadratic residues and nonresidues for some small odd primes.
Table B.1 is a table of the first few odd primes and their quadratic
residues and nonresidues. Note in Definition B.22 that whether a is a
quadratic residue (mod p) only depends on the congruence class of a (mod p),
so we only list one representative of each congruence class, and in fact we
list the representative a with 1 ≤ a ≤ p − 1.
Inspection of this table shows that for these values of p there are the
same number of quadratic residues as there are nonresidues, namely (p −
1)/2 of each. Let us prove that this is true in general.
i i
i i
i i
i i
226 B. Congruences
Remark B.26. We should observe that the proof of Lemma B.25 shows us
how to find the quadratic residues (mod p), and hence the entries in the left
column of Table B.1: Simply square the integers between 1 and (p − 1)/2,
and find integers between 1 and p − 1 congruent to these squares. (Those
integers will simply be the remainders when these squares are divided by
p.) Then the entries in the right column of this table will be the remaining
integers between 1 and p − 1. For example,
p=5: 12 ≡1 (mod 5), 22 ≡4 (mod 5);
p=7: 12 ≡1 (mod 7), 22 ≡4 (mod 7), 32 ≡ 2 (mod 7);
p = 11 : 12 ≡1 (mod 11), 22 ≡4 (mod 11), 32 ≡ 9 (mod 11),
42 ≡5 (mod 11), 52 ≡3 (mod 11).
There is a second, less obvious pattern that can be found in the table.
Lemma B.27. Let p be an odd prime and let a and b be integers with a ≡
0 (mod p) and b ≡ 0 (mod p). Let c be an integer with c ≡ ab (mod p).
(1) If a and b are both quadratic residues (mod p), then c is a quadratic
residue (mod p).
i i
i i
i i
i i
(3) If a and b are both quadratic nonresidues (mod p), then c is a quadratic
residue (mod p).
Proof:
(1) Since a is a quadratic residue (mod p), by definition there is an integer
x with x2 ≡ a (mod p). Similarly, there is an integer y with y 2 ≡
b (mod p). Set z = xy. Then
z 2 = (xy)2 = x2 y 2 ≡ ab ≡ c (mod p),
so by definition c is a quadratic residue (mod p).
(2) Suppose a is a quadratic residue (mod p), and let x be an integer with
x2 ≡ a (mod p). We will prove the lemma in this case by contradiction.
So suppose c is a quadratic residue (mod p), and let z be an integer
with z 2 ≡ c (mod p). Since a ≡ 0 (mod p), we see that x ≡ 0 (mod p)
(as p is a prime), and then we know that there is an integer w with
wx ≡ 1 (mod p). Then (wx)2 ≡ 12 ≡ 1 (mod p). But (wx)2 = w2 x2 =
w2 a, so we see that w2 a ≡ 1 (mod p). Now let y = wz. Then
y 2 = (wz)2 = w2 z 2 ≡ w2 c ≡ w2 (ab) ≡ (w2 a)b ≡ (1)(b) ≡ b (mod p),
so we see that b is a quadratic residue (mod p), contradicting our
hypothesis.
(3) As we have seen in Lemma B.25, there are (p − 1)/2 quadratic residues
(mod p) between 1 and p − 1. Call them d1 , d2 , . . . , d(p−1)/2 . For each
i between 1 and (p − 1)/2, let ei be an integer between 1 and p − 1
with ei ≡ adi (mod p). Then no two ei ’s are equal, as if ei1 = ei2 ,
then adi1 ≡ adi2 (mod p), and since p is a prime and a ≡ 0 (mod p),
by Lemma B.10 this implies that di1 ≡ di2 (mod p), and hence that
di1 = di2 (as they are both between 1 and p − 1), contradicting our
choice of the di ’s.
Now by part (2), each ei is a quadratic nonresidue (mod p) (as it is
congruent to the product of a, a quadratic nonresidue (mod p), and
di , a quadratic residue (mod p)). But also, by Lemma B.25, there
are (p − 1)/2 quadratic nonresidues (mod p) between 1 and p − 1,
so e1 , . . . , e(p−1)/2 must be all of them. Since b is assumed to be a
quadratic nonresidue (mod p), it must be congruent to one of the ei ’s.
So let b ≡ ei0 ≡ ad2i0 (mod p). Then, setting z = adi0 , we see that
z 2 = (adi0 )2 = a2 d2i0 = a(ad2i0 ) ≡ aei0 ≡ ab ≡ c (mod p),
so by definition c is a quadratic residue (mod p).
i i
i i
i i
i i
228 B. Congruences
There is a third, far less obvious, pattern that can be found in the table.
We ask when p − 1, or, equivalently, when −1 (as −1 ≡ p − 1 (mod p)), is
a quadratic residue (mod p). Looking at the table, we see that this is true
for p = 5, 13, 17, and 29 and false for p = 3, 7, 11, 19, and 23. We will
be able to generalize this and to prove that it is true for p ≡ 1 (mod 4)
and false for p ≡ 3 (mod 4). This will take considerable work. Along the
way, we will prove two famous theorems, Fermat’s Little Theorem (of great
importance in itself) and Wilson’s theorem (whose main importance is in
precisely the use we shall put it to).
Theorem B.29 (Fermat’s Little Theorem). Let p be an odd prime. For any
a ≡ 0 (mod p),
i i
i i
i i
i i
a · 2a · 3a · . . . · (p − 1)a = 1 · 2 · 3 · . . . · (p − 1) · a · a · a . . . · a = F ap−1 .
ap−1 ≡ 1 (mod p)
as claimed.
i.e.,
xp−1 ≡ −1 (mod p),
since, as we have just observed, (p − 1)/2 is odd, and −1 raised to an odd
power is −1. But by Fermat’s Little Theorem, regardless of the value of x
(which is certainly ≡ 0 (mod p)),
i i
i i
i i
i i
230 B. Congruences
Now we turn to the second half of our goal. In the proof of Theo-
rem B.29, we simply needed to know F ≡
0 (mod p). Now we will need to
find the exact value of F (mod p).
Theorem B.31 (Wilson’s Theorem). Let p be a prime. Then
(p − 1)! ≡ −1 (mod p).
Proof: First, we note that this is true for p = 2 as 1! = 1 ≡ −1 (mod 2).
Henceforth we assume p is an odd prime. Then p − 1 is even, so there
are an even number p − 1 of integers between 1 and p − 1, i.e., in the set
T = {1, . . . , p − 1}. Let us exclude the two integers 1 and p − 1, to obtain
the set
S = {2, 3, . . . , p − 2},
containing an even number p − 3 of integers.
We claim that for every element z of S there is an element y of S with
y = z, and zy ≡ 1 (mod p).
We already know that for any z in T there is a y in T with zy ≡
1 (mod p) (Theorem B.11). In particular, since S is a subset of T , we know
that for any z in S, there is a y in T with zy ≡ 1 (mod p). We cannot have
y = 1, as then zy ≡ z ≡ 1 (mod p), and we cannot have y = p − 1, as then
zy ≡ −z ≡ 1 (mod p). Thus y is in S. Finally, we cannot have y = z as
then z 2 ≡ 1 (mod p). But we know that x2 ≡ 1 (mod p) can have at most
two solutions (Lemma B.21), and it certainly has the solutions x = ±1, so
it cannot have the solution x = z as well.
Now let us consider the product F0 = 2 · 3 · . . . · (p − 2) of the elements
of S. Each element of S pairs up with another element of S where the
product of the two is congruent to 1 (mod p), and there are (p − 3)/2 such
pairs of elements, so
F0 ≡ (1)(p−3)/2 = 1 (mod p).
But now
F = (p − 1)!
= 1 · 2 · 3 · . . . · (p − 1)
= 1 · (2 · 3 · . . . · (p − 2)) · (p − 1)
= 1 · F0 · (p − 1)
≡ 1 · 1 · (p − 1) (mod p)
≡ −1 (mod p),
as claimed.
i i
i i
i i
i i
Now we observe that there are (p − 1)/2 terms on the left-hand side, so
pulling out the factors of −1 we see that there are (p − 1)/2 of them, and
(−1)(p−1)/2 = 1 since (p − 1)/2 is even, so we see
(−1)(p−1)/2 (12 )(22 )(32 ) · · · (((p − 1)/2)2 ) ≡ −1 (mod p)
(12 )(22 )(32 ) · · · (((p − 1)/2)2 ) ≡ −1 (mod p)
(1 · 2 · 3 . . . (p − 1)/2)2 ≡ −1 (mod p),
as claimed.
i i
i i
i i
i i
232 B. Congruences
(1) If p ≡ 1 (mod 4), then either a and −a are both quadratic residues
(mod p) or they are both quadratic nonresidues (mod p).
Proof: This follows immediately from Lemma B.27, Corollary B.30, and
Corollary B.33.
Lemma B.36.
Proof:
i i
i i
i i
i i
Now q has a prime factorization, of course, and if all of the prime factors
of q were congruent to 1 or 7 (mod 8), then q itself would be congruent
to 1 or 7 (mod 8), which is not the case. Hence q must be divisible by
some prime p with p ≡ 3 or 5 (mod 8). Since 1 ≤ x ≤ p − 1, x2 < p,
so q < p, and in particular p < p. But
x2 = pq + 2 = p (pq/p ) + 2 ≡ 2 (mod p )
and so 2 is a quadratic residue (mod p ), contradicting our inductive
hypothesis.
(2a) First we handle the case p ≡ 7 (mod 8). We proceed as in the proof
of part (1) to show the following claim: if p ≡ 5 or 7 (mod 8), then
−2 is a quadratic nonresidue (mod p).
For p = 5 this claim is certainly true. Now suppose p is prime,
p ≡ 5 (mod 8) or p ≡ 7 (mod 8), and suppose the claim is true for all
primes p < p with p ≡ 5 (mod 8) or p ≡ 7 (mod 8).
Again we proceed by contradiction. Suppose −2 is a quadratic residue
(mod p), and let x be an integer with x2 ≡ −2 (mod p). Again we may
assume 1 ≤ x ≤ p − 1 and x is odd, so x2 = pq − 2 with q odd and
(again using Lemma B.35) pq = x2 + 2 ≡ 1 + 2 = 3 (mod 8). If
p ≡ 5 (mod 8) this forces q ≡ 7 (mod 8), and if p ≡ 7 (mod 8) this
forces q ≡ 5 (mod 8). Then the prime factors of q are all less than p,
but not all of them can be congruent to 1 or 3 (mod 8), as if they were
we would have q ≡ 1 or 3 (mod 8). Hence q has a factor p < p with
p ≡ 5 or 7 (mod 8), and x2 ≡ −2 (mod p ), and so −2 is a quadratic
residue (mod p ), contradicting our inductive hypothesis.
Now suppose p ≡ 5 (mod 8). Then p ≡ 1 (mod 4), so by Corol-
lary B.33, −1 is a quadratic residue (mod p). Then we can apply
Lemma B.27 to conclude that 2 = (−2)(−1) is a quadratic nonresidue
(mod p).
(2b) Now we handle the case p ≡ 1 (mod 8). So as not to interrupt the
flow of the argument, we use a result that we shall prove below.
We claim that the congruence x4 + 1 ≡ 0 (mod p) has a solution.
Assuming that claim, let x = a0 be a solution. Now consider y =
a20 + 1. Then
y 2 = (a20 + 1)2
= a40 + 2a20 + 1
≡ 2a20 (mod p),
i i
i i
i i
i i
234 B. Congruences
(1) If p ≡ 1 (mod 8), then 2 and −2 are both quadratic residues (mod p).
(3) If p ≡ 5 (mod 8), then 2 and −2 are both quadratic nonresidues (mod p).
Proof: This follows directly from Lemma B.36, Corollary B.33, Corol-
lary B.30, and Lemma B.27.
i i
i i
i i
i i
Proof: On the one hand, suppose 2 is a quadratic residue (mod n). Then
x2 ≡ 2 (mod n) for some x. But then x2 ≡ 2 (mod p2 ), i.e., 2 is a quadratic
residue (mod p2 ), which is impossible by Corollary B.37(4).
On the other hand, suppose −2 is a quadratic residue (mod n). Then
x2 ≡ −2 (mod n) for some x. But then x2 ≡ −2 (mod p1 ), i.e., 2 is a
quadratic residue (mod p1 ), which is impossible by Corollary B.37(2).
i i
i i
i i
i i
236 B. Congruences
(1) If at least one of p and q is congruent to 1 (mod 4), then one of the
following is true:
(2) If both p and q are congruent to 3 (mod 4), then one of the following
is true:
Proof: First suppose q ≡ 1 (mod 4). Then by Theorem B.40 (The Law of
Quadratic Reciprocity), q is a quadratic residue (mod p) if and only if p is a
quadratic residue (mod q). But, since q ≡ 1 (mod 4), p is a quadratic residue
(mod q) if and only if −p is a quadratic residue (mod q), by Corollary B.34,
yielding the result.
Next suppose q ≡ 3 (mod 4). Then by Theorem B.40 (The Law of
Quadratic Reciprocity), q is a quadratic residue (mod p) if and only if p is
a quadratic nonresidue (mod q). But, since q ≡ 3 (mod 4), p is a quadratic
nonresidue (mod q) if and only if −p is a quadratic residue (mod q), by
Corollary B.34, again yielding the result.
i i
i i
i i
i i
Proposition B.44 (Euler). Let p be an odd prime and let a be relatively prime
to p. Then a(p−1)/2 ≡ ±1 (mod p). In fact, a(p−1)/2 ≡ (a/p) (mod p), i.e.,
i i
i i
i i
i i
238 B. Congruences
Lemma B.46 (Gauss’s Lemma). Let p be an odd prime and let a be relatively
prime to p. Let S = {a, 2a, . . . , ((p − 1)/2)a}. Let
n = #({ in S | ˜ > p/2}).
Then (a/p) = (−1)n .
Proof: Observe that all of the elements of S are relatively prime to p and
that no two of them are congruent (mod p). Let m = #({ in S | ˜ < p/2}).
Since, for any relatively prime to p, either ˜ < p/2 or ˜ > p/2, we have
m + n = p.
Denote by q1 , . . . , qm those remainders less than p/2 that appear when
elements of S are divided by p, and by r1 , . . . , rn those remainders greater
than p/2 that appear when elements of T are divided by p. Clearly the
elements of T = {q1 , . . . , qm , r1 , . . . , rn } are all distinct. We claim that in
fact the elements of U = {q1 , . . . , qm , p − r1 , . . . , p − rn } are all distinct.
To see this, suppose that qi = p − rj for some i and j. By definition,
qi ≡ va (mod p) and rj ≡ wa (mod p) for some integers v and w between 1
and (p − 1)/2. Then
0 ≡ p = qi + rj ≡ va + wa = (v + w)a (mod p),
which is impossible as 2 ≤ v + w ≤ p − 1, and so, in particular, v + w ≡
0 (mod p).
Observe that U is a set of (p − 1)/2 distinct integers, all between 1
and (p − 1)/2, so in fact we must have U = {1, . . . , (p − 1)/2} (where the
elements of U appear in some unpredictable but irrelevant order).
Let ΠS be the product of the elements of S, ΠT the product of the
elements of T , and ΠU the product of the elements of U . Let us calculate
these numbers (mod p). First, from the definition of S, we see that
ΠS = a · 2a · · · ((p − 1)/2)a = a · · · a · 1 · ((p − 1)/2)
= a(p−1)/2 ((p − 1)/2)!.
Next, since T simply consists of the remainders when the elements of S are
divided by p, we certainly have
ΠT = q1 · · · qm r1 · · · rn ≡ ΠS (mod p).
i i
i i
i i
i i
Finally, the punch line: We calculate ΠU two ways. On the one hand, from
the definition of U ,
ΠU = q1 · · · qm (p − r1 ) · · · (p − rn ) ≡ q1 · · · qm (−r1 ) · · · (−rn )
= (−1)n q1 · · · qm r1 · · · rn ≡ (−1)n ΠT (mod p).
Euler’s Criterion (Proposition B.44) tells us that a(p−1)/2 ≡ (a/p) (mod p).
Using that we then obtain
In the following lemma, [·] denotes the greatest integer function, as usual.
Lemma B.47. Let p be an odd prime and let a be an odd integer. Let n be
as in the statement of Gauss’s Lemma. Then
(p−1)/2
n ≡ n = [ka/p] (mod 2)
k=1
and thus (a/p) = (−1)n .
(p−1)/2
(p−1)/2
(p−1)/2
ka ≡ [ka/p] + k (mod 2).
k=1 k=1 k=1
i i
i i
i i
i i
240 B. Congruences
(p−1)/2
m
n
[ka/p]ka = qi + sj
k=1 i=1 j=1
m n
≡ qi − sj (mod 2)
i=1 j=1
m
n
= −np + qi + (p − sj )
i=1 j=1
m
n
≡n+ qi + (p − sj ) (mod 2).
i=1 j=1
(p−1)/2
(p−1)/2
ka ≡ n + k (mod 2),
k=1 k=1
(p−1)/2
and comparing the two expressions for k=1 ka yields the result.
Proof: Let R be the rectangle in the xy-plane whose vertices are (0, 0),
(p/2, 0), (0, q/2), and (p/2, q/2). Let D be the diagonal of R running from
(0, 0) to (p/2, q/2). D divides R into two triangles. Let T + be the triangle
lying above D and T − be the triangle lying below D. Consider the lattice
points (i.e., points with integer coefficients) strictly inside R (i.e., in R but
not on the boundary of R). Note that there are r = ((p − 1)/2)((q − 1)/2)
such lattice points. Let n+ be the number of these lattice points that are in
T + and let n− be the number of these lattice points that are in T − . Note
that the line D has the equation y = (q/p)x, and so none of these lattice
points lie on D. Hence we see that r = n+ + n− , and so
p−1 q−1 +
2 · 2 +n− + −
(−1) = (−1)r = (−1)n = (−1)n (−1)n .
i i
i i
i i
i i
(p−1)/2
−
n = n−
k
k=1
where n−
k = #({lattice points in T
−
with x − coordinate k}), i.e.,
n−
k = #({(k, y) | y is an integer with 0 < y < (q/p)k}) = [kq/p],
(p−1)/2
so n− = k=1 [kq/p]. Similarly, starting on the y-axis and moving hor-
(p−1)/2
izontally to the right until we hit D we obtain that n+ = k=1 [kp/q].
− +
But by Lemma B.47, (q/p) = (−1)n and (p/q) = (−1)n , and we are
done.
i i
i i
i i
i i
242 B. Congruences
(1) ordp a = p − 1.
Proof: We will prove that (1) is false if and only if (2) is false.
Suppose that (1) is false. Then, by Corollary B.52, ak ≡ 1 (mod p)
for some k with 1 ≤ k < p − 1. But then a0 and ak are not distinct
(mod p), and (2) is false. On the other hand, suppose that (2) is false.
Then ai ≡ aj (mod p) with i = j and i and j both between 0 and p − 2.
We may assume that i < j. Set k = j − i. Then ak ≡ 1 (mod p) with
1 ≤ k < p − 1, and (1) is false.
Our goal is to show that every prime has a primitive root. We build up
to this by proving two results that are useful in themselves.
i i
i i
i i
i i
(a1 a2 )k1 k2 = (ak11 )k2 (ak22 )k1 ≡ 1k2 1k1 ≡ 1 (mod p),
so j divides k1 k2 .
On the other hand, aj1 aj2 = (a1 a2 )j ≡ 1 (mod p) gives aj1 ≡ a−j
2 ≡
(aj2 )−1 (mod p). Then
Proof: Factor k as k = pe11 · · · pemm with each pi a prime. Then, for each i,
there is an element bi with ki divisible by pei i , say ki = pei i qi for some qi .
Set ai = bqi i . Then ordp ai = pei i . Let a = a1 · · · am . Then, by Lemma B.55,
ordp a = k.
Lemma B.58. Let p be a prime and let r be a primitive root (mod p). Let
a be relatively prime to p. Then a ≡ rk (mod p) for some k, and k is well
defined (mod(p − 1)).
i i
i i
i i
i i
244 B. Congruences
Proof: First let us note that the statement of the corollary makes sense, as
if p is an odd prime then p − 1 is even. By Lemma B.58, k is well defined
(mod(p − 1)) and so is certainly well defined (mod 2).
Suppose that a ≡ rk (mod p) with k even. Let k = 2j. Then a ≡
rk = r2j = (rj )2 (mod p) and a is a quadratic residue (mod p). Conversely,
suppose that a is a quadratic residue (mod p), so a ≡ b2 (mod p) for some
b. Then b ≡ rj (mod p) for some j, so a ≡ b2 ≡ (rj )2 = r2j = rk (mod p)
with k = 2j even.
(1) Let r be a primitive root (mod p). For any a relatively prime to p, let
a ≡ rk (mod p). Then (a/p) = (−1)k .
(2) There are (p − 1)/2 quadratic residues (mod p) and (p − 1)/2 quadratic
nonresidues (mod p).
i i
i i
i i
i i
Proof:
(3) Let b ≡ a(p−1)/2 (mod p). Then b2 ≡ ap−1 ≡ 1 (mod p). The congru-
ence 0 ≡ x2 − 1 = (x − 1)(x + 1) (mod p) has only the two solutions
x ≡ ±1 (mod p), so we see b ≡ ±1 (mod p). Now a = rk for some k and
then b ≡ rk(p−1)/2 (mod p). If a is a quadratic residue (mod p), then k
is even and k(p − 1)/2 is divisible by p − 1 and so b ≡ 1 (mod p). If a
is a quadratic nonresidue (mod p), then k is odd and k(p − 1)/2 is not
divisible by p − 1 and so b ≡ 1 (mod p), in which case we must have
b ≡ −1 (mod p).
(5) Let a ≡ rk (mod p) and b ≡ r (mod p). Then ab ≡ rk+ (mod p). By
part (1), (a/p) = (−1)k (mod p), (b/p) = (−1) (mod p), and (ab/p) =
(−1)k+ (mod p). But (−1)k+ = (−1)k (−1) .
B.6 Exercises
Exercise B.1. Fix a positive integer n. Let k be any integer and consider
the set S = {k, k + 1, k + 2, . . . , k + n − 1}. Show that, for any integer x, the
congruence x ≡ a (mod n) is valid for a exactly one of the integers in S.
(Compare Corollary B.6.) A set S with this property is called a complete
system of residues (mod n).
Exercise B.2. Fix a positive integer n. Let k be any integer relatively prime
to n and consider the set S = {0, k, 2k, . . . , (n − 1)k}. Show that S is a
complete system of residues (mod n).
Exercise B.3. Let n be any positive integer and let S be any complete sys-
tem of residues (mod n). Show that S has n elements.
i i
i i
i i
i i
246 B. Congruences
Exercise B.6. Use Euclid’s Algorithm (from Chapter 2) to solve the follow-
ing congruences:
i i
i i
i i
i i
(a) b(a/b) ≡ a;
(b) (a/b)(b/a) ≡ 1;
(In all cases we assume that the denominators are relatively prime to n.)
x2 + y 2 ≡ c (mod p)
has a solution.
(b) More generally, suppose that a ≡ 0 (mod p) and b ≡ 0 (mod p). Show
that for any c, the congruence
ax2 + by 2 ≡ c (mod p)
has a solution.
Exercise B.11. Use the properties of the Legendre symbol to find the value
of each of the following Legendre symbols with a minimum of hand
computation:
i i
i i
i i
i i
248 B. Congruences
(a) (9767/9931);
(b) (9803/9967);
(c) (−210/991);
(d) (−210/983);
(e) (2747/2897);
(f) (2747/2837).
(All of the odd integers above are prime, except for 2747, which is
composite.)
Exercise B.12.
(b) Let p = 2 and let a be an odd integer. Observe that the congruence
x2 ≡ a (mod 2) always has a solution, and that the congruence x2 ≡
a (mod 4) has a solution if and only if a ≡ 1 (mod 4). Let n ≥ 3 be
an arbitrary integer. Show that the congruence x2 ≡ a (mod 2n ) has a
solution if and only if a ≡ 1 (mod 8).
Exercise B.13. Let m and n be relatively prime. Show that the congru-
ence x2 ≡ a (mod mn) has a solution if and only if the congruences
x2 ≡ a (mod m) and x2 ≡ a (mod n) have solutions.
Exercise B.14. Let b = 2e0 pe11 · · · pekk with {pi } distinct odd primes, e0 ≥ 0
and ei > 0 for i > 0. Let a be relatively prime to p. Show that the
congruence x2 ≡ a (mod b) has a solution if and only if a is a quadratic
residue (mod pi ) for each i and
i i
i i
i i
i i
Exercise B.15. Use Gauss’s Lemma (Lemma B.46) to prove Theorem B.49,
i.e., that for an odd prime p
Exercise B.16.
(a) Use Gauss’s Lemma (Lemma B.46) directly to prove the Law of Quad-
ratic Reciprocity for p = 3. (Hint: consider q (mod 12). There are four
cases.)
(b) Use Gauss’s Lemma (Lemma B.46) directly to prove the Law of Quad-
ratic Reciprocity for p = 5. (Hint: consider q (mod 20). There are
eight cases.)
Exercise B.17. Let b > 1 be an odd integer and let a be relatively prime to
b. Let b = pe11 · · · pekk be the prime factorization of b. The Jacobi symbol
(a/b) is defined by (a/b) = (a/p1 )e1 · · · (a/pk )ek where the symbols on the
right are Legendre symbols.
(a) Show that if (a/b) = −1, then the congruence x2 ≡ a (mod b) does not
have a solution.
(g) Suppose that a > 1 and b > 1 are relatively prime odd integers. Show
that (a/b)(b/a) = (−1)((a−1)/2)((b−1)/2) .
Exercise B.18.
(a) For each prime number p between 2 and 19, find all primitive roots
(mod p).
(b) For each prime number p between 23 and 47, find at least one primitive
root (mod p).
i i
i i
i i
i i
250 B. Congruences
Exercise B.19. Let p be a prime and let r be a primitive root (mod p). Let
d be a positive integer.
(d) In general, let g = gcd(d, p − 1). Show that ad ≡ 1 (mod p) has exactly
g solutions, and that these are given by a ≡ rk(p−1)/g (mod p) for k = 0,
1, . . . , g − 1.
i i
i i
i i
i i
Appendix C
251
i i
i i
i i
i i
0.8
0.6
0.4
0.2
±0.4
is a point γ with γ0 − γ6 < 1, so our first choice γ = 1 will do, and we
do not have to go out apparently this far.)
We have previously noted that we may restrict our attention to 0 ,
the region consisting of points that are apparently closest to the origin. In
fact, we will begin by considering + 0 , the portion of this region in the first
quadrant. (In case D ≡ 2 or 3 (mod 4), this is the square with vertices
(0, 0), (1/2, 0), (1/2, 1/2), and (0, 1/2), and in case D ≡ 1 (mod 4), this is
the right triangle with vertices (0, 0), (1/2, 0), and (0, 1/2).)
Again we begin with D < 0, where we can consider ellipses.
First consider D = −7. Then the points (x, y) with (x, y)−(0, 0)−7 <
1 are the interior of an ellipse centered at (0, 0), and the points (x, y)
with (x, y) − (1/2, 1/2)−7 < 1 are the interior of an ellipse centered at
(1/2, 1/2), and these two ellipses completely cover + 0 . See Figure C.1.
(You should check this. Verify that the ellipse centered at (0, 0) crosses
the y-axis at the point (0, 1/7) and the line x + y = 1/2 at the point
(1/8, 3/8), and the ellipse
centered at (1/2, 1/2) crosses the y-axis at the
point (0, 1/2 − (1/2) 3/7) and the line x + y = 1/2 at the point (3/8, 1/8),
so the “top” of the first ellipse lies above the “bottom” of the second ellipse
in + 0 .)
0.8
0.6
0.4
0.2
i i
i i
i i
i i
1.5
0.5
±2 ±1 0 1 2
±0.5
±1
i i
i i
i i
i i
0.5
±2 ±1 1 2
±0.5
√ √
the line x + y = 1/2 at (( 21 − 1)/8, (5 − 21)/8), so the picture is as
shown.) Again, by symmetry, once we have covered + 0 we can cover 0.
The situation for D = 13 is very similar. Again the hyperbolic regions
centered at (0, 0) and (1/2, 1/2) cover + 0 . See Figure C.4. (Again you
should verify this, figuring out the coordinates of the intersection points for
yourself.) Again, by symmetry, once we have covered + 0 we can cover 0.
Now we turn to D = 6. Here the interiors of the hyperbolic regions
centered at (0, 0) and (−1, 0) cover all of + 0 . (See Figure C.5.) There
is one subtlety here, however, that we need to remark on. The top curve
x2 − 6y 2 = −1
and the right-hand curve (x + 1) − 6y
2 2
= 1 intersect at the
+
point (1/2, 5/24) on the right-hand border of 0 . But remember that
we are only concerned with Q-points (e, f ), i.e., with points (e, f ) with
both coordinates √ rational numbers
√ (as these are the points
that correspond
to elements e + f D of Q( D)) and the y-coordinate 5/24 of this point
is not a rational number, so this point is not a Q-point and we have indeed
covered + 0 , and again by symmetry we cover 0.
0.5
±2 ±1 0 1 2
±0.5
±1
i i
i i
i i
i i
0.5
±2 ±1 0 1 2
±0.5
±1
The situation for D = 7 is very similar. See Figure C.6. Again, using
hyperbolas centered at (0, 0) and (−1, 0) we cover all of +
0 . We have
the
same subtlety. The two hyperbolas intersect at the point (1/2, 5/28),
but this is not a Q-point. Again, by symmetry, once we have covered + 0
we can cover 0 , so we are done.
To recapitulate, in the cases D = −1, −2, −3, 2, and 3 we could find a
single value of γ so that the associated region covered +0 , and in the cases
D = −7, −11, 5, 13, 6, and 7 we could find two values of γ so that the two
associated regions together covered + 0 . We can handle other values of D
if we use more
√ values of γ for each D. For D = 17 we can choose γ √ = 0,
(1/2)+(1/2) 17, or −1. For D = 21 we can choose γ =√0, (1/2)+(1/2)√21,
or −1. For D = 29 we can choose γ = 0, (1/2) + (1/2) √ 29, −1, or 4 + √29.
pagebreak For D = 11 we can choose γ = 0, −1, 2 + 11, or −5 − 11.
We leave the details for the exercises.
i i
i i
i i
i i
i i
i i
i i
i i
which are the apparently, and also the actually, nearest lattice points. We
may choose either one of them, and we choose the point (1, 1) represent-
ing 1 + j. (Once again we check that the actual distance between this
point and the original point is (1, 1) − (3/4, 3/4)−7 = (1/4, 1/4)−7 =
|(1/4)2 + 7(1/4)2 | = 8/16 < 1, as we expect.)
C.3 Exercises
Exercise C.1. Fill in the details of the proof of Theorem 2.8:
(a) {2 + j, 13 + j};
i i
i i