Abstract Algebra
Abstract Algebra
Dmitriy Rumynin∗
January 6, 2011
The lecture notes are split into 27 sections. Each section will be discussed
in one lecture, making every lecture self-contained. This means that the
material in a section may be reshuffled or even skipped for the lecture,
although the numeration of propositions will be consistent. The remaining
(up to 3) lectures will be spent on revisions and exercises including past
exams.
These written notes is an official curriculum: anything in them except
vistas can appear on the exam. Each section contains exercises that you
should do. To encourage you doing them, I will use some of the exercises in
the exam.
Vista sections are not assessed or examined in any way. Skip them if
you are allergic to nuts or psychologically fragile! The vistas are food for
further contemplation. A few of them are sky blue, but most are second year
material that we don’t have time to cover. You are encouraged to expand
one of them into your second year essay.
The main recommended book is Concrete Abstract Algebra by Lauritzen.
It is reasonably priced (£25 new, £11 used on Amazon), mostly relevant
(except chapter 5) and quite thin. The downside of the book is brevity of
exposition and some students prefer more substantial books. An excellent
UK-style textbook is Introduction to Algebra by Cameron (from £17 on
Amazon). Another worthy book is Algebra by Artin (£75 on Amazon for
the new edition but you can get older editions for around £45). This one
will have all details you need and much more than you can bear. Most
of algebraists I asked have started with this book and absolutely love it.
A similarly priced (and better according to some) alternative to Artin is
∗
c
Dmitriy Rumynin 2007
1
Abstract Algebra by Dummit and Foote (£55). My own first book was
Algebra by van der Waerden (£30 for each volume on Amazon) but it appears
that most of the mathematicians who hate Algebra in their later life have
started with it.
An alternative strategy is to get two books: one for rings and one for
groups. Virtually any pair of books will cover all the topics in these lecture
notes, although some interaction between subjects will be missing.
If you see any errors, misprints, oddities of my English, send me an
email. If you think that some bits require better explanation, write me as
well. All the contributions will be acknowledged. I would like to thank sec-
ond year students of 2007 Rupesh Bhudia, Iain Embrey, Alexander Illing-
worth, Matthew Hutton, Philip Jackson, Sebastian Jorn, Karl Pountney,
Jack Shaw-Dunn, Gareth Speight, Mohamed Swed and Jason Warnett for
valuable suggestions how to improve these lecture notes.
Contents
1 Groups and subgroups 3
8 Normal subgroups 34
9 Homomorphisms 38
10 Quotient groups 44
2
13 Actions, stabilisers and alternating group 57
14 Orbits 62
15 Fixed points 66
16 Conjugacy classes 69
19 Euclidean domains 80
22 Gaussian primes 90
3
(a) (Identity) for all g ∈ G, e ◦ g = g; and
(b) (Inverse) for all g ∈ G there exists h ∈ G such that h ◦ g = e.
(Actually Property (i) does not really need stating, because it is implied
by the fact that ◦ : G × G → G is a binary operation on G. But it is
traditionally the first of the four group axioms, so we have included it here!)
The number of elements in G is called the order of G and is denoted by
|G|. This may be finite or infinite.
An element e ∈ G satisfying (iii) of the definition is called an identity
element of G, and for g ∈ G, an element h that satisfies (iii)(b) of the
definition (h ◦ g = e) is called an inverse element of g.
We shall immediately prove two technical lemmas, which are often in-
cluded as part of the definition of a group.
Lemma 1.2 Let G be a group. Then G has a unique identity element, and
any g ∈ G has a unique inverse.
Proof: Let e and f be two identity elements of G. Then, e ◦ f = f , but
by Lemma 1.1, we also have e ◦ f = e, so e = f and the identity element is
unique.
Let h and h′ be two inverses for g. Then h ◦ g = h′ ◦ g = e, but by
Lemma 1.1 we also have g ◦ h = e, so
h = e ◦ h = (h′ ◦ g) ◦ h = h′ ◦ (g ◦ h) = h′ ◦ e = h′
4
• multiplicative groups, where we omit the ◦ sign (g ◦h becomes just gh),
we denote the identity element by 1 rather than by e, and we denote
the inverse of g ∈ G by g−1 ; or
• additive groups, where we replace ◦ by +, we denote the identity ele-
ment by 0, and we denote the inverse of g by −g.
If there is more than one group around, and we need to distinguish
between the identity elements of G and H say, then we will denote them by
1G and 1H (or 0G and 0H ).
Additive groups will always be commutative, but multiplicative groups
may or may not be commutative. The default will be to use the multiplica-
tive notation.
The proof of the next lemma is not in the lecture. Try proving it yourself
before reading my proof. From now on, this result will be used freely and
without explicit reference.
5
It is straightforward to check that G× H is a group under this operation.
Note that the identity element is (1G , 1H ), and the inverse of (g, h) is just
(g−1 , h−1 ).
1.4 Multiplication Table
A convenient way to describe a group is by writing its multiplication
table. For instance, the Klein four group K4 =< a, b|a2 , b2 > is the set
{1, a, b, c = ab} with the multiplication table:
1 a b c
1 1 a b c
a a 1 c b
b b c 1 a
c c b a 1
However, if the group is infinite or finite but large, the multiplication
table approach is not quite practical. For instance, the famous big monster
group has approximately 1052 elements. Do you think it is a good idea
to write its multiplication table? However, this group can be explicitly
described as a subgroup of a larger group, which can be well understood.
1.5 Definition of a subgroup
Definition. A subset H of a group G is called a subgroup of G if it forms a
group under the same operation as that of G.
6
be true. Conversely, if (i) and (ii) hold, then we need to show that the
other two axioms, ‘Associativity’ and ‘Identity’ hold in H. Associativity
holds because it holds in G, and H is a subset of G. Since we are assuming
that H is nonempty, there exists h ∈ H, and then h−1 ∈ H by (ii), and
hh−1 = 1 ∈ H by (i), and so ‘Identity’ holds, and H is a subgroup. 2
7
1.9 Vista: other products of groups
The direct products are not the only products of groups available. Its
close relative is a semidirect product. You can twist direct and semidirect
product by a cocycle to get twisted and crossed products. There are also
bycrossed products and knit products. Amazingly enough, all of them are
various group structures on the set G × H for a pair of groups.
If you are willing to consider more general sets, I have further products
up my sleeve. If you can solve Rubik’s cube, you have an experience with
wreath product! It is used to build Rubik’s cube group out of two symmetric
and two abelian groups. Topologists use free products, HNN-extensions and
amalgams. Any of these constructions could be a nice topic for a second
year essay.
8
such things show up in the course, we would call them rings without identity.
A natural question is whether 1 should be different from 0.
9
Observe that R[X1 , . . . Xn ] is commutative if and only if R is commuta-
tive.
2.5 Subrings
Definition. A subset S of a ring R is called a subring of R if it forms a ring
under the same operation as that of R with the same identity element.
The identity element in the rings gives us a trouble again. It is possible
for a subset to be a ring under the same operations but with a different
identity element (see section 2.9 for an example). As a logician would put
it, we consider rings with identity in the signature.
10
Definition. An element x of a ring R is called a unit if there exists an
element x′ ∈ R such that xx′ = x′ x = 1R .
Lemma 2.5 All the units in a ring R form a group under multiplication.
Proof: Let us denote R× the set of all units in R. The product on R× is
associative because the product on R is associative. The identity element of
R× is 1R and the inverse of x is x′ . 2
′ −1
In particular, x is unique and will be denoted x to be consistent with
the rest of the notation. The group R× is called the group of units of the
ring R or the multiplicative group of R. For example, the multiplicative
group Mn (R)× of the matrix ring Mn (R) is called the general linear group
and denoted GLn (R).
2.8 Fields
Definition. A field is a commutative ring K such that K × = K \ {0}. A
subfield is subring of a field, which is a field under the same operations.
Let us look at some familiar rings. For integers, Z× = {±1} = 6 Z \ {0},
so Z is not a field. For complex numbers, C× = C \ {0}, so C is a field. For
the zero ring, Z×1 = {0} =6 Z \ {0} = ∅, so Z1 is not a field. Thus, a field has
at least two distinct elements 0 6= 1.
2.9 Direct product of rings
Direct products of rings are very similar to direct products of groups.
Definition. Let R and S be two rings. We define the direct product R × S
of R and S to be the set {(r, s) | r ∈ R, s ∈ S} of ordered pairs of elements
from R and S, with the obvious component-wise addition and multiplication
(r1 , s1 ) + (r2 , s2 ) = (r1 + r2 , s1 + s2 ), (r1 , s1 )(r2 , s2 ) = (r1 r2 , s1 s2 ) for r1 , r2 ∈
R and s1 , s2 ∈ S.
It is straightforward to check that R × S is a ring under these operations.
Note that the identity element is (1R , 1S ) but R and S are not subrings of
R × S, in general. Indeed, R can be thought of as elements of the form
(r, 0S ) but it does not contain the identity element of R × S.
2.10 Exercises
(i) Which of the rings Z2 , Z3 , Z4 , Z5 , Z6 , Z7 , Z8 are fields?
(ii) Prove that Zn is a field if and only n is prime.
(iii) Find R+ and R× for the direct product ring Z2 × Z2 .
(iv) What is the intersection of the subgroups R× and {z | |z| = 1} of the
multiplicative group of the complex numbers C× ?
11
(v) Consider a ring R without identity element. Define R b = R × Z with
operations given by (r, n) + (s, m) = (r + s, n + m) and (r, n) · (s, m) =
(rs + mr + ns, mn). Prove that this is a ring.
(vi) An element p ∈ R such that p2 = p is called an idempotent. Prove that
a field contains exactly two distinct idempotents.
(vii) Describe all idempotents in M2 (R).
(viii) Consider the set C(R, R) of all functions from real numbers R to
real numbers R, continuous at all but finitely many points. Using the
fact from Analysis that the sum and the product of two continuous
functions is continuous, prove that C(R, R) is a ring.
(ix) Which of the following subsets form a subring of C(R, R): smooth func-
tions C ∞ (R, R), compactly supported functions Cc (R, R), polynomial
functions f (X) such that f ′ (0) = 0?
(x) Let V = U ⊕ W be three vector spaces over R such both V and U
are of countable dimension. We consider the ring R = LR (V ) of all
linear maps V → V with the composition of maps as a multiplication.
Choose a linear isomorphism f : V → U that becomes an element of
R now. Prove that there infinitely many x ∈ R such that xf = 1R .
Conclude that f is not a unit and there is no y ∈ R such that
f y = 1R .
2.11 Vista: pass rings
Besides the ring of commutative polynomials R[X1 , . . . Xn ] there is also
a ring of noncommutative polynomials R < X1 , . . . Xn >. Inside this ring
X1 X2 6= X2 X1 . This ring is of crucial importance in Physics and Engineer-
ing as its elements are tensors!
The general algebraic object is the tensor ring of a bimodule. Importance
subclass are pass algebras or quiver algebras1 . Its elements are formal linear
combinations of passes in a directed graph (quiver). The product of two
passes is its concatenation if one pass ends where the other one starts or
zero if not. For instance, R < X1 , . . . Xn > is the pass algebra of the graph
with 1 vertex and n loops at this vertex.
12
groups.
3.1 Isomorphisms
Later on, we shall be considering the more general case of homomor-
phisms, but for now we just introduce the important special case of isomor-
phisms.
Definition. An isomorphism φ : G → H between two groups G and H is a
bijection from G to H such that φ(g1 g2 ) = φ(g1 )φ(g2 ) for all g1 , g2 ∈ G. Two
groups G and H are called isomorphic if there is an isomorphism between
them. In this case we write G ∼
= H.
Isomorphic groups may be considered to be essentially the same group
- H can be obtained from G simply be relabelling the elements of G. The
ring isomorphism is defined in the same way.
Definition. An isomorphism φ : R → T between two rings R and T is
a bijection from R to T such that φ(r1 r2 ) = φ(r1 )φ(r2 ) and φ(r1 + r2 ) =
φ(r1 ) + φ(r2 ) for all r1 , r2 ∈ R. Two rings R and T are called isomorphic if
there is an isomorphism between them. In this case we write R ∼ = T.
In section 2.6, we saw that the ring of complex numbers C is isomorphic
to a subring of M2 (R). Here is our first example of a group isomorphism.
The groups C2 × R+ and R× are isomorphic. To write it explicitly, it is
convenient to think of C2 as its isomorphic group Z× = {1, −1}. The map
φ : C2 × R+ → R× defined by φ(ε, x) = εex is an isomorphism.
The isomorphism φ : C2 × R+ → R× illustrates the fact that often it is
important that isomorphic groups (or rings) are not equal. Restricted to a
subgroup φ is an exponential function : ex : R 7→ R>0 . You won’t get far in
your analysis exam if you think of it as an equality!
3.2 Elementary Properties – Orders of Elements
First some more notation. In a multiplicative group G, we define g2 = gg,
g3 = ggg, g 4 = gggg, etc. Formally, for n ∈ N, we define g n inductively,
by g1 = g and g n+1 = gg n for n ≥ 1. We also define g 0 to be the identity
element 1, and g−n to be the inverse of gn . Then g x+y = gx g y for all x, y ∈ Z.
In an additive group, gn becomes ng, where 0g = 0, and (−n)g = −(ng).
Definition. Let g ∈ G. Then order of g, denoted by |g|, is the least n > 0
such that gn = 1, if such an n exists. If there is no such n, then g has infinite
order, and we write |g| = ∞.
Note that if g has infinite order, then the elements gx are distinct for
distinct values of x, because if gx = g y with x < y, then gy−x = 1 and g has
finite order.
13
Similarly, if g has finite order n, then the n elements g0 = 1, g1 =
g, . . . , gn−1 = g−1 are all distinct, and for any x ∈ Z, gx is equal to exactly
one of these n elements. Proofs of the next three lemmas are left as exercises
for the reader.
14
(1, 3), whereas φ2 φ1 = (2, 3). This example shows that Sym(X) is not in
general a commutative group. (In fact it is commutative only when |X| ≤ 2.)
The inverse of a permutation can be calculated easily by just reversing all
of the cycles. For example, the inverse of (1, 5, 3, 6)(2, 8, 7) is (6, 3, 5, 1)(7, 8, 2),
which is the same as (1, 6, 3, 5)(2, 7, 8). (The cyclic representation is not
unique: (a1 , a2 , . . . , ar ) = (a2 , a3 , . . . , ar , a1 ), etc.)
Proposition 3.4 Let X and Y be two sets with |X| = |Y |. Then the groups
Sym(X) and Sym(Y ) of all permutations of X and Y are isomorphic.
Proof: Let ψ : X → Y be a bijection. The map Sym(X) → Sym(Y )
defined by f 7→ ψf ψ −1 is an isomorphism. 2
The notation Sym(n) or Sn is standard for the symmetric group on a set
X with |X| = n. By default, we take X = {1, 2, 3, . . . , n}.
3.4 Exercises
(i) Show that the relationship between groups of being isomorphic satisfies
the conditions of an equivalence relation; that is, G ∼
= G, G ∼ =H ⇒
H∼ = G, and G ∼
= H, H ∼
= K ⇒ G ∼
= K.
(ii) Prove Lemma 3.3.
(iii) If X is finite, what is the order of Sym(X) as a function of |X|?
(iv) Now let n be a positive integer, Hn = {z ∈ C | z n = 1}. Prove that
Hn is a subgroup of C× , isomorphic to Cn .
(v) A matrix M ∈ GLn (R) is a permutation matrix if in each row and
column n − 1 entries are 0R and the remaining entry is 1R . Prove that
permutation matrices form a subgroup of GLn (R) isomorphic to Sn .
3.5 Vista: linear groups
A group G is called linear if it is isomorphic to a subgroup of GLn (F )
for some field. A few things could prevent a group from being linear. For
instance, Sym(X) is not linear if X is infinite. Observe that Sym(X) has at
least countably many commuting (i.e. xy = yx) elements of order 2. Let us
assume that 1 + 1 6= 0 ∈ F which implies that any matrix of order 2 can be
diagonalised. Now finitely many diagonalisable commuting matrices can be
simultaneously diagonalised. But GLn (F ) can accommodate at most 2n − 1
diagonal matrices of order 2.
There are other obstacles. In 1902 Burnside proved that a finitely gen-
erated linear periodic3 must be finite. In 1964 Golod and Shafarevich con-
3
periodic mean every element has finite order
15
structed an infinite finitely generated periodic group4 . These groups cannot
be linear by Burnside’s theorem.
Given an infinite group, it can be a non-trivial challenge to establish
whether it is linear. For instance, it was a long standing problem whether
the braid group5 is linear. It was solved affirmatively in 2000 by Warwick
mathematician Daan Krammer.
16
Lemma 4.1 Sn is generated6 by all transpositions.
3. You have seen in Linear Algebra that any invertible matrix is a
product of elementary matrices. This means that the group GLn (K) for a
field K is generated by elementary matrices.
4.2 Cyclic Groups
Definition. A group G is called cyclic, if it is generated by one element. A
cyclic subgroup of G is a subgroup < g > generated by any g ∈ G.
Essentially, a cyclic group G consists of the integral powers of a single
element. In other words, there exists an element g in G with the property
that, for all h ∈ G, there exists x ∈ Z with g x = h. We call the element g a
cyclic generator of G.
We have already seen cyclic groups Cn = Z+ n and C∞ = Z
+ in Sec-
tion 1.2. Are there any other cyclic groups? Let us look at generators first.
Lemma 4.2 In an infinite cyclic group, any generator g has infinite order.
In a finite cyclic group of order n, generators are exactly elements of order
n.
Proof: Let G be a cyclic group, g ∈ G a generator. If |g| = k < ∞ then
gm = gm+k = gm+tk for all t, m ∈ Z. Hence, gm = g (m)k and {gm | m ∈ Z}
contains at most k elements. This proves the first statement.
For the second statement, we use Lemma 3.2 to conclude that the set
m
{g | m ∈ Z} contains exactly k elements. The second statement follows.
2
Let us use the additive notation to describe generators in the cyclic
groups we know. The group Z+ has infinitely many elements of infinite
order but only two generators: 1 and −1. An element x ∈ Z+ n is a generator
if and only it has order n, i.e., the smallest positive integer m such that
n|mx is n. This means that x is a generator if and only if x and n are
coprime. The number7 of possible generators of Z+ n is denoted ϕ(n). The
function ϕ : N → N is called Euler’s totient function. We will compute it in
Section 12.2.
The following fact is an easy corollary of Lemma 4.2, which we state for
future reference. Its proof is left as an exercise.
6
In fact there are much smaller generating sets for Sn , see the exercise below.
7
it is equal to the number of coprime numbers to n between 0 and n − 1 as we have
just proved
17
Proposition 4.3 The order |g| of an element g ∈ G is equal to the order
| < g > | of the cyclic subgroup < g > generated by g.
Now we are ready to establish an identification test for an infinite cyclic
group.
18
that Z× ∼
p = Cp−1 . Known sneaky ways of finding n and m will need prime
factorisation of p − 1, which is known to be computationally hard.
Suppose a computer can make one group multiplication in G every mi-
crosecond (10−6 seconds). How long will it take to compute gn ? If n = 2m is
even, one uses gn = gm · gm . If n = 2m + 1 is odd, one uses gn = gm · gm · g.
Hence, one computes gn using between log2 n and 2 log2 n operations. If
n ≈ 21000 , i.e. has 1000 digits in binary representation, one needs at most
2000 multiplications. Thus, Alice and Bob will make their computations in
at most 2 milliseconds (2 · 10−3 seconds).
Let us consider how Eve can try to break the key in the most straight-
forward way. She needs to solve the equation gn = a in n that can be
done compute all |g| powers of g until she hits a. Repeating gx = ggx−1 ,
every power requires one multiplication. As |g| ≈ 21000 , Eve needs at most
21000 = (210 )100 ≈ (103 )100 = 10300 multiplications, that can be done in at
most 10294 seconds. It is worse mentioning at this point that 108 seconds
constitute approximately 3 years, so Eve would need to wait for 3 × 10286
years for the code to be broken.
4.4 Pauli’s matrices and quaternionic group
The following matrices in GL2 (C) are known as Pauli’s matrices (we use
i to denote the imaginary unit):
0 1 0 −i 1 0
σx = , σy = , σz = .
1 0 i 0 0 −1
In Physics they are used to describe spin, but we will need their scalar
multiples in the inverse order:
i 0 0 1 0 i
I = iσz = , J = iσy = , K = iσx = .
0 −i −1 0 i 0
19
and its multiplication table is
1 −1 I −I J −J K −K
1 1 −1 I −I J −J K −K
−1 1 −1 −I I −J J −K K
I I −I −1 1 K −K −J J
−I −I I 1 −1 −K K J −J
J J −J −K K −1 1 I −I
−J −J J K −K 1 −1 −I I
K K −K J −J −I I −1 1
−K −K K −J J I −I 1 −1
Proof: Computing with matrices, we need to establish that I 2 = J 2 = −1
and IJ = K and JI = −K. Using matrices, we observe that −1 is a scalar
matrix so it commutes with any other matrix: (−1)X = X(−1) = X.
To locate all the elements we start with 1, I, J, K ∈ Q8 , then −1 = I 2 ∈
Q8 , then −I = (−1)I, −J = (−1)J, −K = (−1)K ∈ Q8 . To show that
it is indeed all of Q8 , we have to prove that these 8 matrices are closed
under multiplication (closeness under inverses follows because they have
finite order). We do this by filing the multiplication table. So far we know
the part of the table:
1 −1 I −I J −J K −K
1 1 −1 I −I J −J K −K
−1 1 −1 −I I −J J −K K
I I −I −1 1 K −K
−I −I I 1 −1 −K K
J J −J −K K −1 1
−J −J J K −K 1 −1
K K −K
−K −K K
20
Proof: Since G is generated by a and b, any element of G can be written as
a product of the generators a, b, a−1 , b−1 . Since a4 = 1 and b4 = (a2 )2 = 1,
a−1 = a3 and b−1 = b3 , so any element of G is a product of several a
and b. Furthermore, we can use the equation ba = a−1 b = a3 b to move
all occurrences of a in the product to the left of the expression, so that
G = {ak bl | k, l ∈ Z}. Since z = a2 = b2 commute with both a and b
(indeed, zb = b3 = bz and the same for a) and z 2 = a4 = 1, we get a
description G = {1, a, b, ab, z, za, zb, zab} with all the elements distinct since
|G| = 8.
An isomorphism φ : G → Q8 is uniquely determined by φ(a) = I and
φ(b) = J, for instance, φ(z) = φ(a2 ) = φ(a)2 = −1 and φ(ab) = φ(a)φ(b) =
K. One way to finish the proof now is to fill out the multiplication table of
G and see that they are the same.
A more elegant way is to observe that we filled the multiplication table
of Q8 formally starting with relations: I 2 = J 2 = −1, IJ = K, JI = −K
and (−1)X = X(−1) = X and, without explicitly mentioning (−1)2 = 1.
In G we know the same relations: a2 = b2 = z, ba = a−1 b = a3 b = zab, z
commutes with everything, and z 2 = 1. Hence, we are bound to arrive at
the same multiplication table. 2
4.5 Exercises
(i) Prove that < g1 , g2 , . . . , gr > is equal to the intersection of all sub-
groups containing every gi .
(ii) Is Q+ a finitely generated group? Justify your answer.
(iii) Prove an improved version of Lemma 4.1: Sn is generated n − 1 trans-
positions (k, k + 1) for 1 ≤ k ≤ n − 1.
(iv) Prove Proposition 4.3.
(v) Prove that Q8 has 6 elements of order 4 and one element of order 2.
(vi) Prove that any subgroup of a cyclic group is cyclic itself.
4.6 Vista: free groups
In Algebra-1 you have seen the free abelian group Ab < X > on a set X.
It consists of all formal Z-linear combinations of elements of X. Similarly
to this idea, there is a free (nonabelian) group on a set X. Let us consider
(finite) words in alphabet {x, x−1 | x ∈ X}. For instance, the empty word ∅,
xx−1 y −1 y, xxxx are all words but x2 is not. A word w is irreducible if words
xx−1 or x−1 x are not subwords of w. Now the free group Gr < X > consists
of all irreducible words in the alphabet. The multiplication is concatenation
followed by reduction, for instance the product of xyzx−1 and xz −1 y is not
xyzx−1 xz −1 y since it is not irreducible but its reduction xyy.
21
If |X| = n, one writes Fn for Gr < X >. We have seen a couple of
them: F0 is the trivial group C1 , F1 is the infinite cyclic group C∞ . The
next group F2 is brand new and exciting. Its algebraic properties are crucial
to establishing the Banach-Tarski paradox8 where one cuts a 3-sphere into
4 disjoint pieces and makes two identical 3-spheres by combining 2 pairs of
pieces.
22
this generality. We lack the standard properties of the determinants for
matrices over commutative rings, i.e. multiplicative property det(AB) =
det(A) det(B) and the minor formula AM (A) = M (A)A = det(A)I where
M (A) is the matrix of minors in A. Notice that the latter imply that
A ∈ GLn (R) if and only if det(A) ∈ R× .
5.2 Orthogonal group of size 2 over real numbers
We would like to pinpoint the structure of the group O2 (R). Let us
consider the following two matrices
cos α − sin α cos α sin α
Rα = and Sα = .
sin α cos α sin α − cos α
Rβ Sβ
Rα Rα+β Sα+β
Sα Sα−β Rα−β
23
Observe that rotations form a subgroup SO2 (R) (called special orthogonal
group) that looks suspiciously similar to R+ , indeed, Rα Rβ = Rα+β . There
is a natural map R : R+ → SO2 (R) that sends α to Rα , which is not an
isomorphism, because Rα = Rβ whenever α − β = 2nπ for some n ∈ Z. The
map R is an example of a homomorphism, a concept discussed in Section 9.
5.3 Dihedral group
Let n ∈ N with n ≥ 2. The dihedral group D2n is the group generated
by S0 and R2π/n . Unfortunately, some books denote this group by Dn and
others by D2n , which can be confusing! We prefer D2n because it tells us
the number of its elements:
Lemma 5.4 The order of D2n is 2n and all its elements are S2kπ/n , R2kπ/n
for k ∈ Z, 0 ≤ k ≤ n − 1.
Proof: Using the multiplication table, we can easily find elements R2kπ/n =
k
R2π/n and S2kπ/n = R2kπ/n S0 = R2π/nk S0 in D2n =< R2π/n , S0 >.
Let G = {S2kπ/n , R2kπ/n }. It remains to observe that G a subgroup. G is
−1
obviously nonempty. G is closed under inverse because R2kπ/n = R2(n−k)π/n
−1
and S2kπ/n = S2kπ/n . Finally, G is closed under multiplication because
G = {S2kπ/n , R2kπ/n | k ∈ Z} and 2kπZ is a subgroup of R+ (look at the
multiplication table). 2
In fact, it is convenient at this point to denote rotations ak = R2kπ/n
where a ∈ Zn . If we denote the reflection b = S0 , a typical reflection becomes
ak b = S2kπ/n and multiplication table in D2n could be written using addition
in Zn :
al al b
ak a k+l ak+l b
ak b ak−l b ak−l
We need a recognition test for the dihedral group.
24
end up with a word in the form ak bl . Using an = b2 = 1 again, we can
assume that 0 ≤ k < n and 0 ≤ l < 2. This leaves us with precisely 2n
different words ak bl , and since we are told that |G| = 2n, these words must
all represent distinct elements of G. We have now shown that G = {ak | 0 ≤
k < n} ∪ {ak b | 0 ≤ k < n}, exactly as in D2n .
Using ba = a−1 b twice, we get ba2 = (ba)a = a−1 ba = a−1 a−1 b = a−2 b,
and similarly bak = a−k b for all k ≥ 0, and since a−k = an−k , we have
bak = an−k b for 0 ≤ k < n. Now formally we recover the multiplication
table of G that turns out to be the same as of D2n :
(i) (ak )(al ) = ak+l (k + l < n) or ak+l−n (k + l ≥ n);
(ii) (ak )(al b) = ak+l b (k + l < n) or ak+l−n b (k + l ≥ n);
(iii) (ak b)(al ) = ak an−l b = ak+n−l b (k < l) or ak−l b (k ≥ l);
(iv) (ak b)(al b) = ak an−l bb = ak+n−l (k < l) or ak−l (k ≥ l).
Hence φ(ak bl ) = ak bl is the required isomorphism (from G to D2n or in the
opposite direction - it does not matter). 2
5.4 Dihedral group as a subgroup of symmetric group
Let n ∈ N with n ≥ 2 and let P be a regular n-sided polygon in the
plane whose vertices are vk = v2kπ/n , k ∈ Z. It is clear from the formulas
Sα vβ = vα−β and Rα vβ = vα+β in the previous section that XP = P for
any X ∈ D2n . In fact, the opposite is true: if XP = P for some X ∈ O2 (R)
then X ∈ D2n . Thus, D2n is the group of symmetries of a regular n-gon.
It is instructive to stop thinking of P as a 2-dimensional figure. Let us
think of P as a set of its vertices. This allows us to think of D2n as a subgroup
of Sn . More precise, for each X ∈ D2n there exists a unique σX ∈ Sn such
that Xvk = vσX (k) . For instance, σa = (1, 2, 3, . . . , n) rotates the vertices
counterclockwise. Similarly, b is the reflection through the bisector of P that
passes through the vertex v0 = vn . Then b interchanges the vertices v1 and
vn−1 that are adjacent to vn , and similarly it interchanges v2 and vn−2 , v2
and vn−3 , etc., so we have σb = (1, n − 1)(2, n − 2)(3, n − 3) . . .. For example,
when n = 5, b = (1, 4)(2, 3) and when n = 6, b = (1, 5)(2, 4). Notice the
difference between the odd and even cases. When n is odd, b fixes no vertex
other than v0 , but when n is even, b fixes one other vertex, namely vn/2 . We
summarize this discussion in the following proposition.
25
5.5 Exercises
(i) Verify the multiplicative property det(AB) = det(A) det(B) and the
minor formula AM (A) = M (A)A = det(A)I for 2 × 2-matrices over a
commutative ring.
(ii) Let R be a commutative ring. Prove that A ∈ GL2 (R) if and only if
det(A) ∈ R× .
(iii) Compute the order of the general linear groups GL2 (Z2 ) and GL2 (Z4 ).
(iv) Compute the order of the orthogonal groups O2 (Z2 ) and O2 (Z4 ).
(v) What is the subgroup of O2 (R) generated Sα and S0 ?
(vi) Prove that D4 is isomorphic to K4 .
(vii) Prove that D6 is isomorphic to S3 .
(vii) Prove that D8 has 5 elements of order 2 and 2 elements of order 4.
Conclude that D8 is not isomorphic to Q8 .
(viii) Go to the online Magma calculator http://magma.maths.usyd.edu.au/calc/
and play around with groups. For instance, try the code
G<a,b,c> := Group < a,b,c | aˆk, bˆl, cˆm, a*b*c >; G;
Order(G); for some particular k ≥ l ≥ m ≥ 2. Determine experi-
mentally by running the code which of these groups are finite.
5.6 Vista: defining relations and Burnside’s problem
The equations {an = 1, b2 = 1, ba = a−1 b} are called defining relations
for D2n . One formally writes D2n = Gr < a, b | an = b2 = 1, ba = a−1 b >.
This is a great way of describing groups. Roughly it means that D2n is the
largest group generated by two elements a and b that satisfy these equations.
More precisely, it means that D2n is a quotient group of the free group
F2 = Gr < a, b | ∅ > by a subgroup determined by these relations.
Another example of a group determined by relations is von Dyck group
Dk,l,m = Gr < a, b, c | ak = bl = cm = abc = 1 > with which you worked in
the last exercise. Working out properties of a group from its presentation by
generators and relations is far from straightforward: apart from MAGMA
there packages GAP and SAGE that can help you with it. In Warwick Derek
Holt is actively involved with them and has actually written a code for many
of the functions.
Here is your another chance to get immediate recognition as a mathe-
matical genius on par Galois, Gauss and Perelman. The group B(n, m) =
Gr < x1 , x2 , . . . xn | wm = 1 > is know as Burnside group. By w here I
mean all possible words in a and b. By Proposition 7.6, B(n, 2) is abelian,
of order 2n . Using MAGMA, you can observe that B(2, 3) ∼ = Gr < a, b |
a3 = b3 = (ab)3 = (ab2 )3 = 1 > has order 27.
What is about B(2, 5)? Beware that your laptop is not of big help here.
26
If this group is finite, it has exactly 534 (approximately 6 × 1023 ) elements.
But I would bet that it is infinite. If any betting agency accepts, please, let
me know.
This problem is known as Burnside’s problem9 . In 1984, E. Zelmanov
received Fields Medal for proving that B(k, n) admits a unique maximal
finite quotient B0 (k, n). This was known as restricted Burnside’s problem.
In particular, the order of B0 (2, 5) is 534 .
27
Examples. 1. Let X = Z, n ∈ Z, n 6= 0. We say that x ≡n y if n
divides x − y. This equivalence relation, called congruent modulo n, appear
in Foundations.
2. We say that x ∼ y ∈ R+ if Rx = Ry ∈ SO2 (R). Clearly, x ∼ y if and
only if x − y ∈ 2πZ.
3. Let X = F n×m be the set of all n × m-matrices over a field F . The
equivalence relation equivalent appears in Linear Algebra. Let us recall that
A ∼ B if and only if A and B have the same rank if and only if A can
be transformed to B by elementary row and column transformations if and
only if there exist P ∈ GLn (F ), Q ∈ GLm (F ) such that P AQ = B if and
only if A and B represent the same linear map f : F m → F n in different
bases of the two vector spaces.
4. Let X = F n×n be the set of all n × n-matrices over a field F . The
equivalence relation similar appears in Linear Algebra. Let us recall that
A ∼ B if and only if A and B have the same Jordan normal form if and only
if there exists P ∈ GLn (F ) such that P AP −1 = B if and only if A and B
represent the same linear map f : F n → F n in different bases of the vector
space.
5. Let X = S(Rn×n ) be the set of all symmetric n × n-matrices over the
real numbers R. This equivalence relation without a special name appears
in Algebra-1. In this relation A ∼ B if and only if A and B have the same
signature if and only if there exists P ∈ GLn (F ) such that P AP T = B if and
only if A and B represent the same quadratic form q : Rn → R in different
bases of the two vector spaces.
Definition. Given an equivalence relation R on X and a ∈ X, the equiva-
lence class of a is the following set [a] = {x ∈ X | xRa}.
28
Corollary 6.2 Two equivalence classes [a] and [b] are either equal or dis-
joint. Hence, the equivalence classes form a partition of X.
Proof: If [a] and [b] are not disjoint, then there exists an element c ∈ [a]∩[b].
So by Proposition 6.1 [a] = [c] = [b]. 2
Corollary 6.3 The equivalence relation can be uniquely recovered from its
partition into equivalence classes.
Proof: This follows immediately from Proposition 6.1 as aRb if and only
if they belong to the same class. 2
Finally, we define the quotient set X/R as a collection of equivalence
classes. We will see the further usefulness of this later on but here are the
first three examples which you already saw in various subjects.
Examples. 6. Let ≡n be the congruence modulo n we saw in examples
today. The quotient set Z/ ≡n is the ring Zn of residues modulo n.
7. Let X be the set of all Cauchy sequences in Q. Recall that a sequence
(an ) is Cauchy if for any ε > 0 there exists N such that |am − an | < ε for
all m, n > N . Two Cauchy sequences (an ) and (bn ) are equivalent if their
difference an −bn tends to zero. The significance of this equivalence relations
is that the quotient set X/ ∼ is the set of real numbers.
8. Various function spaces in analysis also quotient sets where one iden-
tifies functions different by a “negligible” function. For instance, let X
Rbe1 the set of all function f : [0, 1] → R such that the Lebesgue integral
|f (x)|2 dx is well-defined and finite. Two functions f and g in X are
0 R1
equivalent if 0 |f (x) − g(x)|2 dx = 0. The quotient set X/ ∼ is the L2 space
L2 ([0, 1], R).
6.3 Cosets
Given a group G and a subgroup H, we define a binary relation ∼H on
G. We set x ∼H y if there exists h ∈ H such that x = hy. Notice that for
G = Z and H = nZ, this is the congruence modulo n.
29
Similarly to x ∼H y, there is an equivalence relation x H ∼ y. We say
that x H ∼ y if and only if xh = y for some h ∈ H if and only if x−1 y ∈ H.
Definition. Let g ∈ G, H ≤ G. The right coset of H containing g is the
equivalence class H [g] of ∼H . Similarly, the left coset is the equivalence class
[g]H of H ∼
The rightness of a right coset is the position of g with respect to H.
Notice that H [g] = {hg | h ∈ H} = Hg. In fact, Hg is the standard
notation used in most of the books. We will use both notations since they
are convenient in different situations.
Similarly, [g]H is the left coset gH = {gh | h ∈ H}. If H and the side
are clear from the context, we just write [g]. In the case of additive groups,
the standard notation becomes H + g rather than Hg but we may still use
[g].
Example. If V is a vector space and W is a subspace, the cosets of W are
affine subspaces v + W parallel to W .
Example. Let G = S3 be the symmetric group. Then G consists of the 6
permutations (), (1,2,3), (1,3,2), (1,2), (1,3), (2,3), where () represents the
identity permutation.
Let us first choose H = {(), (1, 2, 3), (1, 3, 2)} to be the cyclic subgroup
generated by a = (1, 2, 3). If we put b = (2, 3), then we find that H [b] =
Hb = {(1, 2), (1, 3), (2, 3)}. In fact any right coset of H is equal to either
H itself or to H [b] = Hb = G \ H. Furthermore, [b]H = H [b], and indeed
H [g] = [g]H , for all g ∈ G, so the right and left cosets are the same in this
example.
Now let us choose H = {(), (2, 3)} to be the cyclic subgroup generated
by b = (2, 3). With a = (1, 2, 3), we have H [a] = Ha = {(1, 2, 3), (1, 3)} and
2 2
H [a ] = {(1, 3, 2), (1, 2)}, but [a]H = aH = {(1, 2, 3), (1, 2)} and [a ]H =
2
a H = {(1, 3, 2), (1, 3)}, so the right and left cosets are not the same in this
case.
The following two corollaries are immediate consequences of Corollary 6.2
and Proposition 6.4
Corollary 6.5 Two right cosets H [g1 ] and H [g2 ] of H in G are either
equal or disjoint.
30
(ii) Find a binary relation on the set R that is both symmetric and tran-
sitive but not reflexive.
(iii) Find a binary relation on the set R that is both reflexive and symmetric
but not transitive.
(iv) Describe and draw the cosets of R+ in C+ .
(v) Describe and draw the cosets of R× in C× .
(vi) Describe and draw the cosets of H = {z ∈ C||z| = 1} in C× .
(vii) Prove that, if |G : H| is finite, then |G : H| is also equal to the number
of distinct left cosets of H in G. This is clear if G is finite, because
both numbers are equal to |G|/|H|, but it is not quite so easy if G is
infinite.
(viii) Let H be a subgroup of a group G and let Hg be a right coset of H
in G. Prove that the set {k−1 | k ∈ Hg} is a left coset of H in G,
and deduce that there is a bijection between the sets of left and right
cosets of H in G.
(iX) Consider solutions of a homogeneous system of linear equations: H =
{X ∈ Rm |AX = 0}. Show that H is a subgroup.
Now consider solutions of a non-homogeneous system of linear equa-
tions: W = {X ∈ Rm |AX = B}. Show that W is a coset of H.
6.5 Vista: groupoids
The notion of groupoid is a common generalisation of a group and an
equivalence relation. If one develops the theory of groupoids first, then
one can say that a group is a groupoid with one object and an equivalence
relation is a groupoid where homs have at most one element. Find out more
by exploring references on its wiki http://en.wikipedia.org/wiki/Groupoid
Proposition 7.1 If the subgroup H is finite, then all right cosets have ex-
actly |H| elements.
Proof: Since h1 g = h2 g ⇒ h1 = h2 by the cancellation law, it follows that
the map φ : H = H [1] → H [g] defined by φ(h) = hg is a bijection, and the
result follows. 2
31
Of course, all of the above results apply with appropriate minor changes
to left cosets.
Corollary 6.6 and Proposition 7.1 together imply:
Proposition 7.3 Let G be a finite group. Then for any g ∈ G, the order
|g| of g divides the order |G| of G.
Proof: Let |g| = n. The powers {gx | x ∈ Z} of g form a subgroup
H of G, and we saw in Subsection 3.2 that the distinct powers of g are
{gx | 0 ≤ x < n}. Hence |H| = n and the result follows from Lagrange’s
Theorem. 2
We finish with the following technical lemma for future use.
32
Proposition 7.6 Let G be group where every element has order 2 or 1.
Then G ∼
= (C2 )n and |G| = 2n for some n ∈ N.
Proof: Since x2 = y 2 = (yx)2 = 1 for all x, y ∈ G, xy = y 2 xyx2 =
y(yx)(yx)x = yx, so the group is abelian. Moreover, it has a natural struc-
ture of a vector space of Z2 : 1 · x = x and 0 · x = 1G for all x ∈ G. Since G is
finite, it admits a finite basis (as a vector space over Zn2 ) and the statement
follows. 2
Proposition 7.6 gives a method for classifying groups of order 4:
33
Theorem 7.10 (Euler’s Theorem) Let a and n be coprime integers. Then
n|(aϕ(n) − 1).
Proof: Let b = (a)n be the residue. Since the numbers are coprime, b ∈ Z× n.
By Proposition 7.3, |b| divides ϕ(n). Hence b ϕ(n) ×
= 1 in Zn . Consequently,
aϕ(n) − 1 = (aϕ(n) − bϕ(n) ) + (bϕ(n) − 1) is divisible by n in Z. 2
The following fact follows easily.
8 Normal subgroups
We introduce normal subgroups. Using these tools, we classify groups of
order up to 8.
8.1 Normal Subgroups
We need the notion of normal subgroup to carry on.
34
Definition. A subgroup H of a group G is called normal in G if the left
and right cosets [g]H = gH and H [g] = Hg are equal for all g ∈ G.
The standard notation for “H is a normal subgroup of G” is H G or
H G. (H G is sometimes but not always used to mean that H is a proper
normal subgroup of G – i.e. H 6= G.)
Examples. 1. The two standard subgroups G and {1} of any group G are
both normal.
2. Any subgroup of an abelian group is normal.
3. In the example G = D6 in Subsection 7.1, we saw that the sub-
group {(), (1, 2, 3), (1, 3, 2)} (= {1, a, a2 }) is normal in G, but the subgroup
{(), (2, 3)} = {1, b} is not normal in G.
4. More generally, SO2 (R) is a normal subgroup of O2 (R). From its
multiplication table, one coset consists of rotations Rα and another coset
consists of reflections Sα .
In examples 3 and 4, the normal subgroup has index 2, for instance,
H = {1, a, a2 } and |G|/|H| = 6/3 = 2 in example 3.
35
8.2 Groups of order 8
Proposition 8.3 Let G be a group of order 8. Then G is isomorphic to
one of C8 , C4 × C2 , C2 × C2 × C2 , D8 and Q8 .
Proof: By Proposition 7.3, nonidenity elements of G may have order 2, 4
and 8. If all non-identity elements have order 2, then G ∼
= C2 × C2 × C2 by
Proposition 7.6.
Otherwise, there is an element a ∈ G of order 4, for instance, if |x| = 8
then a = x2 is of order 4. By Proposition 8.1, N =< a >= {1, a, a2 , a3 } is a
normal subgroup. Pick any b ∈ G\N . Since bab−1 ∈ N and |bab−1 | = |a| = 4
by Lemma 7.4, bab−1 must be either a or a−1 .
Also we cannot have b2 ∈ N b, as this would imply b2 = nb for some
n ∈ N and b = n ∈ N . Hence, b2 ∈ N since G = N ∪ N b.
Before we analyse eight possibilities, we observe that G is generated by
a and b since < a, b > properly contains N , hence, it has index strictly less
then 2. In particular, this implies that bab−1 = a makes G abelian while
bab−1 = a3 makes it nonabelian.
Let us assume bab−1 = a3 and analyse four nonabelian possibilities:
(iii) b2 = a2 , then G ∼
= Q8 by Proposition 4.7
(iv) b = a is impossible as this means that a = (a3 )3 = b6 and a3 =
2 3
36
2
It is worse pointing out why these five groups of order 8 are noniso-
morphic. Three of them are abelian, two are nonabelian. The nonabelian
D8 and Q8 have different number of elements of order 2: 5 and 1 corre-
spondingly. The abelian are distinct by the fundamental theorem of abelian
groups from Algebra-1. Alternatively, you can just count the number of
elements of order in each of them.
8.3 Groups of order 2p
Let p be a prime. We know two groups of order 2p: the cyclic group C2p
and the dihedral group D2p . It is worse pointing out what happens at small
primes. If p = 2 then D4 ∼= K4 is abelian, non-isomorphic to C4 . If p = 3
then D6 is nonabelian, isomorphic to the symmetric group S3 .
37
(ii) Show that O2 (Z3 ) is a group of order 8 and determine which of the
groups in Proposition 8.3 it is isomorphic to.
(iii) For a commutative ring R we denote Tn (R) the group of triangular
n × n-matrices with coefficients in R, i.e. Tn (R) = {(aij ) ∈ GLn (R) |
aij = 0 whenever i > j}. Show that T3 (Z2 ) is a group of order 8 and
determine which of the groups in Proposition 8.3 it is isomorphic to.
(iv) Let STn (R) be the subgroup of Tn (R) of matrices with determinant
1. Show that T2 (Z3 ) is a group of order 6 and determine which of the
groups in Proposition 8.4 it is isomorphic to.
(v) Show that ST2 (Z4 ) is a group of order 8 and determine which of the
groups in Proposition 8.3 it is isomorphic to.
8.5 Vista: classification of groups
We have advanced quite far in classification of groups. To facilitate the
discussion, let f (n) be the number of non-isomorphic groups of order n. We
know so far that if p is prime then f (p) = 1, f (2p) = 2, f (8) = 5. Later
one, we are going to classify groups of order p2 : they are all abelian, hence
f (p2 ) = 2.
The next interesting order is 12, f (12) = 5, only 2 are abelian. Out
of 3 nonabelian, we know only D12 so far, but we are introducing the two
remaining one later in this course, albeit we are not proving that the list is
exhaustive. One needs Sylow’s theorem to handle 15, f (15) = 1. The next
number 16 is the first case where I would not know what to do: f (16) = 14
and I don’t know most of them10 . The next interesting case is 18: f (18) = 5.
Four of the groups are straightforward: C18 , C3 × C6 , D18 , C3 × D6 but the
remaining one requires semidirect products. In general, if n have no large
prime powers, f (n) is easy to handle. f (p3 ) = 5 with all 5 groups easy to
construct: three are abelian, U T3 (Zp ) (subgroup of T3 consisting of matrices
with 1 on the main diagonal) and Prime powers11 behave badly: f (32) = 51,
f (64) = 267, . . . , f (1024) = 49487365422.
While classifying all finite groupsPappears hopeless,
P it may be possible
to understand generating functions n f (n)z n or n f (pn )z n .
9 Homomorphisms
We introduce the notion of a homomorphism, its image and its kernel.
10
although see M. Wild, Groups of order 16 made easy, American Mathematical
Monthly, 112 (2005), 20–31.
11
Check out the book with exciting title The groups of order 2n (n equal to 6) by Hall
and Senior.
38
9.1 Definition and examples of homomorphisms
Definition. Let G, H be groups, R, S rings. A group homomorphism φ
from G to H is a function φ : G → H such that φ(g1 g2 ) = φ(g1 )φ(g2 ) for all
g1 , g2 ∈ G. A ring homomorphism φ from R to S is a function φ : R → S
such that φ(1R ) = 1S , φ(r1 r2 ) = φ(r1 )φ(r2 ), and φ(r1 + r2 ) = φ(r1 ) + φ(r2 )
for all r1 , r2 ∈ R.
Notice that the ring homomorphism requires an extra condition concern-
ing the identity. For instance, the natural map R → R × S, r 7→ (r, 0S ) is
not a ring homomorphism. Such surprises don’t happen with groups.
39
Otherwise, consider the complex conjugation x 7→ x∗ , C → C. It is a group
homomorphism but not a linear map of vector spaces over C, although it is
a linear map of vector spaces over R.
5. Let R be a commutative ring of characteristic p, that is, px = 0 for
any x ∈ R where p is a prime number. The ring R admits a Frobenius
homomorphism, F : R → R defined by F (x) = xp . The tricky part is the
preservation of addition. The identity (x + y)p = xp + y p is sometimes called
freshman’s dream binomial formula. It holds because the commutativity of
x and y implies that
p−1
X p!
(x + y)p = xp + y p + xk y p−k
k!(p − k)!
k=1
40
The composition det ◦Ω is particularly interesting homomorphism, called
sign homomorphism. Since σ m = 1 for some m > 1, det ◦Ω(σ)m = det ◦Ω(σ m ) =
1. There are only two real numbers 1 and −1 that have finite order, hence
we have a homomorphism sign : Sn → {−1, 1} ∼ = C2 which distinguishes
odd and even permutations. The former have sign(σ) = −1 and the latter
satisfy sign(σ) = 1.
Please, note that there is a risk of circular argument here. It all depends
on
P how you define determinant!! If you define it algebraically by det(ai,j ) =
σ∈Sn sign(σ)aσ(1),1 aσ(2),2 . . . aσ(n),n then you are in real trouble since you
are using determinant to define sign of a permutation and vise versa. We
will break this circle in Section 13.3.
9.2 Image
The image im(φ) of a homomorphism is just its image as a function, and
the following propositions are straightforward to prove.
9.3 Kernels
Definition. Let φ : G → H be a homomorphism. Then the kernel ker(φ)
of φ is defined to be the set of elements of G that map onto 1H ; that is,
ker(φ) = {g | g ∈ G, φ(g) = 1H }.
ker(φ) = {g | g ∈ G, φ(g) = 0H }.
41
Proof: Checking that K = ker(φ) is a subgroup of G is straightforward,
using Proposition 1.5. If g ∈ G then
so K is normal. 2
Examples. 10. Here is an example of a homomorphism from an additive
group to a multiplicative group. Let us define φ : C+ → C× by φ(g) =
exp(g). Then φ(g1 +g2 ) = φ(g1 )φ(g2 ), which says that φ is a homomorphism.
In fact φ is a surjective but not injective. The kernel of φ is 2πiZ since
exp(x + iy) = exp(x)(cos(y) + i sin(y)) for x, y ∈ R.
11. A close relative of example 10 is the homomorphism R : R+ →
O2 (R), essentially defined in Section 5.2. Recall that R(α) = Rα , the rota-
tion by α matrix. It is a homomorphism since Rα Rβ = Rα+β . It is neither
surjective, nor injective. Its image is SO2 (R) and its kernel 2πZ.
12. Let G = H = D12 , the dihedral group of order 12. We saw in
Subsection 5.3 that G = {ak | 0 ≤ k < 6} ∪ {ak b | 0 ≤ k < 6}. We define
φ : G → H by φ(ak ) = a2k and φ(ak b) = a2k b for 0 ≤ k < 6. We claim
that φ is a homomorphism. It seems at first sight as though we need to
check that φ(gh) = φ(g)φ(h) for all 144 ordered pairs g, h ∈ G, but we can
group these tests into the four distinct types listed in Subsection 5.3. We
will make free use of the fact that am = 1 when 6|m.
(i) φ(ak al ) = φ(ak+l ) or φ(ak+l−6 ) = a2(k+l) or a2(k+l−6) = a2k a2l =
φ(ak )φ(al );
(ii) φ(ak (al b)) = φ(ak )φ(al b) – this is similar to (i);
(iii) φ((ak b)al ) = φ(ak−l b) or φ(ak−l+6 b) = a2(k−l) b or a2(k−l+6) b = a2k a−2l b =
a2k ba2l = φ(ak b)φ(al )
(iv) φ((ak b)(al b)) = φ(ak b)φ(al b) – this is similar to (iii).
So φ really is a homomorphism. We can check that the only elements
of G with φ(g) = 1 are g = 1 and g = a3 , so ker(φ) = {1, a3 }, which is
the normal subgroup that we considered in Example 3 of Subsection 10.2.
im(φ) consists of the 6 elements 1, a2 , a4 , b, a2 b, a4 b of G.
In general, if φ : G → H is a homomorphism and J is a subset of
H, then we define the complete inverse image of J under φ to be the set
φ−1 (J) = {g ∈ G | φ(g) ∈ J}. It is easy to check, using Proposition 1.5,
that if J is a subgroup of H, then φ−1 (J) is a subgroup of G.
Here is a final statement, which will be useful later.
42
Proof: Since 1G ∈ ker(φ), if φ is injective, then we must have ker(φ) =
{1G }. Conversely, suppose that ker(φ) = {1G }, and let g1 , g2 ∈ G with
φ(g1 ) = φ(g2 ). Then 1H = φ(g1 )−1 φ(g2 ) = φ(g1−1 g2 ) (by Lemma 9.1), so
g1−1 g2 ∈ ker(φ) and hence g1−1 g2 = 1G and g1 = g2 . So φ is injective. 2
9.4 Exercises
(i) Let V and W be vector spaces over the field Q of rational numbers.
Show that any group homomorphism ψ : V → W is a linear map.
(ii) A homomorphism φ : A → B is called an epimorphism if for any pair
of homomorphisms α, β : B → C the equality αφ = βφ implies that
α = β. Prove that any surjective homomorphism is an epimorphism.
(iii) Prove that the natural embedding Z → Q is an epimorphism of rings
but not surjective.
(iv) Let G be a Klein Four Group. How many distinct homomorphisms
φ : G → G are there? How many of these are isomorphisms?
(v) Let φ : R → S be a ring homomorphism. Prove that if K is a subring
of S, then φ−1 (K) is a subring of R.
Prove that if A is a subring of R, then φ(A) is a subring of S.
Prove that if I is an ideal of S, then φ−1 (I) is an ideal of R.
Give an example where J is an ideal of R but φ(J) is not an ideal of
S.
(vi) Let us consider the following setup. For each natural number n we are
given a group Gn and a group homomorphism φn : Gn → Gn+1 Prove
that
∞
Y
G∞ = {(x1 , x2 , . . .) ∈ Gn |∀iφi (xi ) = xi+1 }
n=1
Q∞
is a subgroup of n=1 Gn .
(viii) Let p be a prime number. Let Gn = Cpn be the cyclic group of order
pn with a generator xn . We define φn : Gn → Gn+1 by φ(xan ) = xpa
n+1 .
Using the above construction we obtain a group G, called a quasicyclic
group and usually denoted Cp∞ . Prove that Cp∞ is isomorphic to H
n
where H = {z ∈ C× | ∃n z p = 1}.
(ix) Describe all distinct group homomorphisms from Cn to Cm and com-
pute their number. Which of them are ring homomorphisms from Zn
to Zm ?
9.5 Vista: Jacobian conjecture
Let R = C[x1 , . . . xn ] be the ring of complex polynomials. All C-linear
ring homomorphisms φ :→ R are easy to describe. Such a homomorphism
gives n polynomials fi = φ(xi ). In the other direct, any n-tuple of polyno-
43
mials fi define a linear ring homomorphism φ(F (x1 , . . . xn )) = F (f1 , . . . fn ).
The fun starts when if we want to decide which of them are isomorphisms
and n ≥ 2. Let us consider the Jacobian of φ:
10 Quotient groups
We introduce quotient groups and prove the isomorphism theorem. We
use it to prove Cayley’s theorem.
10.1 Quotient Groups
Definition. If A and B are subsets of a group G, then we define their
product AB = {ab | a ∈ A, b ∈ B}.
The definition of quotient group depends on the following technical re-
sult.
12
read http://sbseminar.wordpress.com/2009/05/27/how-not-to-prove-the-jacobian-
conjecture/ before you write one of these http://arxiv.org/abs/0912.1924v1
44
Proposition 10.2 Let N be a normal subgroup of a group G. Then the set
G/N of cosets [g] = N g of N in G forms a group under multiplication of
sets.
Proof: We have just seen that [g][h] = [gh], so we have closure, and asso-
ciativity follows easily from associativity of G. Since [1][g] = [1g] = [g] for
all g ∈ G, [1] is an identity element, and since [g−1 ][g] = [g−1 g] = [1], [g−1 ]
is an inverse to [g] for all cosets [g]. Thus the four group axioms are satisfied
and G/N is a group. 2
Definition. The group G/N is called the quotient group (or the factor
group) of G by N .
Notice that if G is finite, then |G/N | = |G : N | = |G|/|N |. Let us finish
with the following fact.
45
namely G = D12 and N = {1, a3 }. Then |G/N | = |G|/|N | = 6. Since
a3 ∈ N , we have
[a]3 = [a3 ] = N a3 = N = [1]
is the identity of G/N . We also have [b]2 = [1] and [b][a] = [a−1 ][b], because
these relations are inherited from the corresponding relations of G. Thus
G/N is a group of order 6 satisfying the three relations [a]3 = 1, [b]2 = 1,
[b][a] = [a]−1 [b], and, by Proposition 5.5 G/N ∼= D6 .
It might be helpful in understanding this example to see the full multipli-
cation table of G (cf. Subsection 5.3), with the elements arranged according
to their cosets. Notice that all elements in each 2 × 2 block of this table lie
in the same coset of N in G. We can then see the multiplication table of
G/N by regarding these 2 × 2 blocks as single elements (i.e. cosets) in the
quotient group.
N Na N a2 Nb N ab N a2 b
1 a3 a a4 a2 a5 b a3 b ab a4 b a2 b a5 b
N 1 1 a3 a a4 a2 a5 b a3 b ab a4 b a2 b a5 b
a3 a3 1 a4 a a5 a2 a3 b b a4 b ab a5 b a2 b
Na a a a4 a2 a5 a3 1 ab a4 b a2 b a5 b a3 b b
a4 a4 a a5 a2 1 a3 a4 b ab a5 b a2 b b a3 b
Na 2 a2 a2 a5 a3 1 a4 a a2 b a5 b a3 b b a4 b ab
a5 a5 a2 1 a3 a a4 a5 b a2 b b a3 b ab a4 b
Nb b b a3 b a5 b a2 b a4 b ab 1 a3 a5 a2 a4 a
a3 b a3 b b a2 b a5 b ab a4 b a3 1 a2 a5 a a4
N ab ab ab a4 b b a3 b a5 b a2 b a a4 1 a3 a5 a2
a4 b a4 b ab a3 b b a2 b a5 b a4 a a3 1 a2 a5
N a2 b a2 b a2 b a5 b ab a4 b b a3 b a2 a5 a a4 1 a3
a5 b a5 b a2 b a4 b ab a3 b b a5 a2 a4 a a3 1
46
The reason that this is not obvious is that we can have K [g] = K [h]
with g 6= h, and when that happens we need to be sure that φ(g) = φ(h).
This is called checking that the map φ is well-defined. In fact, once you
have understood what needs to be checked, then doing it is quite easy,
because K [g] = K [h] ⇒ g = kh for some k ∈ K = ker(φ), and then
φ(g) = φ(k)φ(h) = φ(h).
Clearly im(φ) = im(φ), and it is straightforward to check that φ is a
homomorphism. Finally,
the quotient map p(α) = [α], the isomorphism ψ([α]) = Rα and the embed-
ding ι(T ) = T .
Let us illustrate this theorem using Example 12 from Section 9. Note
that the elements of G = D12 are listed in two separate columns in the
diagram, in different orders, once for the domain and once for the codomain
of φ. The elements of imφ are printed slightly to the left of those not in
im(φ) in the codomain column.
47
φ
1
a3 } = [1] - 1
a
a
a4 } = [a] - a2
a3
a2
a5 } = [a ]
2 - a4
a5
b
a3 b } = [b] - b
ab
ab
a4 b } = [ab] - a2 b
a3 b
a2 b
a5 b } = [a b]
2 - a4 b
a5 b
10.4 Cayley Theorem
As an application, let us prove Cayley’s theorem.
48
(ii) Prove that if G is a simple group and H is a subgroup of index is simple
if it has exactly two normal subgroups: G and 1. Prove that a simple
abelian group is a cyclic group of prime order.
(iii) Let N G. Prove that the subgroups of G/N are precisely the quotient
groups I/N , for subgroups I of G that contain N .
(iv) Let H be any subgroup and let K be a normal subgroup of a group G.
Then H ∩ K is a normal subgroup of H and H/(H ∩ K) ∼ = HK/K.
(v) Let K ⊆ H ⊆ G, where H and K are both normal subgroups of G.
Then (G/K)/(H/K) ∼ = G/H.
10.6 Vista: simple groups
Classification of simple finite groups was both a major success and a
major tragedy of the 20-th century mathematics. While the theorem is
beautiful, the proof has spread over 30,000 journal pages. And up until now
we have not got an acceptably written proof. You can read more about
this topic on wikipedia, search for “Classification of finite simple groups”.
Currently, mathematicians work on second and third generation proofs. Inna
Capdeboscq in Warwick is actively involved in both projects.
If you are thinking of writing an essay, it may be a good idea to describe
all simple groups of order up to 1000. Roger Carter used to teach MA4**
level module with exactly this topic (and title). Besides 168 cyclic groups
of prime order, there are only five nonisomorphic groups: the alternating
groups A5 of order 60 and A6 of order 360 as well as linear groups GL3 (F2 )
of order 168, P GL2 (F8 ) of order 504 and P SL2 (F11 ) of order 660. While
for the most numbers n, it is relatively easy to show that there is no simple
group of order n, numbers 120, 540 and 720 will require special attention13 .
49
two properties are the same. For a non-commutative ring, one sometimes
introduces left ideals satisfying xI ⊆ I and right ideals satisfying Ix ⊆ I
Examples. 1. Any ring R has two boring ideals: zero ideal {0} and the
ring R itself.
2. For any n ∈ Z, the numbers in Z divisible by n form an ideal nZ. Notice
that 1Z = Z and 0Z = {0}.
3. Let R be a non-zero ring, n ≥ 2. Let I be the set of all matrices in
Mn (R) that vanish outside one particular row. Then I is a right ideal but
not an ideal. In particular, I is not a left ideal. Likewise, let J be the set
of all matrices in Mn (R) that vanish outside one particular column. Then
J is a left ideal but not an ideal. In particular, J is not a right ideal.
We won’t take any interest in this course in the left and right ideals.
They will be studied in the third year module Rings and Modules.
50
calculation X
(rij ) = (rk,l Ek,1 )E1,1 E1,l
k,l
51
11.4 The isomorphism theorem
The following theorem is also called the first isomorphism theorem, see
the exercises for the second and the third ones.
52
P in R. Show that I ∩ J, I + J = {x + y | x ∈ I, y ∈ J}
(i) Let I, J be ideals
and IJ = { i xi yi | xi ∈ I, yi ∈ J} are all ideals of R and that
IJ ⊆ I ∩ J ⊆ I + J.
(ii) Let S be any subring and let I be an ideal in a ring R. Then S ∩ I is
an ideal in S and S/(S ∩ I) ∼
= (S + I)/I.
(iii) Let I ⊆ J ⊆ R, where I and J are both ideals in R. Then J/I is an
ideal in R/I and (R/I)/(J/I) ∼ = R/J.
(iv) Compute the ring Z[i]/(1 + 3i).
(v) Compute the ring Z[i]/(5 + 3i).
(vi) Let Ra = R[x]/(x2 − a). Show that R0 , R1 and R−1 are pairwise
non-isomorphic rings.
(vii) Rf = R[x]/(f (x)) where f (x) is a quadratic polynomial. Show that
Rf is isomorphic to either R0 , or R1 , or R−1 .
(viii) Let R be a ring with an ideal Ii for each natural i. Prove that if
Ii ⊆ Ii+1 then the union J = ∪∞ n=1 In is an ideal.
53
Proof: It follows easily from the properties of the quotient maps that ψ is
a ring homomorphism. Now x ∈ ker ψ ⇔ ∀k x + Ik = 0 + Ik ⇔ ∀k x ∈ Ik ⇔
x ∈ ∩ k Ik 2
The isomorphism theorem tells us thatQthere is an induced injective
homomorphism of rings ψ : R/ ∪k Ik → k R/Ik . In general, Chinese
Remainder Theorem (CRT) is any statement that concludes that ψ is an
isomorphism. We are going to prove for the ring Z.
Lemma 12.3 If R and S are rings, the groups (R × S)× and R× × S × are
isomorphic.
Proof: Both groups are subsets of the ring R × S. Let us observe that
these subsets are equal: (r, s) ∈ (R × S)× if and only if ∃(a, b) ∈ R ×
S (a, b)(r, s) = (1, 1) if and only if ∃a ∈ R, b ∈ S ar = 1, bs = 1 if and only if
r ∈ R× & s ∈ S × if and only if (r, s) ∈ R× × S × . The required isomorphism
is the identity map, which clearly preserves the multiplication. 2
We are ready to compute it now.
Corollary 12.5 If m = pa11 · · · pakk where pi are distinct primes then ϕ(m) =
Qk ai ai −1
i=1 (pi − pi ).
54
Q
Proof: By corollary 12.4, ϕ(m) = ki=1 ϕ(pai i ). By Lemma 12.3, m ∈ Zpai
i
is a unit if and only if m is coprime to pai i . The latter is equivalent to not
being divisible by pi . The remainders divisible by pi are of the form xpi for
some x, so there are exactly piai −1 of them. Hence ϕ(pai i ) = pai i − piai −1 . 2
12.3 RSA
Suppose that you run a website so popular with customers, that every
minute thousands of customers are queueing up to pay you with their credit
cards. The key exchange with each of them will slow you down and you
need an asymmetric keys solution instead. You need two keys: a private
key used and known only to you and a public key that you would give to
any customers, including hackers trying to compromise the system. More
serious industrial application of asymmetric keys is the PGP system, which
allows you to exchange messages securely with any number of people on the
internet by giving away your public key.
The source of the key is two large (approximately 1000 bits or so) primes
p and q. The public key consists of the product n = pq and the exponent e
such that gcd(e, ϕ(n)) = 1. Popular choices of public exponent are 65537 =
216 + 1 or 17 = 24 + 1. The private key consists of the private exponent d
such that (ed)ϕ(n) = 1. This number is precomputed once and stored.
The public key is available to anybody shopping online. The credit card
number is padded with bits upfront to form a message m. Padding ensures
the security condition me ≫ n but making sure m is not divisible by p an
q, usually by p > m < q. The encoded message x = (me )n is send over the
internet. The choice of public exponent usually enables fast calculation. For
instance, m65537 is computed using 17 multiplications.
The vendor receives the message and decrypts it by m = (med )n = (xd )n
with the first equality thanks to Euler’s theorem. It is more computationally
intensive but doable: d could have up to 1,000,000 binary bits, so xd may
require up to 2,000,000 multiplications.
A hacker can easily collect the following ingredients: x, e, n. To get to the
credit card number m, the hacker needs d or ϕ(n). He(she) will need either
to decompose n into the product of p and q, which gives ϕ(n), or compute
d directly in the group Z× n by some other means. With such a large n any
known method is going to take thousands of years. Mathematicians know
at least one way to do it faster by using quantum computers. It is engineers’
turn to figure out how to build them.
55
12.4 CRT, elementary form
We recall that one writes x ≡n y if n divides x − y. The Chinese Re-
mainder Theorem can be formulated on the level of systems of comparisons.
2
Our proof of CRT is not constructive. We do not know from the proof
how to solve the system of comparisons. On the other hand, one can easily
derive a constructive proof from the solutionQmethod we are about to de-
scribe. The key is the number Ni = N/ni = k6=i nk . It is divisible by any
nj , j 6= i and coprime to ni . So, it is a generator of Cni inside Z+ ∼ Q Cn .
N = j j
Let us look at a concrete example. Let us solve x ≡7 6, x ≡11 5,
x ≡13 4. In this case, N = 7 · 11 · 13 = 1001 and the generators are
N1 = 143 ≡7 3, N2 = 91 ≡11 3, N3 = 77 ≡13 12 ≡13 −1. In Z7 we have
6 = 2 · 3, in Z11 we have 5 = 9 · 3 and in Z13 we have 4 = 9 · (−1). Hence,
x = 2N1 + 9N2 + 9N3 = 2 · 143 + 9 · 91 + 9 · 77 = 286 + 819 + 693 = 1798
is a solution. Any other other solution is 1798 + 1001k. In particular.
1798 − 1001 = 797 is the minimal positive solution.
12.5 Exercises
Exercises (v)-(x) and above take you through two general versions of
CRT.
(i) Let pk be the k-th prime
Q number, Ik = (pk ). Prove that the natural
map ψ : Z/ ∩k Ik → k Z/Ik is not an isomorphism. Conclude that
CRT fails for infinite sets of ideals.
(ii) Explain how to compute m65537 using 17 multiplications.
(iii) Explain why the security condition me ≫ n is necessary.
(iv) Find the smallest positive x such that x ≡7 3, x ≡13 4, x ≡19 5.
56
(v) Let I, J R. We say that I and J are comaximal if I + J = R. Prove
that the natural map ψ : R/I ∩ J → R/I × R/J is an isomorphism if
and only if I and J are comaximal.
(vi) Let I1 , . . . In be a collection of ideal in R. Suppose Ik and ∩j<k Ij are
comaximal for each k = 2, 3, . . . n. Using Q induction, prove that the
natural homomorphism ψ : R/ ∩k Ik → k R/Ik is an isomorphism.
(vii) Let R be commutative, I, J R. Prove that (I + J)(I ∩ J) ⊆ IJ.
(viii) Let R be commutative, I, J R comaximal. Prove that I ∩ J = IJ.
(ix) Let R be commutative, Ik , k = 1, . . . n pairwise comaximal ideals of
R. Using induction, prove that In and ∩k<n Ik are comaximal.
(x) Let R be commutative, Ik , k = 1, . . . n pairwise
Q comaximal ideals of R.
Prove that the natural map ψ : R/∩k Ik → k R/Ik is an isomorphism.
(xi) Let I, J R be comaximal, R commutative. Prove that I n and J m
are comaximal for all n and m.
12.6 Vista: prime factorisation
Read more about prime numbers and their factorisation in Lauritzen.
The now defunct RSA challenge14 used to pay you money for factoring one
of the numbers of the form pq. This is all related to the P=NP problem, one
of the millennium problems with one million dollars bounty15 . You may also
want to find out more about quantum computers. Imagine notoriety you
can achieve if you find a way of factoring large numbers: you may become
the top criminal of the century by compromising all secure internet traffic
with payment information.
57
Proposition 13.1 Let · be an action of the group G on the set X. For g ∈
G, define the map φ(g) : X → X by φ(g)(x) = g · x. Then φ(g) ∈ Sym(X),
and φ : G → Sym(X) is a homomorphism.
Proof: Property (ii) in the definition says that φ(1G ) is the identity map
IX : X → X, and then Property (i) implies that φ(g)φ(g −1 ) = φ(gg−1 ) =
IX , and similarly φ(g−1 )φ(g) = IX . So φ(g) and φ(g −1 ) are inverse maps,
which proves that φ(g) : X → X is a bijection. Hence φ(g) ∈ Sym(X), and
then Property (i) implies immediately that φ is a homomorphism. 2
The opposite is also true: a homomorphism φ : G → Sym(X) defines an
action g · x = φ(g)(x). In fact, it gives a bijection between the set of actions
G × X → X and the set of homomorphisms φ : G → Sym(X) (see exercises
and Example 1).
The kernel of an action · of G on X is defined to be the kernel K = ker(φ)
of the homomorphism φ : G → Sym(X) defined in Proposition 13.1. So
58
and φ(b) = (d1 , d2 ). This action is not faithful, and its kernel is the normal
subgroup {1, a3 } of G that we have already studied. The image is isomorphic
to D6 .
3. There is a faithful action called the regular action, defined for any
group G. The regular G-set X is the set of G as a set. The action g · x
is defined to be gx for all g ∈ G, x ∈ X. Conditions (i) and (ii) of the
definition obviously hold, so we have defined an action. If g is in the kernel
K of the action, then gx = x for all x ∈ X, which implies g = 1 by the
cancellation law, so the action is faithful. From Proposition 13.2, we can
deduce Cayley’s Theorem (Theorem 10.5).
4. Similarly to the regular G-set, there exists antiregular G-set for any
group G. The antiregular G-set X = G, as a set. The action is defined via
g · x = xg −1 for all g ∈ G, x ∈ X.
13.2 Stabilisers
Definition. Let X be a G-set, x ∈ X. The stabiliser of x in G, denoted by
Gx or StabG (x), is {g ∈ G | g · x = x}.
The proof of the following proposition is left as an exercise.
59
Let us consider the action of the symmetric group Sn on the set R =
Z[X1 , . . . Xn ] defined by the obvious formula σ·F (X1 , . . . Xn ) = F (Xσ(1) , . . . Xσ(n) ).
For example, if σ = (1, 2, 3) and F = 1 + X1 + X23 + X3 X4 then σ · X1 = X2 ,
σ · X2 = X3 , σ · X3 = X1 , σ · X4 = X4 and σ · F = F (X2 , X3 , X1 , X4 ) =
1 + X2 + X33 + X1 X4 . Let point it out that the action of Sn on the set R is
actually an action on the ring R.
Definition. TheQalternating group An is the stabiliser of the Vandermonde
polynomial Ω = i>j (Xi − Xj ) ∈ Z[X1 , . . . Xn ].
We need the following calculation.
60
Proof: Observe that Ω 6= −Ω since 2Ω 6= 0. If σ = τn · · · τ1 then σ · Ω =
(−1)n Ω by Proposition 13.4. Hence, σ is even (odd) if and only if σ · Ω = Ω
(σ · Ω = −Ω correspondingly). 2
The next two corollaries are immediate.
σ·Ω
Corollary 13.6 The sign function sign : Sn → Z× defined by sign(σ) = Ω
is a well-defined group homomorphism.
61
P
on the ring S: (aij )·Xk = i aik Xi extends to a unique ring automorphism.
Similarly, Sn acts via σ · Xj = Xσ(j) . Now, the key is the top degree
element Ω = X1 X2 . . . Xn . It gives both the sign and the determinant by
(aij ) · Ω = det(aij )Ω and σ · Ω = sign(σ)Ω.
The Grassmann ring is not commutative in general but always supercom-
mutative. Find out more about superalgebras, supergroups and supervector
spaces and how they are used in Physics.
14 Orbits
We introduce the quotient G-set G/H and prove the orbit-stabiliser the-
orem. We look at Riemann sphere and Hopf fibration.
14.1 Homomorphisms of G-sets
Let X and Y be G-sets. A function φ : X → Y is a homomorphism of
G-sets if φ(g · x) = g · φ(x) for all g ∈ G, x ∈ X. A bijective homomorphism
will be referred to as an isomorphism. The G-sets X and Y are called
isomorphic if there exists an isomorphism between them.
Examples. 1. For any G-set X and any x ∈ X, the orbit map βx : G → X
defined by βx (g) = g · x is a homomorphism from the regular G-set to X.
Indeed, βx (h · g) = βx (hg) = hg · x = h · (g · x) = h · βx (g).
2. Let X = G be the antiregular G-set. The orbit map β1 is the inverse
map: β1 (g) = g · 1 = 1g−1 = g−1 . Thus, regular and antiregular G-sets are
isomorphic.
14.2 Orbits
Let · be an action of G act on X. We define a relation ∼ on X by x ∼ y
if and only if there exists a g ∈ G with y = g · x. Then ∼ is an equivalence
relation – the proof is left as an exercise.
Definition. The equivalence classes of ∼ are called the orbits of G on X.
In particular, the orbit of a specific element x ∈ X, which is denoted by
G · x or by OrbG (x) is
{ y ∈ X | ∃g ∈ G with g · x = y }.
Observe that the orbit OrbG (x) is the image of the orbit map βx .
Similarly, to the general equivalence relations it is instructive to consider
the quotient set X/ ∼, i.e. the set of orbits. As the equivalence relation is
carried out by G, we denote this quotient set X/G. If there is a single orbit,
the action (as well the G-set) is called transitive.
62
Example. 3. Let X = G be the antiregular G-set, H a subgroup of G. We
restrict the antiregular action: G is an H-set under the antiregular action:
h · g = gh−1 . The orbits of this action are left cosets [g]H = gH. T he
quotient set G/H is the set of all left cosets. In particular, if H is normal in
G, the quotient set admits a group structure which we called the quotient
group. If H is not necessarily normal, the quotient set G/H still carries
an action of G (or a G-set structure): g · [a]H = [ga]H . The axioms of the
action are apparent but one needs to verify that this action is well defined:
[a]H = [b]H =⇒ ∃h ∈ H a = bh =⇒ ga = gbh =⇒ g · [a]H = [ga]H = [gb]H =
g · [b]H . See also exercise (iii).
4. Smith’s normal form in Linear Algebra deals with the action of G =
GLn (K) × GLm (K) on the set X = K n×m of n × m matrices: (g, h) · x =
gxh−1 . Observe that x ∈ OrbG (y) if and only if x and y are equivalent if
and only if they can be moved one to another by a sequence of elementary
row and column transformations if and only if x and y have the same rank.
Recall that the elementary transformation is just a multiplication by an
elementary matrix, who together generate the general linear group. Thus,
|X/G| = min(n, m) + 1 while the Smith’s normal form of x is a particularly
nice element in OrbG (x).
5. Jordan normal form story is slightly more complicated: G = GLn (C)
acts on X = Mn (C) via g · x = gxg−1 . Similarly, x ∈ OrbG (y) if and only if
x and y are similar if and only if they have the same Jordan form. The set
X/G is infinite but its precise structure can be pinpointed.
6. Classification of quadratic forms (over R) involves two groups G =
GLn (R) ≥ H = On (R) acting on the same set X of real symmetric n × n
matrices: g · x = gxg T . A G-orbit is determined by the rank and the
signature of the form and admits a diagonal representative with P 0, ±1 on the
diagonal. Since there are r+1 distinct forms of rank r, |X/G| = nr=0 r+1 =
(n + 1)(n + 2)/2. Each G-orbit is a union of H-orbits. The latter are
determined by the eigenvalues and also admits a diagonal representative
with the eigenvalues on the diagonal.
14.3 Orbit-Stabiliser Theorem
The next theorem is fundamental in group theory: it is an analogue of
the isomorphism theorem for G-sets.
63
Proof: We consider a function ψ : G/Gx → OrbG (x) defined by ψ([g]) =
βx (g) = g · x. Let us observe that this is well defined: [g] = [h] =⇒ ∃a ∈
Gx ga = h =⇒ ψ([h]) = h · x = ga · x = g · (a · x) = g · x = ψ([g]).
It is a homomorphism of G-sets: ψ(g · [h]) = ψ([gh]) = gh · xg · (h · x) =
g · ψ([h]).
For any y ∈ OrbG (x) there exists a g ∈ G with g · x = y. Hence
ψ([g]) = y. For an element g ′ ∈ G, we have
g′ · x = y ⇐⇒ g ′ · x = g · x ⇐⇒ g−1 g′ · x = x ⇐⇒ g −1 g′ ∈ Gx ⇐⇒ g ′ ∈ gGx .
So the elements g′ with ψ(g′ ) = y are precisely the elements of the coset
gGx . Hence, ψ is a bijection. 2
Examples. 7. In Example 2 of Section 13.1, D12 acts transitively on sets
P , E and D of vertices, edges and diagonals of the regular hexagon. Let
us see what the orbit-stabiliser does for the vertices. Let us pick a ver-
tex v0 . Its stabiliser is H = {1, S0 = b}. The quotient set is D12 /H =
{[1]H , [a]H , [a2 ]H , [a3 ]H , [a4 ]H , [a5 ]H } with [ak ]H = {ak , ak b}. Now the orbit-
stabiliser theorem gives a bijection ψ : D12 /H → P given explicitly by
ψ([ak ]H ) = ak · v0 = vk .
If one chooses another point the picture changes slightly. For instance,
the stabiliser of v2 is A = {1, S2 = a2 b}. The quotient set is D12 /A =
{[1]A , [a]A , [a2 ]A , [a3 ]A , [a4 ]A , [a5 ]A } with [ak ]A = {ak , ak+2 b}. The orbit-
stabiliser theorem bijection ψ ′ : D12 /A → P changes to ψ ′ ([ak ]A ) = ak · v2 =
v2+k .
8. Let F be a field. The projective n-space X = P F n consists lines F a in
F n+1 . The group G = GLn+1 (F ) acts on X transitively by A · F a = F (Aa).
The action has a kernel Z G that consists of scalar matrices. The quotient
group P GLn+1 (K) = G/Z, called projective linear group acts faithfully and
transitively. The stabiliser of F e1 in G is the group of triangular matrices
Tn+1 (F ). Thus, we have a bijection between P GLn+1 (K)/Tn+1 (F ) and X.
14.4 Exercises
(i) Consider the identity map I : G → G, I(g) = g. Show that I is
a homomorphism of G-sets from the regular G-set to the antiregular
G-set if and only if G is a group of exponent 2.
(ii) Prove that ∼, defined in Section 14.2, is an equivalence relation.
(iii) Let X be a G × H-set. Prove that X/H is a G-set with the action
g · [x] = [g · x] for all g ∈ G, [x] ∈ X/H. How does it apply to the
construction of the quotient set G/H.
(iv) Prove that the stabiliser of [x]H ∈ G/H is xHx−1 .
64
(v) Let G be a simple group (i.e. it has exactly two normal subgroups), H
its proper subgroup of finite index. Prove that G is finite (hint: what
is the kernel of the action on G/H).
(vi) Derive a precise estimate in (v): if |G : H| = n then |G| ≤ n!/2.
(vii) Prove that the quotient G-sets G/A and G/B are isomorphic if and
only if there exists x ∈ G such that xAx−1 = B.
(viii) Considering action on P F13 , prove that P GL2 (F3 ) is isomorphic to
S4 .
14.5 Vista: Riemann Sphere and Mobius Group
In 1872 Klein proposed Erlangen Program where he suggested to classify
geometries according to their groups of symmetries. Let us take a look how
it works on Riemann Sphere.
The Riemann sphere is the projective 1-space over complex numbers
S = P C1 . Geometrically, it is a 2-sphere. It is customary to represent
2
The action
of GL2 (C) is not faithful: the kernel consists of scalar matrices
α 0
. The group P GL2 (C) acting faithfully is called Mobius group. The
0 α
transformations z 7→ A·z for A ∈ GL2 (C) are called Mobius transformations.
Mobius transformations are conformal, i.e. they preserve angles between
curves. In fact, P GL2 (C) is the group of all conformal transformations of
P C1 and P GL2 (C) underlines conformal geometry of P C1 .
Recall that a matrix A = (aij ) is unitary if and only if A−1 = A∗ where
A∗ = (a∗ji ) is the conjugate matrix. Unitary matrices form the unitary group
Un (C), while unitary matrices with determinant 1 form the special∗ unitary
α −β
group SUn (C). The group SU2 (C) consists of matrices with
β α∗
|α|2 + |β|2 = 1. Hence geometrically, SU2 (C) is the unit 3-sphere S 3 .
The Mobius transformations from SU2 (C) preserve spherical distances.
This corresponds to the metric geometry of P C1 in Klein program. The
orbit map βx : SU2 (C) → P C1 is a Hopf fibration βx : S 3 → S 2 with all
βx−1 (y) being unit spheres.
65
15 Fixed points
We introduce fixed points. We prove three counting formulae and discuss
their applications to combinatorics.
15.1 Fixed points
Definition. Let T ⊆ G be a subset of G. The fixed points (or the fixed
point set) is defined as X T = {x ∈ X|∀g ∈ T g · x = x}. In particular, we
are interested in X g = X {g} for g ∈ G and X G .
Notice that in the above example X g = ∅ unless g = 1, in which case
1
X = X. Such actions are called fixed points free or simply free.
15.2 Formulae
We would like to establish three useful formulae underlining combina-
torics of the group action. The first is an immediate consequence of Theo-
rem 14.1
where the sum is taken over the representatives of all orbits containing more
than 1 element.
Proof: X is a disjoint union of orbits. P One element orbits G
P form X . The
number of elements in the larger orbits is x |OrbG (x)| = x |G|/|StabG (x)|
using Proposition 15.1. 2
66
On the other hand,
X X X X
|A| = |StabG (x)| = |G|/|OrbG (x)| = |G| 1/|OrbG (x)| =
x∈X x∈X orbits x∈ an orbit
X
= |G| 1 = |G||X/G|.
orbits
2
15.3 Necklaces and bracelets
The formulae allow us to compute necklaces and bracelets. We want to
make a necklace of k beads. We have n different types of beads with at least
k beads of each type. How many necklaces can we make?
Mathematically, we consider the set Pk of vertices of regular k-gon. The
dihedral group D2k acts on Pk . Now we consider the set X = Xk,n of all
untied necklaces, i.e. functions F : Pk → Zn . The dihedral group D2k and
its rotation subgroup Ck act on Xk,n : (g · F )(v) = F (g −1 · v) where g ∈ D2k ,
F ∈ X, v ∈ Pk . The cardinality of Xk,n is nk .
Definition. A necklace is a Ck -orbit in Xk,n . A bracelet is a D2k -orbit in
Xk,n .
Let us count bracelets and necklaces for k = 5. With necklaces we count
the number of C5 -orbits. Observe that 1 fixes every function f ∈ X while a
non-trivial rotation fixes only the constant functions. Hence, by Burnside’s
formula |C/C5 | = (n5 + n + n + n + n)/5 = (n5 + 4n)/5.
With bracelets we count the number of D10 -orbits. A reflection R fixes
functions that are constant on the orbits of the reflection, i.e. f (Rv) = f (v)
for each vertex v. A reflection has 3 orbits, there are n3 such functions.
Hence, by Burnside’s formula |C/D10 | = (n5 + 4n + 5n3 )/10.
1X
N (k, n) = ϕ(d)nk/t
k
t|k
67
g
and when g acts on Pk it has d orbits each of size k/d. Thus, |Xk,n | = nd .
Burnside’s formula gives
1 X k/|g|
N (k, n) = n
k
g∈G
′
Since g = (ad )m , it follows that < g >⊇< ad >. Since | < g > | = k/d =
| < ad > |, it follows that < g >=< ad > and there is a single subgroup
< ad > of order t = k/d. It has ϕ(t) elements of order t. Hence, Ck contains
ϕ(t) elements of order t for each t|k and the formula for N (k, n) follows.
If k = 2t − 1 is odd then each reflection fixes one vertex and has further
g
t − 1 orbits of two vertices each. Hence, |Xk,n | = nt for each of k reflections.
The formula for B(2t − 1, n) follows.
If k = 2t is even then a reflection through a vertex fixes two vertices
g
and has further t − 1 orbits of two vertices each. Hence, |Xk,n | = nt+1 for
t reflections. A reflection through the middle of an edge has t orbits of
g
two vertices each. Hence, |Xk,n | = nt for the remaining t reflections. The
formula for B(2t, n) follows. 2
If the supply of a certain type of beads is limited or any other restrictions
is imposed, one can still do the count but one has to deal with a subset of
Xk,n . Some of these situations are covered in exercises and the vista section.
15.4 Exercises
(i) Check that the alternative formula g · F (v) = F (g · v) with g ∈ D2k ,
F ∈ X, v ∈ Pk does not define an action.
(ii) Let a group G act on a ring R. Prove that the set of fixed points RG
is a subring.
(iii) Write an explicit formula derived from Proposition 15.4 for the num-
bers of bracelets and necklaces if k = p2 (p is prime), n is arbitrary.
(iv) Count the number of necklaces and bracelets one can make from 4
identical white and 3 identical black beads. (All the beads are used.)
(v) Count the number of necklaces and bracelets one can make from 4
identical white and 4 identical black beads. (All the beads are used.)
15.5 Vista: aperiodic necklaces
You must have seen the exponential identity
α2 z 2 α3 z 3 αz n
1 + αz + + + . . . = lim (1 + )
2! 2! n→∞ n
68
where α ∈ R, z is a variable. Have you seen its close relative, the cyclotomic
identity
Y∞
1
1 + αz + α2 z 2 + α3 z 3 + . . . = ( )A(k,α)
1 − kz
k=1
1 P k/t
where A(k, α) = k t|k µ(d)α and µ is the Mobius function?
We say that a necklace in Xk,n is aperiodic if its stabiliser in Ck is
trivial. The function A(k, n) counts the number of aperiodic necklaces.
This could be fused together into a nice second year essay as there are sev-
eral different proofs (http://en.wikipedia.org/wiki/Cyclotomic identity and
http://www.stat.wisc.edu/˜callan/notes/cyclotomic/cyclo.pdf).
16 Conjugacy classes
We use G-action on G to study group theory..
16.1 Definition
In Examples 3 and 4 of Subsection 13.1, a group G was made to act on
the set of its own elements by the regular g · x = gx and antiregular actions
g · x = xg −1 for g, x ∈ G.
There is a third important action of G on X = G, which is defined by
g · x = gxg −1 for g, x ∈ G.
It is easy to check that conditions (i) and (ii) of the definition hold, so this
does indeed define an action. This action is called conjugation. The orbits
of the action are called the conjugacy classes of G, and elements in the same
conjugacy class are said to be conjugate in G. So g, h ∈ G are conjugate if
and only if there exists f ∈ G with h = f gf −1 . We will write ClG (g) for the
orbit of g; that is the conjugacy class containing g. We have seen already in
Proposition 7.4 that conjugate elements have the same order.
What is StabG (g) for this action? By definition it consists of the elements
f ∈ G for which f · g = g; that is, f gf −1 = g, or equivalently f g = gf .
In other words, it consists of those f that commute with g. It is called the
centraliser of g in G and is written as CG (g). Notice that the fixed point set
of g also consists of all f such that gf g−1 = f , i.e. commute with g. Hence
Gg = CG (g).
Applying the formulae 15.1, 15.2, 15.3 from the last lecture we get:
69
(i) |ClG (g)| = |G|/|C
PG (g)|
(ii) |G| = |Z(G)| P+ x |G|/|CG (x)|
1
(iii) |G/G| = |G| g∈G |CG (g)|
The kernel K of the action consists of those f ∈ G that fix and hence
commute with all g ∈ G. This is called the centre of G and is denoted by
Z(G). So we have
Z(G) = {f ∈ G | f g = gf ∀g ∈ G}.
70
al b for 0 ≤ l < n. (For example, ab = a2l b with l = (n + 1)/2.) So the set
{ak b | 0 ≤ k < n} forms a single conjugacy class. Geometrically, this is not
surprising, because these n elements are all reflections that pass through one
vertex and the centre of the polygon P of which G is the group of isometries.
Now suppose that n is even. Then, when k = n/2, we have ak = a−k ,
and so {an/2 } is a conjugacy class of size 1 (and hence an/2 ∈ Z(G)). We
also have the classes {ak , an−k } of size 2 for 1 ≤ k ≤ (n − 2)/2. In this case,
the reflections ak b split up into two conjugacy classes of size n/2, namely
{a2k b | 0 ≤ k < n/2} and {a2k+1 b | 0 ≤ k < n/2}. Geometrically these
correspond to the two different types of reflections: those about lines that
pass through two vertices of the regular 2k-gon P and those about lines that
bisect two edges of P .
16.3 Classification of groups up to order 11
As an application we extend our classification of groups to the order 11.
The only outstanding order is 9.
Proposition 16.3 Let p be a prime number. There are two groups of order
p2 up to an isomorphism: Cp × Cp and Cp2 .
Proof: These two groups are non-isomorphic by Lemma 3.3: Cp2 has an
element of order p2 but Cp × Cp hasn’t.
Let us start by proving that G of order p2 is abelian. By Proposition 16.2,
Z(G) 6= 1. By Lagrange’s Theorem, |Z(G)| is either p, or p2 . In the latter
case G is abelian. Suppose the former case. Pick x ∈ G \ Z(G) and consider
CG (x). It clearly contains Z(G) and x. Thus |CG (x)| is bigger than p, hence
it is p2 . Thus CG (x) = G. It is a contradiction as we conclude x ∈ Z(G).
If G admits an element a of order p2 , G is a cyclic group.
If G has no such element, all non-identity elements have order p. As in
the homework problem, G admits a vector space structure over the field Zp .
Choosing a basis, forces an isomorphism G ∼ = Cp × Cp . 2
16.4 Exercises
(i) Show that the centre of a group is an abelian subgroup.
(ii) Find the centre of D2n .
(iii) Find centraliser of each element of D2n .
71
(iv) Let p be an odd prime. In the series of the next exercises, we
describe conjugacy classes in GL2 (Zp ). Prove that |GL2 (Zp )| = (p2 −
1)(p2 − p).
(v) Thanks to Jordan normal form, we know that if a matrix A ∈
GL2 (Zp ) has an eigenvalue then it is conjugate to one of the matrices
α 0 α 0 α 1
, ,
0 α 0 β 0 α
with α 6= β ∈ Zp . Compute the centraliser of each of these matri-
ces. Using these computation, list corresponding conjugacy classes
and their sizes. Verify that these classes contain (p − 1)p2 (p + 1)/2
elements overall.
(vi) Consider a matrix A ∈ GL2 (Zp ) without eigenvalues in Zp with
characteristic polynomial z 2 − αz + β. Verify that for a sufficiently
general v ∈ Zp , A looks like
0 −β
B= ,
1 α
in the basis v, Av. Compute the centraliser of B and the size of the
conjugacy class of A.
(vii) How many conjugacy classes of matrices without eigenvalues are
there16 .
16.5 Vista: Sylow’s Theorems
The Norwegian mathematician Sylow proved a number of theorems about
subgroups of groups of prime power order in 1872. The first Sylow’s theorem
tells us that if a prime power pn divides |G| then G has a subgroup of order
pn .
Of particular interest are the maximal prime powers dividing |G|. If
|G| = pn m with p and m coprime. The subgroups of order pn are called Sy-
low’s p-subgroups. The second Sylow’s theorem states that any two Sylow’s
p-subgroups are conjugate.
This tells us that the set of Sylow’s p-subgroups is a G-set under conju-
gation with a single orbit. The stabiliser of a point is the normaliser of the
Sylow’s subgroup: it contains the group itself. In particular, the number sp
of Sylow’s p-subgroups divide m. The first Sylow’s theorem tells us that sp
is also 1 modulo p.
16
It is equal to the number of quadratic equations z 2 − αz + β = 0 without solutions in
Zp . Every such equation has two distinct solutions in the finite field Fp2 of p2 elements.
This field is unique, so the number of such equations is (|Fp2 | − |Zp |)/2 = (p2 − p)/2.
72
These theorems are very powerful for dealing with finite groups. For
instance, let G be a group of order 15. In such a group s3 should divide 5
and be 1 modulo 3. Hence s3 = 1 and G contains unique normal subgroup
of order 3. Similarly, s5 should divide 3 and be 1 modulo 5. Hence s5 = 1
and G contains unique normal subgroup of order 5. It follows that G ∼ =
∼
C3 × C5 = C15 .
(1, 15)(2, 4, 6, 8, 7)(5, 9)(3, 11, 12, 13, 10)(14, 15, 16)
73
For example, S3 has three conjugacy classes, corresponding to cycle-types
1, 21 , 31 , and S4 has five conjugacy classes, corresponding to cycle-types 1,
21 , 22 , 31 , 41 .
17.2 Conjugacy classes in subgroups
If H is a subgroup of G, the conjugacy classes of H are obviously subsets
of conjugacy classes of G. We would like to make two observations on their
interaction.
74
are the cyclic groups of prime order. There are also infinitely many finite
nonabelian simple groups. These were eventually completely classified into
a number of infinite families, together with 26 examples known as sporadic
groups, that do no belong to an infinite family. The work on this proof went
on for decades, the completion was announced in 1981 but a complete proof
is yet to appear.
One of the infinite families of finite nonabelian simple groups consists of
the alternating groups An for n ≥ 5. The aim of this section will be to prove
that A5 is simple.
The conjugacy classes of An can be described using Proposition 17.3 We
need is for A5 . The classes of S5 correspond do cycle-types 1, 21 , 22 , 31 , 21 31 , 41 , 51 ,
and of these, the permutations of cycle-types 1, 22 , 31 and 51 are even per-
mutations and hence lie in A5 .
There is 1 permutation of cycle-type 1, 15 of type 22 , 20 of type 31 , and
24 of type 51 , making 60 elements in total.
The problem is that these are classes in Sn , and two permutations could
conceivably be conjugate in Sn but not in An , in which case the correspond-
ing class would split up into more than one conjugacy class in An .
In fact, the 15 permutations of cycle-type 22 forms a single class in An .
Using Proposition 17.3, g = (x1 , x2 )(x3 , x4 )(x5 ) commutes with h = (x1 , x2 ).
Similarly, the 20 permutations of cycle-type 31 are all conjugate in An ,
because g = (x1 , x2 , x3 )(x4 )(x5 ) commutes with h = (x4 , x5 ).
However, for the cycle-type 51 , if g = (1, 2, 3, 4, 5) does not commute
with odd permutations. The size of its conjugacy class is 4! = 24, so the size
of its centraliser is 120/24 = 5. It already commutes with 1, g, g2 , g3 , g 4 , so
it cannot commute with anything else.
Alternatively, you can argue that 24 does not divide |A5 | = 60, so the
S5 -conjugacy class must split into two A5 -conjugacy classes.
Summing up, we have:
Lemma 17.5 A5 has 5 conjugacy classes, of sizes 1, 15, 20, 12, 12.
75
17.4 Exercises
(i) Verify that Proposition 17.3 holds for Cn inside D2n . Find precisely
which conjugacy classes in D2n split into two and which one don’t.
(ii) Show that if a permutation f ∈ An contains an independent cycle
of even length then f commutes with an odd permutation.
(iii) Show that if a permutation f ∈ An contains two independent
cycles of the same length then f commutes with an odd permutation.
(iv) Show that if a permutation f ∈ An contains independent cycles
pairwise distinct odd length then f does not commutes with odd per-
mutations.
(v) Count the number of conjugacy classes in Sn and An for 1 ≤ n ≤ 9.
Compute the sizes of the conjugacy classes
17.5 Vista: groups of order 12
This would be too little for a second year essay but you may consider
classifying groups up to order 30 for an essay. As far as order 12 is concerned,
you already know 4 of the five groups: C12 , C6 × C2 , D12 ∼ = D6 × C2 , A4 .
For the fifth group, consider embeddings D6 ≤ SO2 (R) ≤ SO3 (R) and the
surjection ψ : SU2 (C) → SO3 (R) from Vista Section 14.5. The last group
BD12 , called binary dihedral group, is ψ −1 (D6 ). It is different
from the
−1 0
other four groups because it has a single element of order 2: is
0 −1
the only element of order 2 in SU2 (C).
One can sort them out using Sylow’s theorem. By Sylow’s theorem, s2
is either 1 or 3, while s3 is either 1 or 4. Now Sylow’s 2-subgroup could be
either C4 or K4 . You can do the classification by considering the following
cases:
(s2 = 1, s3 = 1, C4 ) the group is C3 × C4 ∼
= C12 ,
(s2 = 1, s3 = 1, K4 ) the group is C3 × K4 ∼= C6 × C2 ,
(s2 = 1, s3 = 4, C4 ) no group, C3 cannot act nontrivially on C4 ,
(s2 = 1, s3 = 4, K4 ) the group is A4 ,
(s2 = 3, s3 = 1, C4 ) the group is BD12 ,
(s2 = 3, s3 = 1, K4 ) the group is D12 ,
(s2 = 3, s3 = 4) no group, Sylow’s 3-subgroups contain (3−1)·4+1 = 9
elements leaving space only for one Sylow’s 2-subgroup.
76
discussing divisibility in domains. We introduce and motivate principal ideal
domains.
18.1 Domains
Definition. Two non-zero elements a, b in a ring R such that ab = 0R are
called zero divisors. A domain is a non-zero commutative ring without zero
divisors.
There are good reasons not to call the zero ring a domain. Unfortunately,
I could not think of a short convenient definition that automatically excludes
it. Could you?
The following proposition has a straightforward proof but is useful for
producing domains.
18.2 Divisibility
We are working in a domain R. Let us try to replicate techniques known
to you in Number Theory.
Definition. Let x, y ∈ R we say that x divides y and write x|y if y = xr
for some r ∈ R.
The following lemma is obvious.
(i) x|y.
(ii) y ∈ (x).
(iii) (x) ⊇ (y).
77
Then x = ry = r(tx) and (1 − rt)x = 0. Because R is a domain, 1 − rt = 0
and q = r ∈ R× .
The other implications are obvious. 2
In Z divisibility is usual: x ∼ y if and only if x = ±y since Z = ×
{1, −1}. In a general domain, divisibility properties are invariant under the
equivalence relation of being associate. In other words, if x satisfies a certain
divisibility property then so is any such y that x ∼ y. For instance, it is easy
to observe (and left as an exercise) that any two greatest common divisors
are associate.
Definition. Let x, y ∈ R. The greatest common divisor gcd(x, y) is such
d ∈ R that d|x, d|y, and if z|x and z|y then z|d. The least common multiple
lcm(x, y) is such l ∈ R that x|l, y|l, and if x|z and y|z then l|z.
Uniqueness of lcm(x, y) and gcd(x, y) (up to an associate element) are
established in the exercises. Existence is a bit trickier. In general, they may
not exist.
18.3 PID-s
Recall that an ideal I R is principal if I = (a) = aR for some a ∈ R
(see Section 11.2).
Definition. A domain R is called a principal ideal domain (abbreviated
PID) if any ideal of R is principal.
Example. 1. Integers Z is PID. Any ideal I Z is a subgroup of Z+ and
any subgroup of Z is cyclic. Hence, I =< n > as an additive group, implying
that I = (n) as an ideal.
In the next lecture we will give more examples of PID-s but for now we
will use their properties.
Proposition 18.4 If R is PID then lcm(x, y) and gcd(x, y) exist for any
pair of elements x, y ∈ R.
Proof: Pick d, l ∈ R such that (d) = (x) + (y) and (l) = (x) ∩ (y). We
claim that d is the greatest common divisor and l is the least common
multiple. Indeed, (x) ⊆ (d) ⊇ (y) and whenever (x) ⊆ (z) ⊇ (y) it follows
that (z) ⊇ (x) + (y) = (d). Similarly, (x) ⊇ (l) ⊆ (y) and whenever
(x) ⊇ (z) ⊆ (y) it follows that (z) ⊆ (x) ∩ (y) = (l). 2
Note that (x) + (y) = {rx + sy}, hence gcd(x, y) = rx + sy for some
r, s ∈ R as soon as R is a PID.
78
18.4 Prime and irreducible elements
There are two different ways to say what a prime number is. We are
going to see that these lead to two different notions in an arbitrary domain
R.
Definition. Let us consider r ∈ R \ (R× ∪ {0}). We say that r ∈ R is
irreducible if and r = ab implies that a ∈ R× or b ∈ R× . We say that p ∈ R
is prime if p ∈ R \ (R× ∪ {0}) and p|xy implies that p|x or p|y.
79
18.5 Exercises
(i) Prove Lemma 18.2.
(ii) Let d and d′ be both the greatest common divisor gcd(x, y). Prove
that d and d′ are associate.
(iii) Let l and l′ be both the least common multiple lcm(x, y). Prove
that l and l′ are associate.
(iv) Let p be prime. Show that if p|a1 · a2 · · · an then p divides ai .
(v) Prove Proposition 18.1.
(vi) Show that if R is a domain then R[x1 . . . xn ] is a domain.
18.6 Vista: noetherian rings and group rings
A natural generalisation of principal ideal domains are noetherian do-
mains. A ring R (not necessarily commutative) is noetherian if every ideal
is finitely generated. Hilbert’s basis theorem states that R[X] is noetherian
whenever R is noetherian. The quotient ring of a noetherian is noetherian
(but not a subring, in general). This gives a plenty of examples of noetherian
rings.
Here is another chance for you to become a famous algebraist next week-
end. All you need to do is to figure out when the group ring is noetherian.
Let G be a group, F a field. The group ring FG is a vector space with basis
Eg , g ∈ G. The multiplication is F-bilinear with Eg · Eh = Egh on the basis
elements. When is FG noetherian?
A group G is called polycyclic if it admits a finite chain of subgroups
Gk ≤ G, k = 0, . . . n such that Gn = G, G0 = {1}, Gk Gk+1 and Gk+1 /Gk
is cyclic for all k. A group G is called virtually polycyclic if it has a polycyclic
subgroup of finite index. It is known (the level of hard exercise) that if
G is virtually polycyclic then FG is noetherian. It is one of the biggest
conjectures in ring theory that the reverse statement holds: is it true that
if FG is noetherian then G is virtually polycyclic.
19 Euclidean domains
First, we discuss what prime elements do on the level of quotient rings.
Then we introduce Euclidean domains and give new examples of PID-s.
19.1 Quotient rings and primes
Proposition 19.1 Let R be a domain. A nonzero element p ∈ R is prime
if and only if R/(p) is a domain.
Proof: We write [x] for the coset x + (p). Now [x] 6= 0 translates into p 6 |x.
Thus, [x][y] = 0 =⇒ [x] = 0 ∨ [y] = 0 translates into the prime element
80
definition p | xy =⇒ p | x ∨ p | y. 2
The following proposition tells us a bit more.
X4 X2
X 7 +1 ; X 4 (X−1)+1 = X 5 −X 4 +1 ; X 2 (X−1)−X 4 +1 = −X 4 +X 3 −X 2 +1
−X 1
; −X(X − 1) + X 3 − X 2 + 1 = X 3 − 2X 2 + X + 1 ; (X − 1) − 2X 2 + X + 1
Hence, X 7 + 1 = (X 4 + X 2 − X + 1)(X 3 − X + 1) + (−2X 2 + 2X).
81
19.3 ED and PID
Theorem 19.3 A euclidean domain is a principal ideal domain.
Proof: Let I be an ideal in an euclidean domain R. Choose b ∈ I \ {0}
with the smallest possible norm. Obviously, (b) ⊆ I. Let us now prove the
opposite inclusion. For an arbitrary a ∈ I we can write a = bq + r with
either r = 0 or ν(b) > ν(r). If r 6= 0 then r = a − bq ∈ I and has a smaller
norm than b. This contradiction
P proves that a = bq ∈ (b). 2
Rings Z[α] = { k ak αk | ak ∈ Z} for α ∈ C are domains but other
properties are harder to predict. We will explain the following examples
later in the course. For now you should take my word on them.
√
Examples. 3. The domain √ Z[(1 + −19)/2] is PID but not ED.
4. The domain Z[ −5] is not PID as shown in Section 18.4. Hence, it
is not ED either.
5. Gaussian integers Z[i] = {a + bi ∈ C|a, b ∈ Z form a subring of
C. Hence, it is a domain. It is euclidean with the norm function ν(x) =
|x|2 . The first property is clear. The second property follows from the fact
that Z[i]∗ = {1, −1, i, −i}, which follows from q −1 = q ∗ /|q|2 where q ∗ =
Re(q) − Im(q)i is the conjugate number of q. For the third property√choose
the Gaussian integer q nearest to a/b. Observe that |q − a/b| ≤ 1/ 2. Let
r = a−qb. As soon as r 6= 0, ν(r) = |a−qb|2 = |q−a/b|2 |b|2 ≤ ν(b)/2 < ν(b).
The last calculation seems to make sense even for r = 0. What is about
our exclusive disjunction “either . . . or”? The answer to this question is that
the function ν is not defined at zero17 .
19.4 Minimal polynomials
Principal ideals have several applications, which you may have seen al-
ready. The idea is always the same. Let us start with the minimal polynomial
of a matrix. Let K be a field, A ∈ Mn (K) P a matrix. It defines
P a ring ho-
momorphism fA : K[X] → Mn (K) by fA ( n αn X ) = n n
n αn A . This
ring homomorphism is sometimes called evaluation homomorphism because
fA (F (X)) = F (A). The homomorphism fA is a linear map from an infi-
nite dimensional vector space to a finite dimensional one. Hence, the kernel
is non-zero. Since K[X] the kernel is an ideal (mA ) for some polynomial
mA ∈ K[X]. Multiplying mA by a scalar does not change the ideal, thus,
without loss of generality, mA is monic (the highest degree term has a co-
efficient 1). This, mA is called the minimal polynomial of A. The kernel of
17
Alternatively, one needs to set ν(0) = −∞ to keep the precious property ν(xy) =
ν(x) + ν(y).
82
fA consists of all polynomials F (X) such that F (A) = 0. Thus, mA is the
monic polynomial of minimal degree such that F (A) = 0.
In a similar way, a complex number α ∈ C a matrix defines an evaluation
ring homomorphism fα : Q[X] → C by fα (F (X)) = F (α). The kernel is
(mα ). If mα = 0 the number α is called transcendental: it does no satisfy
any polynomial with rational coefficients. If mα 6= 0 the number α is called
algebraic. Unique monic mα is called the minimal polynomial of α.
The last application of this sort is the characteristic of a ring. Any ring
R has a natural homomorphism fR : Z → R defined by fR (n) = n1R . The
kernel of this homomorphism is (n) for some n ≥ 0. This number n is the
characteristic of R.
19.5 Exercises
(i) Show that the ring Qp = {x/pn | x ∈ Z} ≤ Q, where p is prime, is
Euclidean domain.
(ii) Divide X 8 + X ∈ Q[X] by X 4 − X − 1 with remainder.
(iii) Divide X 6 − X 5 + 1 ∈ Q[X] by X 3 −√ X 2 + 1 with remainder.
(iv) Find the minimal polynomial of (1 + D)/2 over Q where D is a
square-free integer.
(v) Prove that the characteristic of a domain is always a prime number.
(vi) Describe all rings of characteristic 1.
√
19.6 Vista: −19 and quadratic integers √
You may be left surprised by the lack of details why Z[(1 + −19)/2] is
non-euclidean PID. It is actually easy to see that the usual norm ν(x) = |x|2
fails axiom (iii) of Euclidean domain. It is slightly harder to see a certain
weaker version of axiom (iii). This weaker axiom still ensures that an ideal
I is generated by√a smallest non-zero element. It is somewhat trickier to
show that Z[(1 + −19)/2] has no other euclidean norm18 .
√ It is more interesting to try to understand what is so special √ about
−19. Let D 6= 1 be a square-free integer. The quadratic field Q[ D] has
√ OD of all algebraic√integers. Observe that if D ≡4 1 then
a natural subring
OD = Z[(1 + D)/2], and OD = Z[ D], otherwise. Which of these rings
are ED or PID? The following statements summarize what is known about
imaginary (i.e. D < 0) quadratic integers:
(i) ν(x) = |x|2 is Euclidean norm on OD if and only if
D ∈ {−1, −2, −3, −7, −11},
(ii) OD is PID if and only if D ∈ {−1, −2, −3, −7, −11, −19, −43, −67, −163},
18
O. A. Campoli, The American Mathematical Monthly, Vol. 95 (9), 1988, pp. 868–871
contains full details and elementary treatment
83
(iii) if D ∈ {−19, −43, −67, −163} then OD is not PID.
Less is known about real (D > 1) quadratic integers:
(i) u(x) = |x|2 is Euclidean norm on OD if and only if
D ∈ {2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73},
(ii) for D < 100, OD is PID if and only if D ∈ {2, 3, 5, 6, 7, 11, 13, 14, 17,
19, 21, 22, 23, 29, 31, 33, 37, 38, 41, 43, 46, 47, 53, 57, 59, 61, 62, 67, 69,
71, 73, 77, 83, 86, 89, 93, 94, 97},
(iii) it is conjectured by Gauss (open as of 2010) that there are infinitely
many PID-s among OD ,
(iv) if extended Riemann’s hypothesis holds, then OD is PID implies
that OD is ED.
Thus, O−19 is the “smallest” PID, not ED!
If you want to know more, you should consider taking MA3A6, Alge-
braic Number Theory where you will learn that all OD are all Dedekind
domains and a certain finite group, called the ideal class group of OD , con-
trols whether OD is PID (this group must be trivial)!
84
another factorisation ab = xr · s1 · · · st . By the UFD property, x is associate
to ri for some i. If i ≤ k then x|a. If i > k then x|b.
The if part follows by a standard induction on n, the length of one of
factorisations x = r1 · r2 · · · rn = s1 · s2 · · · sm . If n = 1 then x = r1 is
irreducible and everything follows. If we are done for n − 1, we observe that
rn |s1 · s2 · · · sm . Since rn is prime it divides some si . Hence, r1 = qsi for
some unit q. Now we use the induction assumption on (qr1 ) · r2 · · · rn−1 =
s1 · · · si−1 · si+1 · · · sm . 2
20.2 Principal ideals give unique factorisation
Theorem 20.2 A PID is a UFD.
Proof: Using Propositions 18.6 and 20.1, it suffices to show that R is F D.
We have to factorise an arbitrary x ∈ R \ (R∗ ∪ {0}). If x is irreducible then
we are done. If not we can write x = x1,1 · x1,2 where x1,i are not units.
We are going to repeat this step over and over again. The step n + 1
starts with x = xn,1 · xn,2 · · · xn,k where none of xn,i are units. If xn,i is
irreducible for all i, we have arrived to factorisation of x. We terminate
the process. If not pick all of xn,i which are not irreducible, write them as
a product of two non-units xn,i = xn+1,j · xn+1,i+1 . In this case, we write
x = xn+1,1 · xn+1,2 · · · xn+1,t and continue with the process.
If this process terminates for all x, we are done: R is FD. Now suppose
the process does not terminate for some particular x and we are after some
sort of contradiction. The process goes on forever and produces a set of
decompositions x = xn,1 · xn,2 · · · xn,k , one decomposition for each natural
number n. The latter statement seems to be obvious but there is a set the-
oretic issue: we use recursion, which is some sort of induction, to construct
a set. Why can we do? The answer is because of Recursion Theorem in Set
Theory. Let us not get any further into this now.
The next step requires some abstract thinking. To facilitate it, think of
all this decompositions as a binary tree. The root of the tree is the element
x. The nodes at level n are elements xn,i for all i. If xn,i is irreducible, it does
not have any upward edges. If xn,i = xn+1,j · xn+1,i+1 , it has two upward
edges going to xn+1,j and xn+1,i+1 . Since the process has not terminated,
the tree is infinite. This means there is an infinite path in this tree starting
from the root and going upward. Let yn = xn,i be the element of this infinite
path at level n. In particular, y0 = x. Observe that . . . yn+1 |yn . . . y1 |y0 .
We have done all this hard work to obtain the ascending chain of ideals
. . . (yn+1 ) ⊃ (yn ) . . . (y1 ) ⊃ (y0 ) with all the inclusions proper. The trick is
that their union I = ∪∞ n=1 (yn ) is an ideal. This is true because all of the ideal
conditions could be checked at one particular (yn ) (do the exercises below if
85
you have difficulties with it). Since R is a PID, I = (d) for some particular
d ∈ R. Then d ∈ (yn ) for some n. This implies that I = (d) ⊆ (yn ).
Consequently, I = (d) = (yn ) = (yn+1 ) = . . . = (yn+i ) that contradicts all
the ideal inclusions being proper. 2
Examples. √ 1. All our ED-s Z, Z[i], F [X] are UFD-s.
2. Z[i 5] is FD√ but not√UFD. We have seen that it is not UFD since
6 = 2 · 3 = (1 + i 5)(1 − i 5) are two distinct factorisations. We won’t
prove in this course that this ring is FD.
3 Z[X] is UFD, which will be proved later, but not PID: (2, X) is not
principal.
20.3 Exercises
(i) A ring is noetherian if every ideal is finitely generated. Prove that
R is noetherian if and only if every ascending chain of ideals in R
terminates.
(ii) Prove that every noetherian domain is FD.
(iii) Prove that (2, X) Z[X] is not principal
20.4 Vista: the birth of ring theory
The ring theory appeared as a result of an accident, Lame’s 1847 mis-
take (see http://www.mathpages.com/home/kmath447.htm ). Let ωp =
exp(2πi/p) wherePp > 2 is a prime number. Lame has essentially proved
that if Z[ωp ] = { p−1 i
i=0 ai ω ∈ C | ai ∈ Z} is a UFD then the Fermat Last
Theorem holds for p, i.e. the equation xp + y p = z p have no nontrivial inte-
gral solutions. Lame has not given enough thought to the issue and just used
the UFD property of Z[ωp ]. Kummer has corrected this mistake and given
a criterion in terms of Bernoulli numbers for Z[ωp ] to be UFD. A prime p is
called regular (correspondingly irregular) if Z[ωp ] is UFD (correspondingly
not UFD). Looking at small primes, it appears that all are regular. In fact,
the first irregular prime is 37; then 59, 67, 101, 103, 131, 149 are irregular.
On the other hand, it has been proved that there are infinitely many irreg-
ular primes. It is expected that irregular primes constitute about 39% of all
the primes but it is still an open problem whether there are infinitely many
of them.
The ring O−3 = Z[ω3 ] is called Eisenstein integers. If you were thinking
about Gaussian primes for your second year essay, consider switching to
Eisenstein primes.
86
21 Polynomials over fields
We study F [X] over a field F . Our knowledge of this ring leads to a
number of nontrivial observations about the field F itself.
21.1 Remainder theorem
Proposition 21.1 (Remainder Theorem) Let f = f (X) ∈ F [X]. If f (a) =
0 for some a ∈ F then X − a divides f .
Proof: Divide f (X) by X − a with a remainder:
f (X) = g(X)(X − a) + r.
Notice that r must have degree less than 1, so r ∈ F (a constant polynomial).
Substituting X = a, we arrive at 0 = f (a) = r. 2
Definition. A field F is algebraically closed if for any f (X) ∈ F [X] of degree
at least 1 there exists a ∈ F such that F (a) = 0.
When we want to classify primes in a ring R, we are after a complete
lists of primes, which means that each element on the list is prime and
any other prime is associate to exactly one prime on the list. For example,
X + a ∼ bX + ab and we list just one of them.
87
Corollary 21.4 Z×
p is a cyclic group of order p − 1.
88
In other words, complex numbers form an algebraically closed field.
There are numerous proofs using Algebra, Analysis or Topology including
the one done in Foundations along the lines of the original Gauss’ argument.
My two favourite proofs are via Liouville’s Theorem (MA3B8 Complex Anal-
ysis) or Open Mapping Theorem
21.5 Derivatives and square-free elements
If R is a domain, r ∈ R is square-free if r is not a unit and if x2 | r then
r is a unit. I don’t know how to determine whether r ∈ Z is square-free
except as to decompose r into primes and see.
Let D : F [X] → F [X] be an F -linear map defined on the monomials
by D(X n ) = nX n−1 . You must have recognized the usual derivative except
that we do them over any field and do not use limits. We denote D(f ) = f ′
to play on our usual intuition.
21.6 Exercises
(i) Let F be a field, f = f (X) ∈ F [X] irreducible. Prove that X +(f ) ∈
F [X]/(f ) is a root of the polynomial f (X).
(ii) Improve Theorem 21.3 by showing that any two subgroups in F ×
of order N are equal.
21
The key word is separable here.
89
(iii) Prove that {X − a, X 2 + bX + c | a, b, c ∈ R, b2 − 4c < 0} is a
complete list of primes in R[X].
(iv) For which p is X 2 + X + 1 prime in Zp [X]?
(v) For which p is X 2 − X + 1 prime in Zp [X]?
(vi) For which p is X 2 + X + 2 ∈ Zp [X] square free?
(vii) For which p is X p + X + 1 ∈ Zp [X] square-free?
(viii) Prove that if f ∈ F [X] is square-free and F is algebraically closed
then gcd(f, f ′ ) = 1.
(ix) Prove Corollary 21.10 .
21.7 Vista: Artin’s conjecture
Let D > 1 be a square-free integer. Let S(D) be the set of primes p such
that |[D]| = p−1 where [D] = D +(p) ∈ Z× p . Emil Artin conjectured in 1927
that this set was infinite. A positive answer follows from Generalised Rie-
mann Hypothesis (which you should never confuse with Extended Riemann
Hypothesis). Find out more in M. Ram Murty, Mathematical Intelligencer
10 (4), 1988, 59-67.
22 Gaussian primes
We classify primes in Z[i] = Z[ω4 ] = O−1 and derive some consequences.
22.1 Preliminary observations
Primes in Z[i] are called gaussian primes. Let us recall that ν(x) = |x|2 .
It is useful to remember that if x|y in Z[i] then ν(x)|ν(y) in Z.
90
Proposition 22.3 Let q ∈ Z[i] be a gaussian prime. Then either ν(q) is a
prime or a square of a prime.
Proof: Let n = ν(q) = qq ∗ . Take decomposition of n into primes in Z, say
n = p1 · · · pt . Then q|pj in Z[i] for some j. Thus, n = ν(q) | ν(pj ) = p2j . 2
Theorem 22.6 The prime elements in Z[i] are obtained from the prime
elements Z. Each prime p ∈ Z, congruent 3 modulo 4 is a gaussian prime.
The prime p = 2 gives rise to a gaussian prime q such that 2 ∼ q 2 . Each
prime p ∈ Z congruent 1 modulo 4 gives rise to two nonassociate gaussian
primes q and q ∗ such that p = qq ∗ .
Proof: By Proposition 19.1, a prime p ∈ Z is a gaussian prime if and only
if Z[i]/(p) ∼= Zp [X]/(X 2 + 1) (Lemma 22.5) is a domain. By Corollary 21.6,
this is equivalent to p ≡4 3.
Now using Proposition 22.3, we can distinguish the two cases for a gaus-
sian prime q. In the first case, q is a gaussian prime such that p2 = ν(q)
for a prime p. Hence, q|p in Z[i]. Pick s ∈ Z[i] such that p = qs. Then
|s| = |p|/|q| = 1 and s is a unit (s−1 = s∗ /|s|2 ). Hence q is associate to p
and p is forced to be 3 modulo 4.
In the second case, q is a gaussian prime such that p = ν(q) is a prime.
As x 7→ x∗ is a ring automorphism, q ∗ is also a prime. Thus we observe
two primes q, q ∗ such that p = qq ∗ = ν(q). As p is not gaussian prime, p is
forced to be 1 or 2 modulo 4.
Now q = x + yi ∼ q ∗ = x − yi if and only if |x| = |y| or x = 0 or
y = 0. The latter two cases are impossible and in the former case, we
must have |x| = |y| = 1. Thus, we get 4 associate primes ±1 ± i and
p = ν(1 + i) = 12 + 12 = 2.
If p is 1 modulo 4, we get two groups of associate primes {q = x+yi, −y+
xi, −x − yi, y − xi} and {q ∗ = x − yi, y + xi, −x + yi, −y − xi}. The primes
in the different groups are not associate. 2
91
22.3 Applications
Corollary 22.7 (Fermat) Every prime p congruent 1 modulo 4 is a sum of
two integer squares in a unique way.
Proof: Theorem 22.6 provides existence: p = qq ∗ for a prime q = x + iy,
hence p = qq ∗ = x2 + y 2 . If p = x2 + y 2 = a2 + b2 then p = (x + iy)(x − iy) =
(a + ib)(a − ib) are two prime decompositions in Z[i]. Everything follows
from the UFD property of Z[i]. 2
92
23.1 Fields of Fractions
Let R be a domain. We consider the set W = R × (R \ {0}) = {(x, y) ∈
R × R|y 6= 0}. It admits an equivalence relation where (a, b) ∼ (c, d) when-
ever ad = bc. I leave it as an exercise to show that this is, indeed, an
equivalence relation. An equivalence class of (a, b) is called a fraction and
denoted a/b. Let Q = Q(R) be the set of all the equivalence classes on W .
The list of axioms of the field is long and we have to go and check
them all. But we are in a good shape because we know that the operations
are well-defined, so we can use our usual intuition about fractions. The
associativity of addition is probably the hardest axiom to check
a c e ad + bc e adf + (bcf + bde) a cf + de a c e
( + )+ = + = = + = +( + ).
b d f bd f bdf b df b d f
The commutativity of addition is easier:
a c ad + bc c a
+ = = + .
b d bd d b
93
The zero and the additive inverse are usual: 0 = 0/1 and −(a/b) = (−a)/b
with all the checks routine. The associativity of multiplication is straight-
forward
a c e ac e ace a ce a c e
( · )· = · = = · = ·( · )
b d f bd f bdf b df b d f
as well as the commutativity:
a c ac c a
· = = · .
b d bd d b
The unity and the multiplicative inverse are usual: 1 = 1/1 and (a/b)−1 =
b/a with all the checks routine. It is worth noticing though why a 6= 0.
Indeed, a = 0 if and only if a · 1 = b · 0 if and only a/b = 0/1 = 0. Finally, we
have to check distributivity but it suffices to do it on one side only because
the multiplication is commutative:
a c e ad + bc e ade + bce ade bce a e c e
( + )· = · = = + = · + · .
b d f bd f bdf bdf bdf b f d f
94
Proof: Let a1 be the least common multiple of all the denominators of the
coefficients of g(X), a2 the greatest common divisor of all the coefficients
of a1 g(X), a = a1 /a2 ∈ Q(R). We define e g = ag ∈ R[X]. Similarly,
e e
h = bh ∈ R[X]. Notice that ge and h are primitive. Consequently,
u e
f= ge
geh and vf = ueh
v
for some u, v ∈ R. Moreover, the greatest common divisor of u and v is 1.
As soon as we prove that v is unit in R we conclude by setting qb = ueg
and bh = v −1 e
h. Let us suppose that it is not a unit. Then there exists a
prime element p ∈ R that divides v. Let us consider a ring homomorphism
X X
π : R[X] → R/(p)[X], π( ak X k ) = (ak + (p))X k .
k k
g )π(e
0 = π(v)π(f ) = π(u)π(e h).
95
Q(R) can be represented as
X rj
r+ n where r, rj , pj ∈ R, nj ∈ N, pj are pairwise non-associate prime.
j
pj j
X +1 aX + b c cX 2 + c + aX 2 + bX
= + =
X3 + X X2 + 1 X X3 + X
Hence, b = c = 1, a = −1 and
Z Z Z Z Z
X +1 −X + 1 1 1 dX 2 dX dX
3
dX = ( 2 + )dX = − 2
+ 2
+ =
X +X X +1 X 2 X +1 X +1 X
1
= − ln(X 2 + 1) + arctan(X) + ln(X) + C
2
Does it ring a bell?
4. Let f = (X − a1 ) · · · (X − an ) ∈ C[X] where aj 6= ak for j 6= k and
g ∈ C[X] of degree less than n. Trying to guess the coefficients, we write
n
X tj
g
=
f X − aj
k=1
and
n
X n
X
tj f (X)
g(X) = = tj (X − a1 ) · · · (X − aj−1 ) · (X − aj+1 ) · · · (X − an ).
X − aj
k=1 k=1
96
n
X (X − a1 ) · · · (X − aj−1 ) · (X − aj+1 ) · · · (X − an )
= sj .
(aj − a1 ) · · · (aj − aj−1 ) · (aj − aj+1 ) · · · (aj − an )
k=1
It will be the polynomial of the smallest possible degree such that g(aj ) =
sj .
Proposition 24.1 If R is UFD then there are two kinds of primes in R[X]:
prime elements in R; primitive elements in R[X] that are prime in Q[X].
Moreover, R[X] is UFD.
Proof: It immediately follows from Theorem 23.2 that all elements listed
are irreducible. Let us establish that any f ∈ R[X] can be factorised into
them, We can factorise f = f1 · · · fn in Q[X]. Getting rid of denominators
and common divisors of numerators, we get f = afe1 · · · fen for some a ∈
Q,fej = aj fj primitive in R[X]. Factorising a in R, we arrive at the required
factorisation of f . Thus, every irreducible element of R[X] is associate to
either a prime in R or a primitive element in R[X], prime in Q[X].
Now we proceed to prove that this factorisation is unique. Let us consider
two factorisations
f = p1 · · · pk f1 · · · fn = q1 · · · qt g1 · · · gm ∈ R[X], pj , qj ∈ R, fj , gj 6∈ R
αp1 · · · pk = βq1 · · · qt ∈ R
97
Proof: We proceed by induction on n. If n = 1 then F [X] is ED, hence
PID, hence UFD. If we have proved it for n − 1 we observe that
F [X1 , . . . Xn ] ∼
= F [X1 , . . . Xn−1 ][Xn ]
We assume that there exists a prime p ∈ R such that p divides all ak for
k < n but does not divide an and p2 does not divide a0 . If the greatest
common divisor of all the coefficients is 1 then f (X) is irreducible in R[X].
Proof: A factorisation with one polynomial of zero degree is impossible
because the coefficients have no common divisors. Suppose
n
X Xm Xt
k k
f (X) = ak X = ( bk X )( ck X k )
k=0 k=0 k=0
P
with both polynomials of non-zero degree. Then ak = r+s=k br cs for all k
Since p|a0 = b0 c0 , it divides either b0 or c0 but not both since p2 does not
divide a0 . Without loss of generality, p divides b0 but not c0 .
This serves as a basis of induction. We prove that p divides bj for each
0 ≤ j ≤ m < n. Suppose we have done for all j < l. Then
Since p divides every term in the right hand side, it divides bl c0 . Since it
does not divide c0 , it divides bl .
Hence, p divides an = bm ct which is a contradiction. 2
98
Notice that if f (X) admits p as in Eisenstein’s criterion but its coef-
ficients are not relatively prime then f (X) is not irreducible in R[X] but
irreducible in Q[X] where Q = Q(R).
Examples. 1. If p is prime in R then X n + p is prime in R[X] for any
n. In particular, f (X) = X p − t is prime in R[X] where R = F [t] (F is a
field) and, consequently, by Gauss’ lemma, prime in Q[X] = F (t)[X]. See
Section 21.5 where it was used.
2. If p is prime in Z then f (X) = X p−1 + X p−2 + . . . X + 1 is prime in
Z[X]. Consider the shift automorphism
where h(X) is the product of all Φd (X) over divisors of n other than n
itself. By the inductive hypothesis, h(X) is monic and has coefficients in Z.
So Φn (X) is the result of dividing X n − 1 by h(X).
The process of dividing one polynomial by another would consist of
rewritings X k ; (X k − h(X)) where k is the degree of h(X). Every time
αX m , α ∈ Z is rewritten, αX m−k goes to the result and αX m−k (X k −h(X))
99
appears. Both have integer coefficients, hence the quotient Φn (X) = (X n −
1)/h(X) is monic with integer coefficients. 2
The polynomial Φk is actually irreducible in Z[X] and you can find more
about it in the vista section below. It is an interesting fact that an early
version of a manual for the computer system Maple has stated that all
coefficients of Φk are ±1 and 0. The smallest counterexample is Φ105 (X) =
X 48 + X 47 + X 46 − X 43 − X 42 − 2X 41 − X 40 − X 39 + X 36 + X 35 + X 34 +
X 33 + X 32 + X 31 − X 28 − X 26 − X 24 − X 22 − X 20 + X 17 + X 16 + X 15 +
X 14 + X 13 + X 12 − X 9 − X 8 − 2X 7 − X 6 − X 5 + X 2 + X + 1
24.4 Exercises
(i) Prove Corollary 24.3
(ii) Prove that F [X1 , . . . Xn ] is not PID if n > 1.
(iii) Prove that Z[X1 , . . . Xn ] is not PID.
(iv) Use Eisenstein’s criterion to produce 6 new irreducible polynomials
in Z[X].
(v) Compute the cyclotomic polynomial Φk for all k < 16.
(vi) Pick two of your favourite polynomials in Z[X] and find the great-
est common divisor.
P k
(vii) Let f =P k αk Xk ∈ R[X] be monic, R any domain, I R. Show
that if f = k [αk ]X ∈ R/I[X] is irreducible then f is irreducible.
24.5 Vista: irreducibility of cyclotomic polynomials
Irreducibility of the cyclotomic polynomial Φn (X) was part of this mod-
ule last year. You can find it in the lecture notes on Mathstuff and can use
it for your essay. The idea is to try to reduce the coefficients modulo some
prime p, coprime to n. If Φn (x) ∈ Zp [X] is irreducible then so is Φn (x)
(Exercise (vii)). Unfortunately, the life is not so simple: Φn (x) ∈ Zp [X]
is irreducible if and only if |p + (n)| = ϕ(n) in the group Z× n (prove it
as a part of your essay - very good advanced exercise). This means that
Z×n must be cyclic for Φn (x) to have a shot at irreducibility. Look at
Z× ∼ × × ∼
15 = Z3 × Z5 = C2 × C4 . Hence, Φ15 (x) ∈ Zp [X] is not irreducible
for any prime p!
The trick is to use several different primes. Look up the details. A
beautiful consequences of irreducibility of Φn is an elementary proof that
there are infinitely many primes which are 1 modulo n.
100
25 Algebras and division rings
We introduce algebras fusing rings and vector spaces. We discuss division
rings and prove little Wedderburn’s theorem.
25.1 Algebras, their homomorphisms and ideals
Definition. An algebra is a pair (R, F) such that F is a field, R is both a
ring and a vector space over F such that these two structures share the same
addition and α(ab) = (αa)b = a(αb) for all α ∈ F, a, b ∈ R.
It is common to say that A is an algebra over F or simply F -algebra.
Many of the rings we introduced are algebras.
Examples. 1. Any field K is an algebra over any subfield F ≤ K.
2. If F is a field, Mn (F) and F[X] are algebras over F.
3. If R is an algebra over K and F ≤ K is a subfield then R is an algebra
over F.
Algebras have analogues of subrings, ideals and homomorphisms. A
subalgebra of (R, F) is a subring S ≤ R, which is also a vector subspace. An
algebra ideal of (R, F) is an ideal I R, which is also a vector subspace. An
algebra homomorphism from A = (R, F) to B = (S, F) where both algebras
are over the same field F is an F-linear ring homomorphism from R to S
which is also a linear map. Isomorphism theorem holds for algebras.
101
25.2 Algebras and centres
To understand example 5, we need to recall the notion of the centre.
The centre of a ring R is Z(R) = {a ∈ R | ∀x ∈ R xa = ax}. Similarly to
groups, the centraliser of x ∈ R is C(x) = CR (x) = {a ∈ R | xa = ax}.
102
Proposition 25.4 Let (R, F) be an algebra such that R is finite but nonzero.
Then |F| = pm for some prime p and positive integer m while |R| = |F|k for
some positive integer k.
Proof: Since ψ : F → R, ψ(f ) = f 1R is a nonzero ring homomorphism, its
kernel must be zero (F is a field). Thus, F is finite.
The characteristic of F (see Section 19.4) must be prime p. Hence Zp is
a subring of F. Natural action a · x = ax, a ∈ Zp , x ∈ F makes F into a
vector space over Zp . Since F is finite, the vector space is finite dimensional,
say of dimension m. Hence, |F| = |Znp | = pn .
Similarly, if k is the dimension of R over F then |R| = |Fk | = |F|k . 2
Now we are ready for little disappointment.
The integer Φk (q) divides the right hand side, hence, it divides q − 1. We
claim that |Φk (q)| > q − 1 for k > 1. Indeed, if ξ = e2πi/k then the set of
primitive k-th roots of unity is {ξ t | t ∈ Z×k }. Thus,
Y Y
|Φk (q)|2 = |q − ξ t |2 = [(q − Re(ξ t ))2 + Im(ξ t )2 ]
t t
and since |ξ| = 1, each real part is certainly between −1 and +1, so
q − Re(ξ t ) > q − 1 unless Re(ξ t ) = 1, which happens only if ξ t = 1, which
can happens only if k = 1. 2
103
25.4 Exercises
(i) Prove Proposition 25.1.
(ii) Prove that any two F-algebra structures on a field F are isomorphic
as algebras.
(iii) Prove that if R is a commutative ring then Z(Mn (R)) = {αIn |
α ∈ R}.
(iv) Prove that there are no ring homomorphisms C → R. (Hint: where
should i go?)
(v) Prove that if V is a vector space over a field F then the set of all
linear operators EF (V ) is an F-algebra with S Ṫ = ST and αT : v 7→
α(T v) = T (αv). Prove that Z(EF (V )) = {αIV | α ∈ F}.
(vi) Let R be a ring, F a field. Prove that there is a bijection between
the set A = {(R, F) | (R, F) is an algebra } of algebra structures and
the set B of algebra homomorphisms from F to Z(R).
25.5 Vista: finite fields
To understand finite division rings, it remains to describe finite fields.
We already know that a finite field must have pn elements for some prime
power pn . In fact, for each prime power pn there exists a unique (up to an
isomorphism) field Fpn of order pn . Existence follows from Theorem 21.7.
Let Zp be the algebraic closure of Zp . Then Fpn is the subset of Zp that
n
consists of roots of z p −z. You need the freshman’s dream binomial formula
(Section 9.1) to prove that Fpn is a subfield of Zp .
Uniqueness follows from the UFD property of Zp [z]. Let f (z) be any
n
prime factor of Φpn−1 (z) in Zp [z]. If F is a field of order pn then both z p − z
and Φpn −1 (z) split over F into linear factors. Let ξ ∈ F be a primitive root
of Φpn −1 (z). The evaluation homomorphism Zp [z] → F, h(z) 7→ h(ξ) gives
rise to an isomorphism Zp [z]/(f ) ∼ = F, proving that any two subfields are,
indeed isomorphic.
104
Proposition 26.1 The R-span of 1, I, J and K is an R-subalgebra of
M2 (C).
Proof: The space is an R-vector subspace containing 1. It suffices to check
that it is closed under multiplication. We have already seen how to multiply
these matrices (Proposition 4.6):
1 I J K
1 1 I J K
I I −1 K −J
J J −K −1 I
K K J −I −1
P P
We are done because the multiplication is bilinear: ( i αi Ei )( j βj Ej ) =
P
i,j αi βj Ei Ej . 2
Definition. The Hamilton24 quaternions H is the R-algebra described in
Proposition 26.1.
Notice that H is not a C-subalgebra of M2 (C). The C-span of these
elements is the whole M2 (C). Moreover, H is not a C-algebra at all because
Z(H) = R and there are no ring homomorphisms C → R (see exercises).
26.2 Real and imaginary quaternions
A quaternion α1 is called real. A quaternion αI + βJ + γK is called
imaginary. Imaginary quaternions form a three-dimensional subspace H0 ,
while real quaternions form a subalgebra R of H. For each quaternion x =
α1 + βI + γJ + δK ∈ H, analogously to complex numbers we define its
conjugate
x∗ = α1 − βI − γJ − δK,
its real part
Re(x) = (x + x∗ )/2 = α1 ∈ R,
its imaginary part
Im(x) = (x − x∗ )/2 = βI + γJ + δK ∈ H0 ,
105
We treat the space of the imaginary quaternions H0 as the standard 3-space
with the standard dot and crossed products:
106
26.3 Multiplicative group of quaternions
H is not a field because IJ 6= JI but it is a division ring.
107
Thus every imaginary quaternion a ∈ H0 of norm 1 is an imaginary unit.
Imaginary units form a 2-sphere S 2 ⊆ H0 . Each imaginary unit q ∈ S 2
defines a homomorphism C → H, α + βi 7→ α + βq. Thus, in a strange
interplay between Algebra and Geometry the set of all homomorphisms from
C to H is a 2-sphere.
26.5 Hopf fibration
Hopf fibration has been described in Vista Section 14.5. Before we give
a simpler description of it, we need to understand 3D-rotations. Let Rβx
be the anticlockwise rotation by the angle β in the plane orthogonal to x:
x (w) =
Lemma 26.10 If a = cos θ+x sin θ for some imaginary unit x then R2θ
−1
awa for each w ∈ H0 .
Proof: Choose y, z ∈ H0 so that x, y, z is a positive oriented orthonormal
basis. From Theorem 26.2, it follows that x2 = y 2 = z 2 = −1, xy = −yx =
z, yz = −zy = x and zx = −xz = y.
It suffices to check the proposition on the basis because both parts of the
equality R2θ x (w) = awa−1 are results of linear maps applied to w. Notice
z sin θ)(cos θ − x sin θ) = ((cos θ)2 − (sin θ)2 )y + (2 cos θ sin θ)z = y cos 2θ +
z sin 2θ = R2θ x (y) and finally aza−1 = (cos θ + x sin θ)z(cos θ − x sin θ) =
(z cos θ − y sin θ)(cos θ − x sin θ) = ((cos θ)2 − (sin θ)2 )z − (2 cos θ sin θ)y =
z cos 2θ − y sin 2θ = R2θx (z). 2
3
The sphere S is the group of norm 1 quaternions. The sphere S is 2
108
Theorem 26.11 The U (H)-set S 2 has one orbit. The stabiliser of x ∈ S 2
is U (H) ∪ R(x).
Proof: If x, y ∈ S 2 then R2θ z (x) = y where z ∈ H is any unit vector
0
orthogonal to both x and y and 2θ is the angle between x and y. By
Lemma 26.10, a · x = axa−1 = y where a = cos θ + x sin θ ∈ S 3 = U (H).
Thus, the U (H)-set S 2 has one orbit.
To compute the stabiliser of x ∈ S 2 , observe that an arbitrary ele-
ment a ∈ U (H) can be written as a = cos θ + y sin θ where y ∈ S 2 . By
y
Lemma 26.10, a ∈ Stab(x) if and only if a · x = R2θ (x) = x. For this to
happen we need x and y to be parallel (then y = ±x and a ∈ U (H) ∪ R(x))
or 2θ = 2nπ for some n ∈ Z (then θ = nπ and a = cos nπ + y sin nπ = ±1.
2
Geometrically, the stabiliser U (H) ∪ R(q) is the unit circle in R(x) = C.
Choosing a particular quaternion x ∈ S 2 , its orbit map βx : U (H) → S 2 ,
βx (g) = gxg −1 is the Hopf fibration S 3 → S 2 : the inverse image βx−1 (y) is a
coset of the stabiliser U (H) ∪ R(x), i.e. geometrically a circle.
26.6 Exercises
(i) Prove that CH (K) = R + RK = R(K).
(ii) Prove that Z(H) = R.
(iii) Prove that U (H) = SU2 (C) as subgroups of GL2 (C) (See Vista
Section 14.5 for the definition of SU2 (C).
(iv) Prove that a × b = (ab − ba)/2 for all a, b ∈ H0 .
(v) Using Exercise (iv), prove Jacobi’s identity a × (b × c) + b × (c ×
a) + c × (a × b) = 0 for all a, b, c ∈ H0 .
(vi) Prove Schwarz’s inequality ||x|| · ||y|| ≥ |x • y|.
26.7 Vista: from multiplication tensors to superstring the-
ory
We have successfully used a multiplication table to describe quaternionic
multiplication. Pushing this through for a general algebra leads to tensors.
Let (R, F) be an algebra. Pick elements ei ∈ R constituting a basis of R
as a vector space. We define multiplication for basis elements
X
ei · ej = mki,j ek (1)
k
for uniquely determined mki,j ∈ F. These numbers are called structure con-
stants. Together they form a (2,1)-tensor on the vector space R.
We extend this formula by bilinearity, so the multiplication in R is dis-
109
P P P
tributive. Notice that if a = i αi ei , b = j β j ej , c = k γ k ek then
X X
(ab)c = αi β j γ k (ei · ei ) · ek , a(bc) = αi β j γ k ei · (ei · ek ).
i,j,k i,j,k
Similarly, X X
ei · (ej · ek ) = msj,k ei · es = msj,k mti,s et .
s s,t
110
27 Quaternions and spinors
Using quaternions, we describe 3D and 4D spinors.
27.1 Exponents of quaternions
In Algebra-I, you have studied matrix exponents. Since H is an R-
subalgebra of M2 (C), one can formally define
∞
X
a 1 n
e = a = 1 + a + a2 /2! + . . .
n!
n=0
Proposition 27.2 The element e2πx/n has order n in H× for any imaginary
unit x ∈ U (H) ∩ H0 .
27.2 3D spinors
Let us play more with the action of U (H) on H0 given by a · x = axa−1 .
In the last lecture we have realized that the orbit map of this action is a
Hopf fibration. Now we would like to study the action map.
111
x (w) = awa−1 for each w ∈ H by Lemma 26.10. Hence, the action map
R2θ 0
is a group homomorphism φ : U (H) → SO(H0 ).
If A ∈ SO(H0 ) ∼ = SO3 (R), the characteristic polynomial χA (z) have
degree 3, so A has a real eigenvalue λ with an eigenvector v. Since ||Av|| =
||v||, |λ| = ±1. Moreover, A preserves the orthogonal complement v ⊥ and
A|v⊥ is orthogonal. If λ = 1 then det(A|v⊥ ) = 1 and A|v⊥ is a rotation.
Thus, A is a rotation Rθv by some angle θ. If λ = −1 then det(A|v⊥ ) = −1
and A|v⊥ is a reflection. Thus, A has two more eigenvectors, including an
eigenvector w with eigenvalue 1. Hence, A is a rotation Rπw . It follows that
φ : U (H) → SO(H0 ) is surjective.
x (w) =
Finally, a = cos θ + x sin θ is the kernel of φ if and only if R2θ
−1
awa = w for each w ∈ H0 if and only if 2θ = 2nπ if and only if θ = nπ if
and only if a = ±1. 2
Using the action homomorphism any SO3 (R)-set becomes a U (H)-set.
The opposite is not true: a U (H)-set is an SO3 (R)-set if and only if −1 lies
in the kernel of the action. A 3D-spinor is an element of a U (H)-set25 which
is not an SO3 (R)-set.
Examples. 1. Any quaternion x ∈ H is a spinor. The action is left multi-
plication in H: a · x = ax.
2. Physicists like illustrating the spinors using the following device.
Let A be the set of all positions of this device such that the centre of
the cork remains in the centre of the cube. We say that two positions are
equivalent if one can be moved to another by adjusting elastic thread only.
The quotient set X = A/ ∼ is a U (H)-set: an element a acts on X by
25
Wikipedia (http://en.wikipedia.org/wiki/Spinor) defines a spinor as an element of a
representation, a U (H)-vector space rather than merely a U (H)-set. This difference is
immaterial: a vector space is a set and any set is a subspace of a vector space, the space
of formal linear combinations of the elements.
112
φ(a) rotating the cork. It is not a SO3 (R)-set because a 360-degree rotation
tangles the thread. It is less obvious that a 720-degree rotation does not
tangle the thread (see http://www.youtube.com/watch?v=O7wvWJ3-t44).
3. A variation of example 2 is Feynman dance:
113
xl xl J
xk xk+l xk+l J
xk J xk−l J xk−l+n
Nevertheless, these groups are not isomorphic. For instance, BD4 ∼
= C4
while D4 ∼
= K4 or BD8 ∼ = Q8 .
114
structure of SO(H). An element x ∈ U (H) gives rise to a left scroll Lx (y) =
xy and a right scroll Rx (y) = yx−1 . Right and left scrolls commute: Lx Ry =
Ry Lx . Finally, every element of SO(H) is a composition of left and right
scroll (surjectivity of φ in Theorem 27.6).
27.5 Regular polytopes
You probably know that there exists 5 regular polyhedrons (3D-polytopes),
often called platonic solids. It is a drastic contrast with regular polygons
(2D-polytopes) whom there are infinitely many. What is about regular nD-
polytopes? The answer is surprising: if n ≥ 5 there are 3 regular nD-
polytopes, but there are 6 regular 4D-polytopes26 . Our aim is to sketch
construction of the higher dimensional regular polytopes.
Let us start with the three that exist in any dimension. n-hypercube is
the easiest one to imagine: its 2n vertices have coordinates (±1, ±1 . . . , ±1).
n-hypercube is the convex hull of them.
The next one is n-simplex: it is a convex hull of n + 1 points. 2-simplex
is a regular triangle and 3-simplex is a regular tetrahedron.
The last universal one is n-orthoplex. It is the dual27 polyhedron of the
n-cube. In another language, the 2n vertices of n-orthoplex are centres of
n − 1-dimensional faces of an n-cube. The n-orthoplex itself is the convex
hull of its vertices. The 4-orthoplex (often called 16-cell) has a particularly
nice structure: its vertices are elements of the group BD8 = Q8 !
This gives us an idea take a finite group G ⊆ U (H) and consider its
convex hull. The resulting 4-polytope is bound to have a high degree of
symmetry: left and right scrolls with respect to the elements of G are sym-
metries of the resulting polytope. Unfortunately, no other BD4n gives a
regular solid. It is instructive to realize why the hull of BD16 is not a 4-
cube: 2D-faces of a 4-cube are squares while two of the 2-sides of the hull
of BD16 are octagons.
The key is to find more finite subgroups. One gets them by lifting rota-
tional symmetries of other platonic solids. The first platonic solid is tetrahe-
dron. The group of its rotational symmetric has order 12. Its inverse image
in U (H) is called binary tetrahedral group:
√ 1 + I + J + K ((I−J+K)π/3√3)
BT =< e((I+J+K)π/3 3)
= ,e ,I >
2
∼
=< a, b, c | a3 = b3 = c2 = abc >
26
Wikipedia has numerous interlinked pages (http://en.wikipedia.org/wiki/Regular polytope)
with a wealth of information related to this lecture.
27 ∗
X = {v | ∀a ∈ X | < a, v >≤ 1}
115
Observe that BT has 24 elements but it is not isomorphic to S4 (again BT
has only 1 element of degree 2). In fact, BT is isomorphic to SL2 (Z3 ). It is
more essential for us that the convex hull of BT is a new self-dual regular
4D-polytope, called 24-cell. Self-duality manifests in the fact that it has 24
vertices and 24 3D-faces.
Cube and octahedron are dual of each other. They have the same ro-
tational symmetry group of order 24 (in fact it is isomorphic to S4 ). Its
inverse image in U (H) is called binary octahedral group:
1+I √ 1+I +J +K
BO =< e(Iπ/4) = √ , e((I+J+K)π/3 3) = ,I >
2 2
∼
=< a, b, c | a4 = b3 = c2 = abc >
This group of order 48 is unusual in many respects. In particular, it has no
other notable description as a group. Its convex hull fails to be a regular
solid but has some interesting properties.
We are left with icosahedron and dodecahedron, which are dual of each
other. Their rotational symmetry group has order 60 (in fact it is isomorphic
to S5 ). Its inverse image in U (H) is called binary icosahedral group:
BI =< e((I cos π/3+K sin π/3)π/5) , e((I cos π/5+K sin π/5)π/3) , I >
∼
=< a, b, c | a5 = b3 = c2 = abc >
This group of order 120 is isomorphic to SL2 (Z5 ). The convex hull of BT
is a new regular 4D-polytope, called 600-cell. Its 600 3D-faces are regular
tetrahedra.
The remaining regular 4D-polytope is the dual of 600-cell. It is called
120-cell. Its 120 3D-faces are dodecahedra.
27.6 Exercises
(i) Prove that if ab = ba for some a, b ∈ H then there exists an imagi-
nary unit x ∈ U (H) ∩ H0 such that a, b ∈ R(x) (cf. Exercise 26.6(i)).
(ii) Prove that if AB = BA for some A, B ∈ Mn (C) then28 eA+B =
eA eB .
(iii) Show that JxJ −1 = x−1 if x = eπI/n .
(iv) Prove Lemma 27.7.
(v) Using Lemma 27.7 prove that Sx Sy = φ(xy, yx).
(vi) Prove that the group of rotational symmetries of a tetrahedron is
isomorphic to A4 .
28
A general formula eA+B = F (eA , eB ) is called Baker-Campbell-Hausdorff formula.
116
27.7 Vista
There is no vista for you now: your limit is the sky now.
On a more serious note, if you were thinking of writing an essay29 on
quaternions you, can still do it. You can expand the last section where
no proofs were given. You can also discuss integer quaternions or 4-square
theorem. Another alternative is to describe quaternions by applying Cayley
process to complex numbers. You can apply Cayley process to quaternions
to obtain octonions or so called Cayley numbers. They are still a division
algebra, albeit nonassociative. Another direction is Clifford algebras and
spinors in dimension n.
Certainly, you can write an essay on Physics or Geometry as we have
not touched any of those issues.
29
Consult On quaternions and octonions by Conway and Smith or Regular Polytopes by
Coxeter.
117