Oxford OCW - Rings and Modules

ALGEBRA II: RINGS AND MODULES.
KEVIN MCGERTY.
1. RINGS
The central characters of this course are algebraic objects known as rings. A ring is any mathematical
structure where you can add and multiply, and could be thought of as generalising the properties of Z the
integers. Formally speaking we have:
Denition 1.1. A ring is a datum (R, +, , 0, 1) where R is a set, 1, 0 R and +, are binary operations on R
such that
(1) R is an abelian group under + with identity element 0.
(2) The binary operation is associative and 1 x = x 1 = x for all x R.
1
(3) Multplication distributes over addition:
x (y + z) = (x y) + (x z),
(x + y) z = (x z) + (y z), x, y, z R.
Just as for multiplication of real numbers or integers, we will tend to suppress the symbol for the op-
eration , and write . or omit any notation at all. If the operation is commutative (i.e. if x.y = y.x for
all x, y R) then we say R is a commutative ring
2
. Sometimes people consider rings which do not have a
multiplicative indentity
3
. We wont. It is also worth noting that some texts require an additional axioms
asserting that 1 0. In fact its easy to see from the other axioms that if 1 = 0 then the ring has only one
element. We will refer to this ring as the zero ring, which is a somewhat degenerate object, but it seems
unnecessary to me to exclude it.
Example 1.2. i) The integer Z form the fundamental example of a ring. In some sense much of the
course will be about nding an interesting class of rings which behave a lot like Z. Similarly if n Z
then Z/nZ, the integers modulo n, form a ring with the usual addition and multiplication.
ii) The subset Z[i] = a + ib C : a, b Z is easily checked to be a ring under the normal operations of
addition and multiplication of complex numbers. It is known as the Gaussian integers. We shall see
later that it shares many of the properties with the ring Z of ordinary integers.
iii) Any eld, e.g. Q, R, C, is a ring the only difference between the axioms for a eld and for a ring
is that in the case of a ring we do not require the existence of multiplicative inverses (and that, for
elds one insists that 1 0, so that the smallest eld has two elements).
iv) If k is a eld, and n N, then the set M
n
(k) of n n matrices with entries in k is a ring, with the usual
addition and multiplication of matrices.
v) Saying the previous example in a slightly more abstract way, if V is a vector space over a eld k then
End(V) the space of linear maps from V to V, is a ring. In this case the multiplication is given by
composition of linear maps, and hence is not commutative. We will mostly focus on commutative
rings in this course.
vi) Example iii) also lets us construct new rings from old, in that there is no need to start with a eld k.
Given any ring R, the set M
n
(R) of n n matrices with entries in R is again a ring.
vii) Polynomials in any number of indeterminates form a ring: if we have n variables t
1
, t
2
, . . . , t
n
and k is
a eld then we write k[t
1
, . . . , t
n
] for the ring of polynomials in the variables t
1
, . . . , t
n
with coefcients
in k.
Date: March, 2014.
1
That is, R is a monoid under with identity element 1 if you like collecting terminology.
2
We will try and use the letter R as our default symbol for a ring, in some books the default letter is A. This is the fault of the French,
as you probably guess.
3
In some texts they use the (rather hideous) term rng for such an object.
1
2 KEVIN MCGERTY.
viii) Just as in v), there is no reason the coefcients of our polynomials have to be a eld if R is a ring,
we can build a new ring R[t] of polynomials in t with coefcients in R in the obvious way. What is
important to note in both this and the previous example is that polynomials are no longer functions:
given a polynomial f R[t] we may evaluate it at an r R and thus associate it to a function from
R to R, but this function may not determine f . For example if R = Z/2Z then clearly there are only
nitely many functions from R to itself, but R[t] still contains innitely many polynomials.See below
for more details on polynomial rings.
ix) If we have two rings R and S , then we can form the direct sum of the rings R S : this is the ring
whose elements are pairs (r, s) where r R and s S with addition and multiplication given com-
ponentwise.
x) Another way to construct new rings from old is to consider, for a ring R, functions taking values in
R. The simplest example of this is R
n
= (a
1
, . . . , a
n
), where we add and multiply coordinatewise. This
is just
4
the ring of R-valued functions on the set 1, 2, . . . , n. We can generalise this and consider, for
any set X, the set R
X
= f : X R of functions from X to R, and make it a ring by adding values
(exactly as we dene the sum of two R or C-valued functions).
xi) To make the previous example more concrete, the set of all functions f : R R is a ring. Moreover,
the set of all continuous (or differentiable, innitely differentiable,...) functions also forms a ring by
standard algebra of limits results.
Denition 1.3. If R is a ring, a subset S R is said to be a subring if it inherits the structure of a ring from R,
thus we must have 0, 1 S and moreover S is closed under the addition and multiplication operations in R
so that (S, +) is a subgroup of (R, +).
For example, the integers Z are a subring of Q, the ring of differentiable functions from R to itself is a
subring of the ring of all functions from R to itself. The ring of Gaussian integers is a subring of C, as are
Q,R (the latter two being elds of course). Recall that for a group G containing a subset H, the subgroup
criterion says that H is a subgroup if and only if it is nonempty and whenever h
1
, h
2
H we have h
1
h
1
2
H
(here Im writing the group operation on G multiplicatively). We can use this to give a similar criterion for
a subset of a ring to be a subring.
Lemma 1.4 (Subring criterion). Let R be a ring and S a subset of R, then S is a subring if and only if 1 S and for
all s
1
, s
2
S we have s
1
s
2
, s
1
s
2
S .
Proof. The condition that s
1
s
2
S for all s
1
, s
2
S implies that S is an additive subgroup by the subgroup
test (note that as 1 S we know that S is nonempty). The other conditions for a subring hold directly.
When studying any kind of algebraic object
5
it is natural to consider maps between those kind of objects
which respect their structure. For example, for vector spaces the natural class of maps are linear maps, and
for groups the natural class are the group homomorphisms. The natural class of maps to consider for rings
are dened similarly:
Denition 1.5. A map f : R S between rings R and S is said to be a (ring) homomorphism if
(1) f (1
R
) = 1
S
,
(2) f (r
1
+ r
2
) = f (r
1
) + f (r
2
),
(3) f (r
1
.r
2
) = f (r
1
). f (r
2
),
where strictly speaking we might have written +
R
and +
S
for the addition operation in the two different
rings R and S , and similarly for the multiplication operation
6
. Apart from the fact that things then become
hard to read, because the required syntax is clear from context, hopefully this (conventional) sloppiness in
notation will not bother anyone. Note that it follows from (2) that f (0) = 0.
4
Recall, for example, that sequences of real numbers are dened to be functions a: N R, we just tend to write a
n
for the value of
a at n (and refer to it as the n-th term) rather than a(n).
5
Or more generally any mathematical structure: if youre taking Topology this term then continuous maps are the natural maps
to consider between topological spaces, similarly in Integration you consider measurable functions: loosely speaking, you want to
consider maps which play nicely with the structures your objects have, be that a topology, a vector space structure, a ring structure or
a measure.
6
though since Ive already decided to supress the notation for it, its hard to distinguish the two when you supress both...
ALGEBRA II: RINGS AND MODULES. 3
It is easy to see that the image of a ring homomorphism f : R S , that is s S : r R, f (r) = s
is a subring of S . If it is all of S we say f is surjective, and f : R S is an isomorphism if there is a
homomorphism g: S R such that f g = id
S
and g f = id
R
. It is easy to check that f is an isomorphism
if and only if it is a bijection (that is, to check that the set-theoretic inverse of f is automatically a ring
homomorphism you probably did a similar check for linear maps between vector spaces before.)
Example 1.6. i) For each positive integer n, there is a natural map fromZ to Z/nZ which just takes an
integer to its equivalence class modulo n. The standard calculations that check modular arithmetic
makes sense exactly show that this map is a ring homomorphism.
ii) Let V be a k-vector space and let End
k
(V). Then : k[t] End
k
(V) given by (
_
n
i=0
a
i
t
i
) =
_
n
i=0
a
i
i
is a ring homomorphism. Ring homomorphisms of this type will reveal the connnection between
the study of the ring k[t] and linear algebra. (In a sense you saw this last term when dening things
like the minimal polynomial of a linear map, but we will explore this more fully in this course.)
iii) Obviously the inclusion map i : S R of a subring S into a ring R is a ring homomorphism.
iv) Let A =
_
a b
b a
_
: a, b R. It is easy to check this A is a subring of Mat
2
(R). The map : C A
given by a+ib
_
a b
b a
_
is a ring isomorphism. (This homomorphism arises by sending a complex
number z to the map of the plane to itself given by multiplication by z.)
The rst of the above examples has an important generalisation which shows that any ring R in fact has
a smallest subring: For n Z
0
set n
R
= 1 + 1 + . . . + 1 (that is, 1, added to itself n times), and for n a negative
integer n
R
= (n)
R
. The problem sheet asks you to check that n
R
: n Z is a subring of R, and indeed that
the map n n
R
gives a ring homomorphism from : Z R. Since a ring homomorphism is in particular a
homomorphism of the underlying abelian groups under addition, using the rst isomorphism theorem for
abelian groups we see that n
R
: n Z, as an abelian group, is isomorphic to Z/dZ for some d Z
0
. Since
any subring S of R contains 1, and hence, since it is closed under addition, n
R
for all n Z, we see that S
contains the image of , so that the image is indeed the smallest subring of R.
Denition 1.7. The integer d dened above is called the characteristic of the ring R.
Remark 1.8. The remark above that in general polynomials with coefcients in a ring cannot always be
viewed as functions might have left you wondering what such a polynomial actually is. In other words,
what do we mean when we say k[t] is a ring where t is an indeterminate. The answer is a bit like the
denition of complex numbers: to construct them you just dene an addition and multiplication on R
2
and
then check the denitions do what you want them to, so that (0, 1) becomes i. For polynomials, we just
start with the sequence of coefcients, and do the same thing. More precisely, if R is a ring, consider the set
R[t] of sequences
7
(a
n
)
nN
where all but nitely many of the a
n
s are zero, that is, that there is some N N
such that a
n
= 0 for all n N. Then if (a
n
), (b
n
) are two such sequences, dene
(a
n
) + (b
n
) = (a
n
+ b
n
);
(a
n
).(b
n
) = (
n
k=0
a
k
b
nk
).
It is easy to see that if a
n
= 0 for all n N and b
n
= 0 for all n M, then a
n
+ b
n
= 0 if n maxN, M while
_
n
k=0
a
k
b
nk
= 0 if n N + M, since then at least one of a
k
or b
nk
must be zero (otherwise k < N and n k < M
so that n = k + (n k) < N + M). It is then routine to check that R[t] forms a ring. The indeterminate t
is then just the sequence (0, 1, 0, . . .). In fact it is easy to check that t
n
= (0, . . . , 0, 1, 0, . . .) where the 1 is in
position n, and thus if (a
n
) is a sequence as above with a
n
= 0 for all n N then (a
n
) =
_
N
n=0
a
n
t
n
. Note also
that there is a natural inclusion i : R R[t] which sends r R to the sequence (a
n
) where a
0
= r and a
n
= 0
for all n > 0, which is readily checked to be a ring homomorphism (it is the map which views an element of
R as a constant polynomial.)
In fact, the set of all sequences (a
n
)
nN
forms a ring with the same denitions for addition and multipli-
cation. This is known as the ring of formal power series in t, and is denoted R[[t]]. (The name comes from
7
In this course, N will denote the non-negative integers unless its obviously supposed to denote the positive integers.
4 KEVIN MCGERTY.
the fact that, we view elements of R[[t]] as innite sums
_
n0
a
n
t
n
.) Perhaps surprisingly, it turns out that
that, say, C[[t]] has a simpler structure in many ways that C[t].
One of the basic properties of polynomial rings is that they have natural evaluation homomorphisms:
to specify a homomorphism from a polynomial ring R[t] to a ring S you only need to say what happens to
the elements of R (the coefcients) and what happens to t. We formalise this in the following Lemma.
Lemma 1.9. (Evaluation homomorphisms.) Let R, S be rings and : R S a ring homomorphism. If s S then
there is an unique ring homomorphism : R[t] S such that i = (where i : R R[t] is the inclusion of R into
R[t]) and (t) = s.
Proof. Any element of R[t] has the form
_
n
i=0
a
i
t
i
, (a
i
R), hence if is any homomorphismsatisfying i =
and (t) = s we see that
(
n
i=0
a
i
t
i
) =
n
i=0
(a
i
t
i
) =
n
i=0
(a
i
)(t
i
) =
n
i=0
(a
i
)s
i
,
Hence is uniquely determined. To check there is indeed such a homomorphism we just have to check
that the function (
_
n
i=0
a
i
t
i
) =
_
n
i=0
(a
i
)s
i
is indeed a homomorphism, but this is straight-forward from the
denitions.
2. BASIC PROPERTIES
From now on, unless we explicitly state otherwise, all rings will be assumed to be commutative.
Now that we have seen some examples of rings, we will discuss some basic properties of rings and their
elements. Note that it is a routine exercise
8
in axiom grubbing to check that, for any ring R, we have a.0 = 0
for all a R. The next denition records the class of rings for which this is the only case in which the
product of two elements is zero.
Denition 2.1. If R is a ring, then an element a R\0 is said to be a zero-divisor if there is some b R\0
such that a.b = 0. A ring which is not the zero ring and has no zero-divisors is called an integral domain.
Thus if a ring is an integral domain and a.b = 0 then one of a or b is equal to zero.
Another way to express the fact that a ring is an integral domain is observe that it is exactly the condition
which permits cancellation
9
, that is, if x.y = x.z then in an integral domain you can conclude that either
y = z or x = 0. This follows immediately from the denition of an integral domain and the fact that
x.y = x.z x.(y z) = 0, which follows from the distributive axiom.
Example 2.2. If R is a ring, then R
2
is again a ring, and (a, 0).(0, b) = (0, 0) so that (a, 0) and (0, b) are zero-
divisors. The (noncommutative) ring of n n matrices M
n
(k) for a eld k also has lots of zero divisors, even
though a eld k does not. The integers modulo n have zero-divisors whenever n is not prime.
On the other hand, it is easy to see that a eld has no zero-divisors. The integers Z are an integral domain
(and not a eld). Slightly more interestingly, if R is an integral domain, then R[t] is again an integral domain.
Moreover, the same is true of R[[t]].
Exercise 2.3. Show that if R is an integral domain then R[t] is also.
Recall the characteristic of a ring dened in the last lecture.
Lemma 2.4. Suppose that R is an integral domain. Then any subring S of R is also an integral domain. Moreover,
char(R), the characteristic of R, is either zero or a prime p Z.
Proof. It is clear from the denition that a subring of an integral domain must again be an integral domain.
Now from the denition of the characteristic of a ring, if char(R) = n > 0 then Z/nZ is a subring of R. Clearly
if n = a.b where a, b Z are both greater than 1, then a
R
.b
R
= 0 in R with neither a
R
nor b
R
zero, thus both
are zero divisors. It follows that if R is an integral domain then char(R) is zero or a prime.
8
Its a good idea to try and check that the axioms for a ring do indeed imply that you can perform the standard algebraic manip-
ulations you are used to, so things like 0.x = 0 hold in any ring. None of the checks you have to do are very exciting, so its best to
pick a few such statements. One operation you have to be careful about however, is cancellation (but then again you already should
be aware of from matrix algebra).
9
Except for the assertion the ring is not the zero ring, the zero ring having cancellation vacuously.
Note that in particular, the characteristic of a eld is always zero or a prime.
Recall that in a ring we do not require that nonzero elements have a multiplicative inverse
10
. Neverthe-
less, because the multiplication operation is associative and there is a multiplicative identity, the elements
which happen to have multiplicative inverses form a group:
Denition 2.5. Let R be a ring. The subset
R
= r R : s R, r.s = 1,
is called the group of units in R it is a group under the multiplication operation with identity element 1.
Example 2.6. The units in Z form the group 1. On the other hand, if k is a eld, then the units k
= k\0.
If R = M
n
(k) then the group of units is GL
n
(k).
In our example of Z/nZ notice that this ring either has zero-divisors (when n is composite) or is a eld
(when n is prime). In fact this is dichotomy holds more generally.
Lemma 2.7. Let R be an integral domain which has nitely many elements. Then R is a eld.
Proof. We need to show that if a R\0 then a has a multiplicative inverse, that is, we need to show there is
a b R with a.b = 1. But consider the map m
a
: R R given by left multiplication by a, so that m
a
(x) = a.x.
We claim that m
a
is injective: indeed if m
a
(x) = m
a
(y) then we have
a.x = a.y = a.(x y) = 0,
and since R is an integral domain and a 0 it follows that x y = 0, that is, x = y. But now since R is
nite, an injective map must be surjective, and hence there is some b R with m
a
(b) = 1, that is, a.b = 1 as
required.
Remark 2.8. Note that the argument in the proof which shows that multiplicative inverses exist does not
use the assumption that the ring R was commutative (we only need it in order to conclude R is a eld). A
noncommutative ring where any nonzero element has an inverse is called a division ring (or sometimes a
skew eld). Perhaps the most famous example of a division ring is the ring of quaternions.
2.1. The eld of fractions. If R is an integral domain which is innite, it does not have to be a eld (e.g.
consider the integers Z). However, generalising the construction of the rational numbers from the integers,
we may build a eld F(R) from R by, loosely speaking, taking ratios: the elements of F(R) are fractions
a/b where a, b R and b 0, where we multiply in the obvious way and add by taking common denomi-
nators. The eld F(R) will have the property that it is, in a sense we will shortly make precise, the smallest
eld into which you can embed the integral domain R.
To do this a bit more formally, dene a relation on R R\0 by setting (a, b) (c, d) if a.d = b.c (to see
where this comes from note that it expresses the equation a/b = c/d without using division).
Lemma 2.9. The relation is an equivalence relation.
Proof. The only thing which requires work to check is that the relation is transitive. Indeed suppose that
(a, b) (c, d) and (c, d) (e, f ). Then we have ad = bc and c f = de and need to check that (a, b) (e, f ), that
is, af = be. But this holds if
a f be = 0 d.(a f be) = 0 (ad). f b.(de) = 0 (bc). f b.(c f ) = 0,
as required (note in the rst if and only if we used the fact that R is an integral domain and that d 0.
Write
a
b
for the equivalence class of a pair (a, b) and denote the set of equivalence classes as F(R).
Lemma 2.10. The binary operations (R R\0) (R R\0) R R\0 given by:
((a, b), (c, d)) (ad + bc, bd)
((a, b), (c, d)) (ac, bd)
induce binary operations on F(R).
10
As noted above, the axioms for a ring imply that 0.x = 0 for all x R, thus the additive identity cannot have a multiplicative
inverse, hence the most we can ask for is that every element of R\0 does this is exactly what you demand in the axioms for a eld.
6 KEVIN MCGERTY.
Proof. Note rst that since R is an integral domain and b, d are nonzero, bd is also nonzero, hence the above
formulas do indeed dene binary operations on R R\0. To check that they induce binary operations on
F(R) we need to check that the equivalence class of the pairs on the right-hand side depends only on the
equivalence classes of the two pairs on the left-hand side. We check this for the rst operation (the second
one being similar but easier).
Suppose that (a
1
, b
1
) (a
2
, b
2
) and (c
1
, d
1
) (c
2
, d
2
), so that a
1
b
2
= a
2
b
1
and c
1
d
2
= c
2
d
1
. Then we need to
show that (a
1
d
1
+ b
1
c
1
, b
1
d
1
) (a
2
d
2
+ b
2
c
2
, b
2
d
2
), which holds if and only if
(a
1
d
1
+ b
1
c
1
)(b
2
d
2
) = (a
2
d
2
+ b
2
c
2
)(b
1
d
1
) (a
1
b
2
d
1
d
2
) + (b
1
b
2
c
1
d
2
) = (a
2
b
1
d
1
d
2
+ b
1
b
2
c
2
d
1
),
but a
1
b
2
= a
2
b
1
so a
1
b
2
d
1
d
2
= a
2
b
1
d
1
d
2
and b
2
c
1
= b
1
c
2
so that b
1
b
2
c
1
d
2
= b
1
b
2
c
2
d
1
and we are done.
Let + and denote the binary operations the rst and second operations above induce on F(R). Thus we
have
a
b
+
c
d
=
ad + bc
bd
,
a
b

c
d
=
ac
bd
.
Theorem 2.11. The above formulas give well-dened addition and multiplication operations on F = F(R) the set of
equivalence classes
a
b
: a, b R, b 0, and F is a eld with respect to these operations with additive identity
0
1
and
multiplicative identity
1
1
. Moreover there is a unique injective homomorphism : R F(R) sending a
a
1
.
Proof. (Non-examinable). One just has to check that the axioms for a eld are satised. The ring axioms are
routine to check by calculating in R R\0. To see that F(R) is a eld, note that (a, b) (0, 1) if and only if
a = 0. Thus if
a
b
0 then a 0 and so
b
a
F(R), and by denition
a
b
.
b
a
=
a.b
a.b
=
1
1
. Thus the multiplicative
inverse of
a
b
is
b
a
.
The map a
a
1
is certainly injective, since (a, 1) (b, 1) if and only if a.1 = b.1, that is, a = b. It is then
immediate from the denitions that this map is a homomorphism as required.
Denition 2.12. The eld F(R) is known as the eld of fractions of R.

Remark 2.13. All of this may look a little formal, but it is really no more than you have to do to construct
the rational numbers from the integers. You should think of it as no more or less difcult (or to be fair,
interesting) than that construction: essentially it just notices that all you needed to construct the rationals
was the cancellation property which is the dening property of integral domains.
Finally we make precise the sense in which F(R) is the smallest eld containing R.
Proposition 2.14. Let k be a eld and let : R k be an embedding (that is, an injective homomorphism). Then
there is a unique injective homomorphism

: F(R) k extending (in the sense that

k
= ).
Proof. (non-examinable): Suppose that f : F(R) k was such a homomorphism. Then by assumption f (
a
1
) =
(a), and since homomorphism of rings respect multiplicative inverses this forces f (
1
a
) = (a)
1
. But then,
again because f is supposed to be a homomorphism, we must have f (
a
b
) = f (
a
1
.
1
b
) = f (
a
1
). f (
1
b
) = (a).(b)
1
.
Thus if f exists, it has to be given by this formula.
The rest of the proof consists of checking that this recipe indeed works: Given (a, b) RR\0 rst dene
(a, v) = (a).(b)
1
. Then it is easy to check that is constant on the equivalence classes of the relation
dening F(R), so that it induces a map

: F(R) k. Finally it is straight-forward to see that this map is a
homomorphism extending as required.
Remark 2.15. Notice that this theorem implies that any eld k of characteristic zero contains a (unique)
copy of the rationals. Indeed by denition of characteristic, the unique homomorphism from Z to k is an
embedding, and the above theorem shows that it therefore extends uniquely to an embedding of Q into k
as claimed.
3. HOMOMORPHISMS AND IDEALS
From now on we will assume all our rings are commutative. In this section we study the basic properties
of ring homomorphisms, and establish an analogue of the rst isomorphism theorem which you have
seen already for groups. Just as for homomorphisms of groups, homomorphisms of rings have kernels and
images.
Denition 3.1. Let f : R S be a ring homomorphism. The kernel of f is
ker( f ) = r R : f (r) = 0,
and the image of f is
im( f ) = s S : r R, f (r) = s.
Just as for groups, the image of a homomorphism is a subring of the target ring. For kernels the situation
is a little different. In the case of groups, kernels of homomorphisms are subgroups, but not any subgroup
is a kernel the kernels are characterised intrinsically by the property of being normal (i.e. perserved by the
conjugation action of the group). We will show that the kernels of ring homomorphisms can similarly be
characterised intrinsically, but the situation, because we have two binary operations, is slightly different: a
kernel is both more and less than a subring. Indeed since homomorphisms are required to send 1 to 1, the
kernel never contains 1 unless it is the entire ring, thus a kernel is not a subring. However, it is closed under
addition and mulitplication (as is straight-forward to check) and because 0.x = 0 for any x, it in fact obeys
a stronger kind of closure with respect to multiplication:
11
If x ker( f ) and r R is any element of R, then
f (x.r) = f (x). f (r) = 0. f (r) = 0 so that x.r ker( f ). This motivates the following denition:
Denition 3.2. Let R be a ring. A subset I R is called an ideal if it is a subgroup of (R, +) and moreover for
any a I and r R we have a.r I.
Lemma 3.3. If f : R S is a homomorphism, then ker( f ) is an ideal.
Proof. This is immediate from the denitions.
3.1. Basic properties of ideals. Note that if I is an ideal of R which contains 1 then I = R. We will shortly
see that in fact any ideal is the kernel of a homomorphism. First let us note a few basic properties of ideals:
Lemma 3.4. Let R be a ring, and I, J ideals in R. Then I + J I J and I J are ideals, where
I + J = i + j : i I, j J; I J =
n
k=1
i
k
j
k
: i
k
I, j
k
I, n N.
Moreover we have I J I J and I, J I + J.
Proof. For I + J it is clear that this is an abelian subgroup of R, while if i I, j J and r R, then
r(i + j) = (r.i) + (r. j) I + J as both I and J are ideals, hence I + J is an ideal. Checking I J is an ideal is
similar but easier. To see that I J is an ideal, note that it is clear that the sum of two elements of I J is of the
same form, and if
_
n
k=1
x
k
y
k
I J then
k=1
x
k
y
k
=
n
k=1
(x
k
).y
k
I J,
since if x
k
I then x
k
I. Thus I J is an abelian subgroup. It is also straight-forward to check the
multiplicative condition. The containments are all clear once you note that if i I and j J then i j in
in I J because both I and J are ideals.
In fact given a collection of ideals I
: A in a ring R, their intersection

_
A
I
is easily seen to again

be an ideal. This easy fact is very useful for the following reason:
Denition 3.5. Given any subset T of R, one can dene
(T) =
_
TI
I
(where I is an ideal) the ideal generated by T. We can also give a more explicit from the ground up
description of the ideal generated by a subset X:
Lemma 3.6. Let T R. Then we have
(T) =
n
i=1
r
k
t
k
: r
k
R, t
k
T, n N.
11
This is analogous to the fact that kernels of group homomorphisms, are more closed than arbitrary subgroups.
8 KEVIN MCGERTY.
Proof. Let I denote the right-hand side. It is enough to check that I is an ideal and that I is contained in
any ideal which contains T. We rst check that I is an ideal the proof is very similar to the proof that
I J is an ideal when I and J are: The multiplicative property is immediate: if r R and
_
n
k=1
r
k
x
k
I then
r(
_
n
k=1
r
k
x
k
) =
_
n
k=1
(r.r
k
)x
k
I. Moreover the sum of two such elements is certainly of the same form, and I
is closed under additive inverses since
_
n
k=1
r
k
x
k
=
_
n
k=1
(r
k
).x
k
, so that it is an additive subgroup of R.
It remains to show that if J is an ideal containing X then J contains I. But if x
1
, . . . , x
k
T J and
r
1
, . . . , r
k
R, then since J is an ideal certainly r
k
x
k
J and hence
_
n
k=1
r
k
x
k
J. Since the x
k
, r
k
and n N
were arbitrary it follows that I J as required.
This is completely analogous the notion of the span of a subset in a vector space. If I and J are ideals,
it is easy to see that I + J = (I J). In the case where T = a consists of a single element, we often write aR
for (a).
Remark 3.7. Note that in the above, just as for span in a vector space, there is no need for the set X to be
nite.
Remark 3.8. Note that if T R is a subset of a ring R we can also consider the subring which it generates:
the intersection of subrings is again a subring
12
, so we may set
(T)
s
=
_
TS
S,
where the subscript s is supposed to denote subring. I leave it as an exercise to nd a ground up
description of (T)
s
.
Denition 3.9. If an ideal is generated by a single element we say it is principal. Two elements a, b R are
said to be associates if there is a unit u R
such that a = u.b. (This is an equivalence relation on elements of

R).
If I = (a) then just knowing I does not quite determine a, but it almost does, at least if R is an integral
domain. For example in Z we will see that the ideals are all of the form (n) and the integer n is determined
up to sign by the ideal (n) (the units in Z being exactly 1). The notion of associate elements lets us make
this precise.
Lemma 3.10. Let R be an integral domain. Then if I is a principal ideal, the generators
13
of I are associates, and any
associate of a generator is again a generator. Thus the generators of a principal ideal form a single equivalence class of
associate elements of R.
Proof. If I = 0 = (0) the claim is immediate, so assume I 0 and hence any generator is nonzero. Let
a, b R be generators of I, so I = (a) = (b). Since a (b), there is some r R with a = r.b, and similarly as
b (a) there is some s with b = s.a. It follows that a = r.b = r(s.a) = (r.s)a, hence a(1 r.s) = 0, and so since
a 0 and R is an integral domain, r.s = 1, that is, r and s are units.
Finally if I = (a) and b = u.a where u R
, then certainly b (a) = I so that (b) I, but also if x I, then

x = r.a for some r R and hence x = r.(u
1
.b) = (r.u
1
).b so that x (b), and hence I (b). It follows I = (b)
as required.
3.2. Quotients. In order to show that ideals and kernels of ring homorphisms are the same thing, we now
study a notion of quotient for rings, similar to the quotients of groups and vector spaces which you have
already seen.
Suppose that R is a ring and that I is an ideal in R. Then since (I, +) is a subgroup of the abelian group
(R, +), we may form the quotient group (R/I, +). We briey recall (in perhaps a slightly different fashion to
what you have seen before) the construction of the group (R/I, +): The relations r s if r s I is easily
checked to be an equivalence relation, and the equivalence classes are the cosets r + I : r R. We want
12
Note also that this is a pretty general way of dening the widget generated by a subset of a given object: provided the
intersection of widgets is again a widget, then if S is some subset of your object, the widget it generates is the intersection of all
widgets which contain S , and is thus the smallest widget containing S .
13
i.e. the elements a R such that I = (a).
to endow R/I with the structure of an (abelian) group. To do this, suppose that C
1
, C
2
R/I. Then we may
dene:
(3.1) C
1
+ C
2
= c
1
+ c
2
: c
i
C
i
, i = 1, 2.
Now a priori C
1
+ C
2
is just a subset of R. We claim it is in fact a single coset of I. Certainly, it is a union of
cosets, since if c
1
+c
2
r for any r R, we have r = (c
1
+c
2
) +i for some i I and so r = (c
1
+i) +c
2
C
1
+C
2
.
To see that it is a single coset, x elements d
i
C
i
(i = 1, 2). Then if c
i
C
i
we have c
i
= d
i
+ k
i
where k
i
I
for i = 1, 2, and hence
c
1
+ c
2
= (d
1
+ k
1
) + (d
2
+ k
2
) = (d
1
+ d
2
) + (k
1
+ k
2
) (d
1
+ d
2
) + I.
It follows that (3.1) denes a binary operation on R/I, and then it is routine to check that R/I becomes an
abelian group with this operation (where the identity element is the coset I itself).
To give R/I the structure of a ring, we need to produce a multiplication operation. We can do this is in a
similar fashion: given two cosets C
1
, C
2
dene:
C
1
C
2
= c
1
.c
2
: c
i
C
i
, i = 1, 2.
We claim that the set C
1
C
2
is contained in
14
a single coset of R/I, that is, that C
1
C
2
+ I is a single I-coset.
To do this suppose c
i
, d
i
(i = 1, 2,) are as above. Then we have
c
1
.c
2
= (d
1
+ k
1
).(d
2
+ k
2
) = d
1
.d
2
+ (d
1
.k
2
+ k
1
d
2
+ k
1
k
2
) d
1
.d
2
+ I.
Note here we used the fact that d
1
.k
2
and k
1
.d
2
are both in I, which would not be true if I was just a subring
of R the denition of an ideal is exactly what we need in order for the cosets R/I to inherit a ring structure
from R. Thus we get a binary operation on R/I by setting:
C
1
C
2
= C
1
C
2
+ I.
Theorem 3.11. The datum (R/I, +, , 0 + I, 1 + I) denes a ring structure on R/I and moreover the map q: R R/I
given by q(r) = r + I is a surjective ring homomorphism.
Proof. Checking each axiom is an easy consequence of the fact that the binary operations +, on R/I are
dened by picking arbitrary representatives of the cosets, computing up in the ring R and then taking the
coset of the answer (the important part of the denitions being that this last step is well-dened). Thus
for example, to check is associative, let C
1
, C
2
, C
3
be elements of R/I and choose a
1
, a
2
, a
3
R such that
C
i
= a
i
+ I for i = 1, 2, 3. Then
C
1
(C
2
C
3
) = (a
1
+ I)
_
(a
2
+ I) (a
3
+ I)
_
= (a
1
+ I) (a
2
.a
3
+ I) = (a
1
).(a
2
.a
3
) + I
= (a
1
a
2
).a
3
+ I = (a
1
.a
2
+ I) (a
3
+ I)
=
_
(a
1
+ I) (a
2
+ I)
_
(a
3
+ I) = (C
1
C
2
) C
3
where in going from the second to the third line we use the associativity of multiplication in R. Checking
the other axioms is similarly straight-forward. Finally, the map q: R R/I is clearly surjective, and that it
is a homomorphism is also immediate from the denitions.
The map q: R R/I is called the quotient homomorphism (or quotient map). The next corollary establishes
the answer to the question we started this section with: what are the subsets of R which are kernels of
homomorphisms from R? We already noted that any kernel is an ideal, but the construction of quotients
now gives us the converse:
Corollary 3.12. Any ideal is the kernel of a ring homomorphism.
Proof. If I is an ideal and q: R R/I is the quotient map then clearly ker(q) = I.
Next we want to compare ideals in a quotient ring with ideals in the original ring.
14
In lectures I mistakenly said that it was again a union of cosets and hence a single coset, but this isnt true: in general C
1
C
2
can
be strictly smaller than a single coset: For example take R = Z and I = (6), then if C is the coset 2 +6Z we have C C = 4 +12Z, which is
contained in, but smaller than, the coset 4 + 6Z.
10 KEVIN MCGERTY.
Lemma 3.13. Let R be a ring, I an ideal in R and q: R R/I the quotient map. If J is an ideal then q(J) is an ideal
in R/I, and if K is an ideal in R/I then q
1
(K) = r R : q(r) K is an ideal in R which contains I. Moreover, these
correspondences give a bijection between the ideals in R/I and the ideals in R which contain I.
Proof. This follows directly by working through the denitiosn. Let q: R R/I denote the quotient homo-
morphism. If J is an ideal in R, lets check that q(J) is an ideal in R/I. Since q is a homomorphism, it is
certainly a subgroup of R/I under addition, so we just need to check the multiplicative property. But any
element of q(J) is of the form q( j) for some j J, and since q is surjective any element of R/I is of the form
q(r) for some r R. But since q(r).q( j) = q(r. j) and r. j J it follows immediately that q(J) is an ideal in R/I.
It is similarly straight-forward to check that if K R/I is an ideal then q
1
(K) is an ideal of R. Indeed if
t
1
, t
2
q
1
(K) the q(t
1
+ t
2
) = q(t
1
) + q(t
2
) K since K is an ideal, so that t
1
+ t
2
q
1
(K). Similarly if t q
1
(K)
and r R then q(r.t) = q(r).q(t) K since q(t) K and K is an ideal of R/I. It follows r.t q
1
(K) and so q
1
(K)
is indeed an ideal of R. Thus we see that given an ideal in R we get an ideal in R/I by taking its image under
q while given an ideal in R/I we get an ideal in R by taking its preimage under q.
To see that this induces a bijection between ideals in R/I and ideals which contain I note rst that
q(q
1
(K)) = K (this is true for any surjective map q of sets). If J is an ideal of R then we claim that the com-
position in the other order q
1
(q(J)) is the ideal J + I. Assuming this we see that if J I then q
1
(q(J)) = J
since in in this case we have J +I = J. Hence q is bijective when restricted to ideals containing I as required.
Thus it remains to check that q
1
(q(J)) = J + I. If we have an element x q
1
(q(J)) must have q(x) = q( j)
for some j J, and so q(x j) = 0, that is, x j I. But then x = j + (x j) J + I so that q
1
(q(J)) J + I.
The reverse inclusion is clear as q( j + i) = ( j + i) + I = j + I for any j J, i I.
3.3. The isomorphism theorems. We can now state the rst isomorphism theorem.
Theorem 3.14. Let f : R S be a homomorphism of rings, and let I = ker( f ). Then f induces an isomorphism
f : R/I im( f ) given by:
f (r + I) = f (r).
Proof. First note that if r s I, then f (r s) = 0 and hence f (r) = f (s), so that f takes a single value on each
coset r + I, and hence

f is well-dened. Clearly from the denition of the ring structure on R/I it is a ring
homomorphism, so it only remains to check that it is an isomorphism. It is clearly surjective (if s im( f )
then s = f (r) for some r R and hence s =

f (r + I)). To see that

f is injective it is enough
15
that ker(

f ) = 0,
but this is also obvious! If

f (r + I) = 0 then f (r) = 0 hence r I so that r + I = I.
Despite being essentially trivial to prove, the rst isomorphism theorem is very useful! We now list some
consequences which generalise things you may have seen before (or may see next term in other courses).
Let R be a ring and A and B subsets of R. Then as above we let A + B = a + b : a A, b B.
Theorem 3.15. (Second isomorphism theorem): Let R be a ring and A a subring of R, B an ideal of R. Then A + B is
a subring of R and the restriction of the quotient map q: R R/B to A induces an isomorphism
A/A B (A + B)/B
(Universal property of quotients): If f : R S is a homomorphism and I ker( f ) is an ideal, then there is a unique
homomorphism

f : R/I S such that f =

f q where q: R R/I is the quotient map.
(Third isomorphism theorem): If I J are ideals, then J/I = j + I : j J is an ideal in R/I and
(R/I)/(J/I) R/J
Proof. You checked in Problem set 1 that A + B is a subring of R, and it is easy to see that A B is an ideal
in A, and B is an ideal in A + B, so the two quotients A/A B and (A + B)/B certainly exist. Let q: R R/B
be the quotient map. It restricts to a homomorphism p from A to R/B, whose image is clearly (A + B)/B, so
by the rst isomorphism theorem it is enough to check that the kernel of p is A B. But this is clear: if a A
has p(a) = 0 then a + B = 0 + B so that a B, and so a A B.
For the universal property, note that since q is surjective the requirement that

f (q(r)) = f (r) uniquely
determines

f if it exists. But if r
1
r
2
I then since I ker( f ), we have 0 = f (r
1
r
2
) = f (r
1
) f (r
2
) and
15
This is just the same calculation as for linear maps: if f (r) = f (s) the f (r s) = 0 so that r s ker( f ).
hence f (r
1
) = f (r
2
). It follows f is constant on the I-cosets, and so does indeed induce a unique map
16
f : R/I S such that

f q = f . It is then immediate from the denitions of the ring structure on R/I that

f
is a homomorphism.
For the third isomorphism theorem, let q
I
: R R/I and q
J
: R R/J be the two quotient maps. By the
universal property for the quotient q
J
: R R/J applied to the homomorphism q
I
we see that there is a
homomorphism q
I
: R/J R/I induced by the map q
I
: R R/I, and q
I
q
J
= q
I
. Clearly q
I
is surjective
(since q
I
is) and if q
I
(r + J) = 0 then since r + J = q
J
(r) so that q
I
(r + J) = q
I
(r) we have r I, so that
ker( q
I
) = J/I and the result follows by the rst isomorphism theorem.
Example 3.16. Suppose that V is a k-vector space and End(V). Then we sawbefore that : k[t] End(V)
given by ( f ) = f (). It is easy to see that this map is a homomorphism, and hence we see that im() is
isomorphic to k[t]/I where I = ker( f ) is a principal ideal. The monic polynomial generating I is the minimal
polynomial of as was studied in Algebra I.
An interesting special case of these results is a general version of the Chinese Remainder Theorem. To
state it recall from Example 1.2 ix) the direct sums construction for rings: if R and S are rings, then R S
is dened to be the ring of ordered pairs (r, s) where r R, s S , with addition and multiplication done
componentwise.
Theorem 3.17. Let R be a ring, and I, J ideals of R such that I + J = R. Then
R/I J R/I R/J.
Proof. We have quotient maps q
1
: R R/I and q
2
: R R/J. Dene q: R R/I R/J by q(r) = (q
1
(r), q
2
(r)).
By the rst isomorphism theorem, it is enough to show that q is surjective and that ker(q) = I J. The latter
is immediate: if q(r) = 0 then q
1
(r) = 0 and q
2
(r) = 0, whence r I and r J, that is, r I J. To see that q
is surjective, suppose (r + I, s + J) R/I R/J. Then since R = I + J we may write r = i
1
+ j
1
and s = i
2
+ j
2
,
where i
1
, i
2
I, j
1
, j
2
J. But then r + I = j
1
+ I and s + J = i
2
+ J, so that q( j
1
+ i
2
) = (r + I, s + J).
Suppose that R = I + J where I and J are ideals as above and moreover that I J = 0. Then each r R
can be written uniquely in the form i + j where i I and j J (the proof is exactly the same as it is for
subspaces in a vector space). In this situation we write
17
R = I J. Note that since I.J I J it follows that
i. j = 0 for any i I, j I, thus if i
1
, i
2
I and j
1
, j
2
J we see (i
1
+ j
1
).(i
2
+ j
2
) = i
1
i
2
+ j
1
j
2
. Writing 1 = e
1
+ e
2
where e
1
I and e
2
J if follows (I, +, , 0, e
1
) is a ring as is (J, +, , 0, e
2
), and it is easy to see that these
rings are isomorphic to R/J and R/I respectively. This gives a more explicit description of the isomorphism
R R/I R/J provided by the Chinese Remainder Theorem in this case.
Note also that if we start with two rings S
1
, S
2
, and dene R = S
1
S
2
as in Example 1.2 ix), then the
copies S
R
1
, S
R
2
of S
1
and S
2
inside R (that is, the elements (s, 0) : s S
1
and (0, t) : t S
2
respectively) are
ideals in R (not subrings because they do not contain the multiplicative identity element (1, 1)) and clearly
their intersection is (0, 0), so that R = S
R
1
S
R
2
, thus the external notion of direct sum we saw in lecture
1 is compatible with the internal direct sum notation we used above (that is, when we write R = I J to
denote that I, J are ideals in R with I + J = R and I J = 0).
When R = Z, and I = nZ = nd : d Z, J = mZ, then you can check that the condition that I + J = Z is
exactly that n and m are coprime, and then it also follows that I J = (n.m)Z (the problem sheet asks you
to work out the details of this), and so we recover the classical Chinese Remainder Theorem: if m, n are
coprime integers, then Z/(nm)Z (Z/nZ) (Z/mZ). For example, if R = Z/6Z then R =

3R

4R (writing n for
n + 6Z, and this gives the identication R = Z/2Z Z/3Z.
4. PRIME IDEALS, MAXIMAL IDEALS AND CONSTRUCTIONS OF RINGS.
The quotient construction gives us a powerful way to build new rings and elds. What kind of rings
we obtain as quotients depends on the properties of the ideals we quotient by. In this section we begin
studying two important classes of ideals.
16
Note that this is part of the argument is really just about equivalence relations: if you have a set X with an equivalence relation F
and q: X X/F denotes the map from X to the set of equivalence classes X/F which sends an element to the equivalence class it lies
in, then a function f : X Y induces a function

f : X/F Y such that

f q = f if and only if it is constant on the equivalence classes
of the relation F.
17
The notation is compatible with the direct sum notation used in the rst lecture see the next paragraph.
12 KEVIN MCGERTY.
Denition 4.1. Let R be a ring, and I an ideal of R. We say that I is a maximal ideal if I R and it is not
strictly contained in any proper ideal of R. We say that I is a prime ideal if I R and for all a, b R, whenever
a.b I then either a I or b I.
Lemma 4.2. An ideal I in a ring R is prime if and only if R/I is an integral domain
18
. It is maximal if and only if R/I
is a eld. In particular, a maximal ideal is prime.
Proof. Suppose that a, b R. Note that (a + I)(b + I) = 0 + I if and only if a.b I. Thus if R/I is an integral
domain, (a +I)(b +I) = 0 forces either a +I = 0 or b +I is zero, that is, a or b lies in I, which shows I is prime.
The converse is similar.
For the second part, note that a eld is a ring which has no nontrivial ideals (check this!). The claim
then follows immediately from the correspondence between ideals in the quotient ring and the original
ring given in Lemma 3.13. Since elds are obviously integral domains, the in particular claim follows
immediately.
Remark 4.3. You can also give a direct proof that a maximal ideal is prime. Indeed if I is maximal and a.b I,
and suppose that b I. Then the ideal J = I +bR generated by I and b is strictly larger than I, and so since I
is maximal it must be all of R. But then 1 = i + br for some i I and r R, and hence a = a.1 = a.i + (a.b)r I
since i, a.b I as required.
Example 4.4. Let R = Z. Then it is easy to see that any ideal I is principal, that is, is generated by a single
element, thus we may write I = nZ. The ideal nZ is then prime if and only if n is 0 or a prime number,
and since a nite integral domain is a eld, the nonzero prime ideals are also maximal (notice that in any
integral domain the ideal 0 is always a prime ideal, so certainly it is a prime ideal in Z).
We now consider a more substantial example, that of polynomials in one variable over a eld. Recall
from Constructive Maths in Prelims that the Euclidean algorithm works for polynomials just as it does for
the integers. Although the case of eld coefcients is the only one we really need for the moment, the
following Lemma captures, for polynomials with coefcients in a general ring, when you can do long
division with remainders in polynomial rings.
Lemma 4.5. (Division Algorithm). Let R be a ring and f =
_
n
i=0
a
i
t
i
R[t], where a
n
R
. Then if g R[t] is any

polynomial, there are unique polynomials q, r R[t] such that either r = 0 or deg(r) < deg( f ) and g = q. f + r.
Proof. This is straight-forward to prove by induction on deg(g). Since the a
n
R
, if h R[t]\0 it is easy to
see
19
that deg( f .h) = deg( f ) + deg(h). It follows that if deg(g) < deg( f ) we must take q = 0 and thus r = g. If
m = deg(g) n = deg( f ), then if g =
_
m
j=0
b
j
t
j
where b
m
0 the polynomial
h = g a
1
n
b
m
t
mn
. f ,
has deg(h) < deg(g), and so there are unique q
, r
with h = q
. f + r
. Setting q = a
1
n
b
n
t
mn
+ q
and r = r
it follows g = q. f + r. Since q and r are clearly uniquely determined by q
and r
they are also unique as

required.
It follows from the previous Lemma that if k is a eld, then we have the division algorithm for all non-
zero polynomials. This allows us to prove that all ideals in k[t] are principal:
Lemma 4.6. Let I be a nonzero ideal in k[t]. Then there is a unique monic polynomial f such that I = ( f ). Thus all
ideals in k[t] are principal.
Proof. Since I is nonzero we may pick an f I of minimal degree, and rescale it if necessary to make it
monic. We claim I = ( f ). Indeed if g I, then using the division algorithm, we may write g = q. f + r where
either r = 0 or deg(r) < deg( f ). But then r = g q. f I, and thus by the minimality of the degree of f I we
must have r = 0 and so g = q. f as required. The uniqueness follows
20
from the fact that if I = ( f ) and I = ( f
)
then we would have f = a. f
and f
= b. f , for some polynomials a, b k[t]. But then f = a. f
= (ab). f so
18
Note that this is why one wants to exclude R from being a prime ideal I dened an integral domain to be a ring which was
not the zero ring and had no zero divisors.
19
The key here is that a unit is never a zero-divisor: if a.b = 0 and a is a unit, then b = (a
1
.a).b = a
1
.(a.b) = a
1
.0 = 0.
20
This also follows from the fact that generators of a principal ideal are all associates, and the fact (which you proved in the rst
problem sheet) that the units in k[t] are exactly k
.
that a and b must have degree zero, that is, a, b k. Since we required f and f
to be monic, it follows that

a = b = 1 and so f = f
as required.
Running the Division Algorithm exactly as we do for the integers, one can prove a version of Bezouts
Lemma, i.e. given any two polynomials f
1
, f
2
k[t] there is a polynomial g k[t] which divides both f
1
and
f
2
and can be written in the form a f
1
+ b f
2
for some a, b k[t]. Thus any common divisor of f
1
and f
2
must
divide g and we call g the highest common factor of f
1
and f
2
. In fact, g is only dened up to rescaling by a
constant, thus to make it unique we usually require g to be monic, that is, we require the leading coefcient
of g to be 1. We can also give a less computational proof as follows:
Lemma 4.7. Let f , g k[t] be nonzero polynomials. Then there is a unique monic polynomial d k[t] such that d
divides f and g and there exist a, b k[t] such that d = a. f +b.g. We write d = h.c.f.( f , g), the highest common factor
of f , g, as if c divides f and g then c divides d = a. f + b.g.
Proof. Let I = ( f , g). By Lemma 4.6 it follows that I = (d) for a unique monic polynomial d. Then certainly
f ( f ) I = (d) so that d divides f , and similarly d divides g. Since I = ( f , g) = r. f + s.g : r, s k[t] and
d I it is also clear that we may nd polynomials a, b with d = a. f + b.g as required.
We can also determine which ideals are prime. Recall that a nonzero polynomial f k[t] is said to be
irreducible if whenever f = g.h for g, h k[t] then one of g, h is a unit (which is the same, for k[t], as being a
scalar).
Lemma 4.8. Let k be a eld and I = ( f ) a non-zero ideal in k[t]. Then I is prime if and only if f k[t] js an irreducible
polynomial. Moreover ever such ideal is in fact maximal, and all maximal ideals are of this from. (Thus the nonzero
prime ideals are the maximal ideals.)
Proof. If ( f ) is nonzero prime ideal (so that f 0) and f = g.h then f g or f h, say f g. But then g = f .k and so
f = ( f .k).h, and f (1 k.h) = 0. But k[t] is a domain, it follows k.h = 1 and h is a unit. Thus f is irreducible as
claimed. Conversely, suppose that f is irreducible and suppose that f divides a product g.h. We must show
that it divides one of g or h. But if f does not divide g, then the highest common factor of f and g must be
1. But then by Bezouts Lemma we have 1 = af + bg, for some a, b k[t] and so
h = h.1 = h(af + bg) = f .(ah) + b(gh),
so that f clearly divides h as required.
To see the moreover part, suppose that M is a maximal ideal. Then it is certainly prime, and so by
Lemma 4.6 and the above, M = ( f ) for some irreducible f . On the other hand, if I = ( f ) is a prime ideal,
then suppose that I J for some proper ideal J. Then the ideal J must be principal by Lemma 4.6 again,
so that J = (g). But then f = g.h for some h k[t], where since J is proper, deg g > 0. But then since f is
irreducible, we must have h k, and so h is a unit, and I = J as required.
Remark 4.9. As noted before, if R is an integral domain then R[t] is also (this is easy to see by considering the
degree function). It follows 0 is a prime ideal in k[t]. The above Lemma then shows that this is the only
prime ideal which is not maximal.
Now lets consider what the quotients of k[t] look like: We know any ideal I is of the form ( f ) for a
monic polynomial f . By the division algorithm, any polynomial g can be written uniquely as g = q. f + r
where deg(r) < deg( f ). Thus the polynomials of degree strictly less that d = deg( f ) form a complete set of
representatives for the I-cosets: every coset contains a unique representative r of degree strictly less than
deg( f ).
Since 1, t, . . . , t
deg( f )1
form a basis of the k-vector space of polynomials of degree less than deg( f ) this
means that if we let q: k[t] k[t]/I be the quotient map, and = q(t), then 1, , . . . ,
d1
form a k-basis for
k[t]/I, and we multiply in k[t]/I using the rule
d
= a
0
a
1
. . . a
d1
d
, where f (t) = t
d
+
_
d1
i=0
a
i
t
i
. In
particular, k[t]/( f ) is a k-vector space of dimension deg( f ). We can thereofore view the quotient k[t]/( f ) as
a way of building a new ring out of k and an additional element which satises the relation f () = 0, or
rather, the quotient construction gives us a rigorous way of doing this: for example, when k = R intuitively
we build C out of R and an element i which satised i
2
+ 1 = 0. Via the quotient construction this simply
says that we want to set C = R[t]/(t
2
+ 1) and indeed this is a eld because t
2
+ 1 is irreducible
21
in R[t].
21
In general it is not so easy to decide if a polynomial f k[t] is irreducible, but in the case where deg( f ) 3, f is reducible if and
only if it has a root in k, which can (sometimes) be easy to check.
14 KEVIN MCGERTY.
Remark 4.10. In fact with a little more care
22
it is straight-forward to check that if R is any ring and f R[t]
is a monic polynomial of degree d, and we let Q = R[t]/( f ) and = q(t) (where q: R[t] R[t]/( f ) is the
quotient map as before) then any element of Q can be written uniquely in the form r
0
+ r
1
+ . . . + r
d1
d1
,
where the multiplication in Q is given by the same rule as above.
5. AN INTRODUCTION TO FIELDS
The example of constructing C fromR as the quotient C R[t]/(t
2
+1) clearly generalises substantially. In
this section we use the quotient construction we have developed to construct some examples of elds, and
develop a little of their basic properties. The basic fact we use is Lemma 4.8, which shows that if f k[t]
is any irreducible polynomial then k[t]/( f ) is a eld, and moreover by the above discussion it is clearly a
k-vector space of dimension deg( f ).
If E, F are any elds and F E then we may view E as an F-vector space. If E is nite dimensional
as an F-vector space, then we write [E : F] = dim
F
(E) for this dimension and call it the degree of the eld
extension E/F. Although it probably seems a very crude notion, as it forgets alot of the structure of E, it is
nevertheless very useful. Since the denition of the characteristic of a ring and the embedding property of
elds of fractions show that any eld contains either a copy of Q or F
p
for some prime p, we will focus on
nite extensions of these elds, that is, elds which are either nite dimensional as Q-vector spaces or nite
dimensional as F
p
vector spaces.
The following Lemma is both easy to state and to prove, but is surprisingly useful.
Lemma 5.1. Let E/F be a eld extension and let d = [E : F] < . Then if V is an E-vector space, we may view
V as an F-vector space, and V is nite dimensional as an F-vector space if and only if it is as an E vector space, and
moreover dim
F
(V) = [E : F] dim
E
(V).
Proof. Certainly if V is an E-vector space then by restricting the scalar multiplication map to the subeld F
it follows that V is an F-vector space. Moreover, if V is nite dimensional as an F-vector space it is so as an
E-vector space (a nite F-spanning set will certainly be a nite E-spanning set). Conversely, suppose that V
is a nite dimensional E-vector space. Let x
1
, x
2
, . . . , x
d
be an F-basis of E, and let e
1
, . . . , e
n
be an E-basis
of V. To nish the proof it is enough to check that x
i
e
j
: 1 i d, 1 j n is an F-basis of V: Indeed if
v V, then since e
1
, . . . , e
n
is an E-basis of V there are
i
E (1 i n) such that v =
_
n
i=1

i
e
i
. Moreover,
since x
1
, . . . , x
d
is an F-basis of E then for each
i
there are elements
i
j
(1 j d) such that
i
=
_
d
j=1

i
j
x
j
.
Thus we have
v =
n
i=1
i
e
i
=
n
i=1
_
d
j=1
i
j
x
j
_
e
i
=
1in,1jd
i
j
(x
j
e
i
),
whence the set x
j
e
i
: 1 i n, 1 j d spans V as an F-vector space (and in particular we have already
established that V is nite dimensional as an F-vector space). To see that this set is linearly independent,
and hence establish the dimension formula, just notice that in the above equation, v = 0 if and only if each
i
= 0 by the linear independence of the vectors e
1
, . . . , e
n
, and
i
= 0 if and only if each
j
i
= 0 for 1 j d
by the linear independence of the x
j
s.
Example 5.2. Let V be a C vector space with basis e
1
, . . . , e
n
. Then since 1, i is an R-basis of C, it follows
e
1
, . . . , e
n
, ie
1
, . . . , ie
n
is an R-basis of V.
We record a particularly useful case of the above Lemma:
Corollary 5.3. (Tower Law) Let F E K be elds, then [K : F] is nite if and ony if both degrees [E : F], [K : E]
are, and when they are nite we have [K : F] = [E : F][K : E].
Proof. Apply the previous Lemma to the E-vector space K.
Example 5.4. Let E be a nite eld. Then E has characteristic p for some prime p N (since otherwise
E contains a copy of Z and is hence innite). Thus E contains the subeld F
p
Z/pZ. In particular we
can view it as an F
p
-vector space, and since it is nite, it must certainly be nite-dimensional. But then if
d = dim
F
p
(E), clearly there are p
d
elements in E. Thus we see that a nite eld must have prime-power
22
In particular, one needs to use the general statement of the division algorithm as given in Lemma 4.5.
order. In fact there is exactly one nite eld (up to isomorphism) of order p
d
for every prime p and positive
integer d.
We can also give a construction of such a eld: Take for example p = 3. Then it is easy to check that t
2
+1
is irreducible in F
3
[t] (you just need to check it does not have a root in F
3
, and there are only 3 possibilities!).
But then by our discussion above E = F
3
[t]/(t
2
+ 1) is eld of dimension 2 over F
3
, and hence E is a nite
eld with 9 elements.
Note that if we can nd an irreducible polynomial f of degree d in F
p
[t] the quotient F
p
[t]/( f ) will be a
nite eld of order p
d
. In the exercises we will show that for each d there is an irreducible polynomial of
degree d in F
p
[t], showing at least that nite elds of any prime-power order exist.
Denition 5.5. Let C. We say that is algebraic over Q if there is a eld E which is a nite extension of
Q containing . Otherwise we say that is transcendental. Notice that since the intersection of subelds is
again a subeld
23
, given any set T C there is always a smallest subeld which contains it. This is called
the eld generated by T, and is denoted Q(T) (recall that any subeld of C contains Q, since it contains Z and
hence Q because it is the eld of fractions of Z). In the case where X has just a single element we write
Q() rather than Q() and we say the eld extension is simple. Note that an element C is algebraic if
and only if Q() is a nite extension of Q. Slightly more generally, if F is any subeld of C and C we let
F() = Q(F ) be the smallest subeld of C containing both F and , and one says is algebraic over F
if F()/F is a nite extension.
The next Lemma shows that simple extensions are exactly the kind of elds our quotient construction
builds.
Lemma 5.6. Suppose that E/F is a nite extension of elds (both say subelds of C) and let E\F. Then there is
a unique monic irreducible polynomial f F[t] such that F() F[t]/( f ).
Proof. The eld K = F() is a nite extension of F since it is a subeld of E (and hence a sub-F-vector space
of the nite dimensional F-vector space E). Let d = [K : F] = dim
F
(K). Since the set 1, ,
2
, . . . ,
d
has d +1
elements, it must therefore be linearly independent, and so that there exist
i
F (0 i d), not all zero,
such that
_
d
i=0

i
i
= 0. But then if g =
_
d
i=0

i
t
i
F[t]\0, we see that g() = 0. It follows that the kernel I
of the homomorphism : F[t] E given by (
_
m
j=0
c
j
t
j
) =
_
m
j=0
c
j
j
is nonzero. Now any nonzero ideal in
F[t] is generated by a unique monic polynomial, thus I = ( f ) for some such a polynomial f say. By the rst
isomorphism theorem, the image S of is isomorphic to F[t]/I.Now S is a subring of a eld, so certainly an
integral domain, hence ( f ) must be a prime ideal, and by our description of prime ideals in F[t] thus in fact
maximal, so that S is therefore a eld. Finally, any subeld of C containing F and must clearly contain S
(as the elements of S are F-linear combinations of powers of ) so it follows S = Q().
Denition 5.7. Given C, the polynomial f associated to by the previous Lemma, that is, the irreducible
polynomial for which Q() Q[t]/( f ), is called the minimal polynomial of over Q. Note that our description
of the quotient Q[t]/( f ) shows that [Q() : Q] = deg( f ), hence the degree of the simple eld extension Q()
is just the degree of the minimal polynomial of .
Remark 5.8. (Non-examinable) For simplicity lets suppose that all our elds are subelds of C. It is in fact the
case that any nite extension E/F is simple, that is E = F() for some E (this is known as the primitive
element theorem, which is proved in next years Galois theory course). Moreover it turns out that given any
nite extension E/F of a eld F there are in fact only nitely many elds K between E and F. Neither
statement is obvious, but you should think about how the two facts are clearly closely related: if you accept
the statement about nitely many subelds between E and F then it is not hard to believe the primitive
element theorem you should just pick an element of E which does not lie in any proper subeld, and to
see such an element exists one just has to show that the union of nitely many proper subelds of E cannot
be the whole eld E. On the other hand, if E/F is a nite eld extension and E = F() by the primitive
element theorem for some E, then we have E F[t]/( f ) where f F[t] is the minimal polynomial of
over F. If K is a eld with F K E, then certainly E = K() also, and it follows E K[t]/(g), where
g K[t] is irreducible. Thus we see that g divides f in K[t], so viewing f , g as a polynomials in C[t] we see
that g corresponds to some subset of the (complex) roots of f , and hence (provided you believe that there
23
Just as for subspace of vector space, subrings of a ring, ideals in a ring etc.
16 KEVIN MCGERTY.
are only nitely many subelds of C isomorphic to a given eld which if you think about it follows from
the fact that a polynomial only has nitely many roots in C) then there can be only nitely many subelds
between E and F.
Example 5.9. (1) Consider

3 C. There is a unique ring homomorphism : Q[t] C such that
(t) =
3. Clearly the ideal (t

2
3) lies in ker(), and since t
2
3 is irreducible in Q[t] so that
(t
2
3) is a maximal ideal, we see that ker = (t
2
3), and hence im() Q[t]/(t
2
3). Now the
quotient Q[t]/(t
2
3) is eld, hence im() is also. Moreover, any subeld of C which contains

3
clearly contains im(), so we see that im = Q(
3). In particular, since the images of 1, t form a

basis of the quotient Q[t]/(t
2
3) by our description of quotients of polynomial rings in the previous
section, and under the isomorphism induced by these map to 1 and

3 respectively, we see that
Q(
3) = a +b
3 : a, b Q, a degree two extension of Q. (Note that one can also just directly check
that the right-hand side of this equality is a eld I didnt do that because I wanted to point out the
existence of the isomorphism with Q[t]/(t
2
3).)
(2) Exactly the same strategy
24
shows that Q(2
1/3
) is isomorphic to Q[t]/(t
3
2), and hence Q(2
1/3
) is
a 3-dimensional Q-vectors space with basis 1, 2
1/3
, 2
2/3
, again given by the image of the standard
basis we dened in the quotient Q[t]/(t
3
2).
(3) Now let T =
3, 2
1/3
. Let us gure out what E = Q(T) looks like. Certainly it contains the subelds
E
1
= Q(
3) and E
2
= Q(2
1/3
). Now [E : E
2
] = [E
2
(
3) : E
2
] (which is just saying Q(2
1/3
)(
3) =
Q(2
1/3
,
3) and hence by Lemma 5.6 we see that this degree is just the degree of the minimal
polynomial of

3 over Q(2
1/3
). But we know

3 is a root of t
2
3 Q[t] Q(2
1/3
)[t], so the
degree of this minimal polynomial is at most 2, and hence [E : E
2
] 2. It follows from the tower
law that [E : Q] = [E : E
2
][E
2
: Q] = [E : E
2
].3 6. On the other hand, we then also have
[E : Q] = [E : E
1
][E
1
: Q] = 2[E : E
1
]. It follows that [E : Q] is divisble by 2 and by 3 and hence by 6,
and moreover [E : Q] 2.3 = 6, so that [E : Q] = 6. With a little more work you can then check that
1, 2
1/3
, 2
2/3
,
3, 2
1/3
3, 2
2/3
3 is actually a Q-basis of E.
6. UNIQUE FACTORISATION
Throughout this section unless otherwise explicitly stated all rings are integral domains.
For the integers Z, any integer can be written as a product of prime numbers in an essentially unique
way. We will show in this section that this property holds for any integral domain all of whose ideals are
principal. We also show how to produce examples of such rings by abstracting the Euclidean algorithm.
Denition 6.1. Let R be an integral domain. If a, b R we say that a divides b, or a is a factor of b, and write
ab, if there is some c R such that b = a.c. Note that we can also write this in terms of the ideals a and b
generate: in fact ab if and only if bR aR, as you can see immediately from the denitions.
It also makes sense to talk about least common multiples and highest common factors in an integral
domain:
Denition 6.2. Let R be an integral domain. We say c R is a common factor of a, b R if ca and cb, and
that c is the highest common factor, and write c = h.c.f.(a, b), if whenever d is a common factor of a and b we
have dc. In the same way, we can dene the least common multiple of a, b R: a common multiple is an
element k R such that ak and bk, and the least common multiple is a common multiple which is a factor of
every common multiple.
Note that these denitions can be rephrased in terms of principal ideals: c is a common factor of a, b
if and only if a, b cR. An element g is the highest common factor of a, b if and only if gR is minimal
among principal ideals containing a, b, that is, if a, b cR then gR cR. Similarly the l is the least
common multiple of a, b if it lR is maximal among principal ideals which lie in aR bR.
Lemma 6.3. If a, b R where R is an integral domain, then if a highest common factor h.c.fa, b exists, it is unique
up to units. Similarly when it exists, the least common multiple is also unique up to units.
24
We just need to check that t
3
2 Q[t] is irreducible, but this follows because it does not have a root in Q.
Proof. This is immediate from our description of the highest common factor in terms of ideals. Indeed if
g
1
, g
2
are two highest common factors, then we must have g
1
R g
2
R (since g
1
is a highest common factor
and g
2
is a common factor) and symmetrically g
2
R g
1
R. But then g
1
R = g
2
R, and so since R is an integral
domain this implies g
1
, g
2
are associates, i.e. they differ by a unit. The proof for least common multiples is
analogous.
Remark 6.4. Note that although the denition makes sense in any integral domain
25
, it does not follow that
the highest common factor necessarily exists. We will see shortly that it does always exist for a number of
interesting classes of integral domains.
Denition 6.5. Let R be an integral domain. We say that R is a principal ideal domain (or P.I.D.) if every ideal
in R is principal, that is, every ideal is generated by a single element.
If R is an integral domain, we say it is a Euclidean domain (or E.D.) if there is a function N: R\0 N
(the norm) such that given any a R, b R\0 there exist q, r R such that a = bq + r and either r = 0 or
N(r) < N(b).
Remark 6.6. Some texts require that the norm N satises additional properties, for example one might re-
quire N(a.b) = N(a).N(b) (in which case the norm is said to be multiplicative) or N(a.b) = N(a) + N(b), or
something slightly weaker, such as N(a) N(a.b) for all a, b R. Such additional properties are often very
useful for studying other questions about a ring (e.g. if R is a Euclidean domain with such a norm you can
check that R
= a R : N(a) = N(1)) but are not necessary if one just wants to know the ring is a PID.
Example 6.7. (1) The integers are an example of an Euclidean domain where N(a) = a.
(2) If k is a eld, then k[t] is an Euclidean domain where N(p) = deg(p). (This is a case where its easier
not to have to dene N(0)).
(3) Let R = a + ib C : a, b Z. Then N(a + ib) = a
2
+ b
2
can be shown to be a norm on R. This shows
that this ring is an principal ideal domain.
Note that an integral domain is a Euclidean domain exactly when we have a version of the division
algorithm, and in any such ring we can therefore run the Euclidean algorithm in it. The following Lemma
should thus not be surprising.
Lemma 6.8. If R is a Euclidean domain, the R is a principal ideal domain.
Proof. Let I be an ideal. If I = 0 there is nothing to show, otherwise pick a I with N(a) minimal. We claim
I = (a). To see this suppose that s I. Using the property of the norm, we may write s = a.q + r where r = 0
or N(r) < N(a). Since r = s aq I it follows from the minimality of N(a) that we must have r = 0 and so
s (a) and thus I (a). Since clearly (a) I it follows I is principal as required.
Next we show that the h.c.f. always exists in a PID:
Lemma 6.9. Let R be a PID. Then if a, b R their highest common factor h.c.f.(a, b) exists.
Proof. In fact there is no reason to restrict to two elements: given a
1
, a
2
, . . . , a
n
R we may consider the
I = (a
i
: 1 i n). Since R is a PID, this ideal is principal, say I = (d). Then clearly d divides each of the a
i
and moreover there are elements r
i
R (1 i n) such that d =
_
n
i=1
r
i
a
i
, hence any element which divides
each a
i
also divides d.
Denition 6.10. Let R be an integral domain. A nonzero element a R\0 is said to be prime if aR is a
prime ideal, that is, aR R and whenever a divides r.s it divides at least one of r and s. Note that, in terms
of elements, the condition aR R = 1.R says that a and 1 are not associates, or in other words, a is not a
unit. Note also that it follows by induction on m that if p is prime and pa
1
.a
2
. . . a
m
then pa
j
for some j
(1 j m).
A nonzero element a R\0 is said to be irreducible if it is not a unit, and whenever a = b.c either b or c
is a unit.
In general in an integral domain the notions of a prime and irreducible element are not equivalent
(though we have effectively checked that they are in the case of k[t]). On the other hand, in any integral
domain we have the following:
25
In fact really in any ring, though divisibility is not well-behaved for rings which are not integral domains.
18 KEVIN MCGERTY.
Lemma 6.11. If R is an integral domain then any prime element is irreducible.
Proof. Suppose that p R\0 is prime and p = a.b. Then p divides a.b so that p divides one of a or b, say a.
But then we can write a = r. p and hence p = (r.b)p, so that cancelling we see 1 = r.b, i.e. b is a unit.
Lemma 6.12. Let R be an integral domain and let a R. The element a is irreducible if and only if the ideal (a)
is maximal among (proper) principal ideals in R, that is, if and only if whenever aR bR either b is a unit so that
bR = R or bR = aR that is a = u.b for some unit b R. In particular, if R is a PID any irreducible element is prime
and all nonzero prime ideals are maximal.
Proof. If a R is and aR bR then we have a = b.c for some c R. If c is a unit then aR = bR since a and b
are associates, otherwise we have aR bR. Now if a R is irreducible it clearly follows that any principal
ideal containing aR is either aR or all of R. Conversely, if aR is maximal among proper principal ideals and
a = b.c then aR is contained in bR and cR. Since a is not a unit, one of b or c must not be a unit
26
, say b. But
then aR bR R and so by maximality aR = bR whence a = b.u for some unit u R, and so b.c = b.u and
hence c = u is a unit as required.
If R is a PID then every ideal is principal, so that if a R is irreducible (a) is, by the above, a maximal
ideal, and hence certainly prime. Moreover, any nonzero prime ideal in R is of the form(p) for some prime
element p R0. But since R is a domain, prime elements are irreducible and hence by the above (p) is in
fact maximal. It follows all nonzero prime ideals are maximal as required.
Denition 6.13. An integral domain R is said to be an unique factorisation domain (or UFD) if every element
of R\0 is either a unit, or can be written as a product of prime elements, and moreover the factorization
into primes is unique up to reordering and units. More explicitly, if R is a UFD and r R is nonzero and
not a unit, then there are prime elements p
1
, . . . , p
k
such that r = p
1
p
2
. . . p
k
and whenever r = q
1
q
2
. . . q
l
is
another such prime factorization for r, then k = l and the q
j
s can be reordered so that q
j
= u
j
p
j
, where u
j
R
is a unit.
In fact the next Lemma shows that the uniqueness we require of the factorisation follows automatically
from the denition of prime elements.
Lemma 6.14. Suppose R is an integral domain and that every element a R can be written as a product of prime
elements. Then this factorisation is unique up to reordering and units. More precisely, if a = p
1
p
2
. . . p
k
and a =
q
1
.q
2
. . . q
l
is another such expression, then k = l and after there are units u
i
R
such, after reordering the q

i
s, we
have p
i
= u
i
q
i
(1 i k).
Proof. We use induction on the minimal number M(a) of primes in an expression for a R as a product
of primes. If M(a) = 1 then a is prime and uniqueness is clear since primes are irreducible in an integral
domain. Now suppose that M = M(a) > 1 and a = p
1
p
2
. . . p
M
= q
1
q
2
. . . q
k
for primes p
i
, q
j
and k M. Now
it follows that p
1
q
1
. . . q
k
, and so since p
1
is prime there is some q
j
with p
1
q
j
. Since q
j
is prime and hence
irreducible, this implies that q
j
= u
1
p
1
for some unit u
1
R. Thus we see that (u
1
1
p
2
) . . . p
M
= q
2
q
2
. . . q
k
, and
by induction it follows that k 1 = M1, i.e. k = M, and moreover the irreducibles occuring are equal up to
reordering and units as required.
We are now going to show that unique factorisation holds in any PID. By the above, it is enough to show
that any element has some factorization into prime elements, or equivalently for PIDs, some factorization
into irreducibles. At rst sight this seems like it should be completely obvious: if an element a R is
irreducible, then were done, otherwise it has a factorisation a = b.c. where b, c are proper factors (that is,
ba and ca and neither are associates of a). If b, c are not irreducible then we can nd a proper factorisation
of them and keep going until we reach a factorisation of a into irreducibles. The trouble with this argument
is that we need to show the process we describe stops after nitely many steps. Again intuitively this
seems clear, because the proper factors of a should be getting smaller, but again a priori they might just
keep getting smaller and smaller. The next Proposition which although at rst sight it may seem rather
technical, shows this cannot happen. The idea is to rephrase things in terms of ideals: Recall that ba if and
only if aR bR and b is a proper factor of a (i.e. b divides a and is not an associate of a) if and only if aR bR,
that is, aR is strictly contained in bR. Thus if our PID R contained an element which could be factored into
26
Since R
is a group under multiplication!

smaller and smaller factors this would mean in terms of ideals that we could produce a nested sequence of
ideals each of which strictly contains the previous ideal.
Proposition 6.15. Let R be a PID and suppose that I
n
: n N is a sequence of ideals such that I
n
I
n+1
. Then the
union I =
_
n0
I
n
is an ideal and there exists an N N such that I
n
= I
N
= I for all n N.
Proof. Let I =
_
n1
I
n
. Given any two elements p, q I, we may nd k, l N such that p I
k
and q I
l
.
It follows that for any r R we have r. p I
k
I, and taking n = maxk, l we see that r, s I
n
so that
r + s I
n
I. It follows that I is an ideal. Since R is a PID, we have I = (c) for some c R. But then there
must be some N such that c I
N
, and hence I = (c) I
N
I, so that I = I
N
= I
n
for all n N as required.
Remark 6.16. A ring which satises the condition that any nested ascending chain of ideals stabilizes is
called a Noetherian ring. The condition is a very important niteness condition in ring theory. (Note that
the proof that the chain of ideals stabilizes generalises readily if you just know every ideal is generated by
nitely many elements, rather than a single element.)
Theorem 6.17. Let R be a PID. Then R is a UFD.
Proof. As discussed above, it follows from the fact that irreducibles are prime in a PID and Lemma 6.14 that
we need only show any element can be factored as a product of irreducible elements. Thus suppose for the
sake of a contradiction that there is some a = a
1
R which is not a product of irreducible elements. Clearly
a cannot be irreducible, so we may write it as a = b.c where neither b nor c is a unit. If both b and c can be
written as a product of prime elements, then multiplying these expressions together we see that a is also,
hence at least one of b or c cannot be written as a product of prime elements. Pick one, and denote it a
2
.
Note that if we set I
k
= (a
k
) (for k = 1, 2) then I
1
I
2
. As before a
2
cannot be irreducible, so we may nd
an a
3
such that I
2
= (a
2
) (a
3
) = I
3
. Continuing in this fashion we get a nested sequence of ideals I
k
each
strictly bigger than the previous one. But by Proposition 9.5 this cannot happen if R is a PID, thus no such
a exists.
Remark 6.18. (Non-examinable). The annoying up to units qualication for prime factorisation in a PID
vanishes if you are willing to live with ideals rather than elements: in a PIDany proper ideal I can be written
as a product of nonzero prime ideals I = P
1
P
2
. . . P
k
where the prime ideals occuring in this factorisation
are unique up to reordering. Indeed this is just the statement that two elements of an integral domain are
associates if and only if they generate the same principal ideal. However, if you do Algebraic Number
Theory next year youll see this idea extended to rings where unique factorization of elements fails (in
particular the rings are not PIDs!) but where nevertheless unique factorization of ideals continues to hold.
Remark 6.19. (Again non-examinable, but perhaps illuminating.) In special cases the proof that any element is
a product of irreducibles can be simplied: more precisely, suppose that R is an Euclidean domain with
a norm N which satises the condition that N(a) N(a.b) for all a, b R\0. We will call
27
such a norm
weakly multiplicative. (This holds for example if the norm satises something like N(a.b) = N(a).N(b) or
N(a.b) = N(a) + N(b).) In this case we can replace the use of Proposition 9.5 with a more concrete inductive
argument. In order to make the induction work however, we will need to know that when we factorise
an element as a product of two proper factors (i.e. so neither factor is a unit) then the norms of the factors
are strictly smaller than the norm of the element. Of course if have an explicit description of the norm (as
we do say for k[t] or Z) this may be easy to check directly, but it is in fact a consequence of the weakly
multiplicative property. More precisely we have:
Claim: Let R be an ED with a weakly multiplicative norm. If a, b R\0 satisfy ba and N(a) = N(b) then a
and b are associates.
Proof : To prove the claim, suppose that N(a) = N(b) and a = b.c. We must show that c is a unit. By the
division algorithm we have b = q.a + r where r = 0 or N(r) < N(a) = N(b). Substituting a = b.c and
rearranging we get b(1 q.c) = r, and hence if r 0 then N(r) = N(b.(1 q.c)) N(b) = N(a) which is a
contradiction. Thus r = 0 and so since b 0, 1 q.c = 0 and so c is a unit as required.
27
I dont know if there is a standard name for this property multiplicative would suggest something like N(a.b) = N(a).N(b).
Submultiplicative might be another reasonable term, but it sounds pretty awful.
20 KEVIN MCGERTY.
We now show how, in any Euclidean Domain R with a weakly multiplicative norm a nonunit a R\0 is
a product of irreducibles using induction on N(a) the norm. Note that N(1) N(1.a) = N(a) for all a R\0,
so that the minimum value of N is N(1). But by what we have just done, if N(a) = N(1) then a is a unit (since
1 divides any a R). If N(a) > N(1) then either a is an irreducible element, in which case we are done, or
a = b.c, where neither b nor c is a unit. But then by the claim we must have N(b), N(c) < N(a), and hence by
induction they can be expressed as a product of irreducibles and so multiplying these expressions together
we see so can a. It follows every a R\0 is unit or a product of irreducibles as required.
A ring may be a UFD without being a PID: in fact we will now show that Z[t] is a UFD, even though it is
not a PID. The idea is to use the fact that we understand factorisation in the rings Z and Q[t] well, because
they are both PIDs, thus studying the inclusion of Z[t] into Q[t] will allow us to see it is a UFD.
The next denition and Lemma are the key to our understanding of factorisation in Z[t].
Denition 6.20. If f Z[t] then dene the content c( f ) of f to be the highest common factor of the coecients
of f . That is, if f =
_
n
i=0
a
i
t
i
then c( f ) = h.c.f.a
0
, a
1
, . . . , a
n
. Note that in a general integral domain the highest
common factor is only dened up to units, but in the case of Z if we insist c( f ) 0 then it is unique (since
the units in Z are just 1).
Lemma 6.21. (Gauss). Let f , g Z[t]. Then c( f .g) = c( f ).c(g).
Proof. Suppose rst f , g Z[t] have c( f ) = c(g) = 1. Then let p N be a prime. We have for each such prime
a homomorphismZ[t] F
p
[t] given by
p
(
_
n
i=0
a
i
t
i
) =
_
n
i=0
a
i
t
i
, where a
i
denotes a
i
+ pZ F
p
. It is immediate
that ker(
p
) = pZ[t], so that we see pc( f ) if and only if
p
( f ) = 0. But since F
p
is a eld, F
p
[t] is an integral
domain, and so as
p
is a homomorphism we see that
pc( f .g)
p
( f .g) = 0
p
( f ).
p
(g) = 0
p
( f ) = 0 or
p
(g) = 0 pc( f ) or pc(g),
whence it is clear that c( f .g) = 1 if c( f ) = c(g) = 1.
Now let f , g Z[t], and write f = a. f
,g = b.g
where f
, g
inZ[t] have c( f
) = c(g
) = 1, (so that c( f ) =
a, c(g) = b). Then clearly f .g = (a.b).( f
) and since c( f
) = 1 it follows that c( f .g) = c( f ).c(g) as required.
Alternative proof. If you found the above proof of the fact that c( f .g) = 1 if c( f ) = c(g) = 1 a bit too slick, then
a more explicit version of essentially the same argument goes as follows: Let f =
_
n
i=0
a
i
t
i
and
28
g =
_
n
i=0
b
i
t
i
,
and write f .g =
_
2n
k=0
c
k
t
k
. Suppose that d divides all the coefcients of f .g and d is not a unit. Since c( f ) = 1,
there must be a smallest k such that d does not divide a
k
and similarly since c(g) = 1 there is a smallest l
such that d does not divide b
l
. Consider
c
k+l
=
i+j=k+l
a
i
b
j
,
Now d divides every term on the right-hand side except for a
k
b
l
, since every other term has one of i < k or
j < l, but then d does not divide the sum, contradicting the assumption that d divides c
k+l
. Thus we have a
contradiction and thus c( f .g) = 1 as required.
We can now extend the denition of content to arbitrary nonzero elements of Q[t].
Lemma 6.22. Suppose f Q[t] is nonzero
29
. Then there is an unique Q
>0
such that f = f
where f
Z[t] and
c( f
) = 1. We write c( f ) = . Moreover, if f , g Q[t] then c( f .g) = c( f ).c(g).

Proof. Let f =
_
n
i=0
a
i
t
i
where a
i
= b
i
/c
i
for b
i
, c
i
Z with h.c. f b
i
, c
i
= 1 for all i, 0 i n. Then pick d Z
>0
such that da
i
Z for all i, (1 i n) so that d f Z[t] (for example you can take d = l.c.m.c
i
: 0 i n or
n
i=0
c
i
). Let c = c(d. f ) Z so that if = d/c we have f = f
where f
Z[t] and c( f
) = 1 as required.
To show uniqueness, suppose that f =
1
f
1
=
2
f
2
, where
1
,
2
Q
0
and f
1
, f
2
Z[t] such that c( f
1
) =
c( f
2
) = 1. Then
1
1
2
f
1
= f
2
Z[t]. But since f
1
has content 1, we must then have
1
1
2
Z (since an
irreducible p occuring in
1
/
2
to a negative power cannot divide all the coefcients of f
1
because c( f
1
) = 1.
Similarly writing
2
1
1
f
2
= f
1
we see that
2
1
1
Z so that
1
1

2
Q
>0
Z
= 1 and hence
1
=
2
as
required. The moreover part follows immediately from Gauss Lemma: if f , g Q[t] writing f = f
, g = g
we see that f .g = (.)( f
) and by Gauss Lemma c( f
) = [1], hence c( f .g) = . = c( f ).c(g).

28
Note that so long as we do not assume that b
n
or a
n
is nonzero we may take the same upper limit in the sums.
29
In lecture I allowed f to be zero, in which case c( f ) = 0 and the statement of the Lemma then holds except that now Q
0
.
However since we only use the Lemma for nonzero polynomials, I decided to exclude the zero polynomial for simplicity.
We can now examine irreducibles in Z[t].
Lemma 6.23. (1) Suppose that f Z[t] Q[t] is nonzero, and that f = g.h where g, h Q[t]. Then there exist
Q such that (.g), (
1
.h) Z[t]. Thus f = (.g)(
1
h) is a factorisation of f in Z[t].
(2) Suppose that f Q[t] is irreducible and c( f ) = 1. Then f is a prime element of Z[t].
(3) Let p Z be a prime number. Then p is a prime element in Z[t].
Proof. For the rst part, by Lemma 6.22 we may write g = c.g
and h = d.h
where g
, h
Z[t] have content

[1]. Then c( f ) = c.d so that as f Z[t] we have c.d Z. Setting = d we see that f = (.g).(
1
.h) where
.g = (c.d).g
and
1
h = h
both lie in Z[t] as required.

For the second part, rst note that if f Q[t] has c( f ) = 1 then by denition f must lie in Z[t] (and has
content 1). To see that such an f is prime, we need to show that if g, h Z[t] and f g.h in R[t] then f g or
f h in Z[t]. Now if f g.h in Z[t], certainly it does so in Q[t]. Since Q[t] is a PID, irreducibles are prime and
so either f g or f h in F[t]. Suppose that f g (the argument being identical for h). Then we have g = f .k for
some k Q[t]. Now by Lemma 6.22 we may write k = c(k).k
where k
Z[t]. Moreover by the same Lemma,

c(h) = c( f ).c(k) = c(k) since c( f ) = 1. But h Z[t], hence c(h) = c(k) Z and in fact k Z[t] so that f divides g
in Z[t] as required.
For the nal part, we have already seen that the homomorphism
p
: Z[t] F
p
[t] has kernel pZ[t], and so
since F
p
[t] is an integral domain, the ideal pZ[t] is prime, that is, p is a prime element of Z[t].
Theorem 6.24. The ring Z[t] is a UFD.

Proof. Since Z[t] is an integral domain (as Z is), by Lemma 6.14 it is enough to show that any element of Z[t]
is a product of primes. Let f Z[t]. We may write f = a. f
where c( f
) = 1, and since Z is a UFD we may

factorise a into a product of prime elements of Z which we have just seen are prime in Z[t]. Thus we may
assume c( f ) = 1. But then viewing f as an element of Q[t] we can write it as a product of prime elements in
Q[t], say f = p
1
p
2
. . . p
k
. Now using Lemma 6.22, each p
i
can be written as a
i
q
i
where a
i
Q and q
i
Z[t] and
c(q
i
) = [1]. But then by the Lemma 6.23, q
i
is prime in Z[t], and f = (a
1
. . . a
k
)q
1
. . . q
k
. Comparing contents
we see that (a
1
. . . a
k
) must be a unit in Z, and so we are done.
Remark 6.25. It is easy to see from this that in fact all primes in Z[t] are either primes in Z or primes in Q[t]
which have content 1.
Remark 6.26. In fact one can show directly (see the problem set) that if R is a UFD then highest common
factors exist (that is, given elements a
1
, . . . , a
n
R there is an element d such that da
i
for all i, (1 i n) and
if ca
i
for all i also, then cd). This and the fact that R, just because it is a domain, has a eld of fractions F
and F[t] is a PID for any eld F, is all we need to run the above proof that Z[t] is a UFD. It follows that if
R is any UFD, the so is R[t] using exactly the same strategy
30
as above, hence for example Q[x, y] is a UFD
(and in fact by induction, so is the polynomial ring k[t
1
, t
2
, . . . , t
n
] for any n N and any eld k). It is not
hard to see that these rings are not PIDs, so the class of rings which are UFDs is strictly larger than the class
of PIDs. In fact not every PID is a Euclidean domain either, so there are strict containments: EDs PIDs
UFDs. Finding a PID which is not a Euclidean domain is a bit subtle, and we wont do it here.
6.1. Irreducible polynomials. In this section we develop some techniques for showing a polynomial f
Q[t] is irreducible. But by what we have done above, if f Q[t] is irreducible, we may write f = c( f ).g
where g Z[t] has content 1 and is a prime in Z[t]. Since f and g are associates in Q[t] it follows that to
understand irreducible elements in Q[t] it is enough to understand the prime elements in Z[t] of positive
degree (or equivalently, the irreducibles f Q[t] with c( f ) = 1.)
This is useful for the following reason: Recall that for any prime p Z we have the homomorphism
31
p
: Z[t] F
p
[t]. This allows us to transport questions about factorisation in Z[t] to questions about factori-
sation in F
p
[t]: Clearly if f Z[t] is reducible in Z[t] and its image in F
p
[t] is nonzero (which will always be
the case if c( f ) = 1 say) it will be reducible in F
p
[t]. Since the rings F
p
[t] are smaller than either Z[t] or Q[t]
this can give us ways of testing irreducibility.
30
There is a little more book-keeping because now the content is only dened up to units.
31
Note that there is no homomorphism fromQ[t] to F
p
[t] for any prime p. This is why we have to pass through Z[t].
22 KEVIN MCGERTY.
Example 6.27. Suppose that f = t
3
349t +19 Z[t]. If f is reducible in Q[t], it is reducible in Z[t] and hence
its image under
p
in F
p
[t] will be reducible. But since f has degree 3 it follows it is reducible if and only if
it has a degree 1 factor, and similarly for its image in F
p
[t], which would therefore mean it has a root in F
p
.
But taking p = 2 we see that
2
( f ) =

f = t
3
+ t + 1 F
2
[t] and so it is easy to check that

f (0) =

f (1) = 1 F
2
,
so

f does not have a root, and hence f must be irreducible. Note on the other hand t
2
+ 1 is irreducible in
Z[t] but in F
2
[t] we have t
2
+ 1 = (t + 1)
2
, so
p
( f ) can be reducible even when f is irreducible.
Lemma 6.28. (Eisensteins criterion.) Suppose that f Z[t] and f = t
n
+ a
n1
t
n1
+ . . . a
1
t + a
0
. Then if there is a
prime p Z such that pa
i
for all i, 0 i n 1 but p
2
does not divide a
0
then f is irreducible in Z[t] and Q[t].
Proof. Clearly c( f ) = 1, so irreducibility in Z[t] and Q[t] are equivalent. Let
p
: Z[t] F
p
[t] be the quotient
map. Suppose that f = g.h was a factorisation of f in Z[t] where say deg(g) = k > 0. Then we have
p
( f ) =
p
(g).
p
(h). By assumption
p
( f ) = t
n
, hence since F
p
[t] is a UFD and t is irreducible, we must have
p
(g) = t
k
,
p
(h) = t
nk
. But then it follows the constant terms of both g and h must be divisible by p, and
hence a
0
must be divisible by p
2
, contradicting our assumption.
Example 6.29. This gives an easy way to see that 2
1/3
Q: if it was t
3
2 would be reducible, but we see
this is not the case by applying Eisensteins criterion with p = 2. (It also gives a proof that

2,
3 are not
rational.)
One can also use Eisensteins Criterion in more cunning ways. For example, it might be that the Criterion
does not apply to f (t) but it does to f (t + 1), as the next example shows:
Example 6.30. Suppose that p N is prime, and f = 1 + t + . . . + t
p1
Z[t]. Then we claim f is irreducible.
Let g = f (t +1). Then if g was reducible, say g = h
1
.h
2
it would follow that f (t) = g(t 1) = h
1
(t 1)h
2
(t 1) is
reducible, and similarly if g is irreducible so is f . Thus f is irreducibe if and only if g is. But as f =
t
p
1
t1
we
see that
g = t
1
_
(t + 1)
p
1
_
=
p1
i=0
_
p
i + 1
_
t
i
,
But it is well know that p divides
_
p
i+1
_
for any i, 0 i p 2, while the constant term
_
p
1
_
= p is not divisible
by p
2
, so Eisensteins Criterion shows g and hence f is irreducible.
Remark 6.31. (Non-examinable.) You might be worried
32
about what substituting t + 1 for t means for poly-
nomials with coefcient in an arbirtary ring where we cannot think of them as functions. (In the case of Z[t]
this is not a problem, since Z[t] embeds into Q[t] which can be viewed as a subring of the ring of functions
from Q to itself since Q is innite.) In fact weve done enough to make sense of this already: Recall from
Lemma 1.9 which characterised homomorphisms from polynomial rings that for a polynomial ring R[t],
given a homomorphism : R S and an element of s S there is a unique homomorphism from R[t] to S
taking t to s and r R to (r). Thus there is a unique homomorphism : R[t] R[t] which is the identity
33
on R and sends t to t + 1. Since the corresponding homomorphism given by t t 1 is clearly an inverse to
we see that is an isomorphism fromZ[t] to itself. It follows that f is irreducible if and only if ( f ) is.
7. MODULES
In the remainder of the course all rings R will be integral domains. (In fact, predominantly they will be PIDs or EDs.)
In this section we begin the study of linear algebra over rings. Recall a vector space is just an abelian
group with an action of a eld of scalars obeying some standard rules. The denition of a module is
exactly the same, except now we allow our scalars to belong to an arbitrary ring, rather than insisting they
belong to a eld. Formally, we say the following:
Denition 7.1. Let R be a ring with identity 1
R
. A module over R is an abelian group (M, +) together with a
multiplication action of R on M satisfying:
(1) 1
R
.m = m, for all m M;
32
You might also not be worried, I dont know which group is better off in life in general.
33
In the case R = Z, the identity is the only ring homomorphism fromZ to itself so in that case you dont need to explicitly require
this.
(2) (r
1
.r
2
).m = r
1
.(r
2
.m), for all r
1
, r
2
R, m M
(3) (r
1
+ r
2
).m = r
1
.m + r
2
.m for all r
1
, r
2
R and m M;
(4) r.(m
1
+ m
2
) = r.m
1
+ r.m
2
for all r R and m
1
, m
2
M.
Remark 7.2. Just as with vector spaces, we write the addition in the abelian group M and the addition in the
ring R as the same symbol +, and similarly the multiplication action of R on M is written in the same way
as the multiplication in the ring R, since the axioms ensure that there is no ambiguity in doing so.
Remark 7.3. Note that the denition makes perfectly good sense for a noncommutative ring (when it would
normally be described as a left module since the action of the ring is on the left). Next years course on
Representation Theory will study certain noncommutative rings called group algebras, and modules over
them. In this course we will focus on modules over integral domains and all our main results will be for
modules over a PID, though even then in some cases we will only give proofs for the case where our ring
is a Euclidean domain.
Example 7.4. Lets give a few examples:
(1) As mentioned above, if R is a eld, the denition is exactly that of a vector space over R, so modules
over a eld are just vector spaces over that eld.
(2) At the other end of the spectrum in a sense, if A is an abelian group, then it has a natural structure
of Z-module: if n is a positive integer, then set n.a = a + a + . . . + a (n times) and if n is a negative
integer, set n.a = (a +a +. . . +a) (where this time we add a to itself n times). Its easy to check this
makes A a Z-module, and moreover, the conditions (1), (2), (3) in fact force this denition on us, so
that this Z-module structure is unique
34
. Thus we see that Z-modules are just abelian groups.
(3) Suppose that R is a ring. Then R is a module over itself in the obvious way.
(4) If R is a ring and I is an ideal in R, then it follows directly from the denitions that I is an R-module.
(5) Again if I is an ideal in R then R/I is naturally an R-module where the multiplication action is given
via the quotient homomorphism q: R R/I, that is if m R/I and r R we set r.m = q(r).m (the
multiplication on the righthand side being inside the ring R/I). Indeed the properties (1), (2) and (3)
all follow immediately from the fact that q is a ring homomorphism.
(6) Generalising the previous example somewhat, if : R S is a homomorphism of rings, and M is
an S -module, then we can give M the structure of an R-module by setting r.m = (r).m (where the
action on the right-hand side comes from the S -module structure. Thus for example any if I is an
ideal of R then any R/I-module automatically has the structure of an R-module via the quotient map
q: R R/I.
(7) Generalising the example of R being an R-module over itself in a slightly different way, given our
ring R and a positive integer n, we may consider the module R
n
= (r
1
, r
2
, . . . , r
n
) : r
i
R of n-
tuples of elements of R (written as row vectors or column vectors different books prefer different
conventions), where the addition and the multiplication by scalars is done componentwise. (This is
exactly the way we dene the vector space R
n
for the eld R). Such a module is an example of a free
module over R.
(8) To give a more substantial example, suppose that V is a vector space and : V V is a linear map.
Then we can make V into a C[t]-module by setting p(t).v = p()(v) for any v V and p(t) C[t] (that
is just evaluate the polynomial p on the linear map ). Indeed by Lemma 1.9 a homomorphism from
C[t] to End(V) is uniquely determined by its restriction to the scalars C and the image of t. Here we
dene by the conditions that it sends the complex number C C[t] to .id
V
, and t to . The
fact that the assignment f .v = ( f )(v) for v V, f k makes
35
V into a k[t]-module follows directly
from the fact that is a homomorphism. Conversely, if we are given a C[t]-module M, we can view
it as a complex vector space where the multiplication by scalars is given to us by viewing complex
numbers as degree zero polynomials. The action of multiplication by t is then a C-linear map from
M to itself. Thus C[t]-modules are just C-vector spaces equipped with an endomorphism.
34
Writing down all the details of a proof of this is very similar to the exercise in the problem sheets in which you showed that given
any ring R there is a unique homomorphism fromZ to R.
35
I dont know if this will help anyone, but given an abelian group M, to make it into an R-module, you just need to give a
ring homomorphism from R to End
Z
(M), the ring of all group homomorphisms from M to itself (the multiplication being given by
composition.)
24 KEVIN MCGERTY.
7.1. Submodules, generation and linear independence.
Denition 7.5. If M is an R-module, a subset N M is called a submodule
36
if it is an abelian subgroup of M
and whenever r R and n N then r.n N. If N
i
: i I is a collection of submodules then their intersection
_
iI
N
i
is also a submodule. This allows us to dene (just as we did for ideals, subrings, subelds etc.) for a
set X M the submodule generated by X,
(X) =
_
NX
N,
where N runs over the submodules of M which contain X. Explicitly, it is the subset
_
k
i=1
r
i
x
i
: r
i
R, x
i
X.
The proof is exactly the same
37
as the proof for ideals in a ring. Following the terminology used in linear
algebra we will say X generates or spans the submodule (X). We say a module is nitely generated if there
is a nite set which generates it. If N
1
, N
2
are submodules then the submodule they generate is their sum
M+ N = m+ n : m M, n N. To prove this one rst checks that the righthand side is indeed a submodule
and then that any submodule containing M and N must contain all the elements of M + N (and these two
steps both follow directly from the denitions). Note that this generalises the fact which we have already
seen that the ideal generated by the union of two ideals I J is just their sum I + J.
One more notion that extends from linear algebra is that of linear independence:
Denition 7.6. If M is a module over R, we say a set S M is linearly independent if whenever we have an
equation r
1
s
1
+ r
2
s
2
+ . . . r
k
s
k
= 0 for r
i
R, s
i
S (1 i k) we have r
1
= r
2
= . . . r
k
= 0. Finally, we can say
that a set S is a basis for a module M if and only if it is linearly independent and it spans M. Any module
which has a basis is called a free module.
8. QUOTIENT MODULES AND ISOMORPHISM THEOREMS.
All rings R are integral domains.
Just as for vector spaces, modules given a module together with a submodule there is a natural notion
of a quotient module. (If youve understood quotients of rings and quotients of vectors space, everything
here should look very familiar, as the constructions mimics those cases, in fact they are word for word the
same as for quotient vector spaces).
Denition 8.1. If N is a submodule of M, then in particular it is a subgroup of an abelian group, so we
can form the quotient M/N. The condition that N is a submodule then is precisely what is needed for
the multiplication on M to induce a module structure on M/N: If r R and m + N M/N then dene
r.(m+N) = r.m+N. This is well dened because if m
1
+N = m
2
+N we have m
1
m
2
N, and so r.(m
1
m
2
) N,
whence r.m
1
+ N = r.m
2
+ N. The module M/N is called the quotient module of M by N.
There is also a natural analogue of linear maps for modules: if M
1
, M
2
are module, we say that : M
1

M
2
is a module homomorphism (or just homomorphism) if:
(1) (m
1
+ m
2
) = (m
1
) + (m
2
), for all m
1
, m
2
M
1
,
(2) (r.m) = r.(m), for all r R, m M
1
,
that is, respects the addition and multiplication by ring elements. An isomorphism of R-modules is a
homomorphism which is a bijection (and you can check, just as for groups, that this implies the inverse
map of sets is also a homomorphism of modules). Just as the kernel and image of a linear map between
vector spaces are subspaces, it is easy to see that ker() = m M
1
: (m) = 0 and im() = (m) : m M
1
are submodules of M
1
and M
2
respectively.
Example 8.2. When R is a eld, module homomorphisms are exactly linear maps. When R is Z a Z-module
homomorphism is just a homomorphism of the abelian groups. As another important example, it is easy to
see that if M is an R-module and N is a submodule of M then the denition of the module structure on M/N
ensures precisely that the map q: M M/N given by q(m) = m+N is a (surjective) module homomorphism.
36
Note the denitions in this subsection are exactly the same as for the case of a vector space.
37
As should come as no surprise given example (3) above: a subset of R viewed as an R-module is a submodule if and only if it is
an ideal.
Lemma 8.3. (Submodule correspondence:) Let M be an R-module and N a submodule. Let q: M M/N be the
quotient map. If S is a submodule of M then q(S ) is a submodule of M/N, while if T is a submodule of M/N then
q
1
(T) is a submodule of M. Moreover the map T q
1
(T) gives an injective map from submodules of M/N to the
submodules of M which contain N, thus submodules of M/N correspond bijectively to submodules of M which contain
N.
Proof. To check that q(S ) and q
1
(T) are submodules of N and M respectively follows directly from the
denitions, we give the argument for q
1
(T), since the argument for q(S ) follows exactly the same pattern.
If m
1
, m
2
q
1
(T) then q(m
1
), q(m
2
) T and it follows since T is a submodule that q(m
1
)+q(m
2
) = q(m
1
+m
2
) T
which says precisely that m
1
+ m
2
q
1
(T). Similarly if r R then q(r.m
1
) = r.q(m
1
) T since q(m
1
) T and T
is a submodule, so that r.m
1
q
1
(T). Thus q
1
(T) is a submodule of M as required.
Now if T is any subset of M/N we have q(q
1
(T)) = T simply because q is surjective. Since we have just
checked q
1
(T) is always a submodule in M, this immediately implies that the map S q(S ) is a surjective
map from submodules in M to submodules in M/N and that T q
1
(T) is an injective map, and moreover
since q(N) = 0 T for any submodule T of M/N we have N q
1
(T) so that the image of the map T q
1
(T)
consists of submodules of M which contain N. Hence it only remains to check that the submodules of M of
the form q
1
(T) are precisely these submodules. To see this suppose that S is an arbitrary submodule of M,
and consider q
1
(q(S )). By deniton this is
q
1
(q(S )) = m M : q(m) q(S ) = m M : s S such that m + N = s + N = m M : m s + N,
where the right hand side is just another way of writing the submodule S + N. But if S contains N then we
have S + N = S and hence q
1
(q(S )) = S and any submodule S which contains N is indeed the preimage of
a submodule of M/N as required.
Remark 8.4. If N M is a submodule and q: M M/N is the quotient map, for a submodule Q of M
containing N we will usually write Q/N for the submodule q(Q) of M/N.
Theorem 8.5. (Universal property of quotients.) Suppose that : M N is a homomorphism of R-modules, and S
is a submodule of M with S ker(). Then there is a unique homomorphism

: M/S N such that =

q where
q: M M/S is the quotient homomorphism, that is the following diagram commutes:
M

//
q
!!
D
D
D
D
D
D
D
D
N
M/S
==
{
{
{
{
{
{
{
{
Moreover ker(
) is the submodule ker()/S = m + S : m ker().

Proof. Since q is surjective, the formula

(q(m)) = (m) uniquely determines the values of

, so that

is
unique if it exists. But if m m
S then since S ker() it follows that 0 = (m m
) = (m) (m
)
and hence is constant on the S -cosets, and therefore induces a map

(m + S ) = (m). The fact that

is a
homomorphism then follows directly from the denition of the module structure on the quotient M/S , and
clearly =

q by denition. To see what the kernel of

is, note that

(m + S ) = (m) = 0 if and only if
m ker(), and hence m + S ker()/S as required.
Corollary 8.6. Let M be an R-module. We have the following isomorphisms.
(1) (First isomorphismtheorem.) If : M N is a homomorphismthen induces an isomorphism

: M/ker()
im().
(2) (Second isomorphism theorem.) If M is an R-module and N
1
, N
2
are submodules of M then
(N
1
+ N
2
)/N
2
N
1
/N
1
N
2
,
(3) (Third isomorphism theorem.) Suppose that N
1
N
2
are submodules of M. Then we have
(M/N
1
)/(N
2
/N
1
) M/N
2
.
Proof. The proofs are exactly the same as for rings. For the rst isomorphism theorem, apply the universal
property to K = ker(). Since in this case ker(
) = ker()/ker() = 0 it follows

is injective and hence
induces an isomorphism onto its image which from the equation

q = must be exactly im().
26 KEVIN MCGERTY.
For the second isomorphism theorem, let q: M M/N
2
be the quotient map. It restricts to a homomor-
phism p from N
1
to M/N
2
, whose image is clearly (N
1
+ N
2
)/N
2
, so by the rst isomorphism theorem it is
enough to check that the kernel of p is N
1
N
2
. But this is clear: if n N
1
has p(n) = 0 then n + N
2
= 0 + N
2
so that m N
2
, and so n N
1
N
2
.
For the third isomorphism theorem, let q
i
: M M/N
i
for i = 1, 2. By the universal property for q
1
we
see that there is a homomorphism q
2
: M/N
1
M/N
2
induced by the map q
2
: M M/N
2
, with kernel
ker(q
2
)/N
1
= N
2
/N
1
and q
2
q
1
= q
2
. Thus q
2
is surjective (since q
2
is) and hence the result follows by the
rst isomorphism theorem.
9. FREE, TORSION AND TORSION-FREE MODULES
All rings R in this section are integral domains unless otherwise stated.
The fact that the nonzero elements of a ring do not have to be invertible means that modules behave less
uniformly than vector spaces do. For example, you know that any vector space has a basis, and hence in the
above terminology it is free. However, over Z there are many modules which are not free: indeed a nite
abelian group is certainly a Z-module, but it cannot be free since a free module must contain innitely many
elements (if an element m is part of a basis, it is easy to check that the elements n.m must all be distinct for
n Z). In fact if M is a nite abelian group, every element is of nite order by Lagranges theorem, which
means that for every element m M there is an integer n N such that n.m = 0. This is one important way in
which a module over a general ring can be different from the case of vector spaces: we may have a nonzero
scalar r and a nonzero element m of our module M whose product r.m is nevertheless equal to zero. This is
similar to the fact that a general ring may contain zero-divisors.
Denition 9.1. Let M be an R-module and suppose that m M. Then the annihilator of m, denoted Ann
R
(m)
is r R : r.m = 0. A direct check shows that Ann
R
(m) is an ideal in R. When Ann
R
(m) is nonzero we say
that m M is a torsion element.
We say that a module M is torsion if ever m M is a torsion element. On the other hand, if a module M
has no mon-zero torsion elements we say that M is torsion-free. Note that a ring is an integral domain if and
only if it is torsion-free as a module over itself, i.e. torsion elements in the R-module R itself are exactly the
zero-divisors in R.
Remark 9.2. If M is an R-module, and m M then the submodule R.m of M generated by m is isomorphic as
an R-module to R/Ann
R
(m). Indeed the map r r.m denes an R-module homomorphism from R to M (this
uses the the commutativity of the multiplication in R only) whose image is exactly R.m. Since the kernel
of the map is evidently Ann
R
(m) the isomorphism follows from the rst isomorphism theorem. (Note this
also shows Ann
R
(m) is an ideal, though this is also completely straight-forward to see directly.) A module
which is generated by a single element is known as a cyclic module. It follows from what we have just said
that any cyclic module is isomorphic to a module of the form R/I where I is an ideal of R (corresponding to
the annihilator of a generator of the cyclic module).
Recall from above that we say a module M is free if it has a basis S . The case where S is nite is the
one of most interest to us. Then, just as picking a basis of a vector space gives you coordinates
38
for the
vector space, the basis S allows us to write down an isomorphism : M R
n
where n = S . Indeed if
S = s
1
, s
2
, . . . , s
n
and m M then we may write m =
_
n
i=1
r
i
s
i
for a unique n-tuple (r
1
, r
2
, . . . , r
n
) R
n
, and we
set (m) = (r
1
, . . . , r
n
). It is straight-forward to check that is then an isomorphism of modules.
It is easy to see that when R is an integral domain a free module must be torsion free, but the converse
need not be true in general, as the next example shows. On the other hand, for principal ideal domains,
whose modules will be our main focus, we will shortly see that torsion-free modules are actually free.
Example 9.3. Let R = C[x, y] be the ring of polynomials in two variables. Then the ideal I = (x, y) is a
module for R. It is torsion-free because R is an integral domain (and I is a submodule of R) but it is a good
exercise
39
to see that it is not free. The study of modules over a polynomial ring with many variables is a
basic ingredient in algebraic geometry, which you can study in later years.
38
That is, if V is an n-dimensional R-vector space, the fact that a choice of basis for V gives you an isomorphism from V to R
n
is just
a formal way of saying that picking a basis gives you coordinates for V.
39
Which is on Problem Set 4!
Let R be a PID and suppose that M is a free module so that M has basis, that is, a linearly independent
spanning set X. Just as for vector spaces, the size of a basis for a free module is uniquely determined (even
though a free module may have many different bases, just as for vector spaces).
Lemma 9.4. Let M be a nitely generated free R-module. Then the size of a basis for M is uniquely determined and
is known as the rank rk(M) of M.
Proof. Let X = x
1
, . . . , x
n
be a basis of M. Pick a maximal ideal
40
in R.
Let N
I
=
_
n
i=1
r
i
x
i
: r
i
I. Since I is an ideal it follows that N
I
is a submodule of M. In fact, since X
generates M, any element of the from r.m where r I and m M lies in N
I
, and so the submodule I M
generated by the set of all such elements must lie in N
I
. On the other hand, certainly I M contains r
i
x
i
for
any r
i
I, i 1, 2, . . . , n, and so all sums of the form
_
n
i=1
r
i
x
i
, so that N
I
I M. Hence we see N
I
= I M.
Notice that in particular this means the submodule N
I
= I M does not depend on the choice of a basis of X.
Let q: M M/I M be the quotient map. The quotient module M/I M is module for not just R but in
fact
41
for the quotient eld k = R/I, via the action (r + I).q(m) = r.q(m). Indeed we just need to check this
denition does not depend on the choice of r r + I. But if r r
I then r.m r
.m = (r r
).m I M and so
r
.q(m) = q(r
.m) = q(r.m) = r.q(m) as claimed.

We now claim that if X is a basis for M then q(X) is a basis for the k-vector space M/I M. Note that if we
assume the claim then X = dim
k
(M/I M) and the right-hand side is clearly independent of X (since we have
checked that the submodule I M is) so this will nish the proof of the Lemma. To prove the claim rst note
that since X generates M and q is surjective it follows that q(X) generates (i.e. spans) M/I M. Now suppose
we have
_
n
i=1
c
i
q(x
i
) = 0 M/I M, where c
i
k. Picking any representatives r
i
R for the c
i
R/I we see that
0 =
n
i=1
c
i
q(x
i
) =
n
i=1
r
i
q(x
i
) =
n
i=1
q(r
i
x
i
) = q(
n
i=1
r
i
x
i
where the second equality follows from the denition of the R/I-action, and the remaining from the fact
that q is an R-module homomorphism. But then it follows that y =
_
k
i=1
r
i
x
i
ker(q) = I M. But since I M = N
I
this means that r
i
I for each i, that is c
i
= 0. It follows

X is linearly independent and hence a k-basis of
M/I M as required.
We will shortly see that any nitely generated module is a quotient of a free module R
n
for some n. It is
thus interesting to study submodules of free modules. If R is a PID (as we will from now on assume) then
the submodules of free modules are particularly well behaved.
Proposition 9.5. Let R be a PID, and let F be a free module with basis X a nite set. Then any nonzero submodule
N of F is free on a basis with at most X elements.
Proof. We prove this by induction on k = X. When k = 1, the free module F is isomorphic to R itself, and a
submodule of R is just an ideal. Since R is a PID, any ideal (submodule) I of R is principal, i.e. is generated
by a single element, d say, and since R is a domain so that d is not a zero divisor, it follows R Rd via the
map r r.d so that I = Rd is a free module.
Next suppose X = k > 1. Using the coordinates given by the basis we may assume that F = R
k
and we
have N R
k
. Consider the projection : R
k
R
k1
, where we view R
k1
as the submodule of R
k
consisting
of the elements whose nal entry is zero. The clearly is a surjective homomorphism, and the image (N)
is a submodule of R
k1
. By induction, we know that (N) R
k1
is free module whose bases all have the
same size which is at most k 1.
Now if L = ker() then the kernel of restricted to N is just N L, so that by the rst isomorphism
theorem, N/N L (N). Now if N L = 0 we see that N (N) and so we are done by induction.
Otherwise note that L is just the submodule of elements of R
k
which are zero except for their last entry
which is clearly a copy of the free module R itself, and so N L, by the k = 1 case, if free on one generator, m
say. Since (N) is free, we may pick a basis S = (n
1
), (n
2
), . . . , (n
s
) where again by induction 0 s k 1
40
In a PID we know that maximal ideals exist if R is a eld then we take I = 0, otherwise we take aR for a R an irreducible
element. In a general ring maximal ideals also always exist if you assume the axiom of choice.
41
This is exactly what the submodule I M is cooked up to do if you like M/I M is the largest quotient of M on which R/I acts.
28 KEVIN MCGERTY.
and n
1
, n
2
, . . . , n
s
N (if (N) = 0 the we just take S = , so that s = 0). Now if n N is an arbitrary element
of N, then since S is a basis of (N) we may write
(n) =
s
i=1
c
i
(n
i
)
uniquely for some c
1
, c
2
, . . . , c
s
R. But then using the fact that is a homomorphism we see that (n
_
s
i=1
c
i
n
i
) = 0, that is n
_
s
i=1
c
i
n
i
ker(). Thus we see that n
_
s
i=1
c
i
n
i
N L, and so n
_
s
i=1
c
i
n
i
= d.m
for some d R. Thus we see that N is generated by n
1
, . . . , n
s
, m. We claim this set is in fact a basis of N.
For convenience let n
s+1
= m. Then if we have
_
s+1
i=1
a
i
n
i
= 0, applying we see that
_
s
i=1
a
i
(n
i
) = 0 (since
(n
s+1
) = (m) = 0), and hence since (n
1
), . . . , (n
s
) is a basis of (N) it follows a
i
= 0 for all i (1 i s). But
then we have a
s+1
n
s+1
= 0 and since F is torsion-free it follows n
s+1
= 0. Thus the set n
1
, . . . , n
s+1
is linearly
independent as required.
Example 9.6.
42
Suppose that M Z
3
is the submodule
M = (a, b, c) Z
3
: a + b + c 2Z.
Proposition 9.5 tells us that M must be nitely generated, but lets use the strategy of proof to actually nd
a basis. Let : Z
3
Z
2
be the projection (a, b, c) (a, b). Now in our case it is clear that (M) is all of Z
2
.
Thus if we pick
43
say n
1
= (1, 0, 1) and n
2
= (0, 1, 99) then n
1
, n
2
M and (n
1
) = (1, 0), and (n
2
) = (0, 1)
certainly form a basis for (M). Thus given any m M, we can nd a, b Z so that m an
1
bn
2
ker():
explicitly if m = (a, b, c) then clearly we have m an
1
bn
2
ker(). Now if L = ker() then L is the copy
(0, 0, z) : z Z of Z inside Z
3
, and clearly M L = (0, 0, 2n) : n Z, so that (0, 0, 2) is a generator of L M,
and so putting everything together (1, 0, 1), (0, 1, 99), (0, 0, 2) must be a basis of M as required.
Recall we also had the notion of a torsion element in a module.
Lemma 9.7. Let M be an R-modules, and let M
tor
= m M : Ann
R
(m) 0 is a submodule of M. Moreover, the
quotient module M/M
tor
is a torsion-free module.
Proof. Let x, y M
tor
. Then there are nonzero r, s R such that s.x = t.y = 0. But then s.t R\0, since R is
an integral domain, and (s.y)(x + y) = t.(s.x) + s.(t.y) = 0, and clearly if r R then s.(r.x) = r.(s.x) = 0, so that
it follows M
tor
is a submodule of M as required.
To see the moreover part, suppose that x + M
tor
is a torsion element in M/M
tor
. Then there is a nonzero
r R such that r.(x + M
tor
) = 0 + M
tor
, that is, r.x M
tor
. But then by denition there is an s R such that
s.(r.x) = 0. But then s.r R is nonzero (since R is an integral domain) and (s.r).x = 0 so that x M
tor
and
hence x + M
tor
= 0 + M
tor
so that M/M
tor
is torsion free as required.
10. PRESENTATIONS OF FINITELY GENERATED MODULES OVER A PID
In this section all rings R are PIDs.
We now show how to describe a nitely generated module, at least up to isomorphism, in terms of an
R-valued matrix. To see this, we try and successively describe M in more and more concrete terms. The
following Lemma really is just a rephrasing of the denition of nite generation for a module.
Lemma 10.1. i) Let F be a nitely generated free R-module with basis X = x
1
, . . . , x
n
, and M any R-module.
If f : X M is any function, there is a unique homomorphism : F M extending f , that is, such that
(x) = f (x) for all x X.
ii) Let M be a nitely generated module. Then there is an n N and a surjective morphism : R
n
M.
iii) Let M be a nitely generated module. Then there is an n N such that M is isomorphic to a R
n
/N for some
submodule N of R
n
.
42
Not done in lectures, but might be helpful to read or discuss in tutorials.
43
n
2
might not have been your rst choice for an element of M which maps to (0, 1), but I chose it to emphasize it doesnt matter
which choice you make.
Proof. For the rst part, given f we dene as follows: For any m F we may write m uniquely in the form
m =
_
n
i=1
r
i
x
i
where r
i
R (1 i r). Then dene (m) =
_
n
i=1
r
i
f (x
i
). It is straightforward to check that the
fact that X is a basis ensures is a homomorphism. (This is precisely the same argument as the one used in
linear algebra to show that a linear map is completely determined by its action on a basis.)
For the second part, given any nite subset m
1
, m
2
, . . . , m
n
of M, by the rst part if e
1
, . . . , e
n
is a basis of
R
n
(say the standard basis consisting of elements all of whose entries are zero except for a single entry equal
to 1) the map f (e
i
) = m
i
extends to a homomorphism : R
n
M. Clearly the condition that m
1
, m
2
, . . . , m
k
is a generating set is then equivalent to the map being surjective, since both assert that any element of M
can be written in the form
_
n
i=1
r
i
m
i
= (
_
n
i=1
r
i
e
i
) (r
i
R, 1 i n).
For the nal part, note that if : R
n
M is a surjective homomorphism as in part ii) then by the rst
isomorphism theorem we have R
n
/ker() M (i.e. take N = ker()).
The Lemma shows that, since we only wish to understand modules up to isomorphism, we just need
to understand the quotients R
k
/N where R
k
is a nitely generated free module and N is a submodule of R
k
.
Thus essentially our task is to describe the submodules of R
k
(k N). The key to doing this is given to us by
Proposition 9.5: the submodule N is again free of rank at most k, so that picking a basisn
1
, . . . , n
m
(where
m k) we get an isomorphism : R
m
N given by (r
i
)
m
i=1

_
m
i=1
r
i
n
i
. But now since N R
k
, composing
with this inclusion we can view as an injective map : R
m
R
k
. Now notice that the map : R
m
R
k
captures the module M (up to isomorphism at least): the submodule N is just the image of , and our
original module M is therefore isomorphic to R
k
/im(). We formalise this as follows:
Denition 10.2. Let M be a nitely generated R-module. A presentation of M is an injective homomorphism
: R
m
R
k
such that R
k
/im() M. We often specify the isomorphism

: R
k
/im() M, and then view
the presentation as a chain of maps
R
m

//
R
k

//
M
where is injective and is surjective, and the image of is exactly the kernel of =

q with q the quo-
tient map R
k
R
k
/im(). Remembering the whole chain remembers the module M on the nose, whereas
remembering just the part : R
m
R
k
remembers M up to isomorphism.
To see why, in more concrete terms, one calls this a presentation, lets make explicit what we have done. If
e
1
, . . . , e
m
is the standard basis of R
m
and f
1
, . . . , f
k
is the standard basis of R
k
, then just as in linear algebra,
we may write
(e
j
) =
k
i=1
a
i j
f
i
for some a
i j
R, and the resulting matrix A = (a
i j
)
1ik,1jm
encodes the homomorphism (formally speak-
ing, this is a consequence of part i) of Lemma 10.1). Describing a module M as the quotient R
k
/im() says
that M has generators m
1
, . . . , m
k
(the images of the elements f
i
+ im() R
k
/im() under the isomorphism
from R
k
/im() M) and the linear dependences these generators satisfy are all consequences of the equa-
tions:
k
i=1
a
i j
m
i
= 0 ( j = 1, 2, . . . , m),
that is, the map : R
k
M picks out the generators we use for M and the map records the relations, or
linear dependencies, among these generators. Thus up to isomorphism the structure of the module M is
captured by the matrix of relations A = (a
i j
). As a result, we will be able to obtain a structure theorem for
nitely generated modules over a Euclidean domain R by analysing matrices with entries in R.
11. MATRICES OVER R AND NORMAL FORMS
All rings R in this section are Euclidean Domains.
The previous section shows that to understand nitely generated modules, we need to study homo-
morphisms : R
k
R
m
between free modules. (Strictly speaking, we only need to understand injective
homomorphisms, but the slightly more general question turns out to be no harder.) Concretely, what we
30 KEVIN MCGERTY.
want to do is would like to nd bases of R
k
and R
n
with respect to which the matrix for is as simple as
possible.
In the case of vector spaces (i.e. when R is a eld), this is just the way the proof of rank-nullity proceeds:
you nd bases of R
k
and R
m
such that the basis of R
k
contains a basis for the kernel of and hence the image
of the remaining basis vectors gives a basis for the image, and then extend the basis of the image to a basis
of the whole of R
m
. The matrix of with respect to the resulting bases of R
k
and R
m
is then diagonal with 1s
and 0s on the diagonal, where the rank of is the number of 1s and the nullity the number of zeros. Now
the key to this argument is the Steinitz Exchange Lemma, which says that in a vector space you can extend
any linearly independent set to a basis. However, this Lemma is obviously false for modules: for example
2Z is free inside Z, with basis 2, but the only bases for Z are 1 and 1, thus we cannot extend the basis
2 to a basis of Z. We will show in this section that, in a sense, this is the only thing that fails for modules
over a Euclidean domain: that is, we will show that if N is a submodule of R
n
there is always a basis of
e
1
, . . . , e
m
of R
m
for which some multiples of a subset of the basis, say c
1
e
1
, . . . , c
k
e
k
give a basis for N.
To begin with we recall some of the mechanics for representing a homomorphism : R
k
R
m
by a
matrix. Given bases of R
k
and R
m
, the matrix A of with respect to these bases dened in the previous
section encodes the homomorphism . If we change the basis we take of R
k
, then the matrix of becomes
PA where P is the kk change of basis matrix, and similarly if we change the basis of R
m
the matrix A becomes
AQ where Q is the change of basis matrix for the bases in R
m
. Thus the homomorphism corresponds to
the equivalence class of A in Mat
k,n
(R) where X and Y are equivalent if there are invertible matrices P and
Q such that Y = PXQ. Thus concretely speaking, to get a simple description of , we want to nd simple
representatives matrices for these equivalence classes.
The most explicit way to prove rank-nullity for linear maps between vector spaces is to use row and
column operations (which correspond to particularly simple changes of the basis of the source and target
of the linear map respectively). We will use the same idea for modules over a Euclidean domain.
Denition 11.1. Let A M
m,k
(R) be a matrix, and let r
1
, r
2
, . . . , r
m
be the rows of A, which are row vectors in
R
k
. An elementary row operation on a matrix A M
m,k
(R) is an operation of the form
(1) Swap two rows r
i
and r
j
.
(2) Replace one row, row i say, with a new row r
i
= r
i
+ cr
j
for some c R, and j i.
In the same way, viewing A as a list of k column vectors, we dene elementary column operations.
Note that the row operations correspond to multiplying A by elementary matrices on the left and the
column operations correspond to multiplying A by elementary matrices on the right. Indeed if we let E
i j
denote the matrix with (i, j)-th entry equal to 1 and all other entries zero, then the matrix corresponding to
the rst row operation is S
i j
= I
k
E
ii
E
j j
+ E
i j
+ E
ji
, while second elementary row operation is given by
multiplying on the left by X
i j
(c) = I
k
+cE
i j
. The column operations are given by multiplying on the right by
these matrices.
S
i j
=
_
_
1
0 1
1
.
.
.
1 0
1
_
_
X
i j
(c) =
_
_
1
.
.
.
1 c
.
.
.
1
1
_
_
Denition 11.2. If A, B Mat
k,m
(R) we say that A and B are equivalent if B = PAQ where P Mat
k,k
(R) and
Q Mat
m,m
(R) are invertible matrices
44
We will say that A and B are ERC equivalent if one can be obtained
from the other by a sequence of elementary row and column operations. Since row and column operations
correspond to pre- and pos-multiplying a matrix by elementary matrices, it is clear that two ERC equivalent
matrices are equivalent. (In fact, if you also allow the elementary row and column operations which simply
rescale a row or column by a unit then you can show the converse too, but we do not need that here.)
Recall that we write N: R\0 N for the norm function of our Euclidean domain R.
Theorem 11.3. Suppose that A Mat
k,m
(R) is a matrix. Then A is ERC equivalent (and hence equivalent) to a
diagonal matrix D where
D =
_
_
d
1
0 . . . 0
0 d
2
.
.
. 0
.
.
.
.
.
.
.
.
. 0
0 . . . 0 d
m
.
.
.
.
.
.
.
.
.
0 0 . . . 0
_
_
and each successive d
i
divides the next (thus possibly d
s
= d
s+1
= . . . d
m
= 0, for some s, 1 s m).
Proof. We claim that by using row and column operations we can nd a matrix equivalent to A which is of
the from
(11.1) B =
_
_
b
11
0 . . . 0
0 b
22
. . . b
2m
.
.
.
.
.
.
.
.
.
.
.
.
0 b
k2
. . . b
km
_
_
where b
11
divides all the entries b
i j
in the matrix. Factoring out b
11
from each entry, we may then applying
induction to the submatrix B
= (b
i j
/b
11
)
i, j2
, to obtain the proposition. (Note that row and column opera-
tions on B
correspond to row and column operations on B because b

11
is the only nonzero entry in the rst
row and column of B.)
To prove the claim, we use a sequence of steps which either put A into the required form or reduce the
degree of the (1, 1) entry, hence iterating them we must stop with a matrix of the required form.
Step 1: By using row and column operations, ensure the entry a
11
has N(a
11
) N(a
i j
) for all i, j. By the
division algorithm, we can write a
i j
= q
i j
a
11
+ r
i j
where r
i j
= 0 or N(r
i j
) < N(a
11
). Subtract q
1 j
times column
1 from column i, and q
i1
times row 1 from row i for each i and j (in any order, we only care at the moment
about what happens in the rst row and column.)
Step 2: The resulting matrix now has all nonzero entries in the rst row and column of strictly smaller norm
than the (1, 1)-entry. Thus either we have a matrix of the form (11.1), or we can repeat step 1.
Now since at each iteration of steps 1 and 2 the minimum norm of the nonzero entries of our matrix
strictly decreases, the process must terminate with a matrix of the required form, at least in the sense that
all the entries in the rst row and column are zero except for the (1, 1) entry. Let C denote this matrix. If c
11
does divide all the entries of C then we are done. Otherwise:
Step 3: Pick an entry c
i j
not divisible by c
11
so that c
i j
= q
i j
c
11
+ r
i j
where r
i j
0 and N(r
i j
) < N(c
11
). But
adding q
i j
times column 1 to column j and then subtracting row 1 from row i, we get a new matrix C
with (i, j)-entry equal to r

i j
, and hence the minimum norm of the nonzero entries of C
has again strictly

decreased.
Repeat steps 1 and 2 now on it until we obtain a matrix again of the form (11.1). If its (1, 1)-entry still
does not divide all the entries of the matrix we may repeat step 3. Ultimately
45
since the minimum norm
of the nonzero entries of the matrix keeps strictly decreasing we must terminate at a matrix of the required
form.
44
People often write GL
n
(R) for the group of invertible matrices with coefcients in R. Note that the description of A
1
in terms of
the adjoint matrix works for any commutative ring, so A Mat
k,k
(R) is invertible if and only if det(A) is invertible, that is if and only if
det(A) R
. Thus for example A Mat

n,n
(Z) is invertible if and only if det(A) 1.
45
The strategy of this proof could be viewed as something like if at rst you dont succeed....
32 KEVIN MCGERTY.
Example 11.4. The above proposition is really an algorithm, so lets use it in an example, taking R = Z: Let
A =
_
_
2 5 3
8 6 4
3 1 0
_
_
The entry of smallest norm is the (3, 2) entry, so we swap it to the (1, 1) entry (by swapping rows 1 and 3
and then columns 1 and 2 say) to get
A
1
=
_
_
1 3 0
6 8 4
5 2 3
_
_
Now since the (1, 1) entry is a unit, there will be no remainders when dividing so we get
A
2
=
_
_
1 0 0
0 10 4
0 13 3
_
_
Next we must swap the (3, 3)-entry to the (2, 2)-entry to get:
A
3
=
_
_
1 0 0
0 3 13
0 4 10
_
_
Dividing and repeating our row and column operations now on the second row and column (this time we
do get remainders) gives:
A
4
=
_
_
1 0 0
0 3 13
0 1 3
_
_
A
5
=
_
_
1 0 0
0 3 2
0 1 8
_
_
(where is to denote ERC equivalence). Now moving the (3, 2) entry to the (2, 2)-entry and dividing again
gives:
A
6
=
_
_
1 0 0
0 1 8
0 3 2
_
_
A
7
=
_
_
1 0 0
0 1 0
0 3 22
_
_
A
8
=
_
_
1 0 0
0 1 0
0 0 22
_
_
which is in the required normal form.
12. THE CANONICAL FORM FOR FINITELY GENERATED MODULES.
In this section, all rings are Euclidean Domains, although all results stated here actually also hold more generally for
PIDs.
Combining our results we can now readily prove a structure theorem for nitely generated modules
over a Euclidean domain.
Theorem 12.1. Suppose that M is a nitely generated module over a Euclidean domain R. Then there is an integer s
and nonzero nonunits c
1
, c
2
, . . . , c
r
R such that c
1
c
2
. . . c
r
such that:
M R
s
(
r
i=1
R/c
i
R.
Proof. Since R is a PID we may nd a presentation for M, that is an injection : R
m
R
k
(so that m k)
and a surjection : R
k
M, so that M R
k
/im(). Now if A is the matrix of with respect to the standard
bases of R
k
and R
m
, by Theorem 11.3 which gives a normal form for matrices over a Euclidean domain, we
know we can transform A into a diagonal matrix D with diagonal entries d
1
d
2
. . . d
m
. But since row and
column operations correspond to pre- and post-multiplying A by invertible matrices, and these correspond
to changing bases in R
k
and R
m
respectively, it follows that there are bases of R
k
and R
m
respectively with
respect to which has matrix D. But then if f
1
. . . , f
k
denotes the basis of R
k
, we see that the image of has
basis d
1
f
1
, . . . , d
m
f
m
. But now dene a map : R
k
_

m
i=1
R/d
i
R
_
R
km
by setting for any n =
_
k
i=1
a
i
f
i
M,
(
k
i=1
a
i
f
i
) = (a
1
+ d
1
R, . . . a
m
+ d
m
R, a
m+1
, . . . , a
k
).
It is the clear that is surjective and ker() is exactly the submodule generated by d
i
f
i
: 1 i m, that is,
im(), and so by the rst isomorphism theorem M R
k
/im()

m
i=1
(R/d
i
R) R
km
. Now since is injective
it follows that each of the d
i
are nonzero. On the other hand if d
i
is a unit (and so all d
j
for j i are also)
then R/d
i
R = 0, so this summand can be omitted from the direct sum. The result now follows.
Remark 12.2. The elements c
1
, c
2
, . . . , c
m
are in fact unique up to units. We wont have time to show this
here (the problem sheets asks you to show uniqueness for c
1
and c
1
. . . c
m
at least given a presentation.).
The integer s is also unique, as we now show as a consequence of the important corollary to the structure
theorem which says that a nitely generated torsion-free R-module is free.
Corollary 12.3. Let M be a nitely generated torsion-free module over R. Then M is free. In general if M is a nitely
generated R-module, the rank s of the free part of M given in the structure theorem is rk(M/M
tor
) and hence it is
unique.
Proof. By the above structure theorem, M is isomorphic to a module of the form R
s
(
r
i=1
R/c
i
R, thus we
can assume M is actually equal to a module of this form. Let F = R
s
and N =
r
i=1
R/c
i
R, so that M = F N.
We claim that N = M
tor
. Certainly if a R/c
i
R then since c
i
c
k
we see that c
k
(a) = 0. But then if m N, say
m = (a
1
, . . . , a
k
) where a
i
R/c
i
R it follows c
k
(a
1
, . . . , a
m
) = (c
k
a
1
, . . . , c
k
a
k
) = (0, . . . , 0) so N is torion. On the
other hand if m = ( f , n) where f F and n N then r( f , n) = (r. f , r.n) = (0, 0) we must have f = 0 since a free
module is torsion-free. Thus M
tor
= N as claimed. It follows that M is torsion free if and only if M = F is
free. Moreover, by the second isomorphism theorem F M/M
tor
so that s = rk(F) = rk(M/M
tor
).
(Note that Problem sheet 4 gives an alternative proof that a torsion-free module over a PID is free using just
Proposition 9.5.)
Just to make it explicit, notice that since an abelian group is just a Z-module, our structure theorem gives
us a classication theorem for nitely generated abelian groups.
Corollary 12.4. (Structure theorem for nitely generated abelian groups) Let A be a nitely generated abelian group.
Then there exist uniquely determined integers r Z
0
, c
1
, c
2
, . . . c
k
Z
>0
such that c
1
c
2
. . . c
k
and
A Z
r
(Z/c
1
Z) . . . (Z/c
k
Z).
Proof. This is simply a restatement of the previous theorem, except that once we insist the c
i
are positive the
ambiguity caused by the unit group Z
= 1 is removed.
Suppose that a, b R are coprime, that is, h.c.f.(a, b) = 1 and hence R = Ra+Rb and aRbR = (a.b)R. Then
the Chinese Remainder Theorem shows that R/(a.b)R R/aR R/bR. If a
1
, . . . , a
m
are pairwise coprime so
that h.c.f.(a
i
, a
j
) = 1 for all i j, then h.c.f.a
i
, a
i+1
. . . a
m
= 1, since if p is a prime dividing a
i
and a
i+1
. . . a
m
then p divides some a
j
for j > i and hence h.c.f.a
i
, a
j
. Thus if c = a
1
a
2
. . . a
m
, iteratively we see that
R/cR R/a
1
R R/(a
2
. . . a
m
)R
R/a
1
R R/a
2
R R/(a
3
. . . a
m
)R
. . . R/a
1
R R/a
2
R . . . R/a
m
R
In particular if c = p
n
1
1
. . . . p
n
r
r
is the prime factorisation of c where the p
i
are distinct primes we see that
(12.1) R/cR
r
i=1
R/p
r
i
i
R.
This allows us to give an alternative statement of the structure theorem:
34 KEVIN MCGERTY.
Theorem 12.5. (Structure theorem in primary decomposition form): Let R be a Euclidean domain and suppose that
M is a nitely generated R-module. Then there are irreducibles p
1
, . . . , p
k
R, unique up to units, and integers
r
i
, 1 i k, such that:
M
k
i=1
(R/p
r
i
i
R)
Moreover, the integers r
i
are uniquely determined. (Note that the p
i
s are not necessarily distinct.)
Proof. This follows immediately using the decomposition (12.1) on each of the cyclic modules R/c
i
R in the
statement of our rst structure theorem.
Example 12.6. Suppose that A Z/44Z Z/66Z. Then the rst structure theorem would write A as:
A Z/22Z Z/132Z.
Indeed the generators corresponding to the direct sum decomposition give a presentation of A as Z
2
Z
2
A where the rst map is given by the matrix
_
44 0
0 66
_
and as 66 = 1.44 + 22 we see that row and column operations allow us to show this matrix is equivalent to:
_
44 0
0 66
_

_
44 44
0 66
_

_
44 44
44 22
_

_
22 44
44 44
_

_
22 0
0 132
_
.
On the other hand, for the primary decomposition we write A as:
A
_
(Z/2Z) (Z/2
2
Z)
_
(Z/3Z) (Z/11Z)
2
Notice that the prime 2 appears twice to two different powers. Intuitively you should think of the primary
decomposition as decomposing a module into as many direct summands as possible, while the canonical
form decomposes the module into as few direct summands as possible.
Remark 12.7. Note that the rst structure theorem gives a canonical form which can be obtained algorith-
mically, while the second requires one to be able to factorise elements of the Euclidean domain, which for
example in C[t] is not an automatically computable operation.
13. APPLICATION TO RATIONAL AND JORDAN CANONICAL FORMS
The structure theorem also allows us to recover structure theorems for linear maps: If V is a k-vector
space and T : V V is a linear map, then we view V as a k[t]-module by setting t.v = T(v) for v V.
Lemma 13.1. Let M be a nitely generated k[t]-module. Then M is nite dimensional as a k-vector space if and only
if M is a torsion k[t]-module. Moreover, a subspace U of V is a k[t]-submodule if and only if U is T-invariant, i.e.
T(U) U.
Proof. Given M we can apply the structure theorem to see that:
V k[t]
2
k[t]/(c
1
) . . . k[t]/(c
k
),
Now k[t] is innite dimensional as a k-vector space while k[t]/( f ) is deg( f )-dimensional as a k-vector space,
so if follows that M is torsion if and only if s = 0 if and only if M is nite dimensional as a k-vector space.
For the nal statement, notice that a subspace U of V is T-invariant if and only if it is p(T)-invariant for
every p k[t].
The Lemma shows that pairs (V, T) consisting of a nite dimensional k-vector space V and a linear map
T : V V correspond to nitely generated torsion k[t]-modules under our correspondence above. We can
use this to give structure theorems for endomorphisms
46
of a vector space.
46
An endomorphism of a vector space V is just a linear map from V to itself.
Denition 13.2. For a monic polynomial f k[t], f = t
n
+
_
n1
i=0
a
i
t
i
, the matrix
C
f
=
_
_
0 . . . . . . 0 a
0
1 0
.
.
. a
1
0 1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0 a
k2
0 . . . 0 1 a
k1
_
_
.
is called the companion matrix of f .
Theorem 13.3. Suppose V is a nite dimensional k-vector space and T : V V is a linear map. Then there are
nonconstant polynomials f
1
, . . . , f
k
k[t] such that f
1
f
2
. . . f
k
and a basis of V with respect to which T has matrix
which is block diagonal with blocks C( f
i
):
_
_
C( f
1
) 0 . . . 0
0 C( f
2
) 0
.
.
.
.
.
. 0
.
.
.
.
.
.
0 . . . 0 C( f
k
)
_
_
Proof. By the canonical formtheoremand the previous Lemma, there is an isomorphism: V

k
i=1
k[t]/( f
i
)
of k[t]-modules, where f
1
f
2
. . . f
k
and the f
i
are nonunits (hence nonconstant polynomials). But nowfor any
monic polynomial g = t
n
+
_
n1
i=0
a
i
t
i
, the division algorithm shows that the k[t]-module k[t]/(g) has k-basis
1 +(g), t +(g), . . . t
n1
+(g), and the matrix for mutiplication by t with respect to this basis is just C(g), since
t.t
n1
=
_
n1
i=0
a
i
t
i
mod(g). The union of these basis for each submodule k[t]/( f
i
) gives a basis for the direct
sum, and the matrix of T with respect to the basis of V corresponding to this basis via the isomorphism
1
has the asserted form.
This matrix form for a linear map given by the previous theorem is known as the Rational Canonical Form
of T. Notice that this form, unlike the Jordan canonical form, makes sense for a linear map on a vector space
over any eld, not just an algebraically closed eld like C.
We can also recover the Jordan canonical form for linear maps of C-vector spaces from the second, pri-
mary decomposition, version of our structure theorem, which expresses each module in terms of cyclic
modules k[t]/( f
k
) where f is irreducible. The monic irreducibles over C are exactly the polynomials t for
C. Thus the second structure theorem tells us that, for V a nite dimensional complex vector space and
T : V V, we may write V = V
1
V
2
. . . V
k
where each V
i
isomorphic to C[t]/((t )
r
) for some C,
and r N.
It is thus enough to pick a standard basis for a vector space V with T : V V where as C[t]-modules we
have an isomorphism : V C[t]/((t )
r
). Here we pick the basis
1 + ((t )
r
), (t ) + ((t )
r
), (t )
2
+ ((t )
r
), . . . (t )
r1
+ ((t )
r
)
of C[t]/((t )
r
) and take its preimage v
1
, . . . , v
k
numbered so that
(v
i
) = (t )
ki
+ ((t )
r
).
Clearly (T )v
i
= v
i1
or T(v
i
) = v
i
+ v
i1
if i > 1 and T(v
1
) = v
1
. Thus the matrix of T with respect to this
basis is just the Jordan block of size k, and we recover the Jordan normal form.
Remark 13.4. Next term I will post an additional note summarizing the main ideas and results in the course,
which hopefully will help when reviewing the course.
14. REMARK ON COMPUTING RATIONAL CANONICAL FORM
Not covered in lectures not examinable!
It is also worth considering how one can explicitly compute the decomposition that yields the rational
canonical form: our proof of the existence of the canonical form is constructive, so if we can nd a pre-
sentation of the C[t] module given by a linear map acting on a k-vector space V then we should be able to
compute. The following proposition shows one way to do this.
36 KEVIN MCGERTY.
Proposition 14.1. Let V be an n-dimensional k-vector space and : V V a linear map. If has matrix A Mat
n
(k)
with respect to a basis e
1
, . . . , e
n
of V, then the k[t]-module corresponding to (V, ) has a presentation
k[t]
n
r
//
k[t]
n
f
//
V
where the homomorphism r between the free k[t]-modules is given by the matrix tI
n
A Mat
n
(k[t]), and the map
from f : k[t]
n
V is given by ( f
1
, . . . , f
n
)
_
n
i=1
f
i
(A)(e
i
).
Proof. Sketch: Since t acts by on V, and has matrix A, it follows that the image N of the map r lies in the
kernel of the f . It thus sufces to check that this map is injective and its image is the whole kernel. To see
that it is the whole kernel, let F = k
n
k[t]
n
be the copy of k
n
embedded as the degree zero polynomials. It
follows immediately from the denitions that f restricts to an k-linear isomorphism from F to V, and thus
it is enough to show that N + F = k[t]
n
and N F = 0, both of which can be checked directly. Finally, since
the quotient k[t]
n
/N is torsion, N must have rank n and hence f does not have a kernel (since the kernel
would have to be free of positive rank, and hence the image would have rank less than n.)
Example 14.2. Let A be the matrix
A =
_
_
0 1 0
0 0 1
1 2 3
_
_
Then we have
tI
3
A =
_
_
t 1 0
0 t 1
1 2 3 + t
_
_

_
_
1 2 3 + t
0 t 1
t 1 0
_
_

_
_
1 0 0
0 t 1
0 1 2t 3t t
2
_
_
1 0 0
0 1 t
0 3t t
2
1 2t
_
_

_
_
1 0 0
0 1 0
0 0 t
3
3t
2
2t 1
_
_
so that (Q
3
, A) is isomorphic as a Q[t]-module to Q[t]/(g) where g = t
3
+ 3t
2
+ 2t + 1.
MATHEMATICAL INSTITUTE, OXFORD.

Oxford OCW - Rings and Modules

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Oxford OCW - Rings and Modules

Uploaded by

Copyright:

Available Formats

ALGEBRA II: RINGS AND MODULES.

Denition 2.12. The eld F(R) is known as the eld of fractions of R.

: A in a ring R, their intersection

is easily seen to again

such that a = u.b. (This is an equivalence relation on elements of

, then certainly b (a) = I so that (b) I, but also if x I, then

f : R/I im( f ) given by:

f : R/I S such that

. Then if g R[t] is any

it follows g = q. f + r. Since q and r are clearly uniquely determined by q

they are also unique as

= b. f , for some polynomials a, b k[t]. But then f = a. f

to be monic, it follows that

3. Clearly the ideal (t

3). In particular, since the images of 1, t form a

such, after reordering the q

is a group under multiplication!

) = 1 it follows that c( f .g) = c( f ).c(g) as required.

) = 1. We write c( f ) = . Moreover, if f , g Q[t] then c( f .g) = c( f ).c(g).

we see that f .g = (.)( f

) and by Gauss Lemma c( f

) = [1], hence c( f .g) = . = c( f ).c(g).

Z[t] have content

both lie in Z[t] as required.

Z[t]. Moreover by the same Lemma,

Theorem 6.24. The ring Z[t] is a UFD.

) = 1, and since Z is a UFD we may

) is the submodule ker()/S = m + S : m ker().

S then since S ker() it follows that 0 = (m m

.m) = q(r.m) = r.q(m) as claimed.

correspond to row and column operations on B because b

with (i, j)-entry equal to r

has again strictly

. Thus for example A Mat

You might also like