Professional Documents
Culture Documents
pseudo-syllabus
Prerequisites While Math 259 will proceed at a pace appropriate for a graduate-level course,
its prerequisites are perhaps surprisingly few: complex analysis at the level of Math 113, and
linear algebra and basic number theory (up to say arithmetic in the eld Z=p and Quadratic
Reciprocity). Some considerably deeper results (e.g. estimates on Kloosterman sums) will
be cited but may be regarded as black boxes for our purposes. If you know about algebraic
number elds or modular forms or curves over nite elds, you'll get more from the course at
specic points, but these points will be in the nature of scenic detours that are not required
for the main journey.
Texts There is no textbook for the class; lecture notes will be handed out periodically. This
class is an introduction to several dierent
avors of analytic methods in number theory, and
I know of no one work that covers all this material. Supplementary readings such as Serre's
A Course in Arithmetic and Titchmarsh's The Theory of the Riemann Zeta-Function will be
suggested as we approach their respective territories.
Oce Hours 335 Sci Ctr, Thursdays 2:30{4 PM; or e-mail me at elkies@math to ask ques-
tions or set up an alternative meeting time.
Grading There will be no required homework, though the lecture notes will contain recom-
mended exercises. If you are taking Math 256 for a grade (i.e. are not a post-Qual math
graduate student exercising your EXC option), tell me so we can work out an evaluation and
grading procedure. This will most likely be either an expository nal paper or an in-class
presentation on some aspect of analytic number theory related to but just beyond what we
cover in class. Which grading method is appropriate will be determined once the class size
has stabilized after \Shopping Period". The supplementary references will be a good source
for paper or presentation topics.
Math 259: Introduction to Analytic Number Theory
Introduction: What is analytic number theory?
L-functions such as
L(s) := 1 ? 3?s ? 7?s + 9?s + 11?s ? 13?s ? 17?s + 19?s + ? ? +
1
Prove that the \Kloosterman sum"
?
X
p 1
K (p; a; b) := exp 2pi (ax + bx?1 )
x=1
(with x?1 being the inverse of x mod p) has absolute value at most 2pp.
Show that if a function f : R!R satises reasonable smoothness condi-
tions then for large N the absolute value of the exponential sum
X
N
exp(if (n))
n=1
grows no faster than N for some < 1 (with depending on the condi-
tions imposed on f ).
Investigate the coecients of modular forms such as
1
Y
8 28 =q (1 ? qn)8 (1 ? q2n)8 = q ? 8q2 + 12q3 + 64q4 ? 210q5 ? 96q6 :
n=1
(Fortunately it will turn out that the route from (say) (x) to (s) is not nearly
as long and tortuous as that from xn + yn = z n to deformations of Galois
representations: : : 1 )
The techniques of analytic number theory. A hallmark of analytic number
theory is the treatment of number-theoretical problems (usually enumerative,
as noted above) by methods often relegated to the domain of \applied mathe-
matics": elementary but clever manipulation of sums and integrals; asymptotic
and error analysis; Fourier series and transforms; contour integrals and residues.
Will there is still good new work to be done along these lines, much contempo-
rary analytic number theory also uses advanced tools from within and outside
number theory (e.g. modular forms beyond the upper half-plane, Laplacian spec-
tral theory). Nevertheless, in this introductory course we shall emphasize the
classical methods characteristic of analytic number theory, on the grounds that
they are rarely treated in this Department's courses, while our program already
oers ample exposure to the algebraic/geometric tools. As already noted in
the pseudo-syllabus, we shall on a few occasions invoke results that depend on
deep (non-analytic) techniques, but we shall treat them as deus ex mathematica,
developing only their analytic applications.
The style of analytic number theory. It has often been said that there
are two kinds2 of mathematicians: theory builders and problem solvers. In
1 See for instance [Stevens 1994].
2 Actually there are three kinds of mathematician: those who can count, and those who
cannot. [Attributed to John Conway]
2
the mathematics of our century these two styles are epitomized respectively by
A.Grothendieck and P.Erdos. The Harvard math curriculum leans heavily to-
wards the systematic, theory-building style; analytic number theory as usually
practiced falls in the problem-solving camp. This is probably why, despite its
illustrious history (Euclid, Euler, Riemann, : : : ) and present-day vitality, ana-
lytic number theory has rarely been taught here | in the past twelve years there
have been only a handful of undergraduate seminars and research/Colloquium
talks, and no Catalog-listed courses at all. Now we shall see that there is more
to analytic number theory than a bag of unrelated ad-hoc tricks, but it is true
that fans of contravariant functors, adelic tangent sheaves, and etale cohomol-
ogy will not nd them in the present course. Still I believe that even ardent
structuralists will benet from this course. First, specic results of analytic
number theory often enter as necessary ingredients in the only known proofs
of important structural results. Consider for example the arithmetic of elliptic
curves: the many applications of Dirichlet's theorem on primes in arithmetic
progression, and its generalization to C ebotarev's density theorem,3 include the
recent work of Kolyvagin and Wiles; in [Serre 1981] sieve methods are elegantly
applied to investigate the distribution of traces of an elliptic curve;4 in [Merel
1996] a result (Lemme 5) on the x1 x2 c mod p problem is required to bound
the torsion of elliptic curves over number elds. Second, the ideas and tech-
niques apply widely. Sieve inequalities, for instance, are also used in probability
to analyze nearly independent variables; the \stationary phase" methods that
we'll use to estimate the partition function are also used to estimate oscilla-
tory integrals in quantum physics, special functions, and elsewhere; even the
van der Corput estimates on exponential sums have recently found applica-
tion in enumerative combinatorics [CEP 1996]. Third, the habit of looking for
asymptotic results and error terms is a healthy complement to the usual quest
for exact answers that we can tend to focus on too exclusively. Finally, An
ambitious theory-builder should regard the absence thus far of a Grand Unied
Theory of analytic number theory not as an insult but as a challenge. Both
machinery- and problem-motivated mathematicians should note that some of
the more exciting recent work in number theory depends critically on contribu-
tions from both sides of the stylistic fence. This course will introduce the main
analytic techniques needed to appreciate, and ultimately to extend, this work.
References
[CEP 1996] Cohn, H., Elkies, N.D., Propp, J.: Local statistics for random
3 We shall describe C ebotarev's theorem brie
y in the course but not develop it in detail.
Given Dirichlet's theorem and the asymptotic formula for (x; a mod q), the extra work needed
to get C ebotarev is not analytic but algebraic: the development of algebraic number theory
and the arithmetic of characters of nite groups. Thus a full treatment of C ebotarev does not
alas belong in this course.
4 My thesis work on the case of trace zero (see e.g. [Elkies 1987]) also used Dirichlet's
theorem.
3
domino tilings of the Aztec diamond, Duke Math J. 85 #1 (Oct.96), 117{166.
[Elkies 1987] Elkies, N.D.: The existence of innitely many supersingular primes
for every elliptic curve over Q. Invent. Math. 89 (1987), 561{567.
[Merel 1996] Merel, L.: Bornes pour la torsion des courbes elliptiques sur les
corps de nombres. Invent. Math. 124 (1996), 437{449.
[Serre 1981] Serre, J.-P.: Quelques applications du theoreme de densite de Cheb-
otarev. IHES Publ. Math. 54 (1981), 123{201.
[Stevens 1994] Stevens, G.: Fermat's Last Theorem, PROMYS T-shirt, Boston
University 1994.
[Wilf 1982] Wilf, H.S.: What is an Answer? Amer. Math. Monthly 89 (1992),
289{292.
4
Math 259: Introduction to Analytic Number Theory
Elementary approaches I: Variations on a theme of Euclid1
The rst asymptotic question to ask about (x) is whether (x)!1 as x!1,
that is, whether there are innitely many primes. That the answer is Yes was
rst shown by the justly famed argument in Euclid. While often presented as a
proof by contradiction, the argument can readily be recast as an eective Q (albeit
rather inecient) construction: given primes p1 ; p2; : : : ; pn , let Pn = nk=1 pn ,
dene Nn = PN + 1, and let pn+1 be the smallest factor of Nn . Then pn+1 is
a prime no larger than Nn and dierent from p1 ; : : : ; pn . Thus fpk gk>1 is an
innite sequence of distinct primes, Q.E.D.
Moreover this argument also gives an explicit upper bound on pn , and thus a
lower bound on (x). Indeed we may take p1 = 2 and observe that
Yn Yn
pn+1 Nn = 1 + pn 2 pn :
k=1 k=1
if equality were satised at each step we would have pn = 22n?1 . Thus by
induction we see that2 n?1
pn 2 2
(and of course the inequality is strict once n > 1). Therefore if x 22n?1 then
pk < x for k = 1; 2; : : : ; n and so (x) n, so we conclude3
(x) > log2 log2 x:
The Pn + 1 trick can even be used to prove other special cases of the result that
(a; q) = 1 ) (x; a mod q)!1 as x!1. Of course the case 1 mod 2 is trivial
given Euclid. For ?1 mod q with q = 3; 4; 6, start with p1 = q ? 1 and dene
Nn = qPn ? 1. More generally, for any quadratic character there are innitely
many primes p with (p) = ?1; e.g. given an odd prime q0 , there are innitely
many primes p which are quadratic nonresidues of q0 . [I'm particularly fond of
this argument because I was able to adapt it as the punchline of my doctoral
thesis; see [El].] The case of (p) = +1, e.g. the result that (x; 1 mod 4)!1,
is only a bit trickier.4 For that case, let p1 = 5 and Nn = 4Pn2 + 1, and appeal
1 Specically, [Eu, IX, 20]. For more on the history of work on the distribution of primes
up to about 1900, see [Di, XVIII].
2 Curiously the same bound is obtained from a much later elementary proof: between 1
and 22n there are n pairwise coprime numbers, the rst n ? 1 Fermat numbers 22m + 1 for
0 m < n, [why are they pairwise coprime?], so necessarily at least n primes as well.
3 Q: What sound does a drowning analytic number theorist make? A: log log log log
: : : [R. Murty, via B. Mazur]
4 But enough so that a problem from a recent Qualifying Exam for our graduate students
asked to prove that there are innitely many primes congruent to 1 mod 4.
1
to Fermat's theorem on the prime factors of x2 + y2 . Again this even yields an
explicit lower bound on (x; 1 mod 4), namely5
(x; 1 mod 4) > C log log x
for some positive constant C . [ Exhibit an explicit value of C . Use
Exercises:
cyclotomic polynomials to show more generally that for any q0 , prime or not,
there exist innitely many primes congruent to 1 mod q0 ,6 and that indeed the
number of such primes < x grows at least as fast as some multiple of log log x.
Modify this trick to show that there are innitely many primes congruent to 4
mod 5, again with a log log lower bound.]
But Euclid's approach and its variations, however elegant, are not sucient for
our purposes. For one thing, numerical evidence suggests | and we shall soon
prove | that log2 log2 x is a gross underestimate on (x). For another, one
cannot7 prove all cases of (a; q) = 1 ) (x; a mod q)!1 using only variations
on the Euclid argument. Our next elementary approaches will address at least
the rst deciency.
References
[Di] Dickson, L.E.: History of the Theory of Numbers, Vol. I: Divisibility and
Primality. Washington: Carnegie Inst., 1919.
[Eu] Euclid, Elements.
[El] The existence of innitely many supersingular primes for every elliptic curve
over Q, Invent. Math. 89 (1987), 561{568; See also: Supersingular primes for
elliptic curves over real number elds, Compositio Math. 72 (1989), 165{172.
5 Even a drowning analytic number theorist knows that log log and log log are asymptot-
2 2
ically within a constant factor of each other. What is that factor?
6 A result attributed to Euler in [Di, XVIII].
7 This is not a theorem, of course; for one thing how does one precisely dene \variation of
the Euclid argument"? But I'll be quite impressed if you can even nd a Euclid-style argument
for the innitude of primes congruent to 2 mod 5 or mod 7.
2
Math 259: Introduction to Analytic Number Theory
Elementary approaches II: the Euler product
It was Euler who rst went signicantly beyond Euclid's proof, by recasting
another highlight of ancient Greek number theory, this time unique factorization
into primes, as a generating-function identity.QThat is, from the fact that every
positive integer n may be written uniquely as p prime pcp , with cp a nonnegative
integer that vanishes for all but nitely many p, Euler obtained:
01 1
X
1 Y @X Y 1 :
p p A=
?s = ? c s
(E)
n=1
n
p prime cp =0 p prime
1 ? p?s
The sum on the left-hand side of (E) is now called the zeta function
1 1 X1
?s ;
(s) = 1 + s + s + = n
2 3 n=1
the formula (E) is called the Euler product for (s). So far we have only proved
(E) as a formal identity. But, since all the terms and factors in the innite series
and products are positive, (E) actuallyP1holds as an identity between convergent
series and products provided that ?s converges. By comparison with
R 1 x?s dx (i.e. by the \Integral Test" of elementary calculus) we see that the
n=1 n
1
this happens if and only if s > 1. Moreover, from
Z x+1
?s dy = 1 ?x1?s ? (x + 1)1?s
s?1
y
x
In particular (s)!1 as s!1 from above. This yields Euler's proof2 of the
innitude of primes: if there were only nitely many then the product (E)
would remain nite for all s > 0.
1 In fact more accurate estimates are available from the \Euler-Maclaurin formula", as we
shall see in due course.
2 Actually Euler simply substituted s = 1 in (E) to claim a contradiction with the hypothesis
that there are only nitely many primes, but it is easy enough to convert that idea into this
legitimate proof. If you happen to know that is transcendental, or only that 2 is irrational,
then from 2 = 6 (2) (another famous Euler identity) you may also obtain the innitude of
primes, though naturally the resulting bounds on (x) are much worse.
1
In fact that product is actually innite for s as large as 1, a fact that yields much
tighter estimates on (x) than were available starting from Euclid's proof. For
instance we cannot have constants C; with < 1 such that (x) < Cx for all
x, because then the Euler product would converge for s > . To go further along
these lines it is convenient to take logarithms in (E), converting the product to
a sum: X
log (s) = ? log(1 ? p?s );
p
and to estimate both sides as s!1+. By (*) the left-hand side log (s) is between
log 1=(s ? 1) and log s=(s ? 1); since 0 < log s < s ? 1, we conclude that (s)
is within s ? 1 of log 1=(1 ? s). In the right-hand side we approximate each
summand ? log(1 ? p?s ) by P p?s ; the error is at most 21 p?2s , which summed
over all primes is less than 2 p p?2 < (2)=2. The point is not the numerical
1
bound on the error but the fact that it remains nitely bounded as s!1+. We
thus have: X ?s
p = log s ?1 1 + O(1) (1 < s < 2): (E')
p
The O(1) here is our rst encounter with the \Big Oh" notation; in general if
g is nonnegative then \f = O(g )" is shorthand for \there exists a constant C
such that jf j Cg for each allowed evaluation of f; g". Thus O(1) is aPbounded
function, so (E') means that there exists a constant C such that p p?s is
within C of log s?1 1 for all s 2 (1; 2). [The upper bound of 2 on s is not crucial,
since we're concerned with s near 1, but we must put some upper bound on s
for (E') to hold | do you see why?] An equivalent notation, more convenient in
some circumstances, is f g (or g f ). For instance, a linear map T between
Banach spaces is continuous i T v = O(jvj) i jvj jT vj. Each instance of O()
or or is presumed to carry its own implicit constant C . If the constant
depends on some parameter that parameter appears as a subscript; e.g. for any
> 0 we have log x = O (x ) (equivalently log x x ) on x 2 [1; 1). For
basic properties of O() see the Exercises at the end of this section.
Now (x) still does not occur explicitly in the sum in (E'). WeRthus rewrite
this sum as follows. Express the summand p?s as an integral s p1 y?1?s dy.
Summing over all p we nd that y occurs in the interval of integration (p; 1)
i p < y, i.e. with multiplicity (y). Thus the sum in (E') becomes an integral
involving (y), and we nd:
Z1
(y )y
?1?s dy = log 1 + O(1) (1 < s < 2): (E")
s?1
s
1
Two remarks are in order here. First, that the transformation from (E') to (E")
is an example of a method we shall use often, known either as partial summation
or integration by parts. To explain the latter name,
R consider that the sum in
(E') may be regarded as the Stieltjes integral 11 y?s d(y), which integrated
2
by parts yields (E"); that is how we shall write this transformation henceforth.
Second, that the integral in (E") is just s times the Laplace transform of the
function R (1eu ) on u 2 [0; 1): via the change of variable y = e?u that integral
becomes 0 (eu )e?su du. In Rgeneral if f (u) is a nonnegative function whose
Laplace transform Lf (s) := 01 f (u)e?su du converges for s > s0 then the
behavior of Lf (s) as s!s0 + detects the behavior of f (u) as u!1. In our case,
s0 = 1, so we expect that (E") will give us information on the behavior of (x)
for large x.
We note now that (E") is consistent with (x) x= log x, that is, that3
Z 1 y?s 1 + O(1) (1 < s < 2):
dy = log
2 log y s ? 1
R 1 be I (s). Dierentiating under the integral sign we nd that
Let the integral
I 0 (s) = ? 2 y ?s dy = 21?s =(1 ? s) = 1=(1 ? s) + O(1). Thus for 1 < s < 2 we
have
Z2 Z 2 d
I (s) = I (2) ?
0
I ( ) d = +
? 1 + O(1) = log s ?1 1 + O(1)
s s
as claimed. This does not prove the Prime Number Theorem, but it does show
that, for instance, if c < 1 < C then there are arbitrarily large x; x0 such that
(x) > cx= log x and (x0 ) < Cx0 = log x0 .
3
such that xi < C for all i | and indeed if our suspicion that every xi 1 is
correct then we'll be able to nd C .
We'll encounter this sort of unpleasant ineectivity (where it takes at least two outliers
to get a contradiction) in Siegel's lower bound on L(1; ); it arises elsewhere too,
notably in Faltings' proof of the Mordell conjecture, where the number of rational
points on a given curve of genus > 1 can be eectively bounded but their size cannot.
4
Math 259: Introduction to Analytic Number Theory
Primes in arithmetic progressions: Dirichlet characters and L-functions
We introduce this with the example of the distribution of primes mod 4, i.e. of
(x; 1 mod 4) and (x; 3 mod 4). The sum of these is of course (x) ? 1 once
x > 2, and we've obtained
Z1
s (y)y?1?s dy = log s ?1 1 + O(1) (1 < s < 2) (E")
1
from the Euler product for (s). If the factor (1 ? 2?s )?1 is omitted from that
product we get a product formula for (1 ? 2?s ) (s) = 1+3?s +5?s +7?s + . A
similar formula involving (; 1 mod 4) (or (; 3 mod 4)) would require summing
n?s over the integers all of whose prime factors are are congruent to 1 (or 3)
mod 4, which is hard to deal with. However, we can deal with the dierence
(x; 1 mod 4) ? (x; 3 mod 4) using an Euler product for the L-series
X
1
L(s; 4 ) := 1 ? 31s + 51s ? 71s + ? = 4 (n)n?s :
n=1
Here 4 is the function
8
< +1; if n +1 mod 4;
4 (n) = : ?1; if n ?1 mod 4;
0 if 2jn.
This function is (strongly1) multiplicative:
4 (mn) = 4 (m)4 (n) (m; n 2 Z): (M)
Therefore L(s4 ; ) factors as did (s):
01 1
Y @X Y 1
L(s; 4 ) = (pcp )p?cp s A = ?s : (E4 )
p prime cp =1 p prime 1 ? (p)p
By comparison with the Euler product for (s) we see that the manipulations in
P of ((ns)),n?thes may
(E4 ) are valid for s > 1. Unlike the case function L(s; 4 ) remains
bounded as s!1+, because the sum 1 n=1 be grouped as
1 ? 31s + 51s ? 71s + 91s ? 111s + : : :
1 Sometimes a function f is called multiplicative when f (mn) = f (m)f (n) only for coprime
m; n.
1
in which the n-th term is O(n?(s+1) ) (why?). Indeed this regrouping lets us
extend L(; 4 ) to a continuous function on (0; 1). [ Exercise: Show that in fact
the resulting function is innitely dierentiable on s > 0.] Moreover each term
(1 ? 3?s ), (5?s ? 7?s ), (9?s ? 11?s),: : : is positive, so L(s; 4 ) > 0 for all s > 0,
in particular for s = 1 (you probably already know that L(1; 4 ) = =4). So,
starting from (E4 ) and arguing as we did to get from (E) to (E") we obtain
Z1
s (y; 4 )y?1?s dy = O(1) (1 < s < 2); (E"4 )
1
where X
(y; 4 ) := (y; 1 mod 4) ? (y; 3 mod 4) = 4 (p):
p prime
p<y
Averaging (E"4 ) with (E) we nd that
Z1
s (y; 1 mod 4)y?1?s dy = 21 log s ?1 1 + O(1) (1 < s < 2);
Z11
s (y; 3 mod 4)y?1?s dy = 21 log s ?1 1 + O(1) (1 < s < 2):
1
This is consistent with (x; 1 mod 4) 12 x= log x, and corroborates our ex-
pectation that there should be on the average as many primes congruent to
+1 mod 4 as ?1 mod 4. To phrase it another way, the sets of 1 mod 4 and
3 mod 4 primes both have logarithmic P density 1=2, where a set S of primes is
said to have logarithmic density if p2S p?s log s?1 1 as s!1+.
We can similarly treat (x; 1 mod 3) and, with a tad more work, (x; a mod 8)
(a odd) and (x; a mod 12) (a = 1; 5; 7; 11) in the same way. [To prove that
the L-series 1 ? 3?s ? 5?s + 7?s + ? ? + and 1 ? 5?s ? 7?s + 11?s + ?
? + are positive on s > 0, group them in fours rather than pairs.] In
each case we nd that the two or four congruence classes of primes have equal
logarithmic densities. What about (x; a mod 5)? We have the quadratic
taking n to +1 or ?1 if x 1 or 2 mod 5 (and to 0 if 5jn), but this only lets us
separate quadratic from non-quadratic residues mod 5. To get at the individual
nonzero residue classes mod 5 we need also two 's taking complex values: the
multiplicative functions taking n 0; 1; 2; 3; 4 mod 5 to 0; 1; i; i; ?1. The
resulting L-functions are then complex, but the crucial fact that they do not
vanish at s = 1 can still be shown by using the pairing trick on either the real or
the complex part. We thus nd again that each of the four nonzero congruence
classes of primes mod 5 has the same logarithmic density, and in particular that
there are innitely many primes congruent to a mod 5 for each a 2 (Z=5)
(which we have so far been unable to do for a = 2).
How to generalize this to treat (x; a mod q) for all q > 1 and a coprime with q?
We'll use linear combinations of Dirichlet characters. These are dened as
2
follows: for a positive integer q, a Dirichlet character mod q is a function :
Z!C which is
q-periodic, i.e. n n0 mod q ) (n) = (n0 );
supported on the integers coprime to q and on no smaller subset of Z, i.e.
(n; q) = 1 , (n) = 6 0; and
multiplicative: (m)(n) = (mn) for all integers m; n.
To such a character is associated the Dirichlet L-series
X1 Y 1
L(s; ) := (n)n?s = (s > 1):
p prime 1 ? (p)p
?s
n=1
Examples: The trivial character 0 mod q is dened by (n) = Q 1 if (n; q) = 1 and
(n) = 0 otherwise, associated with the L-series L(s; 0 ) = pjq (1 ? p?s) (s).
If l is prime then the Legendre symbol (=l), dened by (n=l) = 0; 1; ?1 according
as n is zero, a nonzero square, or not a square mod l, is a character mod l or 4l by
Quadratic Reciprocity. If is a Dirichlet character mod q then so is its complex
conjugate (dened of course by (n) = (n) ), with L(s; ) = L(s; ) for s > 1.
If 1 ; 2 are characters mod q1 ; q2 then 1 2 is a character mod lcm(q1 ; q2 ). In
particular the characters mod q constitute a group under multiplication, with
identity 0 and inverse ?1 = .
What is this group? Since a Dirichlet character mod q is just a homomorphism
from (Z=q) to the unit circle (extended by zero to a function on Z=q and
lifted to Z), the group of such characters is just the Pontrjagin dual of (Z=q) .
Pontrjagin duality for nite abelian groups like (Z=q) is easy | it's basically
just the discrete Fourier transform by another name. We recall the basic facts:
For any nite abelian group G let G^ be its dual. Then the dual of G H is
G^ H^ , and the dual of Z=m is a cyclic group of order m. Since any nite
abelian group is a product of cyclic groups, it follows that G^ is isomorphic (not
in general canonically so!) with G, and the canonical homomorphism from G
to the dual of G^ , taking g 2 G to the map 7! (g), is an isomorphism. The
characters of G are orthogonal :
X
1 (g) 2 (g) = j0G; j; ifif 1 6=
= 2 ;
2 .
g2G 1
In particular, they are linearly independent; since there are jGj of them, they
form a basis for the vector space of complex-valued functions on G. The de-
composition of an arbitrary such function f : G!C as a linear combination of
characters is achieved by the inverse Fourier transform:
X X
f = f ; where f := 1
(g)f (g):
jGj g2G
2G^
3
In particular the characteristic function of any g0 2 G is jGj?1 (g).
P
What does all this tell us about Dirichlet L-functions and distribution of primes
mod q? First, that if we dene (; ) by
X X
(x; ) := (a)(x; a mod q) = (p)
a mod q p prime
p<x
then, for all a coprime to q,
X
(x; a mod q) = (1q) (a)(x; );
2(Z=q)
where as usual (q) = j(Z=q) j is the Euler phi (a.k.a. \totient") function.
Second, that
Z1
s (y; )y?1?s dy = log L(s; ) + O(1): (1 < s < 2) (D)
1
Since the Euler product for L(s; ) converges for s > 1, we know that L(s; ) > 0
for s > 1. If = 0 then L(s; ) is essentially (s) so log L(s; ) = log(1=(s ?
1)) + O(1) as s!1+. Otherwise the sum dening L(s; ) actually converges
P once s 1): as a special Pcase of character
for s > 0 (though not absolutely
orthogonality, if 6= 0 then a modq (a) = 0, so S (x) := 0<n<x (n) is a
bounded function and
N (n) Z N
X N Z N
ns = n?s dS (x) = S (x)n?s + s n?1?s S (x) dx
M
n=M M M
M ?s + N ?s ;
which for xed s > 0 tends to zero as M; N !1. As with the special case = 4 ,
we can in fact show [ do it.] that L(s; ) is innitely dierentiable in
Exercise:
(0; 1). From (D) we see that the crucial question is whether L(1; ) is positive
or zero: the right-hand side is O(1) if L(1; ) > 0 but ? log(1=(s ? 1)) + O(1)
if L(1; ) = 0. Our experience with small q, and our expectation that the primes
should not favor one congruence class in (Z=q) to another, both suggest that
L(1; ) will not vanish. Our methods thus far do not let us prove this in general
(try doing it for = (=67) or (=163)!), so let us for the time being assume:
6= 0 ) L(1; ) 6= 0: (6=)
Then, by multiplying (D) by (a) and averaging over a 2 (Z=q) we obtain:
Let a; q be coprime positive integers. Assume that (6=) holds for all Dirichlet
characters mod q. Then
Z1
s (y; a mod q)y?1?s dy = (1q) log s ?1 1 + O(1) (1 < s < 2): (D)
1
4
Thus the primes congruent to a mod q have logarithmic density 1=(q); in par-
ticular the arithmetic progression fn : n a mod qg contains innitely many
primes.
In fact (6=) was proved by Dirichlet, from which followed (D), Dirichlet's cele-
brated theorem on primes in arithmetic progressions. At least three proofs are
now known. These three proofs all start with the product of all (q) L-functions
associated to Dirichlet characters mod q:
0 1?1
Y Y @Y
L(s; ) = (1 ? (p)p?s )A :
mod q p prime mod q
The inner product can be evaluated with the following cyclotomic identity:
Let G be a nite abelian group and g 2 G an element of order m. Then
Y
(1 ? (g)z ) = (1 ? z m)jGj=m
2G^
hold identically for all z .
The identity is an easy consequence of the factorization of 1 ? z m together with
the fact that any character of a subgroup H G extends in [G : H ] ways to a
character of G (in our case H will be the cyclic subgroup generated by g).
Let mp , then, be the multiplicative order of p mod q (for all but the nitely
many primes p dividing q). Then we get
Y Y
L(s; ) = (1 ? p?mps )?(q)=mp : ()
mod q p-q
The left-hand side contains the factor L(s; 0 ), which is C=(s ? 1) + O(1) as
s!1+ for some C > 0 [in fact C = (q)=q]. Since the remaining factors are
dierentiable at s = 1, if any of them were to vanish there the product would
remain bounded as s!1+. So we must show that this cannot happen.
Dirichlet's original approach was to observe that () is, up to a few factors2
1 ? n?s with njq, the \zeta function of the cyclotomic number eld Q(e2i=q )".
He then proved that the zeta function K (s) of any number eld K is C=(s?1)
as s!1+ for some positive constant C (and gave an exact formula for C , which
includes the class number of K and is thus called the \Dirichlet class number
2 In fact if we replace each by its underlying primitive character (see the Exercises) the
product is exactly the zeta function of Q(e2i=q ). This is the prototypical example of the
factorization of a zeta function as a product of Artin L-functions, and the fact that the \Artin
L-functions" for 1-dimensional representations are Dirichlet series is a prototype for class eld
theory. Needless to say, Math 259 is not the course where these remarks may be properly
explained.
5
formula"). That is undoubtedly the best way to go about it | but it requires
more algebraic number theory than I want to assume here. Fortunately there
are at least two ad-hoc simplications available.
The rst is that we need only worry about real characters. If L(1; ) = 0 then
also L(1; ) = 0. So if 6= but L(1; ) = 0 then there are at least two factors
in the left-hand side of () that vanish at s = 1; since they are dierentiable
there, the product would be not only bounded as s!1+, but approach zero
there | which is impossible because the right-hand side is > 1 for all s > 1.
But if is a real character then L(s; 0)L(s; ) is (again within a few factors
1 ? n?s of) the L-function of a quadratic number eld. Developing the algebraic
number theory of quadratic number elds takes considerably less work than is
needed for the full Dirichlet class number formula, and if we only want to get
unboundedness as s!1+ it is even easier | for instance, if (1) = ?1 then the
right-hand side of () is dominated by the zeta function of a binary quadratic
form, which is easily seen to be 1=(s ? 1). However, even this easier proof is
beyond the scope of what I want to assume or develop in this class.
Fortunately there is a way to circumvent any K beyond K = Q, using the fact
that the right-hand side of () also dominates the series ((q) s), which blows
up not at s = 1 but at s = 1=(q). Since this s is still positive, we can still get
a proof of (6=) from it, but only by appealing to the magic of complex analysis.
We thus defer the proof until we have considered (s) and more generally L(s; )
as functions of a complex variable s, which we shall have to do anyway to obtain
the Prime Number Theorem and results on the density (not just logarithmic
density) of primes in arithmetic progressions.
Exercises: Show that the integers q modulo which all the Dirichlet characters
are real (take on only the values 0; 1) are precisely 24 and its factors. If you
know
Q (Quadratic QReciprocity, show that every real Dirichlet character is either
l2L =l ) or 4 l2L (=l ) for some possibly empty nite set L of primes.
If q1 jq and 1 is a character mod q1 then a character mod q is obtained from 1
by multiplying it by the trivial character mod q. Express L(s; ) in terms of
L(s; )1 . Conclude that (6=) holds for , it holds for 1 .
A character mod q that cannot be obtained in this way from any character mod
a proper factor q1 jq is called primitive. Show that any Dirichlet character
comes from a unique primitive character 1 . [The modulus of this 1 is called
theQ conductor of .] Show that the number of primitive characters mod n is
n pjn p , where p = ((p ? 1)=p)2 if p2 jn and (p ? 2)=p if pkn. NB there are
no primitive characters mod n when 2kn.
Deduce the fact that for any q at most one character mod q fails (6=) starting
from (D) together with the fact that (x; a mod q) 0 for all x; a; q. [In the nal
analysis this is not much dierent from our proof using the product of L-series.]
Using either this approach or the one based on (), prove that there is at most
6
one primitive Dirichlet character of any modulus whose L-function vanishes at
s = 1. (Assume there were two, and obtain two dierent imprimitive characters
to the same modulus for which (6=) fails, which we've already shown impossible.
We shall encounter this trick again when we come to Siegel's ineective lower
bound on L(1; ).)
7
Math 259: Introduction to Analytic Number Theory
C ebysev (and von Mangoldt and Stirling)
It is well known1 that for any prime p and positive integer x the exponent of p
in x! is 1 x
x x x
cp (x) := p + p2 + p3 + =
X
pk ; k=1
the sum being nite because eventually pk > x. It was C ebysev's insight that
one could extract information about () from the resulting formula
Y c (x)
x! = pp ;
p
or equivalently
X 1 jxk
X
log x! = cp (x) log(p) = n (n); (C )
p n=1
where (n) is the von Mangoldt function
(n) := log p; if n = pk for some positive integer k and prime p;
0; otherwise.
For instance, C ebysev was able to come close enough to the Prime Number The-
orem to prove \Bertrand's Postulate": every interval (x; 2x) with x > 1 contains
a prime. (See [H&W 1996, p.343{4] for Erdos's simplication of C ebysev's proof;
this simplied proof is also on the Web: http://forum.swarthmore.edu/dr.math/
problems/kuropatwa.4.3.97.html.)
To make use of (C ) we need to estimate
x
X
log x! = log n
n=1
for large x. We do this by in eect applying the rst few steps of symmetrized
Euler-Maclaurin summation. For any C 2 function f we have (by integrating by
parts twice)
Z 1=2 "Z Z 1=2 #
1
f (y) dy = f (0)+ 2
0
f 00 (y)(y + 1=2) dy +
2 00
f (y)(y ? 1=2) dy
2
?1=2 ?1=2 0
1 If only thanks to the perennial problems along the lines of \how many zeros end 1998! ?".
1
Z 1=2
1
= f (0) + 2 f 00 (y)hy + 12 i2 dy;
?1=2
where hz i is the distance from z to the nearest integer. Thus
N
X Z N + 12 Z N + 12
f (k) = f (y) dy + 21 f 00 (y)hy + 1=2i2 dy:
k=1 =
1 2 1
2
y2 :
2
The integral is Z1
? 2 1 hy + 1=2i2 dy
1
y2 + O(1=x);
2
and the other terms are
(x + 21 ) log x ? x + 21 (log 2 ? 1) + O(1=x):
Thus we have
log x! = (x + 12 ) log x ? x + C + O(1=x) (S)
for some absolute constant C . Except for the determination of C (which turns
out to be 12 log(2), as we shall see about a week hence), this formula is of
course Stirling's approximation pto x!. Stirling's approximation extends to an
asymptotic series for x!=[(x=e)x 2x ] in inverse powers of x, but for our pur-
poses log x! = (x + 21 ) log x ? x + O(1) is more than enough. In fact, since for
the time being we're really dealing with logbxc! and not log ?(x + 1), the best
honest error term we can use is O(log x).
Now let X
(x) := (n):
1nx
Then from (C ) and (S) we have
1
X
(x=k) = (x + 21 ) log x ? x + O(1):
k=1
This certainly suggests that (x) x, and lets us prove upper and lower bounds
on (x) proportional to x, for instance
1 " 1 #
X ?m m X m
(x) log x! ? 2 logbx=2 c! = 2 m log 2 x + O(log x) =
2
m=1 m=1
2
= (2 log 2)x + O(log2 x)
P
(since x 1 + 1
m=1 bx=2 c for x 1) and
m 2
1
(?1)k?1 (x=k) = log (2xx!2)! = (log 2)x + O(log x):
X
(x)
k=1
It is true that we're ultimately interested in (x), not (x). But it is easy to get
P1 to1=k(x) of prime1=powers
from one to the other. For one thing the contribution
pk with k > 1 is negligible
P | certainly less than k=2 bx c log x x log x.
2
The remaining sum, p<x log p, can be expressed in terms of (x) and vice
versa using partial summation, and we nd:
Zx
(x) = log(x)(x) ? (y) dy 1=2
y + O(x log x);
1
Z 1=2 Zx
1
= f (0) + 2 (x)
(x) = log x + (y) dy2 + O(x1=2 log x):
?1=2 3 y log y
It follows that the Prime Number Theorem (x) x= log x holds if and only if
(x) x, and good error terms on one side imply good error terms on the other.
It turns out that we can more readily get at (x) than at (x); for instance
(x) is quite well approximated
R by x, while the \right" estimate forR (x) is not
x= log x but that plus x dy= log2 y, i.e. the \logarithmic integral" x dy= log y.
It is in the form (x) x that we'll actually prove the Prime Number Theorem.
Exercises:
Since our upper and lower asymptotic bounds log 2; log 4 on (x)=x are within a
factor of 2 of each other, they do not quite suce to prove Bertrand's Postulate.
But any improvement would prove that (2x) > (x) for suciently large x,
from which the proof for all x follows by just exhibiting a few suitably spaced
primes. It turns out that better bounds are available starting from (C ). For
instance, show that (x) < ( 12 log 12)x + O(log2 x). Can you obtain C ebysev's
bounds of 0.9 and 1.1? In fact it is known that the upper and lower bounds
can be brought arbitrarily close to 1, but alas the only known proof of that fact
depends on the Prime Number Theorem!
Here is another elementary approach: let P (u) be any nonzero polynomial with
integer coecients of degree d; then
Z
f (u)2n du 1=lcm(1; 2; : : : ; 2dn + 1) = exp(? (2dn + 1)) > Q 1 p :
1
0 p<2dn
2This is essentially the same tactic of factoring ( 2nn ) that C ebysev used to prove (2x) >
(x).
3
Thus X
min1 1=jP (u)j:
log p > 2n log 0<u<
p<2dn
P
For instance, taking f (u) = u ? u2 we nd (at least for 4jx) that p<x log x <
x log 4. This is essentially the same (why?) as C ebysev's trick of factoring ( 2nn ),
but suggests dierent sources of improvement; try f (u) = (u ? u2 )(1 ? 2u) for
example. [Unfortunately here the upper bound cannot be brought down to 1+ ;
see [Montgomery 1994, Chapter 10] | thanks to Madhav Nori for bringing this
to my attention.]
Show that
X 1
X
log p = (x) ? (x1=2 ) ? (x1=3 ) ? (x1=5 ) + (x1=6 ) = (k) (x1=k );
px k=1
where is the Mobius function taking the product of r 0 distinct primes to
(?1)r and any non-square-free integer to 0.
References
[H&W 1996] Hardy, G.H., Wright, E.M.: An Introduction to the Theory of
Numbers, 5th ed. Oxford: Clarendon Press, 1988 [AB 9.88.10 / QA241.H37].
[Montgomery 1994] Montgomery, H.L.: Ten lectures on the interface between
analytic number theory and harmonic analysis. Providence: AMS, 1994 [AB
9.94.9].
4
Math 259: Introduction to Analytic Number Theory
The contour integral formula for (x)
1
Lemma. For y; c; T > 0 we have
(
1 Z c+iT ys ds = 1 + O(yc min(1; T j log 1 )); if y 1;
yj (IT )
2i c?iT s 1 ));
O(yc min(1; T j log if y 1,
yj
the implied O-constant being eective and uniform in y; c; T .
(In fact the error's magnitude is less than both yc and yc=T j log yj. Of course
if y equals 1 then the error term is regarded as O(1) and is valid for both
approximations 0; 1 to the integral.)
Proof : Complete the contour of integration to a rectangle extending to real part
?M if y 1 or +M if y 1. The resulting contour integral is 1 or 0 respectively
R theorem. We may let jM j!1 and bound the horizontal integrals
by the residue
by (T )?1 01 ycr dr; this gives the estimate yc=T j log yj. Using a circular arc
centered at the origin instead of a rectangle yields the same residue with a
remainder of absolute value < yc . 2
P R
This Lemma will let us approximate n<x an by (2i)?1 cc?+iTiT xs F (s) ds=s.
We shall eventually choose some T and exploit the analytic continuation of F
to shift the contour of integration past the region of absolute convergence to
obtain nontrivial estimates.
The next question is, which F should we choose? Consider for instance (s). We
have in eect seen already that if we take F (s) = log (s) then the sum of the
resulting an over n < x closely approximates (x). Unfortunately, while (s)
continues meromorphically to 1, its logarithm does not: it has essential log-
arithmic singularities at the pole s = 1, and at zeros of (s) to be described later.
So we use the logarithmic derivative of (s) instead, which at each pole/zero of
has a simple pole with a known residue and thus a predictable eect on our
contour integral.
What are the coecients an for this logarithmic derivative? It is convenient to
use not 0 = but ? 0 = , which has positive coecients. Using the Euler product
we nd
Xd 1
?s ) = X log p p?s = X log p X p?ks :
0
? ((ss)) = log(1 ? p 1 ? p?s p
p ds p k=1
That is,
0 1
X
? (s) = (n)n?s :
n=1
?s
So the n coecient is none other than the von Mangoldt function which arose
in the x! factorization! Thus our contour integral
1 Z c+iT ? 0 (s) xs ds (c > 1)
2i c?iT s
approximates (x). The error can be estimated by our Lemma (IT ): since
2
j(n)j < log n the error is of order at most
1
X
(x=n)c log n min(1; T j log(1x=n)j )
n=1
which is O(T ?1 xc log2 x) provided 1 < T < x. [ Verify this; explain
Exercises:
why the bound need not hold if T is large compared to x. Use (IT ) to show that
nevertheless Z c+i1 0 R
(x) = 2i1 ? (s) xs dss ( )
c?i1
for all x; c > 1 in the principal value sense of (1).] Taking c = 1 + A= log x so
xc x we nd:
Z 1+ log x +iT 0
A 2
(x) = 21i ? (s) x s ds + OA x log x : ()
1+ logA x ?iT s T
Similarly for any Dirichlet character we obtain a formula for
X
(x; ) := (n)(n)
n<x
by replacing (s) in () by L(s; ).
To make use of this we'll want to shift the line of integration to the left, where
jxs j is smaller. As we do so we shall encounter poles at s = 1 and at zeros of
(s), and will have to estimate j 0 = j over the resulting contour. This is why
we are interested in the analytic continuation of (s) [and likewise
L(s; )] and its zeros. We investigate these next.
P
Remark: we can already surmise that (x) will be approximated by x? x =,
the sum running over zeros of (s) counted with multiplicity, and thus that the
Prime Number Theorem is tantamount to the nonvanishing of (s) on Re(s) = 1.
That (1 + it) 6= 0 is also the key step in various \elementary" proofs or the
Prime Number Theorem such as [Newman 1980] (see also [Zagier 1997]).
P
Exercise: Show that 1 ?s
n=1 (n)n = 1= (s), with being the Mobius function
dened in the P last exercise of the previous
R lecture notes. Deduce an integral
formula for n<x (n) analogous to ( ), and an approximate integral formula
analogous to () but with error only O(T ?1 x log x) instead of O(T ?1 x log2 x).
References
[Newman 1980] Newman, D.J.: Simple Analytic Proof of the Prime Number
Theorem, Amer. Math. Monthly 87 (1980), 693{696.
[Zagier 1997] Zagier, D.: Newman's Short Proof of the Prime Number Theorem,
Amer. Math. Monthly 104 (1994), 705{708.
3
Math 259: Introduction to Analytic Number Theory
The Riemann zeta function and its functional equation
1
for > 0. Then we have:
Theorem (Riemann): The function extends to a meromorphic function on C,
regular except for simple poles at s = 0; 1, which satises the functional equation
(s) = (1 ? s): (R)
It follows that also extends to a meromorphic function on C, which is regular
except for a simple pole at s = 1, and that this analytic continuation of has
simple zeros at the negative even integers ?2; ?4; ?6; : : :, and no other zeros
outside the closed 0 1.
critical strip
[The zeros ?2; ?4; ?6; : : : of outside the critical strip are called its trivial
zeros.]
The proof has two ingredients: properties of ?(s) as a meromorphic function of
s 2 C, and the Poisson summation formula. We treat ? rst.
The Gamma function was dened for real s > 0 by Euler3 as the integral
Z1
?(s) := xs e?x dx
x: (?)
0
R
We have ?(1) = 01 e?x dx = 1 and, integrating by parts,
Z1 Z1
s?(s) = e?xd(xs ) = ? xs d(e?x ) = ?(s + 1) (s > 0);
0 0
so by induction ?(n) = (n ? 1)! for positive integers n. Since jxs j = x , the
integral (?) denes an analytic function on > 0, which still satises the
recursion s?(s) = ?(s + 1) (proved either by repeating the integration by parts
or by analytic continuation from the positive real axis). That recursion then
extends ? to a meromorphic function on C, analytic except for simple poles at
0; ?1; ?2; ?3; : : :. (What are the residues at those poles?) For s; s0 in the right
half-plane > 0 the Beta function4 B(s; s0 ), dened by the integral
Z1
B(s; s0 ) := xs?1 (x ? 1)s0 ?1 dx;
0
is related with ? by
?(s + s0 )B(s; s0 ) = ?(s)?(s0 ) (B)
R 1R 1 0
(this is proved by the standard trick of evaluating 0 0 xs?1 ys ?1 e?x?y dx dy
in two dierent ways). Since ?(s) > 0 for real positive s it readily follows that
? has no zeros in > 0, and thus none in the complex plane.
3 Actually Euler used (s ? 1) for what we call ?(s); thus (n) = n! for n = 0; 1; 2; : : :.
4a.k.a. \Euler's rst integral", (?) being \Euler's second integral".
2
This is enough to derive the poles and trivial zeros of from the functional
equation (R). [Don't take my word for it | do it!] But where does (R) come
from? There are several known ways to prove it; we shall use Riemann's original
method, which generalizes to L-series associated to modular forms. Riemann
expresses (s) as an integral involving the theta function
X
1
(u) := e?n2 u = 1 + 2(e?u + e?4u + e?9u + : : :);
n=?1
the sum converging absolutely to an analytic function on the upper half-plane
Re(u) > 0. Then
Z1
2 (s) = ((u) ? 1)us=2 du
u ( > 0)
0
(integrate ((u) ? 1)us=2 du=u = 2 1
P
?n2 u us=2 du=u termwise). But we
n=1 e
shall see:
Lemma: The function (u) satises the identity
(1=u) = u1=2(u): ()
Assume this for the time being. We then rewrite our integral for 2 (s) as
Z1 Z1
((u) ? 1)us=2 du
u + ((u) ? 1)us=2 du
u
0 1
Z1 Z1
= ? s + (u)u u + ((u) ? 1)us=2 du
2 s=2 du
u;
0 1
and use the change of variable u $ 1=u to nd
Z1 Z1
(u)us=2 du
u = (u?1 )u?s=2 du
u
Z1 0 1 Z
1
= (u)u(1?s)=2 du 2
u = s ? 1 + 1 ((u) ? 1)u
(1?s)=2 du :
u
1
Thus Z1
(s) + s + 1 ? s = 2 ((u) ? 1)(us=2 + u(1?s)=2 ) du
1 1 1
u;
1
which is manifestly symmetrical under s $ 1 ? s, and analytic since (u) de-
creases exponentially as u!1. This concludes the proof of the functional equa-
tion and analytic continuation of , assuming our lemma ().
And where does () come from? It is the special case f (x) = e?ux2 of the
following
3
Theorem (Poisson Summation Formula): Let f : R!C be a twice-dierentiable
function such that (jxjr + 1)(jf (x)j + jf 00 (x)j) is bounded for some r > 1, and
let f^ be its Fourier transform
Z +1
f^(y) = e2ixy f (x) dx:
?1
Then
X
1 X
1
f (m) = f^(n); (P)
m=?1 n=?1
the sums converging absolutely.
Proof : Dene F : R=Z!C by
X
1
F (x) := f (x + m);
m=?1
the sum converging absolutely to a twice-dierentiable function by the assump-
tion on f . Thus the Fourier series of F converges absolutely to F , so in particular
1 Z1
X
F (0) = e2inx F (x) dx:
n=?1 0
But F (0) is just the left-hand side of (P), and the integral is f^(n), so its sum
over n 2 Z yields the right-hand side of (P), Q.E.D.
Now let f (x) = e?ux2 . The hypotheses are handily satised for any r, so
(P) holds. The left-hand side is just (u). To evaluate the right-hand side, we
need the Fourier transform of f , which is u ?1=2 e?u?1 y2 . [Contour integration
R 1 e?ux2 dx = u?1=2, which is the well-known Gauss
reduces this claim to ?1
integral | see the Exercises.] Thus the right-hand side is u?1=2 (1=u). Multi-
plying both sides by u1=2 we then obtain () and nally complete the proof of
the analytic continuation and functional equation for (s).
Remark: We noted already that to each number eld K there corresponds a
zeta function
X Y
K (s) := Nm(I )?s = (1 ? Nm(})?s )?1 ( > 1);
I }
in which the sum and product extend respectively over ideals and prime ideals
of the ring of integers OK , and their equality expresses unique factorization. As
in our case of K = Q, this extends to a meromorphic function on C, regular
except for a simple pole at s = 1. Moreover it satises a functional equation
K (s) = K (1 ? s), where
K (s) := ?(s=2)r1 ?(s)r2 (4?r2 ?n jdj)s=2 K (s);
4
in which n = r1 + 2r2 = [K : Q], r1 ; r2 are the numbers of real and complex
embeddings of K , and d is the discriminant of K=Q. The factors ?(s=2)r1 ; ?(s)r2
may be regarded as factors corresponding to the \archimedean places" of K , as
the factor (1 ? Nm(})?s )?1 corresponds to the nite place }. The functional
equation can be obtained from generalized Poisson summation as in [Tate 1950].
Most of our results for = Q carry over to these K , and yield a Prime
Number Theorem for primes of K ; L-series generalize too, though the proper
generalization requires some thought when the class and unit groups need no
longer be trivial and f1g as for Q. See for instance H.Heilbronn's \Zeta-
Functions and L-Functions", Chapter VIII of [CF 1967].
Exercises:
P
Show that if : Z!C is a function such thatP1nm=1 (m) = O(1) (for
instance, if is a nontrivial Dirichlet character) then n=1 (n)n?s converges
uniformly (albeit not absolutely) in compact subsets of f + it : > 0g and
thus denes an analytic function on that half-plane. Apply this to
(1 ? 21?s ) (s) = 1 ? 21s + 31s ? 41s + ?
(with (n) = (?1)n?1 ) and (1 ? 31?s ) (s) to obtain a dierent proof of the
analytic continuation of to > 0.
Show that (?1) = ?1=12, and (if you know or are willing to derive the formula
for (2m)) that more generally (1 ? 2m) = ?B2m =2m where P Bk is the k-th
Bernoulli number dened by the generating function t=(et ? 1) = 1 k
n=0 Bk t =k !.
What is (0)? [It is known that in general K (?m) 2 Q (m = 0; 1; 2; : : :) for
any number eld K . In fact the functional equation for K indicates that once
[K : Q] > 1 all the K (?m) vanish unless K is totally real and m is odd, in
which case the rationality of K (?m) was obtained in [Siegel 1969].]
If you've never seen it yet, or have done it once but forgotten, prove (B) by
starting from the integral representation of the right-hand side as
Z 1Z 1
xs?1 ys0 ?1 e?x?y dx dy
0 0
and applying the change of variable (x; y) = (uz; (1 ? u)z ). [We will probably
have little use for the Beta function in Math 259, but an analogous transforma-
tion will show up later in the formula relating Gauss and Jacobi sums.]
p
Now take s = s0 = 1=2 to prove that ?(1=2) = , and thus to obtain the
Gauss integral Z1 2 p
e?x = :
?1
Then take s0 = s and use the change of variable u = (1 ? 2x)2 in the integral
dening B(s; s) to obtain B(s; s) = 21?2s B(s; 1=2), and thus the duplication
formula
?(2s) = ?1=2 22x?1 ?(s)?(s + 12 ):
5
Use Poisson summation to evaluate 1
P 2 2
n=1 1=(n + c ) for c > 0. [Evaluating the
Fourier transform of 1=(x2 + c2 ) is a standard exercise in contour integration.]
Verify that your answer approaches (2) = 2 =6 as c!0.
Let 8 be the Dirichlet character mod 8 dened by 8 (1) = 1, 8 (3) = ?1.
Show that if f is a function satisfying the hypothesis of Poisson summation then
X
1 X
1
8 (m)f (m) = 8?1=2 8 (n)f^(n=8):
m=?1 n=?1
Letting f (x) = e?ux2 , obtain an identity analogous to (), and deduce a func-
tional equation for L(s; 8 ).
Now let 4 be the character mod 4 dened by 4 (1) = 1. Show that, again
under the Poisson hypotheses,
X
1 X
1
4 (m)f (m) = 21 4 (n)f^(n=4):
m=?1 n=?1
But now taking f (x) = e?ux2 does not accomplish much! Use f (x) = xe?ux2
instead to nd a functional equation for L(s; 4 ).
For light relief after all this hard work, dierentiate the identity () with respect
to u, set u = 1, and conclude that
e > 8 ? 2:
What is the approximate size of the dierence?
References
[CF 1967] Cassels, J.W.S., Frohlich, A., eds.: Algebraic Number Theory. Lon-
don: Academic Press 1967. [AB 9.67.2 / QA 241.A42]
[GR 1980] Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Prod-
ucts. New York: Academic Press 1980. [D 9.80.1 / basement reference
QA55.G6613]
[Siegel 1969] Siegel, C.L.: Berechnung von Zetafunktionen an ganzzahligen
Stellen, Gott. Nach. 10 (1969), 87{102.
[Tate 1950] Tate, J.T.: Fourier Analysis in Number Fields and Hecke's Zeta-
Functions. Thesis, 1950; Chapter XV of [CF 1967].
6
Math 259: Introduction to Analytic Number Theory
More about the Gamma function
The product formula for ?(s). Recall that ?(s) has simple poles at s =
0; ?1; ?2; : : : and no zeros. We readily concoct a product that has the same
behavior: let 1
g(s) := 1s 1 + ks ;
Y .
es=k
k=1
the product converging uniformly in compact subsets of C ? f0; ?1; ?2; : : :g
because ex =(1 + x) = 1 + O(x2 ). Then ?=g is an entire function with neither
poles nor zeros. What about g(s + 1)=g(s)? That is the limit as N !1 of
!
s e 1=k 1 + k = s
N
Y s
exp
XN
1 N
Y k+s
s + 1 k=1 1 + s+1 k s + 1 k=1
k k=1
k + s+1
!
= s N +Ns + 1 exp ? log N + k1 :
XN
k=1
P
Now the factor N=(N +s+1) approaches 1, while the exponent ? log N + Nk=1 k1
tends to Euler's constant
= 0:57721566490 : : : Thus g(s + 1) = se
g(s), and
if we dene
1 !
?? (s) := e?
sg(s) = e?
s Y e
.
1 + s = 1 lim N s Y N
k
s + k (P)
s=k
s k=1 k s N !1 k=1
1
which on integration by parts becomes 2jtj 01 dx=(x2 + 1) = jtj. This proves
R
(Q), and thus shows that (P) is a product formula for ?(s).
Consequences of the product formula. First among these is the Stirling
approximation1 to log ?(s). Fix > 0 and let R be the region
fs 2 C : jsj > ; jIm(log s)j < ? g:
Then R is a simply-connected region containing none of the poles of ?, so there
is an analytic function log ? on R , real on R \ R, and given by the above
product formula:
!
N
X
log ?(s) = Nlim
!1
s log N + log N ! ? log(s + k) : (L)
k=0
We estimate the sum as we did for log x! in obtaining the original form of
Stirling's approximation: the sum diers from
Z N+ 1
log(s + x) dx = (N + 12 + s) log(N + 12 + s) ? (s ? 21 ) log(s ? 12 ) ? N ? 1
2
? 21
= (N + 12 + s) log N +(N + 21 + s) log(1+ N1 (s + 21 )) ? (s ? 21 ) log(s ? 21 ) ? N ? 1
by
1 Z N + 2 1 hxi2 dx jsj?1 :
1
2 ? 21 (s + x)2
with the duplication formula), and that the O (jsj?1 ) error can be expanded in
an asymptotic series in inverse powers of s (which come from further terms in
the Euler-Maclaurin expansion of the sum in (L)).
The logarithmic derivative of our product formula for ?(s) is
1 1
?0 (s) = ?
? 1 + X
1 = lim log N ? XN
1 :
?(s) ?
s k=1 k s + k N !1 s+k
k=0
1 Originally only for n! = ?(n + 1), but we need it for complex s as well. As we shall see
it can also be obtained from the stationary-phase expansion of Euler's integral, the critical
point of xs?1 ex being x = s ? 1.
2
Either
PN
by dierentiating2 (S) or by applying the same Euler-Maclaurin trick to
0 1=(s + k) we nd that
?0 (s) = log s ? 1 + O (jsj?2 ): (S0 )
?(s) 2s
3
Show that this gives the analytic continuation of to a meromorphic function
on C; shift the line of integration to the left to obtain the functional equation
relating (s) to (1 ? s) for < 0, and thus by continuation for all s.
3. (behavior on vertical lines) Deduce from (S) that for xed 2 R
Re log ?( + it) = ( ? 21 ) log jtj ? 2 jtj + C + O (jtj?1 )
?
as jtj!1. Check that for = 0; 1=2 this agrees with the exact formulas
; j?(1=2 + it)j2 =
j?(it)j2 = t sinh t cosh t
obtained from (*).
Reference
[All this material is standard; one basic reference is Ch. XII (pages 235{264) of
[WW 1940]. One reason for not just citing Whittaker & Watson is that some
of the results concerning Euler's integrals B and ? have close analogues in the
Gauss and Jacobi sums associated to Dirichlet characters, and we'll need these
analogues shortly.]
[WW 1940] Whittaker, E.T., Watson, G.N.: A Course of Modern Analysis: : : 3
(fourth edition). Cambridge University Press, 1940 (reprinted 1963). [HA 9.40
/ QA295.W38]
3 The full title is 26 words long, which was not out of line when the book rst appeared in
1902. You can nd the title in Hollis.
4
Math 259: Introduction to Analytic Number Theory
Functions of nite order: product formula and logarithmic derivative
[See for instance Chapter 11 of [Davenport 1967], keeping in mind that Daven-
port uses \integral function" for what we call an \entire function"; Davenport
treats only the case of order (at most) 1, which is all that we need, but it is
scarcely harder to deal with any nite order as we do here.]
The order of an entire function f () is the smallest 2 [0; +1] such that
f (z ) exp jz j+ for all > 0. Entire functions of nite order have nice
innite products. We have seen already the cases of sin z and 1=?(z ), both of
order 1. As we shall see, (s2 ? s) (s) also has order 1 (as do analogous functions
we'll obtain from Dirichlet L-series). From the product formula for (s) we
shall obtain a partial-fraction decomposition of 0 (s)= (s) which we shall use to
analyze the contour-integral formula for (x).
Suppose rst that f has nite order and no zeros. Then f = eg for some
entire function g. We claim that g is a polynomial. Indeed the real part of g
is < O(jz j+ . But then the same is true of jg(z )j. Let h = g ? g(0). Then
h(0) = 0. Let M = supjzj2R Re h(z ); by assumption M R+ . Then
h1 := h=(2M ? h) is analytic in C := fz 2 C : jz j 2Rg, with h1 (0) = 0 and
jh1 (z )j 1 in C . Consider now the analytic function (z ) := 2Rh1(z )=z on C .
On the boundary of that circle, j(z )j 1. Thus by the maximum principle the
same is true for all z 2 C . In particular if jz j R then jh1 (z )j 1=2. But then
jh(z )j 2M . Thus jg(z )j 2M + g(0) jz j+ and g is a polynomial in z as
claimed. Moreover the degree of that polynomial is just the order of f .
We shall reduce the general case to this by dividing a given function f of nite
order (not vanishing identically on C) by a product whose zeros match those
of f . For this product to converge we'll need to bound the number of zeros of f
in a circle. Let z1 ; z2; : : : be the zeros of f , listed with the correct multiplicity
in increasing order of jzk j. Consider rst f (Qz ) in jz j < 1. Let zn be the last zero
of f there. Let g be the Blaschke product nk=1 (z ? zk )=(1 ? zk z ), designed to
have the same zeros but with jg(z )j = 1 on jz j = 1. Then f1 := f=g is analytic
on jz j 1, and jf (z )j = jf1 (z )j on the boundary jz j = 1. Therefore by the
maximum principle jf1 (0)j maxjzj=1 jf (z )j, so
Yn Yn
jf (0)j = jg(0)f (0)j =
1 jzk j jf (0)j
1 jzk j jmax
zj
jf (z )j:
k=1 k=1 =1
1
It follows that if zk (1 k n(R)) are the zeros of f in jz j < R then1
(R)
nY
jf (0)j jzk j max jf (z )j:
k=1 R jzj=R
Now suppose for convenience that f (0) =6 0 [otherwise apply the following ar-
gument to f=z r where r is the order of the vanishing of f at z = 0]. Then we
may take logarithms to nd
nX
(R) Z R
If f has order at most < 1 then the LHS of this is O (R+ ), and we conclude
that Z eR Z eR
n(R) = dr
n(R) r n(r) drr R+ :
R 0
It follows that 1 ?
k=1 jzk j converges if > , since the sum is
P
Z 1 Z 1 Z 1
?
r dn(r) = r ? ?1
n(r) dr r+??1 dr < 1
0 jz1 j jz1j
for any positive < ? . Therefore the product
1 a 1 z m
(1 ? zzk ) exp
Y X
P (z ) := (a = bc)
k=1 m=1 m zk
converges for all z 2 C. Moreover the convergence is uniform in compact subsets
of C, because on jz j R log(1?z=zk )+ am=1 (z=zk )m =m (z=zk )a+1 zk?a?1
P
uniformly once k > n(2R). Thus P (z ) is an entire function, with the same zeros
and multiplicities as f .
It follows that f=P is an entire function without zeros. We claim that it too has
order at most , and is thus exp g(z ) for some polynomial g of degree at most a.
To do this it is enough to prove that for each > 0 there exists C such that for
all R 1 there exists r 2 (R; 2R) such that jP (z )j exp ?C R+ for all z 2 C
with jz j = r. Write P = P1 P2 , with P1 ; P2 being a product over k n(4R) and
k > n(4R) respectively. The k-th factor of P2 (z ) is exp O(jz=zk ja+1 ), so
Z 1
r?a?1 dn(r) R+ ;
X
log jP2 (z )j Ra+1 jzk ja Ra
+1 +1
k>n(4R) 4 R
1 Since the resulting function f has no zeros in z < R, it follows that log f (z )
1 j j 1 j j
is a harmonic
R 2
function on that circle,Pwhence Jensen's formula : if f (0) = 0 then 6
2
using integration by parts and n(r) rQ+ in the last step (check this!). As
to P1 , it is a nite product, which is eh(z) kn(4R) (1 ? z=zk ) where h(z ) is the
polynomial
nX
(4R) a
z m
X 1
h(z ) =
k=1 m=1 m zk
of degree at most a. Thus h(z ) n(4R) + Ra kn(4R) jzk j?a , which readily
P
yields h(z ) R+ (carry out the required partial integration and estimates).
So far, our lower bounds on the factors of P (z ) held for Q
all z in the annulus
R < jz j < 2R, but we cannot expect the same for P3 (z ) := kn(4R) (1 ? z=zk ),
since it may vanish at some points of the annulus. However, we can prove that
some r works by estimating the average
R nX
(2R) R
R jzj=r k=1 R R k
The integral is elementary, if not pretty, and at the end we conclude that the
average is again R+ . This shows that for some r 2 (R; 2R) the desired
lower bound holds, and we have nally proved the product formula
1 z a 1 z m
g (z )
f (z ) = P (z )e = e g (z )
Y
k
(1 ? z ) exp
X
:
k=1 m=1 m zk
Taking logarithmic derivatives we deduce
1 (z=z )a
f 0 (z )=f (z ) = g0 (z ) + P 0 (z )=P (z ) = g0 (z ) +
X
k
k=1 z ? zk :
We note too that if > 0 and k jzk j? < 1 then there exists a constant C
P
f (s) = 1=?(s). [This may appear circular because it is proved from the product
formula for ?(s), but it need not be; see Exercise 3 below.] As we shall see,
the same is true for f (s) = (s2 ? s) (s); it will follow that , and thus , has
innitely many nontrivial zeros with real part in [0; 1], and in fact that the sum
of their reciprocals' norms diverges.
Exercises:
3
2.
P1
Find an entire function f (z ) of order 1 such that jf (z )j exp O(jz j) but
?1
k=1 jzk j = 1. [Hint: you don't have to look very far.]
3. Supply the missing steps in our proof of the product formula.
4. Show that 1=?(s) is an entire function of order 1, using only the following
tools available to Euler: the integral formulas for ?(s) and B(s; s0 ), and the
identities B (s; s0 ) = ?(s)?(s0 )=?(s + s0) and ?(s)?(1 ? s) = = sin s. [The hard
part is getting an upper bound for 1=j?(s)j on a vertical strip; remember how
we showed that ?(s) 6= 0, and use the formula for j?(1=2 + it)j2 to get a better
lower bound on j?(s)j.] Use this to recover the product formula for ?(s), up to
a factor eA+Bs which may be determined from the behavior of ?(s) at s = 0; 1.
5. Prove that if f (z ) is an entire function of order > 0 then
Z Z
jf 0 (z )=f (z )j dx dy r
+1+ (z = x + iy)
jzj<r
as r!1. [Note that the integral is improper (except in the trivial case that
f has no zeros) but still converges: if is a meromorphic function on a region
U C with simple but no higher-order poles then jj is integrable on compact
subsets K U , even K that contain poles of .]
Reference
[Davenport 1967] Davenport, H.: Multiplicative Number Theory. Chicago:
Markham, 1967; New York: Springer-Verlag, 1980 (GTM 74). [9.67.6 & 9.80.6
/ QA 241.D32]
4
Math 259: Introduction to Analytic Number Theory
The product formula for (s) and (s); vertical distribution of zeros
[Check that for = 0; 1=2 this agrees with the exact formulas
; j?(1=2 + it)j2 =
j?(it)j2 = t sinh t cosh t
obtained from the formula for ?(s)?(1 ? s).] For > 1, the Euler product for
(s) shows that log j ( + it)j 1; indeed we have the upper and lower bounds
Y
() j ( + it)j > (1 + p?s )?1 = (2)= ():
p
Thus j ( + it)j is within a constant factor of jtj=2?1=4 e?jtj=2 for large jtj. The
functional equation (s) = (1 ? s) then gives the same result for j (1 ? + it)j,
and we conclude for each < 0 that j ( + it)j is within a constant factor of
jtj1=2? for large t.
What about j ( + it)j for 2 [0; 1], i.e. within the critical strip? Generalizing
our formula for analytically continuing (s) we nd for > 0
?1
NX 1?s X 1 Z n+1
(s) = n?s + Ns ? 1 + (n?s ? x?s ) dx;
n=1 N =1 n
which for large t; N is N 1? + jtjN ? , uniformly at least for 1=2. Taking
N = jtj + O(1) we nd ( + it) jtj1? there, so by the functional equation
also ( + it) t1=2 for > 0. In fact either the \approximate functional
equation" for (s) (usually attributed to Siegel, but now known to have been
used by Riemann) or general convexity results (variations on the \Three Lines
Theorem") tell us that ( + it) jtj(1?)=2+ for 2 [0; 1]. For our present
purposes any bound jtjO(1) will do, but the Lindelof conjecture asserts that in
fact ( + it) jtj for all 1=2 (excluding a neighborhood of the pole
s = 1), and thus by the functional equation that also ( + it) jtj1=2?+
for all 1=2. We shall see that this is implied by the Riemann hypothesis.
However, the best upper bound currently proved on
lim sup log jlog
(1=2 + it)j
jtj
jtj!1
1
is only a bit smaller than 1=6; when we get to exponential sums later this term
we shall derive the upper bound of 1=6.
A remark about our choice of N t in the bound ( + it) N 1? + t N ? : of
j j j j
course we wanted to choose N to make the bound as good as possible, i.e. to minimize
N 1? + t N ? . In calculus we learned to do this by setting the derivative equal to
j j
zero. That would give N proportional to t , but we arbitrarily set the constant of
j j
2
P P
Moreover jj?1? < 1 for all > 0 but jj
?1 = 1. The logarithmic
derivative of ( ) is
0 (s) = B ? 1 ? 1 + X 1 + 1 ; ( 0 )
s s?1 s?
since (s) = ?s=2 ?(s=2) (s) we also get a product formula for (s), and a
partial-fraction expansion of its logarithmic derivative:
0 (s) = B ? 1 + 1 log ? 1 ?0 ( s + 1) + X 1 + 1 : ( 0 )
s?1 2 2? 2 s?
(We have shifted from ?(s=2) to ?(s=2 + 1) to absorb the term ?1=s; note that
(s) does not have a pole or zero at s = 0.)
Vertical distribution of zeros. Since the zeros of (s) are limited to a strip
we can nd much more precise information P
about the distribution
P
of their sizes
than the convergence and divergence of jj?1? and jj?1 . Let N (T ) be
the number of zeros in the rectangle 2 [0; 1], t 2 [0; T ] (this is essentially half
of what we called n(T ) in the context of the general product formula). We shall
prove a theorem of von Mangoldt: as T !1,
N (T ) = 2T log 2T ? 2T + O(log T ): (N)
We follow chapter 15 of [Davenport 1967], keeping track of the fact that Dav-
enport's and ours dier by a factor of (s2 ? s)=2.
We may assume that T does not equal the imaginary part of any zero of (s).
Then
2N (T ) ? 2 = 21i
I
0 (s) ds = 1 I d(log (s)) = 1 I d(Im log (s));
CR 2i CR 2 CR
where CR is the boundary of the rectangle 2 [?1; 2], t 2 [?T; T ]. Since
(s) = (1 ? s) = (s), we may by symmetry evaluate the last integral by
integrating over a quarter of CR and multiplying by 4. We use the top right
quarter, going from 2 to 2 + iT to 1=2 + iT . At s = 2, log (s) is real, so we
have
(N (T )?1) = Im log ( 21 +iT ) = Im(log ?( 12 +iT ))? T2 log +Im(log ( 12 +iT )):
By Stirling, the rst term is within O(T ?1 ) of
Im
? iT ? 3 log? iT + 1 ? T
2 4 2 4 2
3
= T2 log iT2 + 41 ? 43 Im log( iT2 + 14 ) ? T2 = T2 (log T2 ? 1) + O(1):
4
Math 259: Introduction to Analytic Number Theory
A zero-free region for (s)
We rst show, as promised, that (s) does not vanish on = 1. As usual nowadays
we give Mertens' elegant version of the original arguments of Hadamard and (indepen-
dently) de la Vallee Poussin. Recall that
0 (s) X 1 (n)
? (s) = s
n=1 n
has a simple pole at s = 1 with residue +1. If (s) were to vanish at some 1 + it then
? 0 =Pwould have a simple pole with residue ?1 (or ?2; ?3; : : :) there. The idea is
that n (n)=ns converges for > 1, and as !1+ all the terms contribute towards
the positive-residue pole. As approaches 1 + it from the right, the corresponding
terms have the same magnitude but are multiplied by nit , so to get a pole with residue
?1 \almost all" the phases nit would be near ?1. But then near 1 + 2it the phases
n2it would again approximate (?1)2 = +1, yielding a pole of positive residue, which
is not possible because then would have another pole besides s = 1.
To make precise the idea that if nit ?1 then n2it +1, we use the identity
2(1 + cos )2 = 3 + 4 cos + cos 2;
from which it follows that the right-hand side is positive. Thus if = t log n we have
3 + 4 Re(nit ) + Re(n2it ) 0:
Multiplying by (n)=n and summing over n we nd
0 0 0
3 ? () + 4 Re ? ( + it) + Re ? ( + 2it) 0 ()
for all > 1 and t 2 R. Fix t = 6 0. As !1+, the rst term in the LHS of this
inequality is 3=( ? 1) + O(1), and the remaining terms are bounded below. If had
a zero of order r > 0 at 1 + it, the second term would be ?4r=( ? 1) + O(1). Thus
the inequality yields 4r 3. Since r is an integer this is impossible, and the proof is
complete.
Two remarks about this proof are in order. First, that P the only properties of (n)
we used are that facts that (n) 0 for all n and that n (n)=ns has an analytic
continuation with a simple pole at s = 1Qand no other poles of real part 1. Thus the
same argument exactly will show that modq L(s; ), and thus each of the factors
L(s; ), has no zero on the line = 1. Second, that the 3 + 4 cos + cos 2 trick is
worth remembering since it has been adapted to other uses. For instance we shall
revisit and generalize it when we develop the Drinfeld-Vladut upper bounds on points
of a curve over a nite eld and the Odlyzko-Stark lower bounds on discriminants of
number elds. See also the Exercise below.
Returning to (), we next use it together with the partial-fraction formula
0 1 1 ?0 s X 1 1
? (s) = s ? 1 + B1 + 2 ? ( 2 + 1) ? +
s?
1
to show that even the existence of a zero close to 1 + it is not possible. How close
depends on t; specically, we shall show that there is a constant c > 0 such that if
jtj > 2 and ( + it) = 0 then1
c
<1? (Z)
log jtj :
Now let 2 [1; 2] and2 jtj 2 in the partial-fraction formula. Then the B1 and ?0 =?
terms are O(log jtj), and each of the terms 1=(s ? ), 1= has positive real part as
noted in connection with von Mangoldt's theorem on N (T ). Therefore3
0
? Re ( + 2it) < O(log jtj);
and if some = 1 ? + it then
0
? Re ( + 2it) < O(log jtj) ? + 1 ? 1 :
Thus () yields
4 3 + O(log jtj):
<
+?1 ?1
In particular, taking4 = 1 + 4 yields 1=20 < O(log jtj). Thus (log jtj)?1 , and
our claim (Z) follows.
Once we obtain the functional equation and partial-fraction decomposition for Dirichlet
L-functions L(s; ), the same argument will show that (Z) also gives a zero-free region
for L(s; ), though with the implied constant depending on .
Exercise: Show that for each > 2 there exists t 2 R such that
Z1
exp(jxj + itx) < 0:
?1
(Yes, this is related to the present topic; see [EOR 1991, p.633]. The integral is known
to be positive for all t 2 R when 2 [0; 2], see for instance [EOR 1991, Lemma 5].)
References
For the zero-free region see for instance chapter 13 of Davenport's book [Davenport
1967] cited earlier.
[EOR 1991] Elkies, N.D., Odlyzko, A., Rush, J.A.: On the packing densities of super-
balls and other bodies, Invent. Math. 105 (1991), 613{639.
[Montgomery 1971] Montgomery, H.L.: Topics in Multiplicative Number Theory.
Berlin: Springer, 1971. [LNM 227 / QA3.L28 #227]
[Walsz 1963] Walsz, A.: Weylsche Exponentialsummen in der neueren Zahlentheorie.
Berlin: Deutscher Verlag der Wissenschaften, 1963. [AB 9.63.5 / Sci 885.110(15,16)]
1 This classical bound has been improved; the current record of 1 ? log?2=3? jtj, due to
Korobov and perhaps Vinogradov, has stood for 40 years. See [Walsz 1963] or [Montgomery 1971,
Chapter 11].
2 Any lower bound > 1 would do | and the only reason we cannot go lower is that our bounds are
in terms of log jtj so we do not want to allow log jtj = 0.
3 Note that we write < O (log jtj), not = O (log jtj), to allow the possibility of an arbitrarily large
negative multiple of j log tj.
4 1 + will do for any > 3. This requires that 1, e.g. 1=4 for our choice of = 4, else
> 2; but we're only concerned with near zero anyway.
2
Math 259: Introduction to Analytic Number Theory
Proof of the Prime Number Theorem
to (x). Assume that T does not coincide with the imaginary part of any .
Shifting the line of integration leftwards, say to real part ?1, yields
0 1 2
X A = I1 + I2 ? 0 (0) + O x log x ; P
(x) ? @x ? x
T
( )
j Im()j<T
in which I1 ; I2 are the integrals of ?( 0 (s)= (s))xs ds=s over the vertical line
= ?1; jtj < T and the horizontal lines 2 [?1; 1 + 1= log x], t = T respec-
tively. We next show that I1 ; I2 are small. The vertical integral I1 is clearly
0 2
logx T sup (?1 + it) logx T :
jtj<T
The horizontal integrals in I2 are
Z 1+ 1 0
sup ( + iT ) :
log x
x d
?1 2[?1;2]
1
Since X X
1 1
j Im()j<T jj <
j Im()j<T j Im j
Z T
( ) = 2Z T
() Z T
log t dt log2 T ;
=2 dN t N t dt
t2
1 t 1 1 t
we thus have X x log x
x log2 T exp ?c log T
:
jj<T
Therefore
(x) ? 1 1 + exp??c log x log2 x:
T
x log T
We choose T so that the logarithmsp ? log T , ? log x= log T of the two terms are
equal. That is, we take T = exp log x. We then absorb the log2 x factor into
the resulting estimate exp(?C log1=2 x + O(1)) by slightly decreasing C , and at
last obtain the Prime Number Theorem with error estimate: there exists C > 0
such that p
(x) = x + O(exp(?C log x)):
The equivalent result for (x) follows by partial integration:
p
(x) = li(x) + O(exp(?C log x)):
(Recall that li(x) =
R log y = x= log x + O(x= log2 x).)
x
dy=
2
Consequences of the Riemann Hypothesis P (RH). Suppose now that RH
holds. Then we may take T = x in ( ) to nd (x) = x + O(x1=2 log2 x).
More generally if it is known for some 2 [ 21 ; 1) that Re < for all zeros
of then the same argument yields (x) = x + O(x log2 x). The equivalent
result for (x) is (x) = li(x) + O(x log x).PNote that since necessarily 1=2
the O(x1=2 ) dierence between (x) and p<x log p is absorbed into the error
estimates.
P
But note that, under RH, that dierence, which is (x)? p<x logPp (x1=2 ) x1=2 ,
is exactly of the same asymptotic order as the terms x = of ( ), and much larger
than each single term because
P jj?1 < 1=14. Thus one expectsPx to exceed (x) more
often than not. Since 1= diverges, it is possible for ? x = to exceed x1=2 ,
but not often | indeed it was thought that (x) might always be < x, and thus
that (x) < li(x), but Littlewood showed that the dierence changes sign innitely
often in both cases. The rst such sign change has yet to be found, though. The rst
explicit bound was the (in)famously astronomical \Skewes' number" [Skewes 1933];
that bound has since fallen, but still stnds at several hundred digits, too small to
reach directly even with the best algorithms known for computing (x) | algorithms
that also depend on the analytical formulas such as (*); see [LO 1982].
2
A converse implication also holds: if for some 2 [ 12 ; 1) and all > 0 we have
(x) = x + O (x+ ), or equivalently (x) = li(x) + O (x+ ), then (s) has no
zeros of real part > . So, for instance, RH is equivalent to the P assertion that
(x) = li x + O(x1=2 log x). To see this, write ? 0 (s)= (s) = n (n )n ?s as a
Stieltjes integral and integrate by parts to nd
0 Z1 Z1
? (s) = s (x)x?s?1 dx = s ?s 1 + ( (x) ? x) x?s?1 dx ( > 1):
1 1
If (x) ? x x+ then the resulting integral for s=(s ? 1)+ 0 (s)= (s) extends
to an analytic function on > , whence that half-plane contains no zeros
of (s). Note the amusing consequence that an estimate (x) = x + O (x+ )
would automatically improve to (x) = x + O(x log2 x), and similarly for (x).
Exercises.
3
Math 259: Introduction to Analytic Number Theory
L(s; ) as an entire function; Gauss sums
We rst give, as promised, the analytic proof of the nonvanishing of L(1; ) for a
Dirichlet character mod q; this will complete our proof of Dirichlet's theorem
that there are innitely primes in the arithmetic progression fqm + a : m 2 Z>0 g
whenever (a; q) = 1, and that the logarithmic density of such primes is 1=(q).1
We follow of [Serre 1973, Ch. VI x2]. Functions such as (s), L(s; ) and their
products are special cases of what Serre calls \Dirichlet series": functions
1
X
f (s) := an e?ns (D)
n=1
with an 2 C and 0 n < n+1 !1. [For instance L(s; ) has n = log n
and an = (n).] We assume that the an are small enough that the sum in (D)
converges for some s 2 C. We are particularly interested in cases such as
Y
q (s) := L(s; )
mod q
whose coecients an are nonnegative. Then if (D) converges at some real 0 ,
it converges uniformly on 0 , and f (s) is analytic on > 0 . Thus a series
(D) has a maximal open half-plane of convergence (if we agree to regard C itself
as an open half-plane for this purpose), namely > 0 where 0 is the inmum
of the real parts of s 2 C at which (D) converges. This 0 is then called the
\abscissa of convergence" of (D).
We claim that if 0 is nite then it is a singularity of f ; equivalently, that if
for some 2 R the series (D) converges in > and extends to an analytic
function in a neighborhood of then 0 < . Since f (s ? ) is again of the form
(D) with nonnegative coecients it is enough to prove this claim for = 0.
Since f is analytic in > 0 and also in jz j < for some p > 0, it is analytic in
js ? 1j 1+ for suciently small , specically any < 1 + 2 ? 1. Expand f
in a Taylor series about s = 1. Since (D) converges uniformly in a neighborhood
of that point, it may be dierentiated termwise, and we nd that its p-th there
is 1
X
f (p) (1) = (?n )p an e?n :
n=1
1 Davenport gives a simpler proof of Dirichlet's theorem, also involving L-functions but
not yet obtaining even the logarithmic density, in Chapter 4, attributing the basic idea to
Landau 1905.
1
Thus taking s = we obtain the convergent sum
X1 (?1 ? )p X1 (1 + )p "X
1 #
f (?) = (p) p ? n
p=0 p! f (1) = p=0 p! n=1(+n ) an e :
But since all the terms in the inner sum are nonnegative, the sum converges
absolutely and may be summed in reverse order, yielding
1 "1 #
X X (1 + )p ?n
f (?) = an e :
n=1 p=0 p!
But the new inner sum is just a Taylor series for en . So we have shown that
(D) converges at s = ?, and thus that 0 ? < 0 = , as claimed.
We can now prove:
Theorem. Let be a nontrivial character mod q. Then L(1; ) =6 0.
Proof : We know already that L(s; ) extends to a function on > 0 analytic
except for the simple pole of L(s; 0 ) at s = 1. If any L(s; ) vanished at s = 1
then Y
q (s) := L(s; )
mod q
would extend to an analytic P function on > 0. But we observed already that
q (s) is a Dirichlet series n an n?s converging at least in > 1 with an 0
for all n, and thus converging inP > 0. But we also have an 1 if n = k(q)
for some k coprime to q. Thus n an n? diverges for 1=(q). This is a
contradiction, and we are done.
So we have Dirichlet's theorem; but we want more than logarithmic density:
we're after asymptotics of (x; a mod q), or equivalently of (x; ). As with the
Prime Number Theorem, it will be enough to estimate
X
(x; ) := (n)(n);
n<x
for which we have an integral approximation
Z 1+ log x +iT L0
1 2
(x; ) = 21i ? (s; ) x s ds + O x log x (T 2 [1; x]):
1+ log1 x ?iT L s T
So we proceed to develop a partial-fraction decomposition for L0 =L, which in
turn requires us to prove an analytic continuation and functional equation for
L(s; ).
2
Our key tool in proving the functional equation for (s) was the Poisson sum-
mation formula, which we got from the Fourier series of
1
X
F (x) := f (x + m)
m=?1
by setting x = 0. We now need this Fourier series
1
X
F (x) = f^(n)e?2inx
n=?1
for fractional x. (As before, f^(y) is the normalization
Z +1
f^(y) = e2ixy f (x) dx (F)
?1
of the Fourier transform of f .) Let be a character mod q; evaluate F (x) at
x = a=q, multiply by (a), and sum over a mod q to obtain
2 3
1
X X1 X
(m)f (m=q) = 4 (a)e ?2ina=q 5 f^(n): (P0 )
m=?1 n=?1 a mod q
Consider the inner sum
X
?n () := (a)e?2ina=q :
a mod q
Assume henceforth that is primitive. We then claim:
n () = (n)1 (): ( )
Indeed, if (n; q) = 1 then we may replace a by n?1 a, from which n () =
(n)1 (n) follows; note that this part did not require primitivity. If (n; q) > 1
then (n) = 0, so we want to show n () = 0. Let d = (n; q) and q0 = q=d, and
rearrange the n () sum according to a mod q0 :
2 3
X X X 2ina0 =q 6 X
n () = (a)e2ina=q = e 4 (a)75 :
a0 mod q0 aaamod q
0 modq0
a0 mod q0 aaamod q
0 mod q0
We claim that the inner sum vanishes. This is clear unless (a0 ; q0 ) = 1. In that
case the inner sum is X
(a1 ) (a);
aa1mod
mod q
q0
3
for any a1 a0 mod q0 . But this last sum is the sum of a character on the group
of units mod q congruent to 1 mod q0 , and so vanishes unless that character is
trivial | and if (a) = 1 whenever a 1 mod q0 then comes from a character
mod q0 (why?) and is not primitive! This proves ( ). We generally abbreviate
1 () as (), and call that number
X
() := (a)e2ia=q
a modq
the Gauss sum of the character . Then (P0 ) becomes
1
X 1
X
(m)f (m=q) = () (?n)f^(n): (P )
m=?1 n=?1
Assuming that f is a function such that both f; f^ satisfy the
P Poisson hypotheses,
we may apply this twice to nd that ( () () ? (?1)q) m2Z (m)f (m=q) =
0. The sum does not vanish for every suitable f (e.g. f (x) = exp(?C (x ? 1=q)2 )
for large C ), so2 () () = (?1)q. Moreover
X X
() := (a)e2ia=q = (a)e?2ia=q = (?1) ();
a mod q a mod q
4
Note that since (0) = 0 once q > 1 it follows that (u) is rapidly decreasing
as u!0+. Now we have as before
Z1
2?s=2 ?(s=2)L(s; ) = (u)us=2 du
u:
0
This gives the analytic continuation of L(s; ) to an entire function with zeros
at the poles s = 0; ?2; ?4; ?6; : : : of ?(s=2). Moreover by ( ) the integral is
also
() Z 1 (1=q2 u)u(s?1)=2 du = () Z 1 (u)(q2 u)(1?s)=2 du
q u q u
0 0
Z 1
= q(s ) (u)u(1?s)=2 du
u:
0
This last integral is 2 ()q?s ?( 12 (1 ? s))(s?1)=2 L(1 ? s; ) for 2 [0; 1], and
thus by analytic continuation for all s 2 C. We can write the functional equation
symmetrically by setting
(s; ) := (=q)?s=2 ?(s=2)L(s; );
which is now an entire function: (s; ) is related with (s; ) by
(s; ) = () (1 ? s; ):
pq (L+ )
What about odd ? The same denition of would yield zero. We already
indicated (in the exercises on the functional equation for and ) the way
around this problem: we apply the -twisted Poisson summation formula (P )
not to the2 Gaussian e?u(qx)2 but to its derivative, which is proportional to
xe?u(qx) . Using the general fact that the Fourier transform of f 0 is 2iyf^(y)
(integrate by2 parts in the denition?1(F) of2 fb0 ) we see that the Fourier transform
of xe?u(qx) is (iy=(u1=2q)3 )e?u (y=q) . So, if we dene4
1
X
# (u) := n(n)e?n2 u ;
n=?1
we obtain
# (u) = iq2(u3)=2 # (1=q2 u): (# )
This time we must multiply # by u(s+1)=2 du=u to cancel the extra factor of n,
and so obtain the integral formula
Z1
2?(s+1)=2 ?((s + 1)=2)L(s; ) = # (u)u(s+1)=2 du
u
0
4 Our # (u) is Davenport's (
1 qu; ).
5
for L(s; ). Again (# ) together with (0) = 0 tells us that # (u) vanishes
rapidly as u!0+, and thus that our integral extends to an entire function of s;
note however that the resulting trivial zeros of L(s; ) are at the negative odd
integers. The functional equation (# ) again gives us a relation between L(s; )
and L(1 ? s; ), which this time has the symmetrical form
(s; ) = () (1 ? s; );
ipq (L )?
with
(s; ) := (=q)?(s+1)=2 ?((s + 1)=2)L(s; ):
[
Exercise: Complete the missing steps in the proof of L? .]
We may combine (L+ ) with (L? ) by introducing an integer depending on :
a
a := 01;; ifif (1) = +1;
(1) = ?1.
That is, = 0 or 1 according as is even or odd. Then
a
6
To sum the terms for xed nonzero b, let a = cb and a0 = (1 ? c)b to nd
X
e2ib=q 0 (b) (c)0 (1 ? c): = e2ib=q 0 (b)J (; 0 ):
c modq
Thus if 0 = we have
() (0 ) = (?1)(q ? 1) ? J (; 0 )
and thus J (; ) = ?(?1), a fact that can also be obtained directly from
(c)(1?c) = (c?1 ?1). [This in turn yields an alternative proof of j ()j = pq
in the prime case.] Otherwise we nd
(0 ) :
J (; 0 ) = (() (J)
0)
In particular it follows that jJ (; 0 )j = pq if each of ; 0 ; 0 is nontrivial.
The formula (J) is the beginning of a long and intricate chapter of the arithmetic
of cyclotomic number elds; it can also be used used to count points on Fermat
curves mod q, showing for instance that if q 1 mod 3 then p the number of
c 6= 0; 1 in Z=q such that both c and 1 ? c are cubes is q=9 + O( q ).
What of our promised Poisson-free proof of j ()j = pq ? Well, our formula
for n () states in eect that () is the discrete Fourier transform of . It
follows from Parseval that
X X
j ()(a)j2 = q j(a)j2 :
a mod q a mod q
But the LHS is j ()j2 (q), and the RHS is q(q), so we're done.
Further Exercises:
1. Consider a series (D) in which an need not be positive reals. Of course this
series still has an abscissa of absolute convergence. Less obvious, but still true,
is that it also has an abscissa of ordinary convergence.
P Show that if the sum
(D) converges in the usual sense of limN !1 N1 at some s0 then it converges
also in Re(s0 ), the convergence being uniform in arg(s ? s0 ) for each
< =2. Deduce that (D) denes an analytic function on Re(s0 ). (Since
f (s ? s0 ) is again P
of the form (D) it is enough to provePthis claim for s0 = 0.
Assume then that 1 n=1 an converges, and let A(x) = ? n >x an !0 as x!1;
for large M; N write
N
X Z N
an e?ns = e?s dA();
n=M M
etc. This is equivalent to the route taken by Serre, but probably more trans-
parent to us.)
7
2. Suppose is a real character. Then (L) relates (s; ) with (s; 1 ? ). De-
duce that Lp(s; ) has a zero ofp even or odd multiplicity at s = 1=2 according as
() = +i q or () = ?i q. In particular, in the minus case L(1=2; ) = 0.
a a
[But it is known that in fact the minus case never occurs, a fact rst proved by
Gauss after much work. (Davenport proves this in the special case of prime q in
Chapter 2, using a later method of Dirichlet that relies on Poisson summation.)
It follows that the order of vanishing of L(s; ) at s = 1=2 is even; it is conjec-
tured, but not proved, that in fact L(1=2; ) > 0 for all Dirichlet characters .
More complicated number elds are known whose zeta functions do vanish at
s = 1=2.]
3. Obtain a formula for the generalized Jacobi sum
X X
J (1 ; ; n ) := 1 (a1 ) n (an )
ai modq
under suitable hypotheses on the i . What is the analogous formula for denite
integrals?
4. Let be the Legendre symbol modulo an odd prime q. Evaluate ()n in
two ways to count the number of solutions mod q of x21 + + x2n = 1.
5. Can you nd a analog of the duplication formula for the Gamma function?
References
[Davenport 1967] Davenport, H.: Multiplicative Number Theory. Chicago:
Markham, 1967; New York: Springer-Verlag, 1980 (GTM 74). [9.67.6 & 9.80.6
/ QA 241.D32]
[Serre 1973] Serre, J.-P.: A Course in Arithmetic. New York: Springer, 1973
(GTM 7). [AB 9.70.4 (reserve case) / QA243.S4713]
8
Math 259: Introduction to Analytic Number Theory
The asymptotic formula for primes in arithmetic progressions
Now that we have the functional equation for L(s; ), the asymptotics for
(x; ), and thus also for (x; a mod q) and (x; a mod q)), follow just as they
did for (x) and (x) | at least if we are not very concerned with how the
implied constants depend on q. Again all this is found in [Davenport 1967].
Let be a primitive character mod q > 1. We readily adapt our argument
showing that (s2 ? s) (s) is an entire function of order 1 to show that (s; ) is
an entire function of order 1, and thus has a Hadamard product
Y
(s; ) = (0; )eBs (1 ? s=)es= ; ( )
the product ranging over the zeros of (s; ) counted with multiplicity, which
are just the zeros of L(s; ) with 2 [0; 1]. Thus
0 X 1 1
(s; ) = B + + :
s ?
( 0 )
How are these zeros distributed? We noted already that their real parts lie in
[0; 1]. If L(; ) = 0 then by the functional equation 0 = L(1?; ) = L(1?; ).
Thus the zeros are symmetrical about the line = 1=2, but not (unless is
real) about the real axis. So the proper analog of N (T ) is half of N (T; ), where
N (T; ) is dened as the number of zeros of L(s; ) in 2 (0; 1), jtj < T , counted
with multiplicity. [NB this excludes the trivial zero at s = 0 which occurs for
even .] Again we evaluate this by integrating 0 = around a rectangle, nding
that for (say) T 2
1 N (T; ) = T log qT ? T + O(log qT ): (N )
2 2 2 2
[Here the extra term (T =2) log q enters via the new factor qs=2 in (s; ). That
factor is also responsible for the new term ? 21 log q in (L0 ) and thus forces us to
subtract O(log q) from our lower bound on the real part of (L0 =L)(s; ), nding
that
L0 X 1 + O(log jqtj)
(s) =
L
j Im(s?)j<1 s ?
1
( 2 [?1; 2]), the sum comprising O(log jqtj) terms, with log jqtj in place of
log jtj; that is the source of the error O(log qT ) instead of O(log T ) in (N ).]
To isolate the primes in arithmetic progressions mod q, we need also nonprimi-
tive characters, such as 0 . Let 1 be the primitive character mod q1 jq under-
lying a nonprimitive mod q. Then
Y?
L(s; ) = 1 ? 1 (p)p?s L(s; 1 ):
j
pq
Q
The elementary factor pjq has, for each p dividing q but not q1 , (T =) log p+
O(1) purely imaginary zeros of absolute value < T . It follows that the RHS of
(N ) is an upper bound on 21 N (T; ) even when is nonprincipal.
The horizontal distribution of is subtler. We noted already that the logarith-
mic derivative of Y
q (s) := L(s; )
mod q
P ?s with q (n)
is a Dirichlet series ? n q (n)n 0 for all n, and thus that
the 3 + 4 cos + cos 2 trick shows that q , and thus each factors L(s; ), does
not vanish at s = 1 + it. We can then adapt the proof of the classical zero-
free region for (s); since, however, q (s) is the product of (s) L-series, each
of which contributes O(log jqtj) to the bound on (q0 =q )( + it), the resulting
zero-free region is not 1 ? < c= log jtj or even 1 ? < c= log jqtj but 1 ? <
c=((q ) log jqtj). Moreover, the fact that this only holds for say jtj > 2 is newly
pertinent: unlike (s), the L-series might have zeros of small imaginary part.
[Indeed it is known that there are Dirichlet characters whose L-series vanish
arbitrarily close to the real axis.] Still, for every q there are only nitely many
zeros with jtj 2. So our formula
Z 1+ log1 x +iT 0 2
(x; ) = 21i ? LL (s; ) xs dss + O x log x
(T 2 [1; x])
1+ log1 x ?iT T
6 0 there is
yields an estimate as before, with only the dierence that when =
no \main term" coming from a pole at s = 1. We thus nd
p
(x; ) exp(?C log x) ( )
for some constant C > 0. Multiplying by (a) and averaging
p over (including
in the average 0 , for which (x; 0 ) = x + O(exp(?C log x)) instead of ( ),
we obtain
(x; a mod q) = (1q) x + Oq (exp(?Cq log x));
p
( q)
and thus
mod q) = (1q) li(x) + Oq (exp(?Cq log x)):
p
(x; a (q )
2
Note however that the dependence of the error terms on q is unpredictable. The
zero-free region depends explicitly on q (though as we shall see it need not shrink
nearly as fast as 1=(q) log q, a factor which alone would make Cq proportional
to ((q) log q)?1=2 ), but it excludes the neighborhood of the real axis; it would
then seem that to specify C and we would have to compute for each the
largest Re(). There's also the matter of the contribution of the B 's from (L0 ).
Consider, by comparison, the consequences of the Extended Riemann Hypothesis
(ERH), i.e. the conjecture that each nontrivial zero of an L-series associated
to a primitive Dirichlet character has real part 1=2.1 Our analysis of (x)
under RH then carries over almost verbatim to show that (x; ) x1=2 log2 x
as long as q < x with an absolute an eective implied constant, and thus that
(x; a mod q) = (xq) + O(x1=2 log2 x); (?)
again with the O-constant eective and independent of q. It would also follow
that
(x; a mod q ) =
li(x) + O(x1=2 log x); (??)
(q )
with
P similar comments about the eect of the dierence between (x) and
p<x log x [RS 1994].
Exercises.
P
1. Show that the real part of the term B of ( 0 ) is ? Re(1=). Conclude that
Re(B ) < 0. [Davenport 1973, page 85.]
2. Complete the missing steps in the proof of (N ).
3. Complete the missing steps in the proof of ( ), q , and q .
4. Verify that under ERHq the O-constant in (?) does not depend on q. Obtain
an analogous estimate on the weaker assumption that q has no zeros of real part
> for some 2 ( 12 ; 1). Show that if for some q we have (x; a mod q ) x+
for all a 2 (Z=q) then all the L(s; ) for Dirichlet characters mod q are
nonzero on > .
Reference
[RS 1994] Rubinstein, M., Sarnak, P.: Chebyshev's Bias. Experimental Mathe-
matics 3 (1994) #3, 173{197. [ftp://math.princeton.edu/pub/user/miker/
Chebyshev.ps]
3
Math 259: Introduction to Analytic Number Theory
A nearly zero-free region for L(s; ), and Siegel's theorem
Now we see why the case 2 = 0 will give us trouble near s = 1: for such the last
term in () is essentially ( 0 = )( + 2it), whose pole at s = 1 will undo us for small t.
Let us see how far () does take us. (See for instance Chapter 14 of Davenport.) The
rst term is 0
< 3 ? () < ?3 1 + O(1):
For the remaining terms, we use the partial-fraction expansion
0 0
? LL (s; ) = 12 log q + 21 ?? ((s + a)=2) ? B ? 1 +1 :
X
s?
P
To eliminate the contributions of B and 1= we evaluate this at s = 2 and subtract.
Since (L0 =L)(2; ) is bounded, we get
L 0 1 ?0 X 1 1
? L (s; ) = 2 ? ((s + a)=2) ? ? :
s? 2?
1
Next take real parts. For of real part in [0; 1] we have Re(1=(2 ? )) j2 ? j?2 .
To estimate the sum of this over all , we may apply Jensen's theorem to (2 + s; ),
nding that the number (with
P multiplicity) of jj at distance at most r from 2 is
O(r log qr), and thus that j2 ? j?2 log q. We estimate the real part of the ?0 =?
term by Stirling as usual, and nd
0
Re ? L (s; ) < O(log q(jtj + 2)) ? Re 1 = O() ? Re 1 ;
X X
L s? s?
where we have introduced the convenient abbreviation1
:= log q(jtj + 2):
Again each of the Re s?1 is nonnegative, so the estimate remains true if we only include
some or none of the .
In particular it follows that
0
Re ? LL ( + 2it; 2) < O(); (L2 )
2
Next suppose that is a real character and x some > 0. We have a zero-free
region for jtj > = log q. To deal with zeros of small real part, let s = in () | or,
more simply, of the analogous inequality resulting from 1 + cos 0 (i.e. positivity of
?0 = , with the zeta function of the quadratic number eld corresponding to )
| to nd X ? 1 O(log q);
Re(1=( ? )) < +
j Im()j<= log q
the implied O-constant not depending on . Note that the LHS is real since the 's
come in complex conjugate pairs. Now Re(1=( ? )) = Re( ? )=j ? j2 . Choosing
= 1 + 2= log q we nd that j Im()j < 21 ( ? 1) < 12 Re( ? ), and thus that
jj2 < 45 Re( ? )2 . Therefore Re(1=( ? )) > 45 = Re( ? ). So,
2 ?1 < log q + O(log q):
4 X
1 ? Re() +
5 j Im()j<= log q log q 2
Thus if c is small enough we conclude that at most one can have real part > 1 ?
c= log q. Since 's are counted with multiplicity and come in complex conjugate pairs,
it follows that this exceptional zero, if it exists, is real and simple. It is usually denoted
by .
In fact2 such a may occur for at most one character mod q. Since need not be
primitive, it follows that in fact it cannot occur for dierent q if we set the threshold
low enough: there is a constant c > 0 such that for any two distinct real characters
1 ; 2 to (not necessarily distinct) moduli q1 ; q2 at most one of their L-functions has
an exceptional zero > 1 ? c= log q1 q2 . The point is that 1 2 is also a nontrivial
Dirichlet character, so ?(L0 =L)(; 1 2 ) < O(log q1 q2 ) for > 1. The sum of the
negative logarithmic derivatives of (s), L(s; 1 ), L(s; 2), and L(s; 1 2 ) is a positive
Dirichlet series 1
X
(1 + 1 (n))(1 + 2 (n))(n)n?s ;
n=1
which is thus positive for real s > 1. Arguing as before we nd that if i are exceptional
zeros of L(s; i ) then
1 + 1 < 1 + O(log q q );
? 1 ? 2 ? 1 1 2
if i > 1 ? then we may take = 1+2 to nd 1=6 < O(log q1 q2 ), when log q1 q2
as claimed.
In particular, given q at most one real character mod q has an L-series with an ex-
ceptional zero > 1 ? c= log q. A typical estimate on primes in arithmetic progressions
that we can now deduce by our standard methods [See e.g. Davenport, Chapter 20]
is: suppose x > exp(C log2 q) for some absolute constant C > 0; then there exists a
constant c > 0 depending only on C such that (x; a mod q) is
( ?1 p
1 ? (a)x + O(exp ?c log x) x(q) ; if mod q has the exceptional zero ;
? p
1 + O(exp ?c log x) x(q) ;
if no mod q has an exceptional zero.
2 Davenport attributes the next to Landau in Gottinger Nachrichten 1918, 285{295.
3
Just how close can this come to 1? We rst observe that very small 1 ? imply
small L(1; ). To see this we need an upper bound on jL0 (; )j for near 1, and such
a bound (also for complex ) is
0 1 ? log1 q ) jL0(; )j log2 q:
Indeed we have ?L0(; ) = 1 ? Pq
P P
n=1 (n)(log n)n = n=1 + n>q ; the rst sum is
bounded termwise by e log n=n and thus by logP2 q; the rest can be bounded by partial
summation together with the crude estimate j qN (n)j < q, yielding e log q. It follows
that if 1 ? < 1= log q then L(1; ) < (1 ? ) log2 q. But the Dirichlet class number
formula for the quadratic number eld corresponding to gives L(1; ) q?1=2 ; thus
1 ? 1=2 1 2 :
q log q
Note for later use that our method of proving jL0 (; )j log2 q in 0 1 ? 1= log q
also yields jL(; )j log q in the same interval, and in particular at = 1.
Siegel proved3 that in fact
L(1; ) q?
for all > 0. It follows that 1 ? q?. However, the implied constant was, and
still is sixty years later, ineective for every < 1=2. So, for instance, we know that
the class number of an imaginary quadratic eld is jDj1=2? , but only with much
eort was it shown (by Stark and, independently and dierently, Heegner) that the
class number exceeds 1 for all jDj > 163, and even an eective lower bound of c log jDj
for prime D was big news and remains the best eective bound known!
The problem is that again we need more than one counterexample to get a contradic-
tion. We follow Chapter 21 of [Davenport 1967]. Let 1 ; 2 be dierent primitive real
characters to moduli q1 ; q2 > 1, and let
F (s) = (s) (s; 1 ) (s; 2 ) (s; 1 2 )
(the zeta function of a biquadratic number eld) and
= L(1; 1)L(1; 2 )L(1; 12 ) = (s ? 1)F (s)s=1
We shall use an estimate
s 2 (; 1) ! F (s) > A ? 1B
? s (q1 q2 )
C (1?s) (F)
for some universal constants < 1 and positive A; B; C (specically = 9=10 and
A = 1=2; C = 8). Assume (F) for the time being. We shall nd 1 and 2 (; 1) such
that F (s) 0 and conclude that
> BA (1 ? )(q1 q2 )?C (1?): ()
If L(1 ; 1 ) = 0 for some real 1 and 1 ? 1 < =2C then we use that 1 and = 1 .
Otherwise we choose any 1 and any with 0 < 1 ? 1 < =2C , since (s) < 0 for
3 In the 1935 inaugural volume of Acta Arithmetica.
4
0 < s < 1 while the other three factors of F (s) are positive for s > . Then for any
primitive 2 mod q2 > q1 we use () together with L(s; ) log q yields
L(1; 2) > c q1?C (1?)= log q2 ;
with c depending only (but ineectively!) on via 1 and . Since C (1 ? ) < =2,
Siegel's theorem follows.
It remains to prove (F), for which we follow Estermann's simplication of Siegel's
original proof. Since F (s) has a nonnegative Dirichlet series, its Taylor series about
s = 2 is 1
X
F (s) = bm (2 ? s)m
m=0
with b0 = F (2) > 1 and all bm > 0. Thus
1
F (s) = s ? 1 = (bm ? )(2 ? s)m ;
X
m=0
this being valid in the circle js ? 2j < 2. Consider this on js ? 2j = 3=2. We have there
the crude bounds L(s; 1 ) q1 , L(s; 2 ) q2 , L(s; 12 ) q1 q2 , and of course (s)
is bounded on that circle. So, F (s) (q1 q2 )2 on that circle, and thus the same is true
of F (s) ? =(s ? 1). Thus
jbm ? j (2=3)m(q1 q2 )2 :
So for xed 2 (1=2; 1) we nd for all s 2 (; 1)
1
X
2
M
jbm ? j(2 ? s)m (q1 q2 )2 3(2 ? ) :
m=M
Thus (remember b0 = 1, bm 0)
M
F (s) ? s ? 1 1 ? (2 ?1s?) s ? 1 ? O(q1 q2 )2 3(2 2? ) :
M
Let M be the largest integer such that the error term (q1 q2 )2 (2=(3 ? 2))M is < 1=2.
Then
F (s) > 12 ? 1 ? s (2 ? s)M :
but
(2 ? s)M = exp(M log(2 ? s)) < exp M (1 ? s);
and exp M (q1 q2 )O(1) , which completes the proof of (F) and thus of Siegel's theorem.
5
Math 259: Introduction to Analytic Number Theory
Formulas for L(1; )
(if we choose the representative of a mod q with 0 < a < q). Either the real or
the imaginary part will disappear depending on whether is odd or even.
Assume rst that is even. Then the terms (1 ? 2a=q) cancel in (a; q ? a) pairs.
Moreover the terms (a) log 2 sum to zero, and we have
X
L(1; ) = ? (1) (a) log sin a
q: (L+ )
a mod q
For example if is a real character then
pqL(1; ) = 2 log
where
bY
q=2c
= sin(a) a
q
a=1
is a cyclotomic unit of Q(pq). The Dirichlet class number formula then asserts
in eect that = h0 where 0 is the fundamental unit of that real quadratic eld
and h is its class number.
1
If on the other hand is odd then it is thePlogarithm terms that cancel in
symmetrical pairs. Using again that fact that a modq (a) = 0 we simplify (L)
to q?1
X
L(1; ) = ? qi() a(a) (L? )
a=1
In particular if is real then (again using the sign of () for real characters)
q?1
X
L(1; ) = ?q?3=2 a(a):
a=1
Thus the sum is negative, and bypDirichlet equals ?q times the class number of
the imaginary quadratic eld Q( ?q), except for q = 3; 4 when that eld has
extra roots of unity.
(
Exercise: Show directly that the sum is a multiple of q, at least when q is
prime.)
Let usPconcentrate on the case of real characters to prime modulus q ?1 mod 4.
That aq?=11 a(a) < 0 suggests that the quadratic residues mod q tend to be
more numerous in the interval [1; q=2] than in [q=2; q]. We can prove this by
evaluating the sum
N
X
S (N ) := (n)
n=1
at N = q=2. We noted already that for any nontrivial character mod q we
have S (q) = 0 and thus jS (N )j < q for all N . In fact, using the Gauss-sum
formula for (n) we have
X XN X
S (N ) = (1) (a) e2ina=q = (1) (a) 11 ? e2iNa=q ;
? e?2ia=q
a mod q n=1 a modq
from which Polya and Vinogradov's estimate
bX
q=2c
S (N ) q1=2 1 1=2
a=1 a q log q
follows immediately. Now let be the quadratic character modulo a prime
q ?1 mod 4 and let N = (q ? 1)=2. (What would happen for q +1 mod 4?)
Then q?1
X
S ((q ? 1)=2) = (n)(n=q)
n=1
where (x) is the periodic function dened by
8
< 0; if 2x 2 Z;
(x) = : +1=2; if 0 < x ? bxc < 1=2;
?1=2; otherwise
2
(\square wave"). This has the Fourier series
(x) = 2 sin 2x + 13 sin 6x + 15 sin 10x + 17 sin 14x +
We thus have
q?1
1 1X
X
1
S ((q ? 1)=2) = i 2ima=q ? e?2ima=q )
m=1 m (a)(e
a=1
m odd
The inner sum is
()((m) ? (?m)) = 2ipq (m):
Thus our nal formula for S ((q ? 1)=2) is
2pq X (m) (2 ? (2))pq
m odd m = L(1; ):
It follows, as claimed, that there are more quadratic residues than nonresidues
in [1; q=2]; in fact one q > 3 the dierence between the counts is either h or 3h
according as (2) = 1 or ?1, i.e. according as q is 7 or 3 mod 8. Even the
positivity of S ((q ? 1)=2) has yet to proven without resort to such analytic
methods!
P
Exercises: What can you say of S (bq=4c)? What about the sums aq?=11 am (a)
for m = 2; 3; : : :?
3
Math 259: Introduction to Analytic Number Theory
Introduction to exponential sums; Weyl equidistribution
1
A sequence c1 ; c2 ; c3 ; : : : of real numbers is said to be of real numbers is said to
be equidistributed mod 1 if the fractional parts hcn i cover each interval in R=Z
in proportion to its length, i.e. if
lim 1 #fn N : a hc i bg = b ? a
N !1 N n (1)
for all a; b such that 0 a b 1. What does this have to do with exponential
sums? Consider the following theorem of Weyl: for a sequence fcn g1 n=1 in R
(or equivalently in R=Z , the following are equivalent:
(i) Condition (1) holds for all a; b such that 0 a b 1;
(ii) For any continuous function f : (R=Z)!C,
1XN Z1
lim
N !1 N
f (c n ) = f (t) dt; (2)
n=1 0
2
j PNn=1 em(cn )j N by a factor that tends to 1 with N . For instance, we have
Weyl's original application of this theorem: For r 2 R the sequence fnrg is
equidistributed mod 1 if and only if r 2= Q. Indeed if r is rational then hnri
only takes on nitely many values; but if r is irrational then for each m we have
em (r) 6= 1 and thus
N
X
em (nr) = em ((ne+ (1)r)r)??1 em (r) = Om (1):
n=1 m
P
In general we cannot reasonably hope that Nn=1 em(cn ) is bounded for each m,
but we will be able to use our techniques to get a bound2 o(N ). For instance,
we'll see that if P 2 R[x] is a polynomial at least one of whose nonconstant
coecients is irrational then fP (n)g is equidistributed mod 1 (the example of
fnrg being the special case of linear polynomials). We'll also be able to show
this for flog10 (n!)g and thus obtain the distribution of the rst d digits of n! for
each d.
Exercises:
3
Given a positive function g, the functions f such that f = o(g) constitute a
vector space.
4. (Eective and ineective o().) An estimate f = o(g) is said to be eective if
for each > 0 we can compute a specic point past which jf j < g; otherwise it
is ineective. Show that the transformations in the previous exercise preserve
eectivity. Give an example of an ineective o().
References
[Korner 1988] Korner, T.W.: Fourier Analysis. Cambridge, England: Cam-
bridge University Press, 1988. [HA 9.88.14 / QA403.5.K67]
[Schmidt 1976] Schmidt, W.M.: Equations over Finite Fields: An Elementary
Approach. Berlin: Springer, 1976 (LNM 536). [QA3.L28 #536]
4
Math 259: Introduction to Analytic Number Theory
Exponential sums II: the Kuzmin and Montgomery-Vaughan estimates
While proving that an arithmetic progression with irrational step size is equidis-
tributed mod 1, we encountered the estimate
N
e(cn) j1 ?2e(c)j = 1=j sin cj fcg?1;
X
n=1
where fcg is the distance from c to the nearest integer. Kuzmin showed (1927)
that more generally if for 0 N the sequence fcn ? cn?1 g of dierences is
monotonic and contained in [k + ; k + 1 ? ] for some k 2 Z and > 0 then
N
e(cn ) cot
X
?1
2 :
n=0
Indeed, let n = cn ? cn?1 and
n = 1 ? e1( ) = c e(cn??1c) :
n n?1 n
Note that the n are collinear:
n = (1 + i cot n )=2;
since fn g is monotonic, the n are positioned consecutively on the vertical line
Re( ) = 1=2. Now our exponential sum is
N
X ?1
NX
e(cn ) = e(cn ) + (e(cn?1 ) ? e(cn ))n
n=0 n=0
?1
NX
= (1 ? n )e(cn ) ? 1 e(c0 ) + e(cn )(n+1 ? n ):
n=1
Thus
XN ?1
NX
e(cn ) j1 j +
jn+1 ? n j + j1 ? n j = j1 j + jN ? 1 j + jN j;
n=0 n=1
where in the last step we used the monotonicity of Im(n ) and the fact that
Re(n ) = 1=2. The conclusion of the proof,
j1 j + jN ? 1 j + jN j sin1 + tan1 = cot 12 ;
1
is an exercise in trigonometry.
For instance, it follows that for t= < N1 < N2 we have
N2
X
n?it N2 =t;
n=N1
since we are dealing with cn = ?(t log n)=2 and thus n ?t=2n. By partial
summation it follows that
N2 Z N2
n?1=2?it 1t
X
n?3=2 n dn t?1 N 1=2 ;
n=N1 N1
and thus
tj=c
bjX
(1=2 + it) = n?1=2?it + O(1):
n=1
With some more work, we can (and soon will) push the upper limit of the
sum further down, but not (yet?) all the way to t ; as n decreases, the phase
e((t log n)=2) varies more erratically, making the sum harder to control. Still,
ifPwe sum random complex numbers of norm cn , the variance of the sum is
2
n jcn j , so we expect that the sum would grow as the square root of that,
which for (1=2+it) would make it log1=2 jtj \on average". We next prove a series
of general mean-square results along these lines, in which the summands are not
independent variables but complex exponentials with dierent frequencies:
X
f (t) = c e(t)
2A
for some nite set A R and coecients c 2 C. The easiest estimate is that
given A and c we have
Z T2 X
jf (t)j2 dt = (T2 ? T1 ) jc j2 + O(1):
T1 2A
How does the \O(1)" depend on A; c ; T1 ; T2? We easily nd
Z T2 X
jf (t)j2 dt ? (T2 ? T1 ) jc j2 = QA (~c2 ) ? QA (~c1 );
T1 2A
where QA is the sesquilinear form on CA dened by
QA (~x) = 21i
XX x x
; 2A ?
=6
2
and cj 2 CA (j = 1; 2) arePthe vectors with coordinate c e(tj ). The termwise
estimate jQA (~x)j ?1
P
RT
> jx x j=( ? ) is already sucient to prove
T 0 j (1=2 + it)j dt log2 T . But remarkably a tighter estimate holds in
?1 2
this general setting: let
j ? j;
() = min
the minimum taken over all 2 A other than itself. Then
Theorem (Montgomery-Vaughan Hilbert Inequality). For any nite
A R and ~c 2 CA we have
jQA(~c)j
X jc j2 ;
2A ()
and thus
Z T2 X 2 X
c e(t) dt = T2 ? T1 + O(()?1 ) jc j2 :
T1 2A 2A
Why \Hilbert Inequality" and not simply \Inequality"? Because this is a grand
generalization of the original Hilbert inequality, which is the special case A =
f 2 Z : jj < M g. In that case our function P
f (t) is Z-periodic, and as
Schur observed the inequality jQA(~c)j < (1=2) jc j2 follows from the integral
QA (~c) = i 01 (t ? 21 )jf (t)j2 dt (though in that case the resulting estimate
R
formula
R T2
on T1 jf (t)j2 dt is even easier than the upper bound on jQA (~c)j).
To start the proof of Montgomery-Vaughan, consider CA as a nite-dimensional
complex Hilbert space with inner product
X
h~c;~c0 i := c c0 =():
2A
Then QA (~x) = h~x; L~xi where L isPthe Hermitian operator taking ~x to the vector
with coordinate (2i)?1 () =6 xn =( ? ), and we want to show that
h~c; L~ci h~c;~ci for all ~c 2 CA . But this is equivalent to the condition that L
have norm O(1) as an operator on that Hilbert space, and since the operator
is Hermitian it is enough to check that [QA (~c) =]h~c; L~ci 1 holds when ~c is a
normalized eigenvector. Thus it is enough to prove that QA (~c) 1 for all A; c
such that X
jc j2 =() = 1
2A
and there exists some 2 R such that
X
() c =( ? ) = ic
6=
3
for each 2 A, in which case = 2QA(~c).
Now for any ~c we have
X
j2Q(~c)j2 = c
X c 2 X jc j2 X ( ) X c 2 :
6= ? ( ) 6= ?
P 2
By assumption jc j =( ) = 1. For the other factor, we expand
2
X
c
2
=
X c
+
XX c1 c2 :
6= ? 6=
?
1 6=2 1 )(2 ? )
( ?
The single sum contributes
X
jc j2
X ( )
6= ? )
( 2
4
And now we get to use the eigenvalue hypothesis to show that the S (j ) terms
cancel each other. Indeed we have
XX c c X X c2
1 2
S ( ) = c S ( )
2 ? 1 1 1 1
1 6=2 1 2 6=1 2 ? 1
and the inner sum is just ?ic1 =(1 ), so
XX c1 c2 S ( ) = ?i X S () jc j2 :
1
1 6=2 2 ? 1 ()
The same computation shows that
XX c c
1 2
S ( ) = ?i
X
S ( ) jc j2 ;
? 2 ()
1 6=2 2 1
so the S (j ) terms indeed drop out! Collecting the surviving terms, we are thus
left with
c1 c2 ((1 )?+()22 ) :
X XX
j2Q(~c)j2 jc j2 T + ()
2A 1 6=2 2 1
By now all the coecients are positive, so we will have no further magic cancel-
lations and will have to just estimate how big things can get. We'll need some
lemmas (which are the only place we actually use the denition of ()!): rst,
for each k = 2; 3; : : :,
2 A ) (?())k k ()1?k ;
X
(1)
6=
second,
1 ; 2 2 A )
X ( ) [(1 )?1 ] + [(2 )?1 ] : (2)
6=1 ;2 (1 ? ) (2 ? ) (1 ? 2 )2
2 2
6=
by the case k = 2 of (1). The second sum will be Cauchy-Schwarzed. That sum
is bounded by twice
jc1 c2 j ( (?1) )2 = jc c j (?() )2 :
XX XX
B :=
1 6=2 2 1 6=
5
P 2 =() = 1, we have
Since jc j
jB j2
X
( )
X
jc j() 2 :
6= ( ? )
2
P
Expanding and switching 's we rewrite this as
jB j2
XX
jc1 c2 j(1 )(2 )
X
( )
2 :
6=1 ;2 (1 ? ) (2 ? )
2
1 ;2
When 1 = 2 , the inner Psum is ()?3 (by (1) with k = 4), so the contribu-
tion of those terms is jc j2 =() = 1. When 1 6= 2 we apply (2), and
the resulting estimate on the sum of the cross-terms is twice the double sum
dening B ! So, we've shown (modulo the proofs of (1), (2)) that B 2 1 + B .
Thus B 1 and we're nally done.
For instance, if A = flog n=2 : n = 1; 2; 3; : : :; N g we nd that
Z N
T2 X 2 N
X
cn nit dt = (T2 ? T1 + O(n))jcn j2 :
T1 n=1 n=1
Taking T1 = ?T , T2 = 0, cn = n?1=2 we thus have
Z N
T X 2
n?1=2?it dt = T log N + O(T + N ):
0 n=1
It follows that
Z T p
j (1=2 + it)j2 dt = T log T + O(T log T ):
0
Exercises:
1. Prove that the constant in the original Hilbert inequality is best possible,
and show that it holds even if c is allowed to be nonzero for every integer
(this is in fact what Hilbert originally proved).
2. Complete the proof of Montgomery-Vaughan by verifying the inequalities
(1), (2).
3. Let
RT
be a character (primitive or not) mod q. Obtain an asymptotic formula
for 0 jL(1=2+ it)j2 dt. How does the error term depend on q? (It is conjectured
that L(1=2 + it) (qjtj) ; as usual this problem is still wide open.)
6
Math 259: Introduction to Analytic Number Theory
Exponential sums III: the van der Corput inequalities
P
Let f (x) be a suciently dierentiable function, and S = Nn=1 e(f (n)). The
Kuzmin inequality tells us in eect that
I If f 0(x) is monotonic and 1 < ff 0(x)g < 1 ? 1 for x 2 [1; N ] then S 1=1.
We shall use this to deduce van der Corput's estimates on S in terms of N and
higher derivatives of f . In each case the inequality is only useful if f has a
derivative f (k) of constant sign which is signicantly smaller than 1.
II If c2 < f 00 < C2 for some constants c; C such that 0 < c < C then
S c;C N = + ? = :
1 2
2 2
1 2
III If c 3 < f 000 < C3 for some constants c; C such that 0 < c < C then
S c;C N = + N 1 6 1=2 ?1=6 :
3
3
but we'll only make use of van der Corput II and III.
Here is a typical application, due to van der Corput: (1=2 + it) jtj1=6 log jtj.
We have seen that
tj=c
bjX
(1=2 + it) = n?1=2?it + O(1):
n=1
P
We shall break up the sum into segments nN=1 N with N < N1 2N . We shall
use f (x) = (t log x)=2, so k = t=N k . Then II and III give
N0
X N0
X
nit jtj + N=jtj
1=2 1=2
; nit N = jtj = + N=jtj =
1 2 1 6 1 6
n=N n=N
for N < N 0 < N1 . By partial summation of S it follows that
N0
X N0
X
n?1=2?it (jtj=N ) = +(N=jtj) = ;
1 2 1 2
n?1=2?it jtj = + N
1 6 1=2
jj
= t 1=6
n=N n=N
Choosing the rst estimate for N jtj2=3 and the second for N jtj2=3 we
nd that the sum is jtj1=6 in either case. Since the total number of (N; N 0 ]
segments is O(log jtj), the inequality (1=2 + it) jtj1=6 log jtj follows.
1
The inequality II is an easy consequence of Kuzmin's I. [NB the following is
not van der Corput's original proof, for which see for instance Lecture 3 of
[Montgomery 1994]. The proof we give is much more elementary, but does not
as readily yield the small further reductions of the exponents that are available
with these methods.] We may assume that f 00 (x) < 1=4 on [1; N ], else 2 1
and the inequality is trivial. Split [1; N ] into O(N 2 +1) intervals on which bf 0c
is constant. Let 1 be a small positive number to be determined later, and take
out O(N 2 + 1) subintervals of length O(1 =2 + 1) on which f 0 is within 1
of an integer. On each excised interval, estimate the sum trivially by its length;
on the remaining intervals, use Kuzmin. This yields
S (N 2 + 1)(? 1 + 1 =2 + 1):
1
proof of II.
2
For III and higher van der Corput bounds, use Weyl's trick: for H N ,
( )1=2
H N ?h
N XX
jS j H
e(f (n + h) ?
f (n))
: (W)
h=0 n=1
If f (x) has small positive k-th derivative then each f (x + h) ? f (x) has small
(k ? 1)-st derivative, which is positive except for h = 0 when the inner sum is N .
This will let us prove III from II, and so on by induction (see the rst Exercise
below).
To prove (W) dene zn for n 2 Z by zn = e(f (n)) for 1 n N and zn = 0
otherwise. Then
!
X1 1 X1 X H
S= z = n z ; n+h
H
n=?1 n=?1 n=1
in which fewer than N + H of the inner sums are nonzero. Thus by Cauchy-
Schwarz
2
N
1
+H X XH X H X
jS j
2
H2
zn+h
HN 2
zn+h1 zn+h2 :
n=?1 n=1 h1 ;h2 =1 n2Z
But the inner sum depends only on jh1 ? h2 j, and each possible h := h1 ? h2
occurs at most H times. So,
H
N X X
jS j
2
H
zn+h1 zn ;
h=0 n2Z
2
from which (W) follows.
Now to prove III: we may assume N ?3 < 3 < 1, else the inequality is trivial.
Apply (W), and to each of the inner sums with h 6= 0 apply II with 2 = h3 .
This yields
2 H
X
jS j NH + N
2
H
[N (h3 )1=2 + (h3 )?1=2 ]
h=1
?
=N 2
(H3 )1=2 + H ?1 + N=(H3 )1=2 :
Now make the rst two terms equal by taking H = b?3 1=3 c:
jS j N 2 2 1=3
3 + N ?3 1=3 :
Extracting square roots yields III.
Exercises:
3
Math 259: Introduction to Analytic Number Theory
How many points can a curve of genus g have over Fq ?
p p
extending over all primes p, where Z = q?s . Then
X X 1 1
X X 1 X Z n
dp = X
logC (s) = Z dp m =m = Zn dp n :
p m=1 n=1 dp jn n n=1 dp jn
But the inner sum is just the number Nn = Nn (C) of points of C rational over
the eld of qn elements. Note that1 Nn C qn, so the sum and thus the Euler
product converge for jZ j < 1=q, i.e. for > 1.
As in the number-eld case, C satises a functional equation relating its values
at s and 1 ? s:
C (1 ? s) = q(2?2g)( 12 ?s) C (s) = (qZ 2 )1?g C (s);
equivalently,
C (s) := q(1?g)(s? 21 ) C (s)
is invariant under s $ 1 ? s. Moreover, C (s) is of the form
C (s) = P(Z)=(1 ? Z)(1 ? qZ)
for some polynomial P of degree 2g with P(1) = 1; it then follows from the
functional equation that P(1=qZ) = P(Z)=(qZ 2 )g , which is to say that we can
factor P(Z) as
Y
g
P(Z) = (1 ? j Z)(1 ? g+j Z)
j =1
1 Fix a nonconstant function : !P1 . Then
f C
n ( ) (deg ) n (P ) = (deg )(
N C f N
1 n + 1)
f q q
n:
1
for some complex numbers 1 ; : : :; 2g such that
j g+j = q
for j = 1; : : :; g. Comparing this with our formula for Nn we nd
2g
X
Nn = q + 1 ? nj :
n
j =1
(Fortunately this agrees with our known formula for Nn when g = 0: : :) The
analogue of the Dirichlet class number formula is the fact that P(1) = j (1 ?
Q
j ), which is essentially the residue of C (s) at its pole s = 1, is the size of the
\Jacobian" JC (k) of C over k.
So far all this can be proved by more-or-less elementary means, and even extends
to varieties over k of any dimension [Dwork 1960]. A much harder, but known,
result is that the Riemann hypothesis holds: P(q?s ) can vanish only for s such
that = 1=2, i.e. jZ j = q?1=2; thus all the j have absolute value q1=2, and
g+j = j . This theorem of Weil, and its generalization by Deligne to varieties
of arbitrary dimension over nite elds, is at least to some tastes the strongest
evidence so far for the truth of the original Riemann hypothesis and its various
generalizations.
It also has numerous applications. For instance, it follows immediately that the
number N1 = N1 (C) of k-rational points on C is approximated by q + 1:
jN1 ? (q + 1)j 2gpq: (W)
Equality can hold in this Weil bound at least for small g, though already for
g = 1 there are surprises; for instance for q = 128 (W) allows N1 to be as large as
151 and as small as 107, but in fact the most and least possible are 150 and 108.
See [Serre 1982{4] for much more about this. We ask however what happens for
xed q as g!1: how large can N1 (C) grow as a function of g? this is not only
a compelling problem in its own right, but has applications to coding theory
and similar combinatorial problems, see forpinstance [Goppa 1981,3; Tsfasman
1996]. We shall see that the bound N1 < 2g q+Oq (1) coming from (W) cannot
be sharp, and obtain an improved bound, the Drinfeld-Vladut bound
N1 < (pq ? 1 + o(1)) g; (DV)
[DV 1983], that turns out to be best possible for square q [Ihara 1981, TVZ
1982]. Moreover we shall adapt Weyl's equidistribution argument to obtain the
asymptotic distribution of the i on the circle jj2 = q for curves attaining that
bound.
The key idea is much the same as what we used to prove that (1 + it) 6= 0. To
start with, note that if the Weil upper bound N1 q+1+2gpq is attained then
each j = ?pq. This can actually happen: for instance, q = q02 and let C be
2
the (q0 +1)-st Fermat curve, i.e. the smooth plane curve xq +1 +yq +1 +z q +1 = 0
0 0 0
of degree q0 +1 and thus of genus (q02 ? q0 )=2. Then C has q03 +1 points over k,
the maximum allowed by (W) [check this!]. But now consider this curve over
the quadratic extension Fq2 of k: we have
X
2 g
N2 = q + 1 ?
2
2j = q2 + 1 ? 2gq = q3=2 + 1 = N1 ;
j =1
i.e. every point rational over Fq2 is already dened over k! [It is an amusing
problem to verify this directly, without invoking the Riemann hypothesis for
C .] It follows that if g were any larger than (q ? q0)=2 and all the j were ?q0
then N2 would actually be smaller than N1 , which is impossible.
So, we have
2g
X
0 N2 ? N1 = q ? q + (j ? 2j )
2
j =1
and likewise 2g
X
0 N n ? N1 = q n ? q + (j ? nj )
j =1
for each n = 2; 3; 4; : :: (We also have inequalities Ndn > Nn but these do not
help us asymptotically.) How to best combine them? For given q; g this is not
an easy problem, but if we x q and only care about asymptotics as g!1 then
all we need do is use the inequality
M ?1
MX
(j =pq)m = M +
X 2
0 (M ? m)q?m=2 (mj + mj+g ):
m=1 m=1
for each M. Summing this over j g we nd
?1
MX
0 Mg + (M ? m)q?m=2 (qm + 1 ? Nm )
m=1
?1
MX
Mg + (M ? m)q?m=2 (qm + 1 ? N1 )
m=1
M?
X1
= Mg + OM (1) ? (M ? m)q?m=2 N1 :
m=1
Thus
N1 < PM ?1 g m ?m=2 + OM (1):
m=1 (1 ? M )q
For each > 0, the sum can be brought within of
1
q?m=2 = 1=(pq ? 1)
X
m=1
3
by taking M large enough. We thus have for each > 0
N1 < (pq ? 1 + )g + O (1);
from which (DV) follows.
What is required for asymptotic equality as C ranges over a sequence of curves
with g!1? Let j = pqe(xj ) for xj 2 R=Z with xj +g = ?xj . Then
X
2g
Nn = ?qn=2 e(nxj ) + qn + 1:
j =1
Since Nn N1 is used for each n, we must have Nn = N1 + on (g), and thus
2g 2g
e(nxj ) = q(1?n)=2
X X
e(xj ) + on(g):
j =1 j =1
Moreover g
2
X
e(xj ) = ?(1 ? q?1=2)g + o(g):
j =1
Adapting the Weyl equidistribution argument (see especially exercise 2 of the
Weyl handout) we conclude that the xj approach the distribution whose n-th
Fourier moment (n 6= 0) is ?(1 ? q?1=2)=2q(jnj?1)=2, i.e. q (x) dx where the
density q is
1
1 ? (1 ? q ) q(1?n)=2 e(nx) +2e(?nx) :
X
? 1=2
n=1
Since 1
(1 ? q?1=2) q(1?n)=2 = 1;
X
n=1
this density is nonnegative, so it can be attained and (DV) is asymptotically the
best inequality that can be obtained from Nn N1 . In fact it is known [Ihara
1981, TVZ 1982] that
p when q is a square2 there are curves with arbitrarily large
g for which N1 ( q ? 1) g; our proof of (DV) gives the asymptotic distribution
of j onQthe circle jj2 = q for any such sequence. It also lets us compute the
g (1 ? ) of the Jacobian in a logarithmic asymptotic sense:
size #J 2j =1 j
Z 1
g?1 log #J ! log q + log j1 ? q?1=2e(x)j q (x) dx: (J)
0
4
Exercises.
1. Verify that if q0 is a prime power then 2the Fermat curve of degree q0 + 1 has
q03 + 1 rational points over the eld of q0 elements.
2. Suppose C is the Fermat curve xr +yr +z r over Fq (not assuming the special
case q = (r ? 1)2). Write Nn in terms of characters on Fq and identify the
eigenvalues of Frobenius with Jacobi sums. (This yields an elementary proof of
jj j2 = q for Fermat curves.)
3. What is the best upper bound that can be obtained on N1 using only the
inequality N1 N2 ?
4. Compute q (x) and the integral (J) in closed form. Generalize to obtain, for
each s 2 C of real part 6= 1=2, a closed form forplimg!1 C (s) as C ranges over
a family of curves over Fq2 with N1 (C)=g(C)! q ? 1.
References
[DV 1983] Drinfeld, V.G., Vladut, S.: Number of points of an algebraic curve,
Func. Anal. 17 (1983), 53{54.
[Dwork 1960] Dwork, B.M.: On the rationality of the zeta function of an alge-
braic variety. Amer. J. Math. 82 (1960), 631{648.
[Goppa 1981,3] Goppa, V.D.: Codes on algebraic curves, Soviet Math. Dokl. 24
(1981), 170{172; Algebraico-geometric codes, Math. USSR Izvestiya 24 (1983),
75{91.
[Ihara 1981] Ihara, Y.: Some remarks on the number of rational points of alge-
braic curves over nite elds. J. Fac. Sci. Tokyo 28 (1981), 721{724.
[Serre 1982{4] Serre, J.-P.: Sur le nombre des points rationnels d'une courbe
algebrique sur un corps ni; Nombres de points des courbes algebriques sur Fq ;
Resume des cours de 1983{1984: reprinted as ##128,129,132 in his Collected
Works III [O 9.86.1 (III) / QA3.S47]
[Tsfasman 1996] Tsfasman, M.A.: Algebraic Geometry Lattices and Codes,
pages 385{390 in the proceedings of ANTS-II (second Algorithmic Number
Theory Symposium), ed. H. Cohen, Lecture Notes in Computer Science 1122
[QA75.L4 #1122 in the McKay Applied Science Library].
[TVZ 1982] Tsfasman, M.A., Vladut, S.G., Zink, T.: Modular curves, Shimura
curves and Goppa codes better than the Varshamov-Gilbert bound. Math.
Nachr. 109 (1982), 21{28.
5
Math 259: Introduction to Analytic Number Theory
Stark's lower bound on j disc(K=Q)j
1
Proposition: Let K be a number eld of absolute degree n = r1 + 2r2 which
satises the Generalized Riemann Hypothesis. Then
log jDK j > (log 8 + C ? o(1))n + (=2)r1 (S)
as n ! 1, where C = ??0 (1) = :577 : : : is Euler's constant.
Proof : Start from (D), and use the fact that ?K0 =K and its derivatives of even
order with respect to s are positive for s > 1, and the derivatives of odd order
negative. Thus by dierentiating (1) m times (m = 1; 2; 3; : ::) we nd
m m
m d d
0 > (?1) r1 dsm log ? (s=2) + 2r2 dsm log2 ? (s) (>)
X 2 ;
+m! 2 Re [s ? ( 1 +1 i
)]m+1 ? (s ? 1) 2 ?
m+1 sm+1
2
where = 1=2 + i
. Our idea is now that for xed s > 1 and large n the term
in (s ? 1)?(m+1) is negligible, and so by dividing the rest of (>) by 2m m! and
summing over m we obtain (D) with s replaced by s ? 1=2 (Taylor expansion
about s); since Re(1=(s ? 1 ? i
)) is still positive, we then nd by bringing s
arbitrarily close to 1 that
log jDK j > r1 (log ? (1=4)) + 2r2(log 2 ? (1=2)) ? o(n);
and thus obtain our Proposition from the known1 special values
(1=2) = ? log 4 ? C; (1=4) = ? log8 ? =2 ? C:
To make this rigorous, we argue as follows: for any small > 0, take s0 = 1 + ,
and pick an integer M so large that (i) the values at s = s0 ? 1=2 of the M-th
partial sums of the Taylor expansions of (s) and (s=2) about s = s0 are
within of (s0 ? 1=2) and (s0 =2 ? 1=4) respectively (this is possible because
both functions are analytic in a circle of radius 1 > 1=2 about s0 ); (ii) the
value at s = s0 ? 1=2 of the M-th partial sum of the Taylor expansion of
Re(1=(s ? 1=2 ? i
)) about s = s0 is positive for all
> 0 (note that since
Re(1=(s ? 1 ? i
)) = =(2 +
2 ), and the value of the M-th partial sum of the
Taylor expansion diers from this by
1 2 ?M=2
Re [1 + 2( ? i
)] M ( + i
) (1 + +
) ;
2
it's clear that the positive value =(2 +
2 ) dominates the error (1+2 +
2 )?M=2
for all
once M is suciently large). Now divide (>) by 2m m!, sum from m = 0
P1
1 From the innite product for ?(s) we have (s) + C =
n=01=(n + 1) ? 1=(s + 1). Thus
(1=2) + C = ?2 log 2, while (1=4) ? (3=4) = ? and (1=4) + (3=4) + 2C = ?6 log2
(why)?, from which (1=4) = ? log 8 ? =2 ? C follows.
2
to M ? 1 (using (D) for the m = 0 term), and set s = s0 to obtain
log jDK j > r1 log ? (s0 =2 ? 1=4) ? +2r2 log 2 ? (s0 ? 1=2) ? +O(1);
since was arbitrarily small and s0 arbitrarily close to 1, we're done.
See [Odlyzko 1975] for using Stark's method to obtain good lower bounds on
jDK j for specic nite r1; r2.
Exercises.
1. Find a constant B such that (O) holds under the weaker hypothesis that
(K) has no zeros with Re 6= 1=2 and 0 < j Imj < B.
2. Assume that the rational prime 2 splits completely in K, i.e. that the Euler
product for K contains a factor (1 ? 2?s)?n. Assuming GRH for K, nd a lower
bound (L) with larger ; . [NB even under this more restrictive condition,
class eld theory yields towers of elds K whose root-discriminant jDK j1=n is
bounded, indeed constant.] Generalize.
References
[CF 1967] Cassels, J.W.S., Frohlich, A., eds.: Algebraic Number Theory. Lon-
don: Academic Press 1967. [AB 9.67.2 (Reserve) / QA 241.A42]
[Marcus 1977] Marcus, D.A.: Number Fields. New York: Springer 1977. [AB
9.77.1 (Reserve) / QA247.M346]
[Odlyzko 1975] Some Analytic Estimates of Class Numbers and Discriminants.
Inventiones Math. 29 (1975), 275{286.
[Serre 1975] Serre, J.-P.: Minorations de discriminants. #106 (pages 240{243)
in his Collected Works III [O 9.86.1 (III) / QA3.S47]. See also page 710 for ref-
erences to results concerning specic (r1 ; r2) with or without GRH, and page 660
for the analogy with the Drinfeld-Vladut estimates.
[Stark 1984] Some Eective Cases of the Brauer-Siegel Theorem. Inventiones
Math. 23 (1974), 135{152.
3
Math 259: Introduction to Analytic Number Theory
An application of Kloosterman sums
As promised, here is the analytic lemma from [Merel 1996]. The algebraic
exponential sum that arises naturally here will also arise in our investigation of
the coecients of modular forms.
Fix a prime p and a nonzero c mod p. [More generally we might ask the same
question for any integer N and c 2 (Z=N ) ; see Exercise 1 below.] Let I; J
(Z=p) be intervals of size A; B < p. How many solutions (x; y) 2 I J are there
to xy c mod p?
As usual we cannot reasonably hope for a meaningful exact answer, but on
probabilistic grounds we expect the number to be roughly AB=p, and can ana-
lytically bound the dierence between this and the actual number:
Lemma 1. The number M of solutions (x; y) 2 I J of xy c mod p is
AB=p + O(p1=2 log2 p).
Proof : Let ; : (Z=p)!C be the characteristic functions of I; J . Then the
number of solutions to our equation is
M=
X (n) (cn?1 ):
n 2(Z=p)
As it stands, this \formula" for M is just restating the problem. But we may
expand ; in discrete Fourier series:
(x) =
X ^(a)e (ax); (x) = X ^(b)e (bx);
p p
a modp b mod p
where NB for t 2 (Z=p) the notation ep (t) now means e(t=p) = e2it=p , not
e(pt) as before. So, we have
M=
X XX ^(a) ^(b)e (ax + bcx?1) = XX ^(a) ^(b)K (a; bc); (M)
p p
x modp a;b mod p a;b modp
where for a; b mod p the Kloosterman sum Kp (a; b) is dened by
Kp (a; b) :=
X ep (an + bn?1 ):
2(Z=p)
n
1
[This comes from an interpretation of Kp (a; b) as + where is an eigenvalue
of Frobenius for the \Artin-Schreier curve" Y p ? Y = aX + b=X , though even
the connection with that curve is nontrivial | see [Weil 1948c] again, which as
usual generalizes to nite elds which need not have prime order.] Putting this
into (M) we nd
jM ? ^(0) ^(0)(p ? 1)j < 2pp
X j^(a)j X j ^(b)j:
a modp b mod p
But ^(0) = A=p and ^(0) = B=p. For nonzero a; b mod p we obtain ^(a); ^(b)
as sums of geometric series and nd (as in Polya-Vinogradov)
p j^(a)j fa=pg?1; p j ^(b)j fb=pg?1:
P P
Thus a j^(a)j; b j ^(b)j log p, and Lemma 1 is proved. 2
Corollary.2 (\Lemme 5" of [Merel 1996]) If AB is a suciently high multiple
of p3=2 log p then there are x 2 I , y 2 J such that xy c mod p.
For instance it is enough for A; B to both be suciently high multiples of
p3=4 log p. Presumably p1=2+ suces, but as far as I know even p for any
< 3=4 is a dicult problem. We can, however, remove the log factors from
the Corollary:
Lemma 2. Suppose I; J (Z=p) are intervals of sizes A; B with AB
8p5=2 =(p ? 1). Then there are x 2 I , y 2 J such that xy c mod p.
Proof : The idea is to replace ; by functions f; g : (Z=p)![0; 1] supported
on I; J whose discrete Fourier coecients decay more rapidly than p=f pa g, and
sum to O(1) instead of O(log p). This will yield an estimate on
M 0 :=
X f (n)g(cn?1)
2(Z=p)
n
2
Thus if M 0 = 0 then A0 B 0 < 2p5=2 =(p ? 1) and AB 4A0 B 0 < 8p5=2 =(p ? 1),
Q.E.D.
Exercises.
0. Show that (unless a; b both vanish mod p) the Kloosterman sum Kp (a; b)
depends only on ab mod p. In particular Kp (a; b) 2 R.
1. For any integer N and any a; b 2 Z=N Z, dene
KN (a; b) :=
X eN (an + bn?1):
2(Z=N )
Q
n
Prove that if N is squarefree then KN (a; b) = pjN Kp (a; b). Deduce results
analogous to our Lemmas 1,2 for composite moduli. What can you say about
Kpr (a; b) for r > 1, and KN (a; b) for general N ?
2. Show using only the \Riemann hypothesis" for elliptic curves over Fp that
Kp (a; b) p3=4 . [Expand jKp (a; b)j2 and collect like terms. The point is that
while the bound is worse than (K), it is still eectively o(p), which suces for
many purposes (including Merel's), while the proof is more elementary in that
RH for elliptic curves is easier to prove (and was already done by Deuring)
and the resulting bound on Kp(a; b) is obtained more directly than the one in
[Weil 1948].]
3. Let p be an odd prime, and = (=p) the nontrivial real character mod p.
Evaluate the Salie sum
Sp (a; b) :=
X?1 (n)e (an + bn?1)
p
p
n =1
in closed form.
R
As with Gauss sums, there is an analogy between Kloosterman sums and certain de-
nite integrals, in thispcase the integral 01 exp(?ax ? b=x) dx=x which gives twice the
R
Bessel function K0 (2 ab). The Salie sum is analogous to 01 exp(?ax ? b=x) dx=px,
which involves a Bessel function K1=2 of half-integer order and so (unlike the K for
2 Z) known in closed form. See for instance [GR 1980, 3.471 9. and 8.468] for the
relevant formulas.
References
[GR 1980] Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Prod-
ucts. New York: Academic Press 1980. [D 9.80.1 / basement reference
QA55.G6613]
[Merel 1996] Merel, L.: Bornes pour la torsion des courbes elliptiques sur les
corps de nombres. Invent. Math. 124 (1996), 437{449.
[Weil 1948] Weil, A.: On some exponential sums. Item 1948c (pages 386{389)
in his Collected Papers I [O 9.79.1 (I) / QA3.W43].
3
Math 259: Introduction to Analytic Number Theory
An upper bound on the coecients of a PSL (Z) cusp form2
Fix an integer k > 1, and let Mk0 be the space of cusp forms of weight 2k for G =
PSL2 (Z). We have seen that this is a nite-dimensional vector space. Moreover
it carries a Hermitian (Petersson) pairing [Serre 1973, VII 5.6.1 (p.105)]:
ZZ
hf; gi = f (z )g(z )y2k?2 dx dy:
H=G
P
Now for each integer n > 0 the map taking a cusp form f = 1 m
m=1 am q to an
is a linear functional on Mk0 . Thus there is a unique Pn 2 Mk0 that represents
this functional:
hf; Pn i = an (f )
for all f 2 Mk0 . Moreover the Pn for n dim Mk0 constitute a basis for Mk0 :
indeed the orthogonal complement of the linear span of these Pn is the subspace
of f 2 Mk0 whose rst dim Mk0 coecients vanish, and we have seen that 0 is the
only such f . So, an upper bound ar (Pn ) n r for all n dim Mk0 will yield
ar (f ) f r for all f 2 Mk0 .
Remarkably we can obtain Pn and its q-expansion in an explicit enough form (a
\Poincare series") to obtain such an inequality for all > k ? 14 | and the proof
uses Weil's bound [Weil 1948] on Kloosterman sums! (See for instance [Selberg
1965, x3]; thanks to Peter Sarnak for this reference and for rst introducing
me to this approach. It is now known that in fact the correct is k ? 12 + ,
but Deligne's proof of this is quite deep, and is not as generally applicable: the
Poincare-series method still yields the sharpest bounds known for many other
kinds of modular forms.)
P
We begin by observing that for any f (z ) = 1 n
n=1 an q the coecient an may
be isolated from the absolutely convergent double integral
ZZ Z1
e2inz f (z )y2k?2 dx dy = an k ? 2)! a :
e?4ny y2k?2 dy = (4(2n
0 )2k?1 n
0<x<1
Now the region we're integrating over is a fundamental domain for the action
of hT i on H. We decompose this as the union (with only boundary overlaps) of
G-images of the fundamental domain for the action of G. That is, we split up
the integral as
X ZZ
e2ing(z) f (g(z ))y2k (g(z )) dxy2dy
g D
where D is a fundamental domain for H=G and g ranges over coset representa-
tives of hT inG. But these cosets amount to coprime pairs (c; d) of integers with
1
c > 0 or c = 0; d = 1. Moreover, we have for g(z ) = (az + b)=(cz + d)
f (g(z ))y2k (g(z )) = (cz + d)2k y2k (g(z ))f (z ) = f (z ) y2k (z )=(cz + d)2k :
So, we nd
(2k ? 2)! a = ZZ f (z ) XX(cz + d)?2k exp 2in az + b y2k?2 dx dy:
(4n)2k?1 n c;d cz + d
D
Therefore the double sum is (4n)1?2k (2k ? 2)!Pn , provided we can show that it
is in fact a cusp form | which, however, is surprisingly easy. [To do away with
the requirement that c > 0 or (c; d) = (0; 1) we may sum over all coprime pairs
(c; d), then divide by 2. The sum converges absolutely because it is dominated
by the sume dening the Eisenstein series Ek : the factors e(ng(z )) all have
absolute value < 1.] We thus have:
(4n)2k?1 P (z ) = XX(cz + d)?2k exp 2in az + b : (
PP
)
(2k ? 2)! n c;d cz + d
(Note that the exponential factor does not depend on the choice of a; b 2 Z such
that ad ? bc = 1.)
We next determine the q-expansion of the Poincare series Pn . The term (c; d) =
(0; 1) contributes qn to the sum. We group the remaining terms according to c
and d mod c. [The existence of a q-expansion is equivalent to T -invariance, so to
obtain the q-expansion we collect the (az + b)=(cz + d) into hT i-orbits, which is to
say that we now consider Pn as a sum over the double coset space hT inG=hT i.]
Fix coprime PPc; d0 with c > 0, and a0 ; b0 such that a0 d0 ? b0 c = 1. Then the terms
of the sum ( ) with d d0 mod c have (a; b; c; d) = (a0 ; b0 + ma0 ; c; d0 + mc)
for m 2 Z, and thus contribute
X
? 2k a0 (z + m) + b0
(c(z + m) + d0 ) exp 2in c(z + m) + d :
P
( )
m2Z 0
P
By Poisson summation this is r2Z ur where
Z1
(c(z + t) + d0 )?2k exp 2in ac0((zz++tt))++db0 e?2irt dt:
R
ur := ( )
?1 0
2
with the contour of integration C passing above the essential singularity at
w = 0. Note that the integral depends only on n; r; c but not on d0 ; the de-
pendence on d0 is entirely contained in the factor e2i(a +rd )=c , in which a0 is
0 0
Exercises
1. Show that the modular cusp form (q) = q ? 24q2 + 252q3 : : : of weight 12
(called (2)?12 in [Serre 1973]) is given by the formula
1
X 8
(q) = n4 (n)qn =8 = (q1=8 ? 3q9=8 + 5q25=8 ? 7q49=8 + ? )8 :
2
n=1
[NB the sum is essentially the modied theta series # that we used to prove
the functional equation for L(s; 4 ). Note that it is a \modular cusp form
4
of weight 3=2" for some arithmetic subgroup of G whose qn=8 coecients are
O(n1=4 ) in mean square but only O(n1=2 ), not O(n1=4+ ), individually | so
here the Pn -type bound is essentially best possible.] Using the Jacobi product
for , conclude that
1
Y
(1 ? qn )3 = 1 ? 3q + 5q3 ? 7q6 + 9q10 ? + :
n=1
P
2. Let f = n>0 an qn be P a cusp form of weight 2k. Use the boundedness of
y2k jf (z )j2 to prove that n<N jan j2 f N 2k . [In other words an nk?1=2 in
mean square. Note that Hecke's estimate jan j f nk follows immediately.]
3
P P
3. Let f = n>0 an qn be a cusp form of weight 2k, and let Lf (s) = n an n?s
be the associated L-function (called f (s) in [Serre 1973, p.103]). Use the
integral representation of Lf to prove that Lf ( + it) f; jtjk () for some
k () < 1. How small a k () can you obtain? [As usual, it is conjectured a la
Lindelof that Lf ( + it) f; jtj for all k.]
4. Verify that in fact Pn 2 Mk0 .
5. Verify that our nal pestimate on () follows Q from the J2k?1 asymptotics
cited. Since jKc(n; r)j= nc is actually pjc 2, which in turn is bounded
by the number of factors of c, we can make the r factor more precise; show
that in fact log r suces, i.e. the qr coecient of a cusp form of weight 2k is
O(rk?1=4 log r).
P
6. For each even k = 2; 4; 6; : : : there is a unique f = 1n=0 an q 2 Mk of the
n
form 1 + O(q bk=6c+1
), i.e. such that a0 = 1 and a1 = a2 = = abk=6c = 0.
(Why?) Prove that abk=6c+1 > 0. [This is a bit tricky, requiring the residue
formula and the fact that 1= = q?1 + 24q + 324q + 3200q2 + has positive
coecients | a fact that can be deduced from the Jacobi product for .]
Conclude that an even unimodular lattice in dimension 4k has a vector of norm
at most 2(bk=6c + 1).
Can the minimal norm be that large? Such lattices exist for several small k, including
k = 2; 4; 6; : : : ; 16, but it is known that for all but nitely many k the minimal norm
is always strictly smaller, indeed < 2(bk=6c ? ) once k > k() for some eectively
computable k(). This is shown by proving that there is no suitable modular form all
of whose coecients are nonnegative. Still many open questions remain; for instance
it is not even known whether there is an even unimodular lattice of dimension 72 and
minimal norm 8. How many minimal vectors would such a lattice have? See [CS 1993]
for more along these lines, especially p.194 and thereabouts.
References
[CS 1993] Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups.
New York: Springer 1993.
[GR 1980] Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Prod-
ucts. New York: Academic Press 1980. [D 9.80.1 / basement reference
QA55.G6613]
[Selberg 1965] Selberg, A.: On the estimation of Fourier coecients of modular
forms, #33 (506{520) in his Collected Papers I [O 9.89.2 (I)].
[Serre 1973] Serre, J.-P.: A Course in Arithmetic. New York: Springer, 1973
(GTM 7). [AB 9.70.4 (reserve case) / QA243.S4713]
[Watson 1944] Watson, G.N.: A treatise on the theory of Bessel functions (2nd
ed.) Bew York: Macmillan 1944. [HA 9.44.6 / QA408.W2]
[Weil 1948] Weil, A.: On some exponential sums. Item 1948c (pages 386{389)
in his Collected Papers I [O 9.79.1 (I) / QA3.W43].
4
Math 259: Introduction to Analytic Number Theory
The Selberg (quadratic) sieve and some applications
For our last topic we return to an elementary and indeed nave approach to the
distribution of primes: an integer p is prime if and only if it is not divisible
by the primes pp; but half the integers are odd, 2=3 are not multiples of 3,
4=5 not multiples of 5, etc., and divisibility by any prime is independent of
divisibility by nitely many other primes, so: : : Moreover if p is restricted
to an arithmetic progression a mod q with (a; q) = 1 then the same factors
(l ? 1)=l arise except those for which ljq, whence the appearance of q=(q) in
the asymptotic formula for (qx; a mod q).
The problem with estimating (x) etc. this way is that the divisibilities aren't
quite independent. This is already implicit in our trial-division test for primality:
if p is known to contain no primes ppp, the conditional probability that it be
a multiple of some other prime l 2 ( p; p) is not 1=l but zero. Already for
small l, the number of n < x divisible by l is not quite x=l but x=l + O(1),
and similarly for n divisible by a product of distinct primes; so if we try to use
inclusion-exclusion to recover the number of primes,
Q
or even of n not divisible
by the rst r primes, we get an estimate of x p (1 ? p1 ) as expected, but with
an \error term" O(2r ) that swamps the estimate long before r can get usefully
large.
Still, in \sieve" situations like this, where we have a set S of A integers such
that #(S \ DZ)=A is approximated by a multiplicative function (D) of the
squarefree integer D (for instance if S is an interval then (D) = 1=D), there
are various ways of bounding from above the size of the set of n 2 S not divisible
by any of a given set of primes. These dierent sieve inequalities use a variety of
methods, but curiously yield similar bounds in many important contexts, often
bounds that asymptotically exceed by a factor of 2 the expected number. We
shall develop one of the most general such inequalities, due to Selberg, and give
some typical examples of its use in analytic number theory. While we state
Selberg's sieve in the context of divisibility, in fact all that we are using is that
each prime p sifts out a subset of S and that the probabilities that a random
n 2 S survives these tests for dierent p are approximately independent; thus
Selberg's sieve has a counterpart in the context of probability theory, for which
see the nal exercise herein. Selberg's and many other sieves are collected in
[Selberg 1969]; nice applications of sieve inequalities to other kinds of problems
in number theory are interspersed throughout [Serre 1992].
P
Assume, then, that an (n 2 Z) are nonnegative real numbers with n2Z an =
A < 1. For each squarefree d > 0 let
X
Ad := amd = A(d) + r(d)
m2Z
where is a multiplicative function with 0 (d) 1 for each d (equivalently,
1
for each prime d). We are interested in
X
A(D) := an :
n;D)=1
(
Q
We hope that A(D) is approximately A pjD (1 ? (p)), with an error that is
usefully small if the r(d) are. What we can show is
Theorem (Selberg): For each z 1 we have
A + R(D; z );
A(D) S (D; (S)
z)
where S; R are dened by
S (D; z ) :=
XY (p) ; R(D; z ) :=
X
3!(d)jr(d)j
j
d D
d z
pjd 1 ? (p) j
d D
d<z 2
P
and !(d) := pjd 1.
Remarks : as z grows, R(D; z ) increases while 1=S (D; z ) decreases, tending as
z !1 to
X Y (p) Y (p) = Y 1 ;
S (D; D) = = 1 +
djD pjd 1 ? (p) pjD 1 ? (p) pjD 1 ? (p)
Q
so 1=S (D; z )! djD (1 ? (p)) as expected. Note, however, that (S) is only an
upper bound: we do not claim that jA(D) ? A=S (D; z )j R(D; z ).
Q
Typically we will let D = D(y) = pjy p. For instance, if an is the characteristic
function of an arithmetic progression of length A with common dierence q then
A(D(y)) is an upper bound on (x0 + Aq; a mod q) ? (x0 ; a mod q). Of course
we are interested in the case (a; q) = 1. We take (n) = 1=n1 where n1 is
the largest factor of n1 coprime with q. Then jr(d)j 1 for each d, and so
R(D; z ) is bounded by the sum of the n?s coecients of 3 (s) forPn z 2, so is
(z log z )2. [An equivalent and more elementary way to handle nx 3!(n) is
to note that 3!(n) is at most the number of representations n = n1 n2 n3 of n as
a product of three positive integers.] As to S (D; z ), we expand =(1 ? ) in a
geometric series to nd
1=n = (q) log z + O(1):
X
S (D; z ) > q
n<z
(n;q)=1
Thus Selberg's bound (S) is (q=(q))A= log z + O(z 2 log z 2 ), and by choosing
z = A1=2 = log3 A we nd the upper bound
q 2 + O( log log A ) A :
(q) log A log A
2
The implied constant depends on q, but tractably so, without invoking zeros
of L-functions and the like; if the coecient 2 were any smaller this would be
enough to banish the Siegel zero!
Proof of (S): Let d (djD) be arbitrary real parameters with 1 = 1 (and
eventually d = 0 once d > z ). Then
X X 2 XX X
A(D) an d = d d
1 2 an ;
n dj(n;D) d1 ;d2 jD nj[d1 ;d2 ]
where [d1 ; d2 ] := lcm(d1 ; d2 ). The inner sum is just A[d ;d ] , so we have 1 2
XX ?
A(D) d d A([d1 ; d2 ]) + r([d1 ; d2 ]) AQ + R;
1 2
d1 ;d2 jD
where Q is the quadratic form
XX
Q := ([d1 ; d2 ]) d d 1 2
d1 ;d2 jD
in the d , and XX
R := jd d j r([d ; d ]):
1 2 1 2
d1 ;d2 jD
Now for djD the number of pairs d1 ; d2 such that d = [d1 ; d2 ] is 3!(d) (why?);
thus (S) will follow from the following
Lemma: The minimum of the quadratic form Q subject to the conditions
1 = 1 and d > z ) z = 0 is 1=S (D; z ) and is attained by d with jd j 1.
Proof of Lemma: by continuity we may assume that 0 < (d) < 1 for all djD.
(In fact for our purpose we can exclude from the start the possibilities (d) = 0
or 1 | do you see why?) Since is multiplicative and [d1 ; d2 ](d1 ; d2 ) = d1 d2
we have XX [(d1 )d ] [(d2 )d ]
Q= d : 1 2
(( 1 ; d2 ))
d ;d jD 1 2
3
By Mobius inversion we nd
(e) = 1 ?(p()p) ; d = (1d)
Y X
(e=d)x(e):
pjd dje
Our conditions on the d then become
X
(D=e)x(e) = (1)1 = 1; e > z ) z (e) = 0:
ejD
By Schwarz, the minimum of Q subject to these conditions is
h X 1 i?1 = 1=S (D; z );
ejD; ez (e)
and is attained at x(e) = (D=e) ((e)S (D; z )). This yields
But we have
1 =X 1
(d)(d) ejd (e)
since ; are both multiplicative. Thus we have
XX
S (D; z )d = (D) frac1(ef );
e;f
with each ef z and no ef values repeated. Thus the sum is at most S (D; z ),
so jd j 1 as claimed. This concludes the proof of the Lemma, and thus also
of Selberg's inequality (S).
Exercises.
1. Prove that for each integer n > 0 the number of primes p < x such that p +2n
is also prime is On (x= log2 x). In particular, conclude that the sum
1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 +
3 5 5 7 11 13 17 19 29 31 41 43
of the reciprocals of twin primes converges.
2. Prove that there are at most ((8=) + o(1))x= log x integers n < x such that
n2 + 1 is a prime. Generalize. [It is of course an outstanding open problem to
nd a similar lower bound on the number of such n, or even to prove that it is
unbounded as x!1, i.e. to prove that there are innitely many primes of the
form n2 + 1 | or more generally P (n) where P is an irreducible polynomial
4
such that P (n) 6 0 mod p has a solution mod p for each prime p. Dirichlet's
theorem is the case of linear polynomials; the conjecture has yet to be proven
for a single polynomial P (n) of degree 2 or greater.]
3. Let pi (i 2 [m] := f1; 2; : : :; mg) be probabilities, i.e. real numbers in [0; 1],
and let E1 ; : : : ; Em be events approximating independent events with those
probabilities,Q i.e. such that for each I [m] the probability that Ei occurs for
all i 2 I is i2I pi + r(I ). Obtain upper bounds on the probability that none
of the Ei occurs, bounds which correspond to and/or generalize Selberg's (S).
References
[Selberg 1969] Selberg, A.: Lectures on Sieves, pages 66{247 of his Collected
Papers II [O 9.89.2 (II)]
[Serre 1992] Serre, J.-P.: Topics in Galois Theory. Boston: Jones and Bartlett
1992. [BB 9.92.12 / QA214.S47]