You are on page 1of 470

Stability of Flows

S Friedlander, University of Illinois-Chicago, (Friedlander and Serre 2003) and the compendium
Chicago, IL, USA of articles on hydrodynamics and nonlinear
ª 2006 Elsevier Ltd. All rights reserved. instabilities in Godreche and Maneville (1998).

The Equations of Motion


Introduction The Navier–Stokes equations for the motion of an
This article gives a brief discussion of a topic with incompressible, constant density, viscous fluid are
an enormous literature, namely the stability/instabil- @q 1
ity of fluid flows. Following the seminal observa- þ ðq  rÞ q ¼  rP þ  r2 q ½1a
@t 
tions and experiments of Reynolds in 1883, the issue
of stability of a fluid flow became one of the central
div q ¼ 0 ½1b
problems in fluid dynamics: stable flows are robust
under inevitable disturbances in the environment, where q(x, t) denotes the velocity vector, P(x, t) the
while unstable flows may break up, sometimes pressure, and the constants  and  are the density
rapidly. These possibilities were demonstrated in a and kinematic viscosity, respectively. This system is
relatively simple experiment where flow in a pipe is considered in three (or sometimes two) spatial
examined at increasing speeds. As a dimensionless dimensions with a specified initial velocity field
parameter (now known as the Reynolds number)
qðx; 0Þ ¼ q0 ðxÞ ½1c
increases, the flow completely changes its nature
from a stable flow to a completely different regime and physically appropriate boundary conditions: for
that is irregular in space and time. Reynolds called example, zero velocity on a rigid boundary, or
this ‘‘turbulence’’ and observed that the transition periodicity conditions for flow on a torus. This
from the simple flow to the chaotic flow was caused nonlinear system of partial differential equations
by the phenomenon of instability. (PDEs) has proved to be remarkably challenging,
Even though the topic has been the subject of and in three dimensions the fundamental issues of
intense study over more than a century, Reynolds existence and uniqueness of physically reasonable
experiment is still not fully explained by current solutions are still open problems.
theory. Although there is no rigorous proof of It is often useful to consider the Navier–Stokes
stability of the simple flow (known as Poiseuille equations in nondimensional form by scaling the
flow in a circular pipe), analytical and numerical velocity and length by some intrinsic scale in the
investigations of the equations suggest theoretical problem, for example, in Reynolds’ experiment by
stability for all Reynolds numbers. However, experi- the mean speed U and the diameter of the pipe d.
ments show instability for sufficiently large This leads to the nondimensional equations
Reynolds numbers. A plausible explanation for this
@q 1
phenomenon is the instability of such flows with þ ðq  rÞ q ¼ rP þ r2 q ½2a
respect to small but finite disturbances combined @t R
with their stability to infinitesimal disturbances.
div q ¼ 0 ½2b
The issue of fluid stability, in contexts much
more complex than the fundamental experiment of where the Reynolds number R is
Reynolds, arises in a multitude of branches of
R ¼ Ud= ½3
science, including engineeering, physics, astrophy-
sics, oceanography, and meteorology. It is far In many situations, the size of R has a crucial
beyond the scope of this short article to even influence on stability. Roughly speaking, when R is
touch upon most of the extensive literature. In the small the flow is very sluggish and likely to be
bibliography we list just a few of the substantive stable. However, the effects of viscosity are actually
books where classical results can be found very complicated and not only is viscosity able to
(Chandrasekhar 1961, Drazin and Reid 1981, smooth and stabilize fluid motions, sometimes it
Gershuni and Zhukovitiskii 1976, Joseph 1976, actually also destroys and destabilizes flows.
Lin 1967, Swinney and Gollub 1985). Recent The Euler equations, which predate the Navier–
extensive bibliographies on mathematical aspects Stokes equations by many decades, neglect the
of fluid instability are given in several articles in the effects of viscosity and are obtained from [1a] by
Handbook of Mathematical Fluid Dynamics setting the viscosity parameter  to zero. Since this
2 Stability of Flows

removes the highest-derivative term from the equa- moderately large Reynolds numbers. Fully devel-
tions, the nature of the Euler equations is funda- oped turbulence is a phenomenon associated with
mentally different from that of the Navier–Stokes very high Reynolds numbers. These are parameter
equations and the limit of vanishing viscosity (or regimes basically inaccessible in current numerical
infinite Reynolds number) is a very singular limit. investigations of the Navier–Stokes equations and
Since all real fluids are at least very weakly viscous, turbulent models. The Euler equations lie at the
it could be argued that only the the Navier–Stokes limit as R ! 1. It is an interesting observation that
equations are physically relevant. However, many results at the limit of infinite Reynolds number are
important physical phenomena, such as turbulence, sometimes also applicable and consistent with
involve flows at very high Reynolds numbers (104 or experiments for flows with only moderate Reynolds
higher). Hence, an understanding of turbulence is number.
likely to involve the asymptotics of the Navier– There is a huge diversity of forces that couple
Stokes equations as R ! 1. The first step towards with fluid motion to produce instability. We will
the construction of such asymptotics is the study of merely mention a few of these which an interested
inviscid fluids governed by the Euler equations: reader could pursue in consultation with texts listed
in the ‘‘Further reading’’ section and references
@q
þ ðq  rÞ q ¼ rP ½4a therein.
@t

div q ¼ 0 ½4b 1. The so-called Bénard problem of convective


instability concerns a horizontal layer of fluid
Stability issues for the Euler equations are in many between parallel plates and subject to a tempera-
respects distinct from those of the Navier–Stokes ture gradient. The governing equations are the
equations and in this article we will briefly touch Navier–Stokes equation for a nonconstant den-
upon stability results for both systems. sity fluid and the heat equation. In this problem,
the critical parameter governing the onset of
instability is called the Rayleigh number. The
Comments on Some ‘‘Classical’’
patterns that can develop as a result of instability
Instabilities are strongly influenced by the boundary condi-
To illustrate the complexity of the structure of tions in the horizontal coordinates. With lattice
instabilities that can arise in the Navier–Stokes type conditions, bifurcating solutions include
equations, we mention one classical example, rolls, rectangles, and hexagons. Convection rolls
namely the centrifugal instabilities called Taylor– are themselves subject to secondary instabilities
Couette instabilities. Consider a fluid between two that may break the translation symmetry and
concentric cylinders rotating with different angular deform the rolls into meandering shapes. Further
velocities. If the inner cylinder rotates sufficiently refinements of convective instabilities include
faster than the outer one, the centrifugal force is doubly diffusive convection, where the density
stronger on inside particles than outside particles depends on concentration as well as temperature.
and a disturbance which exchanges the radial Competition between stabilizing diffusivity and
position of particles is enhanced, that is, the destabilizing diffusivity can lead to the so-called
configuration is unstable. As the angular velocity ‘‘salt-finger’’ instabilities.
of the inner cylinder is increased above a certain 2. Of considerable interest in astrophysics and
critical rate, the instability is manifested in a series plasma physics are the instabilities that occur in
of small toroidal (Taylor) vortices that fill the space electrically conducting fluilds. Here the fluid
between the cylinders. There follows a hierarchy of equations are coupled with Maxwell’s equations.
successive instabilities: azimuthal traveling waves, Much work has been done on the topic of
twisting regimes, and quasiperiodic regimes until magnetohydrodynamical (MHD) stability, which
chaotic solutions appear. Such a sequence of was developed to address various important
bifurcations is a scenario for a transition to physical issues such as thermonuclear fusion,
turbulence postulated by Ruelle–Takens. Details stellar and planetary interiors, and dynamo
concerning bifurcation theory and fluid behavior theory. For example, dynamo theory addresses
can be found in the book of Chossat and Iooss the issue of how a magnetic field can be
(1994). generated and sustained by the motion of an
We note that phenomena of successive bifurca- electrically conducting fluid. In the simplest
tions connected with loss of stability, such as scenario, the fluid motion is assumed to be a
regimes of Taylor–Couette instabilities, occur at given divergence-free vector field and the study of
Stability of Flows 3

the instabilities that may occur in the evolution We assume that if the initial value u(x, 0) 2 X is
of the magnetic field is called the kinematic given, the future evolution u(x, t), t > 0, of the
dynamo problem. This gives rise to interesting equation is uniquely defined (at least for sufficiently
problems in dynamical systems and actually is small initial data). Without loss of generality, we
closely analogous to the topic of vorticity can assume that zero is a steady state.
generation in the three-dimensional (3D) fluid We define a version of Lyapunov (nonlinear)
equations in the absence of MHD effects. stability and its converse instability.
In the next section we discuss certain mathema- Definition 1 Let (X, Z) be a pair of Banach spaces.
tical results that have been rigorously proved for The zero steady state is called (X, Z) nonlinearly
particular problems in the stability of fluid flows. stable if, no matter how small  > 0, there exists
We restrict our attention to the ‘‘basic’’ equations,  > 0 so that u(x, 0) 2 X and
that is, [2a] and [2b], [4a] and [4b], observing that
even in rather simple configurations there are still kuðx; 0ÞkZ < 
more open problems than precise rigorous results.
imply the following two assertions:
(i) there exists a global in time solution such that
The Navier–Stokes Equations: u(x, t) 2 (([0, 1); X);
Mathematical Definitions of (ii) ku(x, t)kZ <  for a.e. t 2 [0, 1).
Stability/Instability
The zero state is called nonlinearly unstable if either
Instability occurs when there is some disturbance of of the above assertions is violated. We note that
the internal or external forces acting on the fluid under this strong definition of stability, loss of
and, loosely speaking, the question of stability or existence of a solution is a particular case of
instability considers whether there exist disturbances instability. The concept of existence that we will
that grow with time. There are many mathematical invoke in considering the Navier–Stokes equations is
definitions of stability of a solution to a PDE. Most the existence of ‘‘mild’’ solutions introduced by Kato
of these definitions are closely related but they may and Fujita (1962). Local-in-time existence of mild
not be equivalent. Because of the distinctly different solutions is known in X = Lq for q  n, where n
nature of the Navier–Stokes equations for a viscous denotes the space dimension. (Lq denotes the usual
fluid and the Euler equations for an inviscid fluid, Lebesque space).
we will adopt somewhat different precise definitions We now state two theorems for the Navier–Stokes
of stability for the two systems of PDEs. Both equations [2a] and [2b]. The theorems are valid in any
definitions are related to the concept known as space dimension n and in finite or infinite domains. Of
Lyapunov stability. A steady state described by a course, the most physically relevant cases are n = 3 or
velocity field U 0 (x) is called Lyapunov stable if 2. Both theorems relate properties of the spectrum of
every state q(x, t) ‘‘close’’ to U 0 (x) at t = 0 stays the linearized Navier–Stokes equations to stability or
close for all t > 0. In mathematical terms, ‘‘close- instability of the full nonlinear system. Let
ness’’ is defined by considering metrics in a normed U 0 (x), P0 (x) be a steady state flow:
space X. While in finite-dimensional systems the
choice of norm is not significant because all Banach 1 2 1
ðU 0  rÞU 0 ¼ rP0 þ r U0 þ F ½5a
norms are equivalent, in infinite-dimensional sys- R R
tems, such as a fluid configuration, this choice is
crucial. The point was emphasized by Yudovich r  U0 ¼ 0 ½5b
(1989) and it is a version of the definition of
where U 0 2 C1 vanishes on the boundary of the
stability given in this book that we will adopt in
domain D and F is a suitable external force. We
connection with the parabolic Navier–Stokes
write [2a] and [2b] in perturbation form as
equations.
qðx; tÞ ¼ U 0 ðxÞ þ uðx; tÞ ½6
Definitions for a General Nonlinear where
Evolution Equation
@u
Consider an evolution equation for u(x, t) whose ¼ LNS u þ Nðu; uÞ ½7a
@t
phase space is a Banach space X:
@u
¼ Lu þ Nðu; uÞ ru¼0 ½7b
@t
4 Stability of Flows

with extensive analytical and numerical investigations.


Consider the parallel flow U 0 = (U(z), 0, 0) in the
LNS u   ðU 0  rÞu  ðu  rÞU 0 strip 1  z  1. For disturbances of the form
1
þ r2 u  rP1 ½8 ðzÞ eiðk1 xþk2 yÞ et ½10
R
the eigenvalue  is determined by the following
Nðu; uÞ  ðu  rÞu  rP2 ½9
equation with k2 = k21 þ k22 :
Here P1 and P2 are, respectively, the portions of the  " 2 #
pressure required to ensure that LNS u and N(u, u)  d
Ui  k2   U00 
remain divergence free. The operators LNS and N act k dz2
on the space of divergence-free vector-valued func- " #
tions in the closure of the Sobolev space W s, p that 1 d2 2
¼ k  ½11
vanish on the boundary of D. ikR dz2
We note that the spectrum of the elliptic linear
operator LNS with appropriate boundary conditions with boundary conditions  = 0 at z = 1. We note
in a bounded domain is purely discrete: that is, it that the discreteness of the spectrum is preserved if
consists of a countable number of eigenvalues of periodicity conditions are imposed in the (x, y)
finite multiplicity with the sole limit point being at plane.
infinity. The complexity of the spectral problem [11] is
apparent even for the simple case U(z) = 1  z2
Theorem 2 (Nonlinear instability). Let 1 < p < 1 (known as plane Poiseuille flow). Unstable eigenva-
be arbitrary. Suppose that the operator LNS over Lp lues exist but only in certain regions of (k, R)
has spectrum in the right half of the complex plane. parameter space. There is a critical Reynolds number,
Then the flow U 0 (x) is (Lq , Lp ) nonlinearly unstable Rc = 5772, below which Re  < 0 for all wave
for any q > max(p, n). numbers k. For R > Rc , instability occurs in a band
Theorem 3 (Asymptotic Lyapunov stability). Let of wave numbers and the thickness of this band
q > n be arbitrary. Assume that the operator LNS shrinks to zero as R ! 1 (i.e., the inviscid limit).
over Lq has spectrum confined to the left half of the Hence, Poiseuille flow with R < Rc can be considered
complex plane. Then the flow U 0 (x) is (Lq , Lq ) as an example where the stability Theorem 3 can be
nonlinearly stable. applied, that is, the flow is nonlinearly stable to
infinitesimal disturbances. However, extremely care-
A recent proof of these theorems is given in ful experiments are needed to obtain agreement with
Friedlander et al. (2006) using a bootstrap type the theoretical value of Rc = 5772. Rather it is more
argument. In Theorem 2, the space Lq , q > n, is used usual in an experiment with R 2000 that the flow
as an auxiliary space inwhich the norm of the exhibits instability in the form of streamwise streaks
nonlinear term is controlled, while the final instabil- that appear near the walls. These structures do not
ity result is proved in Lp for p 2 (1, 1). We note look like traveling waves of the form given by
that this includes the most physically relevant case expression [10], rather they are finite-amplitude
of instability in the L2 energy norm. An earlier proof effects of nonmodal growth. Such linear growth of
of the theorems under the restriction p  n was disturbances, along with energy growth and pseudos-
given by Yudovich (1989). pectra have recently been investigated extensively.
To apply Theorem 2 or 3 to conclude nonlinear An example where Theorem 3, proving nonlinear
instability or stability of a given flow U 0 , it is instability, can be applied is the so-called
necessary to have information concerning the spec- Kolmogorov flow. This is also a shear flow with the
trum of the linear operator LNS . Obtaining such spectral problem for the linearized operator given by
information has been the goal of much of the eqn [11]. In this example, the profile is oscillatory in z
literature concerning fluid stability (see the biblio- with U(z) = sin mz. In an elegant paper, Meshalkin
graphy and the references therein). However, except and Sinai (1961) used continued fractions to prove
in the case of some relatively simple flows, the the existence of a real unstable positive eigenvalue. It
eigenvalues of LNS have not yet been calculated is interesting, and in some sense surprising, that the
explicitly. Perhaps the example that is the most particular case of sinusoidal profiles leads to a
tractable is plane parallel shear flows. Here the nonconstant-coefficient eigenvalue problem, where
eigenvalue problem is governed by an ordinary it is possible to construct in explicit form the
differential equation (ODE) known as the Orr– transcendental characteristic equation that relates
Sommerfeld equation, which has been the subject of the eigenvalues  and the wave numbers. Usually,
Stability of Flows 5

this can be done only for constant-coefficient equa- Linear (spectral) instability of a steady Euler flow
tions. In the case U(z) = sin mz, a Fourier series U 0 (x) concerns the structure of the spectrum of LE .
representation for the eigenfunctions leads to a Assuming U 0 2 C1 (T n ), the linear equation
tridiagonal infinite matrix for the algebraic system
satisfied by the Fourier coefficients. This is amenable @u
¼ LE u; ru¼0 ½17
to examination using continued fractions. Analysis of @t
the characteristic equation shows that there exist real
eigenvalues  > 0 provided R is larger than some defines a strongly continuous group in every Sobolev
critical value for each wave number k with k2 < m2 . space W s, p with generator LE . We denote this group
by exp {LE t}. For the issue of spectral instability of
the Euler equation it proves useful to study not only
the spectrum of LE but also the spectrum of the
The Euler Equation: Linear and evolution operator exp {LE t}. This permits the
Nonlinear Stability/Instability development of an explicit formula for the growth
We conclude this brief article with some discussion rate of a small perturbation due to the essential (or
of instabilities in the inviscid Euler equations whose continuous) spectrum. It was proved by Vishik
existence is likely to be important as a ‘‘trigger’’ for (1996) that a quantity , refered to as a ‘‘fluid
the development of instabilities in high-Reynolds- Lyapunov exponent’’ gives the maximum growth
number viscous flows. As we mentioned, the Euler rate of the essential spectrum of exp{LE t}. This
equations are very different from the Navier–Stokes quantity is obtained by computing the exponential
equations in their mathematical structure. The growth rate of a certain vector that satisfies a
Euler equations are degenerate and nonelliptic. As specific system of ODEs over the trajectories of the
such, the spectrum of the linearized operator LE is flow U 0 (x). This proves to be an effective mechan-
not amenable to standard spectral theory of elliptic ism for detecting instabilities in the essential
operators. For example, unlike the Navier–Stokes spectrum which result due to high-spatial-frequency
operator, the spectrum of LE is not purely discrete perturbations. For example, for this reason any flow
even in bounded domains. To define LE we consider U 0 (x) with a hyperbolic fixed point is linearly
a steady Euler flow {U 0 (x), P0 (x)}, where unstable with growth in the sense of the L2 -norm.
In two dimensions,  is equal to the maximal
U 0  rU 0 ¼ rP0 ½12a classical Lyapunov exponent (i.e., the exponential
growth of a tangent vector over the ODE x_ = U 0 (x)).
r  U0 ¼ 0 ½12b In three dimensions, the existence of a nonzero
1
We assume that U 0 2 C . For the Euler equations, classical Lyapunov exponent implies that  > 0.
appropriate boundary conditions include zero nor- However, in three dimensions there are also exam-
mal component of U 0 on a rigid boundary, or ples where the classical Lyapunov exponent is zero
periodicity conditions (i.e., flow on a torus) or and yet  > 0. We note that the delicate issue of the
suitable decay at infinity in an unbounded domain. unstable essential spectrum is strongly dependent on
The theorems that we will be describing have been the function space for the perturbations and that ,
proved mainly in the cases of the second and third for a given U 0 , will vary with this function space.
conditions stated above. There are many classes of More details and examples of instabilities in the
vector fields U 0 (x), in two and three dimensions, essential spectrum can be found in references in the
that satisfy [12a] and [12b]. We write [4a] and [4b] bibliography.
in perturbation form as In contrast with instabilities in the essential
spectrum, the existence of discrete unstable eigenva-
qðx; tÞ ¼ U 0 ðxÞ þ uðx; tÞ ½13 lues is independent of the norm in which growth is
with measured. From this point of view, such instabilities
@u
¼ LE u þ Nðu; uÞ ½14a can be considered as ‘‘strong.’’ However, for most
@t flows U 0 (x) we do not know the existence of such
ru¼0 ½14b unstable eigenvalues. For fully 3D flows there are no
examples, to our knowledge, where such unstable
Here
eigenvalues have been proved to exist for flows with
standard metrics. The case that has received the
LE u  ðU 0  rÞ u  ðu  rÞU 0  r P1 ½15
most attention in the literature is the ‘‘relatively
simple’’ case of plane parallel shear flow. The
Nðu; uÞ  ðu  rÞ u  r P2 ½16
eigenvalue problem is governed by the Rayleigh
6 Stability of Flows

equation (which is the inviscid version of the Orr– strict local maximum or minimum of E, then the
Sommerfeld equation [11]): steady flow is nonlinearly stable in the space J1 of
 " 2 # divergence-free vectors u(x, t) (satisfying the bound-
i d 2 ary conditions) that have finite norm,
U  k   U00  ¼ 0
k dz2
kukJ1  kukL2 þ kcurl ukL2 ½19
¼0 at z ¼ 1 ½18
The celebrated Rayleigh stability criterion says that This theory can be applied, for example, to show
a sufficient condition for the eigenvalues  to be that any shear flow with no inflection points in the
pure imaginary is the absence of an inflection point profile U(z) is nonlinearly unstable in the function
in the shear profile U(z). It is more difficult to prove space J1 , that is, the classical Rayleigh criterion
the converse; however, there have been several implies not only spectral stability but also nonlinear
recent results that show that oscillating profiles stability.
indeed produce unstable eigenvalues. For example, if We note that Arnol’d’s stability method cannot be
U(z) = sin mz the continued fraction proof of applied to the Euler equations in three dimensions
Meshalkin and Sinai can be adapted to exhibit the because the second variation of the energy defined
full unstable spectrum for [18]. We note the ‘‘fluid on the tangent space to M is never definite at a
Lyapunov exponent’’  is zero for all shear flows; critical point U 0 (x). This result is suggestive, but
thus the only way the unstable spectrum can be does not prove, that most Euler flows in three
nonempty for shear flows is via discrete unstable dimensions are nonlinearly unstable in the Arnol’d
eigenvalues. sense. To quote Arnol’d, in the context of the Euler
As we have discussed, it is possible to show that equations ‘‘there appear to be an infinitely great
many classes of steady Euler flows are linearly number of unstable configurations.’’
unstable, either due to a nonempty unstable essential In recent years, there have been a number of
spectrum (i.e., cases where  > 0) or due to unstable results concerning nonlinear instability for the
eigenvalues or possibly for both reasons. It is natural Euler equation. Most of these results prove non-
to ask what this means about the stability/instability linear instability under certain assumptions on the
of the full nonlinear Euler equations [14]–[16]. The structure of the spectrum of the linearized Euler
issue of nonlinear stability is complex and there are operator. To date, none of the approaches prove
several natural precise definitions of nonlinear the definitive result that in general linear instability
stability and its converse instability. implies nonlinear instability. As we have remarked,
One definition is to consider nonlinear stability this is a much more delicate issue for Euler than for
in the energy norm L2 and the enstrophy norm H 1 , Navier–Stokes because of the existence, for a
which are natural function spaces to measure generic Euler flow, of a nonempty essential
growth of disturbances but are not ‘‘correct’’ spaces unstable spectrum. To give a flavor of the mathe-
for the Euler equations in terms of proven proper- matical treatment of nonlinear instability for the
ties of existence and uniqueness of solutions to the Euler equations, we present one recent result and
nonlinear equation. Falling under this definition is refer the interested reader to articles listed in the
the most frequently employed method to prove ‘‘Further reading’’ section for further results and
nonlinear stability, which is an elegant technique discussions.
developed by Arnol’d (cf. Arnol’d and Khesin In the context of Euler equations in two dimen-
(1998) and references therein). This is based on sions, we adopt the following definition of Lyapu-
the existence of the so-called energy-Casimirs. The nov stability.
vorticity curl q is transported by the motion of Definition 4 An equilibrium solution U 0 (x) is
the fluid so that at time t it is obtained from the called Lyapunov stable if for every " > 0 there exists
vorticity at time t = 0 by a volume-preserving  > 0 so that for any divergence-free vector u(x, 0) 2
diffeomorphism. In the terminology of Arnol’d, W 1þs, p , s > 2=p, such that ku(x, 0)kL2 <  the unique
the velocity fields obtained in this manner at any solution u(x, t) to [14]–[16] satisfies
two times are called isovortical. For a given field
U 0 (x), the class of isovortical fields is an infinite- kuðx; tÞkL2 < " for t 2 ½0; 1Þ
dimensional manifold M, which is the orbit of the
group of volume-preserving diffeomorphisms in the We note that we require the initial value u(x, 0) to
space of divergence-free vector fields. The steady be in the Sobolev space W 1þs, p , s > p=2, since it is
flows are exactly the critical points of the energy known that the 2D Euler equations are globally in
functional E restricted to M. If a critical point is a time well posed in this function space.
Stability of Flows 7

Definition 5 Any steady flow U 0 (x) for which the this paper. She thanks Misha Vishik for much
conditions of Definition 4 are violated is called helpful advice.
nonlinearly unstable in L2 . The work is partially supported by NSF grant
DMS-0202767.
Observe that the open issues (in three dimensions)
of nonuniqueness or nonexistence of solutions to See also: Compressible Flows: Mathematical Theory;
[14]–[16] would, under Definition 5, be scenarios Incompressible Euler Equations: Mathematical Theory;
for instability. Magnetohydrodynamics; Newtonian Fluids and
Thermohydraulics; Non-Newtonian Fluids; Topological
Theorem 6 (Nonlinear instability for 2D Euler
Knot Theory and Macroscopic Physics.
flows). Let U 0 (x) 2 C1 (T 2 ) be satisfy [12]. Let 
be the maximal Lyapunov exponent to the ODE
x_ = U 0 (x). Assume that there exists an eigenvalue 
in the L2 spectrum of the linear operator LE given Further Reading
by [15] with Re  > . Then in the sense of
Arnol’d VI and Khesin B (1998) Topological Methods in
Definition 5, U 0 (x) is Lyapunov unstable with Hydrodynamics. New York: Springer.
respect to growth in the L2 -norm. Chandrasekhar S (1961) Hydrodynamic and Hydromagnetic
Stability. Oxford: Oxford University Press.
The proof of this result is given in Vishik and Chossat P and Iooss G (1994) The Couette–Taylor Problem.
Friedlander (2003) and uses a so-called ‘‘bootstrap’’ Berlin: Springer.
argument whose origins can be found in references Drazin PG and Reid WH (1981) Hydrodynamic Stability.
in that article. We remark that the above result gives Cambridge: Cambridge University Press.
Friedlander S, Pavlovic N, and Shvydkoy R (2006) Nonlinear
nonlinear instability with respect to growth of the
instability for the Navier–Stokes equations (to appear in
energy of a perturbation which seems to be a Communications in Mathematical Physics).
physically reasonable measure of instability. Friedlander S and Serre D (eds.) (2003) Handbook of Mathema-
In order to apply Theorem 6 to a specific 2D flow tical Fluid Dynamics, vol. 2. Amsterdam: Elsevier.
it is necessary to know that the linear operator LE Friedlander S and Yudovich VI (1999) Instabilities in fluid
has an eigenvalue with Re  > . As we have motion. Notices of the American Mathematical Society 46:
1358–1367.
discussed, such knowledge is lacking for a generic Gershuni GZ and Zhukovitiskii EM (1976) Convective Instabil-
flow U 0 (x). Once again, we turn to shear flows. As ity of Incompressible Fluids. Jerusalem: Keter Publishing
we noted  = 0 for shear flows, any shear profile for House.
which unstable eigenvalues have been proved to Godreche C and Manneville P (eds.) (1998) Hydrodynamics and
Nonlinear Instabilities. Cambridge: Cambridge University
exist provides an example of nonlinear instability
Press.
with respect to growth in the energy. Joseph DD (1976) Stability of Fluid Motions, 2 vols. Berlin:
We conclude with the observation that it is Springer.
tempting to speculate that, given the complexity Kato T and Fujita H (1962) On the nonstationary Navier–Stokes
of flows in three dimensions, most, if not all, such system. Rend. Sem. Mat. Univ. Padova 32: 243–260.
inviscid flows are nonlinearly unstable. It is clear Lin CC (1967) The Theory of Hydrodynamic Stability.
Cambridge: Cambridge University Press.
from the concept of the fluid Lyapunov exponent Meshalkin LD and Sinai IaG (1961) Investigation of stability for a
that stretching in a flow is associated with system of equations describing the plane movement of an
instabilities and there are more mechanisms for incompressible viscous liquid. App. Math. Mech. 25:
stretching in three, as opposed to two, dimensions. 1700–1705.
However, to date there are virtually no mathema- Swinney H and Gollub L (eds.) (1985) Hydrodynamic Instabilities
and Transition to Turbulence. New York: Springer.
tical results for the nonlinear stability problem for Vishik MM (1996) Spectrum of small oscillations of an ideal fluid
fully 3D flows and many challenging issues remain and Lyapunov exponents. Journal de Mathématiques Pures et
entirely open. Appliquées 75: 531–557.
Vishik M and Friedlander S (2003) Nonlinear instability in
2 dimensional ideal fluids: the case of a dominant
eigenvalue. Communications in Mathematical Physics 243:
Acknowledgments 261–273.
Yudovich VI (1989, US translation) Linearization Method in
The author is very grateful to IHES and ENS- Hydrodynamical Stability Theory, Transl. Math. Monog.
Cachan for their hospitality during the writing of vol. 74. Providence: American Mathematical Society.
8 Stability of Matter

Stability of Matter
J P Solovej, University of Copenhagen, Copenhagen, cannot be arbitrarily negative as the number of
Denmark particles increases. This is often referred to as
ª 2006 Elsevier Ltd. All rights reserved. ‘‘stability of the second kind.’’ If stability of the
second kind does not hold, one would be able to
extract an arbitrarily large amount of energy by
adding a single atomic particle to a sufficiently large
Introduction
macroscopic object.
The theorem on stability of matter is one of the most A perhaps more intuitive notion of stability is
celebrated results in mathematical physics. It is one related to the volume occupied by a macroscopic
of the rare cases where a result of such great object. More precisely, the volume of the object,
importance to our understanding of the world when its total energy is close to the lowest possible
around us appeared first in a completely rigorous energy, grows at least linearly in the number of
formulation. particles. This volume dependence is a fairly simple
Issues of stability are, of course, extremely impor- consequence of stability of matter as formulated
tant in physics. One of the major triumphs of the above.
theory of quantum mechanics is the explanation it The first mention of stability of the second kind
gives of the stability of the hydrogen atom (and the for a charged system is perhaps by Onsager (1939),
complete description of its spectrum). Quantum who studied a system of charged classical particles
mechanics or, more precisely, the uncertainty princi- with a hard core and proved the stability of the
ple explains not only the stability of tiny microscopic second kind. The proof of stability of matter by
objects, but also the stability of gigantic stellar Dyson and Lenard, which does not rely on any hard-
objects such as white dwarfs. Chandrasekhar’s core assumption, but rather on the properties of
famous theory on the stability of white dwarfs fermionic quantum particles, used results from
required, however, not only the usual uncertainty Onsager’s paper.
principle, but also the Pauli exclusion principle for The real relevance of the notion of stability of the
the fermionic electrons. second kind was first realized by Fisher and Ruelle
Whereas both the stability of atoms and the (1966) in an attempt to understand the thermo-
stability of white dwarfs were early triumphs of dynamic properties of matter and to give meaning
quantum mechanics, it, surprisingly, took nearly to thermodynamic quantities such as the energy
40 years before the question of stability of everyday density (energy per volume). Stability of matter is a
macroscopic objects was even raised (Fisher and necessary ingredient in explaining the existence of
Ruelle 1966). The rigorous answer to the question thermodynamics, that is, that the energy per
came shortly thereafter in what came to be known volume has a well-defined limit as the volume and
as the ‘‘theorem on stability of matter’’ proved first number of particles tend to infinity, with the ratio
by Dyson and Lenard (1967). (i.e., the density of particles) kept fixed. The
Both the stability of hydrogen and the stability of existence of this limit is, however, not just a simple
white dwarfs simply mean that the total energy of consequence of stability of matter. The existence of
the system cannot be arbitrarily negative. If there the thermodynamic limit for ordinary charged
were no such lower bound to the energy, one would matter was proved rigorously by Lieb and Lebowitz
have a system from which it would be possible, in (1972) using the result on stability of matter as an
principle, to extract an infinite amount of energy. input.
One often refers to this kind of stability as stability After the original proof of stability of matter by
of the first kind. Dyson and Lenard, several other proofs were given
Stability of matter is somewhat different. Stability (see, e.g., reviews by Lieb (1976, 1990, 2004) for
of the first kind for atoms generalizes, as noted later, detailed references). Lieb and Thirring (1975) in
to objects of macroscopic size. The question arises particular presented an elegant and simple proof
as to how the lowest possible energy depends on the relying on an uncertainty principle for fermions. As
size or, more precisely, on the (macroscopic) number explained in a later section, the best mathematical
of particles in the object. Stability of matter in its formulation of the usual uncertainty principle is in
precise mathematical formulation is the requirement terms of a Sobolev inequality. The method of Lieb
that the lowest possible energy depends at most and Thirring is related to a Sobolev type inequality
linearly on the number of particles. Put differently, for antisymmetric functions. The Lieb–Thirring
the lowest possible energy calculated per particle inequality is discussed later. The proof by Dyson
Stability of Matter 9

and Lenard gave a very poor bound on the lowest Since we consider only electrostatic interactions,
possible energy per particle. The proof by Lieb and the quantum Hamiltonian describing this system is
Thirring gave a much more realistic bound on this
quantity (see below). Two proofs of stability of X
N K X
X N
zk
matter will be sketched here. Both proofs rely on the HN ¼ Ti 
i¼1 k¼1 i¼1
jxi  rk j
Lieb–Thirring inequality. The first proof described is X X
mathematically simple to explain, whereas the 1 zk z‘
þ þ ½1
second proof (Lieb–Thirring) is based on the jx  xj j 1k<‘K jrk  r‘ j
1i<jN i
Thomas–Fermi theory. It is mathematically some-
what more involved but, from a physical point of The kinetic energy operator Ti is (half) the Laplacian in
view, more intuitive. the variable xi , i.e., Ti = ð1/2Þi . Atomic units are
As in the case of white dwarfs, stability of matter used, where not only the electron charge is 1, but the
relies on the fermionic property of electrons. Dyson mass of the electron is also 1 and h = 1. The unit of
(1967) proved that the stability of the second kind energy is then 2 Ry.
fails if we ignore the Pauli exclusion principle. In The Hamiltonian HN depends on the parameters
physics textbooks, the importance of the Pauli z = (z1 , . . . , zK ) and r = (r1 , . . . , rK ). It acts on the
exclusion principle for the stability of white dwarfs Hilbert space of fermionic, that is, antisymmetric
is often emphasized. Its importance for the stability wave functions. More precisely, the fermionic
of everything around us is usually ignored. Hilbert space is
As mentioned above the result on stability of
matter appeared from the beginning as a completely ^
N

rigorously proved theorem. In contrast, the stability HFN ¼ L2 ðR3 ; C2 Þ


of white dwarfs was only derived rigorously by Lieb
and Thirring (1984) and Lieb and Yau (1987) over Here the target space is C2 , in order to describe
50 years after the original work of Chandrasekhar. spin-1/2 particles. One can, of course, also consider
The original formulation of stability of matter, the Hamiltonian HN on the full Hilbert space,
which is given in the next section, dealt with O
N
N
charged matter consisting of electrons and nuclei HN ¼ L2 ðR 3 ; C2 Þ ¼ L2 ðR3N ; C2 Þ
interacting only through electrostatic interactions
and being described by nonrelativistic quantum of which HFN is a subspace.
mechanics. Over the years, many generalizations of The quantity of interest is the ground-state energy
stability of matter have been derived in order to
include relativistic effects and electromagnetic inter- EF ðz; N; KÞ ¼ inf inf specHF HN
r N
actions. Some of these generalizations will be n
discussed in this article. A complete understanding ¼ inf inf ð; HN Þ j 
r
of stability of matter in quantum electrodynamics N
o
(QED) does not exist as yet, which is intimately 2 HFN \ C1 ðR 3N ; C2 Þ; kk ¼ 1 ½2
related to the fact that this theory still awaits a
mathematically satisfactory formulation. and likewise for the ground-state energy E(z, N, K)
on the full space HN . Clearly, EF (z, N, K) 
E(z, N, K). It turns out that the energy E(z, N, K) is
the same as one would get by restricting to
The Formulation of Stability of Matter
symmetric functions instead of antisymmetric
Consider K nuclei with nuclear charges z1 , . . . , zK > 0 ones. Therefore, the energy E(z, N, K) is often
at positions r1 , . . . , rK 2 R3 , and N electrons with referred to as the lowest possible energy for bosonic
charges 1 (this amounts to a choice of units) at particles.
positions x1 , . . . , xN 2 R3 . In order to discuss The Hamiltonian HN is an unbounded operator
stability, it turns out that one can consider the and we must discuss its domain to be able to talk
nuclei as fixed in space, whereas the electrons are about its spectrum. Also, it should be self-adjoint. It
dynamic. More precisely, this means that the turns out that these questions are intimately related
kinetic energy of the nuclei is ignored. It is to stability. The operator HN is well defined on
important to realize that if stability holds for static smooth (i.e., C1 ) functions. Thus, the last definition
nuclei, it also holds for dynamic nuclei. This is of EF (z, N, K) in [2] is meaningful. If this ground-state
simply because the kinetic energy is positive, so that energy is finite (i.e., not 1), then the Hamiltonian
the effect of ignoring it is to lower the total energy. has an extension, the Friedrichs’ extension, to a
10 Stability of Matter

self-adjoint operator with the property that the ground-state energy over the space L2 (R3(NþK) )
second equality in [2] holds. e
(ignoring spin) by E(N,K). Then, Dyson proved that
In the definition of EF , we have minimized over
all the positions r of the nuclei. Even though the e
min EðN; KÞ  CM7=5
NþK¼M
nuclear dynamics is not considered, one is still
interested in finding the lowest possible energy for some constant C > 0. It was later shown by
independent of where they are located. Conlon et al. (1988) that the exponent 7/5 is indeed
optimal. Dyson (1967) made a conjecture for the
Theorem 1 (Stability of the first kind). For all N, precise asymptotic behavior of this energy. This
K, and z, we have conjecture, which was proved by Lieb and Solovej
(2005) and Solovej (2004), is given in the next
Eðz; N; KÞ > 1
theorem.
Theorem 2 (Stability of matter). There exists a Theorem 4 (Dyson’s 7/5-law for the charged
constant Cjzj > 0 depending only on jzj = max Bose gas).
{z1 , . . . , zk } such that
e
EðN; KÞ
F lim min 7=5
E ðz; N; KÞ  Cjzj ðN þ KÞ M!1 NþK¼M M
 Z Z Z 
1 2 5=2 2
The constant Cjzj bounds the binding energy per ¼ inf jrj  J  j   0;  ¼ 1 ½3
2
particle. In the case of hydrogen atoms, when
jzj = 1, Dyson and Lenard arrived at a bound with where
C1  1014 Ry. Lieb and Thirring arrive at C1   3=4
5 = 10 Ry. Since the binding energy of a single 4 ð1/2Þð3/4Þ

hydrogen atom is 1 Ry, it is easy to see that one  5ð5/4Þ
must have C1  1=4. Over the years, there have
been some improvements on the estimated value of
this constant in the theory of stability of matter.
That the Pauli exclusion principle, that is, the Generalizations of Stability of Matter
fermionic character of the electrons, is necessary for
Over the years, generalizations of stability of matter
stability of matter is a consequence of the next
including relativistic effects and interactions with the
theorem.
electromagnetic field have been attempted. Since the
Theorem 3 (the N 5=3 law for bosons). If N = K relativistic Dirac operator is not bounded below, we
and z1 =    = zK = z > 0, then there exist constants cannot simply replace the standard nonrelativistic
C > 0 depending on z such that kinetic energy operator Tj = (1/2)j by the free
Dirac operator.
C N 5=3 < Eðz; N; NÞ < Cþ N 5=3 Relativistic effects have been included by con-
sidering the (pseudo) relativistic kinetic energy
It is the superlinear (exponent 5/3) behavior in N qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
of the upper bound that violates stability of matter. TjRel ¼ c2 j þ c4  c2
This upper bound was proved by Lieb (1979) by a
fairly simple variational argument. The lower bound In the units used in this article, the physical value
above, which shows that the exponent 5/3 is of the speed of light c is approximately 137 or,
optimal, was proved by Dyson and Lenard (1968) more precisely, the reciprocal of the fine-structure
in their original paper on stability of matter. constant .
This theorem leaves open the possibility that the For this relativistic kinetic energy, Lieb and Yau
stability of matter could be recovered by introducing (1988) proved that stability of matter holds in the
finite nuclear masses. That this, indeed, is not the case sense formulated in Theorem 2 if (= c1 ) is small
was proved by Dyson (1967) by a complicated enough and maxj {zj }  2=. It is known here that
variational argument based on the Bogolubov pair the value 2= is the best possible, since it is so
theory for
P superfluid helium. We now add the kinetic for the one-atom case. The one-atom case had
energy K k = 1 ð1/2Þrk of the nuclei (assuming, for been studied by Herbst. The corresponding case of
simplicity, that they have the same mass as the a one-electron molecule was studied by Lieb and
electrons) to the Hamiltonian HN and consider Daubechies. Less optimal results on the stability of
the case where z1 = z2 =    = zK = 1. We denote the matter with relativistic kinetic energy had been
Stability of Matter 11

obtained prior to the work of Lieb and Yau by (1995). The latter result includes the physical value of
Conlon and later by Fefferman and de la Llave. . The fact that a bound on  is needed had been
References to these works can be found in the work proved by Loss and Yau. Stability for a one-electron
of Lieb and Yau (1988). atom had been proved in this model by Fröhlich,
The relativistic kinetic energy TjRel agrees with the Lieb, and Loss. The many-electron atom and the one-
free Dirac operator on the positive spectral subspace electron molecule had been studied by Lieb and Loss.
of the free Dirac operator (i.e., a subspace of Most relevant references may be found in the work of
L2 (R3 ; C4 )). Therefore, the stability of matter Lieb et al. (1995).
follows if Tj is replaced by the free Dirac operator The possibility of quantizing the magnetic field has
and if one restricts to the Hilbert space obtained as also been studied. In this case, one must introduce an
in [2] but with L2 (R3 ; C2 ) replaced by the positive ultraviolet cutoff in the momentum modes of the
spectral subspace of the free Dirac operator. This vector potential. Stability of matter in the resulting
formulation is often referred to as the ‘‘no-pair’’ model of (ultraviolet cutoff) QED coupled to non-
model. In the usual Dirac picture, the negative relativistic matter was proved by Fefferman et al.
spectral subspace, the Dirac sea, is occupied. As long improving results of Bugliaro, Fröhlich, and Graf.
as one ignores pair creation, only the positive Finally, one may include both relativistic effects
spectral subspace is available. and electromagnetic interactions. Let us first discuss
Magnetic fields may be included by considering the case of classical electromagnetic fields. If instead
the ‘‘magnetic kinetic energy’’ of the Pauli kinetic energy one uses the Dirac
 2 operator with a magnetic vector potential then
TjMag ¼ 12 irj  c1 Aðxj Þ there would be no lower bound on the energy. But,
as previously described, one can study a no-pair
It turns out that the stability of matter theorem formulation of relativistic particles coupled to
(Theorem 2) holds for all magnetic vector potentials electromagnetic fields. The question arises which
A : R3 ! R3 with a constant Cjzj independent of A. subspace of L2 (R 3 ; C4 ) one should restrict to (i.e.,
This is, therefore, also the case if we consider the which subspace is filled and which one is available).
magnetic field (or rather the vector potential) as a There are two obvious choices. Either one should, as
dynamic variable and add the (positive) field energy before, restrict to the positive spectral subspace of
Z the free Dirac operator or one should restrict to the
1
U¼ jr AðxÞj2 dx ½4 positive spectral subspace of the magnetic Dirac
8 R3
operator. It is proved by Lieb et al. (1997) that the
to the Hamiltonian. The resulting Hamiltonian former choice leads to instability, whereas stability
describes a charged spinless particle interacting of matter holds for the latter choice under some
with a classical electromagnetic field. conditions on  and maxj {zj }. Stability requires that
A more complicated situation is described by the the field energy U is included in the Hamiltonian. It
‘‘magnetic Pauli kinetic energy’’ then holds independently of the magnetic field.
This final stability result also holds if the magnetic
TjPauli ¼ 12ððirj  c1 Aðxj ÞÞ  s j Þ2 field is quantized with an ultraviolet cutoff as
proved by Lieb and Loss (2002).
where the coupling of the spin to the magnetic field The no-pair model even with the ultraviolet cutoff
is included through the vector of 2 2 Pauli quantized field is not fully relativistically invariant.
matrices acting on the spin components of particle j, As mentioned above, there is still no mathematical
that is, s = (1 , 2 , 3 ), with
formulation of QED, a fully relativistically invariant
      model for quantum particles interacting with elec-
0 1 0 i 1 0
1 ¼ ; 2 ¼ ; 3 ¼ tromagnetic fields.
1 0 i 0 0 1

For the Pauli kinetic energy, stability of matter will


The Proof of Stability of the First Kind
not hold independently of the magnetic field (or even
for a fixed unbounded magnetic field) unless the field The proof of stability of the first kind will now be
energy U in [4] is included in the Hamiltonian. If the sketched for charged quantum systems.
field energy is included, stability of matter holds As mentioned in the introduction, stability of the
independently of the magnetic field, that is, even if first kind is a consequence of the uncertainty
one minimizes over the dynamic variable A, if principle. Contrary to what is often stated in physics
(= c1 ) and maxj {zj }2 are small enough. This was textbooks, stability does not follow from the
proved by Fefferman (1997) and by Lieb et al. Heisenberg formulation of the uncertainty principle.
12 Stability of Matter

A mathematically more flexible formulation not have the form required for the stability of the
is provided by the classical Sobolev inequality, second kind.
which states that for all square-integrable functions
2 L2 (R3 ), one has The Proof of Stability of Matter
Z Z 1=3
The proof of stability of the first kind presented in
jr j2  CS j j6 ½5
the previous section must be improved in two ways
in order to conclude the stability of matter.
for CS > 0. It follows from this inequality that for For fermions, it turns out that the lower bound in
any attractive potential V, there is a lower bound on [6] can be improved in such a way that there is no
the energy expectation factor N in the first term. This is the content of the
    bound of Lieb and Thirring discussed in the
1
;  V introduction.
2
Z Z Z 1=3 Theorem 5 (Lieb–Thirring inequality 1975). The
1 2 2 1 6
¼ jr j  Vj j  CS sum of all the negative eigenvalues of the oper-
2 2 ator ð1/2Þ  V(x) is bounded below by
Z Z 2=5 Z 1=5
Z
 V 5=2 j j2 j j6  LLT V 5=2
Z Z
 C V 5=2 j j2 for some constant LLT > 0

for some C > 0. Thus, the lowest possible energy of For N noninteracting fermions moving in the
one particle moving in the potential V is bounded potential V, the lowest possible energy is given by
R the sum of the N lowest eigenvalues of the operator
below by C V 5=2 . For N R (noninteracting) particles,
the lower bound is CN V 5=2 . This holds whether in the above theorem. Thus, the theorem gives a
or not the particles have spin. If, more generally, the lower bound on this energy independently of N.
potential The second point where the argument from the
R can be written as V = U þ W, U, W  0, previous section has to be improved is the control of
where U5=2 < 1 and W is bounded W  kWk1 ,
then the energy of N noninteracting particles moving the electrostatic energy. In the above discussion, all
in the potential V is bounded below by repulsive terms have simply been ignored. For
Z stability of matter, a much more delicate bound is
NC U5=2  NkWk1 ½6 needed. Many versions of such bounds have been
given going back to the work of Onsager (1939).
Here, a result of Baxter (1980) will be used.
For the Hamiltonian HN from [1], one can get a
lower bound on the energy E(z, N, K) by ignoring all Theorem 6 (Baxter’s correlation estimate). For all
the positive potential terms, that is, the last two positions xi , . . . , xN , r1 , . . . , r K 2 R3 and all charges
sums in [1]. The remaining Hamiltonian describes N z1 , . . . , zK > 0, we have the pointwise inequality
independent particles moving in the potential
K X
X N
zk X 1
X
K
zk X
K  þ
V ¼  ¼  ðUk þ Wk Þ k¼1 i¼1
jxi  rk j 1i<jN jxi  xj j
k¼1
jx  rk j k¼1
X zk z‘ XN
where Uk is the restriction of zk =jx  rk j to the set þ  Vðxi Þ
jr  r‘ j
1k<‘K k i¼1
jx  rk j < R for some R > 0 and Wk is the restriction
to the complementary set. Using [6], one can easily
see that the energy expectation is bounded below by where V(x) = (1 þ 2 maxk {zk }) maxk {jx  rk j1 }.
This theorem simply states that, for a lower
 C N K5=2 maxfzk g5=2 R1=2  N K maxfzk gR1 bound, one can replace the full electrostatic Cou-
k k
lomb energy by the energy of independent electrons
0 2 2
¼ C N K maxfzk g moving in the potential where they always see only
k
the closest nuclei (with a modified charge). Baxter
where we have made the optimal choice for (1980) used probabilistic techniques to prove the
R
(K maxk {zk })1 . inequality. An improved version of the inequality
This finite lower bound on the energy proves was given by Lieb and Yau (1988), with an analytic
the stability of the first kind, but it clearly does proof.
Stability of Matter 13

Similarly to the argument in the previous section, One should compare the Lieb–Thirring kinetic
one can write V(x) = U(x) þ W(x), where U is the energy bound with the expression (3/10)(32 )2=3 5=3
restriction of V to the set where mink {jx  rk j} < R for the (thermodynamic) energy density of a
for some R > 0 and W is the restriction to the free Fermi gas. One of the yet unproven conjectures
complementary set. It then follows from Baxter’s is that the Lieb–Thirring bound holds with CLT
correlation estimate and the Lieb–Thirring inequality replaced by the free Fermi constant (3/10)(32 )2=3 .
that the lowest eigenvalue of the Hamiltonian HN on The idea in the Lieb–Thirring proof of stability of
the fermionic Hilbert space HFN is bounded below by matter is to bound the energy below by an
Z expression depending only on the one-electron
 LLT U5=2  Nð1 þ 2 maxfzk gÞR1 density. Theorem 7 achieves this for the kinetic
k
energy. What is missing is a lower bound on the
5=2
 Cð1 þ 2 maxfzk gÞ KR1=2 electrostatic Coulomb energy depending only on the
k
density. One can show (see Lieb (1976) or Lieb and
 Nð1 þ 2 maxfzk gÞR1 Thirring (1975)) that, except for an error of the
k
form ‘‘– const N,’’ the total energy expectation
¼ C ð1 þ 2 maxfzk gÞ2 ðN þ KÞ
0
k (, HN ) may be bounded below by
where R
(1 þ 2 maxk {zk })1 . This lower bound is Z XK Z
zk
linear in the total particle number N þ K, as CLT 5=3  ðxÞ dx
jx  rk j
required by stability of matter. ZZ
k¼1
1 ðxÞðyÞ X zk z‘
þ dx dy þ ½7
2 jx  yj 1k<‘K k
jr  r‘ j
From Thomas–Fermi Theory to Stability
of Matter Here, as before,  is the one-electron density of the
In this final section, the proof of stability of matter N-body wave function . The expression [7] is the
by Lieb and Thirring (1975), where they use the famous Thomas–Fermi energy functional. It has
Thomas–Fermi theory, is discussed briefly. First note been studied rigorously by Lieb and Simon (1977).
that there is a dual formulation of the Lieb–Thirring The Thomas–Fermi energy is Rthe infimum of the
inequality theorem (Theorem 5), which makes the expression (7) over all  with  = N. One of the
connection to the Sobolev inequality [5] much more important results about the Thomas–Fermi energy is
transparent. Teller’s no-binding theorem (Lieb and Simon 1977).
It states that in Thomas–Fermi theory atoms do not
Theorem 7 (Lieb–Thirring inequality as a kinetic bind to form molecules. This means that the
energy bound). For any normalized antisymmetric Thomas–Fermi energy is greater than the sum of
(fermionic) wave function  2 HFN we have with the individual atomic energies (these energies in turn
2=3
CLT = 35 ( 25 L1
LT ) the following lower bound on the depend only on the nuclear charges).
kinetic energy: The above Thomas–Fermi lower bound on the
X N Z energy expectation (, HN ) together with the no-
1
kri ðx1 ; . . . ; xN Þk2 dx1    dxN binding theorem implies stability of matter.
i¼1
2 R 3N
The generalizations to stability of matter dis-
Z
cussed earlier are proved in a way similar to the
 CLT ðxÞ5=3 dx proof presented in the previous section.
R3
N
where k  k is the norm in spin space (C2 ) and the See also: h-Pseudodifferential Operators and
one-electron density is given by Applications; Quantum Statistical Mechanics: Overview;
Z Schrödinger Operators.
ðxÞ ¼ N kðx; x2 ; . . . ; xN Þk2 dx2    dxN
R 3ðN1Þ

This estimate follows immediately from Theorem Further Reading


5, which implies that
Baxter JR (1980) Inequalities for potentials of particle systems.
X N Z Z Z Illinois Journal of Mathematics 24: 645–652.
1
kri k2  V  LLT V 5=2 Conlon JG, Lieb EH, and Yau H-T (1988) The N7=5 law for
i¼1
2 charged bosons. Communications in Mathematical Physics
116: 417–448.
To arrive at Theorem 7, simply choose Dyson FJ (1967) Ground state energy of a finite system of charged
2=3
V = ((2/5(L1
LT ) . particles. Journal of Mathematical Physics 8: 1538–1545.
14 Stability of Minkowski Space

Dyson FJ and Lenard A (1967) Stability of matter. I. Journal of Lieb EH and Yau H-T (1988) The stability and instability of
Mathematical Physics 8: 423–434. relativistic matter. Communications in Mathematical Physics
Dyson FJ and Lenard A (1968) Stability of matter. II. Journal of 118: 177–213.
Mathematical Physics 9: 698–711. Lieb EH (1990) The stability of matter: from atoms to stars. 1989
Fefferman CL (1997) Stability of matter with magnetic fields. Gibbs Lecture. Bulletin of the American Mathematical Society
CRM Proceedings and Lecture Notes 12: 119–133. 22: 1–49.
Fefferman C, Fröhlich J, and Graf GM (1997) Stability of ultraviolet Lieb EH, Loss M, and Solovej JP (1995) Stability of matter in
cutoff quantum electrodynamics with non-relativistic matter. magnetic fields. Physical Review Letters 75: 985–989.
Communications in Mathematical Physics 190: 309–330. Lieb EH, Siedentop H, and Solovej JP (1997) Stability and
Fisher M and Ruelle D (1966) The stability of many-particle instability of relativistic electrons in classical electromagnetic
systems. Journal of Mathematical Physics 7: 260–270. fields. Journal of Statistical Physics 89: 37–59.
Lieb EH and Lebowitz JL (1972) The constitution of matter: Lieb EH and Loss M (2002) Stability of a model of relativistic
existence of thermodynamics for systems composed of quantum electrodynamics. Communications in Mathematical
electrons and nuclei. Advances in Mathematics 9: 316–398. Physics 228: 561–588.
Lieb EH and Thirring WE (1975) Bound for the kinetic energy of Lieb EH (2004) The stability of matter and quantum electro-
fermions which proves the stability of matter. Physical Review dynamics, Proceedings of the Heisenberg Symposium, Munich,
Letters 35: 687–689. Dec. 2001. In: Buschhorn G and Wess J (eds.) Fundamental
Lieb EH (1976) The stability of matter. Reviews of Modern Physics – Heisenberg and Beyond, pp. 53–68. Berlin: Springer
Physics 48: 553–569. (arXiv math-ph/0209034).
Lieb EH and Simon B (1977) Thomas–Fermi theory of atoms, Lieb EH and Solovej JP Ground state energy of the two-
molecules and solids. Advances in Mathematics 23: 22–116. component charged Bose gas. Communications in Mathema-
Lieb EH (1979) The N5=3 law for bosons. Physics Letters A 70: tical Physics (to appear).
71–73. Onsager L (1939) Electrostatic interaction of molecules. Journal
Lieb EH and Thirring WE (1984) Gravitational collapse in of Physical Chemistry 43: 189–196.
quantum mechanics with relativistic kinetic energy. Annals of Solovej JP (2004) Upper bounds to the ground state energies of
Physics, NY 155: 494–512. the one- and two-component charged Bose gases. Preprint.
Lieb EH and Yau H-T (1987) The Chandrasekhar theory of
stellar collapse as the limit of quantum mechanics. Commu-
nications in Mathematical Physics 112: 147–174.

Stability of Minkowski Space


S Klainerman, Princeton University, Princeton, NJ, enough; one should also insist that the corresponding
USA spacetimes become flat along all possible directions,
ª 2006 Elsevier Ltd. All rights reserved. that is, globally asymptotically flat. This is measured
by the decay of the curvature tensor to zero. The
precise rate of decay is also of interest. One expects
that various null-frame components of the curvature
Introduction
tensor decay at different rates along outgoing null
The Minkowski space, which is the simplest solution hypersurfaces; this goes under the name of ‘‘peeling
of the Einstein field equations in vacuum, that is, in estimates.’’ It turns out in fact that we cannot prove
the absence of matter, plays a fundamental role in geodesic completeness without establishing at the
modern physics as it provides the natural mathema- same time sufficiently fast rates of decay to flatness
tical background of the special theory of relativity. It corresponding to at least some peeling.
is most reasonable to ask whether it is stable under The problem of stability of Minkowski space is
small perturbations. In other words, can arbitrary intimately related to that of describing the asympto-
small perturbations of flat initial conditions lead to tic properties of the gravitational field at large
developments which are radically different, in the distances from an isolated, weakly radiating physical
large, from the flat Minkowski space? It turns out to system. Precise laws of gravitational radiation can
be a highly nontrivial problem as the Einstein be deduced from the assumption that the spacetime
equations are of a quasilinear hyperbolic character. (M, g) under consideration can be conformally
Typical systems of this type, in three space dimen- compactified by adding a boundary S, called skry,
sions, do form singularities in finite time even for to M so that an appropriate conformal rescaling of g
small disturbances of their trivial initial data. To can be extended smoothly to the new manifold
avoid finite-time singularities, we must require that ^ g^) with boundary. In reality, the compactified
(M,
sufficiently small perturbations of Minkowski space spacetime cannot be smooth at the particular point
are geodesically complete. This, however, is not i0 corresponding to spacelike infinity. A spacetime
Stability of Minkowski Space 15

(M, g) is called asymptotically simple (AS) if its where r is the covariant derivative, R the scalar
conformal completion is smooth everywhere except curvature of (, g). An initial data set is said to be
i0 and every null geodesic intersects S at precisely maximal if trg k = 0. This is a gauge condition which
two endpoints. The AS assumption allows one to can be imposed without loss of generality. For
derive precise decay asymptotic for various curvature simplicity we shall assume, throughout this article,
components of (M, g) along null geodesics which that all initial data sets we consider are maximal.
are referred to as strong peeling. The obvious
Definition 2 An initial data set is said to be flat, or
questions raised by this procedure are: do there exist
trivial, if it corresponds to a complete spacelike
nontrivial AS spacetimes and, if so, do they contain
hypersurface in Minkowski space with its induced
a sufficiently large class of radiating spacetimes
metric and second fundamental form. An initial data
including those which appear in all relevant
set is said to be asymptotically flat if there exists a
applications?
system of coordinates (x1 , x2 , x3 ) defined in a
Clearly, the two problems mentioned above are
neighborhood of infinity
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi on , with
related but not equivalent. Asymptotically simple
r = (x1 )2 þ (x2 )2 þ (x3 )2 , relative to which the
spacetimes verify strong peeling, in particular they
metric g approaches the Euclidian metric and k
are globally asymptotically flat, that is, their
approaches zero as r ! 1. We assume, for simpli-
curvature tensor tends to zero along all geodesics.
city, that  has only one end. A neighborhood of
Yet, it is perfectly possible that arbitrarily small
infinity means the complement of a sufficiently large
perturbations of the Minkowski space are geodesi-
compact set on .
cally complete and globally asymptotically flat
without being asymptotically simple. Remark 1 Because of the constraint equations, the
The first global stability result of the Minkowski asymptotic behavior cannot be arbitrarily pre-
metric was proved by Christodoulou and Klainer- scribed. A precise definition of asymptotic flatness
man (1993). Their result proves sufficiently strong has to involve the ADM mass of (, g). Taking the
peeling estimates to allow one to derive the most mass into account, we write
important properties of gravitational radiation, such  
as the Bondi mass-law formula, but not as strong as 2M
gij ¼ 1 þ ij þ oðr1 Þ
those consistent with asymptotic simplicity. A r
companion result was proved by Klainerman and According to the positive-mass theorem, M  0 and
Nicolò (2003). Recently, Rodnianski and Lindblad M = 0 implies that the initial data set is flat.
(submitted) have obtained a surprising global
stability of Minkowski result for the Einstein vacuum Definition 3 We say that an initial data set is
equations in the Lorentz gauge, which provides strongly asymptotically flat if, for some 1=2,
considerable weaker peeling than Christodoulou and relative to the coordinate system mentioned above,
Klainerman (1993) and Klainerman and Nicolò  
2M
(1999) but is much easier to prove. gij  1 þ ij ¼ Oðr1 Þ; kij ¼ Oðr2 Þ
r
The goal of this article is to describe various results
obtained since the early 1980s concerning both as r ! 1
aspects of the problem of stability of Minkowski
Moreover, every derivative of g  (1 þ 2M=r) and k
mentioned above.
improves the asymptotics by one.
Definition 4 A Cauchy development of an initial
Initial Data Formulation data set (, g, k) is a spacetime manifold (M, g)
satisfying the Einstein equations together with an
The proper mathematical context for the stability of embedding i:  ! M such that i (g), i (k) are the
Minkowski is that provided by the initial-value first and second fundamental forms of i() in M.
problem for vacuum solutions to the Einstein field A development is required to be also globally
equations, that is, Ricci flat spacetimes (M, g), hyperbolic (which means that i() is a Cauchy
R = 0. We recall the following simple definitions: hypersurface, i.e., each causal curve in M intersects
Definition 1 An initial data set is a triplet (, g, k) i() at precisely one point) in order to assure the
consisting of a three-dimensional complete Rieman- unique dependence of solutions on the data. A
nian manifold (, g) and a 2-covariant symmetric future development of (, g, k) consists of a globally
tensor k on  satisfying the constraint equations: hyperbolic manifold (M, g) with boundary, satisfy-
ing the Einstein equations, and an embedding i as
rj kij  ri trg k ¼ 0; R  jkj2 þ ðtr kÞ2 ¼ 0 before which identifies  to the boundary of M.
16 Stability of Minkowski Space

The most primitive question asked about the relative to their proper time (or affine parameter in
initial-value problem, solved in a satisfactory way, the case of null geodesics). If the initial data set is
for very large classes of evolution equations, is that of sufficiently far off from the trivial one, the corre-
local existence and uniqueness of solutions. For the sponding future development may not be regular.
Einstein equations, this type of result was first This is the content of the following well-known
established by Bruhat (1952) with the help of wave theorem of Penrose (1979).
coordinates which allowed her to cast the Einstein
Theorem 3 If the manifold support of an initial
equations in the form of a system of nonlinear wave
data set is noncompact and contains a closed
equations to which one can apply the standard theory
trapped surface, the corresponding maximal devel-
of symmetric hyperbolic systems. A stronger result,
opment is incomplete.
due to Hughes et al. (1976), states the following:
Theorem 1 Let (, g, k) be an initial data set for
the Einstein vacuum equations. Assume that  can Stability of Minkowski Space
be covered by a locally finite system of coordinate
At the opposite end of Penrose’s trapped-surface
charts U related to each other by C1 diffeomorph-
s s1 condition, the problem of stability of Minkowski
isms, such that (g, k) 2 Hloc (U )  Hloc (U ) with
space concerns the development of asymptotically
s > 5=2. Then there exists a unique (up to an
flat initial data sets which are sufficiently close to
isometry) globally hyperbolic, Hausdorff, develop-
the trivial one. Although it may be reasonable to
ment (M, g) for which  is a Cauchy hypersurface.
expect the existence of a sufficiently small neighbor-
In Theorem 1, the uniqueness up to an isometry hood of the trivial initial data set, in an appropriate
requires additional regularity, s > (5=2) þ 1, on the topology, such that all corresponding developments
data. One has uniqueness, however, without addi- are geodesically complete and globally asymptoti-
tional regularity for the reduced Einstein equations cally flat, such a result was by no means preor-
system in wave coordinates. dained. First, all known explicit asymptotically
flat solutions of the Einstein vacuum equations,
Remark 2 In the case of nonlinear systems of that is, the Kerr family, are singular. The attempts
differential equations, the local existence and to construct nonexplicit, dynamic, solutions based
uniqueness result leads, through a straightforward on the conformal compactification method, due
extension argument, to a global result. The formula- to Penrose (1962), were obstructed by the irregular
tion of the same type of result for the Einstein behavior of initial data sets at i0 . (The problem is
equations is a little more subtle; it was done by that the singularity at i0 could propagate and thus
Bruhat and Geroch. destroy the expected smoothness of scry. This
Theorem 2 (Bruhat–Geroch). For each smooth problem has been recently solved by constructing
initial data set, there exists a unique maximal future initial data sets which are precisely stationary at
development. spacelike infinity.) Finally, the attempts, using
partial differential equation hyperbolic methods,
Thus, any construction, obtained by an evolution- to extend the classical local result of Bruhat
ary approach from a specific initial data set, must be ran into the usual difficulties of establishing global
necessarily contained in its maximal development. in time existence to solutions of quasilinear hyper-
This may be said to solve the problem of global bolic systems. Indeed, as mentioned above, the
existence and uniqueness in general relativity. This is wave coordinate gauge allows one to express
of course misleading, for equations defined in a fixed the Einstein vacuum equations in the form of
background global is a solution which exists for all a system of nonlinear wave equations which does
time. In general relativity, however, we have no such not satisfy Klainerman’s null condition (the null
background as the spacetime itself is the unknown. condition (Klainerman 1983, 1986) identifies an
The connection with the classical meaning of a global important class of quasilinear systems of wave
solution requires a special discussion concerning the equations in four spacetime dimensions for which
proper time of timelike geodesics; all further ques- one can prove global in time existence of small
tions may be said to concern the qualitative properties solutions) and thus was sought to lead to formation
of the maximal development. The central issue is that of singularities. (The conjectured singular behavior of
of existence and character of singularities. First, we wave coordinates was sought, however, to reflect
can define a regular maximal development as one only the instability of the specific choice of gauge
which is complete in the sense that all future timelike condition and not a true singularity of the equations.)
and null geodesics can be indefinitely extended According to Bruhat (personal communication),
Stability of Minkowski Space 17

Einstein himself had reasons to believe that the as r ! 1 with 4 r2 = Area(St, u = t \ Cu ). Also,
Minkowski space may not be stable. The problem   ,  = O(r7=2 ), with  the average of  over the
of stability of the Minkowski space was first settled compact 2-surfaces St, u = t \ Cu .
by Christodoulou and Klainerman (1990).
Three points are noteworthy. (1) The outgoing
Theorem 4 (Global stability of Minkowski). Any optical solution refers to the solution of the Eikonal
asymptotically flat initial data set which is suffi- equation g @a u@ u = 0 whose level hypersurfaces
ciently close to the trivial one has a complete Cu intersect t in expanding wave fronts for
maximal future development. increasing t; (2) the generators L and L are given
by: L = g @ u@ , the null geodesic generator of
A related result (Theorem 5) proved recently by
Cu ; L is then the null conjugate of L, perpendicular
Klainerman and Nicolò (2003a), solves the problem
to St, u = Cu \ t ; and (3) ea is an orthonormal frame
of radiation for arbitrary asymptotically flat initial
on St, u .
data sets: a proof the result below can also be
derived, indirectly, from Christodoulou and Klainer- Theorem 5 (Expanded version). For any asympto-
man (1993). The proof of Klainerman and Nicolò tically flat initial data sets (, g, k), verifying the same
(2003a) avoids, however, a great deal of the asymptotically flat conditions as in Theorem 4 one
technical complications of this proof. can find a suitable domain 0   with compact
closure in  such that its future domain of influence
Theorem 5 For any, suitably defined, asymptoti-
Cþ (0 ) can be foliated by two null foliations; one
cally flat initial data set (, g, k) with maximal
outgoing C(u) whose leaves are complete towards the
future development (M, g), one can find a suitable
future and the second one C(u) which is incoming.
domain 0   with compact closure in  such that
þ Let S(u, u) = C(u) \ C(u) denote the compact
the boundary Dþ 0 of its domain of influence C (0 ), 2-surfaces of intersection between the outgoing and
or causal future of , in M has complete null
incoming null hypersurfaces, whose area is denoted
geodesic generators with respect to the correspond-
by 4 r2 , and consider an adapted null frame (that is,
ing affine parameters.
L is a the geodesic null generator of C(u), L its null
Both the results of Christodoulou–Klainerman and conjugate perpendicular to S(u, u), and ea an ortho-
Klainerman–Nicolò prove in fact a lot more than normal frame on S(u, u)) L, L, (ea )a = 1, 2 at every
stated above. They provide a wealth of information point along an outgoing null cone C(u). Then,
concerning the behavior of null hypersurfaces as well denoting by , , , , ,  the null components of
as the rate at which various components of the the curvature tensor, as in Theorem 5, we have, along
Riemann curvature tensor approach zero along time- C(u) as r ! 1,
like and null geodesics. Here are more precise
versions for Theorems 4 and 5. ; ;   ;  ¼ Oðr7=2 Þ;  ¼ Oðr2 Þ;
½2
Theorem 4 (Expanded version). Assume that  ¼ Oðr1 Þ
(, g, k) is maximal and strong asymptotically
Observe that the rates of decay in [1] and [2] are
flat, g  (1 þ 2M=r) = 0(r3=2 ), k = 0(r5=2 ) plus
the same. This will be referred to as weak peeling to
an appropriate global smallness assumption. We can
distinguish from the rates of decay compatible with
construct complete spacetime (M, g) together with a
asymptotic simplicity, that is,
maximal foliation t given by the level hypersurfaces
of a time function t and null foliation Cu , given by the  ¼ Oðr5 Þ;  ¼ Oðr4 Þ
level hypersurfaces of an outgoing optical function u ½3
such that relative to an adapted null frame e4 = L, ;  ¼ Oðr3 Þ;  ¼ Oðr2 Þ;  ¼ Oðr1 Þ
e3 = L, and (ea )a = 1, 2 we have, along the null hyper-
to which we shall refer as strong peeling. We shall
surfaces Cu the weak peeling decay,
discuss more about these in the next section,
following a review, of a recent result of Lindblad–
ab ¼ RðL; ea ; L; eb Þ ¼ Oðr7=2 Þ Rodnianski.
2a ¼ RðL; L; L; ea Þ ¼ Oðr7=2 Þ Even the expanded forms of Theorems 4 and 5
stated here do not exhaust, all the information
4 ¼ RðL; L; L; LÞ ¼ Oðr3 Þ
½1 provided by global stability results in Christodoulou
4 ¼ RðL; L; L; LÞ ¼ Oðr3 Þ and Klainerman (1993) and Klainerman and Nicolò
2 a ¼ RðL; L; L; ea Þ ¼ Oðr2 Þ (2003a). Of particular interest are the main
asymptotic conclusions which can be derived
ab ¼ RðL; ea ; L; eb Þ ¼ Oðr1 Þ with the help of these information, the most
18 Stability of Minkowski Space

important being the Bondi mass-law formula which important part of their technical complications, the
calculates the gravitational energy radiated at null weak peeling properties of [1].
infinity.
The simplest gauge condition in which the
hyperbolic character of the Einstein field equations
Strong Peeling
are easiest to exhibit is the wave coordinate
condition; that is, one solves the Einstein vacuum The weak peeling properties [1] derived in Theorems
equations relative to a special system of coordinates 4 and 5 are consistent, from a scaling point of view,
x which satisfy the equation & gx = 0. Then, with the SAF condition. To derive strong peeling,
denoting by h = g   m with m the standard see [3], one needs stronger asymptotic conditions.
Minkowski metric, we obtain the following system Recently, Corvino–Schoen and Chruściel and Delay
of quasilinear wave equations in h, (2002) have proved the existence of a large class of
asymptotically flat initial data sets (, g, k) which
g  @ @ h ¼ Nðh; @hÞ ½4 are precisely stationary (here gkerr , kkerr are the initial
data of the a Kerr solution in standard coordinates)
with N(h, @h) a nonlinear term, quadratic in @h, g = gkerr , k = kkerr outside a sufficiently large com-
which can be exhibited explicitly. This form of the pact set. Moreover, they have proved the existence
Einstein field equations, called the wave coordinates of sufficiently small solutions in this class which
reduced Einstein equations, is precisely the one satisfy the requirements needed in Friedrich’s con-
which allowed Bruhat (1952) to prove the first formal compactification method (see Friedrich
local existence result. Later, she also pointed out (2002) and the references within) to produce
that the first nontrivial iterate of [4] behaves like asymptotically simple spacetimes, that is, spacetimes
t1 log t rather than t1 as expected from the decay satisfying Penrose’s regular compactification condi-
properties of solutions to & h = 0 in Minkowski tion (Penrose 1962). Simultaneously, Klainerman
space. This seems to indicate that the wave and Nicolò (1999) were able to refine the methods
coordinates may not be suitable to study the long- used in the proof of Theorem 5 to prove the
time behavior of solutions to the Einstein field following:
equations. This negative conclusion is also consis-
Theorem 6 Assume that the initial data set (, g, k)
tent with the fact that the eqns [4] do not verify
of Theorem 5 satisfies the stronger assumption,
Klainerman’s null condition. (Klainerman’s null
condition (Klainerman 1983) is an algebraic condi- g  gS ¼ Oðrð3=2þ
Þ Þ; k ¼ ðrð5=2þ
Þ Þ ½5
tion on systems of nonlinear wave equations in
(1 þ 3) dimensions, similar to [4], which allows one for some
> 3=2. Here
to extend all local solutions, corresponding to small  
M 1 2
initial data, for all time. Moreover, these solutions gS ¼ 1  2 dr þ r 2 ðd 2 þ sin2 d 2 Þ
r
decay at the rate of t1 as t ! 1 consistent to the
decay of free waves.) Lindblad and Rodnianski denotes the restriction of the Schwarzschild to t = 0
(2003) were able to isolate a new condition, which in standard polar coordinates. Then, in addition to
they call the weak null condition, verified by the the results reported in Theorem 5, we have the
wave coordinates reduced Einstein eqns [4], for strong peeling estimates,
which one can prove a small data global existence
result consistent with the weaker decay rates  ¼ Oðr5 Þ;  ¼ Oðr4 Þ
suggested by the linear asymptotic analysis of as r ! 1 along the outgoing null leaves C(u).
Bruhat. Although the new result provides far Moreover, the same conclusions hold true if [5] is
weaker peeling information than [1], it is much replaced by
simpler to prove than both Theorems 4 and 5.
Moreover, the result seems to apply to a broader g  gkerr ¼ Oðrð3=2þ
Þ Þ; k  kkerr ¼ ðrð5=2þ
Þ Þ ½6
class of initial data than in Theorems 4 and 5. It
remains an intriguing open problem whether the for some
> 5=2.
result of Lindblad–Rodnianski can be used as a The first part of the theorem was proved in
stepping stone towards the more complete results of Klainerman and Nicolò (2003b). The second part is
Theorems 4 and 5; that it is once a complete work in progress by Klainerman and Nicolò. The
solution, with limited peeling, is known to exist existence of initial conditions of the type required in
whether one can improve, using the more precise Theorem 6 was established in the works of Corvino
techniques employed in Theorems 4 and 5 minus an (2000) and Chruściel and Delay (2002).
Stability of Minkowski Space 19

Open Problems Bruhat YCh and Geroch RP Global aspects of the Cauchy
problem in general relativity. Communications in Mathema-
tical Physics 14: 329–335.
Problem 1 Extend results of Theorems 5 and 6 to Christodoulou D (2000) The Global Initial Value Problem in
the whole domain of dependence, for small data sets. General Relativity, Lecture Given at the Ninth Marcel
Grossman Meeting. 2–8 July, Rome.
The results of Theorems 5 and 6 give a Christodoulou D and Klainerman S (1990) Asymptotic properties
satisfactory description of gravitational radiation of of linear field equations in Minkowski space. Communications
general classes of asymptotically flat initial data sets in Pure and Applied Mathematics XLIII: 137–199.
Christodoulou D and Klainerman S (1993) The global non-linear
outside the domain of dependence of a sufficiently stability of the Minkowski space. Princeton Mathematical Series.
large compact set. It would be desirable to extend Chruściel PT and Delay E (2002) Existence of non-trivial,
these results to the whole domain of dependence of vacuum, asymptotically simple spacetimes. Classical and
initial data sets which satisfy an additional global Quantum Gravity 19: L71–L79.
smallness assumption similar to that of Theorem 4. Corvino J (2000) Scalar curvature deformation and a gluing
construction for the Einstein constraint equations. Commu-
Problem 2 Is strong peeling (and implicitly asymp- nications in Mathematical Physics 214: 137–189.
totic simplicity) consistent with physically relevant Friedrich H (2002) Conformal Einstein evolution. In:
Frauendiener J and Friedrich H (eds.) The Conformal
data? If not, is weak peeling a good substitute? Structure of Space-Time, Lecture Notes in Physics. Springer.
Damour and Christodoulou (2000) have given Hughes T, Kato T, and Marsden J (1976) Well-posed quasilinear
second-order hypebolic systems. Archives for Rational and
conclusive evidence that under no-incoming- Mechanical Analysis 63: 273–294.
radiation condition the future null infinity cannot Hawking SW and Ellis GFR (1973) The large scale structure of
be smooth. In fact,  = O(r4 log r) as r ! 1. spacetime. Cambridge Monographs on Mathematical Physics.
Klainerman S (1983) Long-Time Behavior of Solutions to Non-
Problem 3 Can one weaken the AF conditions to linear Wave Equations Proc. ICM Warszawa.
include, for example, initial data sets with infinite Klainerman S (1986) The null condition and global existence to
ADM angular momentum? non-linear wave equations. Lectures in Applied Mathematics
23: 293–326.
It is reasonable to expect a global stability of Klainerman S and Nicolò F (1999) On local and global aspects of
Minkowski result for small initial data sets which the Cauchy problem in general relativity. Classical and
verify, for arbitrarily small , Quantum Gravity 16: R73–R157.
  Klainerman S and Nicolò F (2003a) The evolution problem in
M general relativity. Progress in Mathematical Physics, vol. 25.
g 1þ2  ¼ 0ðr1c Þ; k ¼ 0ðr2c Þ Boston: Birkhaüser.
r
Klainerman S and Nicolò F (2003b) Peeling properties of
One expects in this case that the top null components asymptotically flat solutions to the Einstein vacuum equations.
Classical and Quantum Gravity.
 and  decay only like O(r3 ) as r ! 1 along the
Lindblad H and Rodnianski I (2003) The weak null condition for
null hypersurfaces C(u). It seems that the methods of Eisnstein’s equations. Comptes Rendus Hebdomodaires des
Lindblad–Rodnianski can treat this case but can only Seances de l’Academic des Sciences, Paris 336(11): 901–906.
give decay estimates for ,  of the form O(r3þc ). Lindblad H, and Rodnianski I Global existence for the Einstein
vacuum equations in wave coordinates. Annals of Mathe-
Problem 4 Is the Kerr solution in the exterior of matics (submitted).
the black hole stable? Penrose R (1965) Zero rest-mass fields including gravitation:
asymptotic behaviour. Proceedings of the Royal Society of
The problem remains wide open. London. Series A 284: 159–203.
Penrose R (1979) Singularities and time asymmetry in general
See also: Asymptotic Structure and Conformal Infinity; relativity – an Einstein centenary survey. In: Hawking S and
Classical Groups and Homogeneous Spaces; Critical Israel W (eds.) Cambridge: Cambridge University Press.
Phenomena in Gravitational Collapse; Einstein Penrose R (1962) Zero rest mass fields including gravitation:
Equations: Exact Solutions; Geometric Analysis and asymptotic behavior. Proceedings of the Royal Society of
London. Series A 270: 159–203.
General Relativity; Supergravity.
Wald R (1984) General Relativity. University of Chicago Press.

Further Reading
Bruhat YC (1952) Theoreme d’existence pour certain systemes
d’equations aux derivee partielles nonlineaires. Acta Mathe-
matica 88: 141–225.
20 Stability Problems in Celestial Mechanics

Stability Problems in Celestial Mechanics


A Celletti, Università di Roma ‘‘Tor Vergata,’’ Rome, and later to the discovery of Pluto by C Tombaugh,
Italy as a result of unexplained perturbations on Uranus
ª 2006 Elsevier Ltd. All rights reserved. and Neptune, respectively. Modern advances in
perturbation theories have been provided by the
Kolgomorov–Arnol’d–Moser (KAM) and Nekhor-
oshev theorems, which find broad applications in
Introduction celestial mechanics insofar as simple model pro-
The long-term stability of planets and satellites might blems are concerned.
be desumed by the regular dynamics that we The stability of the solar system can also be
constantly observe. However, the ultimate fate of approached through numerical investigations, which
the solar system is an intriguing question, which has allow one to predict the motion of the celestial
puzzled scientists since antiquity. In the past cen- bodies using more realistic models. The results of
turies, the common belief of a regular motion of the the numerical integrations undermine in some cases
main planets was strengthened by the discovery of a the apparent regularity of the solar system: in the
simple law, due to J D Titius and J E Bode (eight- following sections, we shall review many examples
eenth century), which provides a recipe to compute of regular and chaotic motions in different contexts
the approximate distances of the planets from the of celestial mechanics, from the N-body problem to
Sun. Adopting astronomical units as a measure of the the rotational dynamics.
distance, the Titius–Bode law can be stated as
dn ¼ 0:4 þ 0:3  2n AU ½1
The Restricted Three-Body Problem
where the index n must be selected as provided in
Let P1 , . . . , PN be N bodies with masses m1 , . . . , mN ,
Table 1, which compares the distances computed
which interact through Newton’s law. Let u(i) 2
according to [1] with the observed values. Titius and
R3 , i = 1, 2, . . . , N, denote the position of the bodies
Bode already noticed that it was necessary to skip
in an inertial reference frame. Normalizing the
one unit in n from Mars to Jupiter; indeed, the
gravitational constant to 1, the equations of motion
quantity d3 = 2.8 AU might correspond to an aver-
of the N-body problem have the form
age distance of some minor bodies of the asteroid
belt, which had been discovered since the beginning d2 uðiÞ XN
mj ðuðiÞ  uðjÞ Þ
of the nineteenth century. The studies of the N-body ¼  ; i ¼ 1; . . . ; N ½2
dt2 ðiÞ ðjÞ 3
problem, namely the dynamics of N mutually j¼1; j6¼i ju  u j

attracting bodies (according to Newton’s law), In the case N = 2, one reduces to the two-body
inspired several mathematical and physical theories: problem, which can be explicitly solved by means of
from the development of perturbation methods to Kepler’s laws as follows. Consider, for example, the
the discovery of chaotic systems, as attested by the Earth–Sun case: for negative values of the energy,
masterly work of H Poincaré (1892). In particular, the trajectory of the Earth is an ellipse with one
perturbation theory had relevant applications in focus coinciding with the barycenter, which can
celestial mechanics; for example, it led to the practically be identified with the Sun; the Earth–Sun
prediction of the existence of Neptune in the radius vector describes equal areas in equal times;
nineteenth century by J C Adams and U Leverrier the cube of the semimajor axis is proportional to the
square of the period of revolution.
Consider now an extension to the study of three
Table 1 Tititus–Bode law and observed data
bodies such that in the Keplerian approximation P2
and P3 move around P1 and such that the
Index n Distance computed Observed semimajor axis of P2 is greater than that of P3 (an
(of [1]) from [1] distance (AU) example is obtained identifying P1 with the Sun, P2
Mercury 1 0.4 0.39 with the Jupiter, and P3 with an asteroid of the
Venus 0 0.7 0.72 main belt). The three-body problem is described by
Earth 1 1 1 [2] setting N = 3; a special case is given by the
Mars 2 1.6 1.52 restricted three-body problem, which describes the
Jupiter 4 5.2 5.2
Saturn 5 10 9.54
evolution of a ‘‘zero-mass’’ body under the gravita-
Uranus 6 19.6 19.19 tional attraction exerted by an assigned two-body
system. Setting N = 3 and m3 = 0 in [2], the
Stability Problems in Celestial Mechanics 21

equations governing the restricted three-body pro- problem; notice that H(L, G, ‘, , ) has two degrees
blem are given by of freedom and an explicit time dependence through
the longitude  of P2 . If the primaries are assumed to
d2 uð1Þ m2 ðuð1Þ  uð2Þ Þ move in circular orbits around their common center
¼
dt 2
juð1Þ  uð2Þ j3 of mass, the Hamiltonian function reduces to two
d2 uð2Þ m1 ðuð2Þ  uð1Þ Þ degrees of freedom, where a new variable g is
¼ introduced as the difference between the argument
dt 2
juð2Þ  uð1Þ j3 of perihelion  and the longitude  of the primary.
d2 uð3Þ m1 ðuð3Þ  uð1Þ Þ m2 ðuð3Þ  uð2Þ Þ Normalizing the units of measure so that the
¼  distance between the primaries and the sum of
dt 2
juð3Þ  uð1Þ j3 juð3Þ  uð2Þ j3
their masses is unity, the Hamiltonian function H
The first two equations concern the motion of the describing the circular, planar, restricted three-body
primaries P1 and P2 and they correspond to a problem is given by
Keplerian two-body problem, whose solution can
be inserted in the equation for u(3) , which becomes a 1
HðL; G; ‘; gÞ ¼   G þ "FðL; G; ‘; gÞ ½6
periodically forced second-order equation. The 2L2
restricted three-body problem can be conveniently
where " = m2 . The perturbing function takes the
described in terms of suitable action-angle coordi-
form
nates, known as Delaunay variables. The present
discussion is restricted to the planar case, namely we 1
assume that the motion of the three bodies takes F ¼ r cosðf þ gÞ  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ r2  2r cosðf þ gÞ
place on the same plane. The corresponding Delau-
nay variables, say (L, G, ‘, ) 2 R2  T 2 , are defined where f = ’   represents the true anomaly, namely
as follows (Szebehely 1967). Let a and e be, the angle formed by the instantaneous orbital radius
respectively, the semimajor axis and the eccentricity with the periapsis line. Notice that the quantities r
2=3
of the osculating orbit of P3 and let  = 1=m1 ; then and f are functions of the Delaunay variables
Delaunay’s action variables are given by through the relations [3]–[5]. As a consequence,
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi one can expand the perturbing function in the form
L ¼  m 1 a; G ¼ L 1  e2
(Delaunay 1860)
Next, introduce the angle variables: we denote by  X
and ’ the longitudes of Jupiter and of the asteroid; FðL; G; ‘; gÞ ¼ Fjk ð‘; gÞej ak
let  be the argument of perihelion, namely the angle j; k0
formed by the periapsis direction with a preassigned
reference line, and let u denote the eccentric where Fjk are cosine terms with arguments given by
anomaly, which can be defined through a linear combination of the variables ‘ and g. For
rffiffiffiffiffiffiffiffiffiffiffi example, the first few terms of the series develop-
’ 1þe u ment are given by the following expression:
tan ¼ tan ½3
2 1e 2
L4 9 L4 e
Let ‘ be the mean anomaly, which is related to the FðL; G; ‘; gÞ ¼  1   L8 þ cos ‘
eccentric anomaly by means of Kepler’s equation  4 64  2
3 6 15 10
 L þ L cosð‘ þ gÞ
‘ ¼ u  e sin u ½4 8 64
9
Delaunay’s angle variables are represented by the þ L4 e cosð‘ þ 2gÞ
4
 
mean anomaly ‘ and by the argument of perihelion
3 4 5 8
. For completeness, it should be remarked that  L þ L cosð2‘ þ 2gÞ
the distance r between the minor body P3 and the 4 16
primary P1 is related to the longitude and to the 3 4
 L e cosð3‘ þ 2gÞ
eccentric anomaly by means of the relations 4
 
5 6 35 10
að1  e2 Þ  L þ L cosð3‘ þ 3gÞ
r¼ ¼ að1  e cos uÞ ½5 8 128
1 þ e cosð’  Þ 35
 L8 cosð4‘ þ 4gÞ
In a reference frame centered at one of the 64
primaries, say P1 , let H = H(L, G, ‘, , ) denote 63 10
 L cosð5‘ þ 5gÞ þ    ½7
the Hamiltonian function describing the planar 128
22 Stability Problems in Celestial Mechanics

where the eccentricity is ffia function of the actions


pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where h1 (L0 , G0 ) is the new unperturbed Hamilto-
through e = 1  G2 =L2 . We remark that the nian. If we denote by F0 (L0 , G0 ) the average of the
Hamiltonian [6] is nearly integrable with perturbing perturbing function over the angle variables, the
parameter "; indeed, for " = 0 one recovers the two- new unperturbed Hamiltonian takes the form
body problem describing the interaction between P1
and P3 , which can be explicitly solved according to h1 ðL0 ; G0 Þ ¼ hðL0 ; G0 Þ þ "F0 ðL0 ; G0 Þ
Kepler’s laws.
Expanding
P F in Fourier series as F(L, G, ‘, g) =
n, m2Z F nm (L, G)ei(n‘þmg) , the generating function
KAM Stability is given by the following expression:
Classical perturbation theory, as developed by X Fnm ðL0 ; G0 Þ iðn‘þmgÞ
Laplace, Lagrange, Delaunay, Poincaré, etc., does ðL0 ; G0 ; ‘; gÞ ¼ i e
n; m2Znf0g
!ðL0 Þn  m
not allow investigation of the stability of the N-body
problem, since the series defining the solution are The occurrence of small divisors of the form
generally divergent. In order to justify this state-
ment, let us start by rewriting the unperturbed 1
; n; m 2 Z
Hamiltonian in [6] as !ðLÞn  m
1 might prevent the convergence of the series defining
hðL; GÞ ¼  G ½8
2L2 the generating function. In particular, we remark
so that [6] becomes H(L, G, ‘, g) = h(L, G) þ that zero divisors occur whenever !(L) = m=n. This
"F(L, G, ‘, g). In order to remove the perturbation situation, which is called an m : n orbit–orbit
to the second order in the perturbing parameter, one resonance, implies that during a given interval of
looks for a change of variables (L, G, ‘, g) ! time the body P3 makes m revolutions, whereas P2
(L0 , G0 , ‘0 , g0 ) close to the identity, that is, makes exactly n orbits about P1 .
The control of the occurrence of the small divisors
@ 0 0 was obtained through a theorem by A N Kolmogorov,
L ¼ L0 þ " ðL ; G ; ‘; gÞ
@‘ who made a major breakthrough in the study
@ 0 0 of nearly integrable systems. He proved, under
G ¼ G0 þ " ðL ; G ; ‘; gÞ
@g general assumptions, that some regions of the
@ phase space are almost filled by maximal invariant
‘0 ¼ ‘ þ " 0 ðL0 ; G0 ; ‘; gÞ tori. The theorem provides a constructive algorithm
@L
@ 0 0 to give estimates on the perturbing parameter,
0
g ¼gþ" ðL ; G ; ‘; gÞ ensuring the existence of some invariant surfaces.
@G0
Kolmogorov’s theorem was later extended by
where (L0 , G0 , ‘, g) is the generating function of the V I Arnol’d and J Moser, giving rise to the so-called
transformation. Let KAM theory. More precisely, the KAM theorem
@h 1 can be stated as follows (see, e.g., Arnol’d et al.
ðL; GÞ ¼ 3  !ðLÞ (1997)): consider a real-analytic, nearly integrable
@L L
Hamiltonian function and fix a rationally indepen-
In order to perform a first-order perturbation dent frequency vector !; if the unperturbed
theory, we look for a generating function Hamiltonian is not degenerate and if the frequency
(L0 , G0 , ‘, g), such that the transformed Hamilto- satisfies a strong nonresonance assumption (called
nian is integrable up to O("2 ), namely the diophantine condition), for sufficiently small
 
0 @ 0 0 0 @ 0 0 values of the perturbing parameter, there exists an
h L þ" ðL ; G ; ‘; gÞ; G þ " ðL ; G ; ‘; gÞ invariant torus on which a quasiperiodic motion
@‘ @g
 with frequency ! takes place. A preliminary
@ 0 0
þ "F L0 þ " ðL ; G ; ‘; gÞ; G0 investigation of the stability of the N-body problem
@‘
 by means of KAM theory (Arnol’d et al. 1997)
@ 0 0 leads to the existence of large regions filled by
þ" ðL ; G ; ‘; gÞ; ‘; g
@g quasiperiodic motions, provided the masses of the

@ 0 0 0 0 planets are sufficiently small. Arnol’d’s version of
¼ h1 ðL ; G Þ þ " !ðL0 Þ
0 0
ðL ; G ; ‘ ; g Þ KAM theorem has been applied by J Laskar and P
@‘
 Robutel to the spatial three-body planetary problem
@ 0 0 0 0
 ðL ; G ; ‘ ; g Þ þ FðL ; G ; ‘ ; g Þ þ Oð"2 Þ
0 0 0 0
(the planetary problem concerns the study of the
@g
Stability Problems in Celestial Mechanics 23

dynamics of two bodies with comparable masses, Nekhoroshev’s theorem (see, e.g., Arnol’d et al.
moving in the gravitational field of a larger primary) (1997)), which guarantees, under smallness require-
and the existence of quasiperiodic motions has been ments, the stability of the motions on an open set of
proved for values of the ratio of semimajor axis less initial conditions for exponentially long times.
than 0.8 and for inclinations up to  1 . Consider a Hamiltonian function of the form
Concrete estimates on the strength of the perturba-
Hðy; xÞ ¼ hðyÞ þ "f ðy; xÞ; ðy; xÞ 2 B  T n ½9
tion were given by M Hénon: in the context of the
n
three-body problem, the application of the original where B is an open subset of R . We assume that h
version of Arnol’d’s theorem allows one to prove the and f are analytic functions and that the integrable
existence of invariant tori for values of the perturbing Hamiltonian h satisfies a geometric condition, called
parameter (representing the Jupiter–Sun mass ratio) steepness. We remark that functions such as h(L, G)

10333 while the implementation of Moser’s theo- in [8] satisfy the steepness condition. For sufficiently
rem provides an estimate of 1050 . We remark that the small values of ", Nekhoroshev’s theorem states that
astronomical value of the Jupiter–Sun mass ratio any motion (y(t), x(t)) satisfying Hamilton’s equa-
amounts to  103 , showing a relevant discrepancy tions associated with [9] is bounded for a finite (but
between KAM results and physical measurements. exponentially long) time, that is,
More recently, KAM estimates have been refined and b
adapted to the study of significant problems of celestial kyðtÞ  yð0Þk
y0 "a ; for jtj
t0 eð"0 ="Þ
mechanics (Celletti and Chierchia 1995). Strong where y0 , t0 , "0 , a, and b are suitable positive
improvements have been obtained combining accurate constants.
estimates with a computer-assisted implementation, Nekhoroshev’s theorem can be conveniently
where the computer is used to perform long computa- applied to the three-body problem, where it provides
tions concerning the development of the perturbing a confinement of the action variables, representing
series and the check of KAM estimates. The numerical the semimajor axis and the eccentricity of the
errors are controlled through the implementation of a osculating orbit. Interesting applications of
suitable technique, known as interval arithmetic. In Nekhoroshev’s theorem concern the investigation
the framework of the planar, circular, restricted three- of the triangular Lagrangian points in the spatial,
body problem, the stability of some asteroids has been restricted three-body problem. (The Lagrangian
proved by A Celletti and L Chierchia for realistic points are five equilibrium positions of the planar,
values of the perturbing parameter (e.g., for " = 103 ). restricted three-body problem in a synodic reference
A suitable approximation of the disturbing function frame, which rotates with the angular velocity of the
(namely, a finite truncation of the series development primaries. Two of such positions are called trian-
as in [7]) has been considered. The result relies on an gular, since the configuration of the three bodies is
implementation of a computer-assisted isoenergetic an equilateral triangle in the orbital plane.) Effective
KAM theorem and on the following remark: in the estimates were developed by A Giorgilli and
four-dimensional phase space, on a fixed energy level C Skokos, showing the existence of a stability
the invariant two-dimensional surfaces separate the region around the Lagrangian point L4 , large
phase space, providing the stability of the actions for enough to include some known asteroids. In the
all motions trapped between any two invariant tori. same framework, the exponential stability was
Since the action variables are related to the semimajor proven by G Benettin, F Fassó, and M Guzzo for
axis and to the eccentricity of the orbit, one obtains all values of the mass-ratio parameter, except for a
that the elliptic elements remain close to their initial few values of the reduced mass  up to  ’ 0.038.
values.
A computer-assisted KAM theorem has been
Numerical Results
applied by A Giorgilli and U Locatelli to the
planetary (Jupiter–Saturn) problem. Using a suitable The study of the stability of the N-body problem can
secular approximation, it can be shown that this be investigated by performing numerical integrations
model admits two invariant tori, which bound the of the equations of motion. The dynamics of the
orbits corresponding to the initial data of Jupiter outer planets of the solar system (from Jupiter to
and Saturn. Pluto) has been explored by Sussman and Wisdom
(1992) using a dedicated computer, the Digital
Orrery. The integration of the equations of motion
Nekhoroshev Stability
was performed over 845 million years; the results
A different approach in order to study the stability provided evidence of the stability of the major
of nearly integrable systems is provided by planets and a chaotic behavior of Pluto. An
24 Stability Problems in Celestial Mechanics

alternative approach, based on an average of the equations for rigid body, the equation of motion in
equations of motion over fast angles, was adopted normalized units (i.e., assuming that the period of
by Laskar (1995), where the perturbing function of revolution is 2) takes the form
the spatial problem was expanded up to the second "
order in the masses and up to the fifth powers of the x
€ þ 3 sinð2x  2f Þ ¼ 0 ½10
r
eccentricity and the inclination. The dynamics of all
planets (excluding Pluto) was investigated by means where "  32 (B  A)=C. This equation is integrable
of frequency analysis over a time span ranging from whenever A = B or in the case of zero orbital
15 Gyr to þ10 Gyr. The numerical integrations eccentricity. Due to the assumption of Keplerian
provided evidence of the regularity of the external motion, both r and f are known functions of the
planets (from Jupiter to Neptune), a moderate time. Therefore, we can expand [10] in Fourier
chaotic behavior of Venus and the Earth, and a series as
marked chaotic dynamics of Mercury and Mars. X1 m 
The computations show that the inner solar system x
€þ" W ; e sinð2x  mtÞ ¼ 0 ½11
m6¼0; m¼1
2
is chaotic, with a Lyapunov time of  5 Myr, thus
preventing any prediction of the evolution over where the coefficients W(m=2, e) decay as
100 Myr. W(m=2, e) / ejm2j . A further simplification of the
model is obtained as follows. According to (4), we
neglected the dissipative forces and perturbations
The Spin–Orbit Problem due to other bodies. The most important contribu-
tion is due to the nonrigidity of the satellite,
The dynamics of the bodies of the solar system provoking a tidal torque caused by the internal
results from a combination of a revolutionary friction. The size of the dissipative effects is
motion around a primary body and a rotation significantly small compared to the gravitational
about an internal axis. A simple mathematical terms. Therefore, we decide to retain in [11] only
model describing the spin–orbit interaction can those terms which are of the same order or larger
be introduced as follows. Let S be a triaxial than the average effect of the tidal torque. The
ellipsoidal satellite, which moves about a central following equation results:
planet P. We denote by Trev and Trot the periods of
revolution and rotation. A p : q spin–orbit reso- X
N2  
x
€þ" ~ m ; e sinð2x  mtÞ ¼ 0
W ½12
nance occurs if 2
m6¼0; m¼N1
Trev p
¼ ; for p; q 2 N; q 6¼ 0 where N1 and N2 are suitable integers, which depend
Trot q
on the physical and orbital parameters of the satellite,
Whenever p = q = 1, the satellite always points the while W̃(m=2, e) are suitable truncations of the
same face to the host planet. Most of the evolved coefficients W(m=2, e). We remark that eqn [12] can
satellites or planets are trapped in a 1 : 1 resonance, be derived from Hamilton’s equations associated
with the only exception of Mercury, which is with a one-dimensional, time-dependent, nearly
observed in a nearly 3 : 2 resonance. In order to integrable Hamiltonian function with perturbing
introduce a simple mathematical model which parameter " and a trigonometric disturbing function.
describes the spin–orbit interaction, we assume that:
1. the satellite moves on a Keplerian orbit around the Analytical Results
planet (with semimajor axis a and eccentricity e);
The phase space associated with [12] admits a
2. the spin axis is perpendicular to the orbit plane;
Poincaré map showing a pendulum-like structure:
3. the spin axis coincides with the shortest physical
the periodic orbits are surrounded by librational
axis; and
curves and the chaotic separatrix divides the libra-
4. dissipative effects as well as perturbations due to
tional regime from the region where rotational
other planets or satellites are neglected.
motions can take place. The three-dimensional
We denote by A < B < C the principal moments phase space is separated by KAM rotational tori
of inertia of the satellite and by r and f, respectively, into invariant regions, providing a strong stability
the instantaneous orbital radius and the true property for all motions confined between any pair
anomaly of the Keplerian orbit. Let x be the angle of KAM rotational tori. Let us denote by P(p=q) a
between the longest axis of the ellipsoid and a periodic orbit associated with the p : q resonance; in
preassigned reference line. From standard Euler’s the context of the model associated with [12], the
Stability Problems in Celestial Mechanics 25

stability of the periodic orbit P(p=q) is obtained by 60 and 90 . Since the present obliquity of the
showing the existence of two invariant tori Earth amounts to  23.3 , the Earth is outside the
T (!1 ) and T (!2 ) with !1 < p=q < !2 . A refined dangerous region. An interesting simulation was
computer-assisted KAM theorem has been imple- performed to evaluate the role played by the
mented (Celletti 1990) with the aim of proving the Moon. Without the Moon, the extent of the
existence of trapping invariant surfaces. Realistic chaotic region would greatly increase, eventually
estimates, in agreement with the physical values of preventing the birth of an evoluted life. Among
the parameters (namely, the equatorial oblateness " the inner planets, Mars’ obliquity shows larger
and the eccentricity e), have been obtained in several chaotic extent, which drives to variations from
examples of spin–orbit commensurabilities, like the 0 to 60 in a few million years. On the contrary,
1 : 1 Moon–Earth interaction or the 3 : 2 Mercury– the external planets do not show significant
Sun resonance. chaotic regions and their obliquities are essen-
Concerning Nekhoroshev-type estimates, the tially stable.
classical D’Alembert problem has been studied by
Biasco and Chierchia (2002). In particular, an See also: Averaging Methods; Dynamical Systems in
equatorially symmetric oblate planet moving on a Mathematical Physics: An Illustration from Water Waves;
Keplerian orbit around a primary body has been Gravitational N-Body Problem (Classical); Hamiltonian
Systems: Stability and Instability Theory; KAM Theory
investigated; the model does not assume any further
and Celestial Mechanics; Multiscale Approaches;
constraint on the spin axis. Although the Hamilto- Stability Theory and KAM.
nian describing this model is properly degenerate, it
is shown that Nekhoroshev-like results apply to the
D’Alembert problem in the proximity of a 1 : 1
resonance. Further Reading
Arnol’d VI, Kozlov VV, and Neishtadt AI (1997) Mathematical
Aspects of Classical and Celestial Mechanics. Berlin: Springer.
Numerical Results
Biasco L and Chierchia L (2002) Nekhoroshev stability for the
The model introduced in [10]–[12] often represents an D’Alembert problem of celestial mechanics. Atti Accademia
Nazionale Lincei Classe Scienze Fisiche Matematiche Naturali
unrealistic simplification of the spin–orbit dynamics. Rendiconti Lincei (9) Matematica Applicata 13(2): 85–89.
In particular, assumption (1) implies that secular Celletti A (1990) Analysis of resonances in the spin–orbit problem
perturbations of the orbital parameters are neglected, in celestial mechanics: the synchronous resonance (Part I).
whereas the hypothesis (2) corresponds to disregarding Journal of Applied Mathematics and Physics (ZAMP) 41:
the spin–orbit obliquity, namely the angle formed by 174–204.
Celletti A and Chierchia L (1995) A Constructive Theory of
the rotational axis with the normal to the orbital
Lagrangian Tori and Computer-Assisted Applications.
plane. Due to the presence of an equatorial bulge, the Dynamics Reported, New Series, vol. 4, pp. 60–129. Berlin:
gravitational attraction of the other bodies of the solar Springer.
system induces a torque, resulting in a precessional Delaunay C (1860) Théorie du Mouvement de la Lune, Mémoires
motion. It is also important to take into account the de l’Académie des Sciences 1, Tome XXVIII, Paris.
Laskar J (1995) Large scale chaos and marginal stability in the
changes of the obliquity angle, whose variations
solar system. In: XIth ICMP Meeting (Paris, July 1994),
might affect the climatic behavior. pp. 75–120. Cambridge, MA: International Press.
A realistic model for the precession and the Poincarè H (1892) Les Méthodes Nouvelles de la Méchanique
variation of the obliquity has been presented by Céleste. Paris: Gauthier-Villars.
Laskar (1995). The numerical simulations and the Sussman GJ and Wisdom J (1992) Chaotic evolution of the solar
system. Science 241: 56–62.
frequency-map analysis show that the Earth’s
Szebehely V (1967) Theory of Orbits. New York: Academic Press.
obliquity is actually stable, although a large
chaotic region is found in the interval between
26 Stability Theory and KAM

Stability Theory and KAM


G Gentile, Università degli Studi ‘‘Roma Tre,’’ Rome, As a consequence of this, it became a widespread
Italy belief that, even when starting from an integrable
ª 2006 Elsevier Ltd. All rights reserved. system, the introduction of an arbitrarily small
perturbation would break integrability.
This belief was strengthened by the work of
Poincaré (1898), who showed that the series
Introduction describing the solution in a perturbation theory
A Hamiltonian system is a dynamical system whose approach are in general divergent. The source of
equations of motions can be written in terms of a divergence in perturbation series is the presence of
scalar function, called the Hamiltonian of the system: small divisors, that is, of denominators of the kind
if one uses coordinates (p, q) in a domain (phase space) of w  n, where w is the rotation vector that should
D  R2N , where N is the number of independent characterize the invariant torus (if existent) and n is
variables one needs to identify a configuration of the any integer vector. Despite this, however, perturba-
system (degrees of freedom), there is a function H(p, q) tion series (known as Lindstedt series) continued to
such that p_ =  @H=@q and q_ = @H=@p. An integrable be extensively used by astronomers in problems of
(Hamiltonian) system is a Hamiltonian system which, celestial mechanics, such as the study of planetary
in suitable coordinates (A, a) 2 A  T N , where A is an motions, for the simple reason that they provided
open subset of RN and T = R=2Z is the standard predictions in good agreement with the observa-
torus, can be described by a Hamiltonian H0 (A), tions. But the feeling that the underlying mathema-
that is, depending only on A. The coordinates tical tools were unsatisfactory persisted.
(A, a) are called action-angle variables. In such a In fact, the well-known Fermi–Pasta–Ulam
case the dynamics is trivial: any initial condition numerical experiment, in 1955, was originally
(A0 , a 0 ) evolves in such a way that the action conceived in the spirit of confirming that integr-
variables are constants of motion (i.e., A(t) = A0 for ability would in general be easily lost. Consider a
all t 2 R), while the angles grow linearly in time as chain with N harmonic oscillators, with, say,
a(t) = a 0 þ wt, where w = w(A0 )  @A H0 (A0 ) is periodic boundary conditions, coupled with cubic
called the rotation (or frequency) vector. An and quartic two-body potentials, so that the
integrable system can be thought of as a collection Hamiltonian is
of decoupled (i.e., independent) rotators: the entire
phase space A  T N is foliated into invariant tori X
N
1
and all motions are quasiperiodic. Integrable Hðp; qÞ ¼ p2i þ Wðqiþ1  qi Þ
2
systems are stable, in the sense that nearby initial i¼1 ½1
conditions separate at most linearly in time (in 1  
WðxÞ ¼ x2 þ x3 þ x4
particular, the actions do not separate at all): 2 3 4
mathematically, this is expressed by the fact that
all the Lyapunov exponents are nonpositive. for ,  real parameters and (p, q) 2 RN  RN . One
An example of an integrable system is any one- can introduce new variables such that the Hamilto-
dimensional conservative mechanical system, in any nian, for  =  = 0, can be written as
region of phase space in which motions are
bounded. By increasing the number of degrees of X
N
1 
freedom, exhibiting nontrivial integrable systems H0 ðAÞ ¼ P2k þ !k Q2k ¼ w  A ½2
2
can become a difficult task. The problem of studying i¼1

the effects of even small Hamiltonian perturbations


on integrable systems and of understanding if the for a suitable rotation vector w = (!1 , . . . , !N ) 2 RN
latter remain stable, in the aforementioned sense, (an explicit computation gives !k = 2 sin(k=N)).
was considered by Poincaré to be the fundamental Consider an initial condition in which all the
problem of dynamics. For a long time, it was energy is confined to a few modes, that is, Ak 6¼ 0 at
commonly thought that all motions could be t = 0 only for a few values of k. For  =  = 0, the
reduced to superpositions of periodic motions, system is integrable, so that Ak (t) = 0 for all t 2 R
hence to quasiperiodic motions, but at the end of and for all k such that Ak (0) = 0. If the system ceases
nineteenth century it was realized by Boltzmann and to be integrable when the perturbation is switched
Poincaré that such a picture was too naive, and that on, the energy is likely to start to be shared among
in reality more complicated motions were possible. the various modes, and after a long enough time has
Stability Theory and KAM 27

elapsed, an equidistribution of the energy among all Hamiltonian structure of the equations of motion)
modes (thermalization) might be expected. At least such that at each step the size of the perturbation is
this behavior was expected by Fermi, Pasta, and reduced. Of course, on the basis of Poincaré’s result,
Ulam, but it was not what they found numerically: this iterative procedure cannot work for all initial
on the contrary, all the energy seemed to remain conditions (e.g., when w does not satisfy [4]). The
associated with the modes close to the few initially key point in Kolmogorov’s scheme is to fix the
excited ones. rotation vector w of the torus one is looking for, in
At about the same time, Kolmogorov (1954) such a way that the small divisors are controlled
published a breakthrough paper going exactly in through the Diophantine condition [4] and the
the opposite direction: if one perturbs an integrable exponentially fast convergence of the algorithm.
system, under some mild conditions on the integr- New proofs and extensions of Kolmogorov’s
able part, most of the tori are preserved, although theorem were given later by Arnol’d (1962) and by
slightly deformed. A more precise statement is the Moser (1962); hence, the acronym KAM to denote
following. such a theorem. Arnol’d gave a more detailed (and
slightly different) proof compared to the original
Theorem 1 Let an N-degree-of-freedom Hamilto-
one by Kolmogorov, and applied the result to the
nian system be described by an analytic Hamiltonian
planar three-body problem, thus showing that
of the form
physical applications of the theorem were possible.
Moser, on the other hand, proposed a modified
HðA; aÞ ¼ H0 ðAÞ þ "f ðA; aÞ ½3 method using a technique introduced by Nash
(which approximates smooth functions with analy-
with " a real parameter (perturbation parameter), tical ones) to deal with the case of systems with
f a 2-periodic function of each angle variable finite smoothness.
(potential or perturbation), and H0 (A) satisfying For fixed small enough ", the surviving invariant
the nondegeneracy condition det @A2 H0 (A) 6¼ 0 tori cover a large portion of the phase space, called
(anisochrony condition). If w = w(A)  @A H0 (A) is the Kolmogorov set; the relative measure of the
fixed to satisfy the Diophantine condition region of phase space whichpisffiffiffi not filled by such tori
tends to zero at least as " for " ! 0. A system
C0 described by a Hamiltonian like [3] is then called a
jw  nj > 8n 2 ZN n 0 ½4 quasi-integrable Hamiltonian system.
jnj
The excluded region of phase space corresponds
to the unperturbed tori which are destroyed by the
for some constants C0 > 0 and  > N  1 (here
perturbation: the rotation vectors of such tori are
jnj = j1 j þ    þ jN j and  denotes the standard
close to a resonance, that is, to a value w such that
inner product: w  n = !1 1 þ    þ !N N ), then
w  n = 0 for some integer vector n, and these are
there is an invariant torus with rotation vector w
exactly the vectors which do not satisfy the
for " small enough, say for " smaller than some value
Diophantine condition [4] for any value C0 . A
"0 depending on C0 and  (and on the function f ).
subset of phase space of this kind is called a
By saying that there is an invariant torus with resonance region.
rotation vector w, one means that there is an At first sight, this would seem to provide an
invariant surface in phase space on which, in explanation for the results found by Fermi, Pasta,
suitable coordinates, the dynamics is the same as in and Ulam, but this is not quite the case. First, the
the unperturbed case, and the conjugation (i.e., the threshold value "0 depends on N, and goes to zero
change of variables which leads to such coordinates) very fast as N ! 1 (in general as N! for some
is analytic in the angle variables and in the  > 0); however, the results of the numerical
perturbation parameter. One also says that the experiments apparently were insensitive to the
torus of an integrable system (" = 0) is preserved number N of oscillators. Second, the KAM theorem
(or even persists) under a small perturbation. deals with maximal tori, that is, tori characterized
Note that, a posteriori, this proves convergence of by rotation vectors which have as many components
the perturbation series: however, a direct check of as the number of degrees of freedom, while the
convergence was performed only recently by rotation vectors of the numerical quasiperiodic
Eliasson (1996). Kolmogorov’s proof was based on solutions seem to involve just a small number of
a completely different idea, that is, by performing components.
iteratively a sequence of canonical transformations Finally, as an extra problem, the validity of the
(which are changes of coordinates preserving the nondegeneracy condition for the unperturbed
28 Stability Theory and KAM

Hamiltonian is violated, because the unperturbed corresponding to the normal coordinates), known
Hamiltonian is linear in the action variables (one as the first and second Mel’nikov conditions:
says that the Hamiltonian is isochronous). Recently,
C0
Rink (2001), by continuing the work by Nishida, jw  n k j > 8n 2 ZN n 0; 81  k  s
showed that in the Fermi–Pasta–Ulam problem it is jnj
possible to perform a canonical change of coordi- C0 ½5
j w  n  k  k0 j > 8n 2 ZN n 0
nates such that in the new variables the Hamiltonian jnj
becomes anisochronous: one uses part of the 81  k; k0  s
perturbation to remove isochrony. But the other
two obstacles remain. Such conditions appear, with the values of the
normal frequencies slightly modified by terms
depending on ", at each iterative step, and at the
end only for values of " belonging to some Cantor
Lower-Dimensional Tori set one can have elliptic lower-dimensional tori.
A natural question is what happens to the invariant The second Mel’nikov conditions are not really
tori corresponding to rotation vectors which are not necessary, and in fact they can be relaxed as Bourgain
rationally independent, that is, vectors satisfying n (1994) has shown; this is an important fact, as it
resonance conditions, such as w  n i = 0 for n allows degenerate normal frequencies, which were
independent vectors n 1 , . . . , n n , with 1  n  N  2 forbidden in the previous works by Kuksin (1987),
(the case n = N  1 corresponds to periodic orbits Eliasson (1988), and Pöschel (1989).
and is comparatively easy); for instance, one can Similar results also apply in the case of lower-
take w = (!1 , . . . , !n , 0, . . . , 0) and, by a suitable dimensional tori for the model [3], which represents
linear change of coordinates, one can always make sort of a degenerate situation, as the normal
the reduction to a case of this kind. In particular, frequencies vanish for " = 0. Again, one has to use
one can ask if a result analogous to the KAM part of the perturbation to remove the complete
theorem holds for these tori. Such a problem for the degeneracy of normal frequencies.
model [3] has not been studied very widely in the
literature. What has usually been considered is a Quasiperiodic Solutions in Partial
system of n rotators coupled with a system with
Differential Equations
s = N  n degrees of freedom near an equilibrium
point: then one calls normal coordinates the For explaining the Fermi–Pasta–Ulam experiment,
coordinates describing the latter, and the role of one has to deal with systems with arbitrarily many
the parameter " is played by the size of the normal degrees of freedom. Hence, it is natural to investigate
coordinates (if their initial conditions are chosen systems which have ab initio infinitely many
near the equilibrium point). In the absence of degrees of freedom, such as the nonlinear wave
perturbation (i.e., for " = 0), one has either hyper- equation, utt  uxx þ V(x)u = ’(u), the nonlinear
bolic or elliptic or, more generally, mixed tori, Schrödinger equation, iut  uxx þ V(x)u = ’(u), the
according to the nature of the equilibrium points: nonlinear Korteweg–de Vries equation ut þ uxxx 
one refers to these tori as lower-dimensional tori, as 6ux u = ’(u), and other systems of nonlinear partial
they represent n-dimensional invariant surfaces in a differential equations (PDEs); the continuum limit of
system with N degrees of freedom. Then one can the Fermi–Pasta–Ulam model gives indeed a non-
study the preservation of such tori. linear Korteweg–de Vries equation, as shown by
One can prove that, in such a case, at least if Zabuski and Kruskal (1965). Here (t, x) 2 R [0, ]d ,
certain generic conditions are satisfied, in suitable if d is the space dimension, and either periodic
coordinates, n angles rotate with frequencies (u(0, t) = u(, t)) or Dirichlet (u(0, t) = u(, t) = 0)
!1 , . . . , !n , respectively, while the remaining N  n boundary conditions can be considered; ’(u) is a
angles have to be fixed close to some values function analytic in u and starting from orders strictly
corresponding to the extremal points of the function higher than one, while V(x) is an analytic function of
obtained by averaging the potential over the rotating x, depending on extra parameters 1 , . . . , n . Such a
angles. function is introduced essentially for technical rea-
The case of hyperbolic tori is easier, as in the case sons, as we shall see that the eigenvalues k of the
of elliptic tori one has to exclude some values of " to Sturm–Liouville operator @x2 þ V(x) must satisfy
avoid some further resonance conditions between some Diophantine conditions. If we set V(x) = 2 R
the rotation vector w and the normal frequencies k in the nonlinear wave equation, we obtain the Klein–
(i.e., the eigenvalues of the linearized system Gordon equation, which, in the particular case = 0,
Stability Theory and KAM 29

reduces to the string equation. Again, the role of the Even if systems of the type considered above have
perturbation parameter is played by the size of the been widely studied, they remain significantly
solution itself. different from a discrete system such as the chain
Small-amplitude periodic and quasiperiodic of oscillators [1] for N large enough (also in the
solutions for PDE systems have been extensively limit N ! 1), so that the results which have been
studied, among others, by Kuksin, Wayne, Craig, found for PDE systems do not really provide an
Pöschel, and Bourgain. Results for such systems read explanation for the numerical findings.
as follows. Consider for concreteness the one-dimen- Also in the case of lower-dimensional tori for finite-
sional nonlinear wave equation with Dirichlet bound- dimensional systems the main problem is that, even if
ary conditions and with ’(u) = u3 þ O(u5 ). When the such tori exist, it is not clear what relevance they can
nonlinear function ’(u) is absent, any solution of the have for the dynamics (a case in which hyperbolic tori
linear wave equation utt  uxx þ V(x)u = 0 is a super- play a role is considered later). An important feature of
position of either finitely or infinitely many periodic maximal tori is that they fill most of the phase space, a
solutions with frequencies k determined by the property which certainly does not hold for lower-
function V(x). Let u0 (wt, x) be a quasiperiodic dimensional tori, which lie outside the Kolmogorov set.
solution of the linear wave equation with rotation In the Fermi–Pasta–Ulam experiment, one con-
vector w 2 Rn , where !k = mk , for some n-tuple siders initial conditions close to lower-dimensional
{m1 , . . . , mn }. Then for " small enough there exists a tori; hence, an interesting problem is to study their
subset " of the space of parameters with large stability, that is, how fast the trajectories starting
Lebesgue measure (more precisely, with complemen- from such initial conditions drift away.
tary Lebesgue measure which tends to zero when
" ! 0) such that for all x = (1 , . . . , n ) 2 " there is a
solution u" (t, x) of the nonlinear wave equation and a
Arnol’d Diffusion and Nekhoroshev’s
rotation vector w " satisfying the conditions
Theorem
 pffiffiffi
u" ðt; xÞ  "u0 ðw " t; xÞj  C" Consider again the maximal tori. For N = 2, the
jw "  w j < C" ½6 preservation of most of the invariant tori prevents the
possibility of diffusion in phase space: the tori
for some positive constant C. represent two-dimensional surfaces in a three-dimen-
The case n = 1 (periodic solutions) is not as easy sional space (as dynamics occur on the level surfaces
as the finite-dimensional case, because there are of the energy in a four-dimensional space), so that, if
infinitely many normal frequencies, so that there are an initial condition is trapped in a gap between two
small divisor problems which for finite-dimensional tori, the corresponding trajectory remains confined
systems appear only for n
2. forever between them. The situation is quite different
For the nonlinear wave equation and the for N
3: in such a case, the tori do not represent a
Schrödinger equation, if n
1, one can take topological obstruction to diffusion any more.
V(x) = , but one needs 6¼ 0; for n > 1, one can That mechanisms of diffusion are really possible
take V(x) = , as one can perform a preliminary was shown by Arnol’d (1963). Because of the
transformation leading to an equation in which a perturbation, lower-dimensional hyperbolic tori
function depending on parameters naturally appear inside the resonance regions, with their
appears, as shown by Kuksin and Pöschel (1996). stable and unstable manifolds (whiskers). It is
For n = 1, the case = 0 has been very recently possible that these manifolds of the same torus
solved by Gentile et al. (2005). intersect with a nonvanishing angle (homoclinic
Statements for more general situations can also angle); as a consequence, the angles between the
be obtained, while extensions to space dimensions stable and unstable manifolds of nearby tori
d
2 are not trivial and have been obtained only (heteroclinic angles) can also be different from
recently by Bourgain (1998). The above result also zero, and one can find a set of hyperbolic lower-
holds if the number of components of the rotation dimensional tori such that the unstable manifold of
vector is less than the number of parameters: one each of them intersects the stable manifold of the
uses such parameters because one needs to impose torus next to it: one says that such tori form a
some Diophantine conditions such as [5], now for transition chain of heteroclinic connections. Then
all the frequencies k = !k , k 2
= {m1 , . . . , mn }. Again, there can be trajectories moving along such connec-
the second Mel’nikov conditions were shown by tions, producing at the end a drift of order 1 (in ") in
Bourgain to be unnecessary, and this is an essential the action variables. Such a phenomenon is referred
ingredient for the higher-dimensional case. to as Arnol’d diffusion.
30 Stability Theory and KAM

Of course, diffusing trajectories should be located resonance region, and so on. Of course, for initial
in the region of phase space where there are no conditions on some invariant torus, KAM theorem
invariant tori (hence, a very small region when " is applies, but the new result concerns initial condi-
small), but an important consequence is that, unlike tions which do not belong to any tori.
what happens in the unperturbed case, not all Nekhoroshev’s theorem gives a lower bound for
motions are stable: in particular, the action variables the diffusion time, that is, the time required for a
can change by a large amount over long times. drift of order 1 to occur in the action variables. But,
Providing interesting examples of Hamiltonian of course, an upper bound would also be desirable.
systems in which Arnol’d diffusion can occur is not The diffusion times are related to the amplitude of
so easy: in fact, for the diffusion to really occur, one the homoclinic angles, which are very small (and
needs a lower bound on the homoclinic angles, and difficult to estimate as stated before). The strongest
to evaluate these angles can be difficult. For results in this direction have been obtained with
instance, Arnold’s (1963) original example, which variational methods, for instance, by Bessi, Bernard,
describes a system near a resonance region, is a two- Berti, and Bolle: at best, for the diffusion time, one
parameter system given by finds an estimate O( 1 log 1 ), if is the ampli-
 2  tude of the homoclinic angles (which in turn are
1 2
2 A1 þ A2 þ A3 þ ðcos 1  1Þ exponentially small in some power of
, as one can
þ " ðcos 1  1Þðsin 2 þ cos 3 Þ ½7 expect as a consequence of Nekhoroshev’s theorem).
Then one can imagine that the results of the Fermi–
and the angles can be proved to be bounded from Pasta–Ulam experiment can also be interpreted in the
below only by assuming that the perturbation para- light of Nekhoroshev’s theorem. The solutions one
meter " is exponentially small with respect to the other finds numerically certainly do not correspond to
parameter , which in turn implies a situation not maximal tori, but one could expect that they could be
really convincing from a physical point of view. More solutions which appear to be quasiperiodic for long
generally, for all the examples which are discussed in but finite times (e.g., moving near some lower-
literature, the relation with physics (as the d’Alembert dimensional torus determined by the initial condi-
problem on the possibility for a planet to change the tions), and that if one really insists on observing the
inclination of the precession cone) is not obvious. time evolution for a very long time, then deviations
So the question naturally arises as to how fast can from quasiperiodic behavior could be detected. This
such a mechanism of diffusion be, and how relevant is an appealing interpretation, and the most recent
is it for practical purposes. A first answer is numerical results make it plausible: Galgani and
provided by a theorem of Nekhoroshev (1977), Giorgilli (2003) have found numerically that the
which states the following result. energy, even if initially confined to the lower modes,
tend to be shared among all the other modes, and
Theorem 2 Suppose we have an N-degree-of-
higher the modes the longer is the time needed for the
freedom quasi-integrable Hamiltonian system,
energy to flow to them. Of course, this does not settle
where the unperturbed Hamiltonian satisfies some
the problem, as there is still the issue of the large
condition such as convexity (or a weaker one,
number of degrees of freedom; furthermore, for large
known as steepness, which is rather involved, to
N the spacing between the frequencies is small, and
state in a concise way); for concreteness consider a
they become almost degenerate. Hence, the problem
function H0 (A) in [2] which is quadratic in A. Then
still has to be considered as open.
there are two positive constants a and b such that
for times t up to O( exp ("b )) the variations of the
action variables cannot be larger than O("a ). Stability versus Chaos
The constants a and b depend on N, and they tend The main problem in applying the KAM theorem
to zero when N ! 1; Lochak and Nei_shtadt (1992) seems to be related to the small value of the threshold
and Pöschel (1993) found estimates a = b = 1=2N,
0 which is required. In general, when the size of the
which are probably in general optimal. Nekhor- perturbation parameter is very large, the region of
oshev’s theorem is usually stated in the form above, phase space filled with invariant tori decreases (or even
but it provides more information than that explicitly disappears), and chaotic motions appear. By the latter,
written: the trajectories, when trapped into a one generally means motions which are highly
resonance region, drift away and come close to sensitive to the initial conditions: a small variation of
some invariant torus, and then they behave like the initial conditions produces a catastrophic variation
quasiperiodic motions, up to very small corrections, in the corresponding trajectories (this is due to the
for a long time, until they enter some other appearance of strictly positive Lyapunov exponents).
Stability Theory and KAM 31

A natural question is then how such a result as the (1997) found analytical bounds on the perturbation
KAM theorem is meaningful in physical situations: parameters comparable with the physical values. Of
in other words, for which systems the KAM theorem course, this is not at all conclusive for the general
can really apply. situation in which all planets (with their satellites
One of the main motivations to study such a and the asteroids) are considered together; in
problem was to explain astronomical observations particular, it does not shed light on the problem of
and to study the stability of the solar system. In the stability of the entire solar system.
order to apply the KAM theorem to the solar On the contrary, extensive numerical simulations
system, one has to interpret the gravitational forces performed by Laskar (starting from 1989) seem to
between the planets as perturbations of a collection suggest that the solar system is unstable. Deflections
of several decoupled two-body systems (each planet from the current orbits could be produced to such an
with the Sun). One can write the masses of the extent that collisions between planets could not be
planets as "mi , and " plays the role of the avoided: Mercury could collide with Venus and be
perturbation parameter. The corresponding Hamil- ejected from the solar system. An important issue is
tonian (after suitable reductions and scalings) is to consider the times over which such phenomena
can occur. Laskar’s numerical simulations show that
XN
p2i XN
mi m0 X pi  pj such times are less than the estimated age of the solar
 þ" system, and that one can make accurate predictions
i¼1
2 i i¼1 jqi j 1i<jN
m0
X for the planetary motions only for a finite amount of
mi mj
þ" ½8 time ( 100 Myr). Furthermore, the assumed partial
jq
1i<jN i
 qj j instability of the solar system has also been used by
Laskar (2004) to explain some observed phenomena
where i = 0 corresponds to the Sun, while such as the evolution of the obliquity (which is the
i = 1, . . . , N correspond to the planets (hence angle between equator and orbital plane) of some
N = 9), m0 is the mass of the Sun, and " i are the planets. Of course, these simulations have been
reduced masses ( 1 1 1
i = mi þ "m0 ); here (qi , pi ) 2 carried out with several approximations, as that of
3 3
R  R , i ¼ 0, . . . , N, the inner product in pi  pj is averaging over the fast variables, which allows one to
in R3 , and the norm j  j is the Euclidean one. use a large integration step in the numerical integra-
A first difficulty is that the solar system is a properly tion of the equations of motion for the resulting
degenerate system; that is, the unperturbed Hamilto- system. This is the so-called secular system intro-
nian does not depend on all the action variables. But duced by Lagrange: instead of the fast motion of the
such a degeneracy can be removed by performing a planets, one describes the slow deformations of the
canonical change of coordinates which produces a new planetary orbits (imagining the planets as regions of
Hamiltonian in which the integrable part contains new mass spread along their orbits).
terms of order " depending on all action variables and
is nondegenerate, while the perturbation becomes of See also: Averaging Methods; Bifurcation Theory;
order "2 : the angle variables corresponding to the Billiards in Bounded Convex Domains; Diagrammatic
actions not originally appearing in the unperturbed Techniques in Perturbation Theory; Dynamical Systems
and Thermodynamics; Gravitational N-Body Problem
Hamiltonian are called the slow variables, while the
(Classical); Hamiltonian Systems: Stability and Instability
others are called the fast variables.
Theory; Hamilton–Jacobi Equations and Dynamical
However, a naive implementation of the KAM Systems: Variational Aspects; Integrable Systems and
theorem, in general, even for simplified but still Discrete Geometry; KAM Theory and Celestial
realistic systems, would provide a preposterously Mechanics; Localization for Quasiperiodic Potentials;
small value of the threshold "0 . The problem could Stability Problems in Celestial Mechanics;
be just a computational one: in principle, a very Synchronization of Chaos; Weakly Coupled Oscillators.
refined estimate of the threshold could give a better
value, so that it is very difficult to decide analytically
if the real values of the planetary masses allow the
Further Reading
solar system to fall inside the regime of appli-
cability of the KAM theorem. Results in this Arnol’d VI (1963) Proof of a theorem of A. N. Kolmogorov on
direction have been obtained, but only for special the preservation of conditionally periodic motions under a
situations: for instance, by considering the restri- small perturbation of the Hamiltonian. Russian Mathematical
Surveys 18: 85–192.
cted planar circular three-body problem (which Arnol’d VI (1964) Instability of dynamical systems with
provides a simplified description of the system many degrees of freedom. Soviet Mathematics Doklady 5:
‘‘Sun þ Jupiter þ asteroid’’), Celletti and Chierchia 581–585.
32 Standard Model of Particle Physics

Arnol’d VI, Kozlov VV, and Nei_shtadt AI (1988) Dynamical Kolmogorov AN (1954) On conservation of conditionally
Systems III. Encyclopedia of Mathematical Sciences, vol. 3. periodic motions for a small change in Hamilton’s function.
Berlin: Springer. Doklady Akademii Nauk SSSR 98: 527–530 (Russian).
Bourgain J (1994) Construction of quasi-periodic solutions for Kuksin SB (1987) Hamiltonian perturbations of infinite-
Hamiltonian perturbations of linear equations and applica- dimensional linear systems with imaginary spectrum. Func-
tions to nonlinear PDE. International Mathematics Research tional Analysis and its Applications 21: 192–205.
Notices 1994(11), 475–497. Kuksin SB (1993) Nearly Integrable Infinite-Dimensional Hamil-
Bourgain J (1998) Quasi-periodic solutions of Hamiltonian tonian Systems, Lecture Notes in Mathematics, vol. 1556.
perturbations of 2D linear Schrödinger equations. Annals of Berlin: Springer.
Mathematics 148: 363–439. Kuksin SB and Pöschel J (1996) Invariant Cantor manifolds of
Bourgain J (2005) Green’s Function Estimates for Lattice quasi-periodic oscillations for a nonlinear Schrödinger equa-
Schrödinger Operators and Applications. Princeton: Princeton tion. Annals of Mathematics 143: 149–179.
University Press. Laskar J (2004) Chaos in the solar system. In: Iagolnitzer D,
Celletti A and Chierchia L (1997) On the stability of realistic Rivasseau V, and Zinn-Justin J (eds.) Proceedings of the
three-body problems. Communications in Mathematical Phy- International Conference on Theoretical Physics TH2002,
sics 186: 413–449. (Paris, 2002). Basel: Birkhäuser.
Eliasson LH (1996) Absolutely convergent series expansions for Lochak P and Nei_shtadt AI (1992) Estimates of stability time for
quasi periodic motions. Mathematical Physics Electronic nearly integrable systems with a quasiconvex Hamiltonian.
Journal 2, paper 4 (electronic). Preprint 1988. Chaos 2: 495–499.
Eliasson LH (1988) Perturbations of stable invariant tori for Moser J (1962) On invariant curves of area-preserving mappings
Hamiltonian systems. Annali della Scuola Normale Superiore of an annulus. Nachrichten der Akademie der Wissenschaften
di Pisa 15: 115–147. in Göttingen 1962: 1–20.
Ford J (1992) The Fermi–Pasta–Ulam problem: paradox turns Moser J (1973) Stable and Random Motions in Dynamical
discovery. Physics Reports 213: 271–310.z Systems, Annals of Mathematical Studies. Princeton: Princeton
Galgani L and Giorgilli A (2003) Recent results on the Fermi– University Press.
Pasta–Ulam problem. Rossiı_skaya Akademiya Nauk. Sankt- Nekhorošev NN (1977) An exponential estimate of the time of
Peterburgskoe Otdelenie. Matematicheskiı_Institut im. V. A. stability of nearly integrable Hamiltonian systems. Russian
Steklova. Zapiski Nauchnykh Seminarov (POMI) 300: Mathematical Surveys 32: 1–65.
145–154. Pöschel J (1989) On elliptic lower-dimensional tori in Hamilto-
Gallavotti G (1986) Quasi-integrable mechanical systems. In: nian systems. Mathematische Zeitscrift 202: 559–608.
Phénomènes critiques, systèmes aléatoires, théories de jauge Pöschel J (1993) Nekhoroshev estimates for quasi-convex
(Les Houches, 1984), pp. 539–624. Amsterdam: North- Hamiltonian systems. Mathematische Zeitscrift 213:
Holland. 187–216.
Gallavotti G, Bonetto F, and Gentile G (2004) Aspects of Rink B (2001) Symmetry and resonance in periodic FPU chains.
Ergodic, Qualitative and Statistical Theory of Motion. Berlin: Communications in Mathematical Physics 218: 665–685.
Springer. Zabusky NJ and Kruskal MD (1965) Interaction of ‘‘solitons’’ in
Gentile G, Mastropietro V, and Procesi M (2005) Periodic a collisionless plasma and the recurrence of initial states.
solutions for completely resonant nonlinear wave equations Physical Review Letters 15: 240–243.
with Dirichlet boundary conditions. Communications in
Mathematical Physics 257: 319–362.

Standard Model of Particle Physics


G Altarelli, CERN, Geneva, Switzerland the atomic nuclei are due to strong interactions; the
ª 2006 Elsevier Ltd. All rights reserved. binding of electrons to nuclei in atoms or of atoms
in molecules is caused by electromagnetism; and the
energy production in the Sun and the other stars
occurs through nuclear reactions induced by weak
Introduction
interactions. In principle, gravitational forces
The standard model (SM) is a consistent, finite, should also be included in the list of fundamental
and – within the limitations of our present technical interactions but their impact on fundamental
ability – computable theory of fundamental micro- particle processes at accessible energies is totally
scopic interactions that successfully explains most negligible.
of the known phenomena in elementary particle The structure of the SM is a generalization of
physics. The SM describes strong, electromagnetic, that of quantum electrodynamics (QED), in the
and weak interactions. All microscopic phenomena sense that it is a renormalizable field theory based
observed to date can be attributed to one or the on a local symmetry (i.e., separately valid at each
other of these interactions. For example, the forces spacetime point x) that extends the gauge invar-
that hold together the protons and the neutrons in iance of electrodynamics to a larger set of
Standard Model of Particle Physics 33

conserved currents and charges. There are eight the group SU(3) with color triplet quark matter
strong charges, called ‘‘color’’ charges and four fields fixes the QCD Lagrangian density to be
electroweak charges (which, in particular, include n
the electric charge). The commutators of these 1X 8 X f

L¼ FA F


A
þ q
j ði6D  mj Þqj ½2
charges form the SU(3)  SU(2)  U(1) algebra. In 4 A¼1 j¼1
QED, the interaction between two matter particles
with electric charges (e.g., two electrons) is Here qj are the quark fields (of nf different flavors)
mediated by the exchange of one (or more) photons with mass mj ; 6 D = D   , where   are the Dirac
emitted by one electron and reabsorbed by the matrices and D is the covariant derivative
second. In the SM the matter fields, all of spin 1=2, X
are the quarks, the constituents of protons, neu- D ¼ @  ies t A gA
 ½3
A
trons, and all hadrons, endowed with both color
and electroweak charges, and the leptons (the es is the gauge coupling (in analogy with QED,
electron e , the muon  , the tauon   , plus the
three associated neutrinos e ,  , and  ) with no e2s
s ¼ ½4
color but with electroweak charges. The matter 4
fermions come in three generations or families with here and throughout this article natural units,
identical quantum numbers but different masses. h = c = 1, are used); gA , A = 1, . . . , 8, are the gluon
The pattern is as follows: fields, and tA are the SU(3) group generators in the
    triplet representation of quarks (i.e., tA are 3  3
u u u e c c c  matrices acting on q); the generators obey the
; ;
d d d e s s s  commutation relations [tA , tB ] = iCABC tC , where
  CABC are the complete antisymmetric structure
t t t 
constants of SU(3) (the normalization of CABC and
b b b  ½1
of es is specified by tr[tA tB ] = 1=2AB );
A
Each family contains a weakly charged doublet of F ¼ @  gA A B C
  @ g  es CABC g g ½5
quarks, in three color replicas, and a colorless
weakly charged doublet with a neutrino and a The physical vertices in QCD include the gluon–
charged lepton. At present, there is no explanation quark–antiquark vertex, analogous to the QED
for this triple repetition of fermion families. The photon–fermion–antifermion coupling, but also the
force carriers, of spin 1, are the photon , the weak three-gluon and four-gluon vertices, of order es and
interaction gauge bosons W þ , W  , and Z0 and the e2s , respectively, which have no analog in an abelian
eight gluons g that mediate the strong interactions. theory like QED. In QED, the photon (a neutral
The photon and the gluons have zero masses as a particle) is coupled to all electrically charged
consequence of the exact conservation of the particles. In QCD, the gluons are colored,
corresponding symmetry generators, the electric hence self-coupled. This is reflected in the fact that
charge and the eight color charges. The weak in QED F is linear in the gauge field, so that the
bosons W þ , W  , and Z0 have large masses (mW  term F2 in the Lagrangian is a pure kinetic term,
A
80.4 GeV, mZ = 91.2 GeV), signaling that the corre- while in QCD F is quadratic in the gauge field, so
A2
sponding symmetries are badly broken. In the SM, that in F we find cubic and quartic vertices
the spontaneous breaking of the electroweak gauge beyond the kinetic term.
symmetry is induced by the Higgs mechanism, The QCD Lagrangian in eqn [2] has a simple
which predicts the presence of one (or more) spin 0 structure but a very rich dynamical content, includ-
particles in the physical spectrum, the Higgs ing the observed complex spectroscopy with a large
boson(s), not yet experimentally observed. A tre- number of hadrons. The most prominent properties
mendous experimental effort is underway or of QCD are asymptotic freedom and confinement.
planned to reveal the Higgs sector as the last crucial In field theory, the effective coupling of a given
missing link in the SM verification. interaction vertex is modified by the interaction. As
a result, the measured intensity of the force depends
on the transferred (four)momentum squared, Q2 ,
among the participants. In QCD, the relevant
Quantum Chromodynamics
coupling parameter that appears in physical pro-
The statement that quantum chromodynamics cesses is s (see eqn [4]). Asymptotic freedom means
(QCD) is a renormalizable gauge theory based on that the effective coupling becomes a function of
34 Standard Model of Particle Physics

Q2 : s (Q2 ) decreases for increasing Q2 and vanishes quarks and leptons, generally indicated by is
asymptotically. Thus, the QCD interaction becomes understood):
very weak in processes with large Q2 , called hard
processes or deep inelastic processes (i.e., with a 1X 3
1
Lsymm ¼  FA FA  B B
final-state distribution of momenta and a particle 4 A¼1  4
content very different from that in the initial state).
þ L i  D L þ R i  D R ½7
One can prove that in four spacetime dimensions all
gauge theories based on a noncommuting group of This is the Yang–Mills Lagrangian for the gauge
symmetry are asymptotically free, and conversely. group SU(2)  U(1) with fermion matter fields. Here
The effective coupling decreases very slowly at large
momenta with the inverse logarithm of Q2 : B ¼ @ B  @ B
½8
s (Q2 ) = 1=b log Q2 =2 , where b is a known con- A
F ¼ @ WA  @ WA  g ABC WB WC
stant and  is an energy of the order of a few
hundred MeV. Since in quantum mechanics large are the gauge antisymmetric tensors constructed out
momenta imply short wavelengths, the result is that of the gauge field B associated with U(1), and WA
at short distances the potential between two color corresponding to the three SU(2) generators; ABC
charges is similar to the Coulomb potential, that is, are the group structure constants (see eqn [11]),
proportional to s (r)=r, with an effective color which, for SU(2), coincide with the totally antisym-
charge which is small at short distances. On the metric Levi-Civita tensor (recall the familiar
contrary the interaction strength becomes large at angular-momentum commutators).
large distances or small transferred momenta, of The fermion fields are described through their
order Q < left- and right-hand components:
 . In fact, the observed hadrons are tightly
bound composite states of quarks, with compensating ¼ ½ð1  5 Þ=2 ; L; R ¼ ½ð1  5 Þ=2 ½9
L; R
color charges so that they are overall neutral in color.
The property of confinement is the impossibility of Note that, as given in eqn [9],
separating color charges, like individual quarks and
gluons. This is because in QCD the interaction
L ¼ y
L 0¼ y ½ð1  5 Þ=20
potential between color charges increases, at long ¼ ½0 ð1  5 Þ=20 ¼ ½ð1 þ 5 Þ=2
distances, linearly in r. When we try to separate the
quark and the antiquark that form a color-neutral The matrices P = (1  5 )=2 are projectors. They
meson the interaction energy grows until pairs of satisfy the relations P P = P , P P = 0,
quarks and antiquarks are created from the vacuum Pþ þ P = 1.
and new neutral mesons are coalesced instead of free The standard electroweak theory is a chiral
quarks. For example, consider the process eþ e ! qq  theory, in the sense that L and R behave
at large center-of-mass energies. The final-state quark differently under the gauge group. In particular, all
and antiquark have large energies, so they separate in R are singlets and all L are doublets in the

opposite directions very fast. But the color-confine- minimal SM (MSM). Thus, mass terms for fermions
ment forces create new pairs in between them. Two (of the form ¯ L R þ h.c.) are forbidden in the
back-to-back jets of colorless hadrons are observed symmetric limit. Fermion masses are introduced,
with a number of slow pions that make the exact together with W  and Z masses, by the mechanism
separation of the two jets impossible. In some of symmetry breaking. The covariant derivatives
cases, a third well-separated jet of hadrons is also D L,R are explicitly given by
observed: these events correspond to the radiation D L; R
of an energetic gluon from the parent quark– " #
antiquark pair. X
3
1
A A 0
¼ @ þ ig tL; R W þ ig YL; R B L; R ½10
A¼1
2
A
Electroweak Interactions where tL,R and 1=2YL,R are the SU(2) and U(1)
generators, respectively, in the reducible representa-
We split the electroweak Lagrangian into two parts tions L,R . The commutation relations of the SU(2)
by separating the Higgs boson couplings: generators are given by
 A B  
L ¼ Lsymm þ LHiggs ½6 tL ; tL ¼ i ABC tLC and tRA ; tRB ¼ i ABC tRC ½11
We start by specifying Lsymm , which involves only We use the normalization tr[tA tB ] = 1=2AB in the
gauge bosons and fermions (a sum over all flavors of fundamental representation of SU(2). The electric
Standard Model of Particle Physics 35

charge generator Q (in units of e, the positron transfer squared can be neglected with respect to
charge) is given by m2W in the propagator of Born diagrams with single
W exchange, from eqn [14], we can write
Q ¼ tL3 þ 1=2YL ¼ tR3 þ 1=2YR ½12
2
 
LCC
eff ’ g =8mW
2  ð1  5 Þtþ
L
All fermion couplings to the gauge bosons can be  
derived directly from eqns [7] and [10]. The charged-    ð1  5 Þt L ½19
current (CC) couplings are the simplest. From By specializing further in the case of doublet fields
  nh  pffiffiffii such as e  e or    , we obtain the tree-level
g t1 W1 þ t2 W2 ¼ g t1 þ it2 = 2
relation of g with the Fermi coupling constant
h  pffiffiffii o
 W1  iW2 = 2 þ h:c: GF measured from  decay (GF = 1.16639(2)
nh  pffiffiffii o 105 GeV2 ):
¼g tþ W = 2 þ h:c: ½13 pffiffiffi
GF = 2 ¼ g2 =8m2W ½20
pffiffiffi
where t = t1  it2 and W  = (W 1  iW 2 )= 2, we By recalling that g sin
W = e, we can also cast this
obtain the vertex relation in the form
h pffiffiffi  pffiffiffi
V  W ¼ g  tLþ = 2 ð1  5 Þ=2 þ tRþ = 2 mW ¼ Born = sin
W ½21
i
 ð1 þ 5 Þ=2 W þ h:c: ½14 with
 pffiffiffi 1=2
Born ¼ = 2GF ’ 37:2802 GeV ½22
In the neutral-current (NC) sector, the photon A
and the mediator Z of the weak NC are orthogonal
where  is the fine-structure constant of QED
and normalized linear combinations of B and W3 :
( e2 =4 = 1=137.036).
A ¼ cos
W B þ sin
W W3 In the same way, for neutral currents we obtain,
½15 in Born approximation, from eqn [18], the effective
Z ¼  sin
W B þ cos
W W3 four-fermion interaction given by
pffiffiffi
Equations [15] define the weak mixing angle
W . LNC 2GF 0  ½. . .   ½. . .
eff ’ ½23
The photon is characterized by equal couplings to
left and right fermions with a strength equal to the where
electric charge. Recalling eqn [12] for the charge
matrix Q, we immediately obtain ½. . . tL3 ð1  5 Þ þ tR3 ð1 þ 5 Þ  2Q sin2
W ½24
and
g sin
W ¼ g0 cos
W ¼ e ½16
0 ¼ m2W =m2Z cos2
W ½25
or, equivalently,
All couplings given in this section are obtained at
tan
W ¼ g0 =g ½17
tree level and are modified in higher orders of
Once
W has been fixed by the photon couplings, it perturbation theory. In particular, the relations
is a simple matter of algebra to derive the Z between mW and sin
W (eqns [21] and [22]) and
couplings, with the result the observed values of ( = 0 at tree level) in
 different NC processes are altered by computable
  Z ¼ g=ð2 cos
W Þ  tL3 ð1  5 Þ þ tR3 ð1 þ 5 Þ small electroweak radiative corrections.

2Q sin2
W Z ½18 The gauge-boson self-interactions can be derived
from the F term p inffiffiffiLsymm , by using eqn [15] and
where  ¯ Z is a notation for the vertex. In the W  = (W 1  iW 2 )= 2. For the three-gauge-boson
MSM, tR3 = 0 and tL3 = 1=2. Note that the CC and vertex W þ W  V with V = Z, , we obtain
NC weak couplings do not conserve P (parity) and C
(charge conjugation). W  W þ V ¼ igW  W þ V ½g ðq  pÞ þ g ðp  rÞ
In order to derive the effective four-fermion þ g ðr  qÞ  ½26
interactions that are equivalent, at low energies, to
the CC and NC couplings given in eqns [14] and with
[18], we anticipate that large masses, as experimen-
gW  W þ  ¼ g sin
W ¼ e and
tally observed, are provided for W  and Z by LHiggs . ½27
For left–left CC couplings, when the momentum gW  W þ Z ¼ g cos
W
36 Standard Model of Particle Physics

This form of the triple gauge vertex is very special: in 5 -free and diagonal. In fact, we can make separate
general, there could be departures from the above SM unitary transformations on L and R according to
expression, even restricting us to SU(2)  U(1) gauge 0 0
symmetric and C and P invariant couplings. In fact, L ¼U L; R ¼V R ½33
some small corrections are already induced by the and consequently
radiative corrections. The SM form of the triple gauge
vertex has been experimentally confirmed by measur- M ! M0 ¼ Uy MV ½34
ing the cross section eþ e ! W þ W  at LEP.
This transformation does not alter the general
We now turn to the Higgs sector of the electro-
structure of the fermion couplings in Lsymm .
weak Lagrangian. The Higgs Lagrangian is specified
If only one Higgs doublet is present, the change of
by the gauge principle and the requirement of
basis that makes M diagonal will at the same time
renormalizability to be
diagonalize also the fermion–Higgs Yukawa cou-

y

LHiggs ¼ D ðD Þ  V y  L  R plings. Thus, in this case, no flavor-changing neutral


Higgs exchanges are present. This is not true, in
 R y L
y
½28 general, when there are several Higgs doublets. But
one Higgs doublet for each electric charge sector,
where is a column vector including all Higgs that is, one doublet coupled only to u-type quarks,
scalar fields; it transforms as a reducible representa- one doublet to d-type quarks, one doublet to charged
tion of the gauge group. The quantities  (which leptons would also be satisfactory, because the mass
include all coupling constants) are matrices that matrices of fermions with different charges are
make the Yukawa couplings invariant under the diagonalized separately. In fact, at the moment, the
Lorentz and gauge groups. The potential V( y ), simplest model with only one Higgs doublet seems
symmetric under SU(2)  U(1), contains, at most, adequate for describing all observed phenomena.
quartic terms in so that the theory is Weak charged currents are the only tree-level
renormalizable: interactions in the SM that change flavor: by


2 emission of a W, a u-type quark is turned into a
V y ¼  12 2 y þ 14 y ½29 d-type quark, or a l neutrino is turned into an
l charged lepton (all fermions are left-handed). If
Spontaneous symmetry breaking is induced if the
we start from a u-type quark that is a mass
minimum of V, which is the classical analog of
eigenstate, emission of a W turns it into a d-type
the quantum-mechanical vacuum state (both are the
quark state d0 (the weak isospin partner of u) that in
states of minimum energy) is obtained for nonvan-
general is not a mass eigenstate. In general, the mass
ishing values. This occurs because we have taken
eigenstates and the weak eigenstates do not coincide
2 and positive in V (note the ‘‘wrong’’ sign of the
and a unitary transformation connects the two sets:
mass term). Precisely, we denote the vacuum
0 01 0 1
expectation value (VEV) of , that is, the position d d
of the minimum, by v: @ s0 A ¼ V @ s A ½35
h0j ðxÞj0i ¼ v 6¼ 0 ½30 b0 b
or, in shorthand, D0 = VD, where V is the Cabibbo–
The fermion mass matrix is obtained from the Kobayashi–Maskawa (CKM) matrix. Thus, in terms
Yukawa couplings by replacing (x) by v: of mass eigenstates the charged weak current of
quarks is of the form
M ¼ L M R þ R My L ½31
Jþ / u
 ð1  5 ÞVD ½36
with
M¼
v ½32 Since V is unitary (i.e., VV y = V y V = 1) and commu-
tes with T 2 , T3 , and Q (because all d-type quarks
In the SM, where all left fermions, L , are doublets have the same isospin and charge) the neutral current
and all right fermions, R , are singlets, only Higgs couplings are diagonal both in the primed and
doublets can contribute to fermion masses. There unprimed basis (if the Z d-type quark current is
are enough free couplings in , so that one single abbreviated as D  0 D0 then by changing basis we get
complex Higgs doublet is indeed sufficient to  y
DV VD and V and  commute because, as seen
generate the most general fermion mass matrix. It from eqn [24],  is made of Dirac matrices and T3 and
is important to observe that by a suitable change of Q generator matrices). It follows that D  0 D0 = DD.

basis we can always make the matrix M Hermitian, This is the Glashow–Iliopoulos–Maiani (GIM)
Standard Model of Particle Physics 37

mechanism that ensures natural flavor conservation In MSM only one Higgs doublet is present. Then the
of the neutral current couplings at the tree level. For fermion–Higgs couplings are in proportion to the
three generations of quarks, the CKM matrix depends fermion masses. In fact, from the Yukawa couplings
on four physical parameters: three mixing angles and g f f (fL fR þ h.c.), the mass mf is obtained by replacing
one phase. This phase is the unique source of CP by v, so that mf = g f f v. In MSM, three out of the
violation in the SM. four Hermitian fields are removed from the physical
We now consider the gauge-boson masses and their spectrum by the Higgs mechanism and become the
couplings to the Higgs. These effects are induced by longitudinal modes of W þ ,W  , and Z which acquire a
the (D )y (D ) term in LHiggs (eqn [28]), where mass. The fourth neutral Higgs is physical and should
" # be found. If more doublets are present, two more
X3
charged and two more neutral Higgs scalars should be
D ¼ @ þ ig tA WA þ ig0 ðY=2ÞB ½37
A¼1
around for each additional doublet.
The couplings of the physical Higgs H to the
Here tA and 1=2Y are the SU(2)  U(1) generators in gauge bosons can be simply obtained from LHiggs , by
the reducible representation spanned by . Not only the replacement
doublets but all non-singlet Higgs representations can þ
contribute to gauge-boson masses. The condition that ðxÞ 0 pffiffiffi
ðxÞ ¼ ! ½45
the photon remains massless is equivalent to the 0 ðxÞ v þ ðH= 2Þ
condition that the vacuum is electrically neutral: y
(so that (D ) (D ) = 1=2(@ H)2 þ


), with the

Qjvi ¼ t3 þ 12 Y jvi ¼ 0 ½38 result

The charged W mass is given by the quadratic terms L½H; W; Z


in the W field arising from LHiggs , when (x) is pffiffiffi

¼ g2 v= 2 Wþ W  H þ g2 =4 Wþ W  H 2
replaced by v. We obtain h
 pffiffiffi i
 pffiffiffi 2 þ g2 vZ Z = 2 2 cos2
W H

m2W Wþ W  ¼ g2 tþ v= 2 Wþ W  ½39 

þ g2 = 8 cos2
W Z Z H 2
whilst for the Z mass we get (recalling eqn [15])
 In MSM, the Higgs mass m2H  v2 is of order of
1 2  3 the weak scale v but cannot be predicted because the
2 mZ Z Z ¼ g cos
W t
 2 value of is not fixed. The dominant decay mode of
g0 sin
W ðY=2Þ v Z Z ½40  channel below the WW
the Higgs is in the bb
where the factor of 1/2 on the left-hand side is the threshold, while the W W  channel is dominant for
þ

correct normalization for the definition of the mass sufficiently large mH . The width is small below the
of a neutral field. For Higgs doublets WW threshold, not exceeding a few MeV, but
þ increases steeply beyond the threshold, reaching the
0 asymptotic value of   1=2m3H at large mH , where
¼ ; v¼ ½41
0 v all energies and masses are in TeV.
we obtain A central role in the experimental verification of
the standard electroweak theory has been played by
m2W ¼ 1=2g2 v2 ; m2Z ¼ 1=2g2 v2 = cos2
W ½42 CERN, the European Laboratory for Particle Physics,
located near Geneva, between France and Switzer-
Note that by using eqn [20] we obtain
land. The indirect effects of the Z0 , that is, the
v ¼ 23=4 GF
1=2
¼ 174:1 GeV ½43 occurrence of weak processes induced by the neutral
current, were first observed in 1974 at CERN by the
It is also evident that for Higgs doublets Collaboration Gargamelle (the name of the bubble
chamber used in the experiment). Later, in 1982, the
0 ¼ m2W =m2Z cos2
W ¼ 1 ½44
W  and the Z0 were, for the first time, directly
This relation is typical of one or more Higgs doublets produced and observed in proton–antiproton colli-
and would be spoiled by the existence of, for example, sions by the UA1 and UA2 collaborations and then
Higgs triplets. This result is valid at the tree level and is further studied with the same technique both at
modified by calculable small electroweak radiative CERN and subsequently at the Tevatron of Fermilab
corrections. The 0 parameter has been measured from near Chicago. Starting from 1989 LEP, the large eþ e
the intensity of NC interactions (recall eqn [25]) and collider was functioning at CERN till 2000. In the LEP
confirmed to be close to unity at a few per milli level. circular ring of circumference 27 km, electrons and
38 Stationary Black Holes

positrons were accelerated in opposite directions to an dismantled and, in its tunnel, a new double ring of
equal energy in the range between 45 and 103 GeV. superconducting magnets is being installed. The new
The beams were made to cross and collide in accelerator, the LHC (Large Hadron Collider), will be
correspondence of four experimental areas where the a proton–proton collider of total center-of-mass
ALEPH, DELPHI, L3, and OPAL detectors were energy 14 TeV. Two large experiments ATLAS and
located to study the final states produced in the CMS will continue to search for the Higgs starting in
collisions. In its first phase, called LEP1, from 1989 the year 2007. The sensitivity of LHC experiments to
to 1995 the LEP operation had been completely the SM Higgs will go up to masses mH of 1 TeV.
dedicated to a precise study of the Z0 properties,
mass, lifetime, and decay modes in order to accurately See also: Effective Field Theories; Electric–Magnetic
test the predictions of the SM. The main lessons of the Duality; Electroweak Theory; General Relativity:
precision tests of the standard electroweak theory can Experimental Tests; Noncommutative Geometry and the
Standard Model; Perturbative Renormalization Theory
be summarized as follows. It has been checked that the
and BRST; Quantum Chromodynamics; Quantum
couplings of quarks and leptons to the weak gauge
Electrodynamics and its Precision Tests; Quantum Field
bosons W  and Z are indeed precisely those prescribed Theory: a Brief Introduction; Relativistic Wave Equations
by the gauge symmetry. The accuracy of a few tenths Including Higher Spin Fields; Renormalization: General
of 1% for these tests implies that, not only the tree Theory; Supersymmetric Particle Models.
level, but also the structure of quantum corrections has
been verified. Then, since the end of 1995, the energy
of LEP was increased and the phase of LEP2 was Further Reading
started. The total energy was gradually increased up to
Altarelli G (2000) The Standard Electroweak Theory and Beyond,
206 GeV. The main physics goals of LEP2 were the Proceedings of the Summer School on Phenomenology of
search for the Higgs and for possible new particles, the Gauge Interactions. Zuoz, Switzerland hep-ph/0011078.
precise measurement of mW and the experimental Altarelli G (2001) A QCD Primer, Proceedings of the 2001
study of the triple gauge vertices WW and WWZ0 . European School of High Energy Physics. Beatenberg,
Switzerland, hep-ph/0204179.
The Higgs particle of the SM could in principle be
Close F, Marten, and Sutton MC (2002) The Particle Odyssey.
produced at LEP2 in the reaction e þ e ! Z0 H, Oxford: Oxford University Press.
which proceeds by Z0 exchange. The nonobservation Fraser G (ed.) (1998) The Particle Century. London: The Institute
of the Higgs particle at LEP2 has allowed to establish a of Physics.
lower limit on its mass: mH > Martin BR and Shaw G (1997) Particle Physics, 2nd edn.
 114 GeV. Indirect Chichester: Wiley.
indications on the Higgs mass were also obtained
Particle Data Group (2004) Review of particle physics. Physics
from the precision tests of the SM, as the radiative Letters B 592: 1.
effects depend logarithmically on mH . The indication Perkins DH (2000) Introduction to High Energy Physics. Read-
is that the Higgs mass cannot be too heavy if the SM is ing: Addison-Wesley.
valid: mH <  219 GeV at 95% c.l. In 2001, LEP was

Stationary Black Holes


R Beig, Universität Wien, Vienna, Austria The tensor T is the stress–energy tensor of matter.
P T Chruściel, Université de Tours, Tours, France Spacetimes, or regions thereof, where T = 0 are
ª 2006 Elsevier Ltd. All rights reserved. called vacuum.
Stationary solutions are of interest for a variety
of reasons. As models for compact objects at rest,
or in steady rotation, they play a key role in
Introduction astrophysics. They are easier to study than nonsta-
This article treats a specific class of stationary tionary systems because stationary solutions are
solutions to the Einstein field equations which read governed by elliptic rather than hyperbolic equa-
tions. Finally, like in any field theory, one expects
1 8G
R  g R ¼ 4 T ½1 that large classes of dynamical solutions approach
2 c (‘‘settle down to’’) a stationary state in the final
Here R and R = g R are, respectively, the Ricci stages of their evolution.
tensor and the Ricci scalar of the spacetime metric The simplest stationary solutions describing com-
g , G the Newton constant, and c the speed of light. pact isolated objects are the spherically symmetric
Stationary Black Holes 39

ones. In the vacuum region, these are all given by the in the asymptotically flat region, with all orbits
Schwarzschild family. A theorem of Birkhoff shows 2-periodic. In asymptotically flat spacetimes, this
that in the vacuum region any spherically symmetric implies that there exists an axis of symmetry, that is, a
metric, even without assuming stationarity, belongs to set on which the Killing vector vanishes. Killing vector
the family of Schwarzschild metrics, parametrized by a fields which are a nontrivial linear combination of a
positive mass parameter m. Thus, regardless of time translation and of a rotation in the asymptotically
possible motions of the matter, as long as they remain flat region are called stationary rotating, or helical.
spherically symmetric, the exterior metric is the There exists a technique, due independently to
Schwarzschild one for some constant m. This has the Kruskal and Szekeres, of attaching together two
following consequence for stellar dynamics: imagine regions r > 2m and two regions r < 2m of the
following the collapse of a cloud of pressureless fluid Schwarzschild metric, as in Figure 1, to obtain a
(‘‘dust’’). Within Newtonian gravity, this dust cloud manifold with a metric which is smooth at r = 2m.
will, after finite time, contract to a point at which the In the extended spacetime, the hypersurface {r = 2m}
density and the gravitational potential diverge. How- is a null hypersurface e, the Schwarzschild event
ever, this result cannot be trusted as a sensible physical horizon. The stationary Killing vector X = @t
prediction because, even if one supposes that New- extends to a Killing vector in the extended spacetime
tonian gravity is still valid at very high densities, a which becomes tangent to and null on e. The global
matter model based on noninteracting point particles properties of the Kruskal–Szekeres extension of the
is certainly not. Consider, next, the same situation in exterior Schwarzschild spacetime make this spacetime
the Einstein theory of gravity: here a new question a natural model for a nonrotating black hole. It is
arises, related to the form of the Schwarzschild metric worth noting here that the exterior Schwarzschild
outside of the spherically symmetric body: spacetime [2] admits an infinite number of noniso-
metric vacuum extensions, even in the class of
g ¼ V 2 dt2 þ V 2 dr2 þ r2 d2 ; maximal, analytic, simply connected ones. The
Kruskal–Szekeres extension is singled out by the
2Gm
V2 ¼ 1  ; properties that it is maximal, vacuum, analytic, simply
rc2 connected, with all maximally extended geodesics
 
2Gm either complete, or with the area r of the orbits of the
t 2 R; r 2 ;1 ½2
c2 isometry groups tending to zero along them.
We can now come back to the problem of the
Here d2 is the line element of the standard contracting dust cloud according to the Einstein
2-sphere. Since the metric [2] seems to be singular as theory. For simplicity, we take the density of the
r = 2m is approached (from now on, we use units in dust to be uniform – the so-called Oppenheimer–
which G = c = 1), there arises the need to understand Snyder solution. It then turns out that, in the course
what happens at the surface of the star when the of collapse, the surface of the dust will eventually
radius r = 2m is reached. One thus faces the need of cross the Schwarzschild radius, leaving behind a
a careful study of the geometry of the metric [2] Schwarzschild black hole. If one follows the dust
when r = 2m is approached, and crossed. cloud further, a singularity will eventually form, but
The first key feature of the metric [2] is its will not be visible from the ‘‘outside region’’ where
stationarity, of course, with Killing vector field X r > 2m. For a collapsing body of the mass of the
given by X = @t . A Killing field, by definition, is a Sun, say, one has 2m = 3 km. Thus, standard
vector field the local flow of which generates isome- phenomenological matter models such as that for
tries. A spacetime (the term spacetime denotes a dust can still be trusted, so that the previous
smooth, paracompact, connected, orientable, and objection to the Newtonian scenario does not apply.
time-orientable Lorentzian manifold) is called station- There is a rotating generalization of the Schwarz-
ary if there exists a Killing vector field X which schild metric, namely the two-parameter family of
approaches @t in the asymptotically flat region (where r exterior Kerr metrics, which in Boyer–Lindquist
goes to 1; see below for precise definitions) and coordinates takes the form
generates a one-parameter group of isometries. A
spacetime is called static if it is stationary and if the   a2 sin2  2 2a sin2 ðr2 þ a2  Þ
g¼ dt  dt d’
stationary Killing vector X is hypersurface orthogonal,  
2 2
that is, X[ ^ dX[ = 0, where X[ = X dx = g X dx . ðr2 þ a2 Þ  a2 sin  2
þ sin  d’2
A spacetime is called axisymmetric if there exists a 
Killing vector field Y, which generates a one-parameter 
þ dr2 þ  d2 ½3
group of isometries and which behaves like a rotation 
40 Stationary Black Holes

Singularity (r = 0)
T

r = 2m
r = constant < 2m

r = constant > 2m

r = constant > 2m
II
r = 2m
X
III I

r = 2m IV

r = 2m

r = constant < 2m
Singularity (r = 0)

t = constant
Figure 1 The Kruskal–Szekeres extension of the Schwarzschild solution. (Adapted with permission from Nicolas J-P (2002) Dirac fields
on asymptotically flat space-times. Dissertationes Mathematicae 408: 1–85.)

with 0  a < m. Here  = r 2 þ a2 cos2 ,  = r2 þ a2  stationary black holes, and to give a classification
2mr and rþ < r < 1 where rþ = m þ (m2  a2 )1=2 . of models satisfying the field equations.
When a = 0, the Kerr metric reduces to the
Schwarzschild metric. The Kerr metric is again a
vacuum solution, and it is stationary with X = @t the Model-Independent Concepts
asymptotic time translation, as well as axisymmetric
Some of the notions used informally in the
with Y = @’ the generator of rotations. Similarly to
introductory section will now be made more
the Schwarzschild case, it turns out that the metric
precise. The mathematical notion of black hole is
can be smoothly extended across r = rþ , with {r = rþ }
meant to capture the idea of a region of spacetime
being a smooth null hypersurface e in the extension.
which cannot be seen by ‘‘outside observers.’’ Thus,
The null generator K of e is the limit of the
at the outset, one assumes that there exists a family
stationary-rotating Killing field X þ !Y, where
of physically preferred observers in the spacetime
! = a=(2mrþ ). On the other hand, the Killing vector
under consideration. When considering isolated
X is timelike only outside the hypersurface {r = m þ
physical systems, it is natural to define the ‘‘exterior
(m2  a2 cos2 )1=2 }, on which X becomes null. In the
observers’’ as observers which are ‘‘very far’’ away
region between rþ and r = m þ (m2  a2 cos2 )1=2 ,
from the system under consideration. The standard
which is called the ergoregion, X is spacelike. It is
way of making this mathematically precise is by
also spacelike on and tangent to e, except where the
using conformal completions, discussed in more
axis of rotation meets e, where X is null. Based on
detail in the article about asymptotic structure in
the above properties, the Kerr family provides ~ ~g) is called a con-
this encyclopedia: a pair (m,
natural models for rotating black holes.
formal completion at infinity, or simply conformal
Unfortunately, as opposed to the spherically
completion, of (m, g) if m~ is a manifold with
symmetric case, there are no known explicit collap-
boundary such that:
sing solutions with rotating matter, in particular no
known solutions having the Kerr metric as final 1. m is the interior of m;~
state. 2. there exists a function , with the property that
The aim of the theory outlined below is to the metric ~g, defined as 2 g on m, extends by
understand the general geometrical features of continuity to the boundary of m, ~ with the
Stationary Black Holes 41

extended metric remaining of Lorentzian signa- spacetime. This has to be a condition which does not
ture; and exclude singularities (otherwise the Schwarzschild
~ vanishes
3.  is positive on m, differentiable on m, and Kerr black holes would be excluded), but which
on the boundary nevertheless guarantees a well-behaved exterior
region. One such condition, assumed in all the
i :¼ m~ n m results described below, is the existence in m of an
with d nowhere vanishing on i. asymptotically flat spacelike hypersurface s with
compact interior. Further, either s has no boundary
The boundary i of m~ is called Scri, a phonic or the boundary of s lies on eþ [ e . To make
shortcut for ‘‘script I.’’ The idea here is the things precise, for any spacelike hypersurface let gij
following: forcing  to vanish on i ensures that i be the induced metric, and let Kij denote its extrinsic
lies infinitely far away from any physical object – a curvature. A spacelike hypersurface sext diffeo-
mathematical way of capturing the notion ‘‘very far morphic to R3 minus a ball will be called asympto-
away.’’ The condition that d does not vanish is a tically flat if the fields (gij , Kij ) satisfy the fall-off
convenient technical condition which ensures that i conditions
is a smooth three-dimensional hypersurface, instead
of some, say, one- or two-dimensional object, or of a jgij  ij j þ rj@‘ gij j þ    þ rk j@‘1 ‘k gij j
set with singularities here and there. Thus, i is an þ rjKij j þ    þ r k j@‘1 ‘k1 Kij j  Cr1 ½7
idealized description of a family of observers at
infinity. for some constants C, k  1. A hypersurface s (with
To distinguish between various points of i, one or without boundary) will be said to be asymptotically
sets flat with compact interior if s is of the form sint [
sext , with sint compact and sext asymptotically flat.
iþ ¼ fpoints in i which are to the future of the There exists a canonical way of constructing a
physical spacetimeg conformal completion with good global properties

i ¼ fpoints in i which are to the past of the for stationary spacetimes which are asymptotically
flat in the sense of [7], and which are vacuum
physical spacetimeg
sufficiently far out in the asymptotic region. This
(Recall that a point q is to the future, respectively to conformal completion is referred to as the standard
the past, of p if there exists a future directed, completion and will be assumed from now on.
respectively past directed, causal curve from p to q. Returning to the event horizon e = eþ [ e ,
Causal curves are curves  such that their tangent it is not very difficult to show that every Killing
vector ˙ is causal everywhere, g(, ˙  0.) One
_ ) vector field X is necessarily tangent to e. Since
then defines the black hole region b as the latter set is a null Lipschitz hypersurface, it
follows that X is either null or spacelike on e. This
b :¼ fpoints in m which are leads to a preferred class of event horizons, called
½4
not in the past of iþ g Killing horizons. By definition, a Killing horizon
associated with a Killing vector K is a null hypersur-
By definition, points in the black hole region cannot face which coincides with a connected component of
thus send information to iþ ; equivalently, observers the set
on iþ cannot see points in b. The white-hole region
w is defined by changing the time orientation in [4]. HðKÞ :¼ fp 2 m: gðK; KÞðpÞ ¼ 0; KðpÞ 6¼ 0g ½8
A key notion related to the concept of a black hole is A simple example is provided by the ‘‘boost Killing
that of future (eþ ) and past (e ) event horizons, vector field’’ K = z@t þ t@z in Minkowski spacetime:
eþ :¼ @b; e :¼ @w ½5 H(K) has four connected components,

Under mild assumptions, event horizons in station- H :¼ ft ¼ z; t > 0g; ;  2 f1g
ary spacetimes with matter satisfying the null-energy The closure H of H is the set {jtj = jzj}, which is not
condition, a manifold, because of the crossing of the null
T ‘ ‘  0 for all null vectors ‘ ½6 hyperplanes {t =  z} at t = z = 0. Horizons of this
type are referred to as bifurcate Killing horizons,
are smooth null hypersurfaces, analytic if the metric with the set {K(p) = 0} being called the bifurcation
is analytic. surface of H(K). The bifurcate horizon structure in
In order to develop a reasonable theory, one the Kruszkal–Szekeres–Schwarzschild spacetime can
also needs a regularity condition for the interior of be clearly seen in Figures 1 and 2.
42 Stationary Black Holes

The Vishveshwara–Carter lemma shows that if a have = 0 if and only if m = a. A fundamental


Killing vector K is hypersurface orthogonal, K[ ^ theorem of Boyer shows that degenerate horizons
dK[ = 0, then the set H(K) defined in [8] is a union are closed. This implies that a horizon H(K) such that
of smooth null hypersurfaces, with K being K has zeros in H is nondegenerate, and is of bifurcate
tangent to the null geodesics threading H (‘‘H is type, as described above. Further, a nondegenerate
generated by K’’), and so is indeed a Killing Killing horizon with complete geodesic generators
horizon. It has been shown by Carter that the always contains zeros of K in its closure. However, it
same conclusion can be reached if the hypothesis is not true that existence of a nondegenerate horizon
of hypersurface orthogonality is replaced by that implies that of zeros of K: take the Killing vector field
of existence of two linearly independent Killing z@t þ t@z in Minkowski spacetime from which the
vector fields. 2-plane {z = t = 0} has been removed. The universal
In stationary-axisymmetric spacetimes, a Killing cover of that last spacetime provides a spacetime in
vector K tangent to the generators of a Killing which one cannot restore the points which have been
horizon H can be normalized so that K = X þ !Y, artificially removed, without violating the manifold
where X is the Killing vector field which asymptotes property.
to a time translation in the asymptotic region, and Y The domain of outer communications (DOC) of a
is the Killing vector field which generates rotations black hole spacetime is defined as
in the asymptotic region. The constant ! is called
hhmii :¼ m n fb [ wg ½10
the angular velocity of the Killing horizon H.
On a Killing horizon H(K), one necessarily has Thus, hhmii is the region lying outside of the white-
hole region and outside of the black hole region; it is
r ðK K Þ ¼ 2 K ½9
the region which can both be seen by the outside
Assuming the so-called dominant-energy condition observers and influenced by those.
on T (see Positive Energy Theorem and Other The subset of hhmii where X is spacelike is called
Inequalities in GR), it can be shown that is constant the ergoregion. In the Schwarzschild spacetime,
(recall that Killing horizons are always connected in ! = 0 and the ergoregion is empty, but neither of
the terminology used in this article); it is called the these is true in Kerr with a 6¼ 0.
surface gravity of H. A Killing horizon is called A very convenient method for visualizing the
degenerate when = 0, and nondegenerate other- global structure of spacetimes is provided by the
wise; by an abuse of terminology, one similarly talks Carter–Penrose diagrams. An example of such a
of degenerate black holes, etc. In Kerr spacetimes we diagram is given in Figure 2.

r = constant < 2M
Singularity (r = 0)
r = 2M r = 2M
i+ i+
t = constant
r = infinity
t = constant II
r = infinity

i0 i0
III I

r = infinity
r = infinity
IV
r = constant > 2M
r = constant > 2M
i– i–
r = 2M
r = 2M
r = constant < 2M
Singularity (r = 0)
t = constant
Figure 2 The Carter–Penrose diagram for the Kruskal–Szekeres spacetime. There are actually two asymptotically flat regions, with
corresponding i and e defined with respect to the second region, but not indicated on this diagram. Each point in this diagram represents
a two-dimensional sphere, and coordinates are chosen so that light cones have slopes 1. Regions are numbered as in Figure 1. (Adapted
with permission from Nicolas J-P (2002) Dirac fields on asymptotically flat space-times. Dissertationes Mathematicae 408: 1–85.)
Stationary Black Holes 43

A corollary of the topological censorship theorem Based on the facts below, it is expected that the
of Friedman, Schleich, and Witt is that DOCs of DOCs of appropriately regular, stationary, vacuum
regular black hole spacetimes satisfying the domi- black holes are isometrically diffeomorphic to those
nant-energy condition are simply connected. This of Kerr black holes:
implies that connected components of event hor-
1. The rigidity theorem (Hawking). Event horizons in
izons in stationary spacetimes have R S2
regular, nondegenerate, stationary, analytic
topology.
vacuum black holes are either Killing horizons for
The discussion of the concepts associated with
X, or there exists a second Killing vector in hhmii.
stationary-black hole spacetimes can be concluded
2. The Killing horizons theorem (Sudarsky–Wald).
by summarizing the properties of the Schwarzs-
Nondegenerate stationary vacuum black holes
child and Kerr geometries: the extended
such that the event horizon is the union of Killing
Kerr spacetime with m > a is a black hole space-
horizons of X are static.
time with the hypersurface {r = rþ } forming a
3. The Schwarzschild black holes exhaust the family
nondegenerate, bifurcate Killing horizon generated
of static regular vacuum black holes (Israel,
by the vector field X þ !Y and surface gravity
Bunting – Masood-ul-Alam, Chruściel).
given by
4. The Kerr black holes satisfying
ðm2  a2 Þ1=2
¼ m2 > a2 ½11
2m½m þ ðm2  a2 Þ1=2 
exhaust the family of nondegenerate, stationary-
In the case a = 0, where the angular velocity !
axisymmetric, vacuum, connected black holes.
vanishes, X is hypersurface orthogonal and becomes
Here m is the total Arnowitt–Deser–Misner
the generator of H. The bifurcation surface in this
(ADM) mass, while the product am is the total
case is the totally geodesic 2-sphere, along which the
ADM angular momentum. (Of course, these
four regions in Figure 1 are joined.
quantities generalize the constants a and m
appearing in the Kerr metric.) The framework
for the proof has been set up by Carter, and the
Classification of Stationary Solutions statement above is due to Robinson.
(‘‘No-Hair Theorems’’) The above results are collectively known under
We confine attention to the ‘‘outside region’’ of the name of no-hair theorems, and they have not
black holes, the DOC. (Except for the degenerate provided the final answer to the problem so far.
case discussed later, the ‘‘inside’’(black hole) There are no a priori reasons known for the
region is not stationary, so that this restriction analyticity hypothesis in the rigidity theorem.
already follows from the requirement of stationar- Further, degenerate horizons have been completely
ity.) For reasons of space, we only consider understood in the static case only.
vacuum solutions; there exists a similar theory Yet another key open question is that of the
for electro-vacuum black holes. (There is a some- existence of nonconnected regular stationary-
what less developed theory for black hole space- axisymmetric vacuum black holes. The following
times in the presence of nonabelian gauge fields.) result is due to Weinstein: let @sa , a = 1, . . . , N, be
In connection with a collapse scenario, the vacuum the connected components of @s. Let X[ = g X dx ,
condition begs the question: collapse of what? The where X is the Killing vector field which asymptoti-
answer is twofold: first, there are large classes of cally approaches the unit normal to sext . Similarly, set
solutions of Einstein equations describing pure Y [ = g Y  dx , Y  being the Killing vector field
gravitational waves. It is believed that sufficiently associated with rotations. On each @sa , there exists
strong such solutions will form black holes. a constant !a such that the vector X þ !a Y is tangent
(Whether or not they will do that is related to the to the generators of the Killing horizon intersecting
cosmic censorship conjecture, see Spacetime Toplogy, @sa . The constant !a is called the angular velocity of
Casual Structure and Singularities.) Consider, next, a the associated Killing horizon. Define
dynamical situation in which matter is initially present. Z
1
The conditions imposed in this section correspond ma ¼ 
dX[ ½12
8 @sa
then to a final state in which matter has either been
radiated away to infinity, or has been swallowed by Z
1
the black hole (as in the spherically symmetric La ¼ 
dY [ ½13
Oppenheimer–Snyder collapse described above). 4 @sa
44 Stationary Phase Approximation

Such integrals are called Komar integrals. One Acknowledgments


usually thinks of La as the angular momentum of
This work was partially supported by a Polish
each connected component of the black hole. Set
Research Committee grant 2 P03B 073 24 and by
a ¼ ma  2!a La ½14 the Erwin Schroedinger Institute, Vienna.
Weinstein shows that one necessarily has a > 0. See also: Asymptotic Structure and Conformal Infinity;
The problem at hand can be reduced to a harmonic- Black Hole Mechanics; Critical Phenomena in
map equation, also known as the Ernst equation, Gravitational Collapse; Einstein Equations: Exact
involving a singular map from R3 with Euclidean Solutions; Einstein Equations: Initial Value Formulation;
metric  to the two-dimensional hyperbolic space. Geometric Flows and the Penrose Inequality; Spacetime
Let ra > 0, a = 1, . . . , N  1, be the distance in R3 Topology, Causal Structure and Singularities.
along the axis between neighboring black holes as
measured with respect to the (unphysical) metric .
Weinstein proved that for nondegenerate regular Further Reading
black holes the inequality [11] holds, and that the
metric on hhmii is determined up to isometry by the Carter B (1973) Black hole equilibrium states. In: de Witt C and
3N  1 parameters de Witt B (eds.) Black Holes, Proceedings of the Les Houches
Summer School. New York: Gordon and Breach.
ð1 ; . . . ; N ; L1 ; . . . ; LN ; r1 ; . . . ; rN1 Þ ½15 Chruściel PT (2002) Black holes. In: Friedrich H and
Frauendiener J (eds.) Proceedings of the Tübingen Workshop
just described, with ra , a > 0. These results by on the Conformal Structure of Space-Times, Springer Lecture
Weinstein contain the no-hair theorem of Carter Notes in Physics, vol. 604, pp. 61–102 (gr-qc/0201053).
and Robinson as a special case. Weinstein also Springer: Heidelberg.
Hawking SW and Ellis GFR (1973) The Large Scale Structure of
shows that, for every N  2 and for every set of Space–Time. Cambridge: Cambridge University Press.
parameters [15] with a , ra > 0, there exists a Heusler M (1996) Black Hole Uniqueness Theorems. Cambridge:
solution of the problem at hand. It is known that Cambridge University Press.
for some sets of parameters [15] the solutions will Heusler M (1998) Stationary black holes: uniqueness and beyond.
Living Reviews Relativity 1 (http://www.livingreviews.org/
have ‘‘strut singularities’’ between some pairs of
lrr-1988-6).
neighboring black holes, but the existence of the Nicolas J-P (2002) Dirac fields on asymptotically flat space-times.
‘‘struts’’ for all sets of parameters as above is not Dissertationes Mathematicae 408: 1–85.
known, and is one of the main open problems in our O’Neill B (1995) The Geometry of Kerr Black Holes. Wellesley,
understanding of stationary-axisymmetric electro- MA: A.K. Peters.
vacuum black holes. The existence and uniqueness Volkov MS and Gal’tsov DV (1999) Gravitating non-abelian
solitons and black holes with Yang–Mills fields. Physics
results of Weinstein remain valid when strut Reports 319: 1–83 (hep-th/9810070).
singularities are allowed in the metric at the outset, Wald RM (1984) General Relativity. Chicago: University of
although such solutions do not fall into the category Chicago Press.
of regular black holes discussed here.

Stationary Phase Approximation


J J Duistermaat, Universiteit Utrecht, Utrecht, d. The real variable ! plays the role of a frequency
The Netherlands variable, whereas the real-valued smooth function ’ on
ª 2006 Elsevier Ltd. All rights reserved.  is called the phase function. The amplitude function a
is assumed to be a compactly supported complex
(vector-) valued smooth function on . The topic of
this article is the asymptotic behavior of the oscillatory
Introduction
integral I(!) as the frequency ! tends to infinity.
An oscillatory integral is an integral of the form When the manifold  is not compact and the
Z amplitude function is not compactly supported, then a
Ið!Þ ¼ ei!’ðÞ aðÞd ½1 smooth cutoff function may be used to write the
integral as the sum of an integral with a compactly
Here the integration is over a smooth k-dimensional supported amplitude and one with an amplitude which
manifold  which is provided with a smooth density is equal to zero in a large compact subset of . The
Stationary Phase Approximation 45

latter integral can be studied if suitable assumptions ’() = i () for a real-valued function which has a
are made about the asymptotic behavior of the phase nondegenerate local minimum at 0 , in which case
function and the amplitude at infinity, but this is not the integrand is a sharply peaking Gaussian density
the subject of this article. The use of the exponential at 0 . When ’ and a are analytic near 0 , then the
function with purely imaginary argument instead of method of steepest descent consists of deforming the
the sine and the cosine is just a matter of convenience. path of integration in the complex domain in such a
The first observation about oscillatory integrals in way that the integrand becomes such a sharply
the next section is the principle of stationary phase, peaking Gaussian density. During the deformation,
which states that the contributions to the integral the integral does not change because of Cauchy’s
which are not rapidly decreasing as ! ! 1 only integral theorem.
come from the stationary points of ’, the points  2  An important extension of the theory occurs if the
where the total derivative d’() of ’ is equal to zero. real-valued phase function and the amplitude are
This principle is closely related to the observation that allowed to depend smoothly on additional para-
a superposition of waves is maximal at points where meters x, which vary in an n-dimensional smooth
the waves are in phase, an observation which goes manifold M. The amplitude is also allowed to
back to Huygens (1690). depend on !, with an asymptotic expansion of
Assume that 0 is a nondegenerate stationary point of the form
’. That is, d’(0 ) = 0 and the Hessian D2 ’(0 ) of ’ at
0 is nondegenerate. Then 0 is an isolated stationary X
1
aðx; ; !Þ  ar ðx; Þ!mþðk=2Þr as ! ! 1 ½2
point of ’, and the contribution to I(!) of a neighbor-
r¼0
hood of 0 has an asymptotic expansion of the form
The expansion is supposed to be locally uniformly in
X
1
Ið!Þ  e i!’ð0 Þ
cr ! k=2r
; r!1 (x, ) and to allow termwise differentiations of any
r¼0 order with respect to the variables (x, ). Then the
integral
Here the leading coefficient c0 is the product of a(0 ) Z
with a nonzero constant which only depends on
Iðx; !Þ ¼ ei!’ðx;Þ aðx; ; !Þ d
D2 ’(0 ) and the density d at 0 . For increasing r the
coefficients cr depend on the derivatives of ’ and a
at 0 of increasing order (see the section ‘‘The is called an oscillatory integral of order m. Here the
method of stationary phase’’). function x 7! I(x, !) is viewed as a continuous
Usually, even if all the objects are analytic in a superposition of the -dependent family of oscilla-
neighborhood of 0 , the asymptotic power series tory functions x 7! ei!’(x, ) a(x, ).
does not converge. However, there are exceptional The example which formed the point of departure
cases where the stationary phase approximation is of Airy (1838) is that ei!’(x, ) a(x, , !) is the wave
exact. Assume, for instance, that  is a compact which arrives at the points x in spacetime which is
manifold provided with a symplectic form , ’ is the sent out by a point  on a reflecting mirror. That is,
Hamiltonian function of a Hamiltonian circle action at x one collects (= integrates over ) all the waves
on  with isolated fixed points, and a() d = k =k!. sent out by the various points  of the mirror . The
Then the stationary points of ’ are the fixed points main point of the theory, however, is that in great
of the circle action, each stationary point of ’ is generality the solutions of linear partial differential
nondegenerate and I(!) is equal to the sum over the equations, such as classical wave equations or
finitely many stationary points of only the leading quantum mechanical Schrödinger equations, can be
terms of the asymptotic expansions at the stationary represented, as functions of x, as oscillatory inte-
points. This Duistermaat–Heckman formula is a grals. This construction has led to decisive progress
consequence of a more general localization formula in the general theory of linear partial differential
in equivariant cohomology (see the section ‘‘Exact equations with smoothly varying coefficients.
stationary phase’’). According to the principle of stationary phase, the
For the purpose of applications, but also in the main asymptotic contributions to the integral come
analysis of oscillatory integrals, it is worthwhile to from the points  such that @’(x, )=@ = 0. The
allow complex-valued phase functions, but with a phase function ’ 2 C1 (M  ) is called nondegene-
local minimum for the imaginary part at the rate if the (n þ k)  k-matrix
stationary point 0 of the real parts. That is, the
real part of the exponent i!’() has a local @ 2 ’ðx; Þ @’ðx; Þ
has rank k when ¼0 ½3
maximum at 0 . An extreme case occurs when @ðx; Þ@ @
46 Stationary Phase Approximation

This is the natural condition to ensure that the set the equation p = 0 implies that  is invariant under
   the flow of the Hamiltonian system with Hamilton
@’ðx; Þ function equal to p. Furthermore, the principal

S’ :¼ ðx; Þ 2 M   ¼0
@ symbol s of u satisfies a homogeneous first-order
ordinary differential equation along the solution
is a smooth n-dimensional submanifold of M  . curves of the Hamiltonian system. Conversely, these
The condition [3], moreover, implies that the mapping properties can be used to construct global oscillatory
  integrals u which asymptotically satisfy Pu = 0 and
@’ðx; Þ have prescribed initial values. This theory, due to
’ : S’ 3 ðx; Þ 7! x; 2 T M
@x Maslov (1972), may be viewed as a far reaching
generalization of the WKB method.
is a smooth immersion from S’ into the cotangent Let : T M ! M : (x, ) 7! x denote the canonical
bundle T M of M. Note that  = @’(x, )=@x is projection from T M onto M. The projections into
coordinate invariantly defined as a linear form on M of the solution curves in a Lagrangian submani-
the tangent space Tx M of M at the point x. That fold  of T M, of a Hamiltonian system which
is,  2 (Tx M) = the dual space of Tx M, and (Tx M) leaves  invariant, are the ray bundles of geome-
is the fiber of T M over x. In classical mechanics, trical optics. If  is not transversal to the fiber of
T M is the phase space of the position space M, T M at (x, ), then the ray bundle exhibits a caustic
and a linear form  on Tx M is called a momen- at the point x 2 M, and the oscillatory integral is
tum vector at the position x. If  denotes the cano- asymptotically of larger order than !m near x.
nical symplectic form on T M, then ’  = 0. The Applying the theory of unfoldings of singularities
immersion ’ locally embeds S’ onto a smooth to the phase function, one can determine the
n-dimensional submanifold ’ of M, which is a structurally stable caustics and obtain normal
Lagrangian manifold in T M, which by definition forms of the oscillatory integrals in the structurally
means that ’  = 0. stable cases (see the section ‘‘Caustics’’).
Oscillatory integrals with very different phase If we also integrate over the frequency variable !,
functions and amplitudes can define the same then we obtain the Fourier integral distributions u of
!-dependent functions on M. The theory of Hörmander (1971, sections 1.2 and 3.2). In this case
Hörmander (1971, section 3.1) says that the germs the corresponding Lagrangian manifold is conic in
of the Lagrangian manifolds ’ and  are the same the sense that if (x, ) 2 , then (x,
) 2  for every
if and only if ’ and define the same class of
> 0. The wave front set of u, which is the
oscillatory integrals. Moreover, every Lagrangian microlocal singular locus of the distribution u, is
submanifold  of T M is locally of the form ’ contained in , with equality if the principal symbol
for some nondegenerate phase function ’. In this of u is not equal to zero at the corresponding
way, the mapping ’ 7! ’ defines a bijection stationary points of the phase function. Fourier
between the set of equivalence classes of germs of integral operators are defined as the linear operators
nondegenerate phase functions and the set of germs acting on distributions, of which the distribution
of Lagrangian submanifolds of T M. Let  be an kernels are Fourier integral distributions. Under a
immersed Lagrangian submanifold of T M. A suitable transversality condition for the Lagrangian
global oscillatory integral of order m on M, defined manifolds of the distribution kernels, the composi-
by , is a locally finite sum u(x, !) of oscillatory tion of two Fourier integral operators is again a
integrals of order m with nondegenerate phase Fourier integral operator, and the principal symbol
functions ’ such that ’  . The leading terms of of the composition is a product of the principal
the amplitudes correspond to a section s of a symbols. The proof is an application of the method
canonically defined complex line bundle  over , of stationary phase. Fourier integral operators are a
which is called the principal symbol of u (see the very powerful tool in the analysis of linear partial
section ‘‘The principal symbol on the Lagrangian differential operators with smoothly varying coeffi-
manifold’’). cients (see Hörmander (1985)).
If P is a linear partial differential operator, such as
the wave operators, in which the coefficients may
depend in a smooth way on x and in a polynomial
way on !, then the condition that Pu is asymptoti-
The Principle of Stationary Phase
cally small implies that p = 0 on , in which p is a The principle of stationary phase says that if the
smooth function on T M, called the principal phase function ’ has no stationary points in
symbol of P. Because  is a Lagrangian manifold, the support of the amplitude function a, then the
Stationary Phase Approximation 47

oscillatory integral [1] is rapidly decreasing, in the (1990, theorem 7.6.1)). The Taylor expansion of the
sense that for every N we have I(!) = O(!N ) as exponential factor in [4] then yields that
! ! 1. For the proof, one introduces a vector field v Z
on  such that v’ = 1 on a neighborhood of the ei!hQy;yi=2 bðyÞ dy
support of a. Then ei!’ = (i!)1 v(ei!’ ), and an Rk
  ! 1=2 X 1
1
integration by parts in [1] yields that  det Q ð2i!Þr
2 i r¼0
r!
Z
r 
1 @ @ 
Ið!Þ ¼ ei!’ðÞ ðt vaÞðÞ d  Q1 ; bðyÞ
i! @y @y y¼0

where t v denotes the transposed of the linear partial as ! ! 1 (see Hörmander (1990, lemma 7.7.3)).
differential operator v. Iterating this, the rapid It is important for the applications that, if the
decrease of I(!) follows. phase function and amplitude depend smoothly on
Using cutoff functions, I(!) is, modulo a rapidly parameters, all the constructions can be made to
decreasing function, equal to an oscillatory integral depend smoothly on the parameters.
with phase function ’ and an amplitude which
has support in an arbitrarily small neighborhood of
the set of stationary points of ’. In this sense, Exact Stationary Phase
the contributions to the integral which are not
Suppose that we have given an action of a Lie group
rapidly decreasing come only from the stationary
G on the manifold . Let g denote the Lie algebra of
points of ’.
G. For any g 2 G and X 2 g the corresponding
diffeomorphism of  and vector field on  is
The Method of Stationary Phase denoted by g and X , respectively. If () denotes
the algebra of smooth differential forms on , then
Assume that 0 is a nondegenerate stationary point
we consider the algebra Sg () of all ()-
of ’. Then 0 is an isolated stationary point of ’.
valued polynomials on g, where Sg denotes the
Using local coordinates near 0 , the contribution to
algebra of all polynomial functions on g. On Sg
[1] from the neighborhood of 0 can be written as an
() we have the action of g 2 G which sends to
oscillatory integral with  = Rk and a pase function
X 7! g ( (Ad g X)). Let A = (Sg ())G denote
’ which has a nondegenerate stationary point at 0.
the subalgebra of all G-invariant elements of Sg
Write Q = D2 ’(0). According to the Morse lemma,
(). The equivariant exterior derivative D is
there is smooth substitution of variables  = T(y)
defined by
such that T(0) = 0, DT(0) = I, and ’(T()) = ’(0) þ
hQy, yi=2 for all y in a neighborhood of 0 in Rk . ðD ÞðXÞ ¼ dð ðXÞÞ  iX ð ðXÞÞ
Applying this substitution of variables to [1] we
If is homogeneous as a differential form of degree p
obtain
and homogeneous as a polynomial on g of degree q,
Z then r = p þ 2q is called the total degree of . Let Ar
Ið!Þ ¼ ei!’ð0Þ ei!hQy;yi=2 bðyÞ dy denote the space of sums of such 2 A of total degree r.
R k
Then Dr = D: Ar ! Arþ1 and Dr
Dr1 = 0. The space
HrG () := ker Dr =Im Dr1 is called the equivariant
where b is a compactly supported smooth function cohomology in degree r, in the model of Cartan (1950).
on Rk with b(0) = a(0). Now the Fourier transform Assume that  is compact and oriented, and that
of the function y 7! ei!hQy, yi=2 is equal to the function the action of G preserves the orientation. If 2 A,
then we denote by (X)[k] the volume part of the
  ! 1=2 1 1 differential form (X), and
7! det Q ei! hQ ; i=2 ½4 Z
2 i Z 
ðXÞ :¼ ðXÞ½k ; X 2 g

Both in the definition of the square root of the R
determinant and in the proof one uses the analytic defines an Ad G-invariant function on g. Now
continuation to the domain of complex-valued = D implies that (X)[k] id equal Rto the exterior
symmetric bilinear forms Q for which the imaginary derivative of (X)[k1] , and therefore = 0, in view
part of Q is positive definite. For purely imaginary of Stokes’ theorem. It follows
R that integration over 
Q we have the familiar formula for the Fourier yields a linear mapping from HG () to (Sg )Ad G ,
transform of a Gaussian density (see Hörmander which is called integration in equivariant cohomology.
48 Stationary Phase Approximation

Now assume that also the Lie group G is amplitude a, and (x0 , 0 ) = ’ (x0 , 0 ), then the phase
compact, and let X 2 g. Then the zero-set ZX of function ’(x, )  (x) in the oscillatory integral
X in  has finitely many connected components F, Z Z
each of which is a smooth and compact submanifold hu; ei bi ¼ eið’ðx;Þ ðxÞ aðx; ÞbðxÞ d dx
of . In general, the F’s can have different M 

dimensions. The linearization LX of the vector has a stationary point at (x0 , 0 ), which means that
field X along F acts linearly on the normal bundle  and d intersect at (x0 , 0 ). Here the 1-form d on
NF of F. If  is the curvature form of NF, then M, which is a section of : T M ! M, is viewed as a
submanifold of T M. Locally the Lagrangian sub-
i manifolds of T M which are transversal to the fibers
"ðXÞ :¼ detC ðLX  Þ
2 of : T M ! M are precisely the manifolds of the
form d . The stationary point of ’  is non-
is called the equivariant Euler form of NF. "(X) is an
degenerate if and only if L := T(x0 , 0 )  and
invertible element in the algebra even (F). The
L := T(x0 , 0 ) (d ) are transversal. In this case, the
localization formula of Berline–Vergne (1982) and
method of stationary phase can be applied in order
Atiyah–Bott (1984) now says that if D = 0 then
to obtain an asymptotic expansion in terms of
Z  XZ ½dim F powers of !. The coefficient of the leading term of
ðXÞ ¼ iF ðXÞ="ðXÞ order !m depends only on the Lagrangian plane L ,
F F
which is transversal to both L and the tangent space
Assume that  is a symplectic form on , which of the fiber of T M, and not on the other data of
implies that k = 2l is even. Furthermore, assume that and b. If L denotes the set of all Lagrangian planes
the infinitesimal action of g on  is Hamiltonian, in T(x0 , 0 ) (T M) which are transversal to both L and
which means that there exists a G-equivariant the fiber, then the complex-valued functions on L
smooth mapping  :  ! g , called the momentum which arise in this way form a one-dimensional
mapping, such that iX  = d((X)) for every X 2 g. complex vector space L(x0 , 0 ) . The L(x0 , 0 ) for
Here  is viewed as an element of (g 0 ())G  A. (x0 , 0 ) 2  form a complex line bundle  over 
Then b(X) :=   (X) defines an element  b 2 A such which is canonically isomorphic to the tensor
that Db = 0. In turn, this implies that the form product of the line bundle of half-densities and the
Maslov line bundle, a line bundle with structure
X
l
group Z=4Z (see Duistermaat (1974, section 1.2)).
ðXÞ :¼ ei!b
ðXÞ
¼ ei!ðXÞ ði!Þr =r! In this way, the principal symbol s of u can be
r¼0
viewed as a section of the line bundle  over .
is equivariantly closed, and the localization formula
of equivariant cohomology applied to this case
yields the Duistermaat–Heckman (1982, 1983)
Caustics
formula. Because (X)[k] = ei!(X) (i!)l =l!, its inte-
gral over  is an oscillatory integral with phase Let (x0 , 0 ) be a point in the Lagrangian submanifold
function (X). The stationary points of (X) are the  of T M. The restriction to  of the projection
zeros of X and the stationary points of (X) are : T M ! M is a diffeomorphism from an open
nondegenerate if and only if the zeros of X are neighborhood of (x0 , 0 ) in  onto an open neigh-
isolated. It follows that in this case the oscillatory borhood of x0 in M, if and only if  is transversal
integral is equal to the leading term in the to the fiber of T M at (x0 , 0 ). If  = ’ for a
stationary-phase approximation. nondegenerate phase function ’, (x0 , 0 ) 2 S’ and
(x0 , 0 ) = ’ (x0 , 0 ), then this condition is in turn
equivalent to the condition that 0 is a nondegenerate
stationary point of  7! ’(x0 , ). An application of the
The Principal Symbol on the
method of stationary phase shows that in this case the
Lagrangian Manifold
oscillatory integral is equal to a progressing wave of
Let u(x, !) be a global oscillatory integral of order m the form ei! (x) b(x, !). Here (x) = ’(x, (x)), where
defined by , and let (x0 , 0 ) 2 . One way to define (x) is the stationary point of  7! ’(x, ), and b(x, !)
the principal symbol of u at (x0 , 0 ) 2  is to test u has an asymptotic expansion as in [2] with k = 0.
with an oscillatory function of the form ei! (x) b(x), If 0 is a degenerate stationary point of
in which d (x0 ) = 0 , the support of b is contained  7! ’(x0 , ) and a0 (x0 , 0 ) 6¼ 0, then the oscillatory
in a small neighborhood of x0 , and b(x0 ) = 1. If u is integral is not of order O(!m ). That is, it is of larger
locally represented by the phase function ’ and order than at points where we have a nondegenerate
Stationary Phase Approximation 49

stationary point. For this reason, the points (x0 , 0 ) caustic points can be very intricate (see, e.g., Berry
at which  is not transversal to the fibers of et al. (1979)). A survey of the application of the
: T M ! M are called the caustic points of . theory of unfoldings to caustics in oscillatory
Their projections x0 2 M form the caustic set in M. integrals can be found in Duistermaat (1974).
In the theory of unfoldings of singularities, the
germs of the families of functions x 7! ( 7! ’(x, )) See also: Equivariant Cohomology and the Cartan
and y 7! ( 7! (y, )) are called equivalent if there Model; Feynman Path Integrals; Functional Integration in
exists a germ of a diffeomorphism of the form Quantum Physics; Hamiltonian Group Actions;
h-Pseudodifferential Operators and Applications;
H : (x, ) 7! (y(x), (x, )) and a smooth function (x)
Multiscale Approaches; Normal Forms and Semiclassical
such that (y(x), (x, )) = ’(x, ) þ (x). If J(y, !) is
Approximation; Optical Caustics; Path Integrals in
an oscillatory integral with phase function , Noncommutative Geometry; Perturbation Theory and its
integration variable  and parameter y, then the Techniques; Schrödinger Operators; Singularity and
substitution of variables  = (x, ) in the integral, Bifurcation Theory; Wave Equations and Diffraction.
followed by the substitution of variables y = y(x) in
the parameters, yields that J(y, !) = ei!(x) I(x, !), in
which I(x, !) is an oscillatory integral with phase
function ’ and an amplitude function of the same Further Reading
order as the amplitude function of J. The germ ’ is Airy GB (1838) On the intensity of light in a neighborhood of a
called stable if every nearby germ is equivalent to caustic. Transactions of the Cambridge Philosophical Society
’. The Morse lemma with parameters implies that 6: 379–403.
this is the case if  7! ’(x0 , ) has a nondegenerate Atiyah MF and Bott R (1984) The moment map and equivariant
cohomology. Topology 23: 1–28.
stationary point at 0 . However, the theory of Berline N and Vergne M (1982) Classes caractéristiques équivar-
unfoldings of singularities of Thom and Mather iante. Formules de localisation en cohomologie équivariante.
shows that there are many stable germs with Comptes Rendus Hebdomadaires des Seances de l’Academie
degenerate critical points. Moreover, in dimension des Sciences, Paris 295: 539–541.
Berry MV, Nye JF, and Wright FJ (1979) The elliptic umbilic
n 5 the generic germ is stable, and is equivalent to
diffraction catastrophe. Philosophical Transactions of the
a germ in a finite list of normal forms. Royal Society of London A291: 453–484.
The simplest example of a normal form with Cartan H (1950) Notion d’algèbre différentielle; applications ou
degenerate critical points is ’(x, ) = 3 þ x1 . Here opère un groupe de Lie, and: La transgression dans un groupe
we have taken k = 1, but still allowed an arbitrary de Lie et dans un fibré principal. In: Colloque de Topologie,
dimension n  1 of M. In this normal form, the pp. 15–27, 57–71 Bruxelles: C.B.R.M.
Duistermaat JJ (1974) Oscillatory integrals, Lagrange immersions
stationary points correspond to 32 þ x1 = 0, which and unfoldings of singularities. Communications in Pure and
is a manifold which over the x-space folds over at Applied Mathematics 27: 207–281.
x1 = 0. The stationary point is degenerate if and only Duistermaat JJ and Heckman GJ (1982) On the variation in the
if 6 = 0, hence x1 = 0, which means that x1 = 0 is cohomology of the symplectic form of the reduced phase
the caustic set. If the amplitude is equal to 1, then space. Inventiones Mathematical 69: 259–268.
Duistermaat JJ and Heckman GJ (1983) On the variation in the
the oscillatory integral is equal to !1=3 Ai(!2=3 x1 ), cohomology of the symplectic form of the reduced phase
in which Ai(z) denotes the Airy function. If the space. Inventiones Mathematical 72: 153–158.
amplitude is nonzero at a degenerate critical point, Hörmander L (1971) Fourier integral operators I. Acta Mathe-
then the oscillatory integral near the corresponding matica 127: 79–183.
caustic point is asymptotically of the same order as Hörmander L (1983, 1990) The Analysis of Linear Partial
Differential Operators I. Berlin: Springer.
!1=3 Ai(!2=3 x1 ), which implies that the oscillatory Hörmander L (1985) The Analysis of Linear Partial Differential
integral is a factor !1=6 larger at these caustic points Operators IV. Berlin: Springer.
than at the points away from the caustic set. In Airy Huygens C (1690) Traité de la Lumière, Leyden: Van der Aa.
(1838), where the Airy function was introduced, Airy (English translation: Treatise on Light, 1690, reprint by Dover
considered light in a neighborhood of a caustic as an Publications, New York, 1962.)
Maslov VP (1965) Perturbation Theory and Asymptotic Methods
oscillatory integral. Then, under suitable genericity (In Russian), Moscow: Moskov. Gos. University. (French
conditions, he brought the phase function into the translation by Lascoux J and Sénéor R (1972) Paris: Dunod.)
normal form 3 þ x1 . Even for stable normal forms
in low dimensions, the interference patterns near the
50 Statistical Mechanics and Combinatorial Problems

Statistical Mechanics and Combinatorial Problems


R Zecchina, International Centre for Theoretical grouped in a larger class called NP, where NP stands
Physics (ICTP), Trieste, Italy for ‘‘nondeterministic polynomial time.’’ These
ª 2006 Elsevier Ltd. All rights reserved. problems are such that a potential solution can be
checked rapidly in polynomial time, while finding a
solution may require exponential time in the worst
case. In turn, the hardest problems in NP belong to a
Introduction
subclass called NP-complete: an efficient algorithm
Equilibrium statistical mechanics and combinatorial for solving one NP-complete problem could be
optimization – which is viewed here as a branch of easily modified to effectively solve any problem in
discrete mathematics and theoretical computer NP. By now, a huge number of NP-complete
science – have common roots. Phase transition are problems has been identified, and the lack of such
mathematical phenomena which are not limited to an algorithm corroborates the widespread conjec-
physical systems but are typical of many combina- ture P 6¼ NP, that is, that no such algorithm exists.
torial problems, one famous example being the However, NP-complete problems are not always
percolation transition in random graphs. Similarly, hard: when their resolution complexity is measured
the understanding of relevant physical problems, with respect to some underlying probability distri-
such as three-dimensional lattice statistics or two- bution of problem instances, NP-complete problems
dimensional quantum statistical mechanics pro- are often easy to solve on average. To deepen the
blems, is strictly related to the question of purely understanding of the average-case complexity (and
combinatorial origin of solving counting problems of the huge variability of running times observed in
over nonplanar lattices. Most of the tools and numerical experiments), computer scientists, mathe-
concepts which have allowed to solve problems in maticians, and physicists have focused their atten-
one field have a natural counterpart in the other. tion on the study of random instances of hard
While the possibility of solving exactly physical combinatorial problems, seeking for a link between
models is always connected to the presence of some the onset of exponential-time complexity and some
algebraic properties which guarantee integrability, in intrinsic (i.e., algorithm independent) properties of
the combinatorial approach the emphasis is more on the randomized NP-complete problems. These types
algorithms that can be applied to problem instances of questions have merged combinatorial optimiza-
in which the symmetries behind intergrability might tion with statistical physics of disordered systems.
be absent. Also at the level of out-of-equilibrium Computational complexity theory can also be
phenomena, there exists a deep connection between formulated for counting problems: similarly to
physics and combinatorics: just like physical pro- optimization problems, equivalence classes can be
cesses, local algorithms have to deal with an defined which separate polynomially solvable count-
exponentially large set of possible configurations ing problems with the hard ones – the so-called #P
and their out-of-equilibrium analysis constitutes a and #P-complete problems. Complexity theory for
theory of how problems are actually solved. counting problems makes the connections with
Computational complexity theory deals with statistical mechanics even more direct in that
classifying problems in terms of the computational counting solutions is nothing but a computation of
resources, typically time, required for their solution. a partition function.
What can be measured (or computed) is the time Two simple theorems by Jerrum and Sinclair
that a particular algorithm uses to solve the (1989) (which can be easily extended to many
problem. This time in turn depends on the imple- combinatorial problems) can help in clarifying
mentation of the algorithm as well as on the these connections.
computer the program is running on. The theory of The first theorem tells us that any randomized
computational complexity provides us with a notion algorithm (e.g., Monte Carlo) for approximating the
of complexity that is largely independent of imple- partition function of a generic spin glass model – the
mentational details and the computer at hand. This so-called spin glass problem – could be used to solve
is not surprising, since it is related to a highly all the other NP combinatorial problems. The
nontrivial question, that is: what do we mean by second theorem tells us that an algorithm for
saying that a combinatorial problem is solvable? evaluating exactly the partition function of the
Problems which can be solved in polynomial time ferromagnetic Ising model over a general graph
are considered to be tractable and compose the so- would again solve any other problem in the class #P,
called polynomial (P) class. The harder problems are which, as mentioned above, is the generalization of
Statistical Mechanics and Combinatorial Problems 51

the NP class to counting problems and obviously fact, random codes are optimal), the decoding
contains the class NP as a particular case. problem is in general NP-complete and therefore
Let us consider the following sightly simplified potentially intractable. However, since the choice of
definition of the Ising and the spin glass problems. the coding scheme is part of the design, what
matters are the average-case behavior of the decod-
Problem instance A symmetric matrix Jij with
ing algorithm (and its large deviations) and very
entries in {1, 0, 1} and an inverse temperature .
P efficient codes which can solve on average the
Output The partition
P function Z = {i } 2H(s) , decoding problem close to Shannon’s bounds are
where H(s) =  i,j Jij i j with i = 1. known.
In what follows, we will limit the discussion to
Moreover, let us define the fully polynomial
two basic examples of combinatorial and counting
randomized approximation scheme (FPRAS) for
problems which are representative and central to
counting and decision problems. A FPRAS for a
both computer science and statistical physics.
function f from problem instances to real numbers is
a probabilistic algorithm that in polynomial time in
the problem size n and in the relative error  2 [0, 1], Constraint Satisfaction Problems
outputs with high probability a number which
Combinatorial problems are usually written as
approximates f (n) within a ratio 1 þ . Given the
constraint satisfaction problems (CSPs): n discrete
above definitions, the theorems can be stated as
variables are given which have to satisfy m
follows:
constraints, all at the same time. Each constraint
Theorem 1 There can be no FPRAS for the spin can take different forms depending on the prob-
glass problem unless P = NP, that is, all problems in lem under study: famous examples are the
NP turn out to be solvable in polynomial time. K-satisfiability (K-SAT) problem in which constraints
are an ‘‘OR’’ function of K variables in the ensemble
Theorem 2 The Ising problem is #P-complete even
(or their negations) and the graph Q-coloring
when the matrix Jij is non-negative, that is, an
problem in which constraints simply enforce the
algorithm which outputs in polynomial time the
condition that the endpoints of the edges in the graph
exact Ising partition function for an arbitrary graph
must not have the same color (among the Q possible
could be used to solve any other counting problem
ones). Quite in general a generic CSP can be written
in #P.
as the problem of finding a zero-energy ground state
The above theorems hold for arbitrary graphs, of an appropriate energy function and its analysis
in particular for those graph or lattice realizations amounts at performing a zero-temperature statistical
which are particularly hard to analyze, the so-called physics study. Hard combinatorial problems are
worst cases. There exist no similar proofs of those which correspond to frustrated physical model
computational hardness for more restricted and systems.
realistic structures, such as, for instance, three- Given an instance of a CSP, one wants to know
dimensional regular lattices for the Ising problem whether there exists a solution, that is, an assign-
or finite connectivity random graphs for spin glasses. ment of the variables which satisfies all the
As a final introductory remark, it is worth constraints (e.g., a proper coloring). When it exists,
mentioning that the connections between worst- the instance is called SAT, and one wants to find
case complexity and the average case one is the a solution. Most of the interesting CSPs are
building block of modern cryptography and com- NP-complete: in the worst case, the number of
munication theory. On the one hand, the so-called operations needed to decide whether an instance
RSA cryptosystem is based on factoring large is SAT or not is expected to grow exponentially
integers, a problem which is believed to be hard on with the number of variables. But recent years
average while it is not known to be so in the worst have seen an upsurge of interest in the theory of
case. On the other hand, alternative cryptographic typical-case complexity, where one tries to identify
systems have been proposed which rely on a worst- random ensembles of CSPs which are hard to solve,
case/average-case equivalence (see, e.g., the theorem and the reason for this difficulty. As already
of Ajtai (1996) concerning some hidden vector mentioned, random ensembles of CSPs are also
problems in high-dimensional lattices.) of great theoretical and practical importance in
As far as communication theory is concerned, communication theory, since some of the best
average-case complexity is indeed crucial: while modern error-correcting codes (the so-called low-
Shannon’s theorem (1948) provides a very general density parity check codes) are based on such
result stating that many optimal codes do exist (in constructions.
52 Statistical Mechanics and Combinatorial Problems

Satisfiability and Spin Glass Models heuristic analytical arguments are in support of the
so-called satisfiability threshold conjecture:
The archetypical example of CSP is satisfiability
(SAT). This is a core problem in computational Conjecture There exists c (K) such that with high
complexity: it is the first one to have been shown probability:
NP-complete, and since then thousands of problems
 if  < c (K), a random instance is satisfiable;
have been shown to be computationally equivalent
 if  > c (K), a random instance is unsatisfiable.
to it. Yet it is not so easy to find difficult instances.
The main ensemble which has been used for this Although this conjecture remains unproven, the
goal is the random K-SAT ensemble (for K > 2, existence of a nonuniform sharp threshold has been
K-SAT is NP-complete). established by Friedgut (1997). A lot of effort has been
The SAT problem is defined as follows. Given a devoted to understanding this phase transition. This is
vector of {0, 1} Boolean variables x = {xi }i2I , where interesting both from physics and the computer science
I = {1, . . . , n}, consider a SAT formula defined by points of view, because the random instances with 
close to c are the hardest to solve. There exist
^
F ðx Þ ¼ Ca ðxÞ rigorous results that give bounds for the threshold
a2A c (K): using these bounds, it was shown that c (K)
scales as 2K ln (2) when K ! 1.
where A is an arbitrary finite setW(disjoint with I) On the statistical physics side, the cavity method
labeling the clauses Ca ; Ca (x) = i2I(a) Ja, i (xi ); any (which is the generalization to disordered systems
literal Ja, i (xi ) is either xi or  xi (‘‘not’’ xi ); and characterized by ergodicity breaking of the iterative
finally, I(a)  I for every a 2 A. Similarly to I(a), we method used to solve exactly physical models on the
can define the set A(i)  A as A(i) = {a : i 2 I(a)}, that Bethe lattice), is a powerful tool which is claimed to
is, the set of clauses containing variable xi or its be able to compute the exact value of the threshold,
negation. giving for instance c (3) ’ 4.266 7 . . . It is a non-
Given a formula F , the problem of finding a rigorous method but the self-consistency of its
variable assignment s such that F (s) = 1, if it exists, results have been checked by a ‘‘stability analysis,’’
can also be written as a spin glass problem and it has also led to the development of a new
as follows: if we consider a set of n Ising spins, family of algorithms – the so-called ‘‘survey propa-
i 2 {1} in place of the Boolean variables gation’’ – which can solve efficiently very large
(i = 1, 1 $ xi = 0, 1) we may write the energy instances at clause densities which are very close to
function associated to each clause as follows: the threshold (for technical details see Mézard and
Zecchina (2002) and Braunstein et al. (2005) and
  references therein).
YK
1 þ Ja;ir ir The main hypothesis on which the cavity analysis
Ea ¼
r¼1
2 of random K-SAT relies is the existence, in a region
of clause density [d , c ] close to the threshold, of
where Ja, i = 1 (resp. Ja, i = 1) if xi (resp. x
~i ) appears an intermediate phase called the ‘‘hard-SAT’’ phase;
in clause a. The total energy of a configuration see Figure 1. In this phase the set S of solutions
PjAj (a subset of the vertices in an n-dimensional
E = a = 1 Ea is nothing but a K-spin spin glass
hypercube) is supposed to split into many discon-
model.
nected clusters S = S 1 [ S 2 [    . If one considers
Random K-SAT is a version of SAT in which each
two solutions X, Y in the same cluster S j , it is
clause is taken to involve exactly K distinct
variables, randomly chosen and negated with uni-
form distribution. Its energy function corresponds to
a spin glass system over a finite connectivity
(diluted) random graph.
In recent years random K-SAT has attracted much
interest in computer science and in statistical O(n)
physics. The interesting limit is the thermodynamic
limit n ! 1, m = jAj ! 1 at fixed clause density Easy SAT Hard SAT UnSAT
 = m=n.
Its most striking feature is certainly its sharp m/n
threshold. It is strongly believed that there exists a Figure 1 A pictorial representation of the clustering transition
phase transition for this problem: numerical and in random K-SAT.
Statistical Mechanics and Combinatorial Problems 53

possible to walk from X to Y (staying in S) by The single-sample SP equations are nicely described
flipping at each step a finite numbers of variables. If, in terms of the factor graph representation used in
on the other hand, X and Y are in different clusters, information theory to characterize error-correcting
in order to walk from X to Y (staying in S), at least codes. In the factor graph, the N variables i, j, k, . . . are
one step will involve an extensive number (i.e., / n) represented by circular ‘‘variable nodes,’’ whereas the
of flips. This clustered phase is held responsible for M clauses a, b, c, . . . are represented by square ‘‘func-
entrapping many local-search algorithms into non- tion nodes.’’ For random K-SAT, the function nodes
optimal metastable states. This phenomenon is not have connectivity K, while the variable nodes have an
exclusive to random K-SAT. It is also predicted to average Poisson connectivity K.
appear in many other hard SAT and optimization The iterative SP equations are examples of message-
problems such as ‘‘coloring,’’ and corresponds to the passing procedures. In message-passing algorithms
so called ‘‘one-step replica symmetry breaking’’ such as the so-called ‘‘belief propagation (BP)
(1RSB) phase in the language of statistical physics. algorithm’’ used in error-correcting codes and
It is also a crucial limiting feature for decoding statistical inference problems, the unknowns which
algorithms in some error correcting codes. are self-consistently evaluated by iteration are the
The only CSP for which the existence of the marginals over the solution space of the variables
clustering phase has been established rigorously is characterizing the combinatorial problem (the prob-
the polynomial problem of solving random linear ability space is the set of all solutions sampled with
equation in GF (Motwani and Raghavan 2000). For uniform measure). According to the physical inter-
random K-SAT, rigorous probabilistic bounds can pretation, the quantities that are evaluated by SP are
be used to prove the existence of the clustering the probability distributions of local fields over the set
phenomenon, for large enough K, in some region of of clusters. That is, while BP performs a ‘‘white’’
 included in the interval [d (K), c (K)] predicted by average over solutions, SP takes care of cluster-to-
the statistical physics analysis. cluster fluctuations, telling us which is the probability
In the analysis of CSP like K-SAT, two main of picking up a cluster at random and finding a given
questions are in order. The first is of algorithmic variable biased in a certain direction (or unfrozen if it
nature and asks for an algorithm which decides is paramagnetic in the cluster). SP computes quantities
whether for a given CSP instance all the constraints which are probabilities over different pure states: the
can be simultaneously satisfied or not. The second order parameter which is evaluated as fixed point of
question is more theoretical and deals with large the SP equations is a probability measure in a space of
random instances, for which one wants to know the functions, or for finite n, the full list of probability
structure of the solution space and predict the densities describing the cluster-to-cluster fluctuations
typical behavior of classes of algorithms. of the variables.
In both SP and BP one assumes knowledge of the
marginals of all variables in the temporary absence
Message-Passing Algorithms from
of one of them and then writes the marginal
Statistical Physics
probability induced on this ‘‘cavity’’ variable in
The algorithmic contributions of statistical absence of another third variable interacting with it
mechanics to combinatorial optimization are numer- (i.e., the so-called Bethe lattice approximation for
ous and important (a representative example being the problem). These relations define a closed set of
the celebrated ‘‘simulated annealing algorithm’’). equations for such cavity marginals that can be
For the sake of brevity, here we limit the discussion solved iteratively (this fact is known as message-
to the so-called ‘‘message-passing algorithms’’ which passing technique). The equations become exact if
are also of great interest in coding theory. the cavity variables acting as inputs are uncorre-
The statistical analysis of the cavity equations lated. They are conjectured to be an asymptotically
allows to study the average properties of ensemble exact approximation over random locally tree–like
of problems and it is totally equivalent to the replica structures such as, for instance, the random K-SAT
method in which the average over the ensemble is factor graph. Both BP and SP can be derived in a
the first step in any calculation. The survey variational framework.
propagation (SP) equations are a formulation of
the cavity equations which is valid for each specific
Complexity of Counting Problems
instance and is able to provide information about
the statistical behavior of the individual variables in In order to describe the nature of computational
the stable and metastable states of a given energy complexity of counting in physical models, it is
density (i.e., given fraction of violated constraints). enough to consider the classical Ising problem. The
54 Statistical Mechanics and Combinatorial Problems

computation of the Ising partition function or, more The generalization of the Pfaffian construction to
in general, of the weighted matching polynomial, is the nonplanar case must deal with the ambiguity of
the root problem of lattice statistics. orienting the homology cycles of the graph. Such a
For planar graphs like, for example, two-dimensional problem can be formally solved in full generality for
regular lattices, counting problems can often be any orientable lattice and leads to an expression of
solved by a variety of different methods, for the Ising partition function or the dimer coverings
example, transfer matrices and Pfaffians, which generating function given as a sum over all possible
require a number of operations which are poly- inequivalent orientations of the lattice (or its embed-
nomial in the number of vertices. ding surface): for a graph of genus g, the homology
The complexity of the counting problems changes basis is composed of 2g cycles and, therefore, there
if one considers nonplanar graphs, that is, graphs are 22g inequivalent orientations. It is only for graphs
with a nontrivial topological genus. In discrete of logarithmic genus that the generalized Pfaffian
mathematics, such problems are classified as formalism provides a polynomial algorithm.
#P-complete, meaning that the existence of an Counting perfect matchings can be thought of as
exact polynomial algorithm for the evaluation of the the problem of evaluating the permanent of 0–1
generating functions would imply the polynomial matrices over properly constructed bipartite graphs,
solvability of many known counting combinatorial which is among the oldest and most famous
problems, the most famous one being the evaluation of #P-complete problems.
the permanent of 0–1 matrices. In statistical mechanics The Pfaffian formalism when applied to the perma-
and mathematical chemistry, the interest in nonplanar nent problem leads to a simple general result, that is, it
lattices is obviously related to their D > 2 character: provides a general formula for writing the permanent
the three-dimensional cubic lattice is nothing but a of a matrix in terms of a number of determinants which
nonplanar graph of topological genus g = 1 þ N=4, is exponential in the genus of the underlying graph.
where N is the number of sites.
The planar two-dimensional Ising model was solved See also: Combinatorics: Overview; Determinantal
in 1944 by Onsager using the algebraic transfer matrix Random Fields; Dimer Problems; Phase Transitions in
method. Successively, alternative exact solutions have Continuous Systems; Spin Glasses; Two-Dimensional
been proposed which resorted to simple combinatorial Ising Model.
and geometrical reasoning. As is well known, the
underlying idea of the combinatorial methods consists
in recasting the sum over spin configurations of the Further Reading
Boltzmann weights as a sum over closed curves (loops)
Achlioptas D, Naor A, and Peres Y (2005) Rigorous location of
weighted by the activity of their bonds. Double phase transitions in hard optimization problems. Nature 435:
counting is avoided by a proper cancellation mechan- 759–764.
ism which takes care of the different intrinsic Ajtai N (1996) Generating hard instances of lattice problems.
topologies of loops which give rise to the same Electronic Colloquium on Computational Complexity
contribution in the partition function. Such an (ECCC) 7: 3.
Braunstein A, Mézard M, and Zecchina R (2005) Survey
approach has been developed first by Kac and Ward Propagation: an Algorithm for Satisfiability. Random Struc-
(1952) and provides a direct way of taking the field tures and Algorithms 27: 201–226.
theoretic continuum limit. In D > 2, the general- Cocco S and Monasson R (2004) Heuristic average-case analysis
ization of the above method encounters enormous of the backtrack resolution of random 3-satisfiability
difficulties due to the variety of intrinsic topologies of instances. Theoretical Computer Science A 320: 345.
Distler J (1992) A note on the 3D Ising model as a string theory.
surfaces immersed in D > 2 lattices. Nuclear Physics B 388: 648.
Another combinatorial method proposed in the Dubois O, Monasson R, Selman B, and Zecchina R (eds.) (2001)
1960s by Kasteleyn is the so-called Pfaffian method. NP-hardness and Phase transitions (Special Issue), Theoretical
It consists in writing the weighted sum over loops as Computer Science, vol. 265 (1–2). Elsevier.
a dimer covering or prefect matching generating Friedgut E (1999) Sharp threshold of graph properties, and the
KSat problem. Journal of American Mathematical Society 12:
function. Once the relationship between loop count- 1017–1054.
ing and dimer coverings (or perfect matchings) over Jerrum M and Sinclair A (1989) Approximating the permanent.
a suitably decorated and properly oriented lattice is SIAM Journal on Computing 18: 1149.
established, the Pfaffian method turns out to be a Lovász L and Plummer MD (1986) Matching Theory. North-
simple technique for the derivation of exact solu- Holland Mathematics Studies 121, Annals of Discrete Mathe-
matics (29). New York: North-Holland.
tions or for the definition of polynomial algorithms Mézard M, Mora T, and Zecchina R (2005) Clustering of
over planar lattices which are applicable also to the solutions in the random satisfiability problem. Physics Review
two-dimensional Ising spin glass. Letters 94: 197205.
Statistical Mechanics of Interfaces 55

Mézard M and Parisi G (2001) The Bethe lattice spin glass Motwani R and Raghavan P (2000) Randomized Algorithms.
revisited. European Physical Journal B 20: 217. Cambridge: Cambridge University Press.
Mézard M, Parisi G, and Virasoro MA (1987) Sping Glass Nishimori H (2001) Statistical Physics of Spin Glasses and
Theory and Beyond. Singapore: World Scientific. Information Processing. Oxford: Oxford University Press.
Mézard M, Parisi G, and Zecchina R (2002) Analytic and Papadimitriou CH (1994) Computational Complexity. Addison-
algorithmic solution of random satisfiability problems. Science Wesley.
297: 812–815. Regge T and Zecchina R (2000) Combinatorial and topological
Mézard M and Zecchina R (2002) Random K-satisfiability: from approach to the 3D Ising model. Journal of Physics A:
an analytic solution to a new efficient algorithm. Physical Mathematical and General 33: 741.
Review E 66: 056126. Richardson T and Urbanke R (2001) An introduction to the analysis
Monasson R, Zecchina R, Kirkpatrick S, Selman B, and of iterative coding systems. In: Marcus B and Rosenthal J (eds.)
Troyansky L (1999) Determining computational complexity Codes, Systems, and Graphical Models. New York: Springer.
from characteristic ‘‘phase transitions’’. Nature 400: 133.

Statistical Mechanics of Interfaces


a
S Miracle-Solé, Centre de Physique Théorique,
CNRS, Marseille, France b
ª 2006 Elsevier Ltd. All rights reserved.
w

Introduction
a
When a fluid is in contact with another fluid, or b
with a gas, a portion of the total free energy of the w
system is proportional to the area of the surface of
Figure 1 Partial and complete wetting.
contact, and to a coefficient, the surface tension,
which is specific for each pair of substances.
Equilibrium will accordingly be obtained when the conditions and it is a subsequent relaxation of the
free energy of the surfaces in contact is a minimum. macroscopic crystal that restores the equilibrium.
Suppose that we have a drop of some fluid, b, An interesting phenomenon that can be observed
over a flat substrate, w, while both are exposed to on these crystals is the roughening transition,
air, a. We have then three different surfaces of characterized by the disappearance of the facets of
contact, and the total free energy of the system a given orientation, when the temperature attains a
consists of three parts, associated to these three certain particular value. The best observations have
surfaces. A drop of fluid b will exist provided its been made on helium crystals, in equilibrium with
own two surface tensions exceed the surface tension superfluid helium, since the transport of matter and
between the substrate w and the air, that is, heat is then extremely fast. Crystals grow to sizes of
provided that 1–5 mm and relaxation times vary from milliseconds
to minutes. Roughening transitions for three differ-
 wb þ  ba >  wa
ent types of facets have been observed (see, e.g.,
If equality is attained, then a film of fluid b is Wolf et al. (1983)).
formed, a situation which is known as perfect, or These are some classical examples among a
complete wetting (see Figure 1). variety of interesting phenomena connected with
When one of the substances involved is aniso- the behavior of the interface between two phases in
tropic, such as a crystal, the contribution to the total a physical system. The study of the nature and
free energy of each element of area depends on its properties of the interfaces, at least for some simple
orientation. The minimum surface free energy for a systems in statistical mechanics, is also an interesting
given volume then determines the ideal form of the subject of mathematical physics. Some aspects of
crystal in equilibrium. this study will be discussed in the present article.
It is only in recent times that equilibrium crystals We assume that the interatomic forces can be
have been produced in the laboratory, first, in modeled by a lattice gas, and consider, as a simple
negative crystals (vapor bubbles) of organic sub- example, the ferromagnetic Ising model. In a typical
stances. Most crystals grow under nonequilibrium two-phase equilibrium state, there is a dense
56 Statistical Mechanics of Interfaces

component, which can be interpreted as a solid or The measures [2] determine (by the Dobrushin–
liquid phase, and a dilute phase, which can be Lanford–Ruelle equations) the set of Gibbs states of
interpreted as the vapor phase. Considering certain the infinite system, as measures on the set  of all
particular cases of such situations, we first introduce configurations. If a Gibbs state happens to be equal
a precise definition of the surface tension and then to lim  (j
), when L1 , L2 , L3 ! 1, under a fixed
proceed on the mathematical analysis of some boundary condition , we shall call it the Gibbs
preliminary properties of the corresponding inter- state associated to the boundary condition . One
faces. The next topic concerns the wetting properties also says that this state exists in the thermodynamic
of the system, and the final section is devoted to the limit. Then, equivalently, the correlation functions
associated equilibrium crystal. [4] converge to the corresponding expectation values
in this state.
This model presents, at low temperatures (i.e., for
 > c , where c is the critical inverse temperature),
Pure Phases and Surface Tension
two different thermodynamic pure phases, a dense
The Ising model is defined on the cubic lattice L = Z3 , and a dilute phase in the lattice gas language (called
with configuration space  = {1, 1}L . If  2 , the here the positive and the negative phase). This
value (i) = 1 or 1 is the spin at the site means two extremal translation-invariant Gibbs
i = (i1 , i2 , i3 ) 2 L, and corresponds to an empty or an states, þ and  , obtained as the Gibbs states
occupied site in the lattice gas version of the model. associated with the boundary conditions , respec-
The system is first considered in a finite box   L, tively equal to the ground configurations (i) = 1
with fixed values of the spins outside. and (i) = 1, for all i 2 L. The spontaneous
In order to simplify the exposition, we shall magnetization
mainly consider the three-dimensional Ising model,
though some of the results to be discussed hold in m ðÞ ¼ þ ððiÞÞ ¼  ððiÞÞ ½5
any dimension d  2. We shall also, sometimes, is then strictly positive. On the other hand, if   c ,
refer to the two-dimensional model, it being under- then the Gibbs state is unique and m = 0.
stood that the definitions have been adapted in the Each configuration inside  can be described in a
obvious way. We assume that the box  is a geometric way by specifying the set of Peierls
parallelepiped, centered at the origin of L, of sides contours which indicate the boundaries between
L1 , L2 , L3 , parallel to the axes. the regions of spin 1 and the regions of spin 1.
A configuration of spins on ((i), i 2 ), denoted Unit-square faces are placed midway between the
 , has an energy defined by the Hamiltonian pairs of nearest-neighbor sites i and j, perpendicu-
X
H ð j
Þ ¼ J ðiÞðjÞ ½1 larly to these bonds, whenever (i)(j) = 1. The
hi;ji\6¼; connected components of this set of faces are the
Peierls contours. Under the boundary conditions (þ)
where J is a positive constant (ferromagnetic or and (), the contours form a set of closed surfaces.
attractive interaction). The sum runs over all They describe the defects of the considered config-
nearest-neighbor pairs hi, ji  L, such that at least uration with respect to the ground states of the
one of the sites belongs to , and one takes system (the constant configurations 1 and 1), and
(i) = (i) when i 62 , the configuration  2  are a basic tool for the investigation of the model at
being the given boundary condition. The probability low temperatures.
of the configuration  , at the inverse temperature In order to study the interface between the two
 = 1=kT, is given by the Gibbs measure pure phases, one needs to construct a state describ-
ing the coexistence of these phases. This can be done
Þ ¼ Z ðÞ1 expðH ð j
 ð j ÞÞ ½2
by means of a new boundary condition. Let
where Z () is the partition function n = (n1 , n2 , n3 ) be a unit vector in R3 , such that
X n3 > 0, and introduce the mixed boundary condition
Z ðÞ ¼ expðH ð jÞÞ ½3 ( , n), for which


Local properties at equilibrium can be described by ðiÞ ¼ 1 if i  n  0 ½6
the correlation functions between the spins on finite 1 if i  n < 0
sets of sites,
X Y This boundary condition forces the system to
 ððAÞÞ ¼  ð j
Þ ðiÞ ½4 produce a defect going transversally through the
 i2A box , a large Peierls contour that can be
Statistical Mechanics of Interfaces 57

interpreted as the microscopic interface (also called the number of faces of  (inside ). The term U ()
a domain wall). The other defects that appear above equals ln Zþ (, )=Zþ (), the sum in the partition
and below the interface can be described by closed function Zþ (, ) being extended to all configura-
contours inside the pure phases. tions whose associated contours do not intersect .
The free energy per unit area due to the presence Each term in sum [9] gives a weight proportional to
of the interface is the surface tension. It is defined by the probability of the corresponding microscopic
interface.
n3 Z ;n ðÞ At low (positive) temperatures, we expect the
ðnÞ ¼ lim lim  ln þ ½7
L1 ;L2 !1 L3 !1 L1 L2 Z ðÞ microscopic interface corresponding to this bound-
ary condition, which at zero temperature coincides
In this expression the volume contributions propor-
with the plane i3 = 1=2, to be modified by small
tional to the free energy of the coexisting phases, as
deformations. Each microscopic interface  can then
well as the boundary effects, cancel, and only the
be described by its defects, with respect to the
contributions to the free energy due to the interface
interface at  = 1. To this end, one introduces some
are left. The existence of such a quantity indicates
objects, called walls, which form the boundaries
that the macroscopic interface, separating the
between the horizontal plane portions of the micro-
regions occupied by the pure phases in a large
scopic interface, also called the ceilings of the
volume , has a microscopic thickness and can
interface.
therefore be regarded as a surface in a thermo-
More precisely, one says that a face of  is a
dynamic approach.
ceiling face if it is horizontal and such that
Theorem 1 The interfacial free energy per unit the vertical line passing through its center does not
area, (n), exists, is bounded, and its extension by have other intersections with . Otherwise, one
positive homogeneity, f (x) = jxj (x=jxj), is a convex says that it is a wall face. The set of wall faces splits
function on R3 . Moreover, (n) is strictly positive into maximal connected components. The set of
for  > c , and vanishes if   c . walls, associated to , is the set of these compo-
nents, each component being identified by its
The existence of (n) and also the last statement
geometric form and its projection on the plane
were proved by Lebowitz and Pfister (1981), in the
i3 = 1=2. Every wall !, with projection (!),
particular case n = (0, 0, 1), with the help of correla-
increases the energy of the interface by a quantity
tion inequalities. A complete proof of the theorem
2Jk!k, where k!k = j!j  j(!)j, and two walls are
was given later with similar arguments. The con-
compatible if their projections do not intersect. In
vexity of f is equivalent to the fact that the surface
this way, the microscopic interfaces may be inter-
tension  satisfies a thermodynamic stability condi-
preted as a ‘‘gas of walls’’ on the two-dimensional
tion known as the pyramidal inequality (see
lattice.
Messager et al. (1992)).
Dobrushin, who developed the above analysis,
also proved the dilute character of this ‘‘gas’’ at low
temperatures. This implies that the microscopic
Gibbs States and Interfaces interface is essentially flat, or rigid. One can under-
In this section we consider the ( , n0 ) boundary stand this fact by noticing first that the probability
condition, also simply denoted ( ), associated to the of a wall is less than exp (2Jk!k) and, second,
vertical direction n0 = (0, 0, 1), that in order to create a ceiling in , which is not in
the plane i3 = 1=2, one needs to surround it by a
ðiÞ ¼ 1 if i3  0; ðiÞ ¼ 1 if i3 < 0 ½8 wall, that one has to grow when the ceiling is made
over a larger area.
The corresponding surface tension is  = (n0 ). We Using correlation inequalities one proves that the
shall first recall some classical results which concern Gibbs state  , associated to the ( ) boundary
the Gibbs states and interfaces at low temperatures. conditions, always exists, and that it is invariant
According to the geometrical description of the under horizontal translations of the lattice, that is,
configurations introduced in the last section, we  ((A þ a)) =  ((A)) for all a = (a1 , a2 , 0). It is
observe that also an extremal Gibbs state. Let m(z) be the
X magnetization  (((z)) at the site z = (0, 0, z). The
Z ;n ðÞ=Zþ ðÞ ¼ expð2Jjj  U ðÞÞ ½9
function m(z) is monotone increasing and satisfies

the symmetry property m(z) = m(z þ 1). Some
where the sum runs over all microscopic interfaces  consequences of Dobrushin’s work are the following
compatible with the boundary condition and jj is properties.
58 Statistical Mechanics of Interfaces

Theorem 2 If the temperature is low enough, that This is an exact computation that has been done by
is, if J  c1 , where c1 is a given constant, then Abraham and Reed.
Let us come back to the three-dimensional Ising
m ð0Þ is strictly positive ½10 model where we know that the interface orthogonal
to a lattice axis is rigid at low temperatures.
m ðzÞ ! m ; when z ! 1; exponentially fast ½11
Question 1 At higher temperatures, but before
reaching the critical temperature, do the fluctuations
Equation [10] is just another way of saying that
of this interface become unbounded, in the thermo-
the interface is rigid and that the state  is non-
dynamic limit, so that the corresponding Gibbs state
translation invariant (in the vertical direction).
is translation invariant?
Then, the correlation functions  ((A)) describe
the local properties, or local structure, of the One says then that the interface is rough, and it is
macroscopic interface. In particular, the function believed that, effectively, the interface becomes
m(z) represents the magnetization profile. Then rough when the temperature is raised, undergoing
statement [11], together with the symmetry prop- a roughening transition at an inverse temperature
erty, tells us that the thickness of this interface is R > c .
finite, with respect to the unit lattice spacing. It is known that R  cd = 2 , the critical inverse
The statistics of interfaces has been rewritten in temperature of the two-dimensional Ising model,
terms of a gas of walls and this system may further since van Beijeren proved, using correlation inequal-
be studied by cluster expansion techniques. There is ities, that above this value, the state  is not
an interaction between the walls, coming from the translation invariant. Recalling that the rigid inter-
term U () in eqn [9], but a convenient mathema- face may be viewed as a two-dimensional system,
tical description of this interaction can be obtained the system of walls, a representation that would
by applying the standard low-temperature cluster become inappropriate for a rough interface, one
expansion, in terms of contours, to the regions might think that the phase transition of the two-
above and below the interface. dimensional Ising model is relevant for the rough-
This method was introduced by Gallavotti in his ening transition, and that R is somewhere near
study (mentioned below) of the two-dimensional cd = 2 . Indeed, approximate methods, used by Weeks
Ising model. It has been applied by Bricmont and and co-workers give some evidence for the existence
co-workers to examine the interface structure in the of such a R and suggest a value slightly smaller
present case. As a consequence, it follows that than cd = 2 , as shown in Table 1. To this day,
the surface tension, more exactly  (), and also however, there appears to be no proof of the fact
the correlation functions, are analytic functions at that R > c , that is, that the roughening transition
low temperatures. They can be obtained as explicit for the three-dimensional Ising model really occurs.
convergent series in the variable  = e2J . At present one is able to study the roughening
The same analysis applied to the two-dimensional transition rigorously only for some simplified mod-
model shows a very different behavior at low els with a restricted set of admissible microscopic
temperatures. In this case, the microscopic interface interfaces. Moreover, the closed contours, describing
 is a polygonal line and the walls belong to the one- the defects above and below , are neglected, so that
dimensional lattice. One can then increase the size of these two regions have the constant configurations 1
a ceiling without modifying the walls attached to it. or 1, and one has U () = 0 in eqn [9].
Indeed, Gallavotti turned this observation into a The best known of these models is the classic SOS
proof that the Gibbs state  is now translation (solid-on-solid) model in which the interfaces  have
invariant.
pffiffiffiffiffiffiThe line  undergoes large fluctuations of the property of being cut only once by all vertical
order L1 , and disappears from any finite region of lines of the lattice. This means that  is the graph of
the lattice, in the thermodynamic limit. In particular, a function that can equivalently be used to define
we have then  = (1=2)(þ þ  ), a result that the possible configurations of . If  contains the
extends to all boundary conditions ( , n). horizontal face with center (i1 , i2 , i3  1=2), then
Using these results Bricmont and co-workers also
studied the local structure of the interface at low
temperatures and showed that its intrinsic thickness Table 1 Some temperature values
is finite. To study the global fluctuations, one can
compute the magnetization profile by introducing, d =3 c J
0:22 approximate critical temperature
before taking the thermodynamic limit, a change of d =3 R J
0:41 conjectured roughening temperature
d =2 c J = 0:44 exact critical temperature
scale: 
 ((zL1 )), with = 1=2 or near to this value.
Statistical Mechanics of Interfaces 59

the value at (i1 , i2 ) of the associated function is

(i1 , i2 ) = i3 .
The proof that the SOS model with the boundary +
+
condition ( ) has a roughening transition is a highly
nontrivial result due to Fröhlich and Spencer. When +
+
 is small enough, the fluctuations of  are of order
pffiffiffiffiffiffiffiffiffi
ln L (in a cubic box of side L). –
Moreover, other interface models, with additional –
conditions on the allowed microscopic interfaces, –

are exactly solvable. The BCSOS (body-centered
SOS) model, introduced by van Beijeren, belongs to
this class. It is, in fact, the first model for which the
existence of a roughening transition has been
proved. More recently, also the TISOS (triangular
Ising SOS) model, introduced by Blöte and Hilhorst –
and further studied by Nienhuis and co-workers, has –
been considered in this context.
The interested reader can find more information
and references, concerning the subject of this K W
section, in the review article by Abraham (1986).
Figure 2 Boundary conditions for the cubic lattice. Above, the
box  with the ( ) and (step) boundary conditions. Below, the
box 0 and the wall W with the (w ) boundary conditions.
Wetting Phenomena
Next we consider the Ising model over a plane condition, (i) = 1 or (i) = 1, for all i 2 L0 . Let us
horizontal substrate (also called a wall) and study consider first the case of the () boundary condition.
the difference of surface tensions which governs the The surface free energy contribution per unit area
wetting properties of this system. due to the presence of the wall, when we have the
We first describe the approach developed by negative phase in the bulk, is
Fröhlich and Pfister (1987) and briefly report some
results of their study. We consider the model on the  w ð; KÞ
semi-infinite lattice 1 Zw ð0 Þ
¼ lim lim  ln ½14
0 3 L1 ;L2 !1 L3 !1 L1 L2 Z ðÞ1=2
L ¼ fi 2 Z : i3  0g ½12
The division by Z ()1=2 allows us to subtract from
A magnetic field, K  0, is added on the boundary
the total free energy, ln Zw (0 ), the bulk term and
sites, i3 = 0, which describes the interaction with the
all boundary terms which are not related to the
substrate, supposed to occupy the complementary
presence of the wall. The existence of limit [14]
region L n L0 .
follows from correlation inequalities, and we have
We constrain the model in the finite box 0 =  \ L0 ,
 w  0.
with  as above, and impose the value of the spins
One can prove, as well, the existence of the Gibbs
outside. The Hamiltonian becomes
state w of the semi-infinite system, associated to
X X
Hw0 ð0 j
Þ¼ J ðiÞðjÞ  K ðiÞ ½13 the () boundary condition. This state is the limit of
hi;ji\0 6¼; i20 ;i3 ¼0 the finite volume Gibbs measures 0 (0 j())
defined by the Hamiltonian [13]. It describes the
Here 0 represents the configuration inside 0 , the local equilibrium properties of the system near
pairs hi, ji are contained in L0 , and (i) = (i) when the wall, when deep inside the bulk the system is
i 62 0 , the configuration  being the given boundary in the negative phase. Similar definitions give the
condition (see Figure 2). The corresponding parti- surface tension  wþ and the Gibbs state wþ ,
tion function is denoted by Zw (0 ). corresponding to the boundary condition (i) = 1,
Since there are two pure phases in the model, we for all i 2 0 .
must consider two surface free energies, or surfaces We remark that the states wþ and w are invariant
tensions,  wþ and  w , between the wall and the by translations parallel to the plane i3 = 0, and
positive or negative phase present in the bulk. They introduce the magnetizations, mw (z) = w ((z)),
are defined through the choice of the boundary where z denotes the site (0, 0, z), mw = mw (0), and
60 Statistical Mechanics of Interfaces

similarly mwþ (z) and mwþ . Their connection with function Zw , etc., and we obtain  = 2K and
the surface free energies is given by the formula  = 2J. For nonzero but low temperatures, the
small perturbations of these ground states have to be
 w ð; KÞ   wþ ð; KÞ considered, a problem that can be treated by the
Z K
method of cluster expansions. In fact, the corre-
¼ ðmwþ ð; sÞ  mw ð; sÞÞ ds ½15 sponding defects can be described by closed con-
0
tours as in the case of pure phases.
We mention in the following theorem some
results of Fröhlich and Pfister’s study. Here  is, Theorem 4 For K < J, the functions  w (, K)
as before, the usual surface tension between the and  wþ (, K) are analytic at low temperatures,
two pure phases of the system, for a horizontal that is, provided that  (J  K)  c2 , where c2 is a
interface. given constant. Moreover, mwþ (z) and mw (z) tend,
respectively, to m and to m , when z ! 1,
Theorem 3 With the above definitions, we have exponentially fast.
w
 ð; KÞ   wþ ð; KÞ   ðÞ ½16 The last statement in Theorem 4 tells us that the
wall affects only a layer of finite thickness (with
mwþ ð; KÞ  mw ð; KÞ  0 ½17 respect to the lattice spacing). From a macroscopic
and the difference in [17] is a monotone decreasing point of view, the negative phase reaches the wall,
function of the parameter K. Moreover, if mwþ = mw , and we are in the partial-wetting regime. Indeed, a
then the Gibbs states wþ and w coincide. strict inequality holds in [16].
Thus, for K < J there is always partial wetting at
The proof is a subtle application of correlation low temperatures. Then the following question arises:
inequalities. Since, from Theorem 3, the integrand in
eqn [15] is a positive and decreasing function, the Question 2 Is there a situation of complete wetting
difference  =  w   wþ is a monotone increasing at higher temperatures? It is understood here that K
and concave (and hence continuous) function of the takes a fixed value, characteristic of the substrate,
parameter K. On the other hand, one can prove that such that 0 < K < J.
 =  , if K  J. This justifies the following This is known to be the case in dimension d = 2,
definition: where the exact value of Kw () can be obtained
Kw ðÞ ¼ minfK : ð; KÞ ¼  þ ðÞg ½18 from Abraham’s solution of the model:

In the thermodynamic description of wetting, the cosh 2Kw ¼ cosh 2J  e2J sinh 2J
partial-wetting regime is characterized by the strict
Then complete wetting occurs for  in the interval
inequality in [16]. Equivalently, by K < Kw (). We
c <   w (K), where c is the critical inverse
must have then mwþ 6¼ mw , because of eqn [15].
temperature and w (K) is the solution of Kw () = K.
This shows that, in the case of partial wetting, wþ
The case d = 2 has been reviewed in Abraham
and w are different Gibbs states.
(1986).
The complete-wetting regime is characterized by the
To our knowledge, the above question remains an
equality in [16], that is, by K  Kw (). Then, we have
open problem for the Ising model in dimension
mwþ = mw , and taking into account the last statement
d = 3. The problem has, however, been solved for
in Theorem 3, also wþ = w . This last result implies
the simpler case of a SOS interface model. In this
that there is only one Gibbs state. Thus, complete
case, a nice and rather brief proof of the following
wetting corresponds to unicity of the Gibbs state.
result has been given by Chalker (1982): one has
In this case, we also have lim mw (z) = m , when
mwþ = mw , and hence complete wetting, if
z ! 1, because this is always true for mwþ (z). This
indicates that we are in the positive phase of the 2ðJ  KÞ <  lnð1  e8J Þ
system although we have used the () boundary
condition, so that the bulk negative phase cannot It is very plausible that a similar statement is valid
reach the wall anymore. The film of positive phase, for the semi-infinite Ising model and, also that
which wets the wall completely, has an infinite Chalker’s method could play a role for extending the
thickness with respect to the unit lattice spacing, in proof to this case, provided an additional assump-
the thermodynamic limit. tion is made. Namely, that  is sufficiently large,
When  = 1, only a few particular ground con- and hence J  K small enough, in order to insure the
figurations contribute to the partition functions, convergence of the cluster expansions and to be able
such as the configuration (i) = 1 for the partition to use them.
Statistical Mechanics of Interfaces 61

Equilibrium Crystals interfaces is precisely the TISOS model. A similar


definition can be given for the BCSOS model that
The shape of an equilibrium crystal is obtained,
describes the ground configurations on the body-
according to thermodynamics, by minimizing the
centered cubic lattice.
surface free energy between the crystal and the
From a macroscopic point of view, the roughness
medium, for a fixed volume of the crystal phase.
or the rigidity of an interface should be apparent
Given the orientation-dependent surface tension
when considering the shape of the equilibrium
(n), the solution to this variational problem,
crystal associated with the system. A typical equili-
known under the name of Wulff construction, is
brium crystal at low temperatures has smooth plane
the following set:
facets linked by rounded edges and corners. The
area of a particular facet decreases as the tempera-
W ¼ fx 2 R3: x  n  ðnÞ for all ng ½19
ture is raised and the facet finally disappears at a
Notice that the problem is scale invariant, so that if we temperature characteristic of its orientation. It can
solve it for a given volume of the crystal, we get the be argued that the disappearance of the facet
solution for other volumes by an appropriate scaling. corresponds to the roughening transition of the
We notice also that the symmetry (n) = (n) is not interface whose orientation is the same as that of the
required for the validity of formula [19]. In the present considered facet.
case, (n) is obviously a symmetric function, but The exactly solvable interface models mentioned
nonsymmetric situations are also physically interesting above, for which the function (n) has been
and appear, for instance, in the case of a drop on a wall computed, are interesting examples of this behavior,
discussed in the last section. and provide a valuable information on several
The surface tension in the Ising model between aspects of the roughening transition. This subject
the positive and negative phases has been defined in has been reviewed by Abraham (1986), van Beijeren
eqn [7]. In the two-dimensional case, this function and Nolden (1987), and Kotecky (1989).
(n) has (as shown by Abraham) an exact expression For example, we show in Figure 3 the shape
in terms of some Onsager’s function. It follows (as predicted by the TISOS model (one-eighth of the
explained in Miracle-Sole (1999)) that the Wulff shape because of the condition nk > 0). In this
shape W, in the plane (x1 , x2 ), is given by model, the interfaces orthogonal to the three
coordinate axes are rigid at low temperatures.
For the three-dimensional Ising model at positive
cosh x1 þ cosh x2  cosh2 2J= sinh 2J
temperatures, the description of the microscopic
This shape reduces to the empty set for   c , since interface, for any orientation n, appears as a very
the critical c satisfies sinh 2Jc = 1. For  > c , it is difficult problem. It has been possible, however,
a strictly convex set with smooth boundary. to analyze the interfaces which are very near to
In the three-dimensional case, only certain inter- the particular orientations n0 , discussed in the
face models can be exactly solved (see the section
‘‘Gibbs states and interfaces’’). Consider the Ising
model at zero temperature. The ground configura-
tions have only one defect, the microscopic interface
, imposed by the boundary condition ( , n). Then,
from eqn [9], we may write
n3 3
ðnÞ ¼ lim ðE ðnÞ   1 N ðnÞÞ ½20
L1 ;L2 !1 L1 L2

where E = 2Jjj is the energy (all  have the


same minimal area) and N the number of
ground states. Every such  has the property of 1 2
being cut only once by all straight lines orthogo-
nal to the diagonal plane i1 þ i2 þ i3 = 0, provided
that nk > 0, for k = 1, 2, 3. Each  can then be
described by an integer function defined on a
Figure 3 Cubic equilibrium crystal shown in a projection
triangular plane lattice, the projection of the
parallel to the (1,1,1) direction. The three regions (1, 2, and 3)
cubic lattice L on the diagonal plane. The model indicate the facets and the remaining area represents a curved
defined by this set of admissible microscopic part of the crystal surface.
62 Statistical Mechanics of Interfaces

section ‘‘Gibbs states and interfaces.’’ This analysis the step may be viewed as an additional defect on
allows us to determine the shape of the facets in a the rigid interface described in the section ‘‘Pure
rigorous way. phases and surface tension.’’ It is, in fact, a long wall
We first observe that the appearance of a facet going from one side to the other side of the box .
in the equilibrium crystal shape is related, according The step structure at low temperatures can then be
to the Wulff construction, to the existence of a analyzed with the help of a new cluster expansion.
discontinuity in the derivative of the surface As a consequence of this analysis, we have the
tension with respect to the orientation. More following theorem.
precisely, assume that the surface tension satisfies
Theorem 5 If the temperature is low enough, that
the convexity condition of Theorem 1, and let this
is, if J  c3 , where c3 is a given constant, then the
function (n) = ( ,
) be expressed in terms of the
step free energy,  step (
), exists, is strictly positive,
spherical coordinates of n, the vector n0 being taken
and extends by positive homogeneity to a strictly
as the x3 -axis. A facet orthogonal to n0 appears in
convex function. Moreover,  step (
) is an analytic
the Wulff shape if and only if the derivative
function of  = e2J , for which an explicit conver-
@( ,
)=@ is discontinuous at the point = 0,
gent series expansion can be found.
for all
. The facet F  @W consists of the points
x 2 R3 belonging to the plane x3 = (n0 ) and such Using the above results on the step structure,
that, for all
between 0 and 2, similar methods allow us to evaluate the increment
in surface tension of an interface tilted by a very
x1 cos
þ x2 sin
 @ð ;
Þ=@ j ¼0þ ½21 small angle with respect to the rigid horizontal
interface. This increment can be expressed in terms
The step free energy is expected to play an of the step free energy, and one obtains the
important role in the facet formation. It is defined following relation.
as the free energy associated with the introduction
of a step of height 1 on the interface, and can be Theorem 6 For J  c3 , we have
regarded as an order parameter for the roughening
@ð ;
Þ=@ j ¼0þ ¼  step ð
Þ ½24
transition. Let  be a parallelepiped as in the section
‘‘Pure phases and surface tension,’’ and introduce This relation, together with eqn [21], implies that
the (step, m) boundary conditions (see Figure 2), one obtains the shape of the facet by means of the
associated to the unit vectors m = ( cos
, sin
) 2 two-dimensional Wulff construction applied to the
R2 , by step free energy. The reader will find a detailed
8 discussion on these points, as well as the proofs of
> 1 if i > 0 or if i3 ¼ 0 and Theorems 5 and 6, in Miracle-Sole (1995).
<
ðiÞ ¼ i1 m1 þ i2 m2  0 ½22 From the properties of  step stated in Theorem 5,
>
: it follows that the Wulff equilibrium crystal presents
1 otherwise
well-defined boundary lines, smooth and without
straight segments, between a rounded part of the
Then, the step free energy per unit length for a step
crystal surface and the facets parallel to the three
orthogonal to m (with m2 > 0) on the horizontal
main lattice planes.
interface, is
It is expected, but not proved, that at a higher
temperature, but before reaching the critical
 step ð
Þ
temperature, the facets associated with the Ising
cos
Zstep;m ðÞ model undergo a roughening transition. It is then
¼ lim lim lim  ln ;n ½23
L1 !1 L2 !1 L3 !1 L1 Z 0 ðÞ natural to believe that the equality [24] is true for
any  larger than R , allowing us to determine the
A first result concerning this point was obtained facet shape from eqns [21] and [24], and that for
by Bricmont and co-workers, by proving a correla-   R , both sides in this equality vanish, and
tion inequality which establish  step (0) as a lower thus, the disappearance of the facet is involved.
bound to the one-sided derivative @( , 0)=@ at However, the condition that the temperature is
= 0þ (the inequality extends also to
6¼ 0). Thus, low enough is needed in the proofs of Theorems 5
when  step > 0, a facet is expected. and 6.
Using the perturbation theory of the horizontal
interface, it is possible to also study the microscopic See also: Dimer Problems; Phase Transitions in
interfaces associated with the (step, m) boundary Continuous Systems; Phase Transition Dynamics;
conditions. When considering these configurations, Two-Dimensional Ising Model; Wulff Droplets.
Stochastic Differential Equations 63

Further Reading Messager A, Miracle-Sole S, and Ruiz J (1992) Convexity


properties of the surface tension and equilibrium crystals.
Abraham DB (1986) Surface structures and phase transitions. In: Journal of Statistical Physics 67: 449–470.
Domb C and Lebowitz JL (eds.) Critical Phenomena, vol. 10, Miracle-Sole S (1995) Surface tension, step free energy and facets
pp. 1–74. London: Academic Press. in the equilibrium crystal shape. Journal of Statistical Physics
Chalker JT (1982) The pinning of an interface by a planar defect. 79: 183–214.
Journal of Physics A: Mathematical and General 15: Miracle-Sole S (1999) Facet shapes in a Wulff crystal. In: Miracle-
L481–L485. Sole S, Ruiz J, and Zagrebnov V (eds.) Mathematical Results
Fröhlich J and Pfister CE (1987) Semi-infinite Ising model. I. in Statistical Mechanics, pp. 83–101. Singapore: World
Communications in Mathematical Physics 109: 493–523. Scientific.
Fröhlich J and Pfister CE (1987) Semi-infinite Ising model. II. van Beijeren H and Nolden I (1987) The roughening transition.
Communication in Mathematical Physics 112: 51–74. In: Schommers W and von Blackenhagen P (eds.) Topics in
Gallavotti G (1999) Statistical Mechanics: A Short Treatise. Current Physics, vol. 43. pp. 259–300. Berlin: Springer.
Berlin: Springer. Widom B (1991) Interfacial phenomena. In: Hansen JP, Levesque D,
Kotecky R (1989) Statistical mechanics of interfaces and and Zinn-Justin J (eds.) Liquid, Freezing and the Glass
equilibrium crystal shapes. In: Simon B, Truman A, and Transition, pp. 506–546. Amsterdam: North-Holland.
Davies IM (eds.) IX International Congress of Mathematical Wolf PE, Balibar S, and Gallet F (1983) Experimental observa-
Physics, pp. 148–163. Bristol: Adam Hilger. tions of a third roughening transition in hcp 4 He crystals.
Lebowitz JL and Pfister CE (1981) Surface tension and phase Physical Review Letters 51: 1366–1369.
coexistence. Physical Review Letters 46: 1031–1033.

Stochastic Differential Equations


F Russo, Université Paris 13, Villetaneuse, France becomes well stated. A similar phenomenon happens
ª 2006 Elsevier Ltd. All rights reserved. with linear PDEs of evolution type perturbed with a
spacetime white noise.
SDEs constitute a vast subject and account for an
Introduction incredible amount of relevant contributions. We try
to orientate the reader about the main axes trying to
Stochastic differential equations (SDEs) appear indicate references to the different subfields. We will
today as a modeling tool in several sciences as prefer to refer to monographs when available,
telecommunications, economics, finance, biology, instead of articles.
and quantum field theory.
An SDE is essentially a classical differential
equation which is perturbed by a random noise. Motivation and Preliminaries
When nothing else is specified, SDE means in fact In the whole article T will be a strictly positive real
ordinary SDE; in that case it corresponds to the number. Let us consider continuous functions
perturbation of an ordinary differential equation. b : R þ Rd ! R d , a : Rþ Rd m ! Rd and x0 2 Rd .
Stochastic partial differential equations (SPDEs) are We consider a differential problem of the following
obtained as random perturbation of partial differ- type:
ential equations (PDEs).
One of the most important difference between dXt
¼ bðt; Xt Þ
deterministic and stochastic ordinary differential dt ½1
equations is described by the so-called Peano type X 0 ¼ x0
phenomenon. A classical differential equation with
Let (, F , P) be a complete probability space.
continuous and linear growth coefficients admits
Suppose that previous equation is perturbed by a
global existence but not uniqueness as classical
random noise ( t )t0 . Because of modeling reasons it
calculus text books illustrate studying equations of
could be reasonable to suppose ( t )t0 satisfying the
the type
following properties.
dX pffiffiffiffiffiffiffiffiffiffi
ðtÞ ¼ XðtÞ; Xð0Þ ¼ 0 1. It is a family of independent random variables
dt (r.v.’s)
However, if one perturbs the right member of the 2. ( t )t0 is ‘‘stationary’’, that is, for any positive
equality with an additive Gaussian white noise ( t ) integer n, positive reals h, t0 , t1 , . . . , tn , the law of
(even with very small intensity), then the problem ( t0 þh , . . . , tn þh ) does not depend on h.
64 Stochastic Differential Equations

More precisely we perturb eqn [1] as follows: There one can find basic concepts of the theory of
stochastic processes as the concept of adapted,
dXt progressively measurable process. An adapted pro-
¼ bðt; Xt Þ þ aðt; Xt Þt
dt ½2 cess is also said to be nonanticipating towards the
X0 ¼ x0 filtration (F t ) which represents the state of the
information at each time t. A process (Xt ) is said to
We suppose for a moment that d = m = 1. In reality no be adapted if for any t, Xt is F t -measurable. The
reasonable real-valued process (t )t0 fulfilling pre- notion of progressively measurable process is a slight
vious assumptions exists. In particular, if process (t ) refinement of the notion of adapted process.
exists (resp. (t ) exists and each t is a square-
integrable r.v.), then the process cannot have contin- Definition 2
uous paths (resp. it cannot be measurable with respect (i) A (continuous) (F t ) adapted process (Wt ) is called
to   Rþ ). However,R t suppose that such a process (classical) (F t )-Brownian motion if W0 = 0, if
exists; we set Bt = 0 s ds. In that case, properties (1) for any s < t Wt  Ws is an N(0, t  s) distrib-
and (2) can be translated into the following on (Bt ). uted r.v. which is independent of F s .
(P1) It has independent increments, which means (ii) An (F t )-m-dimensional Brownian motion is a
that for any t0 , . . . , tn , h  0, Bt1 þh  Bt0 þh , . . . , vector (W 1 , . . . , W m ) of (F t )-classical indepen-
Btn þh  Btn1 þh are independent r.v.’s. dent Brownian motions.
(P2) It has stationary increments, which means that From now on, we will consider a probability
for any t0 , . . . , tn , h  0, the law of (Bt1 þh  space (, F , P) equipped with a filtration (F t )t0
Bt0 þh , ..., Btn þh  Btn1 þh ) does not depend on h. fulfilling the usual conditions. From now on all the
On the other hand, it is natural to require that considered filtrations will have that property.
Let W = (Wt )t0 be an (F t )t0 -m-dimensional clas-
(C1) B0 = 0 a.s., sical Brownian motion. In Karatzas and Shreve (1991,
(C2) it is a continuous process, that is, it has chapter 3) and Revuz and Yor (1999, chapter 4), one
continuous paths a.s. introduces the notion of stochastic Itô integral
Equation [2] should be rewritten in some integral form announced before. Let Y = (Y 1 , . . . , Y m ) be a progres-
Z t Rsively
T
measurable m-dimensional process
2 RT such that
Xt ¼ X0 þ bðs; Xs Þds 0 kY s k ds < 1, then the Itô integral 0 Y sR  s is well
dW
0 defined. In particular the indefinite integral 0 Ys dWs is
Z t
an (F t )-progressively measurable continuous process.
þ aðs; Xs ÞdBs ½3 If dm
0 R t Y is an R matrix-valued process, the integral
0 Ys dWs is componentwise defined and it will be a
Clearly the paths of process (Bt ) cannot be differ- vector in Rd . The analogous of differential calculus in
entiable,
Rt so one has to give meaning to integral the framework of stochastic processes is Itô calculus,
0 a(s, X s )dBs . This will be intended in the ‘‘Itô’’ see again Karatzas and Shreve (1991, chapter 3) and
sense, see considerations below. Revuz and Yor (1999, chapter 4). Important tools are
An important result of probability theory says the concept of quadratic variation [X] of a stochastic
that a stochastic process (Bt ) fulfilling properties P1, process when it exists. For instance, the quadratic
P2 and C1, C2 is essentially a ‘‘Brownian motion’’. variationR [W]t of a classical Brownian Rt motion equals t.
t
More precisely, there are real constants b,  such If Mt = 0 Ys dWs , then [M]t = 0 kYs k2 ds. One cele-
that Bt = bt þ Wt , where (Wt ) is a classical Brow- brated theorem of P Lévy states the following: if (Mt )
nian motion defined below. defines a continuous (F t )-local martingale such that
[Mt ]  t, then M is an (F t )-classical Brownian motion.
Definition 1
That theorem is called the ‘‘Lévy characterization
(i) A (continuous) stochastic process (Wt ) is called theorem of Brownian motion.’’ Itô formula constitutes
classical ‘‘Brownian motion’’ if W0 = 0 a.s., the natural generalization of fundamental theorem of
it has independent increments and the law of differential calculus to the stochastic calculus. Another
Wt  Ws is a Gaussian N(0, t  s) r.v. significant tool is Girsanov theorem; it states essen-
(ii) A m-dimensional Brownian motion is a vector tially the following: suppose that the following so-
(W 1 , . . . , W m ) of independent classical Brow- called ‘‘Novikov condition’’ is verified:
nian motions.
  Z T  
Let (F t )t0 be a filtration fulfilling the usual 1
E exp kYt k2 dt < 1
conditions, see (Karatzas and Shreve (1991, section 1.1). 2 0
Stochastic Differential Equations 65

R
Then the process W ~ t = Wt þ t Ys ds, t 2 [0, T] is Theorem 1 We suppose a and b locally Lipschitz
0
again an m-dimensional (F t )-classical Brownian with linear growth. Let  be a square-integrable r.v.
motion under a new probability measure Q on that is F 0 -measurable. Then [4] has a unique
(, F T ) defined by solution X. Moreover,
Z t  !
1 2
dQ ¼ dP exp Ys dWs  kYs k ds E sup jXt j2 <1
0 2 tT
Let  be an F 0 -measurable r.v., for instance,  
Remark 1
x 2 R d . We are interested in the SDE
(i) Equation [4] can be settled similarly by putting
dXt ¼ aðt; Xt Þ dWt þ bðt; Xt Þ dt initial condition x at some time s. In that case
½4
X0 ¼  the problem is again well stated. If   x is a
deterministic point of R d , then we will often
Definition 3 A progressively measurable process
denote by Xs, x the solution of that problem.
(Xt )t2[0, T] is said to be solution of [4] if a.s.
(ii) If the coefficients are only locally Lipschitz, the
Z t Z t
equation may be solved until a stopping time. If
Xt ¼ Z þ aðt; Xt Þ dWt þ bðt; Xt Þ dt
0 0 ½5 d = 1, it is possible to state necessary and sufficient
8t 2 ½0; T conditions for nonexplosion (Feller test).
(iii) The theorem above admits several generaliza-
provided that the right-hand side member makes tions. For instance, the Brownian motion can be
sense. In particular, such a solution is continuous. replaced by general semimartingales, (possibly
The function a (resp. b) is called the diffusion (drift) with jumps as Lévy processes).
coefficient of the SDE. a and b may sometimes be
An important role of diffusion processes is the fact
allowed to be random; however, this dependence
that they provide probabilistic representation to
has to be progressively measurable. Clearly, we can
PDEs of parabolic (and even elliptic) type. We will
define the notion of solution (Xt )t0 on the whole
only mention here the parabolic framework.
positive real axis.
We denote A(t, x) = a(t, x)a(t, x) , where means
We remark that those equations are called Itô transposition for matrices. (t, x) ! A(t, x) = (Aij (t, x))
SDEs. A solution of previous equation is named is a d  d matrix-valued function. Let us consider also
diffusion process. continuous functions k : [0, T]  R d ! R d , g : [0, T] 
Rd ! Rd with polynomial growth or non-negative.
Given a solution of [4], we can associate its
The Lipschitz Case generator (Lt , t 2 [0, T]) setting
The most natural framework for studying the 1X d
existence and uniqueness for SDEs appears when Lt f ðxÞ ¼ Aij ðt; xÞ@ij2 f ðxÞ þ bðt; xÞ  rf ðxÞ
2 i;j¼1
the coefficients are Lipschitz.
A function  : [0, T]  Rm ! Rd is said to have
‘‘polynomial growth’’ (with respect to x uniformly in Feynman–Kac theorem is stated below and it
t), if for some n there is a constant C > 0 with provides probabilistic representation of an asso-
ciated parabolic linear PDEs.
sup kðt; xÞk  Cð1 þ kxkn Þ ½6
t2½0;T Theorem 2 Suppose there is a function v : [0, T[ 
Rd ! Rd continuous with polynomial growth of
The same function is said to have ‘‘linear growth’’ if class C1, 2 ([0, T]  Rd ) satisfying the following
[6] holds with n = 1. A function  : Rþ  Rm ! R d is Cauchy problem:
said to be ‘‘locally Lipschitz’’ (with respect to
x uniformly in t), if for every t 2 [0, T], K > 0, ð@t v þ Lt Þv  kv ¼ g
½7
j[0, T][K, K] is Lipschitz (with respect to x uniformly vðT; xÞ ¼ f ðxÞ
with respect to t).
Let a : Rþ  R dm ! Rd , b : Rþ  R d ! R d , be Then
Borel functions,  an Rd -valued r.v. F 0 -measurable   Z T 
and (Wt )t0 be a m-dimensional (F t )-Brownian vðs; xÞ ¼ E f ðXT Þ exp  kð; X Þ d
motion. Z T s Z t  
Classical fixed-point theorems allow to establish  gðt; Xt Þ exp  kð; X Þ d dt
the following classical result. s s
66 Stochastic Differential Equations

for (s, x) 2 [0, T]  Rd , where X = Xs, x . In particu- This equation will be denoted by E(a, b) (without initial
lar, such a solution is unique. condition). However, as we will see, the general
concept of solution of an SDE is more sophisticated
Remark 2
and subtle than in the deterministic case. We distin-
(i) In order to obtain ‘‘classical solutions’’ of the guish several variants of existence and uniqueness.
above Cauchy problem, one needs some condi-
Definition 4 (Strong existence). We will say that
tions. It is the case, for instance, when the
equation E(a, b) admits strong existence if the
following ellipticity condition holds on A:
following holds. Given any probability space
9c > 0; 8ðt;xÞ 2 ½0;T  Rn ; 8ð1 ;...;n Þ 2 Rn (, F , P), a filtration (F t )t0 , an (F t )t0 -Brownian
X
d X
d motion (Wt )t0 , an F 0 -measurable and square-
Aij ðt;xÞi j  c ji j2 ½8 integrable r.v. , there is a process (Xt )t0 solution
i;j i¼1 to E(a, b) with X0 =  a.s.
In the degenerate case, it is possible to deal with Definition 5 (Pathwise uniqueness). We will say
viscosity solutions, in the sense of P L Lions. that equation E(a, b) admits pathwise uniqueness if
This theorem establishes an important link the following property is fulfilled. Let (, F , P) be a
between deterministic PDEs and SDEs. probability space, a filtration (F t )t0 , an (F t )t0
(ii) A natural generalization of Feynman–Kac theo- Brownian motion (Wt )t0 . If two processes X, X ~ are
rem comes from the system of forward–backward ~
two solutions such that X0 = X0 a.s., then X and X ~
SDEs in the sense of Pardoux and Peng. coincide.
(iii) Other types of probabilistic representation do
appear in stochastic control theory through the Definition 6 (Existence in law or weak existence).
so-called verification theorems, see for instance, Let  be a probability law on Rd . We will say that
Fleming and Soner (1993) and Yong and Zhou E(a, b; ) admits weak existence if there is a
(1999). In that case, the (nonlinear) Hamilton– probability space (, F , P), a filtration (F t )t0 , an
Jacobi–Bellmann deterministic equation is (F t )t0 -Brownian motion (Wt )t0 , and a process
represented by a controlled SDE. (Xt )t0 solution of E(a, b) with  being the law of X0 .
(iv) Another bridge between nonlinear PDEs and We say that E(a, b) admits weak existence if
diffusions can be provided in the framework of E(a, b; ) admits weak existence for every .
interacting particle systems with chaos propaga- Definition 7 (Uniqueness in law). Let  be a
tion, see Graham et al. (1996) for a survey on probability law on Rd . We say that E(a, b; ) has a
those problems. Among the most significant unique solution in law if the following holds. We
nonlinear PDEs investigated probabilistically, we consider an arbitrary probability space (, F , P) and
quote the case of porous media equations. For a filtration (F t )t0 on it; we consider also another
instance, for a positive integer m, a solution to probability space (, ˜ F~, P)
~ equipped with another
2 filtration (F~t )t0 ; we consider an (F t )t0 -Brownian
@t u ¼ 12 @xx ðu2mþ1 Þ ½9
motion (Wt )t0 , and an (F~t )t0 -Brownian motion
can be represented by a (nonlinear) diffusion of (W~ t )t0 ; we suppose having a process (Xt )t0 (resp. a
the type, see Benachour et al. (1996), process (X ~ t )t0 ) solution of E(a, b) on the first (resp.
on the second) probability space such that both the
d Xt ¼ um ðs; Xs Þ dWt law of X0 and X ~ 0 are identical to . Then X and X ~
½10
uðt; Þ ¼ law density of Xt must have the same law as r.v. with values in
E = C(Rþ ) (or C[0, T]).
Different Notions of Solutions We say that E(a, b) has a unique solution in law if
E(a, b; ) has a unique solution in law for every .
Let a and b as at the beginning of the previous
section. Let (, F , P) be a probability space, a There are important theorems which establish
filtration (F t )t0 fulfilling the usual conditions, an bridges among the preceding notions. One of the
(F t )t0 -classical Brownian motion (Wt )t0 . Let  be most celebrated is the following.
an F 0 -measurable r.v. In the section ‘‘Motivation
and preliminaries,’’ we defined the notion of solu- Proposition 1 (Yamada–Watanabe). Consider the
tion of the following equation: equation E(a, b).
(i) Pathwise uniqueness implies uniqueness in law.
d Xt ¼ bðt; Xt Þ dt þ aðt; Xt Þ dWt
½11 (ii) Weak existence and pathwise uniqueness imply
X0 ¼  strong existence.
Stochastic Differential Equations 67

A version can be stated for E(a, b; ) where  is a Proposition 3 (Stroock–Varadhan). Let  be a


fixed probability law. probability on R d such that
Z
Remark 3
kxk2m ðdxÞ < þ1 ½13
(i) If a and b are locally Lipschitz with linear R

growth, Theorem [1] implies that E(a, b) admits for a certain m > 1. We suppose that a, b are
strong existence and pathwise uniqueness. continuous with linear growth. Then E(a, b; )
(ii) If a and b are only locally Lipschitz, then admits weak existence.
pathwise uniqueness is fulfilled.
From now on, a function  : [0, T]  Rm ! Rd will
be said Hölder-continuous if it is Hölder-continuous
Existence and Uniqueness in Law in the space variable x 2 Rm uniformly with respect
to the time variable t 2 [0, T].
A way to create weak solutions of E(1, b) when Stroock and Varadhan (1979) also provide the
(t, x) ! b(t, x) is Borel with linear growth is the following result, which is an easy consequence of
Girsanov theorem. Suppose d = 1 for simplicity. Let their theorem 7.2.1.
us consider an (F t )-classical Brownian motion (Xt ).
We set Proposition 4 We suppose a, b both Hölder-
Z t continuous, bounded such that condition; [8] is
Wt ¼ Xt  bðs; Xs Þds fulfilled. Then SDE E(a, b; ) admits weak uniqueness.
0
Remark 4
Under some suitable probability Q, (Wt ) is an (F t )-
classical Brownian motion. Therefore, (Xt ) provides (i) The Hölder condition and [8] in Proposition 4
a solution to E(1, b; 0 ). may be relaxed and replaced with the solva-
We continue with an example where E(a, b) does bility of a Cauchy problem of a parabolic PDE
not admit pathwise uniqueness, even though it with suitable terminal value.
admits uniqueness in law. (ii) In the case d = 1, if a, b are bounded and just
Borel with [8] for x on each compact, then
Example 1 We consider the stochastic equation E(a, b; ) admits weak existence and uniqueness
Z t
in law. See Stroock and Varadhan (1979,
Xt ¼ signðXs ÞdWs ½12 exercises 7.3.2 and 7.3.3).
0 (iii) If d = 2, the same holds as at previous point
provided that moreover a does not depend on
with time.

1 if x  0 We proceed with some more specifically unidi-
signðxÞ ¼
1 if x < 0 mensional material stating some results from
K J Engelbert and W Schmidt, who furnished
It corresponds to E(a, b; 0 ) with b=0 and necessary and sufficient conditions for weak exis-
a(x) = sign(x). tence and uniqueness in law of SDEs.
For a Borel function  : R ! R, we first define
If (Wt )t0 is an (F t )-classical Brownian motion,
then (Xt )t0 is (F t )t0 -continuous local martingale ZðÞ ¼ fx 2 RjðxÞ ¼ 0g
vanishing at zero such that [X]t  t. According to
then we define the set I() as the set of real numbers
Lévy characterization theorem stated earlier, X is an
x such that
(F t )t0 -classical Brownian motion. This shows in
Z xþ"
particular that E(a, b; 0 ) admits uniqueness in law. dy
In the sequel, we will show that E(a, b; 0 ) also 2
¼ 1; 8" > 0
x"  ðyÞ
admits weak existence.
Let now (, F , P) be a probability space, an Proposition 5 (Engelbert–Schmidt criterion). Sup-
(F t )t0 -classical Brownian motion with respect to a pose that a : R ! R, that is, does not depend on time
filtration and (Xt )t0 such that [12] is verified. Then and we consider the equation without drift E(a, 0).
~ t = Xt can also be shown to be a solution.
X
Therefore, E(a, b; 0 ) does not admit pathwise (i) E(a, 0) admits weak existence (without explo-
uniqueness. sion) if and only if
We continue stating a result true in the multi-
dimensional case. IðaÞ
ZðaÞ ½14
68 Stochastic Differential Equations

(ii) E(a, 0) admits weak existence and uniqueness in Remark 7 Suppose d = 1. Pathwise uniqueness for
law if and only if E(a, b) also holds under the following assumptions.
(i) a, b are bounded, a is time independent and
IðaÞ ¼ ZðaÞ ½15
a  const. > 0, h as in Proposition 6. This result
Remark 5 has an analogous form in the case of spacetime
white noise driven SPDEs of parabolic type, as
(i) If a is continuous then, [14] is always verified. proved by Bally, Gyongy, and Pardoux in 1994.
Indeed, if a(x) 6¼ 0, there is " > 0 such that (ii) a independent on time, b bounded and a 
const. > 0; moreover, ja(x)  a(y)j2  jf (y) 
jaðyÞj > 0; 8y 2 ½x  "; x þ "
f (x)j and f is increasing and bounded.
Therefore, x cannot belong to I(a). For illustration we provide some significant
(ii) Equation [14] is verified also for some discon- examples.
tinuous functions as, for instance, a(x) =
sign(x). This confirms what was affirmed Example 2
Z t
previously, that is, the weak existence (and
Xt ¼ jXs j dWs ; t0 ½16
uniqueness in law) for E(a, 0). 0
(iii) If a(x) = 1{0} (x), [14] is not verified.
(iv) If a(x) = jxj ,   1=2, then We set a(x) = jxj , 0 <  < 1. This is equation
E(a, 0) with a(x) = jxj . According to Engelbert–
ZðaÞ ¼ IðaÞ ¼ f0g Schmidt notations, we have Z(a) = {0}. Moreover
(i) If   1=2, then I(a) = {0}.
So there is at most one solution in law for
(ii) If  < 1=2 then I(a) = ;.
E(a, 0).
(v) The proof is technical and makes use of Therefore, according to Proposition 5, E(a, 0) admits
Lévy characterisation theorem of Brownian weak existence. On the other hand, if   1=2,
motion.
jx  y j  hðjx  yjÞ ½17
where h(z) = z . According to Proposition 6, [16]
Results on Pathwise Uniqueness admits pathwise uniqueness and by Corollary [1],
also strong existence. The unique solution is X  0.
Proposition 6 (Yamada–Watanabe). Let a, b : Rþ  If  < 1=2, X  0 is always a solution. This is not
R ! R and consider again E(a, b). Suppose b the only one; even uniqueness in law is not true.
globally Lipschitz and h : R þ ! Rþ strictly increas- pffiffiffiffiffiffi
Example 3 Let a(x) = jxj, b Lipschitz. Then
ing continuous such that E(a, b) admits strong existence and pathwise unique-
(i) h(0) ness. In fact, a is Hölder-continuous with parameter
R " = 0;
(ii) 0 (1/h2 )(y)dy = 1, 8" > 0; and 1/2 and the second item of Remark 6 applies; so
(iii) ja(t, x)  a(t, y)j  h(x  y). pathwise uniqueness holds. Strong existence is a
consequence of Propositions 3 and 1 (ii).
Then pathwise uniqueness is verified. An interesting particular case is provided by the
Remark 6 following equation. Let x0 , ,   0, k 2 R. The
following equation admits strong existence and
(i) In Proposition 6, one typical choice is pathwise uniqueness.
h(u) = u ,  > 1=2. Z t pffiffiffiffiffiffiffiffi Z t
(ii) Pathwise uniqueness for E(a, b) holds therefore
Zt ¼ x0 þ  jZs j dWs þ ð  kXs Þ ds
if b is globally Lipschitz and a is Hölder- 0 0
continuous with parameter equal to 1/2. t 2 ½0; T ½18
Corollary 1 Suppose that the assumptions of
Equation [18] is widely used in mathematical finance
Proposition 6 are verified and a, b continuous with
and it constitutes the model of Cox–Ingersoll–Ross:
linear growth. Then E(a, b; ) admits strong exis-
the solution of the mentioned equation represents the
tence and pathwise uniqueness, whenever  verifies
short interest rate.
condition [13].
Consider now the particular case where k = 0,
Proof It follows from Propositions 6 and 3  = 2. According to some comparison theorem for
together with Proposition 1 (ii). & SDEs, the solution Z is always non-negative and
Stochastic Differential Equations 69

therefore the absolute value may be omitted. The Without entering into details, the classical Itô
equation becomes formula allows to show that (Yt ) defines a solution of
Z t pffiffiffiffiffi
dYt ¼ ~aðYt Þ dWt
Zt ¼ x0 þ 2 Zs dWs þ t ½19 ½22
0 Y0 ¼ hðx0 Þ

Definition 8 The unique solution Z to Now, eqn [22] fulfills the requirements of the
Z t pffiffiffiffiffi Engelbert–Schmidt criterion so that it admits weak
Zt ¼ x0 þ 2 Zs dWs þ t ½20 existence and uniqueness in law. Consequently,
0 unless explosion, one can easily establish the same
well-posedness for [21].
is called ‘‘square -dimensional Bessel process’’ Zvonkin transformation also allows to prove
starting at x0 ; it is denoted by BESQ (x0 ); for fine strong existence and pathwise uniqueness results
properties of this process, see Revuz and Yor (1999, for [21]; for instance, when
ch. IX.3).
Since Z  0, we call -dimensional a has linear growth, Zand
pffiffiffiffi Bessel process y
starting from x0 the process X = Z. It is denoted bðsÞ
y! ds
by BES (x0 ). 0 a2 ðsÞ
Remark 8 Let d  1. Let W = (W 1 , . . . , W d ) be a is a bounded function.
classical d-dimensional Brownian motion. We set In fact, problem [22] satisfies pathwise uniqueness
Xt = kWt k. (Xt )t0 is a d-dimensional Bessel process. and strong existence since the coefficients are
Remark 9 If  > 1, it is possible to see that Lipschitz with linear growth. Therefore, one can
Z deduce the same for [21].
  1 t ds Veretennikov generalized Zvonkin transformation
X t ¼ Wt þ
2 0 Xs to the d-dimensional case in some cases which
include the case a = 1 and b bounded Borel.
Zvonkin’s procedure suggests also to consider a
The Case with Distributional Drift
formal equation of the type
Pioneering work about diffusions with generalized
dXt ¼ dWt þ  0 ðXt Þ dt ½23
drift was presented by N I Portenko, but in the
framework of semimartingale processes. Recently, where  is only a continuous function and so b =  0
some work was done characterizing solutions in the is a Schwartz distribution;  could be, for instance,
class of the so-called Dirichlet processes, with some the realization of an independent Brownian motion
motivations in random irregular environment. of W. Therefore, eqn [23] is motivated by the study
A useful transformation in the theory of SDE is of irregular random media. When  = 1, b =  0 , SDE
the so-called ‘‘Zvonkin transformation.’’ Let (Wt ) be [22], h0 = e2 still makes sense.
an (F t )-classical Brownian motion. Let a (resp. b) : Using the Engelbert–Schmidt criterion, one can see
R ! R (resp. C1 ) be locally bounded. We suppose that problem [22] still admits weak existence and
moreover a > 0. We fix x0 2 R. Let (Xt )t0 be a uniqueness in the sense of distribution laws. If Y is a
solution of solution of [22], X = h1 (Y) provides a natural
Z t candidate solution for [21]. R F Bass, Z-Q Chen and
X t ¼ x0 þ bðXs Þ ds F Flandoli, F Russo, and J Wolf investigated general-
0
Z t ized SDEs as [23]: in particular, they made previous
þ aðXs Þ dWs ½21 reasoning rigorous, respectively, in the case of strong
0 and weak solutions, see Flandoli et al. (2003).
We set
Z x
2b Connected Topics
ðxÞ ¼ ðyÞ dy
0 a2 We aim here at giving some basic references about
and we define h : R ! R such that topics which are closely connected to SDEs.

hð0Þ ¼ 0; h0 ¼ e Stochastic Partial Differential Equations (SPDEs)


~(x) = (ah0 )(h1 (x)),
h is strictly increasing. We set a If a SDE is a random perturbation of an ordinary
1
where h is the inverse of h. We set Yt = h(Xt ). differential equation, an SPDE is a random
70 Stochastic Differential Equations

perturbation of a PDE. Several studies were Rough Paths Approach


performed in the parabolic (evolution equation)
A very successful and significant research field is the
and hyperbolic case (wave equation). Most of the
rough path theory. In the case of dimension d = 1,
work was done in the case of a fixed underlying
Doss–Sussmann method allows to transform the
probability spaces. We only quote two basic
solution of an SDE into the solution of an ordinary
monographies which should be consulted at first
(random) differential equation. In particular, that
before getting into the subject: the one of Walsh
solution can be seen as depending (pathwise)
(1986) and the one of Da Prato and Zabczyk
continuously from the driving Brownian motion
(1992).
(Wt ) with respect to the usual topology of C([0, T]).
However, it was possible to establish some results
Unless exceptions, this continuity does not hold in
about weak existence and uniqueness in law for
case of general dimension d > 1. Rough paths
SPDEs. One possible tool was a generalization of
theory, introduced by T Lyons, allows to recover
Girsanov theorem to the case of Gaussian spacetime
somehow this lack of continuity and establishes a
white noise. Weak existence for the stochastic
true pathwise stochastic integration.
quantization equation was proved with the help of
infinite-dimensional Dirichlet forms by S Albeverio SDEs Driven by Non-semimartingales
and M Röckner.
We also indicate a beautiful recent monography At the moment, there is a very intense activity
by Da Prato (2004) which pays particular attention towards SDEs driven by processes which are not
to Kolmogorov equations with infinitely many semimartingales. In this perspective, we list SDEs
variables. driven by fractional Brownian motion with the help
of rough paths theory, using fractional and Young
type integrals and involving finite cubic variation
Numerical Approximations processes. Among the contributors in that area we
Relevant work was done in numerical approxima- quote L Coutin, R Coviello, M Errami, M Gubinelli,
tion of solutions to SDEs and related approxima- Z Qian, F Russo, P Vallois, and M Zähle.
tions of solutions to linear parabolic equations
See also: Fractal Dimensions in Dynamics; Image
via Feynman–Kac probabilistic representation, see Processing: Mathematics; Interacting Stochastic Particle
Theorem 2). It seems that the stochastic simulations Systems; Lagrangian Dispersion (Passive Scalar);
(of improved Monte Carlo type and related topics) Malliavin Calculus; Path Integrals in Noncommutative
for solving deterministic problems are efficient when Geometry; Quantum Dynamical Semigroups; Quantum
the space dimension is greater than 4. Fields with Indefinite Metric: Non-Trivial Models; Random
Dynamical Systems; Random Walks in Random
Environments; Stochastic Hydrodynamics; Stochastic
Malliavin Calculus Resonance.
Malliavin calculus is a wide topic (see Malliavin
Calculus). Relevant applications of it concern Further Reading
stochastic (ordinary and partial) differential equa-
tions. We only quote a monography of Nualart Bally V, Gyongy I, and Pardoux E (1994) White noise driven
parabolic SPDEs with measurable drift. Journal of Functional
(1995) on those applications. Two main objects
Analysis 120: 484–510.
were studied. Benachour S, Chassaing P, Roynette B, and Vallois P (1996)
Given a solution of an SDE, (Xt ), sufficient Processus associés l’ équation des milieux poreux. Annali Scuola
Normale Superiore di Pisa, Classe di Scienze 23(4): 793–832.
conditions so that Xt , t > 0, has a (smooth) Bouleau N and Lépingle D (1994) Numerical Methods for
density p(t,  ). Small-time asymptotics of this Stochastic Processes. New York: Wiley.
density, when t ! 0, and small-drift perturbation Da Prato G (2004) Kolmogorov Equations for Stochastic PDEs.
were performed, refining Freidlin–Ventsell large- Basel: Birkäuser.
Da Prato G and Zabczyk J (1992) Stochastic Equations in Infinite
deviation estimates.
Dimensions, Encyclopedia of Mathematics and its Applica-
Coming back to SDE [11], one can conceive to tions, vol. 44. Cambridge: Cambridge University Press.
consider coefficients a, b nonadapted with respect Flandoli F, Russo F, and Wolf J (2003) Some stochastic
to the underlying filtration (F t ). On the other differential equations with distributional drift. Part I. General
hand, the initial condition  may be anticipating, calculus. Osaka Journal of Mathematics 40(2): 493–542.
Fleming WH and Soner M (1993) Controlled Markov Processes
that is, not
R t F 0 -measurable. In that case, the Itô and Viscosity Solutions. New York: Springer.
integral 0 a(s, Xs )dWs is not defined. A replace- Graham C, Kurtz Th G, Méléard S, Protter Ph E, Pulvirenti M,
ment tool is the so-called ‘‘Skorohod integral.’’ and Talay D (1996) Probabilistic models for nonlinear partial
Stochastic Hydrodynamics 71

differential equations. In: Talay and Tubaro L (eds.) Lectures Pardoux E (1998) Backward stochastic differential equations and
given at the 1st Session and Summer School held in viscosity solutions of systems of semilinear parabolic and
Montecatini Terme, 22–30 May 1995, Lecture notes in elliptic PDEs of second order. Stochastic analysis and related
Mathematics, vol. 1627. Springer-Verlag. Centro Internazio- topics, VI (Geilo, 1996), 79–127, Progr. Probab., 42, Boston:
nale Matematico Estivo (C.I.M.E.), Florence, 1996. Birkäuser, 1998.
Karatzas I and Shreve SE (1991) Brownian Motion and Stochastic Protter P (1992) Stochastic Integration and Differential Equa-
Calculus, 2nd edn. New York: Springer. tions. A new approach. Berlin: Springer.
Kloeden PE and Platen E (1992) Numerical Solutions of Lyons T and Qian Z (2002) Controlled Systems and Rough Paths.
Stochastic Differential Equations. Berlin: Springer. Oxford: Oxford University Press.
Lamberton D and Lapeyre B (1997) Introduction au Calcul Revuz D and Yor M (1999) Continuous Martingales and
Stochastique et Applications la Finance. Paris: Collection Brownian Motion, Third edition. Berlin: Springer-Verlag.
Ellipses. Stroock D and Varadhan SRS (1979) Multidimensional Diffusion
Ma J and Yong J (1999) Forward–Backward Stochastic Differ- Processes. Berlin–New York: Springer.
ential Equations and Their Applications, Lecture Notes in Walsh JB (1986) An introduction to stochastic partial differential
Mathematics, vol. 1702. Berlin: Springer. equations. In: Ecole d’été de probabilités de Saint–Flour, XIV –
Nualart D (1995) The Malliavin Calculus and Related Topics. 1984, pp. 265–439. Lecture Notes in Mathematics. vol. 118,
New York: Springer. Springer.
Øksendal B (2003) Stochastic Differential Equations. An Intro- Yong J and Zhou XY (1999) Stochastic Controls: Hamiltonian
duction with Applications. Sixth edition, Universitext. Berlin: Systems and HJB Equations. New York: Springer.
Springer-Verlag.

Stochastic Hydrodynamics
B Ferrario, Università di Pavia, Pavia, Italy relevant collective properties of the flow that,
ª 2006 Elsevier Ltd. All rights reserved. hopefully, make it possible to grasp the salient
features of the dynamics. In this sense, stochastic
hydrodynamics is germane to the kinetic gas theory.
Introduction In the next section we shall review a typical topic of
stochastic hydrodynamics, the evolution of prob-
Mathematical models in hydrodynamics are intro- ability measures. Results on stationary probability
duced to describe the motion of fluids. The basic measures will be given in the subsequent sections.
equations for Newtonian incompressible fluids are Another characteristic of turbulent flows is the lack
the Euler and the Navier–Stokes equations, for of space regularity of the velocity field. We shall
inviscid and viscous fluids, respectively. For a given introduce in the section ‘‘The stochastic Navier–
set of body forces acting on the fluid, these Stokes equations’’ a stochastic model of turbulence,
nonlinear partial differential equations (PDEs) which exhibits lack of regularity of the solutions.
model the evolution in time of the velocity and The Euler equations are a singular limit of the
pressure at each point of the fluid, given the initial Navier–Stokes equations, since they are first order,
velocity and suitable boundary conditions (see instead of second-order PDEs. It is little surprise if they
Partial Differential Equations: Some Examples). involve different mathematical techniques. A full sec-
The equations of hydrodynamics offer challenging tion will be devoted to a discussion of Euler equations
mathematical problems, like proving the existence and another to the Navier–Stokes equations. Statistics
and uniqueness of solutions, determining their of an inviscid flow, when approximated by vortex
regularity, their asymptotic behavior for large time, motion, will be described in the final section.
and their stability. To gain some insight into the
behavior of fluids, stochastic analysis is introduced
into hydrodynamics. In fact, there are various
Statistical Solutions
attempts to describe turbulent regime (see Turbu-
lence Theories). But, analyzing individual solutions Let u(t, x) be the fluid velocity at time t and point
that determine the flow at any time, for a given x 2 D R d ; since the initial velocity is always
initial condition, is a desperate task, since the affected by experimental errors, it is reasonable to
dynamics in a turbulent regime is chaotic and highly assign a measure  determining the probability that
unstable. This is a particular chaotic motion with the initial velocity belongs to a Borel set  of the
some characteristic statistical properties (see Monin space H of all admissible velocity fields u = u(x).
and Yaglom (1987)). The aim of a statistical A spatial statistical solution is a family of
description of turbulent flow is to single out some probability measures (t, ), t  0, each supported
72 Stochastic Hydrodynamics

on the set H such that, given any Borel set  in H, describing the spatial statistical solution, we deal with
we have the moments of (t, ) of any order. For a nonlinear
dynamics [3], the moments equations are an infinite
Probfuðt; xÞ 2 g ¼ ðt; Þ; 8t > 0 ½1
chain of coupled equations, the so-called Friedman–
with the initial condition (0, ) = (). The con- Keller equations.
struction and analysis of statistical solutions (t, ) is A prominent role among statistical solutions is
one of the crucial mathematical problems in played by stationary solutions. They contain all the
stochastic hydrodynamics (see, e.g., Vishik and statistical information in the case of equilibrium in
Fursikov (1988)). time. We have that the characteristic functional of
Hopf gave the first mathematical formulation of an invariant measure is constant in time. Therefore,
the problem of describing turbulent flows by
statistical solutions. The first result on the existence d
ðt; Þ ¼ 0
of statistical solutions is by Foias in 1973. Hopf dt
(1952) presented an equation in variational deriva- Bearing in mind equation [5], this is equivalent
tives satisfied by the characteristic functional (t, ) to say that the signed measure h, F(v)i(t, dv) vani-
of the family of measures (t, ) associated with the shes, for any test function  and time t. Setting t = 0,
Navier–Stokes equations. The characteristic func- we obtain that an invariant measure  in the space
tional (t, ) is the Fourier transform of the measure H satisfies the Liouville equation
(t, ): Z
Z
hðvÞ; FðvÞi dðvÞ ¼ 0 ½7
ðt; Þ ¼ eih;ui ðt; duÞ ½2 H
H
for appropriate test functions . This equation is
defined for any smooth test function .
also called the relation of infinitesimal invariance
We now derive the evolution equation for (t, ),
and the measure  is said to be infinitesimally
by assuming that the dynamics takes place in the
invariant.
phase space H and follows the nonlinear equation
The stationary measures are natural candidates to
du describe the statistical asymptotic behavior of the
¼ FðuÞ ½3
dt system when t ! 1. Notice that, in a chaotic system
two motions that are arbitrarily close to one another at
If uv (t) is the solution started from v at time t = 0,
t = 0 can evolve in completely different ways. So, to
then its probability distribution is represented by
describe satisfactorily the dynamics we take average
the time-evolved measure (t, ). Therefore, we
over a big number of experiments. This is the so-called
have that
Z Z ensemble average. These averages are assumed to be
v with respect to an invariant measure . The invariant
eih;ui ðt; duÞ ¼ eih;u ðtÞi ð0; dvÞ ½4
H H measures must exist and either they are unique or at
most one has physical meaning and enters in the
Differentiating in time, we obtain
functional integral defining the ensemble average.
Z
d v According to the ergodic principle (an assumption not
ðt; Þ ¼ eih;u ðtÞi ih; Fðuv ðtÞÞið0; dvÞ yet proved in hydrodynamics), ensemble averages
dt
ZH replace long-time averages: for every initial velocity
¼ i eih;vi h; FðvÞiðt; dvÞ ½5 field v, except for a set of initial values negligible in
H
some sense, the time average of an observable tends,
The last integral is uniquely determined by , since as time goes to infinity, to the ensemble average
the measure (t, ) is uniquely determined by (t, ). Z Z
1 T v
We denote by (t, ) the last integral in [5]. The lim ðu ðtÞÞ dt ¼ d ½8
T!1 T 0 H
evolution equation thus obtained for the character-
istic functional  is However, it is extremely difficult to prove the
existence of stationary probability measures for the
d
ðt; Þ ¼ iðt; Þ; 8 ½6 Navier–Stokes equations solving directly equation
dt [7]. The situation is formally the same as in
This is called the Hopf equation associated with the equilibrium statistical mechanics, where the Liouville
dynamical system [3]. equation is in fact solved, leading to the Boltzmann–
Another way to analyze the evolution of measures Gibbs distribution. However, the results in statistical
is through the moments; instead of the measure (t, ) hydrodynamics are far from being satisfactory.
Stochastic Hydrodynamics 73

Recent studies to prove the existence of invariant to construct (formally) invariant measures of Gibbs
measures for the Navier–Stokes equations are based type: the energy
on stochastic models (see the section ‘‘The stochastic Z
Navier–Stokes equations’’). On the other hand, for 1
EðuÞ :¼ juj2 dx
the Euler equations it is possible to construct 2 D
formally invariant measures, by means of invariant
quantities of the classical motion (see the next and, only in the two-dimensional case (d = 2), the
section). enstrophy
Finally, we point out that there are techniques Z
using invariant measures to show some results for 1
SðuÞ :¼ jcurl uj2 dx
the time evolution (e.g., the motion exists for almost 2 D
all initial values with respect to an invariant
measure). (with curl u = r?  u  @u2 =@x1  @u1 =@x2 for d = 2).
It is natural to look for velocity fields in the
following function spaces: the space H 0 of finite
kinetic energy and the space H 1 of finite enstrophy.
The Euler Equations Clearly, the admissible fields should also obey the
We start recalling some basic facts on Euler boundary conditions and divergence-free condition.
equations (see Incompressible Euler Equations: If P is the projection operator onto the space of
Mathematical Theory). divergence-free vectors, and B is the bilinear form
The motion of an inviscid, incompressible, and B(u, v) := P[(u  r)v], the Euler equations can be
homogeneous fluid is described by the Euler given the structure of an evolution,
equations, which in Eulerian coordinates read as
du
¼ Bðu; uÞ ½10
@u dt
þ ðu  rÞu þ rp ¼ f
@t
in D ½9 obtained by applying the projection operator P to
ru¼0 the first equation in [9]. The pressure disappears and
un¼0 on @D can be regarded as a Lagrange multiplier associated
with the divergence-free constraint (r  u = 0); it can
where, at time t  0 and position x 2 D, u = u(t, x) be fully recovered once the velocity field is known.
is the vector velocity, p = p(t, x) the hydrodynamic The dynamics is considered in the phase space of
pressure. The units have been chosen so that the divergence-free velocity vectors H (a large space
mass density  = 1. r denotes the nabla vector containing H0 and H 1 ), which is an infinite-
operator so dimensional functional space. More precisely, iden-
tifying H 0 with its dual (H 0 )0 , we introduce the
X
d
@ Gelfand’s triplet
ur¼ uj
@xj
j¼1 H1  H 0  ðH1 Þ0 ¼ H 1
X
d
@uj
ru¼ The space H  , with  = 1, 2, . . . , are the usual
j¼1
@xj Sobolev spaces but with the additional divergence-
  free and boundary conditions. For  > 0 noninteger,
@p @p
rp ¼ ;...; the spaces H  are defined by interpolation, whereas
@x1 @xd those with  < 0 by duality. As usual, regularity in
space is related to the spaces H  with higher
Finally, f denotes the external force. If the spatial
exponent . We have that H = [2R H  .
domain D has a boundary @D, then the velocity is
Invariance of E and S can be proved resorting to
assumed to be tangent to the boundary (n denotes
eqn [9] and assuming that u is a smooth vector field.
the exterior normal vector to the boundary). Some
For instance,
initial condition u0 at time t = 0 is assigned.
When f = 0, there are invariant quantities for Z Z
d d1 @u
system [9]. In the literature, there are many works EðuðtÞÞ ¼ juj2 dx ¼ u dx
suggesting a Gaussian stationary statistics (see, e.g., dt dt 2 D D @t
Z Z
the paper by Kraichnan (1980)). We consider ¼  u  ½ðu  rÞu dx  u  rp dx
invariants that are quadratic in the velocity so as D D
74 Stochastic Hydrodynamics

By integrating by parts and bearing in mind the The energy is


divergence-free condition and the boundary condi-
1X
tion, we conclude that EðuÞ ¼ juj j2
2 j
d
EðuÞ ¼ 0
dt and the renormalized energy is
In the same way, the invariance of S can be proved.  Z 
1X
As a consequence, the following Gibbs measures : E : ðuÞ ¼ juj j2  juj j2 S ðduÞ
2 j
which are defined on the space H
R
1 EðuÞ Since, as can be easily shown (:E : (u))2 S (du)
E ðduÞ ¼ e du
ZE < 1, : E : (u) is finite
Pfor R S2-almost every P u. On the
½11 1
1 SðuÞ contrary, since j juj j S (du) = j j = þ1,
S ðduÞ ¼ e du E(u) is infinite for S -almost every u.
ZS
We also note in passing that, for any
> 0 and
are heuristically invariant in time. In [11], Z. are the > 

partition functions, that is, they are normalization


Z
constants needed to guarantee that E Rand S are e
SðuÞ
genuine probability measures (e.g., ZE = H eE(u) du). e :E:ðuÞ du < 1
H Z
Actually, these measures  solve the Liouville
equation so that
Z
hðuÞ; Bðu; uÞidðuÞ ¼ 0 ½12 ð Þ;ð
Þ e :EðuÞ:
SðuÞ
S ðduÞ ¼ R du ½14
H e :EðuÞ:
SðuÞ du
for any test function , cylindrical, infinitely differ- is a probability measure, which is infinitesimally
entiable, bounded, and with bounded derivatives. invariant for the Euler flow.
On the other hand, the (global and not only Since the space of finite-energy velocity is negligible
infinitesimal) invariance means that if there exists a with respect to these measures, it is necessary to
global flow in time which is well defined in a phase replace the classical solutions having finite energy with
space of full measure , then the measure  is invariant generalized solutions. This is not an easy task in the
under this dynamics. The measures E and S are three-dimensional case, whereas some results have
centered Gaussian measures whose support is in a been proved for the two-dimensional problem, where
space larger than H 0 , as can be proved by standard the following existence result holds. Let us analyze
methods in the theory of Gaussian measures on the quadratic term B(u, u) = P[(u  r)u].(u  r)u can
infinite-dimensional spaces. By the very definition, E be rewritten as r(u u), taking in account the
is a cylindrical measure in H 0 and S is cylindrical in divergence-free condition. Trivially, we have that
H 1 . Then the support of E is any Hilbert space H ~ such
0 ~
that H  H is a Hilbert–Schmidt embedding, and the Rr(u u) = r(u u  : u u : ), where : u u : =
~ such that H 1  H ~ is a u u; S (du): We consider the quadratic expres-
support of S is any space H sion (u u  : u u : ). This is integrable with respect
Hilbert–Schmidt embedding. When the spatial dimen- to the measure S in the sense that
sion d is 2, supp(E ) = \<1 H  and supp(S ) = \<0 Z
H  . When d is 3, supp(E )= \<3=2 H .
ku u : u u : k2H" S ðduÞ < 1 ½15
Moreover, E (H 0 ) = S (H 0 ) = 0, that is, the space
of finite energy H 0 is negligible with respect to these
for any " > 0. We remark that this property is
measures. Let us show this property for the
similar to the integrability of the renormalized
‘‘enstrophy measure’’ S when d = 2. Let {ej }1 j = 1 be
0 energy, which is a quadratic expression as well.
a complete
P orthonormal system P in H . Hence, for
This implies that the H 1" -norm of r(u u) is
u = j uj ej , we have kuk2H0 = j juj j2 and kuk2H1 =
P 2 integrable with respect to the measure S . There-
j j juj j (with 0 < 1 2    and j
j as fore, B(u, u) is defined for S -a.e. u.
j ! 1). Keeping in mind its definition, the measure Now, let us replace eqn [10] with a system of infinite
S can be considered as a measure on the space of equations for all the components uj with respect to
the sequences {uj }j and written as an infinite product the orthonormal basis {ej }j , obtained by taking the
of one-dimensional centered Gaussian measures scalar product with ej of both sides of eqn [10]:
1 2
S ðduÞ ¼ j qffiffiffiffiffiffiffiffiffiffiffiffiffi eðj =2Þjuj j duj ½13 duj
2 j 1 ¼ Bj ðu; uÞ; j ¼ 1; 2; . . . ½16
dt
Stochastic Hydrodynamics 75

Each component Bj (u, u) is defined for S -a.e. u. easier than for the Euler equations. However, at
These estimates lead to define a weak solution (see variance with the Euler equations, the Navier–
Albeverio and Cruzeiro (1990)): Stokes equations do not possess invariants, since
the viscosity dissipates energy. Hence, it is difficult
Theorem 1 Let d = 2. There exists a flow U(t, !)
to find explicit expressions of invariant measures for
defined on a probability space (, F , P) with values
the deterministic Navier–Stokes equations, except
in H "1 for any " > 0, U(  , !) 2 C(R, H"1 ) P-a.e.
the trivial invariant measures concentrated on a
!, such that for each component Uj we have
stationary solution. However, as soon as a stochastic
Uj ðt; !Þ force is introduced in these equations, it is possible
Z t to have nontrivial invariant measures. It is impos-
¼ Uj ð0; !Þ þ Bj ðUðs; !Þ; Uðs; !ÞÞ ds; sible to review here the wide literature concerning
0 the stochastic Navier–Stokes equations and we
P  a:e:!; 8t 2 R confine ourselves to make some remarks. Most
results are concerned with proving the existence
Moreover, the measure S is invariant under this and/or uniqueness of an invariant measure , with-
flow. out giving an explicit representation, apart some
We point out that uniqueness is an open problem attempts like Gallavotti (2002), where a formal
also for d = 2. But already in the classical analysis of representation of stationary distributions is given in
the Euler equations in a bounded domain, unique- terms of functional integrals. Some properties of the
ness for initial velocity of finite energy is not known. not explicit invariant measures are given like, for
Working with the measure E is even worse, instance, estimates of moments, exponential conver-
especially when d = 3, because its support is a larger gence of the statistical solution for large time.
space within which more irregular velocity vectors Stochastic forces can enter in the Navier–Stokes
live. The more irregular the spaces where the flow equations in different ways. We can consider
lives, the more difficult is to handle the nonlinear randomness in the forcing term, so that the force f
term B(u, u). in [18] has a deterministic component which
On the other hand, for d = 1, the mathematical represents its mean varying slowly and a stochastic
analysis is much easier. For instance, it can be one, which accounts small fluctuations around the
proved (see Robert (2003)) that the one-dimensional mean and varying very rapidly. Alternatively, since
inviscid Burgers equation on the line the molecules are not rigidly connected to one
another in the fluid, they are subjected to fluctua-
 
@u @ 1 2 tions. A complete description of fluctuations relating
þ u ¼0 ½17 the microscopic and macroscopic motion is not
@t @x 2
achieved at present. However, we shall introduce
has intrinsic invariant statistical solution, given by a some models for which rigorous mathematical
class of Lévy’s processes with negative jumps. results can be proved.
The first part of this section concerns the Navier–
Stokes equations with noise n:
The Stochastic Navier–Stokes Equations
The Navier–Stokes equations describe advection @u
 u þ ðu  rÞu þ rp ¼ n
with velocity u and diffusion with kinematic @t ½19
viscosity  > 0 (see Viscous Incompressible Fluids: ru¼0
Mathematical Theory)
for which invariant measures exist, one of which can
@u be ergodic provided that the noise is suitably chosen.
 u þ ðu  rÞu þ rp ¼ f In the second part, a Navier–Stokes-type stochastic
@t
ru¼0 in D ½18 system is described, which has irregular solutions, as
expected in turbulence.
u¼0 on @D
Let us introduce the stochastic Navier–Stokes
where  is the Laplace operator. Nonslip boundary equations with time white noise. The first equation
conditions are assumed. Although the Euler equa- in [19] is an Itô equation:
tions [9] are formally obtained from [18] by setting @t u þ ½u þ ðu  rÞu þ rp ¼ @t w ½20
 = 0, the presence of the second-order operator
 makes the analysis needed to prove the Here w = wð1Þ , . . . , wðdÞ is a Brownian motion, that
existence, uniqueness, and regularity of solutions is, its time derivative n = @w=@t is a Gaussian
76 Stochastic Hydrodynamics

stochastic field with zero mean and correlation As soon as the forcing term is more regular in space,
function given by the Navier–Stokes system has a solution of finite
energy. These are solutions close to those of the
E½nðjÞ ðt; xÞnðkÞ ðt0 ; x0 Þ deterministic equation. Techniques similar to those
¼ jk qðx  x0 Þ ðt  t0 Þ ½21 used to prove the existence and/or uniqueness of
solutions for the deterministic equations work also
for j, k = 1, . . . , d.
in the stochastic case with an additive noise (or even
We shall use the differential form for the Itô
a multiplicative noise) to get weak or strong
equation [20] always understood in the integral
solutions. Global existence in the space H 0 is proved
form
for d = 2, 3 and uniqueness only for d = 2, as is the
Z t
case for the deterministic Navier–Stokes equations.
uðtÞ  uð0Þ þ ½uðsÞ þ ðuðsÞ  rÞuðsÞ The interesting feature is that by adding a noise
0
which acts on all the components with respect to a
þ rpðsÞds ¼ wðtÞ ½22
Hilbert basis (or at least on many components), the
Modeling perturbations by a white noise process stochastic Navier–Stokes system has a unique
represents the first step to understand how a random invariant measure, which is ergodic. This is proved
perturbation acts in the mathematical equations, for the spatial dimension d = 2. By means of the
rather than a good physical or numerical model. The Krylov–Bogoliubov’s method, existence of at least
first results are in a paper by Bensoussan and an invariant measure is proved by compactness of a
Temam (1973). family of averaged measures; the limit measures are
Obviously, the regularity of the solutions depends stationary measures. But, when many modes are
on the spatial covariance q of the noise. perturbed by a noise, there is a mixing effect on the
Let us consider the following cases. dynamics, avoiding existence of many stationary
measures. For the spatial dimension d = 2, the best
q = : the noise is white also in space.
result in this context is in Hairer and Mattingly
An invariant measure is known explicitly. Indeed, (2004), where the noise acts on very few modes. For
assume periodic boundary conditions on the square the spatial dimension d = 3, the result in Da Prato
(d = 2) or the cube (d = 3) D, which makes the and Debussche (2003) shows the existence of an
spatial domain a torus. In this case, the Euler and invariant measure; even if there is no uniqueness of
Navier–Stokes equations are set in the same func- the solutions (as in the deterministic case), by a
tional spaces. The generator of the stochastic selection principle, they construct a transition
Navier–Stokes equations [20] corresponds to the semigroup, which has a unique invariant measure,
sum of the generator of the Euler equations [9] and ergodic and strongly mixing.
of the stochastic Stokes equations Mathematical proofs are given for very different
noises. (The reader is urged to consult, among the
@t u ¼ ½u  rp þ @t w
½23 others, the papers by E, Mattingly and Sinai; Flandoli
ru¼0 and Maslowski; Mikulevicius and Rozovskii; Vishik
Since the first equation in [23] is linear in the and Fursikov. The latter authors study also statistical
unknown velocity u, the Stokes system has a unique solutions
P in two and three dimensions. For a kick noise
invariant measure which is a centered Gaussian n = k (t  k)qk (x) in equations [19], there are results
measure. In particular, when the noise is a space- for d = 2 by Bricmont, Kupiainen and Lefevere; Kuksin
time white noise and d = 2, this is the invariant and Shirikyan.)
measure [14] of the enstrophy: We conclude that, as far as invariant measures
and their ergodicity are concerned, the stochastic
ð0Þ;ð2Þ 1 2SðuÞ Navier–Stokes equations have richer results than the
S ðduÞ ¼ e du
Z deterministic Navier–Stokes equations. It is appeal-
ing to investigate the limit as the intensity of the
On a bidimensional torus, it is proved that this
noise goes to zero, so as to recover the deterministic
measure is not only infinitesimally invariant, but
equation. Now, think of equation [19] with a noise
also globally invariant for a unique flow [20]
"n, for n fixed and " ! 0. Due to the sensitive
defined for (0),
S
(2)
-a.e. initial velocity. We recall
dependence on initial conditions, even a small noise
that initial velocities of finite energy are negligible
may have important effects on the dynamics. A
with respect to the measure (0), S
(2)
.
conjecture by Kolmogorov is that the unique
q more regular than above, that is, the noise is invariant measure " tends, when " ! 0, to a specific
colored in space. measure, the so-called Kolmogorov measure, which
Stochastic Hydrodynamics 77

would enter into the ergodic principle. This is a According to the mathematical model for the
difficult problem, not yet solved. fluctuation, we have
We also mention the analysis of the inviscid limit.
dxðtÞ ¼ uðt; xðtÞÞdt þ bðxðtÞÞdwðtÞ ½28
Kuksin (2004) showed that the solution u of the
two-dimensional stochastic Navier–Stokes equations Therefore, Du is computed by means of Itô’s
formula
@u pffiffiffi
 u þ ðu  rÞu þ rp ¼  n; 0< 1 ½24 Xd
@t @u @u
Duðt; xðtÞÞ ¼ dt þ dxk ðtÞ
@t @x k
on the torus converges in distribution to a stationary k¼1
solution of the Euler equations. Here n is a random 1X d
@2u
force white in time and smooth in space. More þ b bs dt ½29
2 k;s¼1 @xk @xs k
precisely, for each subsequence uj ,
This leads to the stochastic Navier–Stokes-type
lim lim uj ðT þ tÞ ¼ UðtÞ ½25
j !0 T!1 equations (we neglect the overline symbol)

and almost every trajectory of the nontrivial limit dt u þ ½u þ ðu  rÞu þ rp þ 12 Qu dt
process U solves the Euler equations [9] without the ¼ ðb  rÞu dwðtÞ ½30
forcing term. Moreover, the process U keeps ru¼0
memory of some features of the noise force n, since
the mean values of the enstrophy and of the energy where Q is the second-order differential operator
of U depend on the noise n. given by the last term in [29].
We now present the second part on stochastic Rigorous mathematical results for the above
models for viscous fluids. In his 1884 paper, equations have been proved for the one-dimensional
Reynolds introduced the decomposition of turbulent case, that is, the Burgers equations on the line.
flow into mean and fluctuating flows. The equations Given an initial velocity of finite energy u0 2 H 0 ,
obtained are difficult to study. We shall show now a there exists a unique solution u 2 C([0, T]; H 0 ) \
tractable model for a one-dimensional problem L2 (0, T; H 1 ) (P-a.s.). But it can be shown that for a
(d = 1) with a suitable model of fluctuations. more regular initial velocity there is no higher
Decompose the velocity field into the sum of a regularity of the solution of eqn [30], if b 6¼ 0. This
mean flow u and a fluctuation means that these stochastic Burgers equations
cannot have too regular nontrivial solutions, as
u¼uþ expected in turbulent motion.

The fluctuation is assumed to be highly irregular; it


is reasonable to model it by a stochastic process. If Statistics of Vortices and Bidimensional
we choose
Turbulence
dw Onsager (1949) proposed to investigate bidimen-
¼b
dt sional turbulent flows, extending in a rigorous way
to hydrodynamics the statistical mechanics approach
where b is a given velocity field and dw=dt is white
of Boltzmann. If we are interested in flows of finite
noise, then the motion of the fluid is governed by a
energy, the results of the section ‘‘The Euler
stochastic equation of Itô type. Indeed, the Navier–
equations’’ provide no answer to the problem.
Stokes equations are balance equations of linear
Another way to proceed is by approximating the
momentum:
Euler equations in a suitable way. Actually, in a
Du two-dimensional turbulent flow, there appears a
¼ u  rp ½26 large-scale organization leading to coherent struc-
Dt
tures. These are hydrodynamical vortices, whose
where Du=Dt is the material time derivative along dynamics is governed by the Euler equations.
the trajectory of a particle which is at time t in Onsager suggested to approximate the continuous
position x(t) moving with velocity u (so Euler equations by a great (but finite) number of
u(x(t)) = (dx=dt)(t)): point vortices. This leads to a finite-dimensional
Hamiltonian system, to which the methods of
Du d @u statistical mechanics can be successfully applied. Of
¼ uðt; xðtÞÞ ¼ þ ðu  rÞu ½27
Dt dt @t course, the crucial point is to pass to the limit, to
78 Stochastic Hydrodynamics

recover the continuous system. But there are many system [33]. We can prove that Z(N) is finite for
different ways to approximate a continuous vorticity ~ 2 (8 =N, 4 ), so that it is natural to choose as

by a cloud of point vortices and different approx- ~
a scaling N = . Hence,
imations may lead to very different statistical
equilibrium states. N ðdx1 dx2    dxN Þ
We present here the approach presented in Lions 1
¼ eð =NÞH dx1 dx2    dxN ½36
(1997). To get an idea of a completely different ZðNÞ
approximation, see, for example, Robert (2003).
Let D be a bounded open smooth simply is considered for 8 < 0, or > 0 with
connected subset of R2 . Then there exists a function N > =4 .
(the stream function) such that u = r? and Bearing in mind the Onsager approach to approx-
j@D = 0. Given the velocity u, we recover the stream imate the turbulent Euler motion by means of point
function byR means of the vorticity ! = curl u =  , vortices, we are interested in the limit as N goes to
so (x) = D g(x, y)!(y) dy (here g is the Green’s þ1, for fixed in (8 , þ 1). It turns out that,
function of the Laplacian  and x, y are points in when the number of point vortices becomes very
D). The Euler equations can be written as large, their statistical behavior corresponds to a very
large number of independent particles moving in a
@! mean force field that they create.
þ u  r! ¼ 0 More precisely, consider  = 1=N, ~ = . The
@t ½31
! ¼ curl u empirical measure

Consider now a solution given by vorticity concen- 1X N


x ðtÞ
trated in a finite number N of points: N i¼1 i

X
N describing the vorticity, weakly converges to a
!¼ i xi ðtÞ ½32 probability density  and each correlation function
i¼1 Z Z
1
Here the vortex intensities i are real values and Nj ðx 1 ;    ; xj Þ ¼ dxjþ1    dxN eð =NÞH
D D ZðNÞ
xi (t) are distinct points in D for i = 1, . . . , N.
for j ¼ 1; . . . ; N  1 ½37
According to the Euler equations, these points evolve
as follows (see also Marchioro and Pulvirenti (1994)): j Qj
weakly converges to i = 1  = i = 1 (xi ).
The equation satisfied by , also called the mean-
d XN
xj ðtÞ ¼ r? l gðxj ðtÞ; xl ðtÞÞ field equation, is
xj
dt l¼1; l6¼j
e UðxÞ
þ j r? gðxj Þ; j ¼ 1; . . . ; N ½33 ðxÞ ¼ R ;
xj ~  UðyÞ dy
De
Z
where ~g is related to the Green’s function g. This is a with UðxÞ ¼ gðx; yÞðyÞ dy ½38
Hamiltonian system in DN . Hereafter, we shall D
suppose that the vortex intensities are the same The relation between U and  can also be written as
(i =  8i), so that the Hamiltonian is U =  in D, U = 0 on @D. We point out that
u = r? U is a stationary solution of the Euler
1 X N XN
equations. Indeed, ! = U =  and  is a function
Hðx1 ; . . . ; xN Þ ¼ gðxj ; xl Þ þ gðxj Þ
~ ½34
2 l; j¼1; l6¼j j¼1 of U, let us say  = F(U). This gives that
r! = rUF0 (U) and thus the term u  r! in the
By means of H, we define the canonical Gibbs Euler equation [31] vanishes.
measure It can be proved that there exists a solution of the
mean-field equation when  0 or when < 0 and
N ðdx1 dx2    dxN Þ D is simply connected. Uniqueness is known in some
1 ~ cases, for instance, when D is a bounded open
¼ e Hðx1;...; xN Þ dx1 dx2    dxN ½35
ZðNÞ smooth simply connected domain and the velocity is
assumed tangent to the boundary.
where Z(N) is the partition function. If Z(N) < 1, There are numerical evidences of this approxima-
then N is a well-defined probability measure on DN tion approach (see references in Lions (1997)
and, by construction, it is an invariant measure for referring to the periodic case). They show that for
Stochastic Hydrodynamics 79

large time and large Reynolds number (viscosity  Da Prato G and Debussche A (2003) Ergodicity for the 3D
close to 0), the vorticity of the solution of the stochastic Navier–Stokes equations. Journal de Mathéma-
tiques Pures et Appliquées 82(8): 877–947.
Navier–Stokes equations appears in a simple and E Weinan, Mattingly JC, and Sinai Y (2001) Gibbsian dynamics
organized structure. This stays intact until the and ergodicity for the stochastically forced Navier–Stokes
viscous dissipation damps it. The important obser- equation. Communications in Mathematical Physics 224(1):
vation is that the organized structure is described 83–106.
quite precisely by the solution of the mean-field Flandoli F and Bessaih H (2003) A mean field result for 3D vortex
filaments. In: Davies IM, Jacob N, Truman A, Hassan O,
equation for some specific . Morgan K, and Weatherill NP (eds.) Probabilistic Methods in
Actually, to say that a fluid is inviscid is an Fluids, 22–34. River Edge, NJ: World Scientific.
approximation (which may be justified in many Flandoli F and Maslowski B (1995) Ergodicity of the 2-D Navier–
contexts), since every fluid displays some kind of Stokes equation under random perturbations. Communica-
viscosity. But turbulence is a phenomenon occurring tions in Mathematical Physics 172(1): 119–141.
Frisch U (1995) Turbulence. The legacy of A. N. Kolmogorov.
at very small viscosity. In this sense, the above result Cambridge: Cambridge University Press.
provides a description of stationary regime in an Gallavotti G (2002) Foundations of Fluid Dynamics. Berlin:
ideal fluid, which is a good approximation of some Springer.
numerical simulations of real fluids. Besides this Hairer M and Mattingly JC (2004) Ergodic properties of highly
good agreement with numerical simulations, there is degenerate 2D stochastic Navier–Stokes equations (English.
English, French summary). Comptes Rendus Mathématique.
no proof on how to deduce the mean-field equation Académie des Sciences. Paris 339(12): 879–888 (see also the
from the Euler equations (e.g., which parameter paper ‘‘Ergodicity of the 2D Navier–Stokes equations with
has to be chosen in eqn [38]?). degenerate stochastic forcing’’ to appear on Annals of
Mathematics).
Remark The extension of this analysis to three- Hopf E (1952) Statistical hydromechanics and functional calculus.
dimensional flows involves vortex filaments, instead Journal of Rational Mechanics and Analysis 1: 87–123.
of point vortices. There are attempts to describe Kraichnan RH and Montgomery D (1980) Two–dimensional
interacting vortex filaments as proposed by Chorin. turbulence. Reports on Progress in Physics 43(5): 547–619.
Kuksin SB (2004) The Eulerian limit for 2D statistical hydro-
Idealizations of behavior of vortices are introduced dynamics. Journal of Statistical Physics 115(1/2): 469–492.
to have a tractable mathematical model. The reader Kuksin S and Shirikyan A (2000) Stochastic dissipative PDEs and
is referred to Lions (1997) for a description of nearly Gibbs measures. Communications in Mathematical Physics
parallel vortex filaments and to Flandoli and Bessaih 213(2): 291–330.
(2003) for more realistic filaments which fold. Lions P-L (1997) On Euler Equations and Statistical Physics,
Pubblicazione della Scuola Normale Superiore. Pisa: Cattedra
Galileiana.
See also: Cauchy Problem for Burgers-Type Equations;
Marchioro C and Pulvirenti M (1994) Mathematical Theory of
Hamiltonian Fluid Dynamics; Incompressible Euler Incompressible Nonviscous Fluids. Applied Mathematical
Equations: Mathematical Theory; Malliavin Calculus; Science, vol. 96. New York: Springer.
Non-Newtonian Fluids; Partial Differential Equations: Mikulevicius R and Rozovskii BL (2004) Stochastic Navier–
Some Examples; Stochastic Differential Equations; Stokes equations for turbulent flows. SIAM Journal on
Turbulence Theories; Viscous Incompressible Fluids: Mathematical Analysis 35(5): 1250–1310.
Mathematical Theory; Vortex Dynamics. Monin AS and Yaglom AM (1987) Statistical Fluid Mechanics:
Mechanics of Turbulence.Cambridge, MA: MIT Press.
Onsager L (1949) Statistical hydrodynamics. Nuovo Cimento
6(suppl. 2): 279–287.
Further Reading Robert R (2003) Statistical hydrodynamics (Onsager revisited). In:
Friedlander S and Serre D (eds.) Handbook of Mathematical
Albeverio S and Cruzeiro AB (1990) Global flows with invariant Fluid Dynamics, vol. II, pp. 1–54. Amsterdam: North-Holland.
(Gibbs) measures for Euler and Navier–Stokes two dimensional Vishik MJ and Fursikov AV (1988) Mathematical Problems of
fluids. Communications in Mathematical Physics 129: 431–444. Statistical Hydromechanics. Dordrecht: Kluwer Academic
Bensoussan A and Temam R (1973) Équations stochastiques du type Publishers.
Navier–Stokes. Journal of Functional Analysis 13: 195–222.
Bricmont J, Kupiainen A, and Lefevere R (2001) Ergodicity of the
2D Navier–Stokes equations with random forcing. Commu-
nications in Mathematical Physics 224(1): 65–81.
80 Stochastic Loewner Evolutions

Stochastic Loewner Evolutions


G F Lawler, Cornell University, Ithaca, NY, USA Using the Riemann mapping theorem, one can see
ª 2006 Gregory F Lawler. Published by Elsevier Ltd. that such a family {D (z, w)} is determined (up
All rights reserved. to reparametrization) by H (0, 1), where H = {x þ
iy : y > 0} denotes the upper half-plane. Suppose
 : [0, 1) ! C is a simple (i.e., no self-intersections)
curve with (0) = 0, (0, 1)  H, and supt Im
Introduction
[(t)] = 1. Let Ht = Hn[0, t]. There is a unique
The stochastic Loewner evolution or Schramm– conformal transformation gt : Ht ! H whose
Loewner evolution (SLE) is a family of random curves expansion at infinity is
that appear as scaling limits of curves or cluster
bðtÞ
boundaries of discrete statistical mechanical models in gt ðzÞ ¼ z þ þ Oðjzj2 Þ; z!1
two dimensions at criticality. The stochastic Loewner z
evolution was introduced by Oded Schramm as a (see Figure 2). The coefficient b(t), which is some-
candidate for the limit of loop-erased random walk times called the half-plane capacity of [0, t] and
and the boundary of percolation clusters, and it is now denoted hcap[[0, t]], is continuous, strictly increas-
believed that SLE curves appear in most planar critical ing, and tending to 1. In fact,
systems whose scaling limit satisfies conformal invar-
bðtÞ ¼ lim y E½Im½X  j X0 ¼ iy
iance. The curves are defined by solving a Loewner y!1
differential equation with a random input.
where Xs denotes a complex Brownian motion and
 = [0, t] is the first time s such that Xs 2 R [ [0, t].
Definition By reparametrizing , b(t) = 2t. With this parame-
trization, the maps gt satisfy the Loewner differen-
There are three major one-parameter families of SLE tial equation
curves – chordal, radial, and whole-plane – which
correspond to curves connecting two boundary points 2
g_ t ðzÞ ¼ ; g0 ðzÞ ¼ z
in a domain, a boundary point and an interior point in gt ðzÞ  Ut
a domain, and two points in C, respectively. The
parameter is usually denoted  > 0. The starting point where U : [0, 1) ! R is a continuous function with
for defining SLE is to write down the assumptions U0 = 0. In fact, Ut = gt ((t)). Schramm observed that
that one expects from a scaling limit, assuming that the measure H (0, 1), at least if it were supported
the limit is conformally invariant. on simple curves and the curves were parametrized
In the chordal case, we assume that there is a using half-plane capacity, would produce a random
family of probability measures {D (z, w)}, indexed Ut . If the assumptions above on {D (z, w)} are
by simply connected proper domains D  C and translated into assumptions on the ‘‘driving func-
distinct boundary points z, w 2 @D, supported on tion’’ Ut , one shows readily that Ut pmust ffiffiffi be a
continuous curves  : [0, t ] ! D  with (0) = z, driftless Brownian motion, that is, Ut =  Bt , for a
(t ) = w, which satisfies the following: standard one-dimensional Brownian motion Bt .
Chordal SLE (in H connecting 0 and 1) is
 Conformal invariance. If f : D ! D0 is a con- defined to be the random collection of conformal
formal transformation, then the image of D (z, w) maps gt obtained by solving the initial-value problem
under f is the same as D0 (f (z), f (w)), up to a time
change. 2
g_ t ðzÞ ¼ pffiffiffi ; g0 ðzÞ ¼ z ½1
 Conformal Markov property for D (z, w). gt ðzÞ   Bt
Suppose [0, t] is known, and let gt be
a conformal transformation of the slit domain
Dn[0, t] onto D with gt ((t)) = z, gt (w) = w w w
γ (t )
(see Figure 1). Then the conditional distribution
on gt  [t, t ], given [0, t], is the same, up to a gt
change of parametrization, as the original dis-
tribution. (Implicit in this is the assumption that
(t) is on the boundary of Dn(0, t], which will be z z = gt(γ(t))
true, e.g., if  is non-self-intersecting and
(0, t )  D.) Figure 1 The map gt from D n[0, t] onto D.
Stochastic Loewner Evolutions 81

gt

γ (t )

U0 Ut
Figure 2 The map gt from Hn[0,t] onto H:

where Bt is a standard one-dimensional Brownian transformations, but the scale invariance of SLE in
motion. Equation [1] is often given in terms of the H shows that the image measure is independent of
inverse ft = g1
t : the choice of transformation.
The geometric and fractal properties of the curve
2
f_t ðzÞ ¼ ft0 ðzÞ pffiffiffi  vary greatly as the parameter  changes:
z  Bt
 if   4,  is a simple curve;
This equation describes a random evolution of  if 4 <  < 8,  has self-intersections, but is not
conformal maps ft from H into subdomains of H. space filling; and
 the solution of [1] is defined up to a
For each z 2 H,  if  8,  is a space filling curve.
time Tz 2 [0, 1] with Tz > 0 for z 6¼ 0. For fixed
To see this, one notes that the conformal Markov
t, gt is the unique conformal transformation of
property implies that there can be double points
Ht := {z 2 H : Tz > t} onto H with expansion
with positive probability if and only if Tx < 1
2t occurs with positive probability for x > 0. In add-
gt ðzÞ ¼ z þ þ ; z!1 ition, the curve is space filling if and only if Tz < 1
z
for all z and Tw 6¼ Tz for w 6¼ z. The problem is then
The chordal SLE path is the random curve reduced to a problem about the Bessel equation [3]
 : [0, 1) ! H such that for each t, Ht is the for which the following holds:
unbounded component of Hn[0, t]. It is not
immediate from the definition that such a curve   if a 1=2 and z 6¼ 0, the probability that Tz < 1
exists, but its existence has been proved. If Gt = gt= , is zero. If a < 1=2, this probability equals 1.
then we can write eqn [1] as  if 1=4 < a < 1=2, and w, z are distinct points in H,
then there is a positive probability that Tw = Tz .
_ t ðzÞ ¼ a  if 0 < a  1=4, then with probability 1, Tw 6¼ Tz
G ½2
Gt ðzÞ þ Wt for all w 6¼ z.
pffiffiffi
where a = 2= and Wt :=  Bt= is a standard This kind of argument is typical when studying
Brownian motion. Then Zzt := Gt (z) þ Wt satisfies SLE – geometric properties of the curve are
the Bessel stochastic differential equation established by analyzing a stochastic differential
a equation. The Hausdorff dimension of the path 
dZzt ¼ dt þ dWt ; Zz0 ¼ z ½3 is given by
Zzt
n  o
This equation is valid up to time Tz , which is the dim½½0; 1Þ ¼ min 1 þ ; 2
first time that Zzt = 0. 8
Although chordal SLE is defined with a parti-
cular parametrization, one generally thinks of it as a The radial Loewner equation describes the evolu-
measure on curves modulo reparametrization. The tion of a curve from the boundary of the unit disk
scaling properties of Brownian motion imply that D = {z : jzj < 1} to the origin. Suppose  : [0, 1) !
this measure is invariant under dilations of H. If D  is a simple curve with (0) = 1, (0, 1)  Dn{0},
D
is a simply connected domain and z, w are distinct and (t) ! 0 as t ! 1. Let gt be the unique
boundary points of D, chordal SLE in D connecting conformal transformation of Dn[0, t] onto D such
z and w is defined to be the conformal image of that gt (0) = 0, g0t (0) > 0. One can check that g0t (0) is
SLE in H from 0 to 1 under a conformal continuous and strictly increasing in t, and hence we
transformation of H onto D taking 0 to z and 1 can parametrize  in such a way that gt0 (0) = et .
to w. There is a one-parameter family of such Using this reparametrization, there is a continuous
82 Stochastic Loewner Evolutions

Ut : [0, 1) ! R with U0 = 0 such that gt satisfies motion. Girsanov’s theorem implies that Brownian
the radial Loewner equation motions with the same variance but different drifts
have absolutely continuous distributions. In parti-
eiUt þ gt ðzÞ cular, qualitative properties such as existence of
g_ t ðzÞ ¼ gt ðzÞ ; g0 ðzÞ ¼ z
eiUt  gt ðzÞ double points or Hausdorff dimension of paths are
the same for radial and chordal SLE. Ut is a driftless
If z 6¼ 0, then we can define ht (z) = i log gt (z)
Brownian motion if a = 1=3,  = 6.
locally near z, and this equation becomes
Whole-plane SLE from 0 to 1 is a path
   : (1, 1) ! C with (1) = 0, (1) = 1, such
_ht ðzÞ ¼ cot ht ðzÞ  Ut
2 that given (1, t], the distribution of (t, 1) is
that of radial SLE from boundary point (t) to
Radial SLE (connecting 1 and 0 in D) is obtained interior point 1 in the domain Cn[1, t]. One can
pffiffiffi
by setting Ut =  Bt . If D is a simply connected define whole-plane SLE connecting two distinct
domain, z 2 D, w 2 @D, then radial SLE in D points in C by conformal transformation.
connecting w and z is obtained by conformal
transformation using the unique transformation
f of D onto D with f (0) = z, f (1) = w. Again, we
Locality and Restriction
think of this as being defined modulo time change. If
a = 2= and vt = hat=2 , then There are two special values of  :  = 6, a = 1=3 that
  satisfies the ‘‘locality’’ property and  = 8=3, a = 3=4
a vt ðzÞ þ Wt that satisfies the ‘‘restriction’’ property. Suppose  is a
v_ t ðzÞ ¼ cot ½4
2 2 chordal SLE curve from 0 to 1 in H parametrized
pffiffiffi as in [2]. Suppose  : N ! H is a conformal map
where Wt :=   Bt= is a standard Brownian
taking a neighborhood N of 0 in H to (N ) and that
motion. If Lzt = vt (z) þ Wt , then we get
locally maps R into R. Let (t) ˜ =   (t), which is
 z defined for sufficiently small t. Let g
t be the
z a L
dLt ¼ cot t dt þ dWt conformal transformation of Hn˜ [0, t] onto H with
2 2

Radial and chordal SLE are closely related. In fact, if a


ðtÞ
g
t ðzÞ ¼ z þ þ 
 is a chordal SLE path in H from 0 to 1, ˜ is a z
radial SLE path in D from 1 to 0, and  = i log , ˜ ~ t be the driving function such that
and let U
then for small t the distribution of  is absolutely
continuous to the distribution of a (random time a_
ðtÞ
g_
t ðzÞ ¼
change of) . Showing this involves understanding ~t
g
t ðzÞ  U
the behavior of the Loewner equation under
conformal transformations. Suppose , ˜ have been Here a
(t) = hcap[˜ [0, t]]. If we change time,
parametrized as in [2] and [4] with a = 2=. Let gt
t
= ~(t) , so that a
((t)) = at, then an application
be the conformal transformation of Hn[0, t] onto of Itô’s formula shows that Ut
:= U ~ (t) satisfies
H such that
1 00ðtÞ ðWðtÞ Þ

dUt
¼ ð3a  1Þ ~t
dt þ dW
a ðtÞ 2 0 ðWðtÞ Þ2
g
t ðzÞ ¼ z þ þ ; z!1 ðtÞ
z
Here W ~ t is a standard Brownian motion, t = g

and let Ut
be the Loewner driving function such that 1
t
  gt , and gt is the conformal map associated to .
a_
ðtÞ In particular, if a = 1=3,  = 6, Ut
is a standard
g_
t ðzÞ ¼
g
t ðzÞ  Ut
Brownian motion; hence, ~
has the distribution of
SLE6 . The locality property for SLE6 can be stated
Here a
(t) = hcap[[0, t]]. If we consider a time as ‘‘the conformal image of SLE6 is (a time change
change  such that a
((t)) = at and let Ut = U~ (t)
of) SLE6 .’’ Intuitively, the SLE6 path in a restricted
be the time-changed driving function, Itô’s formula domain does not feel the boundary of the domain
can be used to show that until it reaches it. Radial SLE6 satisfies a similar
~t
dUt ¼ 12ð1  3aÞ Ft dt þ dW ½5 locality property. Moreover, [5] can be used to
show that the image of chordal SLE6 under the
where the Ft in the drift term depends on [0, t] and exponential map is the same (for small time t) as
is independent of a, and W ~ is a standard Brownian radial SLE6 . The locality property explains why
Stochastic Loewner Evolutions 83

SLE6 is a natural candidate for the boundary of In studying the relationship between SLE and
percolation clusters. conformal field theories, two other probabilistic
If   4, SLE paths are simple, that is, with no objects, restriction measures and the (Brownian)

self-intersections. Suppose A  Hn{0} is a compact loop soup, arise. An H-hull (connecting 0 and 1) is
set such that HnA is simply connected. Let  denote an unbounded, connected, closed set K  H  with
a chordal SLE in H connecting 0 and 1 and K \ R = {0} and such that HnK consists of two
let EA be the event EA = {(0, 1) \ A = ;}. Let connected components, one whose boundary
A : HnA ! H be the unique conformal transforma- includes the positive reals and the other whose
tion with A (0) = 0, A (1) = 1, 0A (1) = 1. On boundary includes the negative reals. A (chordal)
the event EA , we can define (t)
˜ = A  (t). Chordal restriction measure on hulls K is a probability
SLE is said to satisfy the restriction property if the measure with the property that for any A as in [6],
conditional distribution of ˜ given EA is the same as the distribution of A  K given {K \ A = ;} is the
(a time change of) . The only   4 that satisfies same as the original measure. The (Brownian) loop
this property is  = 8=3. The proof of this fact also measure is a measure on unrooted loops derived
establishes the formula: if  is a chordal SLE8=3 from Brownian bridges. It is the scaling limit of the
curve in H from 0 to 1, then measure on random walk loops that gives each
unrooted simple random walk loop of length 2n
Pfð0; 1Þ \ A ¼ ;g ¼ 0A ð0Þ5=8 ½6 measure 42n . The loop measure in a bounded
There is a similar formula for radial SLE8=3 , which domain is obtained by restricting to loops that stay
establishes a radial restriction property. Suppose in that domain. We can consider this as a measure

A  Dn{0, 1} is a compact set such that DnA is on ‘‘hulls’’ by filling in the bounded holes (so that
simply connected. Let A be the unique conformal the complement of the hull is connected). By doing
transformation of DnA onto D with A (0), 0A (0) > 0. this we get a family of infinite measures on hulls,
Then, if  is a radial SLE8=3 curve from 1 to 0 in D, indexed by domains D, and this family satisfies
then conformal invariance and the restriction property.
The loop soup with parameter is a Poissonian
Pfð0; 1Þ \ A ¼ ;g ¼ 0A ð0Þ5=48 j0A ð1Þj5=8 realization from this measure with parameter .
The set of all restriction measures is parametrized
The restriction property makes SLE8=3 the candidate by  5=8; the -restriction measure has the
for the scaling limit of self-avoiding walks. property that

PfK \ A 6¼ ;g ¼ 0A ð0Þ


Relation to Conformal Field Theory
For  = 5=8, K is given by the path of SLE8=3 . For
The Schramm–Loewner evolution is one of the tools integer , the hull K can be constructed by taking
used to rigorously prove predictions made using -independent Brownian excursions in H (Brownian
powerful, yet nonrigorous, arguments of conformal motions starting at 0 conditioned to stay in H for all
field theory. In conformal field theory, there is a times), and letting K be the hull obtained by taking
parameter c, called the central charge, which the union of the paths and filling in the bounded
classifies theories. To each c  1, there corresponds holes. If   8=3, c  0, then the restriction mea-
a   4 and a ‘‘dual’’ 0 = 16= 4: sure with exponent  5=8 can also be con-
ð8  3Þð  6Þ structed as follows: take a chordal SLE path and
c ¼ c ¼ an independent realization of the loop soup with
2
intensity  = c ; add to the SLE path all the
In particular,  = 8=3, 0 = 6 corresponds to central loops in the soup that intersect the SLE curve; and
charge zero. It is expected, and has been proved in a then fill in all the bounded hulls. The limiting case
number of cases, that SLE or SLE0 curves will  = 5=8, = 0 gives the only measure supported on
appear in scaling limits of systems with central simple curves that is also a restriction measure,
charge c . These systems can also be parametrized SLE8=3 .
by the boundary scaling exponent or conformal For 8=3 <   4, 0 < c  1, it is conjectured,
weight and proved for small c , that SLE curves can be
found by taking a loop soup with parameter = c
6
 ¼  ¼ and looking at connected curves in the fractal set
2 given by the complement of the union of all the hulls
For  = 8=3,  = 5=8 which is the exponent in [6]. generated by the loops.
84 Stochastic Loewner Evolutions

Examples chooses a spanning tree of V from the uniform


distribution on all spanning trees, then the distribu-
The scaling limit of simple random walk, Brownian
tion of the unique path connecting the two points is
motion, is known to be conformally invariant. A
exactly that of the LERW (see Figure 3). Another
two-dimensional Brownian bridge or loop is a
description of the LERW is as the Laplacian random
Brownian motion, Bt , 0  t  1, conditioned so
walk: the LERW from z to w in V chooses a new
that B0 = B1 . The frontier or outer boundary of the
step weighted by the value of the function that is
Brownian motion is the boundary of the unbounded
harmonic on the complement of w and the path up
component of the complement. Benoit Mandelbrot
to that point with boundary values 0 on the path
first observed numerically that the outer boundary
and 1 on w. The LERW in the discrete upper half-
of Brownian motion had fractal dimension 4=3.
plane can be obtained by erasing loops from a
Gregory Lawler, Oded Schramm, and Wendelin
simple random walk excursion. The LERW and the
Werner used SLE to prove that the boundary has
uniform spanning tree are systems with central
Hausdorff dimension 4/3. In fact, the outer bound-
charge c = 2. It has been proved that the scaling
ary can be considered as an SLE8=3 loop.
limit of the LERW is SLE2 ; hence, the paths have
SLE6 and SLE8=3 arise in the scaling limit of
Hausdorff dimension 5/8.
critical percolation on the triangular lattice. Suppose
There is another path associated to spanning trees
that each vertex in the upper half-plane triangular
given by the one-to-one correspondence between
lattice is colored black or white each with a
spanning trees and Hamiltonian walks on a corre-
probability 1/2. Suppose the real line gives a
sponding directed (Manhattan) lattice on the dual
boundary condition of black on the negative real
graph (see Figure 4). If the spanning trees, or
line and white on the positive real line. Then if we
equivalently the Hamiltonian walks, are chosen
represent the vertices in the lattice as hexagons as
using the uniform distribution, then the scaling
in the figure, a curve is formed which is the
limit of this walk is the space-filling curve SLE8 .
boundary between the black and white clusters.
Note that 2 and 8 are the dual values of  associated
This curve is called the ‘‘percolation exploration
to c = 2.
process.’’ Stanislav Smirnov proved that the scaling
limit of this curve is conformally invariant, and from
this it can be concluded that the curve is chordal
SLE6 . In particular, the Hausdorff dimension is 7/4
and the scaling limit has double points. In the
scaling limit, the ‘‘outer boundary’’ of this curve has
Hausdorff dimension 4/3 and its dimension is
absolutely continuous with respect to that of
SLE8=3 . While this result is expected for other
critical percolation model, such as bond percolation
in Z2 with critical probability 1/2, it has only been
proved for the triangular lattice. Percolation has
central charge 0 and the ‘‘locality’’ property can be Figure 3 A spanning tree and the path between two vertices.
seen in the lattice model. The outer boundary of the If the tree has the uniform distribution, the path has the
distribution of the LERW.
curve has the same distribution as the outer
boundary of a Brownian motion that is reflected at
angle
=3 off the real line. Locally, the outer
boundary of percolation, the outer boundary of
complex Brownian motion, and SLE8=3 all look the
same, and it is expected that this will also be true for
the scaling limit of self-avoiding walks.
There are three models derived in some way from
simple random walk that have been proved to have
scaling limits of SLE . The loop-erased random walk
(LERW) in a finite subset V of Z2 connecting two
distinct points is obtained by taking a simple
random walk from one point to the other and
erasing loops chronologically. The LERW is closely Figure 4 A spanning tree and the corresponding Hamiltonian
related to uniform spanning trees; in fact, if one walk.
Stochastic Loewner Evolutions 85

Another discrete process derived from simple Generalizations


random walk, the harmonic explorer, has a scaling
One of the reasons that the theory of SLE is nice for
limit of SLE4 . There is a particular property of SLE
simply connected domains is that a simply connected
that leads to the definition of this discrete process.
domain with an arc connected to the boundary of the
Consider a chordal SLE curve, let z 2 H, and let Zzt
domain removed is again simply connected. For
be as in [3] with a = 2=. Itô’s formula shows that
nonsimply connected domains, it is more difficult to
t := arg(Zzt ) satisfies
describe because the conformal type of the slit
  domain changes as time evolves. In the case of a
1 sinð2t Þ sin t
dt ¼ a dt  dWt curve crossing an annulus, this can be done with an
2 z 2
jZt j jZzt j
added parameter referring to the conformal type of
the annulus (two annuli of the form {z : rj < jzj < sj }
In particular, t is a martingale if and only if
are conformally equivalent if and only if
a = 1=2,  = 4. The probability that a complex
r1 =s1 = r2 =s2 ). It is not immediately obvious what
Brownian motion starting at z 2 H first hits R on
the correct definition of SLE should be in general
the negative half-line can be shown to be arg (z). If
domains and, more generally, on Riemann surfaces.
  4, then we can see that 1 equals 0 or
,
One possibility for   4 is to consider a configura-
depending on whether z is on the right or left side
tional (equilibrium statistical mechanics) view of
of the path (0, 1). For the martingale case  = 4,
SLE. Consider a family of measures {D (z, w)},
t represents the probability that z is on the left
where D ranges over domains and z, w are distinct
side of (0, 1), given (0, t]. The harmonic
boundary points at @D is locally analytic, supported
explorer is a process on the hexagonal lattice
on simple curves from z to w (modulo time change).
defined to have this property. In a way similar to
Let #D (z, w) = D (z, w)=jD (z, w)j be the correspond-
the percolation process, the walk is defined as the
ing probability measures, which may be defined even
boundary between black and white hexagons on
if @D is not smooth at z, w. Then the following
the triangular lattice. However, when an unex-
axioms should hold:
plored hexagon is reached in the harmonic
explorer, it is colored black with probability q,  Conformal invariance. If f : D ! D0 is a confor-
where q is the probability that a simple random mal transformation, f  # #
D (z, w) = D0 (f (z), f (w)).
walk on the triangular lattice starting at that  Conformal Markov property.
hexagon (considered as a vertex in the triangular  Perturbation of domains. Suppose D1  D and
lattice) hits a black hexagon before hitting a white @D1 , @D agree near z, w. Then D1 (z, w) should
hexagon. It is not difficult to show that this process be absolutely continuous with respect to D (z, w).
has the property that for z away from the curve, Let Y denote the Radon–Nikodym derivative of
the ‘‘probability of z ending on the left given the D1 (z, w) with respect to D (z, w). Then
curve of n steps’’ is a martingale.
There are many other models for which SLE YðÞ ¼ 1fð0; t Þ  D1 g Fc ðD; ; DnD1 Þ
curves are expected in the limit, but it has not been
where Fc is to be determined. In the case where
established. The most difficult part is to show the
D, D1 are simply connected, Fc (D; , DnD1 ) =
existence of a limit that is conformally invariant.
J(, D, D1 )c , where J(, D, D1 ) denotes the prob-
One example is the self-avoiding walk (SAW). It is
ability that there is a loop in the Brownian loop
an open problem to establish that there exists a
soup in D that intersects both  and DnD1 . (There
scaling limit of the uniform measure on SAWs and
is no problem defining this quantity in nonsimply
to establish conformal invariance of the limit.
connected domains, but it is not clear that it is the
However, the nature of the discrete model is such
right quantity.) Here c = c . The restriction property
that if the limit exists, it must satisfy the restriction
tells us that F0 1.
property. Hence, under the assumption of confor-
mal invariance, the only possible limit is SLE8=3 .  Conformal covariance. If f is as above, @D, @D0 are
Numerical simulations strongly support the con- smooth near z, w and f (z), f (w), respectively, then
jecture that SLE8=3 is the limit of SAWs, and this
gives strong evidence for the conformal invariance f  D ðz; wÞ ¼ jf 0 ðzÞj jf 0 ðwÞj D0 ðf ðzÞ; f ðwÞÞ
conjecture for SAWs. Critical exponents for SAWs
(as well as critical exponents for many other Here  =  is the boundary scaling exponent.
models) can be predicted nonrigorously from
rigorous scaling exponents for the corresponding See also: Boundary Conformal Field Theory; Percolation
SLE paths. Theory; Random Walks in Random Environments.
86 Stochastic Resonance

Further Reading Schramm O (2000) Scaling limits of loop-erased random walks


and uniform spanning trees. Israel Journal of Mathematics
Kager W and Nienhuis B (2004) A guide to stochastic Loewner 118: 221–288.
evolution and its applications. Journal of Statistical Physics Smirnov S (2001) Critical percolation in the plane: conformal
115: 1149–1229. invariance, Cardy’s formula, scaling limits. Comptes rendus
Lawler G (2004) Conformally invariant processes in the plane. In: de l’Academie des sciences – series i – mathematics 333:
School and Conference on Probability Theory, ICTP Lecture 239–244.
Notes, vol. 17, pp. 305–351. Trieste, Italy: ICTP. Werner W (2003) SLEs as boundaries of clusters of Brownian
Lawler G (2005) Conformally Invariant Processes in the Plane. loops. Comptes rendus de l’Academie des sciences – series i –
Providence, RI: American Mathematical Society. mathematics 337: 481–486.
Lawler G and Werner W (2004) The Brownian loop soup. Werner W (2004) Random planar curves and Schramm–Loewner
Probability Theory Related Fields 128: 565–588. evolutions. In: Ecole d’Eté de Probabilités de Saint-Flour
Lawler G, Schramm O, and Werner W (2004a) Conformal XXXII – 2002, Lecture Notes in Mathematics, vol. 1840,
restriction, the chordal case. The Journal of American pp. 113–195. Berlin: Springer.
Mathematical Society 16: 917–955.
Lawler G, Schramm O, and Werner W (2004b) Conformal
invariance of planar loop-erased random walks and uniform
spanning trees. Annals of Probability 32: 939–995.

Stochastic Resonance
S Herrmann, Université Henri Poincaré, Nancy 1 deduce estimates of the average temperature on
Vandoeuvre-lès-Nancy, France Earth over the last 700 000 years. They exhibit
P Imkeller, Humboldt Universität zu Berlin, periodic switching between ice and warm ages with
Berlin, Germany fast spontaneous transitions. The average periodicity
ª 2006 Elsevier Ltd. All rights reserved. of the glaciation time series obtained is 105 years.
In order to explain temperature variations, Benzi
et al. (1981) introduced random perturbations into
Introduction an energy balance model of the Budyko–Sellers type.
This model describes the evolution of the seasonal
The concept of stochastic resonance was introduced and global average temperature X caused by defects
by physicists. It originated in a toy model designed in the balance between incoming and outgoing
for a qualitative description of periodicity phenom-
radiation
ena in the recurrences of glacial eras in Earth’s
history. It spread its popularity over numerous areas dXðtÞ
c ¼ Ein  Eout
of natural sciences: neuronal response to periodic dt
stimuli, variations of magnetization in a ferromag- where c is the active thermal inertia of the system.
netic system, voltage variations in the simple Schmitt The incoming energy is modeled as proportional
trigger electronic circuit or in more complicated to the ‘‘solar constant’’ Q:
devices, behavior of lasers in optical bi-stability, etc.  
The interest in this ubiquitous phenomenon is 2
t
Ein ¼ Q 1 þ A cos ; with T 92 000 years
enhanced by signal analysis: an optimal dose of T
noise in some system can essentially boost signal
transduction. Noise in this context does not enter the and A 0.1% of Q. This exceedingly small varia-
system as an impurity perturbing its performance, but tion of the solar constant is caused by a modulation
on the contrary as a catalyst triggering amplified of the orbital eccentricity of the Earth’s trajectory
stochastic response to weak periodic signals. (Figure 1). The outgoing radiation Eout is composed

The Climate Paradigm


The phenomenon of stochastic resonance was first
discovered in an elementary climate model serving in
an explanation of major transitions in paleoclimatic
time series confining glacial cycles. Data collected
for instance from ice or deep sea cores allow one to Figure 1 Modulation of the orbital eccentricity.
Stochastic Resonance 87

of two essential parts. The first part a(X)Ein is of the random system to weak perturbations with
dominated by the albedo a(X) representing the long periods.
proportion of energy reflected back to space. It is a
decreasing function of temperature, due to the
higher rate of reflection from a brighter Earth at Strongly Damped Brownian Particle
low temperatures implying a bigger volume of ice.
It is useful to roughly compare solutions of
The second part of the outgoing radiation comes
stochastic differential equations and motions of
from the fact that the Earth radiates energy like a
Brownian particles in double-well landscapes
black body, and is given by the Boltzmann law X4 ,
(Figure 3) in order to understand properties of
where  is the Stefan constant. Describing the
their trajectories (see Schweitzer 2003, Mazo 2002).
balance of energy terms as a slowly and weakly
As in the previous section, let us concentrate on a
time-varying gradient of a potential U, the balance
one-dimensional setting, remarking that we shall
model can be expressed by
give a treatment that easily generalizes to the finite-
dimensional setting. Due to Newton’s law, the
dXðtÞ @U  t 
¼ ; XðtÞ motion of a particle is governed by the impact of
dt @x T all forces acting on it. Let us denote F the sum of
these forces, m the mass, x the space coordinate, and
where the time period 1 is blown up to (large) T by
v the velocity of the particle. Then
time scaling. The roles of deep and shallow wells
switch periodically (Figure 2). Since the variation of mv_ ¼ F
the solar constant is extremely small, we can assume
Let us first assume the potential to be switched off.
that the height of the barrier between the two wells
In their pioneering work at the turn of the
is lower-bounded by a positive constant. The system
twentieth century, Marian v. Smoluchowski and
then admits three steady states two of which are
Paul Langevin introduced stochastic concepts to
stable and separated by roughly 10 K. As the solar
describe the Brownian particle motion by claiming
constant, they fluctuate slowly and very weakly.
that at time t
Therefore, this deterministic system cannot account
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
for climate changes with temperature variations of FðtÞ ¼ 0 vðtÞ þ 2kB T0 W _t
 10 K. They can only be explained by allowing
transitions between the two steady states which The first term results from friction 0 and is velocity
become possible by adding noise to the system. In dependent. An additional stochastic force represents
general, short timescale phenomena such as annual random interactions between Brownian particles and
fluctuations in solar radiation are modeled by their simple molecular random environment. The
white noise W _ (formal derivative of the Wiener
Gaussian white noise of intensity " and lead to
equations of the type process) plays the crucial role. The diffusion coefficient
(standard deviation of the random impact) is com-
@U  t  pffiffiffi posed of Boltzmann’s constant kB , friction, and
dX"t ¼  ; X"t dt þ "dWt ½1 environmental temperature T. It satisfies the condition
@x T
of the fluctuation–dissipation theorem expressing the
which are generic for studying stochastic resonance balance of energy loss due to friction and energy gain
in numerous physical and biological models. Gen- resulting from noise. The equation of motion becomes
erally, the input of noise amplifies a weak periodic dxðtÞ
signal by creating trajectories fluctuating randomly ¼ vðtÞ
dt
periodically between meta-stable states. An optimal pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tuning of noise intensity to period length (‘‘stochas- 0 2kB T0
dvðtÞ ¼  vðtÞ dt þ dWt
tic resonance’’) significantly enhances the response m m

Figure 2 Deep and shallow wells switching periodically. Figure 3 Brownian particle in a double-well landscape.
88 Stochastic Resonance

2 2
1
1 1
0
0
0
–1 –1
–1
–2 –2
0 T 2T 3T 4T 0 T 2T 3T 4T 0 T 2T 3T 4T
Figure 4 Resonance pictures for diffusions.

In the stationary regime, the stationary Ornstein– generating a potential U(t, x). This leads to the
Uhlenbeck process provides its solution Langevin equation
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z t dxðtÞ
ð0 =mÞt 2kB T0 ¼ vðtÞ
vðtÞ ¼ vð0Þ e þ eð0 =mÞðtsÞ dWs dt
m 0 @U pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m dvðtÞ ¼ 0 vðtÞdt  ðt; xðtÞÞ þ 2kB T0 dWt
@x
The ratio  := 0 =m determines the dynamic behav-
ior. Let us focus on the over-damped situation with In the over-damped limit, after relaxation time, the
large friction and very small mass. Then for adiabatic elimination of the fast variables (Gardiner
t > > 1= =  (relaxation time), the first term in the 2004) leads to an equation similar to the one
expression for velocity can be neglected, while the encountered in the previous section:
stochastic integral represents a Gaussian process. By pffiffiffiffiffiffiffiffiffiffiffiffi
1 @U 2kB T
integrating, we obtain in the over-damped limit dxðtÞ ¼  ðt; xðtÞÞ þ dWt
0 @x 0
( ! 1) that v and thus x is Gaussian with almost
constant mean In the particular case of some double-well potential
x ! U(t, x) with slow periodic variation, the follow-
1  et ing patterns of behavior of the solution trajectories
mðtÞ ¼ xð0Þ þ vð0Þ  xð0Þ will be experienced. If temperature is high, noise has

a predominant influence on the motion, and the
and covariance close to the covariance of white particle often crosses the barrier separating the two
noise see Nelson (1967): wells during one period. The behavior of the particle
does not seem to be periodic but rather chaotic. If
2kB T kB T temperature is small, the particle stays for a very
Kðs; tÞ ¼ minðs; tÞ þ ð2 þ 2et þ 2es long time in the starting well, fluctuating weakly
0 0 
around the equilibrium position. It has too low
 ejtsj  eðt þ sÞ Þ energy to follow the periodic variation of the
potential. So in this case too, the trajectories do
2kB T
 minðs; tÞ not look periodic. Between these two extreme
0 situations, there exists a regime of noise intensities
for which the energy transmitted by the noise is
Hence, the time-dependent change of the velocity of sufficient to cross the barrier almost twice per
the Brownian particle can be neglected, the velocity period. The parameters are then near to the
rapidly thermalizes (v_  0), while the spactial coor- resonance point and the motion exhibits periodic
dinate remains far from equilibrium. In the so-called switching (Figure 4).
adiabatic transformation, the evolution of the
particle’s position is thus given by the transformed
Langevin equation
Transition Criteria
pffiffiffiffiffiffiffiffiffiffiffiffi and Quasideterministic Motion
2kB T
dxðtÞ ¼ dWt
0 Studying stochastic resonance accordingly means
looking for the range of regimes for which periodic
Let us next suppose that we have a Brownian behavior is enhanced and eventually optimal. The
particle in an external field of force (see Figure 3), optimal relation between period T and noise
Stochastic Resonance 89

intensity " emerges in the small noise limit. To in probability as " ! 0. Here  denotes Lebesgue
explain this, let us focus on the basic indicator for measure on R. If T < eV s = , the time left is not long
periodic transitions – the time the Brownian particle enough for crossings: the particle stays in the
needs to exit from the starting well, say the left one. starting well, near the stable equilibrium point:
In the ‘‘frozen’’ case, that is, if the time variation of
the potential term is eliminated just by freezing it at ðt 2 ½0; 1 : jX"tT  ðxl 1fx 2 Dl g þ xr 1fx 2 Dr g Þj > Þ ! 0
some time s, the asymptotics of the exit time is
derived from the classical large deviation theory of This observation is at the basis of Freidlin’s law of
randomly perturbed dynamical systems (see Freidlin quasideterministic periodic motion discussed in the
and Wentzell 1998). Let us assume that U is locally subsequent section. The lesson it teaches is this: to
Lipschitz. We denote by Dl (resp. Dr ) the domain observe switching of the position to the energetically
corresponding to the left (resp. right) well and  most favorable well, T should be larger than some
their common boundary. The law of the first exit critical level e = . Measuring time in exponential
time D" l = inf {t  0, X"t 2
= Dl } is described by some scales by
through the equation T " = e
=" , the
particular functional related to large deviation. For condition becomes
> .
t > 0, we introduce the ‘‘action functional’’ on the
space of continuous functions C([0, t]) on [0, t] by
( R t 2 Stochastic Resonance for Landscapes,
1
s 2 0 ’ _ u þ @U
@x ðs; ’u Þ du; if ’ is abs: Frozen on Half-Periods
St ð’Þ ¼ continuous
þ1 otherwise This particular case has analytical advantages, since
which is non-negative and vanishes on the set it allows one to employ classical techniques of
of solutions of the ordinary differential equation semigroup and operator theory. The situation is the
x_ = (@U=@x)(s, x). Let x and y 2 R. In relation with following: let U be a double-well potential with
the action functional, we define the quasipotential minima xl = 1 and xr = 1 and a saddle point at
the origin. We assume that U(x) ! 1 as jxj ! 1
Vs ðx; yÞ ¼ inffSst ð’Þ : ’ 2 Cð½0; tÞ; ’0 ¼ x; ’t ¼ y; t  0g and U(1) = V=2 = V l =2, U(1) = v=2 = V r =2,
U(0) = 0, and 0 < v < V. We define the 1-periodic
It represents the minimal work the diffusion starting
potential by U(t, x) = U(t þ 1=2, x). Hence on each
in x has to do in order to reach y. To switch wells,
half-period the corresponding diffusion is time homo-
the Brownian particle starting in the left well’s
geneous. The critical level is then easily defined by
bottom xl has to overcome the barrier. So we let
= v, that is, twice the depth of the shallow well. By
V s ¼ inf Vs ðxl ; yÞ letting
y2
(
This minimal work needed to exit from the left well 1 for t 2 ½k; k þ 12Þ
ðtÞ ¼
can be computed explicitly, and is seen to equal to 1 for t 2 ½k þ 12 ; k þ 1Þ; k ¼ 0; 1; 2; . . .
twice its depth. The asymptotic behavior of the exit
time is expressed by the periodic function which describes the location of
lim " ln E½D" l  ¼ V s the global minimum of the potential, we get in the
"!0 small noise limit
and
  ðt 2 ½0; 1 : jX"tT  ðtÞj > Þ ! 0
lim Px eðV s  Þ=" < D" l < eðV s þ ="Þ ¼ 1
"!0 in probability as " ! 0. This result expresses
for any  > 0 Freidlin’s law of quasideterministic motion: for
The prefactor for the exponential rate, derived by large periods, the trajectories of the particle
Freidlin and Wentzell (1998), was first given by approach a periodic deterministic function. But the
Eyring and Kramers and then by Bovier et al. (2004). sense in which this notion measures periodicity does
Let us now assume that the left well is the deeper not take into account that for large periods short
one at time s. If the Brownian particle has enough excursions to the wrong well may occur in an erratic
time to cross the barrier, that is, if T > eV s = , then way without counting much for Lebesgue measure
whatever the starting point is, Freidlin (2000) proved of time. In fact, if the period is too large, that is,
that it should stay near xl in the following sense:
> V, the time available in one period permits the
exit of not only the shallow well but also that of the
ðt 2 ½0; 1 : jX"tT  xl j > Þ ! 0 deep well. So, whatever the starting position of
90 Stochastic Resonance

the particle is, the number of observed transitions in  the SPA-to-noise ratio, giving the ratio of the
one half period becomes very large. Indeed the first amplitude of the response and the noise intensity,
time the particle starting in xl hits again xl after which is also related to the signal-to-noise ratio:
visiting the position xr satisfies
MSPN ð"; TÞ ¼ MSPA ð"; TÞ="2
Eð Þ ¼ ev=" þ eV=" < T " ¼ e
="
 the total energy of the averaged trajectories
The motion of the particle appears more chaotical
Z 1
than periodic: noise intensity is too large compared
to period length. We avoid this range of chaotic MEN ð"; TÞ ¼ ðE ½XsT Þ2 ds
0
spontaneous transitions by defining the resonance
interval IR = [v, V], as the range of admissible energy
The second family of criteria is more probabilistic.
parameters
for randomly periodic behavior. In this
It refers to quality measures based on transition
regime, the trajectories possess periodicity proper-
times between the domains of attraction of the local
ties. In these terms the resonance point describes the
minima, residence times distributions measuring the
tuning rate
R 2 IR for which the stochastic response
time spent in one well between two transitions, or
to weak external periodic forcing is optimal. To
interspike times. This family is certainly less popular
make sense, this point has to refer to some measure
in the physics community.
of quality for periodicity of random trajectories. In
However, measures related to invariant measures
the huge physics literature concerning resonance,
may suffer from robustness deficiency (Imkeller and
two families of criteria can be distinguished. The
Pavlyukevich 2002). To explain what we mean by
first one is based on invariant measures and spectral
robustness, let us introduce a model reduction first
properties of the infinitesimal generator associated
discussed by McNamara and Wiesenfeld (1989).
with the diffusion X" . Now, X" is not Markovian
Instead of studying the diffusion X" in the double-
and consequently does not admit invariant mea-
well landscape, they introduce a two-state Markov
sures. But by taking into account deterministic
chain Y " (Figure 5) the dynamics of which just
motion of time in the interval of periodicity and
takes account of the domain of attraction the diffusion
considering the process Zt = (t mod(T " ), Xt ), we
is in, and therefore with state space {1, 1}. A
obtain a Markov process with an invariant measure
reasonable choice of the infinitesimal generator should
t (x)dx. In other words, the law of Xt  t (x)dx and
retain the dynamics of the diffusion’s transitions
the law of XtþT  tþT (x)dx, under this measure,
characterized by Kramers’ rate. We may take
are the same for all t  0. Let us present the most
important ones:  
’ ’ T
 the spectral power amplification (SPA) which QðtÞ ¼ ; 0t
 2
plays an eminent role in the physics literature  
 T
describes the energy carried by the spectral QðtÞ ¼ ; t<T
component of the averaged trajectories of X" ’ ’ 2
corresponding to the period:
periodically continued on Rþ . Here, ’ = peV=" and
Z 1
2 = qev=" . The prefactors of subexponential order

MSPA ð"; TÞ ¼ E ½X"sT e2is ds are beyond the scope of large deviation theory. They
0 are related to the curvature of the potential in the

1 1 1

0 0 0

–1 –1 –1

0 T 2T 3T 4T 0 T 2T 3T 4T 0 T 2T 3T 4T
Figure 5 Resonance pictures for Markov chain.
Stochastic Resonance 91

minima and the saddle point of the landscape and particular function designed to cut out the small
given by fluctuations of the diffusion in the neighborhood of
the bottoms of the wells, by identifying all states
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p¼ U00 ð1ÞjU00 ð0Þj there. So g(x) = 1 (resp. 1) in some neighborhood
2 of 1 (resp. 1) and otherwise g is the identity. This
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

results in
q¼ U00 ð1ÞjU00 ð0Þj
2 Z 1 2

On the intervals [kT=2, (k þ 1)T=2[, k  0, the e
MSPA ð"; TÞ ¼ E gðXsT Þe ds
" 2is
Markov chain Y " is time-homogeneous and its 0
transition probabilities can be expressed in terms In the small noise limit this quality function admits a
of ’ and . For instance, the probability with which local maximum close to the resonance point of the
the chain jumps from state 1 to state þ1 in the time reduced model: the growth rate of Topt "
is also given
window [t, t þ h] equals ’h þ o(h), if this time by the sum of the wells’ depths. So the lack of
interval is contained in [kT=2, (k þ 1)T=2[ for robustness seems to be due to the small fluctuations
some even k. The stationary measure of the Markov of the particle in the wells’ bottoms. In any case, this
chain denoted by can be explicitly calculated, and clearly calls for other quality measures to be used to
so can the classical quality measures based on the transfer properties of the reduced model to the
spectral notions. For instance, the spectral power original one. Our discussion indicates that due to
amplification coefficient equals their emphasis on the pure transition dynamics, the
Z 2 second family of quality measures should be used.
1
MSPA ð"; TÞ ¼ E ½Yst" e2is ds For these notions there is no need to restrict to
0 landscapes frozen in time-independent potential
4 T 2 ð’  Þ2 states on half-period intervals.
¼ 2
 ð’ þ Þ2 T 2 þ 2

This simple expression admits asymptotically a Stochastic Resonance for Continuously


unique maximum which exhibits the resonance Varying Landscapes
point:
rffiffiffiffiffiffiffiffiffiffiffiffi From now on the potential U(t, x) is supposed to be
 v n o continuously varying in (t, x). For simplicity, its
"
Topt ¼ pffiffiffiffiffiffiffiffi eðVþvÞ=2" 1 þ OðeðVvÞ=" Þ local minima are assumed to be located at 1, and
2pq V  v
its only saddle point at 0, independently of time. So
The optimal period is then exponentially large – as the only meta-stable states on the whole time axis
was suggested by large deviation theory – and the are 1. Let us denote by  (t) (resp. þ (t)) the
growth rate is the sum of the two wells’ depths. The depth of the left (resp. right) well at time t. Together
simple Markov chain model is popular since the with U, these functions are continuous and
usual physical quantities are easy computable and 1-periodic. Assume that they are strictly monoto-
since it is believed to mimic the dynamics of a nous between their global extrema. Let us now come
Brownian particle in the corresponding double-well back to the motion of a Brownian particle in this
landscape. However, the models are not as similar landscape. The exit time law by Eyring–Kramers–
as expected (Freidlin 2003). Indeed, in a reasonably Freidlin entails that trajectories get close to the
large time window around the resonance point for global minimum, if the period is large enough.
Y " , the tuning picture of the spectral power Stated as before in exponential rates T = e
=" , with
amplification for the diffusion is different. Under
 maxi = supt0 2 (t), that is,
exceeds the
weak regularity conditions on the potential, it maximal work needed to cross the barrier, the
exhibits strict monotonicity in the window. Hence, particle often switches between the two wells and
optimal tuning points for diffusion and Markov should stay close to the deepest position in the
chain differ essentially. In other words, the SPA landscape. This position being described by the
tuning behavior of the diffusion is not robust for function (t) = 21{þ (t)> (t)} 1, we get in the small
passage to the reduced model. This strange defi- noise limit
ciency is difficult to explain. The main reason of this
ðt 2 ½0; 1 : jX"tT  ðtÞj > Þ ! 0
subtle effect appears to be that the diffusive nature
of the Brownian particle is neglected in the reduced in probability. But on these long timescales, many
model. In order to point out this feature, we may short excursions to the wrong well are observed, and
compute the SPA coefficient of g(X" ), where g is a trajectories look chaotical instead of periodic. So we
92 Stochastic Resonance

have to look at smaller periods even at the cost that On this interval they get close to deterministic
the particle may not stay close to the global periodic ones. Again, periodicity is quantified by a
minimum. Let us study the transition dynamics. quality measure, to be maximized in order to obtain
Assume that the starting point is 1 corresponding resonance as the best possible response to periodic
to the bottom of the deep well. If the depth of the well forcing. One interesting measure is based on the
is always larger than
= " log T " , the particle has too probability that random transitions happen in some
little time during one period to climb the barrier, and small time window around a deterministic time, in
should stay in the starting well. If, on the contrary, the small noise limit (Herrmann and Inkeller 2005).
the minimal work to leave the starting well, given by Formally, for h > 0, the measure gives
2 (s), becomes smaller than
at some time s, then
the transition can and will happen. More formally, Mh ð"; TÞ ¼ min Pi ð
=T " 2 ½ai
 h; ai
þ hÞ

for
2 [ inf t0 2 (t), supt0 2 (t)], we define
(Figure 6). where Pi is the law of the diffusion starting in i. In
the small noise limit, this quality measure tends to 1,
a

ðsÞ ¼ infft  s : 2 ðtÞ 
g and optimal tuning can be related to the exponential
rate at which this happens. This is due to the
The first transition time from 1 to 1 denoted þ following large deviations principle:
has the following asymptotic behavior as
" ! 0: þ =T " ! a
(0). At the second transition the lim " logð1  Mh ð"; TÞÞ ¼ maxf
 2i ðai
 hÞg
particle returns to the starting well. If aþ

is defined
"!0 i¼

analogously with respect to the depth function þ , for


2 IR , with uniform convergence on each
this transition will occur near the deterministic time compact subset of IR . The result is established
aþ  "

(a
(s))T . In order to observe periodicity, and to using classical large-deviation techniques applied to
exclude chaoticity from all parts of its trajectories, locally time homogeneous approximations of the
the particle has to stay for some time in the other diffusion. Maximizing the transition probability in
well before returning. This will happen under the the time window position means minimizing the
assumption 2þ (a
(0)) >
, that is, the right well is default rate obtained by the large deviations
the deep one at transition time. In fact, we can principle. This can be easily achieved. In fact, if the
define the resonance interval IR (Figure 7), as the set window length 2h is small, then
 2i (ai
 h) 
of all scales
for which trajectories exhibit 2h0i (ai
), since 2i (ai
) =
by definition. The value
periodicity in the small noise limit, by 0i (ai
) is negative, so we have to find the position
h i where its absolute value is maximal. In this position
IR ¼ max inf 2i ðtÞ; inf max 2i ðtÞ the depth of the starting well has the most rapid
i¼ t0 t0 i¼
drop under the level
, characterizing the link
between the noise intensity and the period. So the
2Δ – (t ) transition time is best concentrated around it.
It is clear that a good candidate for the resonance
µ point is given by the eventually existing limit of the
global minimizer
R (h) as the window length h tends
to 0. This limit is therefore called the resonance
point of the diffusion with time-periodic landscape
aµ– (0) t U. Let us note that for sinusoidal depth functions
Figure 6 Definition of a
 .
Vþv Vv
 ðtÞ ¼ þ cosð2tÞ
4 4
and

2Δ – (t ) þ ðtÞ ¼  ðt þ Þ

the optimal tuning is given by T " = exp


R =" with
IR

R = (v þ V)=2. This optimal rate is equivalent to
the optimal rate given by the SPA coefficient of the
2Δ + (t ) reduced dynamics’ Markov chain in the preceding
section.
The big advantage of the quality measure Mh is its
Figure 7 Resonance interval. robustness. Indeed, consider the reduced model
Stochastic Resonance 93

consisting of a two-state Markov chain with to the effect of modulating periodically a bifurcation
infinitesimal generator parameter: at time zero the right-hand well becomes
  almost flat, and at the same time the bottom of the
’ðtÞ ’ðtÞ
QðtÞ ¼ well and the saddle approach each other; half a
ðtÞ  ðtÞ
period later, a spatially symmetric scenario is
where ’(t) = exp 2 (t=T)=" and (t) = encountered. In this situation, there is a threshold
exp 2þ (t=T)=". The law of transition times of value for the noise intensity under which transitions
this Markov chain is readily computed from Laplace become unlikely. Above this threshold, the trajec-
transforms. Normalized by T " it converges to ai
. tories typically contain two transitions per period.
This calculation even reveals a rigorous underlying Results are formulated in terms of concentration
pattern for the second- and higher-order transition properties for random trajectories. The intuitive
times interpreting the interspike distributions of picture is this: with overwhelming probability,
the physics literature. The dynamics of diffusion sample paths will be concentrated in spacetime sets
and Markov chain are similar. Resonance points scaling with the small parameters of the problem. In
provided by Mh for the diffusion and its analog for higher dimensions, these sets may be given by
the Markov chain agree. adiabatic or center manifolds of the deterministic
system, which allow model reduction of higher-
dimensional systems to lower-dimensional ones.
Asymptotic results hold for any choice of the small
Related Notions: Synchronization
parameters in a whole parameter region. A passage
In the preceding sections, we interpreted stochastic to the small noise limit as for optimal tuning in the
resonance as optimal response of a randomly preceding sections is not needed.
perturbed dynamical system to weak periodic forcing, Related problems studied by Berglund and Gentz
in the spirit of the physics literature (see Gammaitoni in the multidimensional case concern the noise-
et al. (1998)). Our crucial assumption concerned the induced passage through periodic orbits, where
barrier heights a Brownian particle has to overcome unexpected phenomena arise. Here, as opposed to
in the potential landscape of the dynamical system: it the classical Freidlin–Wentzell theory, the distribu-
is uniformly lower bounded in time. Measures for the tion of first-exit points depends nontrivially on the
quality of tuning were based on essentially two noise intensity. Again aiming at results valid for
concepts: one concerning spectral criteria, with the small but nonvanishing parameters in subexponen-
spectral power amplification as most prominent tial scale ranges, they investigate the density of first-
member, the other one concerning the pure transi- passage times in a large regime of parameter values,
tions dynamics between the domains of attraction of and obtain insight into the transition from the
the local minima. A number of different criteria can stochastic resonance regime into the synchronization
be used to create an optimal tuning between the regime.
intensity of the noise perturbation and the large
period of the dynamical system. The relations have to See also: Dynamical Systems in Mathematical Physics:
be of an exponential type T = exp
=", since the An Illustration from Water Waves; Magnetic Resonance
Brownian particle needs exponentially long times to Imaging; Spectral Theory for Linear Operators;
cross the barrier separating the wells according to the Stochastic Differential Equations.
Eyring–Kramers–Freidlin transition law. Our barrier
height assumption seems natural in many situations,
but can fail in others. If it becomes small periodically, Further Reading
and eventually scales with the noise-intensity para-
meter, the Brownian particle does not need to wait an Benzi R, Sutera A, and Vulpiani A (1981) The mechanism of
exponentially long time to climb it. So periodicity stochastic resonance. Journal of Physics A 14: L453–L457.
Berglund N and Gentz B (2002) A sample-paths approach to
obtains for essentially smaller timescales. In this noise-induced synchronization: stochastic resonance in a
setting, the slowness of periodic forcing may also be double-well potential. Annals of Applied Probability 12(4):
assumed to be essentially subexponential in the noise 1419–1470.
intensity. Bovier A, Eckhoff M, Gayrard V, and Klein M (2004) Meta-
If it is fast enough to allow for substantial changes stability in reversible diffusion processes. I. Sharp asymptotics
for capacities and exit times. Journal of the European
before large deviation effects can take over, we are Mathematical Society 6(4): 399–424.
in the situation of Berglund and Gentz (2002). They Freidlin MI (2000) Quasi-deterministic approximation,
in fact consider the case in which the barrier metastability and stochastic resonance. Physica D 137(3–4):
between the wells becomes low twice per period, 333–352.
94 String Field Theory

Freidlin M (2003) Noise sensitivity of stochastic resonance and Imkeller P and Pavlyukevich I (2002) Model reduction and
other problems related to large deviations. In: Sri stochastic resonance. Stochastics and Dynamics 2(4):
Namachchivaya and Lin YK (eds.) IUTAM Symposium on 463–506.
Nonlinear Stochastic Dynamics, Solid Mech. Appl, vol. 110, Mazo RM (2002) Brownian Motion: Fluctuations, Dynamics,
pp. 43–55. Dordrecht: Kluwer Academic. and Applications. International Series of Monographs on
Freidlin MI and Wentzell AD (1998) Random Perturbations of Physics, vol. 112. New York: Oxford University Press.
Dynamical Systems, 2nd edn. New York: Springer. McNamara B and Wiesenfeld K (1989) Theory of stochastic
Gammaitoni L, Hänggi P, Jung P, and Marchesoni F (1998) resonance. Physical Review A 39(9): 4854–4869.
Stochastic resonance. Reviews of Modern Physics 70(1): Nelson E (1967) Dynamical Theories of Brownian Motion.
223–287. Princeton, NJ: Princeton University Press.
Gardiner CW (2004) Handbook of Stochastic Methods for Schweitzer F (2003) Brownian Agents and Active Particles,
Physics, Chemistry and the Natural Sciences. Springer Series Springer Series in Synergetics, (Collective dynamics in the
in Synergetics, 3rd edn., vol. 13. Berlin: Springer. natural and social sciences, With a foreword by J. Doyne
Herrmann S and Imkeller P (2005) The exit problem for Farmer). Berlin: Springer.
diffusions with time-periodic drift and stochastic resonance.
Annals of Applied Probability 15(1A): 39–68.

Strange Attractors see Lyapunov Exponents and Strange Attractors

String Field Theory


L Rastelli, Princeton University, Princeton, NJ, USA light-cone diagrams in a second-quantized language.
ª 2006 Elsevier Ltd. All rights reserved. While often useful as a bookkeeping device, light-
cone SFT seems unlikely to represent a real
improvement over the first-quantized approach. By
contrast, from our experience in ordinary quantum
Introduction field theory, we should expect Poincaré-covariant
String field theory (SFT) is the second-quantized SFTs to give important insights into the issues of
approach to string theory. In the usual, first- vacuum selection, background independence and the
quantized, formulation of string perturbation the- nonperturbative definition of string theory.
ory, one postulates a recipe for the string S-matrix in Covariant SFT actions are well established for the
terms of a sum over two-dimensional (2D) world open (Witten 1986), closed (Zwiebach 1993) and
sheets embedded in spacetime. Very schematically, open/closed (Zwiebach 1998) bosonic string. These
theories are based on the BRST formalism, where the
hhV1 ðk1 Þ . . . Vn ðkn Þii world sheet variables include the bc ghosts intro-
X Z
duced in gauge-fixing the world sheet metric to the
¼ g
s ½d
  hV1 ðk1 Þ . . . Vn ðkn Þif
 g ½1
conformal gauge gab  ab . (An alternative approach
topologies
(Hata et al.), based on covariantizing light-cone SFT,
Here the left-hand side stands for the S-matrix of the will not be described in this article.) Much less is
physical string states {Va (ka )}. The symbol h. . .i{
 } presently known for the superstring: classical actions
denotes a correlation function on the 2D world sheet, have been established for the Neveu-Schwarz sector
which is a punctured Riemann surface of Euler number of the open superstring (Berkovits 2001) and for the
 and given moduli {
 }. In SFT, one aims to recover heterotic string (Berkovits et al. 2004).
this standard prescription from the Feynman rules of a During the first period of intense activity in SFT
second-quantized spacetime action S[]. The string (1985–1992), the covariant bosonic actions were
field , the fundamental dynamical variable, can be constructed and shown to pass the basic test of
thought of as an infinite-dimensional array of space- reproducing the S-matrix [1] to each order in the
time fields { i (x
)}, one field for each basis state in the perturbative expansion. The more recent revival of
Fock space of the first-quantized string. the subject (since 1999) was triggered by the
The most straightforward way to construct S[] realization that SFT contains nonperturbative infor-
uses the unitary light-cone gauge. Light-cone SFT is mation as well: D-branes emerge as solitonic
an almost immediate transcription of Mandelstam’s solutions of the classical equations of motion in
String Field Theory 95

open SFT (OSFT). We can hope that the nonpertur- Here to the standard free bulk action (integrated
bative string dualities will also be understood in the over the upper-half complex plane UHP) we have
framework of SFT, once covariant SFTs for the added perturbation localized on the real axis R.
superstring are better developed. Notice that the basis of perturbations depends on
In this article, we review the basic formalism of the chosen BCFT0 .
covariant SFT, using for illustration purposes the 3. We interpret the coefficients {~i (x )} of the
simplest model – cubic bosonic OSFT. We then perturbations as spacetime fields. (The tilde on
briefly sketch the generalization to bosonic SFTs ~i (x) serves as a reminder that these fields are not
that include closed strings. Finally, we turn to the quite the same as the fields i (x) that will appear
subjects of classical solutions in OSFT and the in the OSFT action). We are after a spacetime
physics of the open-string tachyon. action S[{~i }] such that solutions of its classical
equations of motion correspond to conformal
boundary conditions:

Open Bosonic SFT S


¼ 0ðspacetimeÞ
~i
The standard formulation of string theory starts with
the choice of an on-shell spacetime background where $
i ½f~ j g ¼ 0 ðworld sheetÞ ½3
strings propagate. In the bosonic string, the closed We recognize in [2] the familiar open-string
string background is described by a conformal field ~
tachyon T(x) ~  (x), which are the
and gauge field A
theory of central charge 26 (the ‘‘matter’’ CFT). The lowest modes in an infinite tower of fields. Relevant
total world sheet CFT is the direct sum of this matter perturbations on the world sheet (with conformal
CFT and of the universal ghost CFT, of central charge dimension h < 1) correspond to tachyonic fields in
26. To describe open strings, we must further specify spacetime (m2 < 0), whereas marginal world sheet
boundary conditions for the string endpoints. The perturbations (h = 1) give massless spacetime fields.
open-string background is encoded in a boundary CFT To achieve a complete description, we must include
(BCFT), a CFT defined in the upper-half plane, with all the higher massive open-string modes as well,
conformal boundary conditions on the real axis which correspond to nonrenormalizable boundary
(see Boundary Conformal Field Theory in this encyclo- perturbations (h > 1). In the traditional -model
pedia). In modern language, the choice of BCFT approach, this appears like a daunting task. The
corresponds to specifying a D-brane state. formalism of OSFT will automatically circumvent
In classical OSFT, we fix the closed-string back- this difficulty.
ground (the bulk CFT) and consider varying the
D-brane configuration (the boundary conditions).
To lowest order in gs , we can neglect the back- The Open-String Field
reaction of the D-brane on the closed-string fields,
In covariant SFT the reparametrization ghosts play
since this is a quantum effect from the open-string
a crucial role. The ghost CFT consists of the
viewpoint. Let us prepare the ground by recalling  z), c(z), of dimen-
Grassmann odd fields b(z), c(z), b(
the standard -model philosophy. To describe off-
sions (2, 0), (1, 0), (0, 2), (0, 1), respectively. The
shell open-string configurations, we should allow for
boundary conditions on the real axis are
general (not necessarily conformal) boundary condi-  c = c. The state space HBCFT of the full
b = b,
tions. We can imagine to proceed as follows: 0
matter þ ghost BCFT can be broken up into
1. We choose an initial open-string background, a subspaces of definite ghost number,
reference BCFT that we shall call BCFT0 . For M 1
ðGÞ
example, a Dp brane in flat 26 dimensions HBCFT0 ¼ HBCFT0 ½4
(Neumann boundary conditions on p þ 1 coordi- G¼1
nates, Dirichlet on 25  p coordinates). We use conventions where the SL(2, R) vacuum j0i
2. We then write a basis of boundary perturbations carries zero ghost number, G(j0i) = 0, while
around this background. Taking, for example, G(c) = þ1 and G(b) = 1. As is familiar from the
BCFT0 to be a D25 brane in flat space, the world first-quantized treatment, physical open-string states
sheet action SWS takes the schematic form are identified with G = þ1 cohomology classes of
Z Z the BRST operator,
1   ~ Þ
SWS ¼ @X @X þ Tðx
20 UHP R QjVphys i ¼ 0; jVphys i  jVphys i þ Qji
~ ðx Þ@X þ B~ ðx Þ@ 2 X þ    ½5
þA ½2 GðjVphys iÞ ¼ þ1
96 String Field Theory

where the nilpotent BRST operator Q has the first-quantized wave functions are promoted to
standard expression dynamical fields in the second-quantized theory.
I Finally, let us quote the reality condition for the
1 string field, which takes a compact form in the
Q¼ ðc Tmatter þ : bc@c :Þ ½6
2i Schrödinger representation:

Though not a priori obvious, it turns out that the ½X ðÞ; bðÞ; cðÞ
simplest form of the OSFT action is achieved by ¼  ½X ð  Þ; bð  Þ; cð  Þ ½10
taking as the fundamental off-shell variable an
where the superscript denotes complex conjugation.
arbitrary G = þ1 element of the first-quantized
Fock space,
The Classical Action
ð1Þ
ji 2 HBCFT0 ½7 With all the ingredients in place, it is immediate to
write the quadratic part of the OSFT action. The
By the usual state–operator correspondence of CFT, linearized equations of motion must reproduce the
we can also represent ji as a local (boundary) physical-state condition [5]. This suggests
vertex operator acting on the vacuum,
S  hjQji ½11
ji ¼ ð0Þj0i ½8 Here hji is the usual BPZ inner product of BCFT0 ,
which is defined in terms of a two-point correlator on
The open-string field ji is really an infinite- the disk, as we review below. The ghost anomaly
dimensional array of spacetime fields. We can implies that on the disk we must have Gtot = þ3,
make this transparent by expanding it as which happily is the case in [11]. Moreover, since the
inner product is nondegenerate, variation of [11] gives
XZ
ji ¼ dpþ1 kji ðkÞi i ðk Þ ½9 Qji ¼ 0 ½12
i
as desired. The equivalence relation jVphys i 
where {ji (k)i} is some convenient basis of H(1) BCFT0 jVphys i þ Qji is interpreted in the second-quantized
that diagonalizes the momentum k . The fields language as the spacetime gauge invariance
i are a priori complex. This is remedied by
ð0Þ
imposing a suitable reality condition on the string  ji ¼ Qji; ji 2 HBCFT0 ½13
field, which will be stated momentarily. Notice that
valid for the general off-shell field. This equation is
there are many more elements in {ji (k)i} than in the
a very compact generalization of the linearized
physical subspace (the cohomology classes of Q).
gauge invariance for the massless gauge field.
Some of the extra fields will turn out to be
Indeed, focusing on the level-zero components,
nondynamical and could be integrated out, but at
ji  A (x)(c@X )(0)j0i and ji  (x)j0i, we find
the price of making the OSFT action look much
A (x) = @ (x). It is then plausible to guess that the
more complicated.
nonlinear gauge invariance should take the form
It is often useful to think of the string field in
terms of its Schrödinger representation, that is, as a  ji ¼ Qji þ ji ji  ji ji ½14
functional on the configuration space of open
where is some suitable product operation that
strings. Consider the unit half-disk in the upper-
conserves ghost number
half plane, DH  {jzj  1, =z  0}, with the vertex
operator (0) inserted at the origin. Impose BCFT0 ðnÞ ðmÞ ðnþmÞ
: HBCFT0
HBCFT0 ! HBCFT0 ½15
open string boundary conditions for the fields X(z, z)
on the real axis (here X(z, z) is a short-hand notation Based on a formal analogy with 3D nonabelian
for all matter and ghost fields), and boundary Chern–Simons theory, Witten proposed the cubic
conditions X() = Xb () on the curved boundary of action
DH , z = exp (i), 0    . The path integral over  
1 1 1
X(z, z) in the interior of the half-disk assigns a S¼ 2 hjQji þ hj i ½16
go 2 3
complex number to any given Xb (), so we obtain a
functional [Xb ()]. This is the Schrödinger wave The string field ji is analogous to the Chern–
function of the state (0)j0i. Thus, we can think of Simons gauge potential A = Ai dxi , the product to
open-string functionals [Xb ()] as the fundamental the ^ product of differential forms, Q to the exterior
variables of OSFT. This is as it should be: the derivative d, and the ghost number G to the degree
String Field Theory 97

of the form. The analogy also suggests a number of coordinates are essential since we are dealing with
algebraic identities: off-shell open-string states. The BPZ inner product
(two-point vertex) is given by
Q2 ¼ 0
h1 j2 i  hI 1 ð0Þ 2 ð0ÞiUPH
hQAjBi ¼ ð1ÞGðAÞ hAjQBi
1 ½20
QðA BÞ ¼ ðQAÞ B þ ð1ÞGðAÞ A ðQBÞ IðzÞ ¼ 
z
½17
hAjBi ¼ ð1ÞGðAÞGðBÞ hBjAi The symbol f (0), where f is a complex map,
hAjB Ci ¼ hBjC Ai means the conformal transform of (0) by f. For
A ðB CÞ ¼ ðA BÞ C example, if  is a dimension-d primary field, then
f (0) = f 0 (0)d (f (0)). If  is nonprimary, the
Note in particular the associativity of the -product. transformation rule will be more complicated and
It is straightforward to check that this algebraic involve extra terms with higher derivatives of f. By
structure implies the gauge invariance of the cubic performing the SL(2, C) transformation
action under [14]. A -product satisfying all required
formal properties can indeed be defined. The most 1 þ iz
w ¼ hðzÞ  ½21
intuitive presentation is in the functional language. 1  iz
Given an open-string curve X(), 0    , we we can represent the two-point vertex as a corre-
single out the string mid-point  = =2 and define lator on the unit disk D = {jwj  1},
the left and right ‘‘half-string’’ curves
h1 j2 i ¼ hf1 1 ð0Þ; f2 2 ð0ÞiD
 ½22
XL ðÞ  XðÞ for 0    f1 ðz1 Þ ¼ hðz1 Þ; f2 ðz2 Þ ¼ hðz2 Þ
2 ½18

XR ðÞ  Xð  Þ for     The vertex operators are inserted as w = 1 and
2 w = þ1 on D (see Figure 2a) and correspond to the
A functional [X()] can, of course, be regarded as two open strings at (Euclidean) world sheet time
a functional of the two half-strings, [X] ! = 1 (we take z = exp (i þ )). The left half of
[XL , XR ]. We define D is the world sheet of the first open string; the right
Z half of D is the world sheet of the second string. The
ð1 2 Þ½XL ; XR   ½dY1 ½XL ; Y2 ½Y; XR  ½19 two strings meet at = 0 on the imaginary w axis.
R The three-point Witten vertex is given by
where [dY] is meant as the functional integral over
the space of half-strings Y(), with Y(=2) = h1 ; 2 ; 3 i
XL (=2) = XR (=2). Figure 1a shows two open  hg1 1 ð0Þg2 2 ð0Þg3 3 ð0ÞiD ½23
strings interacting (to form a single open string) if
where
and only if the right half of the first string precisely
 
overlaps with the left half of the second string. 1 þ iz1 2=3
Associativity is transparent (Figure 1b). g1 ðz1 Þ ¼ e2i=3
1  iz1
We can now translate this formal construction in  
the precise CFT language. Very generally, an n-point 1 þ iz2 2=3
g2 ðz2 Þ ¼ ½24
vertex of open strings can be defined by specifying 1  iz2
an n-punctured disk, that is, a disk with marked  
1 þ iz3 2=3
points on the boundary (punctures) and a choice of g3 ðz3 Þ ¼ e2i=3
1  iz3
local coordinates around each puncture. Local

w w
0 φ1
A
B
π π/2
A C φ1 φ2 φ2
A*B
0 π/2
B A*B*C φ3
π
(a) (b)
(a) (b) Figure 2 Representation of the quadratic and cubic vertices as
Figure 1 Midpoint overlaps of open strings. 2- and 3-punctured unit disks.
98 String Field Theory

The 3-punctured disk is depicted in Figure 2b, and where @l and @r are derivatives from the left and
describes the symmetric mid-point overlap of the from the right. It is often convenient to expand S in
three strings at = 0. Finally, the relation between powers of h, S = S0 þ hS1 þ h2 S2 þ    , with
the three-point vertex and the -product is
fS0 ; S0 g ¼ 0
h1 j2 3 i  h1 ; 2 ; 3 i ½25 ½28
fS0 ; S1 g þ fS0 ; S1 g ¼ 2hS0 ; . . .
Knowledge of the right-hand side (RHS) in [25] for
all  allows to reconstruct the -product. All formal
With these definitions in place, we shall simply
properties [17] are easily shown to hold in the CFT
describe the answer, which is extremely elegant. In
language. This completes the definition of the OSFT
OSFT the full set of fields and antifields is packaged
action. in a single string field ji of unrestricted ghost
Evaluation of the classical action is completely
number. If we write
algorithmic and can be carried out for arbitrary
massive states, with no fear of divergences, since in
ji ¼ j i þ jþ i
all required correlators the operators are inserted ½29
well apart from each other. with Gð Þ  1 and Gðþ Þ  2

all the fields are contained in j i and all the


Quantization antifields in jþ i. To make the pairing explicit, we
Quantization is defined by the path integral over the pick a basis {js i} of HBCFT0 , and define a conjugate
second-quantized string field. The first step is to deal basis {jCs i} by
with the gauge invariance [14] of the classical action.
The gauge symmetry is reducible: not all gauge hC
r js i ¼ rs ½30
parameters (0) (the superscript labels ghost number)
lead to a gauge transformation. This is clear at the Clearly, G(C
s ) þ G(s ) = 3. Then
linearized level; indeed, if (0) = Q(1) , then X X
(0) (1) = Q2 (0) = 0. Thus, the set {(0) } gives a j i ¼ js is ; jþ i ¼ jC
s is ½31
redundant parametrization of the gauge group. Gðs Þ1 Gðs Þ1
Characterizing this redundancy is somewhat subtle,
since fields of the form (1) = Q(2) do not really Basis states js i with even (odd) ghost number
lead to a redundancy in (0) , and so on, ad infinitum. G(s ) are defined to be Grassmann even (odd). The
It is clear that we need to introduce an infinite tower full string field ji is declared to be Grassmann
of (second-quantized) ghosts for ghosts. odd. It follows that s is Grassmann even (odd) for
The Batalin–Vilkovisky formalism is a powerful way G(s ) odd (even), and that the corresponding
to handle the problem. The basic object is the master antifield  s has the opposite Grassmanality of s ,
action S(s ,  s ), which is a function of the ‘‘fields’’ s as it must be. With this understanding of ji, the
and of the ‘‘antifields’’  s . Each field is paired with a classical master action S0 is identical in form to the
corresponding antifield of opposite Grassmanality. Witten action [16]! The boundary condition is
(‘‘Grassmanality’’ is defined to be even or odd: a satisfied; indeed, setting jþ i = 0, the ghost number
Grassmann even (odd) field is a commuting (antic- anomaly implies that only the terms with G = þ1
ommuting) field). The master action must obey the survive. The equation {S0 , S0 } = 0 follows from
boundary condition of reducing to the classical action straightforward manipulations using the algebraic
when the antifields are set to zero. (Note that in general identities [17]. On the other hand, the issue of
the set of fields s will be larger than the set of fields i whether S0 = 0, or whether instead quantum
that appear in the classical action). Independence of the corrections are needed to satisfy full BV master
S-matrix on the gauge-fixing procedure is equivalent to equation, is more subtle and has never been fully
the BV master equation resolved. The  operator receives singular contri-
1
butions from the same region of moduli space
2 fS; Sg hS
¼  ½26 responsible for the appearance of closed-string
The antibracket { , } and the  operator are defined as poles, which are discussed below. (See Thorn
(1989) for a classic statement of this issue). It
@r A @l B @r A @l B
fA; Bg   seems possible to choose a basis in HBCFT0 such that
@s @ s @ s @s there are no quantum corrections to S0 (Erler and
½27
@r @ Gross 2004). In the following we shall derive the
  s l
@ @s Feynman rules implied by S0 alone.
String Field Theory 99

SFT Diagrams and Minimal Area Metrics minimal-area metrics will summarize ideas devel-
oped mainly by Zwiebach.) Quite generally, the
Imposing the Siegel gauge condition b0  = 0, one
Feynman rules of an SFT provide us with a cell
finds the gauge-fixed action
decomposition of the appropriate moduli space of
 Riemann surfaces, a way to construct surfaces in
1 1 1
Sgf ¼  hjc0 L0 ji þ hj i terms of vertices and propagators. Given a Riemann
g2o 2 3
 surface (for fixed values of its complex moduli), the
þ h
j b0 ji ½32 SFT must associate with it one and only one string
diagram. The diagram has more structure than the
Riemann surface: it defines a metric on it. In all
where
is a Lagrangian multiplier. The propagator known covariant SFTs, this is the metric of minimal
reads area obeying suitable length conditions. Consider
Z 1 the following:
b0
¼ b0 dT eTL0 ½33 Minimal-area problem for open SFT Let Ro be a
L0 0
Riemann surface with at least one boundary
Since L0 is the first-quantized open-string Hamilto- component and possibly punctures on the boundary.
nian, eTL0 is the operator that evolves the open- Find the (conformal) metric of minimal area on Ro
string wave functions [X()] by Euclidean world such that all nontrivial Jordan open curves have
sheet time T. It can be visualized as a flat length greater than or equal to . (A curve is said to
rectangular strip of ‘‘horizontal’’ width  and be nontrivial if it cannot be continuously shrunk to a
‘‘vertical’’ height T. Each propagator comes with point without crossing a puncture.)
an antighost insertion
An OSFT diagram (for fixed values of its Ti ),
Z  defines a Riemann surface Ro endowed with a
b0 ¼ bðÞ ½34 metric solving this minimal-area problem. This is the
0
metric implicit in its picture: flat everywhere except
integrated on a horizontal trajectory. at the conical singularities of defect angle (n  2)
The only elementary interaction vertex is the mid- when n propagators meet symmetrically. (For n = 3,
point three-string overlap, visualized in Figure 3. We these are the elementary cubic vertices; for n > 3,
are instructed to draw all possible diagrams with they are effective vertices, obtained when propaga-
given external legs (represented as semi-infinite tors joining cubic vertices collapse to zero length.) It
strips), and to integrate over all Schwinger para- is not difficult to see both that the length conditions
meters Ti 2 [0, 1) associated with the internal are obeyed, and that the metric cannot be made
propagators. The claim is that this prescription smaller without violating a length condition. Con-
reproduce precisely the first-quantized result [1]. versely, any surface Ro endowed with a minimal-
This follows if we can show that (1) the OSFT area metric, corresponds to an OSFT diagram. The
Feynman rules give a unique cover of the moduli idea is that the minimal-area metric must have open
space of open Riemann surfaces; (2) the integration geodesics (‘‘horizontal trajectories’’) of length 
measure agrees with the measure [d ] in [1]. The foliating the surface. The geodesics intersect on a
latter property holds because the antighost insertion set of measure zero – the ‘‘critical graph’’ where the
[34] is precisely the one prescribed by the Polyakov propagators are glued. Bands of open geodesics of
formalism for integrating over the moduli Ti . To infinite height are the external legs of the diagram,
show point (1), we introduce the concept of while bands of finite height are the internal
minimal-area metrics, which has proved very propagators.
fruitful. (Here and below, our discussion of The single cover of moduli space is then ensured
by an existence and uniqueness theorem for metrics
solving the minimal-area problem for OSFT. These
metrics are seen to arise from Jenkins–Strebel
quadratic differentials. Existence shows that the
Feynman rules of OSFT generate each Riemann
surface Ro at least once. Uniqueness shows that
there is no overcounting: since different diagrams
correspond to different metrics (by inspection of
Figure 3 The cubic vertex represented as the mid-point gluing their picture), no Riemann surface can be generated
of three strips. twice.
100 String Field Theory

Closed Strings in OSFT modes. The closed-string field is taken to live in a


subspace of the matter þ ghost state space, ji 2
As is familiar, the open-string S-matrix contains
H~CFT0 , where the tilde means that we impose the
poles due to the exchange of on-shell open and
subsidiary conditions
closed strings. The closed-string poles are present in
nonplanar loop amplitudes. We have seen that b   
0 ji ¼ L0 ji ¼ 0; b0  b0  b0 ;
OSFT reproduces the standard S-matrix. Factoriza- ½36

L  L0  L0
tion over the open-string poles is manifest, it 0

corresponds to propagator lengths Ti going to In the classical theory, the string field carries ghost
infinity. Surprisingly, the closed-string poles are number G = þ2, since it is the off-shell extension of
also correctly reproduced, despite the fact that the familiar closed-string physical states, and the
OSFT treats only the open strings as fundamental quadratic action reads
dynamical variables. In some sense, closed strings
must be considered as derived objects in OSFT. S  h; Qc i ½37
Factorizing the amplitudes over the closed-string Here Qc is the usual closed BRST operator. The inner
poles, one finds that on-shell closed-string states can product h , i is defined in terms of the BPZ inner
be represented, at least formally, as certain singular product, with an extra insertion of c
0  c0  
c0 ,
open-string fields with G = þ2, closely related to the
(formal) identity string field. The picture is that of a hA; Bi  hAjc
0 jBi ½38
folded open string, whose left and right halves In [37] Gtop = þ6, as it should be. Without the
precisely overlap, with an extra closed-string vertex extra ghost insertion and the subsidiary conditions
operator inserted at the mid-point. The correspond- [36] it would not be possible to write a quadratic
ing open/closed vertex is given by action. The linearized equations of motion and
gauge invariance,
hphys jiOC  hphys ð0ÞI ð0ÞiD
  ð1Þ
1 þ iz 2 ½35 Qc ji ¼ 0; ji  ji þ Qc ji; ji 2 H~CFT0 ½39

1  iz give the expected cohomological problem. The fact
and describes the coupling to the open-string field of that the cohomology is computed in the semirelative
a nondynamical, on-shell closed string jphys i. It is complex, b 
0 ji = b0 ji = 0, well known from the
possible to add this open/closed vertex to the OSFT operator formalism of the first-quantized theory, is
action. Remarkably, the resulting Feynman rules recovered naturally in the second-quantized treatment.
give a single cover of the moduli space of Riemann The interacting action is constructed iteratively,
surfaces with at least one boundary, with open and by demanding that the resulting Feynman rules give
closed punctures. This is shown using the same a (unique) cover of moduli space. This requires the
minimal-area problem as above, but now allowing introduction of infinitely many elementary string
for surfaces with closed punctures as well. vertices V g, n , where n is the number of closed-string
We should finally mention that the structure of punctures and g the genus. This decomposition of
OSFT emerges frequently in topological string moduli space is more intricate than the decomposi-
theory, in contexts where open/closed duality plays tion that arises in OSFT, but is in fact analogous to
a central role. Two examples are the interpretation it, when characterized in terms of the following.
of Chern–Simons theory as the OSFT for the Minimal-area problem for closed SFT Let Rc be a
A-model on the conifold, and the intepretation of closed Riemann surface, possibly with punctures.
the Kontsevich matrix integral for topological Find the (conformal) metric of minimal area on R
gravity as the OSFT on FZZT branes in (2, 1) such that all nontrivial Jordan closed curves have
minimal string theory. length greater than or equal to 2.
The minimal-area metric induces a foliation of
Rc by closed geodesics of length 2. In the classical
Closed Bosonic SFT theory (g = 0), the minimal-area metrics arise from
The generalization to covariant closed SFT is Jenkins–Strebel quadratic differentials (as in the open
nontrivial, essentially because the requisite closed- case), and geodesics intersect on a measure-zero set.
string decomposition of moduli space is much more For g > 0, however, there can be foliation bands of
complicated. geodesics that cross. By staring at the foliation, we can
The free theory parallels the open case, with a break up the surface into vertices and propagators. In
minor complication in the treatment of the CFT zero correspondence with each puncture, there is a band of
String Field Theory 101

infinite height, a flat semi-infinite cylinder of circum- open/closed vertex [35]) corresponds to taking lo = 
ference 2, which we identify as an external leg of the and lc = 0. Varying lc 2 [0, 2], we find a whole
diagram. We mark a closed geodesic on each semi- family of interpolating SFTs. This construction
infinite cylinder, at a distance  from its boundary. clarifies the special status of the Witten theory:
Bands of finite height (internal bands not associated to moduli space is covered by a single cubic open
punctures) correspond to propagators if their height is overlap vertex, with no need to introduce dynamical
greater than 2, otherwise they are considered part of closed strings, but at the price of a somewhat
an elementary vertex. Along any internal cylinder of singular formulation.
height greater than 2, we mark two closed geodesics,
at a distance  from the boundary of the cylinder. If we
now cut open all the marked curves, the surface Classical Solutions in Open SFT
decomposes into a number of semi-infinite cylinders
In the present formulation of SFT, a background (a
(external legs), finite cylinders (internal propagators)
classical solution of string theory) must be chosen from
and surfaces with boundaries (elementary interac-
the outset. The very definition of the string field
tions). Each elementary interaction of genus g and
requires to specify a (B)CFT0 . Intuitively, the string
with n boundaries is an element of V g, n . A crucial point
field lives in the ‘‘tangent’’ to the ‘‘theory space’’ at a
of this construction is that we took care of leaving a
specific point – where ‘‘theory space’’ is some notion of
‘‘stub’’ of length  attached to each boundary. Stubs
a ‘‘space of 2D (boundary) quantum field theories,’’
ensure that sewing of surfaces preserves the length
not necessarily conformal. In the early 1990s indepen-
condition on the metric (no closed curve shorter
dence from the choice of background was demon-
than 2).
strated for infinitesimal deformations: the SFT actions
These geometric data can be translated into an
written using neighboring (B)CFTs are indeed related
iterative algebraic construction of the full quantum
by a field redefinition. In recent years, it has become
action S[]. The V g, n satisfy geometric recursion
apparent that at least the open-string field reaches out
relations whose algebraic counterpart is the quan-
to open-string backgrounds a finite distance away –
tum BV master equation for S[]. Remarkably, the
possibly covering the whole of theory space. (Classical
singularities of the  operator encountered in OSFT
solutions of closed SFT are beginning to be investi-
are absent here, precisely because of the presence of
gated at the time of this writing (2005)).
the stubs. We refer to Zwiebach (1993) for a
The OSFT action written using BCFT0 data is just
complete discussion of closed SFT.
the full world volume action of the D-brane with
BCFT0 boundary conditions. Which classical solu-
Open/Closed SFT tions should we expect in this OSFT? In the bosonic
string, Dp branes carry no conserved charge and are
There is also a covariant SFT that includes both open
unstable. This instability is reflected in the presence
and closed strings as fundamental variables. The
of a mode with m2 = 1=0 , the open-string
Feynman rules arise from the following problem.
tachyon T(x ),  = 0, . . . , p. From this physical pic-
Minimal-area problem for open/closed SFT Let ture, Sen argued that:
Roc be a Riemann surface, with or without
1. the tachyon potential, obtained by eliminating
boundaries, possibly with open and closed punctu-
the higher modes of the string field by their
res. Find the (conformal) metric of minimal area on
equations of motion, must admit a local mini-
Roc such that all nontrivial Jordan open curves have
mum corresponding to the vacuum with no
length greater than or equal to lo = , and all
D-brane at all (henceforth, the tachyon vacuum,
nontrivial Jordan closed curves have length greater
T(x ) = T0 );
than or equal to lc = 2.
2. the value of the potential at T0 (measured with
The surface Roc is decomposed in terms of respect to the BCFT0 point T = 0) must be
g, n
elementary vertices V b, m (of genus g, b boundary exactly equal to minus the tension of the brane
components, n closed-string punctures and m open- with BCFT0 boundary conditions;
string punctures) joined by open and closed propa- 3. there must be no perturbative open-string excita-
gators. Degenerations of the surface correspond tions around the tachyon vacuum; and
always to propagators becoming of infinite length – 4. there must be space-dependent ‘‘lump’’ solutions
factorization is manifest both in the open and in the corresponding to lower-dimensional branes. For
closed channel. example, a lump localized along one world
The SFT described in the section ‘‘Closed strings volume direction, say x1 , such that T(x1 ) ! T0
in OSFT’’ (Witten OSFT augmented with the single as x1 ! 1, is identified with a D(p  1) brane.
102 String Field Theory

Sen’s conjectures have all been verified in OSFT. from requiring that [41] admits classical solutions in
(See Sen (2004) and Taylor and Zwiebach (2003) Siegel gauge. The choice
for reviews). The deceptively simple-looking equa-
1
tions of motion (in Siegel gauge) Q¼ ðcðiÞ  cðiÞÞ
2i
L0 ji þ b0 ðji jiÞ ¼ 0 ½40 ¼ c0  ðc2 þ c2 Þ þ ðc4 þ c4 Þ     ½42
are really an infinite system of coupled equations,
satisfies all these requirements. The conjecture
and no analytic solutions are known. Turning on a
(Rastelli et al. 2001) is that, by a field redefinition,
vacuum expectation value (VEV) for the tachyon
the kinetic term around the tachyon vacuum can be
drives into condensation an infinite tower of modes.
cast into this form. This ‘‘purely ghost’’ Q is
Fortunately, the approximation technique of ‘‘level
somewhat singular (it acts at the delicate string
truncation’’ is surprisingly effective. The string field
mid-point), and presumably should be regarded as
is restricted to modes with an L0 eigenvalue smaller
the leading term of a more complicated operator
than a prescribed maximal level L. For any finite L,
that includes matter pieces as well. The normal-
the truncated OSFT contains a finite number of
ization constant 0 is formally infinite. Nevertheless,
fields and numerical computations are possible.
a regulator (e.g., level truncation) can be introduced,
Numerical results for various classical solutions
and physical observables are finite and independent
converge quite rapidly as the level L is increased.
of the regulator. The vacuum SFT ([41]–[42])
The most important solution is the string field jT i
appears to capture the correct physics, at least at
that corresponds to the tachyon vacuum. A remark-
the classical level. Taking a matter/ghost factorized
able feature of jT i is universality: it can be written
ansatz
as a linear combination of modes obtained by acting
on the tachyon c1 j0i with ghost oscillators and jg i
jm i ½43
matter Virasoro operators,
and assuming that the ghost part is universal for all
jT i ¼ T0 c1 j0i þ u Lm
2 c1 j0i þ v c1 j0i þ    D-branes solutions, the equations of motion reduce
This implies that the properties of jT i are indepen- to following equations for the matter part:
dent of any detail of BCFT0 , since all computations jm i jm i ¼ jm i ½44
involving jT i can be reduced to purely combinator-
ial manipulations involving the ghosts and the A solution jm i can be regarded as a projector
Virasoro algebra. The numerical results strongly acting in ‘‘half-string space.’’ Recall that the
confirm Sen’s conjectures, and indicate that the -product looks formally like a matrix multiplica-
tachyon vacuum is located at a non-singular point in tion [19]: the matrices are the string fields, whose
configuration space. Numerical solutions describing ‘‘indices’’ run over the half-string curves. These
lower-dimensional branes and exactly marginal projector equations have been exactly solved by
deformations are also available. For example, the many different techniques (see Rastelli (2004) for a
full family of solutions interpolating between a review). In particular, there is a general BCFT
D1 and a D0 brane at the self-dual radius has construction that shows that one can obtain solu-
been found. There is increasing evidence that the tions corresponding to any D-brane configuration,
open-string field provides a faithful map of the including multiple branes – the rank of the projector
open-string landscape. is the number of branes. A rank-one projector
corresponds to an open-string functional which is
left/right split, [X()] = FL (XL )FR (XR ). There is
Vacuum SFT: D-branes as Projectors
also clear analogy between these solutions and the
In the absence of a closed-form expression for jT i, soliton solutions of noncommutative field theory.
we are led to guesswork. When expanded around The analogy can be made sharper using a formalism
jT i, the OSFT is still cubic, only with a different that rewrites the open-string -product as the tensor
kinetic term Q, product of infinitely many Moyal products. (See
  Bars (2002) and references therein).
1 1
S ¼  0 hjQji þ hj i ½41 It is unclear whether or not multiple-brane
2 3
solutions (should) exist in the original OSFT – they
The operator Q must obey all the formal properties are yet to be found in level truncation. Under-
[17], must be universal (constructed from ghosts and standing this and other issues, like the precise role of
matter Virasoro operators), and must have trivial closed strings in the quantum theory seems to
cohomology at G = þ1. Another constraint comes require a precise characterization of the allowed
String Theory: Phenomenology 103

space of open-string functionals. In principle, the Erler TG and Gross DJ (2004) Locality, causality, and an initial
path integral over such functionals would define the value formulation for open string field theory, arXiv:hep-th/
0406199.
theory at the full nonperturbative level. This remains Ohmori K (2001) A Review on Tachyon Condensation in Open
a challenge for the future. String Field Theories. Master’s thesis, University of Tokyo
(arXiv:hep-th/0102085).
Note Added in Proof Very recently, M Schnabl, Okawa Y (2002) Open string states and D-brane tension from
building on previous work on star algebra projectors vacuum string field theory. Journal of High Energy Physics
and related surface states (Rastelli L (2004) and 0207: 003 (arXiv:hep-th/0204012).
references therein) was able to find the exact Rastelli L (2004) Open string fields and D-branes. Fortschritte der
solution for the universal tachyon condensate in Physics. 52: 302.
Rastelli L, Sen A, and Zwiebach B Vacuum String Field Theory,
OSFT. This breakthrough is likely to lead to rapid Proceedings of the Strings 2001 Conference, TIFR, Mumbai,
new developments in SFT. India, arXiv:hep-th/0106010.
Schnabl M (2005) Analytic solution for tachyon condensation in
See also: Boundary Conformal Field Theory; BRST open string theory. arXiv:hep-th/0511286.
Quantization; Chern–Simons Models: Rigorous Results; Sen A (2004) Tachyon dynamics in open string theory, arXiv:hep-
Fedosov Quantization; The Jones Polynomial; Large-N th/0410103.
and Topological Strings; Large-N Dualities; Shatashvili SL On Field Theory of Open Strings, Tachyon
Noncommutative Geometry from Strings; Condensation and Closed Strings, Proceedings of the Strings
2001 Conference, TIFR, Mumbai, India, arXiv:hep-th/
Noncommutative Tori, Yang–Mills, and String Theory;
0105076.
Operads; Superstring Theories; Topological Quantum
Siegel W (1988) Introduction To String Field Theory. Advanced
Field Theory: Overview; Two-Dimensional Conformal Series in Mathematical Physics 8: 1–244 (arXiv:hep-th/
Field Theory and Vertex Operator Algebras. 0107094).
Taylor W and Zwiebach B (2001) D-branes, tachyons, and string
field theory. In: Gubser SS and Lykken JD (eds.) Boulder
Further Reading 2001, Strings, Branes and Extra Dimensions, TASI 2001
Lectures, pp. 641–759. Singapore: World Scientific.
Bars I (2002) MSFT: Moyal Star Formulation of String Field Theory.
Thorn CB (1989) String field theory. Physics Report 175: 1.
Proceedings of 3rd International Sakharov Conference on
Witten E (1986) Noncommutative geometry and string field
Physics, Moscow, Russia, 24–29 Jun, arXiv:hep-th/0211238.
theory. Nuclear Physics B 268: 253.
Berkovits N (2001) Review of open superstring field theory,
Zwiebach B (1993) Closed string field theory: quantum action
arXiv:hep-th/0105230.
and the B-V master equation. Nuclear Physics B 390: 33
Berkovits N, Okawa Y, and Zwiebach B (2004) WZW-like action
(arXiv:hep-th/9206084).
for heterotic string field theory. Journal of High Energy
Zwiebach B (1998) Oriented open–closed string theory revisited.
Physics 0411: 038 (arXiv:hep-th/0409018).
Annals of Physics 267: 193 (arXiv:hep-th/9705241).
Bordes J, Chan HM, Nellen L, and Tsou ST (1991) Half string
oscillator approach to string field theory. Nuclear Physics B
351: 441.

String Theory: Phenomenology


A M Uranga, Consejo Superior de Investigaciones The string theory has a unique fundamental scale
Cientificas, Madrid, Spain Ms , fixed by the string tension, often encoded in the
ª 2006 Elsevier Ltd. All rights reserved. parameter 0 of dimension (length)2 . All other
scales are derived from this one and are background
dependent.
Most of the string theory phenomenological
String Theory and Compactification model building has centered on the critical super-
strings, which are ten dimensional (10D) and
The string theory provides a setup in which gauge involve spacetime (as well as world-sheet) super-
and gravitational interactions can be described in a symmetry. There are five such different 10D
unified framework consistently at the quantum level. theories: type IIA, type IIB, type I, and the E8 E8
As such, it provides a candidate theory in which to and SO(32) heterotic theories. The heterotic theories
describe the standard model of particle physics include nonabelian gauge fields and charged fer-
(describing quarks and leptons and their strong and mions in ten dimensions; hence, they constitute a
electroweak interactions) and gravity within the promising setup to embed the standard model. On
same quantum theory. the other hand, the possibility of including D-branes
104 String Theory: Phenomenology

(which carry nonabelian gauge symmetries and All such models are on equal footing from the
charged matter) in compactifications of type II point of view of the theory. Hence, 4D string models
theories (and orientifolds thereof, like the type I suffer from a large arbitrariness. Although the
theory itself) makes the latter reasonable alternative breaking of supersymmetry clearly changes the
setups to embed the standard model as a brane picture qualitatively (e.g., flat directions associated
world. The different 10D theories (as well as the to moduli are lifted by radiative corrections), it is
11D M-theory) are related by diverse dualities, also difficult to evaluate this impact.
upon compactification. This suggests that they are In this situation, most of the research in string
just different limits of a unique underlying theory. theory phenomenology has centered on the study of
For 4D models, this implies that the different classes generic properties of certain classes of compactifica-
of constructions are ultimately related by dualities, tions, with the potential to lead to realistic struc-
and that often a given model may be realized using tures (such as N = 1 or no supersymmetry,
different string theory constructions as starting nonabelian gauge symmetries with replicated sets
points. of charged chiral fermions). Within each class,
In order to recover 4D physics at low energies, explicit models (as close as possible to the standard
compactification of the theory is required. In model) have also been constructed. Generic predic-
geometrical terms, the theory is required to propa- tions or expectations for phenomenology can be
gate on a spacetime with geometry M4  X6 , where obtained within each setup, but quantitative results,
M4 is a 4D Minkowski space, and X6 is a compact even for explicit models, are always functions of
manifold. This description is valid in the regime of a undetermined moduli vacuum expectation values.
large compactification volume, 0=R2  1 (where R Tractable mechanisms for moduli stabilization are
is the overall scale of the compact manifold), where under active research, although only preliminary
0 string theory corrections are negligible. Other 4D results are available presently.
string models may be constructed using abstract The better-studied classes of models are compac-
conformal field theories. They may often be tifications of heterotic theories on Calabi–Yau
regarded as extrapolations of geometric compactifi- spaces, and compactifications of type II theories (or
cations to the regime of sizes comparable with the orientifolds thereof) with D-branes. Other possibi-
string length, where string theory corrections are lities include the heterotic M-theory, the M-theory
relevant and the classical geometric picture does not on G2 holonomy varieties, the F-theory on Calabi–
hold. Yau 4-folds, etc. As already mentioned, different
In the simplest situation of geometrical compacti- classes (or even explicit models) are often related by
fication, not including additional backgrounds string duality.
beyond the metric, the requirement of 4D spacetime
supersymmetry (useful for the stability of the model,
as well as of phenomenological interest) implies that Heterotic String Phenomenology
the space X6 is endowed with an SU(3) holonomy A large class of phenomenologically interesting
metric. Existence of such metrics is guaranteed for string vacua, which has been explored in depth, is
Calabi–Yau spaces, namely Kähler manifolds with provided by 4D compactifications of (any of the
vanishing first Chern class. two) perturbative heterotic string theories. Compac-
There are a very large number of 4D super- tification on large volume manifolds can be
symmetric string models that can be constructed described in the supergravity approximation. As
using different starting string theories and different described by Candelas, Horowitz, Strominger, and
compactification manifolds. They lead to different Witten, the requirement of 4D N = 1 supersymmetry
4D spectra, often including nonabelian gauge sym- requires the internal manifold to be of SU(3)
metries and charged chiral fermions (but only rarely holonomy, a condition which is satisfied by
resembling the actual standard model). In addition, Calabi–Yau manifolds. In the presence of a curva-
for each given model, there exist, in general, a large ture, the Bianchi identity for the Kalb–Ramond
number of massless 4D scalars, known as moduli, 2-form B is modified, so that, in general, it reads
whose vacuum expectation values are not fixed.
They parametrize different choices of the compacti- 1
dH ¼ tr R2  tr F2 ½1
fication data in a given topological sector (e.g., 30
Kähler and complex structure moduli of the internal where H is the field strength 3-form, R is the Ricci
Calabi–Yau space). All physical parameters of the 2-form, and F is the field strength, in the adjoint
4D theory vary continuously with the vacuum representation, of the 10D gauge fields. Regarding
expectation values of these scalars. the above equation in cohomology leads to a
String Theory: Phenomenology 105

consistency condition, forcing the background gauge The above geometric approach has several limita-
bundle V to be topologically nontrivial, with tions. On the technical side, the construction of
explicit holomorphic and stable gauge bundles is
c2 ðVÞ ¼ c2 ðTX6 Þ ½2
nontrivial from the mathematical viewpoint. On the
where c2 denotes the second Chern class, and TX6 is more fundamental side, it allows one to explore only
the compactification tangent space. the large volume limit of heterotic compactifications.
The condition of supersymmetry implies that the Further insight into the latter aspect can be
gauge fields must be solutions of the Donaldson– obtained via constructions based on exactly solvable
Uhlenbeck–Yau equations. Existence of such a solu- conformal field theories (CFTs), which describe the
tion is guaranteed for holomorphic and stable gauge world-sheet string dynamics in compactifications,
bundles. The simplest solution to these conditions is including all 0 corrections, and, therefore, allowing
the so-called standard embedding, where the gauge one to enter the small volume regime. The simplest
connection is locally identical to the spin connection, such compactifications are provided by toroidal
but more general solutions exist and have been orbifolds, which describe string propagation in
characterized for particular classes of Calabi–Yau quotients of toroidal compactifications by a discrete
manifolds (e.g., when they are elliptically fibered). group . From the world-sheet viewpoint, they are
The gauge background bundle V, with structure described by 2D free CFT, but which include sectors
group H, breaks the 10D gauge symmetry G to its of closed strings with boundary conditions twisted
commutant subgroup G4D . The latter corresponds to by elements of . The resulting 4D theory contains
the 4D gauge symmetry. Moreover, the background chiral fermions, arising from the untwisted and
bundle modifies the Kaluza–Klein reduction of the twisted sectors. In the former, the nonchiral spec-
10D charged fermions, leading to a nonzero number trum of toroidal compactification suffers a projec-
of replicated 4D chiral fermions. Decomposing the tion onto the -invariant states and leads to
adjoint representation of G (in which 10D fermions chirality. Twisted sectors are localized at the fixed
transform) with respect to G4D  H, points of the orbifold action, where the local
supersymmetry is reduced, leading naturally to
Adj G ¼ ðRG4D ;i ; RH;i Þ ½3
i chiral fermions.
Many of these models can be regarded as limits of
the net number of 4D chiral fermions in the
compactifications on Calabi–Yau spaces in the limit
representation RG4D is given by the index of the
in which they become locally flat and develop
Dirac operator coupled to V in the representation
conical singularities (and similarly, their gauge
RH, i . Condition [1] implies proper cancellation of
bundles become locally flat and with curvature
chiral anomalies in the resulting theory. A simple
localized near the singular points). Indeed, flat
and well-studied class is provided by standard
directions involving moduli fields in the twisted
embedding compactifications of the E8  E8 hetero-
sector often exist, which correspond to geometric
tic string theory, whose unbroken 4D gauge group is
blow-ups of the singular point that resolve the
E6  E8 . The number of families (i.e., chiral multi-
conical singularities to yield a smooth Calabi–Yau.
plets in the representation 27 of E6 ) and conjugate
The theories remain simple and solvable for any
families (in the 27) are given by the Hodge numbers
value of the untwisted moduli (namely moduli of the
n27 ¼ h1;1 ðX6 Þ; n27 ¼ h2;1 ðX6 Þ ½4 underlying toroidal compactification). This allows
the discussion of their low-energy effective action
More specifically, the harmonic representatives in including the explicit dependence on the untwisted
each cohomology class represent the internal profile moduli, while only partial results for the dependence
of the corresponding 4D fields. The net number of on twisted moduli are known.
families is thus determined by the Euler character- Other approaches, such as free fermion construc-
istic (X6 ) tions or Gepner models, also provide exact descrip-
nfam ¼ jh1;1  h2;1 j ¼ 12 jðX6 Þj ½5 tions of compactifications, although only at a point
of the moduli space, deep inside the small volume
Recently, much progress in heterotic model building regime.
has been achieved in nonstandard embedding com- Exact CFT constructions provide a small volume
pactifications by the detailed construction of holo- description of Calabi–Yau compactifications, at
morphic stable bundles and the computation of the least for particular models. Moreover, their consis-
diverse indexes. In particular, explicit models with tency conditions (modular invariance of the parti-
just the minimal supersymmetric standard model tion function) provide a stringy version of the large
spectrum have been constructed. volume geometric condition implied by eqn [2]. The
106 String Theory: Phenomenology

constructions also show the existence of full-fledged of compactifications including nonperturbative


string theory constructions with properties similar to objects, namely 5-branes, has been pursued; so has
geometric compactifications, but incorporating all 0 been the strong coupling limit of the E8  E8
corrections. heterotic, described by compactifications of the
Within the general class of perturbative heterotic M-theory on an interval (the so-called heterotic
string models, a certain number of phenomenologi- M-theory or Horava–Witten theory). The strong
cally interesting statements are quite generic. coupling phenomena of the SO(32) heterotic theory
can be addressed using dual type I (or other type II
 The 4D Planck scale MP and gauge couplings gYM
orientifold) constructions.
(at the string scale) are related to the fundamental
string scale by
D-Brane Phenomenology
Ms ¼ MP gYM ½6
A different setup for realistic string theory compac-
This implies that the string scale is close to the 4D tifications, within the so-called brane-world con-
Planck scale. In this situation, supersymmetry can structions, is provided by compactifications of type II
stabilize the electroweak scale against radiative string theories containing D-branes, or quotients
corrections. thereof. A particularly relevant class of quotients
 4D heterotic models contain certain U(1) symme- involves quotienting out by world-sheet parity,
tries, whose gauge bosons actually get Stuckelberg accompanied by some Z2 geometric action. The
masses due to B ^ F couplings to components of resulting theories are denoted type II orientifolds, and
the 2-form. Such U(1)’s would correspond to contain orientifold planes, subspaces fixed under the
global symmetries, but are violated at tree level by geometric action, corresponding to regions where the
0 nonperturbative effects, namely world-sheet orientation of a string can flip. Type II compactifica-
instantons. Hence, no continuous global symme- tions with D-branes filling the noncompact dimen-
tries exist, even perturbatively, in these models. sions must satisfy a set of consistency conditions,
Proton decay might, however, be avoided by known as RR tadpole cancellation. This is the
discrete global symmetries. In any event, even condition that, in the compact space, the charge of
without such symmetries, the large fundamental D-branes and orientifold planes under the different
scale suppresses the processes mediating proton RR forms must cancel. For the Z-valued charges, the
decay. Thus, the proton lifetime is naturally larger conditions read
than present experimental bounds. X
Na Qa þ QOp ¼ 0 ½7
 Gauge coupling constants for the different gauge a
factors in the standard model unify at the string
scale. This agrees with extrapolation from their where Na denotes the multiplicity of D-branes with
electroweak values, assuming the minimal super- charge vector and Qa under the RR fields, QOp is the
symmetric standard model content between the charge vector of the orientifold planes. Additional
electroweak and string scale, up to a mismatch of discrete conditions may be present if the relevant
scales (by a factor of 20). The latter may be K-theory group (classifying D-brane charges in the
addressed in diverse ways, such as threshold corresponding background) contains torsion pieces.
corrections, intermediate scales, or in the heterotic The most familiar example of these constructions
M-theory. is provided by the type I string theory, which is an
 Yukawa couplings are, in principle, computable. orientifold quotient of the type IIB theory by world-
Explicit computations have been carried out in sheet parity (with no geometric action). The model
standard embedding geometric compactifications can be regarded as containing one orientifold
(where they amount to the overlap integral of the 9-plane and 32 D9 branes (all filling out 10D
internal profiles of the 4D fields, namely a spacetime), such that their RR charges with respect
topological intersection number), and in orbifold to the (nondynamical) RR 10-form cancel.
models. They are in general moduli dependent, so Supersymmetric geometric compactifications of
their quantitative analysis is involved. Qualita- type II theories and orientifolds must correspond to
tively, however, interesting patterns, such as compactification on Calabi–Yau spaces in order to
hierarchical structures, are possible, for example, have a preserved spinor. Models with D-branes
in specific orbifold models. filling the noncompact dimensions may be broadly
classified into two classes: type IIB compactifications
Heterotic models have been studied beyond the with D(3 þ 2p)-branes, wrapped on holomorphic
perturbative regime. For instance, the construction 2p-cycles, and carrying holomorphic and stable
String Theory: Phenomenology 107

world-volume gauge bundles, and type IIA compac- corrections, Yukawa couplings, and other diverse
tifications with D6 branes wrapped on special correlation functions have been computed in toroi-
Lagrangian 3-cycles (in general, models with D4 dal cases, where the corresponding correlators are
and D8 branes are not allowed since Calabi–Yau computable exactly in 0 . Particularly interesting is
spaces do not have nontrivial 1- or 5-cycles on the computation of Yukawa couplings, or, in
which to wrap the branes). This classification is a general, of couplings involving only fields at inter-
large volume realization of the general classification sections. These couplings arise from open-string
of supersymmetric configurations of D-branes into world-sheet instantons, namely disks with bound-
two classes, denoted A and B. aries on the D-branes corresponding to those
intersections.
Intersecting Brane Worlds
Type IIB Orientifolds
Type IIA compactifications with A-branes corre-
spond to compactifications of type IIA theory (or Type IIB compactifications with B-type branes
orientifolds thereof) with D6 branes wrapped on contain several familiar classes of 4D models, for
3-cycles of the internal Calabi–Yau space. In these instance, compactifications of type I string theory on
models, each stack of N D6 branes generically leads smooth Calabi–Yau spaces (whose description may
to a U(N) gauge factor. Chirality arises from open be carried out using the effective supergravity
strings stretched between pairs of branes at the action, in close analogy with the heterotic compac-
corresponding intersections. The chiral fermions tifications). Compactifications of type I string theory
from an open string stretched between branes a on orbifolds can be regarded as a particular
and b transform in the bifundamental representation realization of this, easily described using exact
(&a , &b ) of the gauge factors U(Na )  U(Nb ) of the CFTs (although from the viewpoint of the general
intersecting D6 brane stacks. In general, two description as B-branes, the appearance of lower-
3-cycles in a 6D manifold intersect at points of the dimensional branes requires their mathematical
internal space. Hence, such fermions arise in several description to involve coherent sheaves). Since
families, whose (net) number is given by the (net) open strings at orbifolds do not have twisted
number of intersections of the corresponding boundary conditions, chirality arises from the orbi-
3-cycles a , b , namely the topological invariant fold projection of the toroidally compactified theory
intersection number of their homology classes on the spectrum.
Another example within this kind is provided by
Iab ¼ ½a   ½b  ½8
the so-called magnetized D-brane models. These
Simple modifications of the above rules arise in correspond to toroidal compactifications of type I
some sectors in the presence of orientifold planes theory, with D9 branes carrying constant magnetic
(e.g., the reduction of the gauge symmetry from backgrounds for the internal components of the
unitary to orthogonal or symplectic factors for world-volume gauge fields. In this kind of model,
branes on top of orientifold planes). although the closed-string sector is highly super-
The RR tadpole cancellation conditions specify symmetric, the open-string spectrum has reduced
that the total homological charge carried by the D6 supersymmetry, or no supersymmetry (if the bundle
branes (and the orientifold 6-planes) cancel. They stability condition is relaxed). Chirality arises from
imply automatic cancellation of cubic nonabelian the nontrivial index of the Dirac operator for open
anomalies, and the cancellation of mixed U(1) strings ending on D-branes with different world-
anomalies by a Green–Schwarz mechanism mediated volume magnetic fields. Explicit models have mainly
by 4D scalars from the RR closed-string sector. centered on nonsupersymmetric models from orien-
Explicit models with SM spectrum have been tifolds of T 6 , and on supersymmetric models from
constructed in orientifolds of toroidal compactifica- orientifolds of the T 6 =(Z2  Z2 ) orbifold. In both
tions in the nonsupersymmetric case, and in orbi- contexts, models with semirealistic spectra have
folds thereof in supersymmetric cases. The been obtained: concretely nonsupersymmetric mod-
generalization of the above construction beyond els with just the standard model spectrum, or
toroidal situations is, in principle, possible, but supersymmetric models with the minimal super-
difficult, due to the mathematically challenging symmetric standard model spectrum, plus nonchiral
task of constructing special Lagrangian submani- matter. Further, properties of the gauge coupling
folds for general Calabi–Yau manifolds. constants and the computation of the Yukawa
Certain phenomenologically interesting quantities, couplings have been studied as functions of unde-
such as gauge couplings and their threshold termined moduli.
108 String Theory: Phenomenology

Finally, a second large class of models constructed where VT is a measure of the volume in the
using B-type branes are given by lower-dimensional directions transverse to the brane, and gs is the
D-branes, for example, D3 branes, located at singular 10D string coupling. The above relation shows that
points in the internal compactification space. Since the it is possible to achieve large 4D Planck mass with
massless sector of open strings is determined only in a lower fundamental string scale by adjusting the
terms of the local structure of the singularity, these transverse volume and the string coupling. This has
models have been mostly studied in noncompact been proposed by Antoniadis, Arkani-Hamed,
setups. Resulting spectra can be encoded in quiver Dimopoulos, and Dvali as an alternative to explain
diagrams, related to those in the mathematical litera- the Planck/weak hierarchy without supersymmetry.
ture on the McKay correspondence. Semirealistic three-  The compactifications contain several U(1) gauge
family models have been constructed based on systems symmetries. For some of the corresponding gauge
of D3 and D7 branes at the C3 =Z3 orbifold singularity. bosons, the 4D effective theory contains Stuckel-
Type IIB orientifold compactifications are also berg masses of order Ms , due to B ^ F couplings
intimately related to F-theory compactifications on to fields in the RR sector. These couplings make
Calabi–Yau 4-folds, which provide a nonperturba- the U(1) gauge bosons massive; hence, they are
tive completion for such models. absent from the low-energy physics. Nevertheless,
Mirror symmetry exchanges type IIB and IIA the U(1)’s remain as global symmetries exact in 0
compactifications with B- and A-type branes. Hence, and to all orders in the perturbation theory in gs .
it provides a map between the above two kinds of They are violated by D-brane instantons, which
compactifications. This shows that type IIB orienti- are nonperturbative in gs . In many realistic
fold models lead to spectra with structure similar to models, the baryon number is one such global
that of intersecting-branes worlds, and that they symmetry, and it prevents proton decay, even if
share many of their general properties. the string scale is not large.
As a particular example, toroidal models of  In general, each gauge factor in the standard
intersecting D6 branes are mapped under mirror model arises from a different brane stack, and
symmetry to models of magnetized D9 branes. This their gauge couplings at the string scale are
mirror map has been exploited to construct the same controlled by different moduli. This implies that,
theories from both starting points and to recover generically, it is not natural to have gauge
certain quantities, such as the 0 -exact Yukawa coupling unification in D-brane models. Particular
couplings in the IIA picture from a purely classical models may enjoy enhanced discrete global
(no 0 corrections) computation in the mirror IIB symmetries at special points in moduli space
model. This is a particular application of the general where unification is achieved, thus making uni-
proposal of homological mirror symmetry in com- fication appear more natural in such examples.
pactifications with branes. Similar statements apply for constructions which
Type II orientifold compactifications with realize complete or partial unification of gauge
D-branes have also been explored beyond the groups at large scales (like string models of grand
geometric regime, using exact CFTs to describe the unification or of Pati–Salam type).
(analog of the) internal space, and crosscap and  As already mentioned, important quantities such
boundary states to describe (the analogs of) orienti- as Yukawa couplings are, in principle, computa-
fold planes and D-branes. Formal developments in ble, although quantitative expressions have been
the construction of the latter in Gepner models have derived only in a few examples, mostly in toroidal
been successfully applied to obtain large classes of compactifications or quotients thereof. The results
semirealistic 4D string models in this setup. are moduli dependent, making it difficult to
As compared with heterotic compactifications, the derive model-independent patterns.
setup of D-brane models leads to several generic
features:
M-Theory Phenomenology
 Since gauge sectors are localized on D-branes, and
Most of the phenomenological models from the
have a dilaton dependence different from gravita-
M-theory have been constructed using the Horava–
tional interactions, the relation between the
Witten theory (compactification of M-theory on
fundamental string scale and the 4D Planck scale
S1 =Z2 ) as starting point. This theory provides a
and gauge coupling reads
description of the strong coupling regime of the
E8  E8 heterotic theory, and many of its basic
M11p VT
MP2 gYM
2
¼ s
½9 features are similar to those in the perturbative
gs regime. In particular, the techniques used in model
String Theory: Phenomenology 109

building involve the construction of stable and parametrization of the general 4D N = 1 super-
holomorphic vector bundles and the computation gravity action in terms of the Kähler potential for
of the relevant indexes to obtain the 4D gauge group the moduli and matter fields, the gauge kinetic
and charge matter content. An important difference functions, and the superpotential. The moduli action
is that gauge interactions propagate only over the is quite universal, at least for geometric compactifi-
10D boundaries of spacetime, while gravity propa- cations and for untwisted moduli in orbifold
gates over the 11 dimensions. This makes the setup compactifications. For instance, the Kähler potential
share some features of brane-world constructions, for the 4D dilaton multiplet S and the modulus T
and, in particular, it allows one to lower the controlling the size of the internal manifold, in the
fundamental scale of the theory (the 11D Planck large volume and weak coupling regime, reads
scales) to reconcile it with the traditional unification
K ¼  logðS þ S Þ  3 logðT þ T Þ ½10
scale.
A different setup for M-theory phenomenology The corresponding expression including matter
involves the compactification of the 11D theory on a fields is more model dependent, but known within
7-manifold of G2 holonomy X7 , in order to lead to each particular class.
N = 1 supersymmetry in four dimensions. Although
a fundamental formulation of the M-theory is
Moduli Stabilization and Supersymmetry
lacking, duality arguments and indirect evidence
Breaking
can be used to show that nonabelian gauge
symmetries of the A–D–E classical groups arise if Both issues are often related. Although moduli
X7 contains 3-cycles of codimension-4 singularities, stabilization preserving supersymmetry is possible,
locally of the form C2 =, with  an A–D–E Kleinian it often occurs that the potential stabilizing moduli
subgroup of SU(2). Similarly, it can be shown that has its origin in mechanisms related to super-
chiral multiplets charged under these gauge symme- symmetry breaking.
tries arise if X7 contains certain codimension-7 The description of purely string theoretical
singularities. The local geometry of the latter has mechanisms to break supersymmetry is difficult,
been explicitly described, and can be regarded as lying and most approaches rely on field-theoretical
at the intersections of codimension-4 singularities. mechanisms in the effective action. One of the better-
The direct construction of such singular G2 studied mechanisms, mostly in the heterotic string
holonomy manifolds is very difficult, and there are setup (but also in type II compactifications), is
no known topological conditions that guarantee gaugino condensation in a strongly coupled hidden
existence of such a metric for a fixed topology. sector, interacting with the standard model sector
However, the existence of large classes of such via gravitational (or perhaps additional gauge)
models can be indirectly shown by using duality interactions. Although explicit models with such
arguments. Namely, any type IIA models of inter- hidden sectors and strong dynamics exist, they
secting D6 branes and O6 planes, preserving N = 1 often result in runaway potentials for moduli.
supersymmetry, lifts to an M-theory compactifica- Racetrack scenarios where several condensates
tion on a singular G2 holonomy manifold. In fact, balance each other are possible but contrived.
the local structure of the codimension-4 and -7 A second mechanism to break supersymmetry,
singularities agrees in particular cases with the local mostly explored in type IIB/F-theory compactifica-
structure of D6 branes on 3-cycles and D6 brane tions, is the introduction of field-strength fluxes for
intersections. p-form fields. Interestingly, such fluxes lead to
nontrivial potentials depending on moduli, and
generically breaking supersymmetry. The existence
Further Topics of several remnant flat directions in the leading 0 , gs
Some additional topics related to the phenomenol- approximation, leaves unanswered the question of
ogy of the string theory, but not covered by the possible runaway moduli potentials in those direc-
above model building description are discussed in tions. However, evidence for nonperturbative con-
tributions stabilizing the remaining moduli at finite
the following.
distance has been proposed. Preliminary results in the
analysis of flux stabilized vacua have been obtained
Effective Actions
in simple examples of (still unrealistic) Calabi–Yau
The construction of effective actions for such classes compactifications with small number of moduli.
of models has been carried out in general in Most explored mechanisms propose supersymmetry
supersymmetric compactifications, using the breaking below the Kaluza–Klein compactification
110 String Theory: Phenomenology

scale, and, therefore, can be described in the 4D standard model. In particular, generic features such
effective theory. They can be nicely parametrized in as nonabelian gauge symmetry and chirality, coupled
terms of vacuum expectation values for the dilaton to gravity, are generic in 4D compactifications. This
and geometric moduli of the compactification. This is already a success. In addition, much progress has
description allows for a computation of the soft been made in the general description of the relevant
terms using the expansion of the N = 1 supergravity mathematical tools, and physical mechanisms and
formulas in components. Concrete patterns, such as ingredients involved in these vacua, as well as in the
the universality of squark masses, or the complex explicit construction of models with the standard
phases of diverse soft terms, can be explored using model spectrum (or supersymmetric extensions of
this approach. it). Yet, many questions remain open and much
Alternative mechanisms of breaking supersymme- more work is needed in order to make contact with
try at higher scales, such as the introduction of the physics observed in nature.
antibranes or nonsupersymmetric compactifications,
lead to generic difficulties with stability. See also: Brane Worlds; Compactification of Superstring
Related to the question of supersymmetry break- Theory; Cosmology: Mathematical Aspects; Superstring
ing is the question of the cosmological constant. Theories.
Unfortunately, there is no manifest mechanism in
the string theory that explains the smallness of the
observed value of this scale. Given that many Further Reading
aspects of both quantum gravity in the string theory
and realistic model building (with proper super- Acharya B and Witten E (2001) Chiral fermions from manifolds
of G(2) holonomy, hep-th /0109152.
symmetry breaking and moduli stabilization) are Aldazabal G, Ibáñez LE, Quevedo F, and Uranga AM (2000)
still under progress, an open-minded point of view D-branes at singularities: a bottom up approach to the string
on this problem and the proposed solutions is kept. embedding of the standard model. Journal of High Energy
Physics 0008: 002.
Angelantonj C and Sagnotti A (2002) Open strings. Physics
Cosmology Reports 371: 1–150.
Angelantonj C and Sagnotti A (2003) Open strings – erratum.
Although somewhat different from the traditional
Physics Reports 376: 339–405.
focus of string phenomenology, recent progress in Antoniadis I, Arkani-Hamed N, Dimopoulos S, and Dvali GR
observational cosmology has triggered much interest (1998) New dimensions at a millimeter to a Fermi and
in string theory realizations of inflationary models superstrings at a TeV. Physics Letters B 436: 257–263.
(or alternatives such as pre-big bang scenarios). Bachas C (1995) A way to break supersymmetry, hep-th /
9503030.
Most inflationary models have centered on using
Blumenhagen R, Cvetic̆ M, Langacker P, and Shiu G (2005)
moduli as the inflaton field, due to their flat Toward realistic intersecting D-brane models, hep-th /
potentials. A simple setup in type II compactifica- 0502005.
tions, known as brane inflation models, uses the Candelas P, Horowitz GT, Strominger A, and Witten E (1985)
modulus controlling a brane position as the inflaton Vacuum configurations for superstrings. Nuclear Physics B
258: 46–74.
field, which has a flat enough potential with a
Donagi R, He Y-H, Ovrut BA, and Reinbacher R (2004) The
moderate fine-tuning. Such setups may lead to spectra of heterotic standard model vacua, hep-th/0411156.
interesting additional features, such as a moderate Green MB, Schwarz JH, and Witten E (1987) Superstring Theory.
but potentially observable density of cosmic strings Cambridge Monographs On Mathematical Physics, vols. 1
created in the reheating process. and 2. Cambridge: Cambridge University Press.
Ibáñez LE (1987) The search for a standard model SUð3Þ 
On the other hand, many interesting questions in
SUð2Þ  Uð1Þ superstring: an introduction to orbifold con-
string cosmology await further understanding of structions. Seoul Sympos. 1986, 46.
time-dependent backgrounds in the string theory. Polchinski J (1998) String Theory. vols. 1 and 2. Cambridge:
Cambridge University Press.
Uranga AM (2003) Chiral four-dimensional string compactifica-
tions with intersecting D-branes. Classical and Quantum
Retrospect
Gravity 20: S373–S394.
It is remarkable that the formal framework of Witten E (1996) Strong coupling expansion of Calabi–Yau
compactification. Nuclear Physics B 471: 135–158.
the string theory admits tractable solutions with
reasonable resemblance to the structure of the
String Topology: Homotopy and Geometric Perspectives 111

String Topology: Homotopy and Geometric Perspectives


R L Cohen, Stanford University, Stanford, CA, USA that is compatible with both the intersection product
ª 2006 Elsevier Ltd. All rights reserved. on H (M) via the map ev : LM ! M( ! (0)), and
with the Pontrjagin product in H (M).
The construction of this pairing involves consid-
String topology is a new field of study involving the eration of the diagram,
geometric and algebraic topology of spaces of loops  e
LM Mapð8; MÞ ! LM  LM ½1
and paths in manifolds. The subject was initiated in
the important work of Chas and Sullivan (1999) Here Map(8, M) is the mapping space from the
who uncovered previously unknown algebraic struc- figure 8 to M, which can be viewed as the subspace
ture in the homology and equivariant homology of of LM  LM consisting of those pairs of loops that
loop spaces. While the structure is purely topologi- agree at the basepoint.  : Map(8, M) ! LM is the
cal, it was motivated by formalisms in quantum field map on mapping spaces induced by the pinch map
theory and string theory. Since that time this subject S1 ! S 1 _ S 1 .
has attracted the attention of many mathematicians, Chas and Sullivan constructed this pairing by
but one of the main lines of research continues to be studying intersections of chains in loop spaces.
motivated by the attempt to understand the relation A more homotopy-theoretic viewpoint was taken
between this structure (and its generalizations) with by Cohen and Jones (2002) who viewed e : Map
topological and conformal field theories. (8, M) ! LM  LM as an embedding, and showed
In order to describe some of the recent advances in there is a tubular neighborhood homeomorphic to a
this field, we begin with some notation. Throughout normal given by the pullback bundle, ev (TM),
this article Mn will denote a closed, n-dimensional, where ev : LM ! M is the evaluation map mentioned
oriented manifold. LM will denote the free loop space, above. They then constructed a Pontrjagin–Thom
collapse map whose target is the Thom space of the
LM ¼ MapðS1 ; MÞ 
normal bundle, e : LM  LM ! Map(8, M)ev (TM) .
For D1 , D2  M closed submanifolds, P M (D1 , D2 ) Computing e in homology and applying the Thom
will denote the space of paths in M that start at D1 isomorphism defines an ‘‘umkehr map,’’
and end at D2 ,
e! : H ðLM  LMÞ ! Hn ðMapð8; MÞÞ
P M ðD1 ; D2 Þ ¼ f : ½0; 1 ! M; ð0Þ 2 D1 ; ð1Þ 2 D2 g
The Chas–Sullivan loop product is defined to be the
The paths and loops we consider will always be composition
assumed to be piecewise smooth. Such spaces of paths
and loops are well known to be infinite-dimensional  ¼   e! : H ðLM  LMÞ ! Hn ðMapð8; MÞÞ
manifolds, and roughly speaking, string topology is the ! Hn ðLMÞ
study of the intersection theory in these manifolds.
Notice that the umkehr map e! can be defined for a
Recall that for closed, oriented manifolds, there is
generalized homology theory h whenever one has a
an intersection pairing,
Thom isomorphism of the tangent bundle, TM,
Hr ðMÞ  Hs ðMÞ ! Hrþsn ðMÞ which is to say a generalized homology theory h for
which the representing spectrum is a ring spectrum,
which is defined to be Poincaré dual to the cup
and which supports an orientation of M.
product,
By twisting the Pontrjagin–Thom construction by
[ the virtual bundle TM, one obtains a map of
H nr ðMÞ  H ns ðMÞ ! H2nrs ðMÞ
spectra,
The geometric significance of this pairing is that if 

the homology classes are represented by submani- e : LMTM ^ LMTM ! Mapð8; MÞev ðTMÞ
folds, Pr and Qs with transverse intersection, then where LMTM is the Thom spectrum of the pullback
the image of the intersection pairing is represented of the virtual bundle ev (TM). Now we can
by the geometric intersection, P \ Q. compose, to obtain a multiplication,
The remarkable result of Chas and Sullivan says
e  
that even without Poincaré duality, there is an LMTM ^ LMTM ! Mapð8; MÞev0 ðTMÞ ! LMTM
intersection type product
The following was proved by Cohen and Jones
 : Hp ðLMÞ  Hq ðLMÞ ! Hpþqn ðLMÞ (2002).
112 String Topology: Homotopy and Geometric Perspectives

Theorem 1 Let M be a closed manifold, then Cohen and Godin (2004) used the theory of ‘‘fat’’ or
LMTM is a ring spectrum. If M is orientable the ring ‘‘ribbon’’ graphs to represent surfaces as developed
structure on LMTM induces the Chas–Sullivan loop by Harer (1985), Penner (1987), and Strebel (1984),
product on H (LM) by applying homology and the in order to define Pontrjagin–Thom maps,
Thom isomorphism.
g;pþq : ðLMÞp ! Mapðg;pþq ; MÞðg;pþq Þ
The ring structure on the spectrum LMTM was
also observed by Dwyer and Miller using different
where (g, pþq ) is the appropriately defined normal
methods.
bundle of in . By applying (perhaps generalized)
Cohen and Godin (2004) generalized the loop
homology and the Thom isomorphism, they defined
product in the following way. Observe that the
the umkehr map,
figure 8 is homotopy equivalent to the pair of pants
surface P, which we think of as a genus 0 cobordism ðin Þ! : H ððLMÞp Þ ! Hþðg;pþq Þ n ðMapðg;pþq ; MÞÞ
between two circles and one circle.
Furthermore, Figure 1 is homotopic to the where (g, pþq ) = 2  2g  p  q is the Euler char-
diagram of mapping spaces, acteristic. Cohen and Godin then defined the string
out in topology operation to be the composition,
LM  MapðP; MÞ ! ðLMÞ2
where in and out are restriction maps to the g;pþq ¼ out  ðin Þ! : H ððLMÞp Þ ! Hþðg;pþq Þ n
‘‘incoming’’ and ‘‘outgoing’’ boundary components  ðMapðg;pþq ; MÞÞ ! Hþðg;pþq Þ n ððLMÞq Þ
of the surface P. So the loop product can be viewed
as a composition, They proved that these operations respect gluing of
 ¼ P surfaces,

¼ ðout Þ  ðin Þ! : ðH ðLMÞÞ  2 ! H ðMapðP; MÞÞ 1 #2 ¼ 2  1
! H ðLMÞ
where 1 #2 is the glued surface as shown in
where using the figure 8 to replace the surface P can Figure 3.
be viewed as a technical device that allows one to The coherence of these operations is summarized
define the umkehr map (in )! . in the following theorem.
In general if one considers a surface of genus g,
viewed as a cobordism from p incoming circles to q Theorem 2 (Cohen and Godin 2004). Let h be
outgoing circles, g, pþq , one gets a similar diagram any multiplicative generalized homology theory that
(Figure 2) supports an orientation of M. Then the assignment
out in
ðLMÞq  Mapðg;pþq ; MÞ ! ðLMÞp g;pþq ! g;pþq : h ððLMÞp Þ ! h ððLMÞq Þ

is a positive boundary topological quantum field


theory. ‘‘Positive boundary’’ refers to the fact that
the number of outgoing boundary components, q,
must be positive.
A theory with open strings was initiated
by Sullivan (2004) and developed further by
A Ramirez (2005) and by Harrelson (2004). In this
Figure 1 Pair of pants P.

r circles
q circles q circles
p circles p circles
Figure 2 g, pþq . Figure 3 1 #2 .
String Topology: Homotopy and Geometric Perspectives 113

setting one has a collection of submanifolds, Di  M, group, one has that the loop space of the classifying
referred to as ‘‘D-branes.’’ This theory studies space satisfies
intersections in the path spaces PM (Di , Dj ). a
A theory with D-branes involves ‘‘open–closed LBG ’ BCg
cobordisms’’ which are cobordisms between com- ½g

pact one-dimensional manifolds whose boundary is where [g] is the conjugacy class determined by
partitioned into three parts: g 2 G, and Cg < G is the centralizer of g.
1. Incoming circles and intervals. When BG is represented by a closed manifold, or
2. Outgoing circles and intervals. more generally, when G is a Poincaré duality group,
3. The rest is the ‘‘free boundary’’ which is itself a the Chas–Sullivan loop product then defines pairings
cobordism between the boundary of the incom- among the homologies of the centralizer subgroups.
ing and boundary of the outgoing intervals. Each Abbaspour et al. describe this loop product entirely
connected component of the ‘‘free boundary’’ is in terms of group homology, thus giving structure
labeled by a D-brane (see Figure 4). to the homology of Poincaré-duality groups that
previously had not been known.
In a topological field theory with D-branes,
one associates to each boundary circle a vector Example 2 Applications to 3-manifolds.
space VS1 (in our case VS1 = H (LM)) and to an (Abbaspour 2005). Let  : H M ! H (LM) be
interval whose endpoints are labeled by Di , Dj , one induced by inclusion of constant loops. This is a
associates a vector space VDi , Dj (in our case VDi , Dj = split injection of rings. Write H (LM) = H (M)

H (PM (Di , Dj ))). AM . We say H (LM) has nontrivial extended loop


To an open–closed cobordism as above, one products if the composition
associates an operation from the tensor product of 
these vector spaces corresponding to the incoming AM  AM ,! H ðLMÞ  H ðLMÞ ! H ðLMÞ
boundaries to the tensor product of the vector
is nontrivial.
spaces corresponding to the outgoing boundaries.
Let M be a closed, irreducible 3-manifold. In a
Of course, these operations have to respect the
remarkable piece of work, Abbaspour showed the
relevant gluing of open–closed cobordisms.
relationship between having a trivial extended loop
By developing a theory of fat graphs that encode
product and M being ‘‘algebraically hyperbolic.’’
the open–closed boundary data, Ramirez was able
This means that M is a K( , 1) and its fundamental
to prove that there are string topology operations
group has no rank-2 abelian subgroup. (If geome-
that form a positive boundary, topological quantum
trization conjecture is true, this is equivalent to M
field theory with D-branes (Ramirez 2005).
admitting a complete hyperbolic metric.)
We end these notes by a discussion of three
applications of string topology to classifying spaces Example 3 The string topology of classifying
of groups. spaces of compact Lie groups (Gruher (to appear)
and of Gruher and Salvatore (to appear)). The goal
Example 1 Application to Poincaré duality groups –
of Gruher’s work is to construct string topological
(Abbaspour et al. to appear). For G any discrete
invariants of LBG ’ EG  G G, where G acts on
itself via conjugation. Ultimately, one would like to
understand the relationship between this structure
and the work of Freed (2003) on twisted equivariant
D2 K-theory, KG (G) and the Verlinde algebra.
D1
The first observation in this program was to
notice that the key ingredient in the forming of the
D3 Chas–Sullivan loop product is that the fibration
ev : LM ! M is a fiberwise monoid over a closed
D4 oriented manifold. The fiber is M, which has the
D5
usual Pontrjagin product.
The following was proved by Gruher and
Salvatore:

D6
Lemma 3 Let G ! E ! M be a fiberwise monoid
over a closed manifold M. Then ETM is a ring
Figure 4 Open–closed cobordism. spectrum.
114 String Topology: Homotopy and Geometric Perspectives

The following construction gives a large supply of K-theory K (LBG) maps to the equivariant K-theory,
examples of such fiberwise monoids over manifolds. KG (G). Now in recent work of Freed (2003) twisted
Let G ! P ! M be a principal G bundle over a equivariant K-homology, KG (G) was shown to be
closed manifold M. We can construct the corre- isomorphic to the Verlinde algebra. This algebra is a
sponding adjoint bundle, space of representations of the loop group, LG. The
multiplication in this algebra is the ‘‘fusion product,’’
AdðPÞ ¼ P G G ! M
coming from conformal field theory. One topic of
It is an easy observation that G ! Ad(P) ! M is a current research is to understand the relationship
fiberwise monoid. between multiplicative structure coming from the
string topology of BG, and this fusion product in the
Theorem 4 Ad(P)TM is a ring spectrum. This ring
Verlinde algebra. More generally, the goal is to bring
structure is natural with respect to maps of principal
to bear the considerable calculational techniques of
G-bundles.
algebraic topology that are available in string
Let BG be classifying space of compact Lie topology, to understand the recently uncovered field
groups. It is possible to construct a filtration of BG, theoretic structure of twisted K-theory (Freed 2003),
and its applications to string theory.
M1 ,! M2 ,! ,! Mi  Miþ1 ,! ,! BG
where the Mi ’s are compact, closed manifolds. An Acknowledgment
example of this is filtering BU(n) by Grassmannians.
Let G ! Pi ! Mi be the restriction of EG ! BG. The author was partially supported by a grant
By the above theorem one obtains an inverse system from the NSF.
of ring spectra
See also: Mathematical Knot Theory; Topological
PTM
1
1
PTM
2
2
PTM
i
i
PTM
iþ1
iþ1
Defects and Their Homotopy Classification.

Theorem 5 The homotopy type of this pro-ring-


spectrum is a well-defined invariant of BG. It is
Further Reading
referred to as the ‘‘string topology of BG.’’ Abbaspour H (2005) On string topology of three manifolds.
Topology 44: 1059–1091.
Abbaspour H, Cohen RL, and Gruher K String Topology of
Poincaré Duality Groups (in preparation).
Potential Application: Twisted K-theory Chas M and Sullivan D (1999) String topology. To appear in
and the Verlinde Algebra Annals of Mathematics. Preprint: math.GT/9911159.
Cohen RL and Godin V (2004) A polarized view of string
Let G be a connected, compact Lie group. Using the topology. In: Topology, Geometry, and Quantum Field Theory,
observation that the loop space of a classifying space London Math. Soc. Lecture Notes, vol. 308, pp. 127–154.
is the classifying space of the loop group, Cohen RL and Jones JDS (2002) A homotopy theoretic realization
L(BG) ’ B(LG), the string topology gives new of string topology. Mathematisches Annalen 324: 773–798.
structure on the classifying space of these loop Freed D, Hopkins M, and Teleman C Twisted K-theory and loop
group representations. Preprint: math.AT/0312155.
groups. In particular, one has new structure on the Godin V (2004) A Category of Bordered Fat Graphs and the
K-theory of these classifying spaces. Now classical Mapping Class Group of a Bordered Surface. Ph.D. thesis,
results of Atiyah and Segal suggest that K-theory of Stanford University.
classifying spaces should be related to the representa- Gruher K Ph.D. thesis, Stanford University (in preparation).
Gruher K and Salvatore P String Topology of Classifying Spaces
tion theory of the group. In this case, the representa-
(in preparation).
tion theory of loop groups has been widely studied Harrelson E (2004) On the homology of open–closed string
and is very important in conformal field theory. theory. Preprint: math.AT/0412249.
Understanding the precise relationship between the Harer JL (1985) Stability of the homology of the mapping class groups
string topology of the classifying space and of orientable surfaces. Annals of Mathematics 121: 215–249.
this representation theory is an interesting area of Penner R (1987) The decorated Teichmuller space of punctured
surfaces. Communications in Mathematical Physics 113:
current research. To motivate this, first recall that the 299–339.
loop space, LBG, has a well-known description as Ramirez A (2005) Open–Closed String Topology. Ph.D thesis,
Stanford University.
LBG ’ EG Ad G Strebel K (1984) Quadratic Differentials. Berlin: Springer.
where the right-hand side refers to the homotopy Sullivan D (2004) Open and closed string field theory interpreted
in classical algebraic topology. In: Tillman U (ed.) Topology,
orbit space of the conjugation (or adjoint) action of Geometry, and Quantum Field Theory, London Math.
G on itself. Thus, the homology H (LBG) is the Soc. Lecture Notes, vol. 308, pp. 344–357. Cambridge:
equivariant homology HG (G). Similarly, the Cambridge University Press.
Superfluids 115

Superfluids
D Einzel, Bayerische Akademie der Wissenschaften, Miesener, Wolfke, and others accumulated the
Garching, Germany evidence that liquid 4 He undergoes a second-order
ª 2006 Elsevier Ltd. All rights reserved. phase transition at T = 2.17 K to a state referred to
as a superfluid, since the liquid could flow without
any sign of a flow resistance. This superfluid state
was interpreted in terms of Bose condensation of the
Introduction 4
He atoms in the liquid (London 1938).
Superfluidity has been known to exist since the In Figure 1 the P–T phase diagram of liquid 4 He is
1930s. This widespread phenomenon occurs in shown with a normal liquid phase, a solid phase and
many-particle Bose and Fermi systems as different the superfluid phase below the -line at about 2 K.
as liquid 4 He, liquid 3 He, atomic gases like Rb and Fermions cannot condense in a way similar to the
Li, atomic nuclei, pulsars and last, but not least, in BEC, due to the Pauli exclusion principle. In 1957
metals, where the itinerant electrons may become Bardeen, Cooper, and Schrieffer came up with their
superfluid. This article is devoted to a unifying ingenious proposal that the superfluidity of the
theoretical description of Bose and Fermi super- electron system (usually referred to as superconduc-
fluidity. The mechanisms leading to superfluidity tivity) comes about through the formation of
include Bose–Einstein condensation (BEC) and fermion pairs (quasibosons) in k-space in a spin-
Bardeen, Cooper, and Schrieffer (BCS)–Leggett singlet state. In 1971, several superfluid phases of
pairing correlations. We hope to be able to liquid 3 He at a few mK were discovered by Lee,
demonstrate why this fascinating phenomenon is – Osheroff, and Richardson at Cornell University.
even roughly 80 years after its experimental discov- Experimental aspects connected with the spin
ery and its first theoretical explanation – still a degrees of freedom of the quantum liquid gave
subject of intensive research. strong evidence for Cooper pairing of the 3 He atoms
The phenomenon of superfluidity is closely in a spin-triplet state. In Figure 2 the zero-field P–T
connected with the apparent lack of any measurable phase diagram of liquid 3 He is shown with a normal
flow resistance, which scales with the shear viscosity (Fermi) liquid phase, a solid phase and the super-
of the fluid. Its complete absence implies that fluid A and B phases.
the system is frictionless moving with zero viscosity. Immediately after this discovery, Anthony
The observation of superfluidity is usually precluded J Leggett applied the BCS ideas to liquid 3 He and
by the solidification of most liquids as the tempera- introduced a generalized scheme, that allowed for
ture is lowered. Only systems with particularly triplet-pairing correlations. His theory turned out to
light atoms (like the helium isotopes 4 He and 3 He) describe a large variety of experimental results
stay liquid down to the lowest temperatures. accurately. A new and exciting development set in
These systems are referred to as ‘‘quantum liquids,’’ when Bose–Einstein condensates were discovered for
since their liquid state is caused by the quantum- the first time in dilute gases of alkali atoms in 1995
mechanical zero-point motion of the atoms. It by Cornell and Wiemann et al. (Rb), Ketterle et al.
should be noted that the Helium isotopes (Na), and Hulet et al. (Li).
belong to two different kinds of elementary
particles which can be distinguished by their
statistics: 4 He is a spin-0 boson and 3 He a spin-
1/2 fermion. 4
In 1924, Satyendra Nath Bose and Albert Einstein
proposed that below a characteristic degeneracy Solid
Pressure (MPa)

3
temperature TB , a macroscopic number of bosons Normal liquid
can condense into the state of lowest energy k = 0. 2
In the 1930s, Fritz London and Heinz London
showed that this so-called Bose–Einstein condensate Superfluid λ
1
can be described by a macroscopic quantum-
mechanical wave function like the one for a single Gas
0
elementary particle, but with the probability density 0 1 2 3 4 5 6
replaced by the density of the condensed particles. Temperature (K)
By the end of the 1930s, the experimental results of Figure 1 The phase diagram of liquid 4 He. Courtesy of Erkki
Allen, Kamerlingh–Onnes, Keesom, Kapitza, Thuneberg.
116 Superfluids

4 In sharp contrast, fermions obey the Pauli exclu-


Solid sion principle, which states that only one fermion
can occupy a quantum state jk, i specified in
3 Superfluid A phase addition by the spin projection . The average
Pressure (MPa)

statistical occupation is given by the Fermi–Dirac


2
distribution
Superfluid B phase 1
fk ¼ ½5
1 eðk Þ=kB T þ 1
Normal fluid
Figure 3 shows a comparison of Bose–Einstein
0 and Fermi–Dirac momentum distributions nk plotted
0 1 2 3
vs. k . The chemical potential is shown for fermions
Temperature (mK)
only, F = kB T is always positive and the total
Figure 2 The phase diagram of liquid 3 He. Courtesy of Erkki
density can be expressed as
Thuneberg.
1X 2
n¼ fk ¼ 3 F3 ðÞ ½6
V k;  T 2
Boson and Fermion Degeneracy
In what follows, the energy dispersion of Bose and where the factor of 2 originates from the spin
Fermi systems is denoted as k (free bosons/fermions degeneracy. For parabolic dispersion, the Fermi
would be represented by k =  h2 k2 =2m). A large integral reads:
number of bosons can occupy Bose quantum states Z 1
1 dy y1 T!0 ð=kB TÞ
jki, the average occupation is dictated by the Bose– F ðÞ ¼ ½7
Einstein distribution ðÞ 0 ey þ 1 ¼ ð þ 1Þ

1 One recognizes that the degeneracy condition


nk ¼ ½1 3=2
eðk Þ=kB T 1 nT  1 corresponds to the limit T  TF =
For Bose systems, the chemical potential is negative (0)=kB , which is connected with the formation of
 = kB T and  is fixed by the condition a ‘‘Fermi sea,’’ with (0)  EF the Fermi energy:

1 X0 1 2
h 2
n¼ nk ¼ 3 B32 ðÞ ½2  T!0
¼ ð32 nÞ3 ¼ EF ½8
V k T 2m

where the prime indicates the summation over To summarize, quantum behavior in Bose and
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
excited states jkj > 0. In [2], T = h= 2mkB T Fermi system sets in below the degeneracy tempera-
denotes the thermal de Broglie wavelength which ture T  , defined through n3T  = O(1). For bosons,
provides a criterion for the importance of quantum T  = TB is the temperature at which the chemical
effects or degeneracy through n3T  O(1). The Bose potential vanishes, whereas for fermions T  = TF is
integrals B () originate from the conversion of the the Fermi temperature.
momentum sum into an energy integral and read for
parabolic dispersion:
Z 1 X1 
1 dy y1 e
B ðÞ ¼ yþ
¼ ½3 2
ðÞ 0 e  1 ¼1  

with B (0) = (),  the Euler -function and denot-


nk
ing the Riemann -function. It is important to under-
stand that in order to have a constant total density
n, B3=2 () has to increase / T 3=2 in the same way as 1
3T . This is, however, impossible at all temperatures Fermi–Dirac
since the chemical potential of the Bose gas vanishes
( ! 0) at a finite temperature TB given by
 2=3 Bose–Einstein
2h2 n
TB ¼ ½4 0
mkB ð3=2Þ 0 μF εk
for which n3TB = B3=2 (0) = (3=2) = 2.612 . . . . Figure 3 The Fermi and Bose momentum distribution.
Superfluids 117

London Quantum Hydrodynamics n  = P  0 T. Finally, the acceleration of the


mass supercurrent jsm is of the form
For a general treatment of the quantum-mechanical
origin of the equations describing Bose and Fermi @jsm
s
¼  rð P  0 T Þ ½13
superfluidity, it is convenient to introduce a para- @t

meter  which describes single bosons ( = 1) or


It turns out that the London equations [10] and
Fermion pairs ( = 2) of mass M = m. The basic
[13], in which
s is an unknown phenomenological
assumption (London 1938) is that the laws of
parameter, explain many experimental observations
quantum mechanics are applicable also to a macro-
such as persistent currents, U-tube oscillations,
scopic number of single ( = 1) or composite ( = 2)
thermomechanical (e.g., fountain-) effects, beaker
particles of density
s =m, the so-called condensate,
flow phenomena, and many others.
which is represented by a macroscopic wave func-
tion (r, t). has the property
Bose–Einstein Condensation (BEC)

s ðr; tÞ
ðr; tÞ ðr; tÞ ¼ ;  ¼ 1; 2 In order to understand the macroscopic quantum
m
state in case of Bose systems, we consider first the
The dynamics of the condensate is governed by simple case of a Bose gas. Let us decompose the
the Schrödinger equation energy eigenstates k into those with k = 0 = 0
! (condensate) and average occupation number
h@
 h2 r2
 N0 1 1
i ¼  2 þ ½9 n0 ¼ ¼ ½14
 @t 2 m V 
V e 1

in which  represents the condensate’s chemical and those with k > 0 (excited states) and average
potential. After performing a Madelung transforma- occupation number
tion (Madelung 1926):  
Nex X0 B3/2 ðÞ T 3/2
nex ¼ ¼ nk ¼ n ½15

s V k
B3/2 ð0Þ TB
¼ aei’ ; a2 ¼
m with the total density n = nex þ n0 . The consequence of
one arrives at two coupled hydrodynamic equations, the chemical potential vanishing at TB clearly is a mac-
the first of which reads roscopic occupation of the ground state of the Bose gas:
1 1
@
s N0 !0
= ¼ !1 ½16
þ r  jsm ¼ 0 1 þ  þ   1 
@t ½10
h
 This phenomenon is referred to as BEC. Below
jsm ¼
v; s s s
v ¼ r’ TB ,  = 0 and from [15] we see that
m
 3=2
Equation [10] can be interpreted as a continuity free bosons T
nex ¼ n ; T < TB ½17
equation, which represents the conservation law for TB
the condensate mass density
s . The second equation The average occupation of the ground state is given by
 @’ 1
h n0 ðTÞ ¼ n  nex ðTÞ; T < TB ½18
 h2 r 2 Þ
¼ mvs2 þ  þ Oð ½11
 @t 2 It is important to understand that the number
assumes the form of the Hamilton–Jacobi equation density of condensed particles nex has nothing to
for the action field of classical mechanics  h’, if the do with the current response function
s (eqn [10]).
quasiclassical limit (terms / O( h2 r2 ) ! 0) is taken. A derivation of
s will be given in the section ‘‘Local
From [10] and [11] a condensate acceleration response of condensates and excitation gases.’’
equation can be derived, which resembles the Euler Let us now discuss the structure of the excitation
equation of classical hydrodynamics ( = 0 þ ): spectrum, which will turn out to be crucial for the
observability of superfluidity, in some more detail.
@vs 1 Suppose that a macroscopic object of mass M moves
þ ðvs  rÞvs ¼  r  ½12 through the superfluid. Then one may ask the question,
@t m
at what velocity does this motion cause the creation of
The physical nature of the driving force becomes an excitation of energy Ep and momentum p. The
evident after applying the Gibbs–Duhem relation condition can be formulated in terms of the velocity
118 Superfluids

difference vi  vf as Ep = M(v2i  v2f )=2 and critical velocity for the phonon–roton spectrum is
p = M(vi  vf ). Eliminating vf yields p = p  vi þ characterized by the roton minimum and is given by
O(M1 ) so that condition for the creation of an excit- vL D=p0 .
ation leads to the so-called Landau critical velocity
 
Ep
vL ¼ min >0 ½19 BCS–Leggett Pair Condensation
jpj
The key assumptions of the weak-coupling mean-
It is immediately clear that for free bosons vL = 0. field BCS–Leggett pairing model can be summarized
This means that a free Bose gas can never be a as follows: one first assumes that at sufficiently low
superfluid, since drag forces on moving objects will temperatures it is energetically favorable that a
start to act even at smallest velocities. temperature-dependent part of the fermions forms
It turns out that interaction effects can drastically so-called Cooper pairs. This pair formation is caused
modify the nature of the elementary excitations. In by an attractive interaction in k-space near the
1947, Nikolai Bogoliubov showed (for the first time Fermi surface:
using the method of second quantization) that even in
ðsÞ
the limit of weak repulsive interactions the excitation kp < 0; j k j; j p j < c
spectrum is phonon-like Ep = cjpj, with c the sound
velocity. Lev Landau and Richard Feynman investi- Here k = k   measures the energy from the
gated the situation for superfluid 4 He, where the chemical potential. The index s denotes the total
interactions between the atoms are far from weak. spin of the pair. Classical superconductors have
Landau (1947) postulated the following form for the pairs in a relative singlet state s = 0, ms = 0 whereas
excitation spectrum, for which Feynman (1953) gave the superfluid phases of liquid 3 He have pairs in a
the microscopic justification. At low momenta, the relative spin-triplet state s = 1, ms = 0,
1, with ms
spectrum is phonon-like and linear in p: the magnetic quantum number. The amplitude of
spontaneous pair formation is
lim Ep ¼ Ephon
p ¼ cjpj ½20
p!0 gk1 2  h^ck1 ^ck2 i 6¼ 0; T  Tc ½22
At higher momenta, the spectrum is reminiscent with k = k1  k2 the relative momentum of the
of that of crystal phonons in that Ep passes though a pair. The attractive interaction that drives the
maximum, and then, at a characteristic momentum Cooper-pair formation connects the pairing ampli-
p0 approaches the next minimum, which, however, tude gk1 2 with a new energy scale, the so-called
is located at a finite energy D. Feynman called this pair potential
part of the spectrum the ‘‘roton’’ (mass mr ) in an X ðsÞ
analogy with a ‘‘smoke ring,’’ since it is connected Dk1 2 ¼ kp gp1 2 ½23
with the forward motion of a particle accompanied p
by a ring of back-flowing other particles: As a consequence of triplet pairing the spin part of
ðjpj  p0 Þ2 the pair potential is ‘‘even’’ upon interchange of 1
lim Ep ¼ Erot
p ¼Dþ ½21 and 2 : Dk2 1 = Dk1 2 . Then the Pauli principle
jpj!p0 2mr
requires that Dk1 2 must be ‘‘odd’’ with respect
Figure 4 shows a sketch of the phonon–roton to the interchange of k1 and k2 or, equivalently,
spectrum of superfluid 4 He. Clearly, the Landau k ! k. The k-dependence can now be classified by
an orbital quantum number ‘ with the special cases
of ‘ = 1 (p-wave) pairing, ‘ = 3 (f-wave) pairing, etc.
Ep
All superfluid phases of 3 He are characterized by
p-wave orbital symmetry.
The transition temperature Tc from [23] reads
Rotons
2e 1=ðNF ðsÞ Þ
kB Tc ¼ c e
Δ 
with NF = 3n=2EF the density of states at the Fermi
Phonons level and = 0.577 . . . the Euler constant. The
energies k can trivially be divided into particle-like
0 ( k > 0) and hole-like ( k < 0) terms. The presence
0 p0 p
of the pair potential Dk leads to a mixing of particle-
Figure 4 The phonon–roton spectrum. and hole-like contributions to the energy, which
Superfluids 119

becomes a matrix in particle–hole, or Nambu space ms =


1 components of the spin triplet contribute
(Nambu 1960), and generates what is referred to as to its spin dependence (equal spin pairing (ESP)).
off-diagonal long-range order (ODLRO):
!
k 1 Dk
k ¼ ½24 Local Response of Condensates
Dyk  k 1
and Excitation Gases
As usual, the diagonalization of k (Bogoliubov In the previous sections we have seen that the
1958) leads to the energy dispersion of the relevant structure (energy dispersion, statistics, critical flow
thermal excitations of the superfluid state, the so- velocity) of the relevant thermal excitations is of
called Bogoliubov quasiparticles or ‘‘bogolons’’: crucial importance for the superfluidity. We can
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi now aim at a generalized statistical description of
Ek ¼ k2 þ D2k ; D2k ¼ Dk  Dyk ½25 bosonic (phonons, rotons) and fermionic (bogolons)
excitation gases, by introducing a generalized
In Figure 5, the dispersion Ep of Bogoliubov
momentum distribution
quasiparticles vs. jpj is shown. It turns out that the
superfluid phases (A and B) of liquid 3 He in zero 1
magnetic field are characterized by unitary matrices n fEk g ¼ ½28
eEk =kB T 
Dk , so that the scalar quantity Dk can be interpreted
and its energy derivative
as the energy gap in the bogolon spectrum, which, in
general, may be anisotropic in k-space. @n fEk g 1
’k; ¼  ¼ ½29
The energy gap Dk of the superfluid B-phase can @Ek 2kB T ½coshðEk /kB TÞ  
be represented in the simple nodeless (pseudoiso-
tropic) and BCS-like form (Balian and Werthamer, Special cases are

(BW), 1963): 1; Bose ðphonons; rotonsÞ

Dð0Þ  1; Fermi ðbogolonsÞ
Dk ¼ DðTÞ; ¼ ½26
kB Tc e
Introducing the spin s = (1  )=4, the total
Its spin structure is characterized by the presence of all momentum density response to the presence of a
three triplet components ms = 0,
1 and will be superfluid velocity
discussed further with respect to the magnetization
hr’ 1
response (see next section). The gap symmetry of 3 He-A vs ¼ ; s ¼ 0;
is uniaxial with respect to an axis ‘^ (Anderson and ð2s þ 1Þm 2
Morel 1960; Anderson and Brinkman 1973) and a normal fluid velocity vn can be written in the
0 ð0Þ e5=6 general form
k ¼ 0 ðTÞ sin k ; ¼ ½27
kB Tc 2e 2s þ 1 X
jm ¼ pn fEk þ Ek g þ
vs ½30
^ and characterized by two point
where cos k = k  ‘, V k
nodes of Dk at the zeros (k = 0, ) on the Fermi
surface. It has furthermore turned out that only the After Taylor-expanding n with respect to the
small energy shifts Ek = p  (vs  vn ), one may
introduce the so-called normal fluid density tensor
Ep 2s þ 1 X

nij ¼ ’k; pi pj ½31
V k

and the momentum density assumes the form


j m ¼
s  vs þ
n  vn ;
s ¼
1 
n ½32
Bogolons
Equation [32] forms the central result of this
essay because it represents the microscopic counter-
part of the generalized London equation [10]. It is
Δ
clearly seen how the phenomenon of superfluidity
0 originates from
s > 0 due to a qualitative change
0 pF p in the dispersion of the elementary excitations,
Figure 5 The bogolon energy dispersion. which may in particular be characterized by a gap in
120 Superfluids

the excitation spectrum. Equation [32] is more general


than [10] in that it introduces a two-fluid picture in 1
which the mass supercurrent jsm =
s vs (eqn [10]) is
complemented by a normal (excitation) mass current
jnm =
n vn in the presence of a macroscopic velocity 27 bar
field vn of the excitation gas obeying arbitrary
statistics. The temperature dependence of
s (T) can
now be computed via [31] and the result depends on
the dispersion of the thermal excitation under con- A||
sideration. Figure 6 shows the temperature depen- ρn(T )/ρ
dence of the normal and superfluid density of
A⊥
superfluid 4 He. The normal fluid density of superfluid
3 B
He is, in general, a tensor quantity
( 0
0 1
n
nk ‘^i ‘^j þ
n? ð ij  ‘^i ‘^j Þ; 3 He-A T/Tc

ij ¼ ½33

n ij ; 3
He-B Figure 7 The normal fluid density for 3 He-A, B.

The short-range Fermi liquid interaction leads to a After a Taylor expansion of n with respect to the small
quasiparticle mass enhancement m =m = 1 þ F1s =3 local temperature change T, the result for cV (T) reads
characterized by the pressure-dependent dimensionless  2 
Landau parameter F1s . In Figure 7, the normal fluid 2s þ 1 X Ek @Ek
cV ¼ ’k;  Ek ½36
density (
nk, ? for 3 He-A,
n for 3 He-B) is shown as a V T @T
k
function of reduced temperature at a pressure of 27
bar, where F1s = 12.53. The entropy density of an In Figures 8 and 9 we show the cusp-like specific heat
excitation system of arbitrary statistics below the of a Bose gas as compared with the specific heat of
3
transition can be written as He-A, B, which display discontinuities at Tc .
Finally, the superfluid phases of 3 He are char-
ð2s þ 1Þ X acterized in addition by the spin degrees of freedom,
0 ¼ k B Pk
V k ½34 reflected by the bogolon spin magnetization
Pk ¼ ð1 þ n Þ lnð1 þ n Þ  n ln n response to an external magnetic field B:

with n = n {Ek }, from which one may derive the h X


Mn ¼ n1 fEk  hB=2g ¼ 0 B ½37
specific heat capacity 2V k;
( )
2s þ 1 X Ek þ @E k
@T T
where denotes the gyromagnetic ratio of the fermions.
T  ¼ Ek n The bogolon spin susceptibility 0 is obtained after a
V k
kB ðT þ TÞ
Taylor expansion of n1 with respect to B as
¼ cV T ½35  2 X  2
h 1 h
0 ¼ ’k;1  NF YðTÞ ½38
2 V k; 2

1
2

ρs(T )/ρ

C(T )/NkB

ρn(T )/ρ
0 0
0 1 0 1 2
T/ Tλ T/TB
Figure 6 The normal and superfluid density for He-II. Figure 8 The specific heat capacity of a Bose gas.
Superfluids 121

1
A(ms = ±1)
B
2
A
2
C(T )/CN 3
B(ms = ±1,0)

1 1
3
B (27 bar) X(T )/XN
A

B ms = 0
0
0 0 1
0 1 T/Tc
T/Tc
Figure 10 The spin susceptibility of 3 He-A, B.
Figure 9 The specific heat of 3 He-A, B.

Note that eqn [38] accounts only for the ms = 0 functions and contributes to the entropy and the flow
(bogolon) contribution to the spin-triplet suscept- dissipation. Superfluidity is now well understood using
ibility, the temperature dependence of which is given various aspects of the concept of the macroscopic wave
by the so-called Yosida function Y(T) = NF1 function. On the microscopic level, the mechanisms of
P BEC and BCS–Leggett pair formation have been
k ’k, 1 . The total susceptibility reads
successfully invoked to understand the fascinating
tot ¼ 0 þ 1 þ 1 ½39 properties of Bose and Fermi superfluids.
|{z} |fflfflfflfflfflffl{zfflfflfflfflfflffl}
bogolons condensate
See also: Bose–Einstein Condensates; Bosons and
with the condensate contributing through ms =
1 a Fermions in External Fields; High Tc Superconductor
fraction of 2/3 of the normal state Pauli suscept- Theory; Topological Knot Theory and Macroscopic
ibility. In Figure 10, the reduced spin susceptibility Physics; Variational Techniques for Ginzburg–Landau
Energies; Vortex Dynamics.
=N of 3 He-A, B is plotted vs. reduced tempera-
ture. While the constant susceptibility is character-
istic of the ESP pairing state, the reduction of the
B-phase susceptibility is due to the lack of the Further Reading
nonmagnetic ms = 0 contribution to the spin triplet
Anderson PW and Brinkman WF (1973) Physical Review Letters
in the low-temperature limit. Exchange interaction 30: 1911.
effects, characterized by the dimensionless Landau Anderson PW and Morel P (1960) Physical Review Letters 5: 136.
parameter F0a , lead to a further reduction of the Annett J (2004) Superconductivity, Superfluids and Condensates.
Balian-Werthamer (BW)-state susceptibility, which Oxford: Oxford University Press.
is shown for 27 bar, where F0a = 0.755. Note that Balian R and Werthamer NR (1963) Physical Review 131: 1553.
Bogoliubov NN (1947) Journal of Physics USSR 11: 23.
the theoretical picture reflected in Figure 10, and Bogoliubov NN (1958) Soviet Physics JETP 7: 794.
also in Figures 6, 7, and 9, is in quantitative Dobbs ER (2000) Helium Three. Oxford: Oxford University
agreement with experimental observations. Press.
In summary, superfluidity is a quantum-mechanical Feynman RP (1953) Physical Review 91: 1301.
phenomenon seen on a macroscopic scale. It occurs Keller WE (1969) Helium-3 and Helium-4. New York: Plenum.
Landau LD (1947) Journal of Physics USSR 11: 91.
below the degeneracy temperature T  / n2=3 =m of London F (1938) Physical Review 54: 947.
both Bose and Fermi many-particle systems (like liquid Madelung E (1926) Z. Physik 40: 322.
4
He and 3 He) and is a property of a macroscopic Nambu Y (1960) Physical Review 117: 648.
number of particles, the condensate. The role of (weak Tilley DR and Tilley J (1990) Superfluidity and Superconductiv-
or strong) interactions is manifested in the structure of ity. Adam Hilger.
Tisza L (1938) Journal de Physique et Radium 1: 164.
the relevant elementary excitations, which always exist Tsuneto T (1998) Superconductivity and Superfluidity. Cambridge:
in addition to the condensate at finite temperatures and Cambridge University Press.
above certain critical velocities. These excitations form Vollhardt D and Wölfle P (1990) The Superfluid Phases of
a gas, referred to as the normal fluid, since it gives rise Helium 3. Taylor and Francis.
to temperature-dependent thermodynamic and response
122 Supergravity

Supergravity
K S Stelle, Imperial College, London, UK once it was realized that renormalizable supersym-
ª 2006 Elsevier Ltd. All rights reserved. metric models display a cancellation of some of the
divergences that have plagued relativistic quantum
field theory since its inception in the 1930s. In
Introduction: Minimal D = 4 Supergravity particular, in renormalizable flat-space field theory
models, divergences quadratic in a high-momentum
The essential idea of supersymmetry is an extension of cutoff vanish as a result of cancellations between
the relativistic structure group of spacetime, which in virtual bosonic and fermionic particles. This is a
ordinary four-dimensional physics in the absence of very attractive feature for control of the ‘‘hierarchy
gravity is the Poincaré group ISO(3, 1). In a minimal problem’’ in particle physics, especially for the
supersymmetric theory in flat D = 4 spacetime, the instability inherent in having vastly different scales
minimal supersymmetry algebra (the ‘‘graded Poincaré within the same theory, for example, the TeV scale
algebra’’) adds spinorial generators Q to the Lorentz of ordinary electroweak physics and the 1016 GeV
generators Mmn and the translational generators scale where unification with the strong interactions
(momenta) Pm , where m = 0, 1, 2, 3. The core relation might come in.
is the ‘‘anticommutator’’ of two Q : When one includes gravity, the stability problems
 g ¼ 2 m Pm
fQ ; Q ½1 of particle physics become much more severe.
 
Einstein’s theory of general relativity is itself non-
where Q  = Qy  0 and the  m are the Dirac gamma renormalizable, that is, its ultraviolet divergences are
matrices. In the minimal D = 4 supersymmetry of different forms from the terms present in the
algebra, the spinor generator Q is taken to be original ‘‘classical’’ action and there is no acceptable
Majorana: Q = C(Q)  T , where C is the charge- finite set of correction terms that can be added to it
conjugation matrix and AT denotes the transpose to remove this defect. Moreover, when otherwise
of the matrix A. The full supersymmetry algebra tolerably behaved matter field theories that are
adjoins to the anticommutation relation [1] the renormalizable in a flat-spacetime context are
usual commutation relations among the Lorentz coupled to general relativity, the gravitational
generators and the commutators of the Lorentz couplings pollute the matter theories with non-
generators with the momenta and the spinors Q ; renormalizable divergences. This is a key aspect of
the latter express respectively the vectorial and the great difficulty that has been encountered in
spinorial characters of Pm and Q : interpreting gravity as a quantum theory.
Supersymmetry, with its divergence-canceling
i½Mmn ; Mpq  ¼ np Mmq  mp Mnq ½2
powers, was thus a very attractive option in the
struggle to formulate a quantum theory of gravity, and
i½Mmn ; Pq  ¼ nq Pm  mq Pn ½3
the creation of a supergravity theory was thus a very
high priority task. This was achieved in 1976 by
i½Mmn ; Q  ¼ 12 ðmn QÞ ½4
Freedman, Ferrara, and Van Nieuwenhuizen using the
where mn = (1=2)(m n  n m ) and mn = diag(1, technique of iterative Noether coupling to build up this
1, 1, 1) is the Minkowski metric. The final relation nonlinear theory order-by-order in powers of the
in the supersymmetry algebra expresses the flatness fermionic fields. The fermionic partner of the massless
of Minkowski space: spin-2 ‘‘graviton’’ field is a massless fermionic spin-3/2
field that has come to be called the ‘‘gravitino.’’
½Pm ; Pn  ¼ 0 ½5
A second 1976 paper by Deser and Zumino soon
This algebra has been considered as an extension of the followed, emphasizing how supergravity manages to
symmetry algebra of particle physics since the work of circumvent the well-known problems of coupling
Gol’fand and Likhtman in 1971, and especially since spins higher than 1 to gravity. A key point in
the linearly realized supersymmetric model of Wess achieving this result is the role played by the local
and Zumino in 1974. That model contains a pair of version of the supersymmetry algebra [1]–[5]. As
D = 4 scalar fields and a D = 4 Majorana spinor, so one can see from the translations occurring on the
the numbers of bosonic and fermionic degrees of right-hand side of [1], when one replaces translation
freedom are equal; this is a fundamental characteristic symmetry by local general coordinate invariance in a
of supersymmetric theories. gravitational context, the supersymmetry transfor-
The work of Wess and Zumino led to an mations must themselves become local as well. Local
explosion of interest in supersymmetry, especially symmetries allow for transformation parameters
Supergravity 123

that are local in the spacetime coordinates xm , and showing the highly nonlinear nature of supergravity
in interacting theories they require coupling of the theory – when expanded out, the theory becomes
corresponding ‘‘gauge field’’ to a conserved current. much more cumbersome to study. The 1.5 order
In the case of supergravity, the gravitino field m formalism trick is one of a large number of algebraic
plays this gauge-field role, and its coupling to the simplifications that had to be developed in order to
conserved current of supersymmetry is the key to master the technical aspects of supergravity. It also
allowing a consistent coupling between the spin-2 reveals a characteristic physical feature: this theory
graviton and the spin-3/2 gravitino. naturally involves a connection with torsion built
from the fermionic fields.
In terms of the torsional covariant derivative
Dm (x) = (@m þ (1=4)(!ab ab
m (e) þ Km ( ))ab )(x) of the
The Minimal Supergravity Action
infinitesimal supersymmetry parameter (x), the
The action for minimal supergravity in D = 4 local supersymmetry transformations which leave
dimensions can be written, using the vierbein the action [6] invariant (up to the integral of a total
formalism where the metric is expressed as a derivative) are
quadratic expression in a nonsymmetric 4  4
eam ¼ i
a m ½9
vierbein matrix eam , gmn = eam ebn ab , as
Z ¼ 21 Dm  ½10
1 m
I ¼ 2 d4 x detðeÞRðe; !ðeÞ þ Kð ÞÞ 1
2 The inhomogeneous part 2 @m  in the gravitino
Z
i transformation [10] demonstrates the gauge-field
 d4 xmnpq m 5 n Dp ðe; !ðeÞ þ Kð ÞÞ q ½6
2 nature of the gravitino field. For a distribution of
pffiffiffiffiffiffiffiffiffiffi ‘‘supermatter’’ fields (e.g., Wess–Zumino model
where  = 8G is the gravitational coupling scalars and spinors), the integrated ‘‘charge’’ that
constant, one would get from a Gauss’s law surface integral at
   
spatial infinity using the gravitino gauge field is the
!ab 1 na
m ðeÞ ¼ 2 e ebn;m  ebm;n  12 enb ean;m  eam;n
total supercharge Q , which in turn plays the role of
þ 12 ena erb ðenc;r  erc;n Þecm ½7 the supersymmetry generator in the original matter-
sector supersymmetry algebra [1].
is the usual vierbein formalism spin connection (in Both the gravitational field and the gravitino field
which ebn, m = @m ebn and ema is the matrix inverse of are thus effectively gauge fields, albeit not of a
ema ), and standard Yang–Mills type. The local algebra is a
i2  a deformation of the rigid supersymmetry algebra [1]–
Kab
mð Þ ¼ ð m b
þ a m b
 m  b a
Þ ½8 [5], generalizing the relation between general covar-
4
iance and flat-space Poincaré symmetry. Some basic
is the fermionic contorsion, an additional part of the consequences of the flat-space algebra are preserved,
covariant derivative Dm (e þ K( )) appearing in the however. An extremely important instance of this is
action [6]. (Indices m, n are taken to be ‘‘world’’ energy positivity. As one can see by multiplying [1]
indices while indices a, b are ‘‘tangent space’’ indices; by  0 and then contracting on the spinor index,
one can convert from one type to another using the
vierbein eam and its inverse, e.g., a = em 1X
a m .) E ¼ P0 ¼ fQ ; Qy g
Keeping the terms in the action grouped as above 2 
using the nonstandard covariant derivative eab m þ Km
ab
The right-hand side is manifestly non-negative
is what has been called ‘‘1.5 order formalism’’: this
provided the theory is quantized in a positive-metric
greatly simplifies the writing and analysis of the
Hilbert space. One can see this even more explicitly
supergravity action [6]. In the action [6], one has the
in a Majorana spinor basis, where Qy = Q .
Ricci scalar R(e, !(e) þ K( )) written in terms of this
Accordingly, for flat-space supersymmetric theories,
generalized torsional spin connection. One may of
one obtains directly the result that energy is
course expand out all the !ab ab
m þ Km combinations non-negative. This carries over to the local algebra
and write the nonlinear fermionic terms separately.
of supergravity, where the total energy is obtained
Doing this produces a quartic term
from a Gauss’s law integral over the sphere at
2 b a spatial infinity.
L4 ¼ ½  c
ð b a c þ 2 a b cÞ In general relativity, an integrated energy can be
32
defined with respect to an asymptotic timelike
 4ð a  b c
Þð a b c Þ Killing vector at spatial infinity. Showing that this
124 Supergravity

energy is non-negative remained for decades a while the local supersymmetry transformations are
famously unsolved problem in gravitational physics; changed to include the auxiliary fields, e.g., the
it was ultimately proven in Yau’s positive-energy gravitino transformation becomes
theorem. The algebraic structure of supergravity
makes energy positivity much more transparent, m ¼ 21 Dm ð!; KÞ
 
however. Since pure general relativity can be þ 5 bm  13 m  n bn   13 m ðM þ 5 NÞÞ
obtained by setting the gravitino field to zero, this
result is inherited by pure Einstein theory as a while the auxiliary fields transform into expressions
consequence of its being embeddable into super- that vanish on-shell. Since the field equations for the
gravity. Energy positivity can thus be proved even at auxiliary fields are algebraic in character and since
the classical level using ideas taken from super- for source-free supergravity they have the simple
gravity, as was done by Witten and later streamlined solution bm = M = N = 0, one can directly regain the
by Nester, in an argument much simpler than Yau’s on-shell formalism by algebraically eliminating the
proof. This argument writes the energy as an auxiliary fields.
integral over a positive-semidefinite expression The inclusion of auxiliary fields is not an empty
quadratic in a commuting spinor field which is trick, however. The local supersymmetry transfor-
analogous to the (anticommuting) spinor parameter mations including the auxiliary fields form a closed
of supergravity in the transformations [9] and [10]. set without the use of equations of motion (‘‘off-
shell closure’’). This standardizes the form of the
supersymmetry transformations so that they remain
the same even when supermatter is coupled to
Auxiliary Fields and Superspace
supergravity instead of needing a case-by-case
Supergravity shares with flat-space supersymmetric Noether construction as in the case without the
theories a curious technical feature that gives a hint auxiliary fields. In this way, a standard set of
of a new underlying geometry. Standard counting of coupling rules can be drawn up, known as the
the gauge-invariant continuous degrees freedom of ‘‘tensor calculus.’’ This tensor calculus is of great
the graviton and the gravitino in momentum space importance as it allows for the construction of
yield the same result per momentum value: two general models of supergravity coupled to super-
bosonic degrees of freedom and two fermionic matter (Wess–Zumino multiplets and super Yang–
degrees of freedom. This accords with the general Mills multiplets consisting of spin-1 gauge fields and
requirement in supersymmetric theories that the spin-1=2 ‘‘gaugino’’ fields). These general couplings
numbers of bosonic and fermionic degrees of free- form the basis for essentially all supersymmetric
dom match. This count follows from the Einstein phenomenology, and in particular for the formula-
and spin-3/2 equations of motion, or ‘‘on-shell.’’ tion of the Minimal Supersymmetric Standard
If one compares the count of nongauge degrees Model. Since supersymmetry is not directly observed
of freedom without using the equations of motion in low-energy physics, it must be spontaneously
(i.e., ‘‘off-shell’’), one obtains an imbalance, how- broken, like many other gauge symmetries. As it
ever: six nongauge graviton versus 12 nongauge happens, the physically realistic mechanisms of
fermion fields. This is directly related to another supersymmetry breaking all originate from super-
puzzling feature of the supergravity realization of gravity couplings derived using the tensor calculus.
local supersymmetry: the local supersymmetry alge- Given the regular set of tensor calculus rules for
bra closes onto a finite set of transformations only coupling supergravity to supermatter, one is led to
when the equations of motion are imposed. suspect that a geometrical structure lies in the
As in flat-space supersymmetry, the cure for this background. This is indeed the case; the correspond-
problem is to add nondynamical ‘‘auxiliary’’ fields ing construction is known as ‘‘superspace.’’
to the action. In the supergravity case, the The basic idea of superspace is a generalization of
imbalance in the off-shell bose–fermi field count the coset space construction of Minkowski space as
indicates that an additional six bosonic fields are the coset space given by the Poincaré group divided
needed. In the minimal set of auxiliary fields, these by the Lorentz group: M4 (xm ) = ISO(3, 1)=SO(3, 1).
organize into a vector bm and a scalar-pseudoscalar For supersymmetric theories, one analogously con-
pair M, N; the additional terms in the action [6] are structs Superspace(xm ,
 ) = Graded Poincaré/SO(3, 1).
simply The basic ideas of superspace were introduced by
Z Akulov and Volkov in 1972, while the idea of
  expanding in ‘‘functions’’ on this space, thus yielding
d4 x detðeÞ  13 M2  13 N 2 þ 13 bm bm
‘‘superfield,’’ was introduced by Salam and Strathdee
Supergravity 125

in 1974. This led to a formulation of the Wess– and these correspond naturally to the various
Zumino model in terms of a chiral superfield (x,
), possible choices of auxiliary-field sets. With the
which is subjected to a covariant superspace minimal set, the supergravity multiplet is described
constraint. by a superfield carrying a vector index Hm (x,
,
); 
In order to manage the formalism of superspace this superfield is called the prepotential of super-
more efficiently, it is convenient to use a two- gravity. Note the fact that since the divisor group in
component spinor formalism corresponding to the the coset-space construction of superspace is the
Weyl basis for the Dirac gamma matrices, in which Lorentz group, superfields may carry indices corre-
the Majorana spinor coordinate
is represented as sponding to any Lorentz representation. The com-
  ponent-field expansion of the Hm superfield yields

 the physical eam , m , m˙ and auxiliary fields

¼ _

 (bm , M, N) together with a number of other compo-
nents of dimension lower than those of the physical
where two-component indices , ˙ = 1, 2 are raised
fields. This is not, however, all that surprising: even
and lowered with the covariant two-index antisym-
˙ the physical fields eam , m , m˙ contain components
metric tensors  , ˙  , which both take the numer-
that are not directly related to the physical modes
ical value i 2 . The flat-space fermionic covariant
because we are dealing with a gauge theory. What
derivatives are then
occurs in superspace is a redundant expression of
@ _ the supergravity multiplet with the presence of
D ¼ þ i m 
_
@m
@
  various component gauge fields.
½11 The full expression of local supersymmetry in
 _ ¼  @ þ i
 m @m
D _  _ superspace can be given in a number of different
@

formalisms. Suffice it here to indicate the transfor-
where the m ˙
= (1, i ) for m = (0, i) (where i are the mation of the linearized theory expanded in small
Pauli matrices) are the Van der Waerden matrices fluctuations about empty flat superspace. Convert-
which establish the mapping between vector indices ing the vector index of Hm into a (chiral, antichiral)
and (chiral, antichiral) spinor index pairs. The spinor index pair via H, ˙ = m H , the linearized
˙ m
Wess–Zumino multiplet is then described by a local symmetry transformation of the supergravity
complex chiral superfield satisfying the constraint multiplet is
D ˙ = 0. Unlike the situation in Minkowski space,
where the only Lorentz-covariant solution to a H_ ¼ D L  _ L
_ D ½12
 
constraint that sets to zero the @=@xm derivatives is
a constant, superspace has a reducible set of where the transformation parameter superfield L
coordinates (xm ,
 ,
˙ ) and, as a result, requiring carrying a spinor index is antichiral: D L  ˙ =0

to be annihilated by D  ˙ does not require the whole (while the conjugate parameter superfield L   is
superfield to be a constant. chiral). Expanding in component fields and compar-
Since the fermionic coordinates of superspace ing with the expansion of Hm , one sees that the

 ,
˙ are anticommuting (i.e., they are elements of chiral spinor superfield contains precisely the com-
a Grassman algebra), and since , ˙ = 1, 2 have an ponents needed to provide the standard gauge
index range of two, powers of them higher than the symmetries of eam and m , m˙ and also to trans-
second order necessarily vanish. As a result, super- form the other gauge components of Hm as well.
fields like can be expanded into sets of component One can then make various gauge choices according
fields, each of which is an ordinary field in to taste in a given context.
Minkowski space. In this way, a chiral superfield One frequently encountered superspace gauge
expands into (A(x), B(x),  (x), ˙ (x), F(x), G(x)), choice sets to zero all the fields in Hm except for
where the fields A, B, , and  are the physical the physical and auxiliary fields (eam , m , m˙ ,
fields of the Wess–Zumino model, while F and G bm , M, N). This is called a Wess–Zumino gauge
are dimension-2 auxiliary fields. In this way, the following the analogy to a similar construction for
auxiliary fields of supersymmetry naturally fit into a super Maxwell theory (containing spins 1 and 1/2).
superspace formalism as higher components in a Wess–Zumino gauge choices are not, however,
superfield expansion. It is in this sense that they supersymmetrically covariant. This shows up when
point toward the superspace formulations of super- one works out the supersymmetry algebra in such a
symmetric theories. gauge: the presence of auxiliary fields gives closure,
For supergravity, there are a number of different as required, without use of the equations of motion,
approaches to realizing the theory in superspace, but the anticommutator of two supersymmetry
126 Supergravity

transformations when acting on a gauge field such theorem barring unified spacetime and internal
as the Maxwell field or the vierbein gives a symmetries. This theorem (the Coleman–Mandula
combination of the anticipated translation with an theorem) can be evaded, since at the time it was
admixture of a gauge transformation with a field- written, graded Lie symmetry algebras were not yet
dependent parameter. considered. For nonzero central charges, the exter-
The prepotential superfield of minimal super- nal automorphism algebra becomes a subalgebra of
gravity can itself be fit into larger formalisms in U(N) determined by the requirement that invariant
superspace that are analogous to standard differen- antisymmetric tensors a‘ij exist.
tial geometry, with supervielbeins, superspin con- The representations of the algebra [13]–[14] span
nections and so forth. An unavoidable feature of an increasing range of spins as the number N of
these more seemingly geometric constructions, how- D = 4 supersymmetries increases. For massive repre-
ever, is their high degree of redundancy: superspace sentations without central charges, the spins of the
vielbeins and spin connections carrying Lorentz smallest supersymmetry representation extend from
indices have many component fields in addition to states of spin 0 (scalars) up to spin N/2; with central
those found in the prepotential. This redundancy is charges, the spin range can be shortened down to a
then cut down in turn by imposing superspace minimum range of N=4. For massless representa-
constraints on the geometrical superfields, for tions, the range of helicities in a PCT (parity–
example, on the components of the torsion tensor change–time reversal) symmetric multiplet is from
in superspace. N=4 to N=4. This spin range has an important
implication for the maximal extension of super-
symmetry that can be realized in an interacting
Extended Supergravities and supersymmetric field theory, because no interacting
Supergravities in Higher Dimensions theories with a finite set of spins exist for spins > 2.
The possible graded extensions of the Poincaré Accordingly, the maximal extension of supersym-
algebra allow for more than one spinorial generator. metry is N = 8 for massless theories, and in order to
Thus, one can have N supersymmetry generators have massive states with spins that do not exceed
 ˙ , i, j = 1, . . . N, with basic anticommutators
Qi , Q spin 2 in an N = 8 theory, the central charges have
j
(in Lorentz two-component notation) to be active for maximal multiplet shortening.
The N = 8 supergravity theory, found by Crem-
 _ g ¼ 2 i m _ Pm
fQi ; Q ½13
i j  mer and Julia in 1978, is thus the largest possible
supergravity in D = 4 dimensions. It contains the
j
fQi ; Q g ¼ 2 a‘ij Z‘ ½14 following ‘‘spin’’ range (allowing for a certain
imprecision of expression: for massless fields one
 i_ ; Q
fQ  _ g ¼ 2 _  ‘ should really speak only of helicities)
j _  aij Z‘ ½15
N = 8 supergravity spins
The right-hand sides of [14] and [15] allow for the
possibility of nonvanishing commutators between Spin 2 3
2
1 1
2
0
supersymmetry generators of the same chirality. As Multiplicity 1 8 28 56 70
one can see from the overall symmetry in pairs of
indices (i, j), the coefficients a‘ij must be antisym- In order to realize the automorphism SU(8) symme-
metric in the i, j indices, so such nonvanishing same- try, one has to consider the field strengths for the 28
chirality anticommutators cannot occur for N = 1. spin-1 fields, separated into complex self-dual and
The corresponding abelian generators Z‘ are called anti-self-dual parts in their antisymmetric Lorentz
central charges since they must commute with all the indices. These complex field strengths can then be
other (Qi , Q ˙ , Pm ) elements of the algebra. endowed with a complex 28-dimensional represen-
], j
The i, j indices may be endowed with a symmetry tation of SU(8). The 70 scalars, on the other hand,
meaning as well, although this is not obligatory in fit precisely into the four-index antisymmetric
every model. When the central charges are absent, self-dual representation of SU(8), i1 i2 i3 i4 =
Z‘ = 0, one has U(N) (or SU(N)) as the maximal 1=(4!)i1 i2 i3 i4 j1 j2 j3 j4 j1 j2 j3 j4 . It is the use of the eight-
such external automorphism; the choice of index index epsilon tensor here that restricts the auto-
placement on Qi and Q  ˙ anticipates this. If such a morphism group to SU(8) instead of U(8).
j
symmetry is realized in a given model, the fact that The SU(8) automorphism symmetry of N = 8
 ˙ carry representations both for that
the Qi , Q supergravity theory is linearly realized. It plays an
j
symmetry and for the spacetime Poincaré symmetry important role in another symmetry of this theory
demonstrates how supersymmetry evades the no-go which is highly nonlinear. This theory has a
Supergravity 127

remarkable nonlinear E7 symmetry. In fact, the 70 but divergences are nonetheless expected to occur at
scalars form a nonlinear sigma model with the fields some finite loop order.
taking their values in the coset space E7 =SU(8) (of This persistence of nonrenormalizability in D = 4
dimension 133  63 = 70), where the SU(8) divisor supergravity theories is no longer seen as a disaster,
is the linearly realized automorphism group dis- however, because these theories are now seen as
cussed above. effective theories for the massless modes arising
The extended supergravities point to another from a deeper microscopic quantum theory. In
aspect of supergravity theory: the existence of addition, the theories that are most directly con-
higher-dimensional supergravities, from which the nected to this underlying quantum theory are,
extended theories in D = 4 spacetime can be derived surprisingly, the maximal supergravities in space-
by Kaluza–Klein dimensional reduction. If one time dimensions 10 and 11. D = 11 supergravity can
considers a D0 dimensional massless theory in a be dimensionally reduced on a 1-torus (i.e., a circle)
spacetime where d dimensions form a compact to D = 10 where the massless sector yields type IIA
d-torus, then the theory can be viewed as a D = D0  d supergravity theory. This theory is the effective
dimensional theory in which the discrete Fourier theory for a consistent quantum theory of type IIA
modes arising from the periodicity requirements on superstrings in D = 10. Theories of relativistic
the d-torus give rise to towers of equally spaced strings (i.e., one-dimensional extended objects)
massive Kaluza–Klein states, plus a massless sector have strikingly different properties from theories of
in D0  d dimensions corresponding to the modes point particles. In particular, the spread-out nature
with no dependence on the d-torus coordinates. of the interactions leads to a damping out of the
Importantly, N = 8 supergravity in four- quantum field theory divergences, while the under-
dimensional spacetime can be obtained in this way lying supersymmetry causes a cancellation of other
from a supergravity theory that exists in 11 space- infinities that could have arisen owing to the two-
time dimensions. Upon dimensional reduction on a dimensional nature of the string world sheets. This
7-torus to four dimensions, one obtains N = 8, D = 4 gives, for the first time, a perturbatively well-defined
supergravity at the massless level, plus an infinite quantum theory including gravity.
tower of massive N = 8 supermultiplets with central In addition to the type IIA theory, there are four
charges so that their spin range extends only up to other consistent superstring theories in D = 10, and
spin 2. This D = 11 supergravity was in fact found these are in turn related to various D = 10 super-
before the N = 8 theory by Cremmer, Julia, and gravity effective theories for the massless modes:
Scherk, with the details of the more complicated type IIB, E8  E8 heterotic, SO(32) heterotic, and
N = 8, D = 4 theory being worked out via the SO(32) type I. Remarkably, the maximal D = 11
techniques of Kaluza–Klein dimensional reduction. supergravity enters into this picture as well, as a
The fields of the D = 11 theory include an exotic consequence of a pattern of duality symmetries that
field type not encountered in D = 4 theories: the have been found among the superstring theories.
bosonic fields of the theory comprise the graviton eA
M The dualities of string theory are directly related
plus a three-index antisymmetric tensor gauge field to the nonlinear symmetries of the dimensionally
CMNP . Counting the number of propagating modes reduced supergravities in D = 4. The string quantum
of these fields for a given momentum value gives corrections do not respect the E7 symmetry of the
44 þ 84 = 128 bosonic degrees of freedom. This classical N = 8 theory, but they do respect a discrete
precisely balances the 128 fermionic degrees of subgroup of this symmetry in which the E7 group
freedom coming from the D = 11 gravitino M . elements are required to take integer values: E7 (Z).
This quantum-level restriction to a discrete sub-
group can be seen from another phenomenon
characteristic of superstring theories: the existence
Supergravity Effective Theories, Strings
of ‘‘electric’’ and ‘‘magnetic’’ brane solutions. The
and Branes
antisymmetric-tensor (or ‘‘form’’) fields of the
The hope for a cancellation of the ultraviolet higher-dimensional supergravities naturally give rise
divergences in a supersymmetric theory of gravity to solitonic solutions in which p þ 1 dimensions
turned out to be ephemeral, although there is in fact form a flat Poincaré invariant subspace. This can be
a postponement of the divergence onset until a interpreted as the world volume of an infinite
higher order in quantum field loops. There is p-brane extended object. In the D = 11 supergravity
agreement that the nonmaximal supergravities theory, the branes that emerge in this way are a
diverge at the three-loop order. For the 2-brane and a 5-brane. The three-dimensional world
N = 8, D = 4 theory, the situation remains unclear, volume of the 2-brane naturally couples to the
128 Supermanifolds

3-form field CMNP , just as an ordinary Maxwell weak coupling duality. In the case of the type IIA
vector field couples to the one-dimensional world theory, however, something remarkable happens.
line of a point particle (or 0-brane). The 2-brane is The strong coupling limit of this theory turns out to
thus naturally electrically charged with respect to be related by duality, not to another string theory,
the 3-form field; its charge can be obtained, in a but to the maximal D = 11 supergravity. The role of
direct generalization of the Maxwell case, from a the Kaluza–Klein massive modes for the 11 to 10
Gauss’ law integral of the field strength H[4] = dC[3] reduction is played by an infinite tower of extremal
over a 7-sphere at spatial infinity in the eight charged black holes.
directions transverse to the brane worldvolume. Thus, even D = 11 supergravity theory has a role
The 5-brane, on the other hand, has a magnetic to play in the effective theory of the underlying
type charge; it is the 7-form dual to H[4] that is quantum dynamics. This underlying theory has been
integrated to give its charge. In addition to these dubbed ‘‘M-theory.’’ It is still only partially under-
static infinite p-branes, the theory contains dynami- stood, but many of its most important properties are
cal finite-extent branes as well, although for these presaged by the remarkable nonlinear structure of
one generally does not have explicit solutions. the classical supergravities.
As one reduces a higher-dimensional supergravity
to lower and lower dimensions, there is a proliferation See also: Brane Construction of Gauge Theories; Brane
of solitonic brane solutions of varying dimensionality, Worlds; Branes and Black Hole Statistical Mechanics;
and of both electric and magnetic charge types. In a Random Algebraic Geometry, Attractors and Flux Vacua;
Renormalization: General Theory; Spinors and Spin
quantum theory context, these electrically and magne-
Coefficients; Stability of Minkowski Space;
tically charged branes pair up in ways that must satisfy
Supermanifolds; Superstring Theories;
a generalization of the Dirac quantization condition Supersymmetric Particle Models; Symmetries
for D = 4 electric and magnetic point particles. This and Conservation Laws; Symmetries in Quantum
ends up requiring all the supergravity solitonic brane Field Theory: Algebraic Aspects.
charges to lie on a charge lattice. It is the requirement
that this discrete brane-charge lattice be respected that
restricts the classical supergravity nonlinear symmetry
groups to discrete duality subgroups. Further Reading
The dualities relate brane solutions within a given Buchbinder JL and Kuzenko SM (1998) Ideas and Methods of
theory and also between different string theories. Supersymmetry and Supergravity. Bristol: IoP Publishing Ltd.
They include transformations that invert the radii of Stelle KS (1998) BPS branes in supergravity, Trieste 1987 School of
compactifying tori, giving a large–small compactifi- High-Energy Physics and Cosmology, arXiv:hep-th/9803116.
Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 68:
cation scale duality. They also include transforma- 189–398.
tions that invert the string coupling constant, thus Wess J and Bagger J (1983) Supersymmetry and Supergravity.
interchanging strong and weak coupling. The type Princeton: Princeton University Press.
IIB theory, for example, is self-dual under strong–

Supermanifolds
F A Rogers, King’s College London, London, UK effectiveness of supermanifolds is that anticommut-
ª 2006 Elsevier Ltd. All rights reserved.
ing coordinates allow the fermionic canonical anti-
commutation relations to be handled in a way
analogous to the bosonic canonical commutation
relations. Supersymmetric methods have proved
Introduction
immensely effective in fundamental physics; they
A supermanifold is a generalization of a classical also play a considerable role in geometrical index
manifold to include coordinates that are in some theory in mathematics. In this article we describe
sense anticommuting. Much of the motivation for supermanifolds from two points of view – geometric
the study of supermanifolds comes from super- and algebraic – and consider some of the standard
symmetric physics, where it is useful to have a features of manifold calculus, including integration
formalism which treats fermions and bosons in the since this is an area where the distinctive features of
same way. The underlying reason for the this generalized geometry are particularly apparent.
130 Supermanifolds

Thus, even D = 11 supergravity theory has a role and Conservation Laws; Symmetries in Quantum
to play in the effective theory of the underlying Field Theory: Algebraic Aspects.
quantum dynamics. This underlying theory has been
dubbed ‘‘M-theory.’’ It is still only partially under-
stood, but many of its most important properties are Further Reading
presaged by the remarkable nonlinear structure of
the classical supergravities. Buchbinder JL and Kuzenko SM (1998) Ideas and Methods of
Supersymmetry and Supergravity. Bristol: IoP Publishing Ltd.
See also: Brane Construction of Gauge Theories; Brane Stelle KS (1998) BPS branes in supergravity, Trieste 1987 School of
High-Energy Physics and Cosmology, arXiv:hep-th/9803116.
Worlds; Branes and Black Hole Statistical Mechanics;
Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 68:
Random Algebraic Geometry, Attractors and Flux Vacua; 189–398.
Renormalization: General Theory; Spinors and Spin Wess J and Bagger J (1983) Supersymmetry and Supergravity.
Coefficients; Stability of Minkowski Space; Princeton: Princeton University Press.
Supermanifolds; Superstring Theories;
Supersymmetric Particle Models; Symmetries

Supermanifolds
F A Rogers, King’s College London, London, UK groups, which are supermanifolds with a compatible
ª 2006 Elsevier Ltd. All rights reserved. group structure.

Introduction Some Algebraic Preliminaries


A supermanifold is a generalization of a classical The coordinates of a supermanifold have particular
manifold to include coordinates that are in some algebraic features which are best understood by
sense anticommuting. Much of the motivation for introducing some of the basic concepts of super-
the study of supermanifolds comes from super- algebra. (The word super here does not imply
symmetric physics, where it is useful to have a superiority, simply the extension of some classical
formalism which treats fermions and bosons in the concept to have odd as well as even, anticommuting
same way. The underlying reason for the effective- as well as commuting, elements.) A ‘‘super vector
ness of supermanifolds is that anticommuting space’’ is a vector space V together with a direct sum
coordinates allow the fermionic canonical anti- decomposition
commutation relations to be handled in a way
analogous to the bosonic canonical commutation V ¼ V0  V1 ½1
relations. Supersymmetric methods have proved
The subspaces V0 and V1 are referred to, respec-
immensely effective in fundamental physics; they
tively, as the even and odd parts of V. A general
also play a considerable role in geometrical index
element v of V thus has the unique decomposition
theory in mathematics. In this article we describe
v = v0 þ v1 with v0 in V0 and v1 in V1 . We will
supermanifolds from two points of view – geometric
normally consider homogeneous elements, that is,
and algebraic – and consider some of the standard
elements v which are either even or odd, with parity
features of manifold calculus, including integration
denoted by jvj, so that jvj = i if v is in Vi , i = 0, 1.
since this is an area where the distinctive features of
(Arithmetic of parity indices i = 0, 1 is always
this generalized geometry are particularly apparent.
modulo 2.) A superalgebra is a super vector space
One situation where supermanifolds are used in
whose elements can be multiplied together in such a
physics is in the superspace formulation of super-
way that the product of an even element with an
gravity, where the physical fields are found in the
even element and that of an odd element with an
component fields in the Taylor expansion of func-
odd element are both even, while the product of an
tions on the supermanifold in anticommuting vari-
odd element with an even element is odd; more
ables. More fundamentally, the symmetry groups of
formally:
supersymmetric theories have commuting and anti-
commuting generators, and are examples of super Lie Definition 1
Supermanifolds 131

(i) A ‘‘superalgebra’’ is a super vector space numbers. Such functions will be known (anticipating
A = A0  A1 which is also an algebra which the terminology for functions of both odd and even
satisfies Ai Aj  Aiþj . variables) as supersmooth. (A useful notation will be
(ii) The superalgebra is ‘‘supercommutative’’ if, for to write
all homogeneous a, b in A, ab = (1)(jajjbj) ba. X
1 n 
Fð ; . . . ;  Þ ¼ F  ½6
If the algebra is supercommutative then odd 
elements anticommute, and the square of an odd
element is zero. The basic supercommutative super- with  a multi-index  = 1    k and  =
algebra used is the real Grassmann algebra with 1    k 1. The set of multi-indices is restricted to
generators 1, 1 , 2 , . . . and relations those where 1  1 <    < k  n.) More general
supersmooth functions, with the coefficients F; , . . .
1i ¼ i 1 ¼ i ; i j ¼ j i ½2 taking values in C, RS , or some other algebra are
also possible.
A typical element of this algebra is then
Differentiation of supersmooth functions of anti-
X X commuting variables is defined by linearity together
a ¼ a; 1 þ ai  i þ aij i j    ½3
with the rule
i i<j

@ð1 2 . . . r Þ


This algebra, which is denoted RS , is a superalgebra
with RS := R S,0  RS,1 , where RS,0 consists of linear @j

combinations of products of even numbers of the k1 1 c
k r
if j ¼ k
¼ ð1Þ  . . .  . . .  ½7
anticommuting generators, while RS,1 is built simi- 0 otherwise
larly from odd products.
The Grassmann algebra R S is used to build the where the caret b indicates an omitted factor.
(m, n)-dimensional superspace Rm,nS in the following In order to extend the notion of supersmoothness
way: to functions on the more general superspace RSm,n ,
we should strictly take note of the fact that an even
Definition 2. An (m, n)-dimensional superspace is Grassmann variable is not simply a real or complex
the space variable, as explained in the appendix. Assuming
this done, a supersmooth function on the general
Rm;n ¼ RS0      RS0  R S1      R S1 ½4
S |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} superspace Rm, S
n
can then be defined as a function of
m copies n copies the form
X
A typical element of R m, S
n
is written as Fðx1 ; . . . ; xm ; 1 ; . . . ; n Þ ¼ 1 m 
F ðx ; . . . ; x Þ ½8
1 m 1 n
(x , . . . , x ;  , . . . ,  ), where the convention is 
used that lower case Latin letters represent even
objects and lower case Greek letters represent odd with each coefficient function F a smooth function
objects, while small capitals are used for objects of on Rm .
mixed or unspecified parity. The final preparatory idea needed is the topology
on the superspace Rm, n
S . It turns out that a coarse,
As will be described in more detail below, in the
non-Hausdorff topology leads to most of the super-
geometric approach supermanifolds are spaces
manifolds used in physics. In order to define this
locally modeled on R m,n S . In order to define a topology, we introduce a mapping
supermanifold, we will need to define a topology
on this space, and to have some notion of  : RS ! R
differentiation. Consider first multilinear functions
of purely anticommuting variables. If there are n defined by
such variables, 1 , . . . , n , then a multilinear function  
F can be expressed in the form
X X
 a; 1 þ ai  i þ aij i j    ¼ a; ½9
i i<j
X
n X
n
1
Fð ; . . . ;  n Þ ¼ F; þ Fi 
i
þ Fij 
i j
 þ 
i¼1 1¼i<j
and the related mapping
1 n
þ F1...n  . . .  ½5  : Rm;n ! Rm
S

where the coefficients F; , Fi and so on are real defined by


132 Supermanifolds

ððx1 ; . . . ; xm ; 1 ; . . . ; n ÞÞ ¼ ððx1 Þ; . . . ; ðxm ÞÞ ½10 (i) An (m, n) open chart on M is a pair (U, ) such
that U is a subset of M and  is an injective map
These maps project out all the nilpotent Grass- of U into RSm,n , with the image (U) an open set
mann generators, leaving simply the real part. The in Rm,n
S .
topology involves the inverse of these projection (ii) An (m, n) atlas on M is a collection {(U ,  )} of
maps: a subset U of R m,n
S is said to be open if and (m, n) charts on M such that the U cover M
only if there exists an open set V in R m such that and, whenever U \ U is not empty, the change
U = 1 (V). Thus, an open set is unlimited in the of coordinate function  1 is supersmooth.
nilpotent directions. An (m, n)-dimensional supermanifold is a set M
In the sequel, where we consider integration, the together with a maximal (m, n) atlas on M.
superdeterminant of the matrix M of an endo- The space M is given a topology by defining U  M
morphism of a super vector space V will be useful. to be open if and only if, for each  such that U \ U
If V is an (m, n)-dimensional super vector space is not empty, the set  (U \ U ) is an open subset
(so that V0 has dimension m and V1 dimension n), of Rm,n
S .
then M will have the block diagonal form Examples of supermanifolds include Rm,n itself, and
S
 
M00 M01 also supermanifolds constructed from the data of a
M10 M11 vector bundle over a classical manifold in a manner
which will now be described. If N is a classical
where the entries of M00 and M11 are even, whereas m-dimensional real manifold and E is an n-dimensional
those of M10 and M10 are odd. If N = M1 has block vector bundle over N, then an (m, n)-dimensional
form supermanifold can be constructed in the following
  way: suppose that {(V ,  )} is an atlas of charts on N,
N00 N01
N10 N11 so that each V is an open subset of N and each  is
an injective map of V onto an open subset of Rm ,
then the superdeterminant of M is defined by with  1  smooth. Suppose further that the V are
also local trivialization neighborhoods of the bundle E
S det M ¼ det M00 det N11
with transition functions g : V \ V ! GL(n).
It can be shown that the superdeterminant obeys the Then we build the supermanifold M by patching
product rule, unlike the obvious generalization of together the sets 1 (  (V )  R0, n
S ) in a consistent
the determinant to the super case. way. This leads to a supermanifold with coordinate
change functions
 
 1
 x1 ; . . . xm 1 n
 ;  ; . . . ; 
The Geometric Approach to
Supermanifolds 
¼ x1 ; . . . xm 1 n
 ;  ; . . . ; 
A manifold is a space locally modeled on the
topological space Rm , where m is the dimension of where
the manifold. Thus, each point in a manifold has a   
x1 ; . . . xm
 ¼  1
 x1 ; . . . xm

neighborhood which is essentially a neighborhood in
Rm . The most geometrically intuitive approach to   ½11
X
n
supermanifolds is to generalize this directly by j ¼ gj k x1 ; . . . ; xm k
 
modeling a space locally on an extension of R m to k¼1
include anticommuting variables; the most straight-
forward space with the required algebraic property (Here again we refer to the appendix for the way in
is the superspace R m, S
n
built from a Grassmann which functions of even Grassmann variables, as
algebra, leading to a supermanifold of dimension opposed simply to real numbers, are handled.)
(m, n). (The dimension of a supermanifold is a pair Particular examples of this construction are the
of integers, indicating the numbers of even and odd tangent bundle over N and bundles of spinors over
coordinates of each point.) N. It was actually shown by Batchelor that all real,
The formal definition of a supermanifold will now supersmooth supermanifolds are of this form.
be given in a manner very closely analogous to that A similar definition may be made of a complex
of a classical manifold. supermanifold using a complex Grassmann algebra,
with the coordinate transition functions required to
Definition 3. Let M be a set.
be superanalytic. In this case, supermanifolds which
Supermanifolds 133

are not related to vector bundles in the manner difference, the two approaches lead to essentially
described above are possible, basically because equivalent supermanifolds.
partitions of unity do not exist in the analytic The advantage of the algebraic approach is its
setting. An example is the twisted supertorus, which mathematical elegance and economy – there is no
is built over the standard torus and has transition need to introduce the auxiliary Grassmann algebra
functions (z, ) ! (z þ 1, ) and (z, ) ! (z þ a þ RS in which coordinate functions take values – but
,  þ ), extending the standard torus with transi- from the point of view of physicists, the geometric
tion functions z ! z þ 1, z ! z þ a. (Here a,  are, point of view has two advantages: first, it is closer to
respectively, even and odd constants.) This super- the standard manifold picture and thus easier to
manifold is an example of a super Riemann surface; grasp, and, second, it allows a wider class of
such surfaces play an important role in the quanti- supermanifolds, because Grassmann constants are
zation of the spinning string. allowed; for instance, the twisted supertorus
As with classical manifolds, a natural class of described above cannot be included in the algebraic
functions can be defined on a supermanifold: approach without either introducing an auxiliary
a function f on an open subset U of the super- algebra or moving to the more difficult concept of a
manifold M is said to be supersmooth if, for each  family of supermanifolds.
such that U \ U is nonempty, the function f 1  is While there have been various attempts to develop
supersmooth on  (U \ U ). In local coordinates infinite-dimensional supermanifolds, most of the
supersmooth functionsP are such that constructions have been developed for very specific
f (x1 , . . . , xm , 1 , . . . , n ) =  f (x1 , . . . , xm ) with purposes, such as path integration and functional
each f a smooth function. integration methods for theories with fermions.
Even the question of defining a basic infinite-
dimensional superalgebra with the necessary
analytic properties, such as a Hilbert–Banach super-
The Algebraic Approach to
algebra, requires sophisticated procedures, so that
Supermanifolds the development of a theory of infinite-dimensional
In the algebraic approach to supermanifolds, it is the supermanifolds becomes extremely technical.
algebra of functions, rather than the manifold
itself, which is extended to include anticommuting
Calculus on Supermanifolds
elements. In this approach an (m, n)-dimensional
supermanifold is defined to be a pair (N, A), where Much of the calculus of functions on supermanifolds
N is an m-dimensional classical manifold and A is a proceeds in simple analogy to that of classical
sheaf of superalgebras over N with various proper- manifolds, with addition sign factors occurring when-
ties, described below. The statement that A is a ever two odd quantities are transposed. For instance, a
sheaf of algebras over N means that corresponding vector field on M may be described as a super-
to each open subset U of N there is an algebra A(U); derivation of the algebra of supersmooth functions
also, if V  U, there is a ‘‘restriction map’’ U, V on M, that is, a linear mapping of this space obeying
mapping A(U) into A(V), and the various restriction the super Leibnitz rule X fg = Xf g þ (1)(jXjjf j) f Xg.
maps obey certain consistency conditions. A parti- Standard examples of vector fields (defined locally) are
cular example of such a sheaf (with trivial odd part) coordinate derivatives @=@xi and @=@j , defined by
is the sheaf A; of real-valued functions on N, with (@=@xi )f = @i (f ) and (@=@j )f = @jþm (f ) with 
A; (U) = C1 (U), the set of real-valued smooth func- the coordinate function corresponding to the coordi-
tions on U and U, V mapping a function in C1 (U) nates (x1 , . . . , xm ; 1 , . . . , n ). Equipped with this con-
to its restriction in C1 (V). The defining property of cept of vector field, much of differential calculus on
the sheaf corresponding to an (m, n)-dimensional manifolds can be directly generalized to supermani-
supermanifold is that there is a cover {U } of N for folds in a relatively straightforward way. However, in
which the algebras A(U ) have the form A(U ) ffi the case of integration the situation is quite different.
C1 (U ) (Rn ), so that a typical P element f of The standard approach to integration of anticommut-
A(U ) may be expressed as f =  f  , where f 2 ing variables is the Berezin integral, which is a formal,
C1 (U ) and 1 , . . . , n are generators of (Rn ). The algebraic integral that is not an antiderivative and has
notation here is chosen to emphasize the close no measure-theoretic features. There are various
correspondence with the algebra of smooth func- reasons why such an integral is used: for instance,
tions described at the end of the previous section. even the simple function  of a single anticommuting
This makes it clear that, despite an apparent variable has no antiderivative, while the topology on
RSm,n does not allow open sets which discriminate in
134 Supermanifolds

odd directions. Additionally, when changing variables Definition 4. The function F : Rm,0
S ! RS is said to
on RSm,n it is the superdeterminant of the Jacobian be supersmooth if there exists a smooth function
matrix which must be used. In the purely odd sector, ~ : Rm ! R, such that
F
differentials thus transform the ‘‘wrong’’ way. 1 m
Fðx ; . . . ; x Þ
The Berezin integral of a function f of n anti-
commuting variables is defined by X
m
@ F~
! ¼ F~ððxÞÞ þ ðxi  ðxi Þ1Þ ððxÞÞ
Z @xi
n
X i¼1

d  f  ¼ f1...n ½12 1X m
 þ ðxi  ðxi Þ1Þ
2 i;j¼1
In other words, Berezin integration simply picks out
the coefficient of the highest-order term, thus @ 2 F~
 ðxj  ðxj Þ1Þ ððxÞÞ . . . ½15
resembling differentiation more than integration in @xi @xj
the classical sense. Nonetheless, the Berezin integral (Although this Taylor series will in general be
has very useful properties, in particular allowing infinite, it gives well-defined coefficients for each
direct analoges of Fourier transformations and  in the expansion [3], so that the value of F is a
integral kernel. Given that it is the algebra of well-defined element of R S .) A number of different
functions, and the operators acting on these alge- classes of function can be obtained, by varying the
bras, which is the key element in supergeometry, space in which the function ~F takes its value.
these are vital properties of the integral.
The transformation rule under change of variable
See also: Batalin–Vilkovisky Quantization; BRST
is the inverse of that which one expects. For Quantization; Graded Poisson Algebras; Path-Integrals in
instance, in the case of a single variable, if one Non Commutative Geometry; Random Matrix Theory in
makes the transformation  !  = a þ  with a and Physics; Supergravity; Superstring Theories;
 constants, a direct calculation shows that the Supersymmetric Particle Models; Supersymmetric
integral is invariant provided that one sets d = a d. Quantum Mechanics.
Integration on RSm,n is essentially defined by
combining classical integration for the even variables
with Berezin integration for odd variables, giving
Z ! Further Reading
m n
X
1 m 
d xd  f ðx ; . . . ; x Þ Batchelor M (1979) The structure of supermanifolds. Trans-
1 ðVÞ  actions of the American Mathematical Society 253: 329–338.
Z Batchelor M (1980) Two approaches to supermanifolds. Trans-

¼ dm x f1...n ðx1 ; . . . ; xm Þ ½13 actions of the American Mathematical Society 258: 257–270.
V Berezin FA (1987) Introduction to Superanalysis. Dordrecht: Reidel.
Berezin FA and Leǐtes DA (1976) Supermanifolds. Soviet Maths
This also defines integration on supermanifolds, Doklady 16: 1218–1222.
provided that we can find a rule for the change of Crane L and Rabin JM (1988) Super Riemann surfaces:
variable. This, as indicated above, may be done by uniformization and Teichmüller theory. Communications in
using the superdeterminant of the Jacobian matrix. Mathematical Physics 113: 601–623.
DeWitt BS (1992) Supermanifolds. Cambridge: Cambridge
Suppose that (y, ) are a new set of coordinates on University Press.
our supermanifold. Then an invariant definition of Howe PS (1979) Super Weyl transformations in two dimensions.
integral is obtained if we set Journal of Physics A 12: 393–402.
0 1 Jadczyk A and Pilch K (1981) Superspace and supersymmetries.
@y @y Communications in Mathematical Physics 78: 373–390.
B @x @ C m n Kostant B (1977) Graded manifolds, graded Lie theory
dm y dn  ¼ SdetB C
@ @ @ A d x d  ½14 and prequantization. In: Differential Geometric Methods in
Mathematical Physics,. Lecture Notes in Mathematics, Springer.
@x @ Kupsch J and Smolyanov O (2000) Hilbert norms for graded
algebras. Proceedings of the American Mathematical Society
Appendix 128: 1647–1653.
Polchinski J (1998) String Theory, vol. II. Cambridge: Cambridge
We now describe the device which allows functions University Press.
of even Grassmann variables to be handled simply as Rogers A (2003) Supersymmetry and Brownian motion on
functions of conventional variables. The necessary supermanifolds. Infinite Dimensional Analysis, Quantum
class of functions is captured by defining super- Probability and Related Topics 6(suppl. 1): 83–102.
smooth functions on R m,0
S as extensions by Taylor Salam A and Strathdee J (1974) Super-gauge transformations.
Nuclear Physics B 76: 477–482.
expansion from smooth functions on Rm .
Superstring Theories 135

Voronov AA (1992) Geometric integration theory on super- West P (1990) Introduction to Supersymmetry and Supergravity.
manifolds. Soviet Scientific Reviews C: Mathematical Physics Singapore: World Scientific.
Reviews 9: 1–138.
Wess J and Bagger J (1983) Supersymmetry and Supergravity.
Princeton: Princeton University Press.

Superstring Theories
C Bachas and J Troost, Ecole Normale Supérieure, The Five Superstring Theories
Paris, France
Theories of relativistic extended objects are tightly
ª 2006 Elsevier Ltd. All rights reserved. constrained by anomalies, that is, quantum viola-
tions of classical symmetries. These arise because the
classical trajectory of an extended p-dimensional
object (or ‘‘p-brane’’) is described by the embedding
Introduction X ( a ), where  a = 0,..., p parametrize the brane world
String theory postulates that all elementary particles volume, and X = 0,..., D1 are coordinates of the
in nature correspond to different vibration states of target space. The quantum mechanics of a single
an underlying relativistic string. In the quantum p-brane is therefore a (p þ 1)-dimensional quantum
theory both the frequencies and the amplitudes of field theory, and as such suffers a priori from
vibration are quantized, so that the quantum states ultraviolet divergences and anomalies. The case
of a string are discrete. They can be characterized by p = 1 is special in that these problems can be exactly
their mass, spin, and various gauge charges. One of handled. The story for higher values of p is much
these states has zero mass and spin equal to 2h, and more complicated, as will become apparent later on.
can be identified with the messenger of gravitational The theory of ordinary loops in space is called
interactions, the graviton. Thus, string theory is a closed bosonic string theory. The classical trajectory
candidate for a unified theory of all fundamental of a bosonic string extremizes the Nambu–Goto
interactions, including quantum gravity. action (proportional to the invariant area of the
In this article, we discuss the theory of superstrings world sheet)
as consistent theories of quantum gravity. The aim is Z qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 2
to provide a quick (mostly lexicographic and biblio- SNG ¼ d  detðG @a X @b X Þ ½1
2
0
graphic) entry to some of the salient features of the
subject for a nonspecialist audience. Our treatment is where G (X) is the target-space metric, and 0 is
thus neither complete nor comprehensive – there exist the Regge slope (which is inversely proportional to
for this several excellent expert books, in particular the string tension and has dimensions of length
by Green, et al. (1987) and by Polchinski (1998). An squared). In flat spacetime, and for a conformal
introductory textbook by Zwiebach (2004) is also choice of world-sheet parameters  =  0  1 , the
highly recommended for beginners. Several other equations of motion read:
complementary reviews on various aspects of super- @þ @ X ¼ 0 and  @ X @ X ¼ 0 ½2
string theories are available on the internet (see the
‘‘Further reading’’ section); some more will be given with  the Minkowski metric. The X are thus free
as we proceed. two-dimensional fields, subject to quadratic phase-
space constraints known as the Virasoro conditions.
These can be solved consistently at the quantum
level in the critical dimension D = 26. Otherwise,
the symmetries of eqns [2] are anomalous: either
Lorentz invariance is broken, or there is a conformal
anomaly leading to unitarity problems. (For D < 26,
unitary noncritical string theories in highly curved
rather than in the originally flat background can be
constructed.)
Even for D = 26, bosonic string theory is, how-
ever, sick because its lowest-lying state is a tachyon,
Figure 1 A four-particle and a four-string interaction.
136 Superstring Theories

The finiteness of string perturbation theory has coupling gYM . Thus, up to redefinitions of the fields,
been, strictly speaking, only established up to two the type I theory has necessarily the same low-
loops – for a recent review see D’Hoker and Phong energy limit.
(2002). However, even though the technical pro- The D = 10 supergravity plus super Yang–Mills
blem is open and hard, the qualitative case for all- has a hexagon diagram that gives rise to gauge and
order finiteness is convincing. It can be illustrated gravitational anomalies, similar to the triangle
with the torus diagram which makes a one-loop anomaly in D = 4. It turns out that for the two
contribution to string amplitudes. The thin torus of special groups E8  E8 and SO(32), the structure of
Figure 2 could be traced either by a short, light these anomalies is such that they can be canceled by
string propagating (virtually) for a long time, or by a a combinationR of local counter-terms. One of them
long, heavy string propagating for a short period of is of the form B2 ^ X8 (F, R), where X8 is an 8-form
time. In conventional field theory, these two virtual quartic in the curvature and/or Yang–Mills field
trajectories would have made distinct contributions strength. The other is already present in the lower
to the amplitude, one in the infrared and the second line of expression [7], with the replacement
in the ultraviolet region. In string theory, on the !gauge
3 ! !gauge
3  !Lorentz
3 , where the second Chern–
other hand, they are related by a modular transfor- Simons form is built out of the spin connection.
mation (that exchanges  0 with  1 ) and must not, Note that these modifications of the effective action
therefore, be counted twice. A similar kind of involve terms with more than two derivatives, and
argument shows that all potential divergences of are not required by supersymmetry at the classical
string theory are infrared – they are therefore level. The discovery by Green and Schwarz that
kinematical (i.e., occur for special values of the string theory produces precisely these terms (from
external momenta), or else they signal an instability integrating out the massive string modes) was called
of the vacuum and should cancel if one expands the ‘‘first superstring revolution.’’
around a stable ground state.
The low-energy limit of the heterotic and type I
D-Branes
string theories is N = 1 supergravity plus super
Yang–Mills. In addition to the N = 1 graviton A large window into the nonperturbative structure
multiplet, the massless spectrum now also includes of string theory has been opened by the discovery of
gauge bosons and their associated gauginos. The D(irichlet)-branes, and of strong/weak-coupling
two-derivative effective action in the heterotic case duality symmetries. A Dp brane is a solitonic
reads: p-dimensional excitation, defined indirectly by the
Z pffiffiffiffiffiffiffiffi property that open string endpoints can attach to its
1
Shet ¼ 2 d10 x G e2 world volume (see Figure 3). Stable Dp branes exist
2
" in the type IIA and type IIB theories for p even,
2 respectively, odd, and in the type I theory for p = 1
 R þ 4@ @   þ 2 trðF F Þ
gYM and 5. They are charged under the R–R (p þ 1)-form
 2 # potential or, for p > 4, under its magnetic dual.
1  2 gauge  Strictly speaking, only for 0 p 6 do D-branes
 dB2  2 !3  þ fermions ½7
2 gYM resemble regular solitons the word stands for
‘‘solitary waves’’). The D7 branes are more like
where !gauge
3 = tr(AdA þ (2=3)A3 ) is the Chern–
Simons gauge 3-form. Again, supersymmetry fixes
completely the above action – the only freedom is in
the choice of the gauge group and of the Yang–Mills

Time

Space
Figure 2 The same torus diagram viewed in two different
channels. Figure 3 D-branes and open strings.
Superstring Theories 137

cosmic strings, the D8 branes are domain walls, It implies that two or more identical D-branes
while the D9 branes are spacetime filling. Indeed, exert no net static force on each other, because
type I string theory can be thought as arising from their R–R repulsion cancels exactly their gravita-
type IIB through the introduction of an orientifold tional attraction. A nontrivial check of the result
9-plane (required for tadpole cancelation) and of 32 [9] comes from the Dirac quantization condition
D9 branes. (generalized to extended objects by Nepomechie
The low-energy dynamics of a Dp brane is and Teitelboim). Indeed, a Dp brane and a
described by a supersymmetric abelian gauge theory, D(6  p)-brane are dual excitations, like electric
reduced from ten down to p þ 1 dimensions. The and magnetic charges in four dimensions, so their
gauge field multiplet includes 9  p real scalars, couplings must obey
plus gauginos in the spinor representation of the
R-symmetry group SO(9  p). These are precisely 2 2 p 6p ¼ 2k where k 2 Z ½10
the massless states of an open string with endpoints
moving freely on a hyperplane. The real scalar fields This ensures that the Dirac singularity of the long-
are Goldstone modes of the broken translation range R–R fields of the branes does not lead to an
invariance, that is, they are the transverse coordinate observable Bohm–Aharonov phase. The couplings
~ a ) of the D-brane. The bosonic part of the
fields Y( [9] obey this condition with k = 1, so that D-branes
low-energy effective action is the sum of a Dirac– carry the smallest allowed R–R charges in the
Born–Infeld (DBI) and a Chern–Simons (CS) like theory.
term: A simple but important observation is that open
Z qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi strings living on a collection of n identical D-branes
Ip ¼ Tp dpþ1 e detðG ^ ab þ F ab Þ have matrix-valued wave functions ij , where
Z X i, j = 1, . . . , n label the possible endpoints of the
 p ^ n ^ eF
C ½8 string. The low-energy dynamics of the branes is
n thus described by a nonabelian gauge theory, with
group U(n) if the open strings are oriented, and
where F ab = B ^ ab þ 20 Fab , hats denote pullbacks SO(n) or Sp(n) if they are not. We have already
on the brane of bulk tensor fields (e.g., G ^ ab = encountered such Chan–Paton factors in our discus-
  sion of the type I superstring. More generally, this
G @a Y @b Y ), Fab is the field strength of the
world-volume gauge field, and in the CS term simple property of D-branes has led to many insights
one is instructed to keep the (p þ 1)-form of the on the geometric interpretation and engineering of
expression under the integration sign. The constants gauge theories, which are reviewed in the articles
Tp and p are the tension and charge density of the Brane Construction of Gauge Theories and Gauge
D-brane. As was the case for the effective super- Theories from Strings. It has also placed on a firmer
gravities, the above action receives curvature footing the idea of a brane world, according to
corrections that are higher order in the 0 expan- which the fields and interactions of the standard
sion. Note however that a class of higher-order model would be confined to a set of D-branes, while
terms have been already resummed in expression gravitons are free to propagate in the bulk (for
[8]. These involve arbitrary powers of Fab , and are reviews, see Brane Worlds and reference Lust
closely related more precisely T-dual, see later) to (2004)). It has, finally, inspired the gauge/string
relativistic effects which can be important even in theory or AdS/CFT correspondence (see Ads/CFT
the weak-acceleration limit. When refereing to the Correspondence and Aharony et al. (2000)) on
D9 branes of the type I superstring, the action [8] which we will comment later.
includes the GS terms required to cancel the gauge
anomaly.
The tension and charge density of a Dp brane can Dualities and M Theory
be extracted from its coupling to the (closed-string)
graviton and R–R (p þ 1)-form, with the result: One other key role of D-branes has been to provide
evidence for the various nonperturbative duality
 conjectures. Dual descriptions of the same physics
Tp2 ¼ 2p ¼ ð42 0 Þ3p ½9
2 arise also in conventional field theory. A prime
example is the Montonen–Olive duality of four-
The equality of tension and charge follows from dimensional, N = 4 supersymmetric Yang–Mills,
unbroken supersymmetry, and is also known as a which is the low-energy theory describing the
Bogomol’nyi–Prasad–Sommerfeld (BPS) condition. dynamics of a collection of D3 branes. The action
138 Superstring Theories

for the gauge field and six associated scalars I (all in weakly coupled heterotic string. These are, indeed,
the adjoint representations of the gauge group G) is the only known ultraviolet completions of the
Z theory [7]. Furthermore, for I 1, the D1 brane
1
SN¼4 ¼  2 d4 xtr of the type I theory becomes light, and could be
4g plausibly identified with the heterotic string. This
!
X X conjecture has been tested successfully by comparing
 I  I I J 2
 F F þ 2 D  D  þ 2½ ;   various supersymmetry-protected quantities (such as
I I<J
Z the tensions of BPS excitations and special higher-
 
 d4 x tr F
F derivative terms in the effective action), which can be
32 2
calculated exactly either semiclassically, or at a given
þ fermionic terms ½11 order in the perturbative expansion. Testing the duality
for nonprotected quantities is a hard and important
Consider for simplicity the case G = SU(2). The problem, which looks currently out of reach.
scalar potential has flat directions along which the The other three string theories have also well-
six I commute. By an SO(6) R-symmetry rotation, motivated dual descriptions at strong coupling .
we can set all but one of them to zero, and let The type IIB theory is believed to have an SL(2, Z)
<tr(1 1 )> = v2 in the vacuum. In this ‘‘Coulomb symmetry, similar to that of the N = 4 super Yang–
phase’’ of the theory, a U(1) gauge multiplet stays Mills. (Note that is a dynamical parameter, that
massless, while the charged states become massive changes with the vacuum expectation value of the
by the Higgs effect. The theory admits furthermore dilaton <  >. Thus, dualities are discrete gauge
smooth magnetic-monopole and dyon solutions, and symmetries of string theory.) The type IIA theory
there is an elegant formula for their mass: has a more surprising strong-coupling limit: pitffiffiffiffiffigrows
4i one extra dimension (of radius R11 = 1= 0 ), and
M ¼ vjnel þ nmg j; where  ¼ þ ½12 can be approximated at low energy by the maximal
2 g2
11-dimensional supergravity of Cremmer, Julia, and
and nel (nmg ) denotes the quantized electric (mag- Scherk. The latter is a very economical theory – its
netic) charge. This is a BPS formula that receives massless bosonic fields are only the graviton and a
no quantum corrections. It exhibits the SL(2, Z) 3-form potential A3 . The bosonic part of the action
covariance of the theory, reads
a þ b Z pffiffiffiffiffiffiffiffi
! 1
c þ d S11D ¼ 2 d11 x G ðR  12jF4 j2 Þ
2 11 Z
and ½13 1
 1  A3 ^ F4 ^ F4 ½15
a b 12 211
ðnel ; nmg Þ ! ðnel ; nmg Þ
c d
The electric and magnetic charges of the 3-form are a
Here a, b, c, d are integers subject to the condition (fundamental?) membrane and a solitonic 5-brane.
ad  bc = 1. Of special importance is the transfor- Standard Kaluza–Klein reduction on a circle maps S11D
mation  ! 1=, which exchanges electric and to the IIA supergravity action [6], where G , , and C1
magnetic charges and (at least for = 0) the strong- descend from the 11-dimensional graviton, and B2 and
with the weak-coupling regimes. For more details C3 from the 3-form A3 . Furthermore, all BPS excita-
see the review by Harvey (1996). tions of the type IIA string theory have a counterpart in
The extension of these ideas to string theory can be 11 dimensions, as summarized in Table 1. Finally, if
illustrated with the strong/weak- coupling duality one compactifies the eleventh dimension on an interval
between the type I theory, and the Spin(32)=Z2 (rather than a circle), one finds the conjectured strong-
heterotic string. Both have the same massless spec- coupling limit of the E8  E8 heterotic string.
trum and low-energy action, whose form is dictated The web of duality relations can be extended by
entirely by supersymmetry. The only difference lies in compactifying further to D 9 dimensions. Readers
the relations between the string and supergravity interested in more details should consult Polchinski
parameters. Eliminating the latter, one finds (1998) or one of the many existing reviews of the
pffiffiffi subject (Townsend (1996), see also ‘‘Further Read-
1
het ¼ and 0het ¼ 2 I 0I ½14 ing’’ section). In nine dimensions, in particular, the
2 I two type II theories, as well as the two heterotic
It is thus tempting to conjecture that the strongly superstrings, are pairwise T-dual. T-duality is a
coupled type I theory has a dual description as a perturbative symmetry (thus firmly established, not
Superstring Theories 139

Table 1 BPS excitations of type IIA string theory, and their counterparts in M theory compactified on a circle of radius R11

Tension Type IIA M on S 1 Tension


pffiffiffi pffiffiffiffiffi
( = 10 ) (2 0 )3 D0 brane K–K excitation 1=R11
 1=3
TF = (20 )1 String Wrapped membrane 2R11 22 = 211
pffiffiffi pffiffiffiffiffi  1=3
( = 10 )(2 0 ) D2 brane Membrane T2M = 22 = 211
pffiffiffi pffiffiffiffiffi  2=3
( = 10 )(2 0 )1 D4 brane Wrapped 5-brane R11 22 = 211
 2=3
(= 210 )(20 ) NS-5-brane 5-brane ð1=2Þ 22 = 211
pffiffiffi pffiffiffiffiffi
( = 10 )(2 0 )3 D6 brane K–K monopole 22 R11
2
= 211

From Bachas CP (1997) Lectures on D-branes. In: Olive DI and West PC (eds.) Duality and Supersymmetric Theories, Proceedings,
Easter School, Newton Institute, Euroconference, Cambridge, UK, April 7–18. With permission of Cambridge University Press.

Strong/weak Strong/weak compactify string theory on a six-dimensional


1
s
M 1
s /z2 manifold. There is an embarassment of riches,
but no completely realistic vacuum and, more
II B II A HE HO I significantly, no guiding dynamical principle to
help us decide (see Compactification of Superstring
Theory). The controlled (and phenomenologically
T T required) breaking of spacetime supersymmetry is
Figure 4 Web of dualities in nine dimensions. From Bachas CP
also a problem.
(1997) Lectures on D-branes. In: Olive DI and West PC (eds.) Conformal field theory and quantum geometry.
Duality and Supersymmetric Theories, Proceedings, Easter School, The algebraic tools of 2D conformal field theory,
Newton Institute, Euroconference, Cambridge, UK, April 7–18. With both bulk and boundary (see Two-Dimensional
permission of Cambridge University Press. Conformal Field Theory and Vertex Operator
Algebras), play an important role in string theory.
only conjectured) which exchanges momentum and They allow, in certain cases, a resummation of 0
winding modes. Putting together all the links one effects, thereby probing the regime where classical
arrives at the fully connected web of Figure 4. This geometric notions do not apply.
makes the point that all five consistent superstrings, Microscopic models of black holes. Charged extre-
and also 11-dimensional supergravity, are limits of a mal black holes can be modeled in string theory by BPS
unique underlying structure called M theory. (For configurations of D-branes. This has led to the first
lack of a better definition, ‘‘M’’ is sometimes also microscopic derivation of the Bekenstein–Hawking
used to denote the D = 11 supergravity plus entropy formula, a result expected from any consistent
supermembranes, as in Figure 4.) A background- theory of quantum gravity. As with the tests of duality,
independent definition of M theory has remained the extension of these results to neutral black holes is a
elusive. Attempts to define it as a matrix model of difficult open problem – see Branes and Black Hole
D0 branes, or by quantizing a fundamental mem- Statistical Mechanics.
brane, proved interesting but incomplete. A diffi- AdS/CFT and holography. A new type of (holo-
culty stems from the fact that in a generic graphic) duality is the one that relates supersym-
background, or in D = 11 Minkowski spacetime, metric gauge theories in four dimensions to string
there is only a dimensionful parameter fixing the theory in asymptotically anti-de Sitter spacetimes.
scale at which the theory becomes strongly coupled. The sharpest and best-tested version of this duality
relates N = 4 super Yang–Mills to string theory in
AdS5  S5 . Solving the -model in this latter back-
Other Developments and Outlook
ground is one of the keys to further progress in the
We have not discussed in this brief review some subject (see AdS/CFT Correspondence).
important developments covered in other contribu- String phenomenology. Finding an experimental
tions to the encyclopedia. For the reader’s conve- confirmation of string theory is clearly one of the most
nience, and for completeness, we enumerate (some pressing outstanding questions. There exist several
of) them giving the appropriate cross-references: interesting possibilities for this – cosmic strings, large
Compactification. To make contact with the extra dimensions, modifications of gravity, primordial
standard model of particle physics, one has to cosmology (see String Theory: Phenomenology for a
140 Supersymmetric Particle Models

Angelantonj C and Sagnotti A (2002) Open strings. Physics


α3 Report 371: 1.
α2 Angelantonj C and Sagnotti A (2003) Open strings – erratum.
Physics Report 376: 339 (arXiv:hep-th/0204089).
α1 Bachas CP (1997) Lectures on D-branes. In: Olive DI and West
PC (eds.) Duality and Supersymmetric Theories, Proceedings,
κ Easter School, Newton Institute, Euroconference, Cambridge,
3 17 UK, April 7–18, arXiv:hep-th/9806199.
log E (GeV) Berkovits N (2002) ICTP lectures on covariant quantization of the
Figure 5 The unification of couplings. superstring, arXiv:hep-th/0209059.
D’Hoker E and Phong DH (2002) Lectures on two-loop super-
strings, arXiv:hep-th/0211111.
Green MB, Schwarz JH, and Witten E (1987) Superstring Theory.
review). Here we point out the one supporting piece Vol. 1: Introduction; Vol. 2: Loop Amplitudes, Anomalies and
of experimental evidence: the unification of the Phenomenology. Cambridge: Cambridge University Press.
gauge couplings of the (supersymmetric, minimal) Harvey JA (1996) Magnetic monopoles, duality, and super-
standard model at a scale close to, but below the symmetry. arXiv:hep-th/9603086.
Kiritsis E (2004) D-branes in standard model building, gravity
Planck scale, as illustrated in Figure 5. This is a
and cosmology. Fortschritte der Physik 52: 200 (arXiv:hep-th/
generic ‘‘prediction’’ of string theory, especially in its 0310001).
heterotic version. Lust D (2004) Intersecting brane worlds: A path to the standard
model? Classical and Quantum Gravity 21: S1399 (arXiv:hep-
See also: AdS/CFT Correspondence; Boundary th/0401156).
Conformal Field Theory; Brane Construction of Gauge Obers NA and Pioline B (1999) U-duality and M-theory. Physics
Theories; Brane Worlds; Branes and Black Hole Report 318: 113 (arXiv:hep-th/9809039).
Statistical Mechanics; Compactification of Superstring Polchinski J (1998) String Theory. Vol. 1: An Introduction to the
Bosonic String; Vol. 2: Superstring Theory and Beyond.
Theory; Derived Categories; Electroweak Theory;
Cambridge: Cambridge University Press.
Gauge Theories from Strings; Noncommutative Salam A and Sezgin E (1989) Supergravities in Diverse Dimen-
Geometry from Strings; Supermanifolds; String Field sions. Amsterdam: North-Holland/World Scientific.
Theory; String Theory: Phenomenology; Supergravity; Sen A (1997) An introduction to non-perturbative string theory,
Two-Dimensional Conformal Field Theory and Vertex arXiv:hep-th/9802051.
Operator Algebras; Wheeler–DeWitt Theory. The following URL of the HEP-SPIRES database: http://www.slac.-
stanford.edu collects many popular string theory reviews.
Townsend PK (1996) Four lectures on M-theory, arXiv:hep-th/
Further Reading 9612121.
Zwiebach B (2004) A First Course in String Theory. Cambridge:
Aharony O, Gubser SS, Maldacena JM, Ooguri H, and Oz Y Cambridge University Press.
(2000) Large N field theories, string theory and gravity.
Physics Report 323: 183 (arXiv:hep-th/9905111).

Supersymmetric Particle Models


S Pokorski, Warsaw University, Warsaw, Poland imposes stringent relations between interactions
ª 2006 Elsevier Ltd. All rights reserved. which involve particles of different spin. This gives
rise to a special ultraviolet behavior of supersym-
metric theories. Their ultraviolet divergences are
much softer than in nonsupersymmetric theories. In
Introduction
particular, N = 4 supersymmetric quantum field
Supersymmetric quantum field theories (see Super- theories are finite and for any N they are free from
gravity) are characterized by the existence of one quadratic divergences plaguing ordinary theories
(N = 1 supersymmetry) or several (N > 1 extended with elementary scalars. N > 4 supersymmetric
supersymmetry) conserved Noether-like charges theories necessarily involve particles of spin higher
QA A = 1, . . . , N, which establish symmetry links than 1 and are not renormalizable. Supersymmetry
between particle states of different spin. Super- promoted to a local symmetry includes gravity.
symmetry ensures equal numbers of bosonic and Only N = 1 supersymmetric theories allow for
fermionic particle states. If it is exact, bosons and chiral fermions which are the fundamental objects in
fermions related by supersymmetry transformations elementary particle interactions (see Standard Model
have equal masses. Moreover, supersymmetry of Particle Physics). This is because parity and
Supersymmetric Particle Models 141

charge conjugation symmetries are violated in weak understood as a theory with the momentum cut-off
interactions. Therefore, N > 1 theories may not be SM , quantum corrections to the mass parameter m2
of immediate phenomenological relevance. How- in eqn [1] are quadratically divergent:
ever, they may be useful for constructing super-  2
3
symmetric theories in more than four dimensions m2 ¼ 3g 2
2 þ g 2
1 þ   8y 2
t SM þ    ½3
(more than three spatial dimensions). Chiral (effec- 642
tive) theory in four dimensions can be then obtained Here, g1 , g2 , and yt are the gauge couplings of the
after compactification of extra dimensions. For groups U(1)Y , SU(2)L , and the top-quark Yukawa
instance, N = 2 theory in five dimensions (x , y) coupling, respectively. This means that if, above the
compactified on a circle with reflection symmetry energy scale SM , the SM is replaced by some more
y ! y (orbifold compactification) gives chiral fundamental theory, in which there are particles of
N = 1 theory in four dimensions. masses M & SM , the quantum corrections to m2 are
Absence of quadratic divergences in supersym- quadratically dependent on the new mass scale M.
metric theories is the main argument supporting the For M v, this is very unnatural even if the original
belief that fundamental interactions of elementary parameter m2 remains a free parameter of this
particles at energies not higher that O(1 TeV) should underlying theory and particularly difficult to accept
be described by an (approximately) N = 1 super- if in the underlying theory m2 is fixed by some more
symmetric extension of the standard model (SM). fundamental considerations. If the SM was the
Indeed, supersymmetric models elegantly solve the correct theory up to, for example, the mass scale
so-called hierarchy problem of the SM. At present, suggested by the see-saw mechanism for the neu-
supersymmetry remains a theoretical hypothesis. trino masses, SM  1015 GeV
No experimental evidence for it has been found yet
jm2 j  1028 GeV2  1024 v2 !
(for experimental lower bounds on the masses of
supersymmetric particles see Eidelman et al. (2004)). Clearly, this excludes the possibility of understand-
Supersymmetric models will be tested experimentally ing the magnitude of the Fermi scale v in any
at the Large Linear Collider at CERN (Geneva), after sensible way. Thus, for naturalness of the Higgs
the completion of its construction in 2007. Super- mechanism in the SM there should exist a new mass
gravity theories may be physically relevant as an scale M & v, say only one order of magnitude higher
intermediate step in constructing phenomenologically than v and the theory describing the physics above
viable models from superstring theories. that scale should be free of quadratic divergences.
The essence of the hierarchy problem of the (Approximate) supersymmetry is at present the most
standard model (SM) – the successful SU(3)c  elegant and theoretically most complete solution to
SU(2)L  U(1)Y gauge theory of interactions of the hierarchy problem of the SM.
quarks and leptons at energies up to about 100 GeV –
is the following. By itself, the SM does not explain the
value of the Fermi scale v of the electroweak
1=2 Supersymmetric Extensions of the SM
SU(2)L  U(1)Y symmetry breaking (v  GF where
GF is the Fermi constant determined by the life time In supersymmetry, the gauge fields Aa are promoted
of the muon). Indeed, in the SM, the electroweak ^ a = (Aa , a , Da ), one for each
to vector superfields V 
symmetry breaking is realized by an elementary Higgs gauge symmetry group generator, where a ’s are
field H (an SU(2) doublet) with a potential Weyl fermions (called gauginos) and Da ’s are
nondynamical auxiliary fields. A renormalizable

V ¼ m2 H y H þ ðH y HÞ2 ½1 supersymmetric gauge theory is completely defined
2
(see, e.g., Sohnius (1985) and Wess and Bagger
where m and  are free parameters of the SM. When (1992)) by specifying the gauge group, the set of
m2 < 0 is chosen, the minimum of the potential chiral supermultiplets  ^ i = (i , i , Fi ) representing
occurs when matter fields, and the superpotential – a holo-
morphic polynomial function of at most third
m 2 v2
hH y Hi ¼   ½2 order in the chiral superfields which determines
 2 Yukawa couplings of the fermions i and scalars i .
that is, the Higgs doublet acquires SU(2)  U(1)Y Auxiliary fields Da and Fi can be eliminated via their
breaking vacuum expectation value v which is just (algebraic) equations of motion.
the Fermi scale. The masses of the intermediate The so-called minimal supersymmetric SM
vector bosons W  and Z0 are proportional to v and (MSSM) encodes the main features of any super-
depend also on the gauge couplings. Within the SM symmetric extension of the SM. Its gauge group is
142 Supersymmetric Particle Models

SU(3)  SU(2)  U(1) – the same as in the SM – and contradicted by the experimental data. Therefore, in
the chiral superfields are associated to each of the the MSSM, supersymmetry has to be broken expli-
SM quark and lepton fields. Thus, quarks and citly but in such a way that the soft ultraviolet
leptons get scalar spin zero superpartners, the behavior remains intact. Remarkably, the super-
squarks and sleptons, carrying the same quantum symmetry breaking terms which can be added to the
numbers as their corresponding fermions and the MSSM Lagrangian without reintroducing quadratic
vector superfields provide spin 1/2 superpartners for divergences make heavy just those fields which are
the gauge fields – the gluinos, the winos, and the opposite statistics superpartners of the SM gauge
bino. The SM Higgs doublet with weak hypercharge bosons and fermions. These so-called soft terms are:
Y = 1=2 becomes a scalar component of a chiral
~G
Lsoft ¼  12G ~ aG
~ a  1W~W
~ aW
~ a  1B
~~~
superfield H^ u which contains in addition one 2 2 BB
doublet of Weyl fermions – the Higgsinos. The ~ 2  m2 jU
 m2Q jQj ~ c j2  m2 jD
~ c j2
U D
chiral anomaly cancelation condition requires that
there be also a second Higgs chiral superfield H ^d ~ 2  m2 jE
 m2L jLj ~ c j2  m2 jH u j2
E Hu
with Y = 1=2. Such a superfield is also required for  m2Hd jHd j2  m23 ðH u H d þ c:c:Þ
giving masses to all flavors of quarks; because of the
holomorphicity of the superpotential the same Higgs þ AU U ~ u þ AD D
~ c QH ~ d þ AE E
~ c QH ~ c LH
~ d ½5
doublet cannot couple simultaneously to all quarks.
With the MSSM superfield content, the most and yield gaugino (gluino G, ~ wino W, ~ and bino B) ~
general renormalizable superpotential consistent and scalar mass terms as well as explicit trilinear
with the gauge symmetry has the form couplings between scalars (scalar mass terms and
A-terms are 3  3 matrices in the flavor space). As a
W ¼ Yu U ^H
^ cQ ^ u þ Yd D ^H
^ cQ ^ d þ Yl E
^ cL
^H ^ dH
^ d þ H ^u
result, supersymmetry is broken in the mass spectra
^L
^ cQ ^ þ 2 E
^ cL
^L^ þ 3 U
^ cD
^ cD
^ c þ 4 L
^H^ u ½4 but not in the dimensionless couplings.
þ 1 D
The origin of the soft supersymmetry breaking
(flavor indices are suppressed) where the superfield Q ^ remains an open issue. Terms [5] are most probably
contains the SU(2) quark doublet Q and its scalar remnants of the spontaneous supersymmetry break-
superpartner Q ~ and similarly for the lepton doublet ing in the so-called ‘‘hidden’’ sector – a hypothetical
^
L, quark singlets U, ^ D,
^ and lepton singlet E ^ super- set of fields that do not interact directly with the
fields. The three first terms in [4] give the SM-like MSSM fields. For example, in the popular scenario,
Yukawa couplings of quarks and leptons to the Higgs they interact with the MSSM fields only gravitation-
fields together with Yukawa couplings of the corre- ally and spontaneous supersymmetry breaking in the
sponding superpartners. The fourth term has no SM hidden sector is communicated to the MSSM sector
analogy; it gives supersymmetric masses to the Higgs by gravitational interactions giving rise to terms [5].
scalar and Higgsinos. The interactions in the second Several other mechanisms of supersymmetry break-
line do not conserve baryon and lepton numbers, ing transmission have also been proposed (gauge
respectively B and L, and should be forbidden (or mediation, anomaly mediation, etc.).
strongly suppressed) by some additional symmetry of The mass parameters and A-terms in [5] are free
the theory as they would lead to rapid proton decay. A parameters of the low-energy supersymmetric theory
discrete symmetry, called R-parity R = (1)2Sþ3(BL) , and, combined with the interactions like QQ ~G~
where S is the spin of the field, is an interesting originating from supersymmetric kinetic terms, may
possibility. R-parity acts differently on the different be a new, troublesome, source of flavor changing
components of the superfields: it is even for all SM neutral currents and of CP violation.
particles and odd for their superpartners. Its conserva-
tion implies that superpartners must appear in pairs in
any interaction vertex. Thus, with R-parity imposed, Higgs Sector of the MSSM
the lightest supersymmetric particle is stable and it is an
The MSSM Higgs potential reads
excellent candidate for the dark matter in the universe.
Supersymmetry cannot be an exact symmetry of V ¼ m21 jHd j2 þ m22 jHu j2 þ m23 ðHu Hd þ c:c:Þ
nature because there do not exist elementary fermions
and bosons degenerate in mass. The superpotential g2 þ g22  2
þ 1 jHd j2  jHu j2 ½6
[4] does not break supersymmetry spontaneously but 8
even if it did the elementary fermions and bosons Its quartic part is uniquely determined by the
would on average have equal masses (they would structure of the supersymmetric gauge theory. The
satisfy some mass sum rule) which is also parameters m21 , m22 , and m23 are determined by
Supersymmetric Particle Models 143

the soft supersymmetry breaking Higgs boson The minimal-model bound on the Higgs mass can
masses [5] and the  parameter in [4]. The potential be relaxed in models with extended Higgs sector.
[6] is bounded from below for m21 þ m22 > 2m43 , and For instance, if an additional gauge group singlet
for m21 m22  m23 < 0 it has the electroweak symmetry chiral superfield couples to the Higgs doublets, the
breaking minimum at vu = hHu0 i 6¼ 0, vu = hHd0 i 6¼ 0. Higgs self-coupling  in [9] receives additional
The ratio vu =vd  tan  is then phenomenologically contributions. Explicit calculations show that in
a very important parameter. such and other models, with Msoft . 1 TeV, the
Quantum corrections to the mass parameters in bound on the Higgs mass cannot be raised above
[6] are controlled by the mass scale Msoft of the 150 GeV if one wants to preserve perturbative
supersymmetry breaking terms [5]; at the one-loop gauge coupling unification.
level instead of [3], one finds
1  2  2NEW
m21;2  3g 2 þ g 2
1  12y 2
b;t M 2
soft ln ½7 Supersymmetric Grand Unified Theories
162 M2soft
There are two striking aspects of the matter
where yb and yt are the bottom- and top-quark
spectrum in the SM. One is the chiral anomalies
Yukawa couplings, respectively and NEW is the scale
cancelation (Weinberg 1996–2000, Pokorski 2000),
at which the soft supersymmetry breaking terms are
which is necessary for a unitary (and renormaliz-
generated by the putative supersymmetry breaking
able) theory, and occurs thanks to certain conspiracy
transmission mechanism. In gravity mediation scenar-
between quarks and leptons suggesting a deeper link
ios, NEW  MPl . In gauge mediation scenarios, NEW
between them. The second one is that the spectrum
is low but it is a new scale, introduced by hand.
fits into simple representations of the SU(5) and
In the softly broken supersymmetric models, the
SO(10) groups (Ross 1985). Indeed, each generation
hierarchy problem is solved for Msoft . O(10)v.
of the SM matter fills 5 þ 10 þ 1 (if the right-handed
Moreover, eqn [7] shows that via quantum correc-
neutrino is included into the spectrum) representations
tions the large top-quark Yukawa coupling yt drives
of SU(5) and for SO(10), 16 = 5 þ 10 þ 1. The
the mass parameter m22 to a negative value, inducing
assignment of fermions to the SU(5) or SO(10)
the electroweak symmetry breaking. This means that
representations fixes the normalization of the U(1)Y
in supersymmetric models the electroweak scale is
generator. Both facts suggest unification of strong and
calculable in terms of the known coupling constants
electroweak elementary forces in a grand unified
and the (unknown) scales Msoft and cutoff scale
theory with some bigger gauge symmetry group. Such
NEW to the MSSM. If Msoft . O(10)v, the correct
unification implies that all the SM gauge forces
electroweak scale is obtained for NEW  MGUT .
become of equal strength at some unification scale.
This nicely fits with unification of the gauge
Their strength is measured by the running gauge
couplings.
couplings i = g2i =4, i = 1, 2, 3, of the three group
In supersymmetric models, the quartic couplings
factors SU(3)c  SU(2)L  U(1)Y . The energy scale
in the Higgs potential are restricted. This typically
dependence of i is governed by the renormalization
leads to a strong upper bound on the mass of the
group equations. In the first nontrivial approximation,
lightest Higgs particle. In the minimal model with
they read:
the potential [6], at the tree level
ðiÞ  
MHiggs < MZ
91 GeV ½8 1 1 b0 Q
¼  ln ½11
i ðQÞ i ðMZ Þ 2 MZ
This bound is substantially modified by quantum
corrections. They depend quadratically on the top- Here, 1=i (MZ ) = (58.98  0.04, 29.57  0.03,
quark mass and logarithmically on the stop mass 8.40  0.14) are the experimental values of the
scale M~t  Msoft : gauge couplings at the Fermi scale and b(i) 0 are the
coefficients which depend on the matter content of
M2Higgs < v2 ½9 the theory. They are

where  is given by b0 ¼ 101
þ 43Ng ; 43 4 4
6 þ 3Ng ; 11 þ 3Ng

 ¼ 18 g22 þ g21 cos2 2 þ  in the SM and

3g22 m4t M2 b0 ¼ 3
þ 2Ng ; 5 þ 2Ng ; 9 þ 2Ng
with  ¼ 2 2 2
ln 2~t ½10 5
8 v MW mt
in the MSSM, where Ng is the number of fermion
For M~t < 1 TeV, MHiggs < 130 GeV. generations. In the SM, the running gauge couplings
144 Supersymmetric Particle Models

approach each other at high scale of order 1013 GeV be consistent with but close to the present experi-
but never unify. mental limits.
In the MSSM, with sparticle spectrum character-
ized by Msoft
1 TeV and for the initial Fermi scale
values given above, the three gauge couplings unify Summary
with high precision at the scale MGUT  1016 GeV. Supersymmetry is distinct in several very important
Therefore, the MSSM can be embedded into super- points from all other proposed solutions to the
symmetric grand unified theories with no hierarchy hierarchy problem. First of all, it provides a general
problem for the Fermi scale (it is stable with respect theoretical framework which allows one to address
to radiative corrections generated by particles with many physical questions. Supersymmetric models,
masses MGUT ) and no conflict with the measured like the MSSM or its simple extensions, satisfy a
values of the gauge couplings. very important criterion of ‘‘perturbative calculabil-
In the SM, the baryon number is (perturbatively) ity.’’ In particular, they are easily consistent with
conserved since there are no renormalizable couplings the precision electroweak data. The SM is their
violating this symmetry. Experimental search for low-energy approximation in the sense of the
proton decay, for example, p ! eþ 0 , p ! Kþ , is Appelquist–Carazzone decoupling, so most of the
one of the most fundamental tests for particle physics. successful structure of the SM is built into super-
The present limit on the proton life time is
p > symmetric models. The quadratically divergent quan-
1033 yr. In grand unified theories, baryon number tum corrections to the Higgs mass parameter (the
conservation is violated by interactions mediated by origin of the hierarchy problem in the SM) are absent
the heavy gauge bosons corresponding to the enlarged in any order of perturbation theory. Therefore, the
gauge symmetry (e.g., SU(5)), spontaneously broken at cutoff to a supersymmetric theory can be as high as
MGUT to the SM gauge symmetry. Such interactions the Planck scale, and ‘‘small’’ scale of the electroweak
manifest themselves at low energy as additional, breaking is still natural. Supersymmetry is not only
nonrenormalizable interactions added to the SM consistent with grand unification of elementary forces
Lagrangian. Proton decay is then induced by the set but, in fact, makes it very successful. And, finally,
of dimension-6 operators of the form supersymmetry is needed for string theory.
ð6Þ However, there are also some problems to be solved:
ð6Þ ci the hierarchy problem of the electroweak scale is solved
Oi ¼ qqql ½12
M2ð6Þ but the origin of the soft supersymmetry breaking scale
Msoft remains an open question: spontaneous super-
where q, l denote quarks and leptons, respectively. symmetry breaking and its transmission to the visible
For c(6)
i  GUT
1=25, the experimental limit on sector is a difficult problem and a fully satisfactory

p requires M(6) & 1015 GeV, consistently with mechanism which would yield Msoft hierarchically
MGUT = 1016 GeV in supersymmetric GUTs. How- smaller than the Planck (string) scale has not yet been
ever, in supersymmetric GUTs, there is still another, found. On the phenomenological side, there are new
genuinely supersymmetric, source of contributions potential sources of flavor-changing neutral current
to the proton decay amplitudes. These are the transitions and of CP violation, and baryon and lepton
dimension-5 operators numbers are not automatically conserved by the
ð5Þ renormalizable couplings. But even those problems
ð5Þ ci can at least be discussed in a concrete quantitative way.
Oi ¼ ~~l
qqq ½13
M2ð5Þ
See also: Brane Construction of Gauge Theories;
where q ~, ~l denote squarks and sleptons, respectively. Perturbation Theory and its Techniques; Seiberg–Witten
Such operators originate from the exchange of the Theory; Standard Model of Particle Physics;
color triplet scalars present in the Higgs boson GUT Supergravity; Supermanifolds.
multiplets, with M(5)  MGUT  1016 GeV, and
c(5) & 107 is given by the Yukawa couplings.
Inserted into diagrams with gaugino exchanges they Further Reading
give rise to dimension-6 operators of the form [12]. Eidelman S et al. (The Particle Data Group) (2004) Review of
One then gets c(6) = GUT c(5) , M2(6) = M(5) MSUSY . particle physics. Physics Letters B 592: 1.
Given various uncertainties, for example, in the Kane GL (ed.) (1998a) Perspectives in Supersymmetry. Singapore:
World Scientific.
unknown squark, gaugino, and heavy Higgs boson
Kane GL (ed.) (1998b) Perspectives on Higgs Physics II.
mass spectrum, such contributions in supersym- Singapore: World Scientific.
metric GUT models predict the proton life time to Nilles H-P (1984) Physics Reports C 110: 1.
Supersymmetric Quantum Mechanics 145

Pokorski S (2000) Gauge Field Theories, Cambridge Monographs Weinberg S (1996–2000) The Quantum Theory of Fields,
on Mathematical Physics, 2nd edn. Cambridge: Cambridge vols. I–III. Cambridge: Cambridge University Press.
University Press. Wess J and Bagger J (1992) Supersymmetry and Supergravity,
Ross GG (1985) Grand Unified Theories. Redwood City, CA: Princeton Series in Physics, 2nd edn. Princeton: Princeton
Addison-Wesley. University Press.
Sohnius M-F (1985) Physics Reports C 128: 39.

Supersymmetric Quantum Mechanics


J-W van Holten, NIKHEF, Amsterdam, an external magnetic field, the only polarization states
The Netherlands of the dipole being spin up or spin down.
ª 2006 Elsevier Ltd. All rights reserved. In the Schrödinger representation of quantum
mechanics (wave mechanics), fermionic degrees of
freedom are represented by anticommuting Grassmann
Introduction variables. These have no immediate classical analog,
but can be used to construct quasiclassical obser-
Supersymmetric quantum mechanics is a specific vables like spin.
extension of quantum mechanics with fermionic A supersymmetric quantum system is a system
degrees of freedom. In quantum field theory and possessing both fermionic and bosonic degrees of
many-body theory, a fermionic degree of freedom is freedom, characterized by a degeneracy between
one which is subject to Pauli’s principle: any states with even and odd fermion number. In the
nondegenerate quantum state associated with a Schrödinger representation, this is manifest in a
fermionic degree of freedom can be occupied at symmetry transforming bosonic (Grassmann-even)
most once at any time. Similarly, in quantum into fermionic (Grassmann-odd) variables. The
mechanics, one associates a fermionic degree of generators of the supersymmetry transformations
freedom with an observable, the eigenvalue spec- square to the Hamiltonian of the system.
trum of which is restricted to the discrete set (0, 1).
The simplest example of a purely fermionic
quantum system is the fermionic oscillator. It is
The Supersymmetric Oscillator
represented by conjugate operators (f , f y ) such that
An elementary example of a supersymmetric quan-
f 2 ¼ 0; f y2 ¼ 0; ff y þ f y f ¼ 1 ½1 tum system is the supersymmetric oscillator. It is a
with a Hamiltonian H given by the bilinear physical system combining a standard bosonic
expression quantum oscillator with a fermionic oscillator of
the same frequency. The ordinary harmonic oscilla-
h!f y f
Hf ¼ "f þ  ½2 tor is described by the pair of lowering and raising
operators (b, by ), with commutator
The state space of this system is spanned by two
independent state vectors j0i and j1i, such that bby  by b ¼ 1 ½6

f j0i ¼ 0; f y j0i ¼ j1i and the Hamiltonian


½3
f j1i ¼ j0i; f y j1i ¼ 0 Hb ¼ "b þ h!by b ½7
By construction, the states jnf i are eigenstates of In this case, the eigenvalue spectrum of the occupa-
fermion number, tion number

Nf ¼ f y f ; Nf2 ¼ Nf ½4 Nb ¼ b y b ½8


consists of all non-negative integers nb = 0, 1, 2, . . . ,
with eigenvalue nf = (0, 1); this implements the Pauli
with corresponding energy eigenvalues. To construct
principle. The states have energy eigenvalues
the supersymmetric oscillator, the harmonic oscilla-
Enf ¼ "f þ nf h!; nf ¼ ð0; 1Þ ½5 tor is combined with a fermionic oscillator [2] of the
same frequency:
differing in energy by E = h!. Physically, the system 
can be identified with a single fixed magnetic dipole in Hs ¼ "0 þ h! by b þ f y f ½9
146 Supersymmetric Quantum Mechanics

where "0 = "b þ "f . The ground state of this system is whilst
the state annihilated by both b and f:
Q2 ¼ Qy2 ¼ 0 ½16
bj0; 0i ¼ f j0; 0i ¼ 0 ½10
The above relations suffice to guarantee that the
The full set of energy eigenstates of the system is supercharges (Q, Qy ) are conserved:
constructed by taking  
½Q; H ¼ Qy ; H ¼ 0 ½17
1 a result re-expressing the degeneracy between states
jnb ; nf i ¼ pffiffiffiffiffiffiffi bynb f ynf j0; 0i
nb ! ½11 with the same n but different nb and nf . The real
nb ¼ ð0; 1; 2; . . .Þ; nf ¼ ð0; 1Þ form of the supercharges is

with the energy eigenvalue spectrum 1  1 


Q1 ¼ Q þ Qy ; Q2 ¼ Q  Qy ½18
2 2i
Eðnb ; nf Þ ¼ "0 þ n
h!; n ¼ nb þ nf ½12
In this representation
Clearly, there is a degeneracy in energy between the H ¼ Q21 þ Q22 ½19
states jnb þ 1, 0i and jnb , 1i, which have the same
total occupation number n, but differ in the bosonic An important observation is that the ground state is the
and fermionic occupation number by one unit. This only state annihilated by both supersymmetry operators:
is illustrated in Figure 1. Such pairs of states which
Qj0; 0i ¼ 0; Qy j0; 0i ¼ 0 ½20
are degenerate in energy can be transformed into
each other by the operators Indeed, it is the only state with zero energy
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi eigenvalue, and only such a state can be an
Q ¼ 2 h! by f ; Qy ¼ 2h! f y b ½13 invariant supersinglet; all other states have positive
energy and they necessarily occur in supersymmetry
The explicit transformations are
pairs.
1
jnb þ 1; 0i ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Qjnb ; 1i
2ðnb þ 1Þh!
½14
1 Anticommuting Variables
jnb ; 1i ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Qy jnb þ 1; 0i
2ðnb þ 1Þh! Fermionic degrees of freedom can be described in a
pseudoclassical formulation by anticommuting vari-
The operations [14] are called supersymmetry
ables  taking values in an infinite-dimensional
transformations, and the operators Q and Qy are
Grassmann algebra:
called supercharges.
As the zero point of energy is arbitrary in systems 0 þ 0  ¼ 0 ½21
without gravitational interactions, it is customary to
With an anticommuting variable , we can associate
take "0 = 0, that is, "f =  "b ; with the normal-
a derivative operator @=@, which is an element of
ization [13], the Hamiltonian H is then the symme-
another Grassmann algebra such that
trized absolute square of the supercharges:
 
@ @ @ @2
QQy þ Qy Q ¼ 2H ½15 ;  ¼  þ  ¼ 1; ¼0 ½22
@ @ @ @2
This extends the original Grassmann algebra to a
Clifford algebra. Integration with respect to an
anticommuting variable is defined in the same
States (nb, nf) way:
E/hω Z Z
4 (3, 1) (4, 0) d   ¼ 1; d  1 ¼ 0 ½23
3 (2, 1) (3, 0)
that is, integration is the same as differentiation for
2 (1, 1) (2, 0)
anticommuting variables. With these definitions,
1 (0, 1) (1, 0) we can represent the fermionic raising and lowering
0
(0, 0) operators in terms of anticommuting variables as
0 1 2 3 4 nb
@
Figure 1 Spectrum of states of the supersymmetric oscillator. f y ! ; f ! ½24
@
Supersymmetric Quantum Mechanics 147

and the states by provides the integrand for the path-integral repre-
sentation of the evolution operator in the quantum
j0i ! 1; j1i !  ½25
theory. The proof is not given here; the reader is
Then an arbitrary state takes the form of a linear referred to the literature. In passing, note that as the
superposition anticommuting variables (, )  are taken to be
dimentionless, one actually should identify the
ji ¼ 0 j0i þ 1 j1i ! ðÞ ¼ 0 þ 1 ½26  in the
momentum conjugate to  with  =  ih;
and the standard positive-semidefinite inner product quantum theory, this is replaced by the operator
on the state space is represented on the wave ih@=@.
functions by the double integral
Z

hji ¼ d de  ðÞðÞ
 ¼ 0 0 þ 1 1 ½27
Classical Supersymmetry
By construction, f y =  and f = @=@ are conjugates The classical action for the supersymmetric oscilla-
with respect to this inner product: tor with bosonic amplitude x and fermionic ampli-
Z Z
 tude  is
      @ 

d d e  ðÞðÞ ¼ d e 
ðÞðÞ ½28 Z 2

@ 1 2 ! 2  _ 
S¼ dt x_  x þ i þ ! ½35
1 2 2
The real (self-conjugate) forms of the fermion
operators are, therefore, defined by As inferred from the quantum theory, it is a


combination of a linear harmonic oscillator and a
@ @
1 ¼  þ ; 2 ¼ i   ½29 fermionic
pffiffiffi oscillator of the same frequency. A factor
@ @ h is also absorbed in  and ; equivalently, we can
which satisfy the Pauli–Dirac anticommutation use natural units in which h = 1. In the following,
relations we use this convention.
The action [35] is invariant under infinitesimal
i j þ j i ¼ 2ij ½30 symmetry transformations
By taking the product, we obtain  
x ¼ i  þ 
@ ½36
3 ¼ i1 2 ¼ 1  2 ¼ 1  2Nf  ¼ ðx_ þ i!xÞ;  ¼ ðx_  i!xÞ

@
1 with (
, ) Grassmann-odd parameters. The Noether
, N f ¼ ð 1  3 Þ ½31 theorem then implies that there are conserved
2
fermionic charges
Thus, we may think of the wave functions as two-
Q ¼ ðp  i!xÞ;  ¼ ðp þ i!xÞ
Q ½37
component spinors, the components being labeled
either by the eigenvalues of the spin operator 3 , or with the momentum defined by p = ẋ. The other
equivalently by the fermion number Nf , which is a conserved quantity is the energy, represented by the
projection operator on the states with negative spin. Hamiltonian
The action of the Hamiltonian on a wave function
() is represented by the integral 1 2 
H¼ p þ !2 x2 þ ! ½38
Z 2
 0
½HðÞ ¼ d0 deð Þ Hð; Þð
 0
Þ ½32 The canonical phase-space formulation is obtained
by defining brackets of two functions (A, B) on the
 is the ordered symbol of the Hamiltonian:
where H(, )  by
phase space (x, p ; , )
Hð; Þ h!
 ¼ "f þ  ½33 @A @B @A @B
fA; Bg ¼ 
This expression is to be considered as the classical @x @p @p @x

Hamiltonian of the system. In particular, the @A @B @A @B


þ ið1ÞA þ ½39
exponent of the action @ @  @  @
Z 2
  where (1)A is the Grassmann parity of A. In terms
S¼ h_  Hð; Þ
dt i 
of these brackets, the time evolution and super-
1
Z 2 symmetry transformations take the form
 
¼ h dt i_ þ !
 þ "f ðt2  t1 Þ ½34
1 A_ ¼ fH; Ag;  A
A ¼ i Q þ Q; ½40
148 Supersymmetric Quantum Mechanics

Moreover, the charges Q and Q satisfy the bracket states jE, nf i by the energy E and the fermion
algebra number nf = (0, 1). Moreover, all states of positive
energy are degenerate with respect to fermion
 ¼ 2iH;
Q; Q fQ; H g ¼ Q; H ¼ 0 ½41
number, as they form pairs related by
Thus, the action [35] is the classical counterpart of supersymmetry:
the quantum theory [9]–[17] in the correspondence pffiffiffiffiffiffi pffiffiffiffiffiffi

QjE; 0i ¼ 2E jE; 1i; QjE; 1i ¼ 2E jE; 0i ½47
limit i{A, B} ! [A, B] = AB  BA. For these the-
ories, supersymmetry is rooted in the classical Only ground states with E0 = 0 can occur as singlets
transformations [36]. under supersymmetry. The existence of such a
ground state with fermion number nf amounts to
the existence of a state j0, nf i satisfying
Supersymmetric Quantum Mechanics
Ay f j0; nf i ¼ Af y j0; nf i ¼ 0 ½48
The construction for the supersymmetric oscillator
can be generalized to other dynamical systems in The corresponding wave functions are of the form
two ways. First, the nature of the interactions as
represented by the potential can be modified. j0; 0i ! 0 ðx; Þ ¼  ðxÞ
½49
Second, the number of degrees of freedom can be j0; 1i ! 1 ðx; Þ ¼ þ ðxÞ
varied. This section presents a generalization of the
where  (x) are solutions of the equations
supersymmetric oscillator to anharmonic interac-
tions, obtained by modification of the supercharges A  ¼ 0; Ay þ ¼0 ½50
[37] with a general function (x) as follows:
These functions are formally given by the
Q ¼ ðp  iðxÞÞ;  ¼ ðp þ iðxÞÞ
Q ½42 expressions
Rx
The brackets [39] imply the supersymmetry algebra  ðyÞ dy
 ðxÞ ¼ C e 0 ½51
[41] with the Hamiltonian
For a zero-energy ground state to exist, one of these
i 
functions must be normalizable. For example, if
H¼ Q; Q
2 (x) is a polynomial of positive odd degree 2k  1,
1 1 1   then, depending on the sign of the coefficient of
¼ p2 þ 2 ðxÞ þ 0 ðxÞ   
 ½43
2 2 2 x2k1 , one of the exponents is bounded, approaching
zero for x ! 1, and as a result becomes square
In quantum mechanics, the supercharges become
integrable.
operators Q and Qy upon reinterpretation of (x, p)
If no normalizable wave functions of the form
as canonically conjugate operators, and the replace-
[51] exist, the ground state cannot have zero energy
ment  ! f y and  ! f ; this procedure involves no
(E0 > 0) and all states necessarily belong to
ordering ambiguity. The Hamiltonian operator
superdoublets.
defined by the anticommutator of Q and Qy then
takes the operator form associated with [43]. With
the identification
Spinning-Particle Mechanics
1 1
A ¼ pffiffiffi ðp  iÞ; Ay ¼ pffiffiffi ðp þ iÞ ½44 Minimal supersymmetric classical or quantum
2 2
mechanics requires equal number of bosonic and
and making use of the (anti)commutation relations fermionic coordinates in configuration space (xi , i ),
rather than equal number of bosonic and fermionic
AAy  Ay A ¼ 0 ðxÞ; ff y þ f y f ¼ 1 ½45 degrees of freedom in phase space. Specifically,
this Hamilton operator can be written in normal- minimal free supersymmetric particle mechanics in
ordered form as n dimensions is described by the classical
  Lagrangian
H ¼ 12 QQy þ Qy Q ¼ Ay A þ 0 ðxÞf y f ½46
1 i
L ¼ x_ 2i þ i _i ; i ¼ 1; . . . ; n ½52
It is positive-semidefinite by construction. All results 2 2
for the supersymmetric oscillator are reproduced It is invariant modulo a total time derivative under
upon taking (x) = !x. infinitesimal supersymmetry transformations
As the Hamiltonian commutes with the fermion
number operator Nf , we can label all stationary xi ¼ ii ; i ¼ x_ i  ½53
Supersymmetric Quantum Mechanics 149

The canonical phase-space formulation is phrased field described by a vector potential Ai (x). An
in terms of the free-particle momentum and extension of the free-particle action [52], invariant
Hamiltonian under the same supersymmetry transformations
[53], is
pi ¼ x_ i ; H ¼ 12p2i ½54
Z

1 i iq
and the brackets S¼ dt x_ 2i þ i _i þ qAi ðxÞx_ i  Fij ðxÞi j
2 2 2
@A @B @A @B @A @B
fA; Bg ¼  þ ið1ÞA ½55 ½63
@xi @pi @pi @xi @i @i
where Fij = ri Aj  rj Ai is the field strength. The
The supersymmetry transformations are generated
canonical momentum in this model is
by the supercharge
pi ¼ x_ i þ qAi ðxÞ ½64
Q ¼ pi i ; A ¼ ifQ; Ag ½56
with the supersymmetry algebra with the result that the canonical expressions for the
Hamiltonian and supercharge become
ifQ; Qg ¼ 2H; fQ; H g ¼ 0 ½57
H ¼ 12ðpi  qAi ðxÞÞ2 ; Q ¼ ðpi  qAi ðxÞÞi ½65
An important quantity in these models is the bilinear
(Grassmann-even) antisymmetric tensor In the quantum theory, these constants of motion
become the covariant Laplacian and Dirac operator
ij ¼ ii j ½58
in an external vector potential Ai (x). Observe that
For a free particle, it is a set of constants of motion supersymmetry requires the spin to couple to the
forming a representation of so(n), the Lie algebra of magnetic field with gyromagnetic ratio g = 2. Expli-
n-dimensional rotations: citly, the equation of motion for  can be trans-
formed into an equation for the spin precession:
ij ; kl ¼ jk il  jl ik  ik jl þ il jk ½59
_i ¼ qFij j ) _ ij ¼ qðFik kj  ik Fkj Þ ½66
Therefore, the physical interpretation of ij is that it
represents the particle spin. For this reason, super- In three dimensions, this is equivalent to an equation
symmetric particle mechanics is often called spin- in terms of axial vectors:
ning-particle mechanics.
Quantum mechanics of the spinning particle has Fij ¼ "ijk Bk ; ij ¼ "ijk sk ) s_ ¼ qB  s ½67
the same algebraic structure, with (xi , pi ) the
showing that the precession rate of s is given by
standard canonically conjugate operators, and the
twice the Larmor frequency.
fermionic coordinates i represented by the genera-
tors of a Clifford algebra; the irreducble representa-
tion in terms of Pauli–Dirac matrices of dimension
2[n=2]  2[n=2] is Extended Supersymmetry
1 It is possible to construct theories with more
i ! pffiffiffi i ; i j þ j i ¼ 2ij ½60
2 supersymmetries by associating with every bosonic
coordinate several fermionic coordinates. An exam-
It follows that the wave functions have 2[n=2] ple is the supersymmetric oscillator and its general-
components, describing different polarization states. izations considered earlier, which has equal number
Furthermore, in minimal supersymmetric quantum of bosonic and fermionic degrees of freedom in
mechanics, the supersymmetry operator is repre- phase space, rather than equal number of bosonic
sented by the Dirac operator: and fermionic coordinates in configuration space.
1 The classical phase space, spanned by variables
Q ! pffiffiffi   p; ð  pÞ2 ¼ p2i ¼ 2H ½61 (xi , pi ; i , i ) with i = 1, . . . , n, then has double the
2
number of fermionic variables compared to the
Hence, the stationary states of the system solve the minimal supersymmetric particle models. Such mod-
Dirac equation els can be constructed for systems with an
pffiffiffiffiffiffi n-dimensional bosonic configuration space. Their
  p ¼ 2E ½62
supercharges take the form
The models can, without difficulty, be extended to
Q ¼ ðpi  ii ðxÞÞi ;  ¼ ðpi þ ii ðxÞÞi
Q
include interactions with external fields. As an ½68
example, we consider the coupling to a magnetic x ¼ ðx1 ; . . . ; xn Þ
150 Supersymmetric Quantum Mechanics

whilst the Hamiltonian becomes Alternatively, we can represent the wave functions
as spinors of dimension 2n , on which the fermion
H ¼12p2i þ 122i ðxÞ operators fiy and fi act as a 2n -dimensional matrix
 
þ 14ðrj i þ ri j Þ i j  i j ½69 representation of the Clifford algebra with genera-
tors a , a = 1, . . . , 2n, defined by
The supercharges are conserved if the curl of i (x) 
vanishes: ri j  rj i = 0. It follows that at least i ¼ fi þ fiy ; iþr ¼ i fi  fiy ½79
locally there exists a single function W(x) such that
These operators indeed satisfy the anticommutation
i ðxÞ ¼ ri WðxÞ ½70
rule
W(x) is called the superpotential. Defining the
a b þ b a ¼ 2ab ½80
operators
n
Thus, the wave functions have 2 components, as
Ai ¼ pi  ii ðxÞ; Ayi ¼ pi þ ii ðxÞ compared to the 2[n=2] polarization states of the
½71
Ai Ayj  Ayj Ai ¼ ri j þ rj i minimal models.

the supersymmetric quantum theory is defined by


The Witten Index
Q ¼ Ai fiy ; Qy ¼ Ayi fi We have noted that for supersymmetric quantum
  ½72
H ¼ 12 QQy þ Qy Q systems, like the harmonic and anharmonic super-
symmetric oscillator, states exist in pairs of different
The Hamiltonian is the direct operator translation of fermion number, degenerate in energy, except for
the classical expression [69]; its normal-ordered form is possibly one or more zero-energy states which are
  superinvariant in the sense that
H ¼ Ayi Ai þ 12 ri j þ rj i fiy fj ½73
 ni ¼ 0 , Hj0; ni ¼ 0
Qj0; ni ¼ Qj0; ½81
The total fermion number operator
In the Schrödinger representation, these states are
Nf ¼ fiy fi ½74 characterized as zero modes of the Dirac operator:
(summed over i) satisfying the commutation   D ¼ 0 ½82
relations
h i where Di is an ordinary or field-dependent (e.g.,
 
Nf ; fjy ¼ fjy ; Nf ; fj  ¼ fj ½75 covariant) derivative. Clearly, the existence of such

states can, in some cases, be guaranteed if there is no
commutes with the Hamiltonian. Hence, the station- state which can pair up with a given state to form a
ary states can be labeled by the energy E and the total superdoublet. Witten developed a topological char-
fermion number nf = (0, . . . , n). The energy spectrum acterization of this condition, encoded in an index
being positive semidefinite, all positive-energy states defined by
occur in pairs of fermion number (nf , nf þ 1); zero-
energy states exist only if the equations I ¼ trð1ÞNf ¼ nb ðE ¼ 0Þ  nf ðE ¼ 0Þ ½83
where Nf is the fermion number operator, and
Ai fiy j0; nf i ¼ Ayi fi j0 ; nf i ¼ 0 ½76
nb, f (E = 0) are the number of bosonic and fermionic
admit a normalizable solution. In this context, the zero-energy states. The trace is taken over the
vanishing of the curl of i (x) is important, as it is a complete space of states, but as all nonzero energy
necessary condition for the formal solutions states occur in pairs of a bosonic and a fermionic
Z x
state, their contributions to the trace cancel, having
 ðxÞ ¼ C exp  ðyÞ  dy opposite sign. Therefore, the trace is actually only
0 over the zero-energy states, and counts the number
¼ C0 eWðxÞ ½77 of bosonic states with positive sign, and the number
of fermionic states with negative sign. If the index
to be single-valued. If one of them is normalizable, vanishes, I = 0, then any zero-energy states necessa-
there exists a zero-energy ground state with nf = 0 rily exist in equal number of bosonic and fermionic
or nf = n, represented by a wave function: states; under perturbations of the potential, these
j0; 0i ! 0 ðx; xÞ ¼ states can form pairs and change their energy to a
 ðxÞ
½78 positive value. However, if the index does not
j0; ri ! r ðx; xÞ ¼ þ ðxÞ1 . . . n vanish, I 6¼ 0, then there are states which have no
Supersymmetry Methods in Random Matrix Theory 151

partner of complementary fermion number; these Finally, as the wave function representation of
states can never get a nonzero energy under changes supersymmetric quantum mechanics [82] links the
in the parameters of the potential, as long as the Witten index to the space of zero modes of a Dirac
changes respect supersymmetry. Such systems, there- operator, in particular cases it can be used to
fore, necessarily possess exact zero-energy states describe topological aspects of sigma models and
which are invariant under all supersymmetries. gauge theories, and related mathematical quantities
Deformations of the potential respecting super- such as the Atiyah–Singer index.
symmetry are those obtained by changing the More details and references to the original
parameters in the superpotential. The usefulness of literature can be found in the reviews listed in the
this concept is, therefore, that the index for models Further Reading section.
with complicated superpotentials can be computed
by comparing them with models with simple super- See also: Path-Integrals in Non Commutative Geometry;
potentials having similar topological properties. Supermanifolds.
Counting the number of states is not always a
simple procedure, in particular when the spectrum
includes continuum states. Therefore, in practice one Further Reading
often needs a regularization procedure, by taking the
Cooper F, Khare A, and Sukhatme U (1995) Supersymmetry and
trace over the full state space of the exponentially quantum mechanics. Physics Reports 251: 267.
damped quantity De Witt BS (1984, 1992) Supermanifolds. Cambridge: Cambridge
University Press.
Ið Þ ¼ trð1ÞNf e H ½84 Shifman MA (1999) ITEP Lectures on Particle Physics and Field
Theory, vol. 1, ch. 4. Singapore: World Scientific.
and taking the limit ! 0. The quantity [84] can be van Holten JW (1996) D = 1 Supergravity and spinning particles.
computed in terms of a path integral with periodic In: Jancewicz B and Sobezyk J (eds.) From Field Theory to
boundary conditions for the fermionic degrees of Quantum Groups, p. 173. Singapore: World Scientific.
freedom.

Supersymmetry Methods in Random Matrix Theory


M R Zirnbauer, Universität Köln, Köln, Germany insulating behavior sets in when the disorder strength
ª 2006 Elsevier Ltd. All rights reserved. is increased or the space dimension reduced.
The main theoretical tool used in the physics
literature on the subject is the ‘‘supersymmetry
method’’ pioneered by Wegner and Efetov (1979–83).
Introduction Over the past 20 years, physicists have applied the
method in many instances, and a rather complete
A prominent theme of modern condensed matter picture of weakly disordered metals has emerged.
physics is electronic transport – in particular, the Several excellent reviews of these developments are
electrical conductivity – of disordered metallic available in print.
systems at very low temperatures. From the Landau From the perspective of mathematics, however, the
theory of weakly interacting Fermi liquids, one method has not always been described correctly, and
expects the essential aspects of the situation to be what is sorely lacking at present is an exposition of
captured by the single-electron approximation. how to implement the method rigorously. (Unfortu-
Mathematical models that have been proposed and nately, the correct exposition by Schäfer and Wegner
studied in this context include random Schrödinger (1980) was largely ignored or forgotten by later
operators and band random matrices. authors.) In this article, an attempt is made to help
If the physical system has infinite size, two distinct remedy the situation, by giving a careful review of
possibilities exist: the quantum single-electron the Wegner–Efetov supersymmetry method for the
motion may either be bounded or unbounded. In the case of Hermitian band random matrices.
former case, the disordered electron system is an
insulator, in the latter case, a metal with finite
conductivity (if the electron motion is not critical Gaussian Ensembles
but diffusive). Metallic behavior is expected for Let V be a unitary vector space of finite dimension.
weakly disordered systems in three dimensions; A Hermitian random matrix model on V is defined
152 Supersymmetry Methods in Random Matrix Theory

by some probability distribution on Herm(V), the that the matrix entries of H all are statistically
Hermitian linear operators on V. We may fix some independent.
orthonormal basis of V and represent the elements By varying the lattice , the number of orbitals N,
H of Herm(V) by Hermitian square matrices. and the variances Jij , one obtains a large class of
Quite generally, probability distributions are Hermitian random matrix models, two prominent
characterized by their Fourier transform or char- subclasses of which are the following:
acteristic function. In the present case this is
1. For jj = 1, one gets the Gaussian Unitary
  Ensemble (GUE). Its symmetry group is U =
ðKÞ ¼ eitrHK
U(N), the largest one possible in dimension
where the Fourier variable K is some other linear N = dim V.
operator on V, and h. . .i denotes the expectation 2. If ji  jj denotes a distance function for , and f a
value with respect to the probability distribution for rapidly decreasing positive function on R þ of
H. Later, it will be important that, if (K) is an width W, the choice Jij = f (ji  jj) with N = 1
analytic function of K, the matrix entries of K need gives an ensemble of band random matrices with
not be from R or C but can be taken from the even bandwidth W and symmetry group U = U(1)jj .
part of some exterior algebra. Beyond being real, symmetric, and positive, the
The probability distributions to be considered in variances Jij are required to have two extra proper-
this article are Gaussian with zero mean, hHi = 0. ties in order for all of the following treatment to go
Their Fourier transform is also Gaussian: through:
ðKÞ ¼ eð1=2ÞJðK;KÞ  They must be positive as a quadratic form. This is
to guarantee the existence of an inverse, which we
with J some quadratic form. We now describe J for a denote by wij = (J1 )ij .
large family of hierarchical models that includes the  The off-diagonal matrix entries of the inverse
case of band random matrices. must be nonpositive: wij 0 for i 6¼ j.
Let V be given a decomposition by orthogonal
vector spaces:

V ¼ V1  V2      Vjj Basic Tools


Green’s Functions
We should imagine that every vector space Vi
corresponds to one site i of some lattice , and the A major goal of random matrix theory is to
total number of sites is jj. For simplicity, we take understand the statistical behavior of the spectrum
all dimensions to be equal: dim V1 =    = dim and the eigenstates of a random Hamiltonian H.
Vjj = N. Thus, the dimension of V is Njj. The Spectral and eigenstate information can be extracted
integer N is called the number of orbitals per site. from the Green’s function, that is, from matrix
If i is the orthogonal projector on the linear elements of the operator (z  H)1 with complex
subspace Vi  V, we take the bilinear form J to be parameter z 2 CnR. For the models at hand, the
good objects to consider are averages of U -invariant
jj
X observables such as
JðK; K0 Þ ¼ Jij trði K j K0 Þ D E
ð1Þ
i;j¼1 Gi ðzÞ ¼ tr i ðz  HÞ1 ½1

where the coefficients Jij are real, symmetric, and D E


ð2Þ
positive. This choice of J implies invariance under Gij ðz1 ; z2 Þ ¼ tr i ðz1  HÞ1 j ðz2  HÞ1 ½2
the group U of unitary transformations in each
subspace: The discontinuity of G(1) i (z) across the real z-axis
yields the local density of states. In the limit of
U ¼ UðV1 Þ  UðV2 Þ      UðVjj Þ infinite volume (jj ! 1), the function G(2) ij (z1 , z2 )
for z1 = E þ i", z2 = E  i", real energy E, and " > 0
Clearly, (K) = (UKU1 ) or, equivalently, the going to zero, gives information on transport,
probability distribution for H is invariant under for example, the electrical conductivity by the
conjugation H 7! UHU1 , for U 2 U . Kubo–Greenwood formula.
If {eai }a = 1,..., N is an orthonormal basis of Vi , we Mathematically speaking, if G(2) ij (E þ i", E  i") is
define linear operators Eab ab b a
ij : Vj ! Vi 0by 0Eij ej = ei . bounded (for infinite volume) in " and decreases
ab b0 a 0 aa bb
By evaluating J(Eij , Ej0 i0 ) = Jij ii0 jj0   , one sees algebraically with distance ji  jj at " = 0þ , the
Supersymmetry Methods in Random Matrix Theory 153

spectrum is absolutely continuous and the eigen- Fermionic Variant


states are extended at energy E. On the other hand,
The supersymmetry method of random matrix
a pure point spectrum and localized eigenstates are
theory is a theme with many variations. The first
signaled by the behavior G(2) 1  jijj
ij
" e with
variation to be described is the ‘‘fermionic’’ one. To
positive Lyapunov exponent .
optimize the notation, we now write dN, J (H) for
the density of the Gaussian probability distribution
Green’s Functions from Determinants of H:
Z
For any pair of linear operators A, B on a finite-
hFðHÞi ¼ FðHÞ dN; J ðHÞ
dimensional vector space V, the following formula
from basic linear algebra holds if A has an inverse:
 All determinants and traces appearing below will be
d  taken over vector spaces that are clear from the
detðA þ tBÞ ¼ detðAÞ trðA1 BÞ
dt t¼0 context.
Let z1 , . . . , zn be any set of n complex numbers,
Using it with A = z  H and z 2 CnR, all Green’s
put z := diag(z1 , . . . , zn ) for later purposes, and
functions can be expressed in terms of determinants;
consider
for example,
Z Y n
ð2Þ
Gij ðw; zÞ ferm
n; N ðz; JÞ ¼ detðz  HÞ dN;J ðHÞ ½5
* +
@ 2 detðw  HÞ detðz  H þ tEij Þ 
ab ¼1
XN
¼  The supersymmetry method expresses this average
@s@t detðw  H  sEba
ji Þ detðz  HÞ

a;b¼1 s¼t¼0 of a product of determinants in an alternative way,
It is clear that, given a formula of this kind, what by integrating over a ‘‘dual’’ measure as follows.
one wants is a method to handle ensemble averages Introducing an auxiliary unitary vector space
of ratios of determinants. This is what is reviewed in Cn , one associates with every site i of the lattice
the sequel.  an object Qi 2 Herm(Cn ), the space of Hermitian
n  n matrices. If dQi for i = 1, . . . , jj are Lebesgue
measures
Q on Herm(Cn ), one puts DQ = const. 
Determinants as Gaussian Integrals
i dQi and
Let the Hermitian scalar product of the unitary 1
Þij tr Qi Qj
vector space V be written as ’1 , ’2 7! (’ 1 , ’2 ), dn; J ðQÞ :¼ eð1=2Þi;j ðJ DQ ½6
and denote the adjoint or Hermitian conjugate The multiplicative constant in DQ is fixed
of a linear operator A on V by A . If R by
requiring the density to be normalized: dn, J
R e A := (1/2)(A þ A ) > 0, the standard Lebesgue (Q) = 1. By completing the square, this Gaussian
 A’)
integral of the Gaussian function ’ 7! e(’, probability measure has the characteristic function
makes sense and gives Z
Z
eij trQj Kj dn; J ðQÞ ¼ eð1=2Þij Jij tr Ki Kj
eð’;A’Þ

¼ det A1 ½3

where it is understood that we are integrating with where the Fourier variables K1 , . . . , Kjj are n  n
the Lebesgue measure matrices with matrix entries taken from C or
R on (the normed vector space)
another commutative algebra.
V normalized by e(’,  ’)
= 1. The same integral
with anticommuting instead of the (commuting) The key relation of the fermionic variant of the
’ 2 V gives supersymmetry method is that the expectation of the
Z product of determinants [5] has another expression as

eð ;A Þ ¼ det A ½4 Z Y
jj
ferm
n; N ðz; JÞ ¼ detN ðz  iQj Þ dn; J ðQÞ ½7
This basic formula from the field theory of j¼1
fermionic particles is a consequence of the integra- pffiffiffiffiffiffi
tion over anticommuting variables actually being (i = 1). The strategy of the proof is quite simple:
differentiation: one writes the determinants in both expressions for
Z ferm
n, N as Gaussian integrals over nNjj complex
@2 fermionic variables 1 , . . . , n (each  is a vector in
d 1 d 1 f ð 1 ; 1 ; . . .Þ :¼  f ð 1 ; 1 ; . . .Þ
@ 1@ 1 V with anticommuting coefficients), using the basic
154 Supersymmetry Methods in Random Matrix Theory

formula [4]. The integrals then encountered are holds true, provided that the parameters z1 , . . . , zn
essentially the Fourier transforms of the distribu- all lie in the same half (upper or lower) of the
tions dN, J (H) resp., dn, J (Q). The result is complex plane. To obtain information on transport
Z properties, however, one needs parameters in both
  
e z ð  ;  Þ eð1=2Þij Jij  ð  ;i  Þð  ;j  Þ the upper and lower halves; see the paragraph
following [2]. The general case to be addressed
for both expressions of ferm n, N . In other words, below is I m z > 0 for  = 1, . . . , p, and I m z < 0
although the probability distributions dN, J (H) and for  = p þ 1, . . . , n. Careful inspection of the steps
dn, J (Q) are distinct (they are defined on different leading to eqn [9] reveals a convergence problem for
spaces), their characteristic functions coincide
P when 0 < p < n. In fact, [9] with Qj in Herm(Cn ) turns
evaluated on the Fourier variables K =   (  ,  ) out to be false in that range. Learning how to
for H and (Ki ) = (  , i  ) for Qi . This establishes resolve this problem is the main step toward
the claimed equality of the expressions [5] and [7] for mathematical mastery of the method. Let us there-
ferm
n, N (z, J). fore give the details.
What is the advantage of passing to the alternative If s := sgnI m z , the good (meaning convergent)
expression by dn, J (Q)? The answer is that, while H Gaussian integral to consider is
is made up of independent random variables, the new Z Y
n
variables Qi , called the Hubbard–Stratonovich field, ei s ð’ ;ðz HÞ’ Þ ¼ det1 ðis ðz  HÞÞ
are correlated: they interact through the ‘‘exchange’’ ¼1
constants wij = (J1 )ij . If that interaction creates
To avoid carrying around trivial constants, we now
enough collectivity, a kind of mean-field behavior
assume i(n2p)Njj = 1. Use of the characteristic
results.
function of the distribution for H then gives
For the simple case of GUE (jj = 1, w11 = N=2 ) Z
with z1 =    = zn = E, one gets the relation
Z n;N ðz; JÞ ¼ ei s z ð’ ;’ Þ
bos

2 2
hdetn ðE  HÞi ¼ detN ðE  iQÞ eðN=2 Þtr Q dQ 1
 e2ij Jij  s ð’ ;i ’ Þs ð’ ;j ’ Þ ½10
the right-hand side of which is easily analyzed by the The difficulty of analyzing this expression stems
steepest descent method in the limit of large N. from the ‘‘hyperbolic’’ nature (due to the indefinite-
For band random matrices in the so-called ergodic ness of the signs s = 1) of the term quartic in the
regime, the physical behavior turns out to be governed ’ , ’
 .
by the constant mode Q1 =    = Qjj – a fact that can
be used to establish GUE universality in that regime.
Fyodorov’s Method
The integrand for bos is naturally expressed in
Bosonic Variant terms of n  n matrices Mi with matrix ele-
ments (Mi ) = (’  , i ’ ). These matrices lie in
The bosonic variant of the present method, due to
Hermþ (Cn ), that is, they are non-negative as well
Wegner, computes averages of products of determi-
as Hermitian. Fyodorov’s idea was to introduce
nants placed in the denominator:
them as the new variables of integration. To do
Z Y
n
that step, recall the basic fact that, given two
bos
n;N ðz; JÞ ¼ det1 ðz  HÞdN; J ðHÞ ½8 differentiable spaces X and Y and a smooth map
¼1
: X ! Y, a distribution  on X is pushed forward
where we now require I m z 6¼ 0 for all  = 1, . . . , n. to a distribution () on Y by ()[f ] := [f ],
Complications relative to the fermionic case arise where f is any test function on Y.
from the fact that the integrand in [8] has poles. If We apply this universal principle to the case at
one replaces the anticommuting vectors  by hand by identifying X with V n , and Y with
commuting ones ’ , and then simply repeats the (Hermþ (Cn ))jj , and with the mapping that sends
previous calculation in a naive manner, one arrives at
ð’1 ; . . . ; ’n Þ 2 X to ðM1 ; . . . ; Mjj Þ 2 Y
Z Y
jj
bos ?
n;N ðz; JÞ ¼ detN ðz  Qj Þdn; J ðQÞ ½9 by (Mi ) = (’  , i ’ ). On X = V n we are integrat-
j¼1 ing Rwith the product Lebesgue measure normalized
by e (’ , ’ ) = 1. We now want the push-forward
where the integral is still over Qj 2 Herm(Cn ). The of this flat measure (or distribution) by the mapping
calculation is correct, and relation [9] therefore . In general, the push-forward of a measure is not
Supersymmetry Methods in Random Matrix Theory 155

guaranteed to have a density but may be singular constructed by Schäfer and Wegner, but was largely
(like a Dirac -distribution). This is in fact what forgotten in later physics work.
happens if N < n. The matrices Mi then have less Writing (Mk ) = (’ , k ’ ) as before, consider
than the maximal rank, so they fail to be positive the function
but possess zero eigenvalues, which implies that
the flat measure on X is pushed forward by into the FM ðQÞ ¼ eð1=2Þij wij trðsQi þizÞðsQj þizÞk trMk Qk ½12
boundary of Y. For N  n, on the other hand, the viewed as a holomorphic function of
push-forward measure
Qjj does have a density on Y; and
that density is i = 1 (det Mi )Nn dMi , as is seen by Q ¼ ðQ1 ; . . . ; Qjj Þ 2 EndðCn Þjj
transforming to the eigenvalue representation and R
If the Gaussian integral Q FM (Q)DQ with holo-
comparing Jacobians. The dMi are Lebesgue mea-
morphic density DQ = i dQi is formally carried
sures on Herm(Cn ), normalized by the condition
out by completing the square, one gets the integrand
Z Z
of [10]. This is just what we want, as it would allow
etrMi ðdet Mi ÞNn dMi ¼ e ð’ ;i ’ Þ ¼ 1
Mi >0
us to pass to a Q-matrix formulation akin to the one
of the previous section. But how can that formal
Assembling the sign information for I m z in a step be made rigorous? To that end, one needs to (1)
diagonal matrix s := diag(s1 , . . . , sn ), and pushing the construct a domain on which jFM (Q)j decreases
integral over X forward
Q to an integral over Y with rapidly so that the integral exists, and (2) justify
measure DM := i dMi , we obtain Fyodorov’s completion of the square and shifting of variables.
formula: To begin, take the absolute value of FM (Q).
Z
Putting (1=2)(Qj þ Q j ) =: R eQj and (1=2i)(Qj 
n;N ðz; JÞ ¼ eð1=2Þij Jij trðsMi sMj Þ
bos
Q j ) =: I m Qj , we have jFM j = e(1=4)(f1 þf2 þf3 ) with
Y
X
 ek trðiszMk þðNnÞ ln Mk Þ DM ½11 f1 ðQÞ ¼ wij trðsI m Qi þ zÞðsI m Qj þ zÞ þ c:c:
ij
This formula has a number of attractive features. X
One is ease of derivation, another is ready general- f2 ðQÞ ¼ 2 wij trðsR eQi ÞðsR eQj Þ
ij
izability to the case of non-Gaussian distributions. !
The main disadvantage of the formula is that it does X X
not apply to the case of band random matrices f3 ðQÞ ¼ 4 tr Mi þ sI m z wij R eQi
i j
(because of the restriction N  n); nor does it
combine nicely with the fermionic formula [7] to These expressions suggest making the following
give a supersymmetric formalism, as one formula is choice of integration domain for Qi (i = 1, . . . , jj).
built on Jij and the other on wij . Pick some real constant  > 0 and put
Note that [11] clearly displays the dependence on  þ 
the signature of I m z: you cannot remove the s1 , . . . , sn Pi 0
R eQi ¼ Ti Ti ; I m Qi ¼ Pi :¼
from the integrand without changing the domain of 0 P i
integration Y = (Hermþ (Cn ))jj . This important p q
with Ti 2 U(p, q), Pþ 
i 2 Herm(C ), Pi 2 Herm(C ).
feature is missing from the naive formula [9].
The set of matrices Qi so defined is referred to as
Setting q = n  p, let U(p, q) be the pseudounitary p, q
the Schäfer–Wegner domain X . The range of the
group of complex n  n matrices T with inverse
field Q = (Q1 , . . . , Qjj ) is the direct product
T 1 = sT s. Since jdet Tj = 1 for T 2 U(p,Q q), the X : = (Xp, q jj
 ) .
integration domain Y and density DM = i dMi of
To show that this is a good choice of domain, we
Fyodorov’s formula are invariant under U(p, q)
first
R of all show convergence of the integral
transformations Mi 7! TMi T , and so is actually the
FM (Q)DQ. The matrices Pi commute with s, so
integrand in the limit where all parameters z1 , . . . , zn X
X
become equal. Thus, the elements of U(p, q) are f1 ðQÞjX ¼ 2R e wij trðPi þ szÞðPj þ szÞ
global symmetries in that limit. This observation ij
holds the key to another method of transforming the
Since the coefficients wij are positive as a quadratic
expression [10].
form, this expression is convex (with a positive
The Method of Schäfer and Wegner
Hessian) in the Hermitian matrices Pi . Second, the
function
To rescue the naive formula [9], what needs to be X  1
abandoned is the integration domain Herm(Cn ) for f2 ðQÞjX ¼  22 wij tr Ti Ti Tj Tj
the matrices Qi . The good domain to use was ij
156 Supersymmetry Methods in Random Matrix Theory

R R
is bounded from below by the constant 22 ni wii . which proves X (t) FM (Q)DQ = X FM (Q)DQ, inde-
This holds true because wij is negative for i 6¼ j, and pendent of t. (This argument does not go through
because Ti Ti > 0 and the trace of a product of two for the nonrigorous choice sQi := Ti Pi Ti1 usually
positive Hermitian matrices is always positive. made!)
Third, In the limit t ! 1, one encounters the expression
Z Z
!
X X FM ðQÞDQ ¼ dn;J ðisQÞ
f3 ðQÞjX ¼ 4 tr Mi þ sI m z wij Ti Ti X ð1Þ X
i j
 eð1=2Þij Jij trðsMi sMj Þþik trðszMk Þ
is positive, as ( . . . ) is positive Hermitian. As long as with dn, J as in [6]. The normalization integral over
sI m z > 0, the function f3 goes to infinity for all X is defined by taking the Hermitian matrices Pi to
possible directions of taking the Ti to infinity on be the inner variables of integration. The outer
U(p, q). integrals over the Ti then demonstrably exist, and
Thus, when the matrices Qi are taken to vary on one can fix the (otherwise
R arbitrary) normalization
the Schäfer–Wegner domain Xp, q
 , the absolute value
of DQ by setting X dn, J (isQ) = 1. Making that
(1=4)(f1 þf2 þf3 ) choice, and comparing with [10], one has proved
jFM j = e decreases rapidly
R at infinity.
This establishes the convergence of X FM (Q)DQ. Z Z 
Next, let us count dimensions. The mapping bos
n;N ¼ FðMi Þ ¼ð’
 ;i ’ Þ ðQÞDQ
T 7! TT for T 2 U(p, q) =: G is invariant under 
’;’ X

right multiplication of T by elements of the unitary


The final step is to change the order of integration
subgroup H:= U(p)  U(q) – it is called the ‘‘Cartan
over the Q- and ’-variables, which is permitted
embedding’’ of G=H into G. The real manifold G=H
since the Q-integral converges uniformly in ’.
has dimension 2pq and so does its image under the
Doing the Gaussian ’-integral and shifting Qk !
Cartan embedding. Augmenting this by the dimen-
Qk  isz, one arrives at the Schäfer–Wegner formula
sion of Herm(Cp ) and Herm(Cq ) (from Pi ), one gets
for bos
n, N :
dimXp, q 2 2 2
 = 2pq þ p þ q = (p þ q) = n , which is as
2
Z
it should be. bos 1
n;N ðz; w Þ ¼ eð1=2Þij wij trðsQi sQj Þ
Finally, why can one shift variables and do the X
Gaussian integral over Q (with translation-invariant  eNk tr lnðQk iszÞ DQ ½13
DQ) by completing the square? This question is
legitimate as the Schäfer–Wegner domain Xp, 
q
lacks which is a rigorous version of the naive formula [9].
invariance under P the required shift, which is Compared to Fyodorov’s formula, it has the dis-
Qi 7! Qi  isz þ j Jij sMj s. advantage of not being manifestly invariant under
To complete the square in [12], introduce a global hyperbolic transformations Qi 7! TQi T (the
parameter t 2 [0, 1] and consider the family of shifts integration domain X is not invariant). Its best
 feature is that it does apply to the case of band
Qi 7! Qi þ t isz þ j Jij sMj s random matrices with one orbital per site (N = 1).

For fixed t, this shift takes X = (Xp,


 )
q jj
into another
domain, X (t). Inspection shows that the function
Supersymmetric Variant
[12] still decreases rapidly (uniformly in the Mi ) on
X (t), as long as t < 1. Without changing the We are now in a position to tackle the problem of
integral, one can add pieces to X (t) (for t < 1) at averaging ratios of determinants. For concreteness,
infinity to arrange for the chain X  X (t) to be a we shall discuss the case where the number of
cycle. Because X (t) is homotopic to X (0) = X , this determinants is two for both the numerator and the
cycle is a boundary: there exists a manifold Y (t) of denominator, which is what is needed for the
dimension dimX þ 1 such that @Y (t) = X  X (t). calculation of the function G(2)
ij (z1 , z2 ) defined in
Viewed as a holomorphic differential form of degree eqn [2]. We will consider the case of relevance for
(n2 jj, 0) in the complex space End(Cn )jj , the the electrical conductivity: z1 = E þ i , z2 = E  i ,
integrand ! := FM (Q)DQ is closed (i.e., d! = 0). with E 2 R and > 0.
Therefore, by Stokes’ theorem, A Q-integral formula for G(2) ij (z1 , z2 ) can be
Z Z Z Z derived by combining the fermionic method for
D
E
! !¼ !¼ d! ¼ 0 detðz1  HÞdet z2  H þ t2 Eab
X X ðtÞ @Y ðtÞ Y ðtÞ ij
Supersymmetry Methods in Random Matrix Theory 157

with the Schäfer–Wegner bosonic formalism for where the second supertrace includes a sum over
D
E sites and orbitals, and on setting t1 = t2 = 0 becomes
det1 z1  H  t1 Eba
ji det 1
ðz 2  HÞ Y
eNr Str lnðQr iszÞ ¼ SdetN ðQr  iszÞ
and eventually differentiating with respect to t1 , t2 at r
t1 = t2 = 0 and summing over a, b; see the subsection Q
The superintegral ‘‘measure’’ DQ = r DQr is the
‘‘Green’s functions from determinants.’’ All steps are flat Berezin form, that is, the product of differentials
formally the same as before, but with traces and for all the commuting matrix entries in (Qr )BB and
determinants replaced by their supersymmetric (Qr )FF , times the product of derivatives for all the
analogs. Having given a great many technical details anticommuting matrix entries in (Qr )BF and (Qr )FB .
in the last two sections, we now just present the To prove the formula [14], two new tools are
final formula along with the necessary definitions needed, a brief account of which is as follows.
and some indication of what are the new elements
involved in the proof. Gaussian Superintegrals
Let each of QBB , QFF , QBF , and QFB stand for a
2  2 matrix. If the first two matrices have There exists a supersymmetric generalization of the
commuting entries and the last two anticommuting Gaussian integration formulas given in the subsec-
ones, they combine to a 4  4 supermatrix: tion ‘‘Determinants as Gaussian integrals’’: if
  A, D(B, C) are linear operators or matrices with
QBB QBF commuting (resp., anticommuting) entries, and

QFB QFF ReA > 0, one has
  Z
Relevant operations on supermatrices are the 1 A B Þð ;C’Þð ;D Þ
supertrace, Sdet ¼ eð’;A’Þð
 
’;B
C D
StrQ ¼ trQBB  trQFF Verification of this formula is straightforward.
and the superdeterminant, Using it, one writes the last factor in [14] as a
Gaussian superintegral over four vectors: ’1 , ’2 , 1 ,
detðQBB Þ and 2 . The integrand then becomes Gaussian in the
SdetQ ¼
detðQFF  QFB QBB 1 QBF Þ matrices Qr .
These are related by the identity Sdet = exp Str ln
Shifting Variables
whenever the superdeterminant exists and is
nonzero. The next step in the proof is to do the ‘‘Gaussian’’
In the process of applying the method described integral over the supermatrices Qr . By definition, in
earlier, a supermatrix Qi gets introduced at every a superintegral, one first carries out the Fermi
site i of the lattice . The domain of integration for integral, and afterwards the ordinary integrations.
each of the matrix blocks (Qi )BB (i = 1, . . . , jj) is The Gaussian integral over the anticommuting parts
taken to be the Schäfer–Wegner domain X1, 
1
(with (Qr )BF and (Qr )FB is readily done by completing the
some choice of  > 0); the integration domain for square and shifting variables using the fact that
each of the (Qi )FF is the space of Hermitian 2  2 fermionic integration is differentiation:
matrices, as before. Z Z
@
Let E11
BB be the 4  4 (super)matrix with unit entry d
f ð

0 Þ ¼ f ð

0 Þ ¼ d
f ð
Þ
in the upper-left corner and zeros elsewhere; simi- @

larly, E22
FF has unity in the lower-right corner and Similarly, the Gaussian integral over the Hermitian
zeros elsewhere. Putting s = diag(1, 1, 1, 1) and matrices (Qr )FF is done by completing the square
z = diag(z1 , z2 , z1 , z2 ), the supersymmetric Q-integral and shifting. The integral over (Qr )BB , however, is
formula for the generating function of G(2) ij – not Gaussian, as the domain is not R n but the
obtained by combining the Schäfer–Wegner bosonic Schäfer–Wegner domain. Here, more advanced
method with the fermionic variant – is written as calculus is required: these integrations are done by
* + using a supersymmetric change-of-variables theorem
detðz1  HÞ detðz2  H þ t2 Eab ij Þ
due to Berezin to make the necessary shifts by
detðz1  H  t1 Eba ji Þ detðz2  HÞ nilpotents. (There is not enough space to describe
Z
this here, so please consult Berezin’s (1987) book.)
¼ DQeð1=2Þkl wkl StrðsQk sQl Þ
Without difficulty, one finds the result to agree with
 the left-hand side of eqn [14], thereby establishing
cc 11 ba 22 ab
 eStr ln r;c ðQr iszÞErr þit1 EBB Eji it2 EFF Eij ½14 that formula.
158 Supersymmetry Methods in Random Matrix Theory

Approximations along that manifold M := H2  S2 leaves the Q-field


integrand [14] unchanged (for z1 = z2 = t1 = t2 = 0).
All manipulations so far have been exact and, in
One can actually anticipate the existence of such a
fact, rigorous (or can be made so with little extra
manifold from the symmetries at hand. These are
effort). Now we turn to a sequence of approxima-
most transparent in the starting point of the
tions that have been used by physicists to develop a
formalism as given by the characteristic function
quantitative understanding of weakly disordered
heiKH i with
quantum dots, wires, films, etc. While physically
satisfactory, not all of these approximations are KH ¼ ð’ 2 ; H’2 Þ þ ð 1 ; H
1 ; H’1 Þ  ð’ 1Þ
under full mathematical control. We will briefly þ ð 2 ; H 2 Þ
comment on their validity as we go along.
The signs of this quadratic expression are what is
encoded in the signature matrix s = diag(1, 1, 1, 1)
Saddle-Point Manifold (recall that the first two entries are forced by I m z1 >
We continue to consider G(2) 0 and I m z2 < 0). The Hermitian form KH is
kl (E þ i", E  i") and
focus on E = 0 (the center of the energy band) for invariant under the product of two Lie groups:
simplicity. By varying the exponent on the right- U(1, 1) acting on the ’’s, and U(2) acting on the ’s.
hand side of [14] and setting the variation to zero This invariance gets transferred by the formalism to
one obtains, for t1 = t2 = 0, the Q-side; the saddle-point manifold M is in fact an
X ‘‘orbit’’ of the group action of G := U(1, 1)  U(2)
wij sQj s  NQ1
i ¼0 on the Q-field. In the language of physics, the
j degrees of freedom of M correspond to the Gold-
stone bosons of a broken symmetry.
which is called the saddle-point equation. KH also has some supersymmetries, mixing ’’s
Let us now assume translational
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi invariance, with ’s. At the infinitesimal level, these combine
wij = f (ji  jj). Then, if  = N=j wij , the saddle- with the generators of G to give a Lie superalgebra
point equation has i-independent solutions of the of symmetries g := u (1, 1j2). One therefore expects
form some kind of saddle-point supermanifold, say M , on
  the Q-side.
q 0
Qi ¼  BB M can be constructed by extending the above
0 qFF
solution q0 := diag(qBB , qFF ) of the dimensionless
where for qFF there are three possibilities: two saddle-point equation sqs = q1 to the full 4  4
isolated points qFF = 1 (unit matrix) coexist with supermatrix space. Putting q = q0 þ q1 with
a manifold  
0 qBF
  q1 ¼
cos 1 sin 1 ei 1 qFB 0
qFF ¼ ½15
sin 1 ei 1  cos 1 and linearizing in q1 , one gets
which is two-dimensional; whereas the solution sq1 s ¼ q1 1
0 q1 q0 ½17
space for qBB consists of a single connected
2-manifold: The solution space of this linear equation for q1 has
  dimension 4 for all q0 2 M. Based on it, one expects
cosh 0 sinh 0 ei 0 four Goldstone fermions to emerge along with the
qBB ¼ ½16
sinh 0 ei 0 cosh 0 four Goldstone bosons of M.
For the simple case under consideration, one can
The solutions qFF = 1 are usually discarded in the
introduce local coordinates and push the analysis to
physics literature. (The argument is that they break
nonlinear order, but things get quickly out of hand
supersymmetry and therefore get suppressed by
(when done in this way) for more challenging,
fermionic zero modes. For the simpler case of the
higher-rank cases. Fortunately, there exists an
one-point function [1] and in three space dimen-
alternative, coordinate-independent approach, as
sions, such suppression has recently been proved by
the mathematical object to be constructed is
Disertori, Pinson, and Spencer.) Other solutions for
completely determined by symmetry!
qBB are ruled out by the requirement R eQi > 0 for
the Schäfer–Wegner domain.
Riemannian Symmetric Superspace
The set of matrices [16] and [15] – the ‘‘saddle-
point manifold’’ – is diffeomorphic to the product of The linear equation [17] associates with every point
a 2-hyperboloid H2 with a 2-sphere S2 . Moving x 2 M a four-dimensional vector space of solutions,
Supersymmetry Methods in Random Matrix Theory 159

Vx . As the point x moves on M the vector spaces Vx the bundle V with metric gAB ( ) = g(@=@ A , @=@ B ),
turn and twist; thus, they form what is called a the action functional is
vector bundle V over M. (The bundle at hand turns Z
out to be nontrivial, i.e., there exists no global
S¼ dd x@ A gAB ð Þ@ B
choice of coordinates for it.)
A section of V is a smooth mapping  : M ! V
such that (x) 2 Vx for all x 2 M. The sections of The coupling parameter has the physical meaning
V are to be multiplied in the exterior sense, as they of bare (i.e., unrenormalized) conductivity. In the
represent anticommuting degrees of freedom; present model = NW 2 a2d , where W is essentially
hence the proper object to consider is the exterior the width of the band random matrix in units of the
bundle, ^V. lattice spacing a (the short-distance cutoff of the
It is a beautiful fact that there exists a unique continuum field theory). S is the effective action in
action of the Lie superalgebra g on the sections of the limit z1 = z2 . For a finite frequency ! =Rz1  z2 , a
^V by first-order differential operators, or deriva- symmetry-breaking term of the form i! dd xf ( ),
tions for short. (Be advised however that this where  = N()1 ad is the local density of states,
canonical g -action is not well known in physics or has to be added to S.
mathematics.) By perturbative renormalization group analysis, that
The manifold M is a symmetric space, that is, a is, by integrating out the rapid field fluctuations, one
Riemannian manifold with G-invariant geometry. finds for d = 2 that decreases on increasing the cutoff
Its metric tensor, g, uniquely extends to a second- a. This property is referred to as ‘‘asymptotic freedom’’
rank tensor field (still denoted by g) which maps in field theory. On its basis one expects exponentially
pairs of derivations of ^V to sections of ^V, and is decaying correlations, and hence localization of all
invariant with respect to the g -action. This collec- states, in two dimensions. However, a mathematical
tion of objects – the symmetric space M, the proof of this conjecture is not currently available.
exterior bundle ^V over it, the action of the Lie In three dimensions and for a sufficiently large bare
superalgebra g on the sections of ^V, and the conductivity, the renormalization flow goes toward
g -invariant second-rank tensor g – form what the metallic fixed point ( ! 1), where G-symmetry
the author calls a ‘‘Riemannian symmetric super- is broken spontaneously. A rigorous proof of this
space,’’ M . important conjecture (existence of disordered metals
in three space dimensions) is not available either.

Nonlinear Sigma Model Zero-Mode Approximation

According to the Landau–Ginzburg–Wilson (LGW) For a system in a box of linear size L, the cost of
paradigm of the theory of phase transitions, the exciting fluctuations in the sigma model field is
large-scale physics of a statistical mechanical system estimated as the Thouless energy ETh = =L2 . In the
near criticality is expected to be controlled by an limit of small frequency, j!j  ETh , the physical
effective field theory for the long-wavelength excita- behavior is dominated by the constant modes
tions of the order parameter of the system. A (x) = A (independent of x). By computing the
Wegner is credited for the profound insight that integral over these modes, Efetov found the energy-
the LGW paradigm applies to the random matrix level correlations in the small-frequency limit to be
situation at hand, with the role of the order those of the GUE.
parameter being taken by the matrix Q. He argued
that transport observables (such as the electrical See also: Random Matrix Theory in Physics; Symmetry
conductivity) are governed by slow spatial variations Classes in Random Matrix Theory.
of the Q-field inside the saddle-point manifold.
Efetov skilfully implemented this insight in a super-
symmetric variant of Wegner’s method. Further Reading
While the direct construction of the effective Berezin FA (1987) Introduction to Superanalysis. Dordrecht: Reidel.
continuum field theory by gradient expansion of Disertori M, Pinson H, and Spencer T (2002) Density of states of
[14] is not an entirely easy task, the outcome of the random band matrices. Communication in Mathematical
calculation is predetermined by symmetry. On Physics 232: 83–124.
Efetov KB (1997) Supersymmetry in Disorder and Chaos.
general grounds, the effective field theory has to be Cambridge: Cambridge University Press.
a nonlinear sigma model for the Goldstone bosons Fyodorov YV (2002) Negative moments of characteristic poly-
and fermions of M : if { A } are local coordinates for nomials of random matrices: Ingham–Siegel integral as an
160 Symmetric Hyperbolic Systems and Shock Waves

alternative to Hubbard–Stratonovich transformation. Nuclear Wegner F (1979) The mobility edge problem: continuous
Physics B 621: 643–674. symmetry and a conjecture. Zeitschrift für Physics B 35:
Mirlin AD (2000) Statistics of energy levels and eigen functions in 207–210.
disordered systems. Physics Reports 326: 260–382. Zirnbauer MR (1996) Riemannian symmetric superspaces and
Schäfer L and Wegner F (1980) Disordered system with n orbitals their origin in random matrix theory. Journal of Mathematical
per site: Lagrange formulation, hyperbolic symmetry, and Physics 37: 4986–5018.
Goldstone modes. Zeitschrift für Physik B 38: 113–126.

Symmetric Hyperbolic Systems and Shock Waves


S Kichenassamy, Université de Reims the physical or geometric nature of the unknowns;
Champagne-Ardenne, Reims, France to obviate this defect to some extent, we remark that
ª 2006 Elsevier Ltd. All rights reserved. symmetrizers may be viewed as introducing a new
Riemannian metric on the space of unknowns. The
search for a comprehensive criterion for identifying
Introduction equations and boundary conditions compatible with
SH structure is still the object of current research.
Many systems of partial differential equations The most important fields of application of the
arising in mathematical physics and differential theory today are general relativity and fluid
geometry are quasilinear: the top-order derivatives dynamics, including magnetohydrodynamics.
enter only linearly. They may be cast in the form
of first-order systems by introducing, if needed,
Context of SH Theory in Modern Terms
derivatives of the unknowns as additional unknowns.
For such systems, the theory of symmetric–hyperbolic The basic reason why the theory works may be
(SH) systems provides a unified framework for summarized as follows for the modern reader; the
proving the local existence of smooth solutions if history of the subject is, however, more involved.
the initial data are smooth. It is also convenient for Let H be a real Hilbert space. Consider a linear
constructing numerical schemes, and for studying initial-value problem du=dt þ Au = 0; u(0) = u0 2 H,
shock waves. Despite what the name suggests, the where A is unbounded, with domain D(A). By
impact of the theory of SH systems is not limited to Stone’s theorem, one can solve it in a generalized
hyperbolic problems, two examples being Tricomi’s sense, if the unbounded operator A satisfies A þ
equation, and equations of Cauchy–Kowalewska A = 0. This condition contains two ingredients: a
type. symmetry condition on A, and a maximality condi-
Application of the SH framework usually requires a tion on D(A), which incorporate boundary condi-
preliminary reduction to SH form (‘‘symmetrization’’). tions (von Neumann, Friedrichs). Semigroup theory
After comparing briefly the theory of SH systems (Hille and Yosida, Phillips, and many others)
with other functional-analytic approaches, we col- handles more general operators A: it is possible to
lect basic definitions and notation. We then present solve this problem in the form u(t) = S(t)u0 for t > 0,
two general rules, for symmetrizing conservation where {S(t)}t0 is a continuous contraction semi-
laws and strictly hyperbolic equations, respectively. group, if and only if (Au, u)  0, and equation x þ
We next turn to special features possessed by linear Ax = y has a solution for every y in H (this is a
SH systems, and give a general procedure to prove maximality condition on D(A)). One then says that
existence, which covers both linear and nonlinear A is maximal monotone. For such operators, A þ
systems. We then summarize those results on shock A  0. SH systems are systems Qut þ Au = F,
waves, and on blow-up singularities, which are satisfying two algebraic conditions ensuring for-
related to SH structure. Examples and applications mally that A þ A is bounded, and that Q is
are collected in the last section. symmetric and positive definite. This algebraic
The advantages of SH theory are: a standardized structure enables one to solve the problem directly,
procedure for constructing solutions; the availability without explicit reference to semigroup theory.
of standard numerical schemes; a natural way to Precise definitions are given next.
prove that the speed of propagation of support is We assume throughout that all coefficients,
finite. On the other hand, the symmetrization nonlinearities, and data are smooth unless otherwise
process is sometimes ad hoc, and does not respect specified.
Symmetric Hyperbolic Systems and Shock Waves 161

Ajk
Definitions BB = Djk BA with (Djk ) diagonal. Some authors
require the symmetry condition
Consider a quasilinear system

MA B A AC BCjk Ckj


B ¼ BC BA ½5
B ðx; uÞ@ u ¼ N ðx; uÞ ½1
where u = (uA )A = 1,..., m , x = (x ) = 0,..., n , and @ = Equations in which f A = uA 0 are called reaction–

@=@x . The components of u may be real or diffusion equations; they arise in physical and
complex. We follow the summation convention on biological problems in which chemical reactions
repeated indices in different positions; x0 = t may be and diffusion phenomena are combined, and in
thought of as the evolution variable; we write population dynamics.
x = (t, x), with x = (x1 , . . . , xn ). Indices A, B, . . . run A conservation law is symmetric if and only if
from 1 to m, indices j, k, . . . from 1 to n, and Greek @f A =@uB is symmetric in A and B, which means
indices from 0 to n. The complex conjugate of uA is that there are, locally, functions g (x, u) such that
written uA . f A = @g =@uA .
A more fundamental derivation of conservation
 Equation [1] is symmetrizable if there are func- laws would take us beyond the scope of this survey.
tions AB (x, u) such that

MAB :¼ AC MC


B Symmetrization
satisfies the condition =M   for every .
MAB BA Two general procedures for symmetrization are
 It is symmetric if it is symmetrizable with available: one for conservation laws, the other for
AB = AB . semilinear strictly hyperbolic problems.
 It is symmetric-hyperbolic with respect to k if it
is symmetric and if k MAB is positive definite: Conservation Laws with a Convex Entropy
k MAB A B > 0 for  = (A ) 6¼ 0.
Consider, for simplicity, a conservation law of
Thus, a symmetrizer (AB ) gives rise to a the form
Riemannian metric (k AC MC B ) on the space of
unknowns, independent of any Riemannian struc- @t uA þ @j f Aj ðuÞ ¼ 0 ½6
ture on x-space. The system is SH with respect to x0
We, therefore, assume that the f A = f A (u) and
if k = 0 .
f A0 (u) = uA . We show that the following three
The simplest class of SH systems is provided by
statements are equivalent locally: (1) there is a
real semilinear systems of the form
strictly convex function U(u) such that AB =
A0 ðxÞ@t u þ Ai ðxÞ@i u ¼ Nðx; uÞ ½2 @ 2 U=@uA @uB is a symmetrizer; (2) eqn [6] implies a
scalar relation of the form @ U = 0, with U0 strictly
 0
where the A are real symmetric matrices, A is convex; and (3) there is a change of unknowns
symmetric and positive definite, and k = 0 . Writ- vA = vA (u) such that the system satisfied by v = (vA )
ing A0 = P2 , with P symmetric and positive definite, is SH and (@vA =@uB ) is positive definite.
one finds that v = Pu solves a SH system with A0 = I In fluid dynamics, U0 may sometimes be related
(identity matrix). to specific entropy, and Uj to entropy flux. For this
Conservation laws (with ‘‘reaction’’ or ‘‘source’’ reason, if (2) holds, one says that U0 is an entropy
term N A ) are usually defined as quasilinear systems for eqn [6], and that (U0 , Uj ) is an entropy pair. A
of the form system may have several entropies in this sense; this
fact is sometimes useful in studying convergence
@ f A ðx; uÞ ¼ N A ðx; uÞ ½3
properties of approximate solutions of eqn [6].
They are common in fluid dynamics and combus- Let us now prove the equivalence of these
tion. They are limiting cases of nonlinear diffusion properties.
equations of the typical form Assume first (3): there are new unknowns
Ajk
vA = vA (u) and functions g (v) such that
@ f A ðx; uÞ ¼ N A ðx; uÞ þ "@j ðBB @k uB Þ ½4 f A = @g =@vA . One finds that if eqn [6] holds,
The determination of the form of the coefficients @g
BAjk
B is a nontrivial modeling issue; they may reflect
@ U  ¼ 0 where U ¼ vA  g ½7
@vA
varied physical processes such as heat conduction,
viscosity, or bulk viscosity. They may depend on x, Furthermore, we have f A0 = uA ; therefore, eqn [7]
u, and the derivatives of u. The simplest case is gives: U0 = vA uA  g0 , so that U0 is the Legendre
162 Symmetric Hyperbolic Systems and Shock Waves

transform (familiar from mechanics) of g0 . It follows Strictly Hyperbolic Equations


that vA = @U0 =@uA . Finally, (@vA =@uB ) = (@ 2 U0 =
Consider the scalar equation Pf = g(t, x), where P is
@uA @uB ) is positive definite, and U0 is strictly
the linear operator
convex.
We have proved that (3) implies (2). Next, assume X
N 1
j
(2): the entropy equality Ut þ @j Uj = 0 holds identi- P ¼ @tN  pNj ðt; xÞ@t
j¼0
cally – and not just for the solution at hand. Using
[6], we find of order N. Let  = (1  )1=2 , where  is the Laplace
operator on the space variables. Then u = (uA ), where
@U A @Uj uA = @tA1 NA f for A = 1, . . . , N, solves a first-order
0¼ @ t u þ @j uB
@uA @uB pseudodifferential system of the form
 
@U @f Aj @Uj
¼  A B þ B @j uB ut  Lu ¼ G
@u @u @u
If P is strictly hyperbolic, the principal symbol
Assumption (2), therefore, means that U is strictly a1 (t, x, ) of L has a diagonal form with real
convex and satisfies eigenvalues j (t, x, ), and there are projectors
x, )(p2j = pj ) which
pj (t, P P commute with P
a1 , such that
@U @f Aj @Uj 1 = j pj , and a1 = j j pj . Let r0 = j pj pj , and
¼ ½8
@uA @uB @uB r0 (D) the corresponding operator. Equation
Now, letting vA = @U=@uA and gj (v) = vA f Aj  Uj , r0 ðDÞ@t u  r0 ðDÞLu ¼ r0 ðDÞG
we find
is formally SH in the following sense: r0 is positive
j
 Aj j
 C definite and r0 a1 is Hermitian.
@g @U @f @U @u
¼ f Aj þ 
@vA @uA @uC @uC @vA
¼ f Aj ½9 Linear Problems
Consider a linear system
Let AB = @ 2 U=@uA @uB . Since U is strictly convex,
(AB ) is positive definite, and so is its inverse. We Lu :¼ Qðt; xÞ@t u þ Aj ðt; xÞ@j u þ Bðt; xÞu
have now proved (3). Note that uA = @g0 =@vA ,
¼ f ðt; xÞ ½10
where g0 (v) = uA vA  U(u) is the Legendre trans-
form of U. We assume that Q and the Aj are real and
Next, using eqn [9], and the relations AB = symmetric, Q  c with c positive, and all coeffi-
@vA =@uB = @vB =@uA , we find cients and their first-order derivatives are bounded.

0 ¼ AB ½@t uB þ @j f Bj  Energy Identity


2 j
@vB @ g Multiplying the equation by uT (transpose of u), one
¼ AB @t uB þ @j uC
@uA @vB @uC derives the ‘‘energy identity’’
@ 2 gj
¼ AB @t uB þ A C @j uC @t ðuT QuÞ þ @j ðuT Aj uÞ þ uT Cu ¼ 2uT f ðt; xÞ ½11
@u @u
where C = 2B  @t Q  @j Aj . C is not necessarily
which is SH; therefore, AB is a symmetrizer for eqn
positive. However, v := u exp (t) satisfies a linear
[6], and (1) is proved. Thus, (2) implies (1) and (3).
SH system for which C is positive definite if  is
Finally, if (1) holds, AC @f Cj =@uB is symmetric in
large enough.
A and B. It follows that
  Propagation of Support
@ @U @f Aj @f Aj @U @ 2 f Aj
¼  AC þ
@uC @uA @uB @uB @uA @uB @uC A basic property of wave-like equations is finite
speed of propagation of support: if the right-hand
is symmetric in B and C, so that there are, locally, side vanishes, and if the solution at time 0 is
functions Uj such that eqn [8] holds. Therefore, localized in the ball of radius r, then the solution
(U, Uj ) is an entropy pair, and we see that (1) at time t is localized in the ball of radius r þ ct for a
implies (2). suitable constant c.
This completes the proof of the equivalence of (1), This property also holds for SH systems. To see
(2), and (3). this, let us consider the set where a solution u
Symmetric Hyperbolic Systems and Shock Waves 163

vanishes: if the initial condition vanishes for jxj  R, One constructs a solution defined for t small, which
we claim that u at some later time vanishes for jxj  is in H s , s > n=2 þ 1, as a function of x, by the
R  t=a, for a large enough. following procedure:
Indeed, let us integrate the energy identity on a
(1) Replace spatial derivatives by regularized opera-
truncated cone  := {jxj  a(t0  t)=t0 ; 0  t  t1 }
tors, which should be bounded in Sobolev
with t1 < t0 . The boundary of  consists of three
spaces; the regularized equation is an ODE in
parts: @ = 0 [ 1 [ S, where 0 and 1 represent
H s ; let u" be its solution.
the portions of the boundary on which t = 0 and t1 ,
(2) Write the equation satisfied by derivatives of
respectively. The outer normal to S is proportional
order s of u" , and apply the energy identity to it.
to (a, t0 xj =jxj). Let E(s) denote the integral of uT Qu
(3) Find a positive T such that the solution is
on  \ {t = s}. Integrating eqn [11] by parts, we
bounded in H s for jtj  T, uniformly in "; this
obtain
Z implies a C1 bound.
(4) Prove the convergence of the approximations
Eðt1 Þ  Eð0Þ þ uT u ds
S in L2 .
ZZ (5) Prove the continuity in time of the H s norm;
¼ ð2u f  uT CuÞdt dx
T
½12 conclude that the u" tend to a solution in

P C(T, T; H s ).
where  is proportional to aQ þ t0 j xj Aj =jxj.
Take a so large that  is positive definite. The The result admits a local version, in which
integral over S is then non-negative. If C is positive Sobolev spaces are replaced by Kato’s ‘‘uniformly
definite and f  0, so that E(0) = 0, we find that local’’ spaces. Uniqueness of the solution is proved
E(t1 )  0. Since Q is positive definite, this implies along similar lines. We do not attempt to identify
u  0 on 1 , as claimed. the infimum of the values of s for which the Cauchy
problem is well-posed.
A Numerical Scheme
Jump Discontinuities: Shock Waves
System Lu = f may be discretized, for example, by
the Lax–Friedrichs method: let h be the discretiza- A ‘‘shock wave’’ is a weak solution of a system of
tion step in space, and k the time step; write conservation laws admitting a jump discontinuity.
j u(t, x) = u(t, x1 , . . . , xj þ h, . . . , xn ) (translation in By definition, weak solutions satisfy, for any smooth
the j direction). One replaces @j u by the centered function A (x) with compact support,
difference in the j direction: (j u  j1 u)=2h; and the ZZ
time derivative by ff A @ A þ NA A g dt dx ¼ 0
1 X
½uðt þ k; xÞ  ðj uðt; xÞ þ j1 uðt; xÞÞ=k ½13 The theory of shock waves is an attempt to
2n j
understand solutions of conservation laws which are
For consistency of the scheme, we require k=h =  > 0 limits of solutions of diffusion equations; the hope is
to be fixed as k and h tend to zero; stability then that the influence of second-derivative terms is
holds if  is small. appreciable only near shocks, and that, for given
initial data, there is a unique weak solution of the
conservation law which may be obtained as such a
Nonlinear Problems and Singularities limit, if modeling has been done correctly. This
problem may be difficult already for a single shock
We give a simple setup for proving the existence of
(‘‘shock structure’’).
smooth solutions to SH systems for small times.
The theory of shock waves follows the one-
Such solutions may develop singularities. We limit
dimensional theory closely. We therefore describe
ourselves to two types of singularities, on which SH
the main facts for a conservation law in one space
structure provides some information: jump disconti-
dimension (u = u(t, x)):
nuities and blow-up patterns. Caustic formation is
not considered. @t u þ @x f ðuÞ ¼ 0
If a shock travels at speed c, the weak formulation
Construction of a Smooth Solution
of the equations gives the Rankine–Hugoniot rela-
Consider a real SH system (eqn [1]). Recall that a tion c[u] = [f (u)], where square brackets denote
function of x belongs to the Sobolev space H s if its jumps. There may be several weak solutions having
derivatives of order s or less are square-integrable. the same initial condition. One restricts solutions by
164 Symmetric Hyperbolic Systems and Shock Waves

making two further requirements: (1) the system one can write the solution as the sum of a singular
admits an entropy pair (U, F) with a convex entropy part, known in closed form, and a regular part. If
and (2) to be admissible, weak solutions must be the singularity locus is represented by t = 0, the
limits of ‘‘viscous approximations’’ regular part solves a renormalized equation of the
typical form
@t u þ @x f ðuÞ ¼ "@x2 u
tMu þ Au ¼ t" N ½14
as " ! 0. One then finds easily that the entropy
equality (@t U þ @x F = 0) must be replaced, for such where Mu = 0 is SH. Under natural conditions, for
weak solutions, by the entropy condition: @t U þ any initial condition u0 such that Au0 = 0, there is a
@x F  0 in the weak sense. This condition admits a unique solution of eqn [14] defined for small t.
concrete interpretation if the gradient of each The upshot is an asymptotic representation of
characteristic speed is never orthogonal to the solutions which renders the same services as an
corresponding right eigenvector (‘‘genuine nonli- exact solution, and is valid precisely where numeri-
nearity’’); in that case, characteristics must impinge cal computation breaks down.
on the shock (‘‘shock inequalities’’). Fuchsian reduction enables one in particular to
For the equations of gas dynamics with polytropic study (1) the blow-up time; (2) how the singularity
law (pv = const.), there is a unique solution with locus varies when Cauchy data, prescribed in the
initial condition u = ul for x < 0, u = ur for x > 0, smooth region, are varied; and (3) expressions which
where ul and ur are constant (‘‘Riemann problem’’) remain finite at blow-up. It is the only known general
which satisfies the entropy condition, provided jul  ur j procedure for constructing analytically singular
is small. More generally, if the equation of state spacetimes involving arbitrary functions, rather than
p = p(v, s) > 0 satisfies @p=@v < 0 and @ 2 p=@v2 > 0, arbitrary parameters, and is therefore relevant to the
the shock inequalities are equivalent to the fact that search for alternatives to the big bang.
the entropy increases after the passage of a shock
with jul  ur j small.
On the numerical side, one should mention:
(1) the widely used idea of upstream differencing; Examples and Applications
(2) the Lax–Wendroff scheme, the complete analysis Wave Equation with Variable Coefficients
of which requires tools from soliton theory; and
(3) the availability of general results for dissipative Consider the equation
schemes for SH systems.
Recent trends include: (1) admissibility conditions @tt u þ 2aj ðxÞ@jt u  ajk ðxÞ@jk u ¼ f ðt; x; u; ruÞ
when genuine nonlinearity does not hold and
(2) other approximations of shock wave problems, with (ajk ) positive definite. Letting v = (v0 , . . . ,
most notably kinetic formulations. vnþ1 ) := (u, @j u, @t u), we find the system
Some of the ideas of shock wave theory have been
applied to Hamilton–Jacobi equations and to @t v0 ¼ vnþ1
motion by mean curvature, with applications to @t vk  @k vnþ1 ¼ 0
front propagation problems and ‘‘computer vision.’’
@t vnþ1 þ 2ak @k vnþ1  ajk @k vj ¼ f

Stronger Singularities: Blow-Up Patterns It is symmetrizable, using the quadratic form


AB vA vB = v20 þ ajk vj vk þ v2nþ1 .
The amplitude of a solution may also grow without
One proves directly that, if vj = @j v0 for t = 0, this
bound. Examples include optical pulse propagation
relation remains true for all t.
in Kerr media and singularities in general relativity.
The phenomenon is common when reaction terms
are allowed. As we now explain, this phenomenon is Maxwell’s Equations
reducible to SH theory in many cases of interest.
Blow-up singularities are usually not governed by Maxwell’s equations may be split into six evolution
the characteristic speeds defined by the principal equations: @t E  curl B þ j = 0 and @t B þ curl E = 0,
part, because top-order derivatives are balanced by and two ‘‘constraints’’ div E 
= 0, div B = 0. The
lower-order terms. In many applications, a systema- system of evolution equations is already in sym-
tic process (Fuchsian reduction) enables one to metric form; the quadratic form AB uA uB is here
identify the correct model near blow-up; as a result, jEj2 þ jBj2 .
Symmetric Hyperbolic Systems and Shock Waves 165

Compressible Fluids never exactly satisfied, and the computed solution


may deviate considerably from the exact solution.
Consider first the case of a polytropic gas:
Also, numerical computations depend heavily on the
@t v þ ðv rÞv þ
1 rp ¼ 0 way Einstein’s equations are formulated.
½15 The simplest way to derive a SH system is to
@t
þ divð
vÞ ¼ 0
replace Rab by R(h) 1 c c
ab = Rab  2 [gbc @a F þ gac @b F ],
with p proportional to
. Taking (p, v) as where Fc := gab cab . It turns out that R(h) ab =
unknowns, one readily finds the SH system  12 gcd @cd gab þ Hab (g, @g), where the expression of
1 1 Hab is immaterial. Applying to each component of
@t p þ ðv rÞp þ div v ¼ 0 ½16 the metric the treatment of the first example above
p p
(wave equation with variable coefficients), one
easily derives an SH system of 50 equations for 50

@t v þ rp þ
ðv rÞv ¼ 0 ½17
unknowns: the ten independent components of the
Symmetrization for more general compressible metric, and their 40 first-order derivatives. Now, if
fluids with dissipation, including bulk viscosity, so the c are initially zero (coordinates are ‘‘harmo-
as to satisfy the additional condition [5] may be nic’’), they remain so at later times.
achieved if we take as thermodynamic variables
Unfortunately, the harmonic coordinate condition
and T, and assume pressure p and internal energy " does not seem to be stable in the large. More recent
satisfy @p=@
> 0 and @"=@T > 0, by taking as formulations start with one of the standard setups
unknowns (
,
v,
(" þ jvj2 =2)). The specific entropy (ADM formalism, conformal equations, tetrad
s satisfies d" = Tds  pd(1=
). If the viscosity and formalism, Newman–Penrose formalism) and pro-
heat conduction coefficients are positive, one finds ceed by adding combinations of the constraints to
that U = 
s is a convex entropy (in the sense of SH the equations, multiplied by parameters adjusted so
theory) on the set where
> 0, T > 0. as to ensure hyperbolicity or symmetric–hyperboli-
city if needed. Another recent idea is to add a new
unknown  which monitors the failure of the
Einstein’s Equations
constraint equations; one adds to the equations a
The computation of solutions of Einstein’s equations new relation of the form @t  = C  , where
over long times, in particular in the study of C = 0 is equivalent to the constraints, and  and
coalescence of binary stars, has recently led to are parameters. One then adds coupling terms to
unexplained difficulties in the standard Arnowitt– make the extended system SH. It is expected that the
Deser–Misner (ADM) formulation of the initial- set of constraints acts as an attractor.
value problem in general relativity. One way to Reported computations indicate that these meth-
tackle these difficulties is to rewrite the field ods have resulted in an improvement of the time
equations in SH form; we focus on this particular over which numerical computations are valid.
aspect of recent research.
Recall the problem: find a four-dimensional Tricomi’s Equation
metric gab with Lorentzian signature, such that
Rab  12 Rgab = Tab , with ra Tab = 0, combined Let ’(x, y) solve (y@x2  @y2 )’ = 0. Letting u =
with an equation of state if necessary. Rab is the ex (@x ’, @y ’), one finds a symmetric system Lu = 0,
Ricci tensor and R = gab Rab is the scalar curvature; with
they depend on derivatives of the metric up to order 2.    
y 0 0 1
In addition to the metric, Tab involves physical L¼ ð@x þ Þ  @
0 1 1 0 y
quantities such as fluid 4-velocity or an electro-
magnetic field. The conservation laws of classical If
mathematical physics are all contained in the  
1 y
relation ra Tab = 0. Z¼
1 1
Now, the field equations cannot be solved for
@t2 gab , and, as a consequence, the Taylor series of gab
with respect to time cannot be determined, even we find that K = ZL = A1 @x þ A2 @y þ B, where
formally, from the values of gab and @t gab for t = 0 1 
1 1 2 þ y y
(i.e., the Cauchy data). Furthermore, these data B  2 ð@x A þ @y A Þ ¼ 2
y 
must satisfy four constraint equations. If the
constraints are satisfied initially, they ‘‘propagate.’’ is positive definite if y is bounded, of arbitrary sign,
But in numerical computation, these constraints are and  is small.
166 Symmetries and Conservation Laws

Cauchy–Kowalewska Systems Wave Refinement of the Friedman–Robertson–Walker


Metric.
Consider a complex system
@u
@t u ¼ Aj ðz; t; uÞ þ Bðz; t; uÞ ½18
@zj Further Reading
where u = (uA ), z = (z1 , . . . , zn ). The coefficients are Dautray R and Lions J (1988) Mathematical Analysis and
analytic in their arguments when z and t are close to Numerical Methods for Science and Technology, vols. 1–6.
the origin and u is bounded by some constant K. (transl. from French). New York: Springer.
Friedrichs KO (1986) In: Morawetz CS (ed.) Selecta, two volumes:
The Cauchy–Kowalewska theorem ensures that, for
collection of reprints with comments. Boston: Birkhäuser.
any analytic initial condition near the origin, this Godlewski E and Raviart P-A (1996) Numerical Approximation
system has a unique analytic solution near z = 0, of Hyperbolic Systems of Conservation Laws. Berlin: Springer.
even without any symmetry assumption on the Aj . Gustafsson B, Kreiss H-O, and Oliger J (1996) Time-Dependent
This result is a consequence of SH theory Problems and Difference Methods. New York: Wiley-Interscience.
John F (1982) Partial Differential Equations, 4th edn. Berlin:
(Garabedian).
Springer.
Indeed, write zj = xj þ iyj , @zj = (1=2)(@xj  i@yj ), and Kato T (1975a) Quasilinear equations of evolution with applica-
@zj = (1=2)(@xj þ i@yj ). Recall that analytic functions tions to partial differential equations. Lecture Notes in
of z satisfy the Cauchy–Riemann equations @zj u = 0. Mathematics 448: 25–70.
Adding (A  j )T @z to [18], and using the definition of Kato T (1975b) The Cauchy problem for quasilinear symmetric
j
hyperbolic systems. Archive for Rational Mechanics and
@zj and @zj , we find the symmetric system
Analysis 58: 181–205.
1 Kawashima S and Shizuta Y (1988) On the normal form of the
 j ÞT Þ@xj u
ut ¼ ðAj þ ðA symmetric hyperbolic–parabolic systems associated with the
2 conservation laws. Tôhoku Mathematical Journal 40: 449–464.
1  j ÞT Þ@yj u þ B Kichenassamy S (1996) Nonlinear Wave Equations. New York:
þ ðAj  ðA ½19
2i Dekker.
Lax PD (1973) Hyperbolic Conservation Laws and the Mathe-
Solving this system, we find a candidate u for a matical Theory of Shock Waves. Regional Conference Series in
solution of eqn [18]. To show that u is analytic if the Applied Mathematics. Philadelphia: SIAM.
data are, we solve a second SH system for Lax PD and Phillips RS (1989) Scattering Theory, (revised
w = w(j) := @zj u. If the data are analytic, w vanishes edition). Boston: Academic Press.
Majda A (1984) Compressible Fluid Flow and Systems of
initially, and therefore remains zero for all t.
Conservation Laws. Berlin: Springer.
Therefore, u is indeed analytic. Serre D (1999/2000) Systems of Conservation Laws, I & II.
Cambridge: Cambridge University Press.
See also: Computational Methods in General Relativity: Smoller J (1983) Shock Waves and Reaction-Diffusion Equations.
The Theory; Einstein Equations: Initial Value Berlin: Springer.
Formulation; Evolution Equations: Linear and Nonlinear; Taylor M (1991) Pseudodifferential Operators and Nonlinear
Magnetohydrodynamics; Partial Differential Equations: P.D.E., Progress in Mathematics, vol. 100. Boston: Birkhäuser.
Some Examples; Semilinear Wave Equations; Shock

Symmetries and Conservation Laws


L H Ryder, University of Kent, Canterbury, UK high-energy physics, and the majority of this article
ª 2006 Elsevier Ltd. All rights reserved.
is concerned with this subject. The article concludes
with some observations about symmetries and
conservation laws in general relativity.
In the early days, considerations of symmetry
Introduction: Spacetime Symmetries
were almost limited to Lorentz transformations: we
Symmetries have played, and continue to play, an begin by reviewing this crucially important topic.
important role in fundamental physics, but the part Invariance of the laws of nature under translations
they play is today seen as more complicated and in space and time are actually necessary for the
many-sided than it was in the early days of particle existence of science itself; if experiments did not
physics, just after the Second World War. The area yield the same results today and tomorrow, and in
in which symmetries have had their most dramatic Paris and Moscow and on the Moon, then in effect
consequences is elementary particle physics, or there would be no laws of nature. Almost as strong
Symmetries and Conservation Laws 167

a statement could be made about invariance under corresponding to invariance under translation in
rotations; if space were not isotropic, experimental space and time are momentum and energy; con-
results would depend on which direction the servation of angular momentum follows from
apparatus was aligned in, and again any laws invariance under rotations and invariance under
would be extremely hard to find. Turning to the Lorentz transformations gives rise to conservation
question of motion, Newton and Galileo realized of motion of the center of mass.
that the laws of dynamics are the same in all inertial
frames in relative motion. In the Newton–Galileo
scheme, the rule for relating the space and time
Gauge Theories: Electromagnetism
coordinates of two frames of reference is (for
relative motion along the common x-axis)
and Yang–Mills Theories
A quantity whose conservation has been well known
x0 ¼ x  vt; t0 ¼ t ½1 for a long time is electric charge. The question may
then be asked: invariance under what symmetry
This principle of relativity was reaffirmed by gives rise to conservation of electric charge? A
Einstein, but with the crucial modification that the classical complex field has the Lagrangian density
rules for relating coordinates in two frames are
given by Lorentz transformations, so that [1] is L ¼ ð@ Þð@   Þ  m2   ½3
replaced by
which is invariant under
 vx
x0 ¼ ðx  vtÞ; t0 ¼  t  2 ½2  ! expðiQ Þ ½4
c
 being the parameter for the transformation.
Time is absolute in [1] but relative in [2]. Einstein
Noether’s theorem then yields conservation of Q,
was of course motivated by the fact that Maxwell’s
interpreted as electric charge. With  a constant, as
equations are covariant under Lorentz transforma-
above, the Lagrangian possesses a ‘‘global’’ symme-
tions, but not under Newton–Galileo ones.
try. This becomes a ‘‘local’’ symmetry when 
The above considerations reveal that the laws of
becomes space and time dependent, (r, t) or
nature should be covariant under ten types of
(x ). In that case, however, the Lagrangian [3] is
transformation: three translations in space, one in
no longer invariant under [4], because of the
time, three parameters (angles) for rotations and
derivative terms. To preserve invariance an extra
three velocities. These transformations together
field A must be introduced, so that [4] then
form a group, the inhomogeneous Lorentz, or
becomes
Poincaré group. It is a nonabelian group whose ten
generators correspond to 4-momentum, angular  ! expðiQ ðx ÞÞ
momentum, and Lorentz boosts. The seminal work 1 ½5
on the significance of this group in fundamental A ! A þ @ 
Q
physics is that of Wigner in 1939. Assuming that the
states of fundamental quantum systems (particles, and the Lagrangian acquires extra terms, involving
atoms, molecules) form the basis states for repre- A . The field A is called a gauge field and is
sentations of this group, these entities are described identified with the electromagnetic potential. The
by two quantities, mass and spin. Spin, moreover, transformation [5] is called a gauge transformation,
which was already familiar from earlier investiga- and since the phase factor exp (iQ ) may be
tions in quantum physics, was described by the regarded as a unitary 1  1 matrix, we have here a
rotation group (SU(2), which is homomorphic to theory with U(1) gauge invariance, which describes
SO(3)) only for states with timelike momentum. For electromagnetism and conservation of charge.
photons, for example, with null momentum, spin is The notion of isospin had been introduced by
described by the (noncompact) Euclidean group in Heisenberg in 1932. Isospin (then called isotopic
the plane, with the consequence that there are only spin) was a vector-like quantity conserved in strong
two polarization states for this massless particle. (nuclear) interactions. Yang and Mills in 1954 made
Noether’s theorem provides the crucial link the pioneering suggestion that isospin conservation
between symmetries and conservation laws, via the could also be recast as a gauge theory, by enlarging
principle of least action. Noether showed that the the U(1) group of electromagnetism to SU(2)
invariance of the action under a continuous (corresponding to rotations in ‘‘isospin space’’),
symmetry operation implied the existence of a and at the same time treating the rotation angles as
conserved quantity. The conserved quantities functions of spacetime. Then, eqn [4] will change: if
168 Symmetries and Conservation Laws

for example y is an isospinor field, then local Spontaneous Symmetry Breaking


isospin rotations are given by
The general idea of spontaneous symmetry breaking
n t o
yðxÞ ! exp i  yðxÞ yðxÞ ¼ UðxÞyðxÞ ½6 is that the vacuum – the state of lowest energy – is
2 not invariant under the symmetry in question. A
where t are the Pauli matrices: t=2 are the generators simple and common illustration is a pencil balanced
of SU(2). The gauge field then has three components vertically on its tip on a horizontal plane. The pencil
Ai (i = 1, 2, 3) which may be written as a matrix is in unstable equilibrium but the system has a
symmetry under rotations in the plane about the
i axis coincident with the pencil. Eventually, the
A ¼ Ai pencil will fall into its lowest-energy state (vacuum),
2
lying on the table in some direction – and the
transforming as rotational symmetry is then lost. In fact, under
A ! A0  ¼ UðxÞA U1 ðxÞ rotations the actual lowest-energy (vacuum) state
will be changed into another such state. There is a
i degenerate vacuum.
 ð@ UðxÞÞU1 ðxÞ ½7
g A similar scenario may be constructed in a
where g is the coupling constant, analogous to complex scalar field theory. Consider such a theory
electric charge. The problem with this idea was that with a Lagrangian given by
the isospin gauge field, analogous to the photon in L ¼ ð@ Þð@   Þ  m2    ð Þ2 ½8
electrodynamics, should, like the photon, be mass-
less and have polarization states 1 (commonly, but that is, with a potential energy function given by
inaccurately – see the work of Wigner (1939) – called
Vð;  Þ ¼ m2   þ ð Þ2 ½9
spin 1); whereas the Yukawa particle, identified as the
 meson, was massive and had spin 0, so could not act where m is the mass of the field (quantum) and  is the
as the isospin gauge field. coupling of its self-interaction. The ground state is
The Yang–Mills idea really came into its own obtained by minimizing V, hence @V=@ = 0, giving
with the standard model (SM) of particle physics. (assuming that m2 > 0) a minimum at  =  = 0.
This (gauge) model has an invariance group SU(2)  If, however, m2 < 0, there is a local maximum at
U(1)  SU(3), the first two groups corresponding to  = 0 and a minimum at jj2 = m2 =2 > 0. In
electroweak interactions (a unification of weak quantum theory language, the vacuum expectation
interactions and electromagnetism) and the final value < 0jj0 > of the field is nonzero. Goldstone
SU(3) to quantum chromodynamics (QCD), the showed that this implied the presence of a massless
gauge theory describing quark interactions, which scalar particle – a Goldstone boson. There was some
‘‘glues’’ them together to make hadrons – protons, interest in this result in particle physics, where the
neutrons, pions, etc. This model is a dramatically hypothesis of ‘‘partial conservation of the axial vector
successful one. The QCD sector of the theory current’’ (PCAC) might result in a Goldstone boson
requires essentially no further elaboration on the that could be identified with the pion; although not
Yang–Mills idea than replacing the group SU(2) by massless, the pion is the lightest hadron, so ‘‘almost’’
SU(3). This is a straightforward matter of replacing massless.
the generators t=2 of SU(2) with the eight generators Higgs analyzed what happens to the Goldstone
(3  3 matrices) of SU(3). U(x) then also becomes a model if electromagnetism is included. The Lagran-
3  3 matrix. The three degrees of freedom are the gian [8] is invariant under the global transformation
three quark ‘‘colors,’’ for which there is good [4], but if this is made local, as in [5], a gauge field
experimental evidence, and the gluons, the quanta must be introduced and it is found that the massless
of the gauge fields, are indeed massless and have Goldstone boson disappears and the massless gauge
good experimental support. In the electroweak field (photon) becomes massive. Thus, spontaneous
sector, however, the gauge fields, the W and Z symmetry breaking of a gauge theory results in the
bosons, were found with the predicted masses of appearance of a massive, rather than massless, gauge
80.3 and 91.2 GeV respectively (the proton mass, for particle. (It is relevant to remark that a massless
comparison, is 0.98 GeV). They are certainly not photon possesses two polarization states, but a
massless, as the straightforward Yang–Mills theory massive one possesses three, so the number of spin-
would require, and the explanation for this requires polarization states is preserved – the massless
the introduction of the concept of spontaneous photon ‘‘eats’’ the Goldstone boson and becomes
symmetry breaking. massive.) The Higgs model was generalized to the
Symmetries and Conservation Laws 169

case of a nonabelian symmetry group by Guralnik, problem; this is the problem that the number of
Hagen, and Kibble and invoked by Weinberg in his electron neutrinos detected on Earth, originating in
1971 model for the electroweak interaction in which the Sun, is less than the number predicted, by a
the gauge quanta were massive. factor close to 3. The mismatch could be at least
Higgs’ work was motivated by the theory of partly, and perhaps completely, explained if electron
superconductivity, where the Meissner effect (expul- neutrinos ‘‘oscillated’’ into muon and/or tau neutri-
sion of magnetic flux from a superconductor), when nos on their passage from the Sun to the Earth, since
relativistic, implies that the effective mass of a the reaction which detects the neutrinos on Earth is
photon in a superconductor is nonzero – this is, sensitive only to electron neutrinos, and not to the
the ‘‘reason’’ that the flux does not penetrate. In the other species. But oscillation is only permitted if
theory of Bardeen, Cooper, and Schrieffer (BCS), a Le , L , and Lt are not separately conserved quan-
superconductor is described by an effective scalar tities. Oscillation can also only take place if the
field, a composite of electron pairs (though paired in masses of the different neutrinos are different – the
momentum space rather than coordinate space), and oscillation rate depends on m2 – hence not all
this provides a physical analogy with the model the neutrinos may be massless.
above. The SM of particle physics postulates a Higgs
scalar field analogous to the BCS composite scalar
Discrete Symmetries
field. If this field exists, Higgs particles should also
exist, but they have not yet been found. This is an Ever since parity violation was discovered in weak
outstanding problem for the SM. interactions (nuclear beta decay) by Wu in 1957, the
whole subject of discrete symmetries has presented
problems which are still not resolved. The symme-
Baryon and Lepton Numbers tries in question are

The fact that the proton p does not decay into P (space inversion): (x, y, z) ! (x, y, z)
positron plus photon, eþ þ , or muon plus photon, T (time reversal): t ! t
þ þ , implies a conservation law of baryon C (particle–antiparticle conjugation): particle $
number B (the proton possessing B = 1 and the antiparticle
others B = 0). Furthermore, the stability of  and Are the laws of physics invariant under these
t against decay into e þ  implies conservation of operations? The Wu experiment revealed that weak
lepton numbers Le , L , and Lt . These are regarded interactions are not invariant under P, but what
as global, not local, symmetries, so there are no about other interactions and other operations? In
associated gauge fields or interactions. Interestingly, this context, the CPT theorem is highly important.
however, these symmetries are not built into the SM, According to this theorem (based on very general
so are not guaranteed by it. More interestingly, these assumptions), all laws of nature must be invariant
symmetries are actually destroyed in one attempt to under the combined operation CPT, so that, for
go beyond the SM. This is the hypothesis that QCD example, the fact that weak interactions are not
may be unified with electroweak interactions to invariant under P means that they are not invariant
produce a ‘‘grand unified’’ theory (GUT). The under the product CT either.
simplest GUT is the one in which the SU(2)  U(1)  The violation of P invariance in beta decay was
SU(3) symmetry is assumed to be a subgroup of the soon related to the fact that the neutrino involved
much tighter symmetry SU(5), and in that theory the (the electron neutrino – or, to be precise, antineu-
proton is unstable: trino) was massless. Spin-1/2 particles like the
electron and neutrino obey the Dirac equation,
p ! e þ þ 0 ½10
which may be written out as a pair of coupled
301
The predicted lifetime is 10 years, while a recent equations for left- and right-handed states. In the
estimate of the lifetime for this decay mode is > case m = 0, however, these equations decouple so it
5  1032 years. It may be that GUTs do not exist in is possible to have a massless spin-1/2 particle which
nature, but since the decay [10] violates conserva- is either left-handed or right-handed. Any interac-
tion of the quantities B and Le , even entertaining the tion involving this particle would automatically
idea that the decay might take place begs the violate parity (which turns a left-handed state into
question, ‘‘are these conservation laws sacrosanct?’’ a right-handed one). Experiments have verified that
Another recent development which leads to the the neutrino is indeed left-handed. The SM incorpo-
same question is the subject of neutrino oscillations. rates this in the sense that the left-handed electron
A strong motivation for this is the solar neutrino e L and the electron neutrino e are assigned to a
170 Symmetries and Conservation Laws

weak isospin SU(2) doublet, while the right-handed violating interactions, but as the density increases
electron e R transforms as a singlet. A similar and this reaction rate becomes less than the
pattern is repeated for the  and t particles and expansion rate, thermal equilibrium can no longer
their neutrinos. The phenomenon of neutrino oscil- be maintained. Thus, GUTs offer an explanation of
lations, on the other hand, does not allow all the why there is no antimatter in the Universe. It might
neutrino states also to be purely left-handed (since be thought that this sort of explanation is implau-
they cannot be massless). This poses a potential sible, since the B-violating and CP-violating forces
problem for the SM. are so weak, but actually this is not a problem, since
For a few years after 1957 it was believed that beta the ratio of baryon number to photon number in the
decay violated C as well as P, but conserved the Universe is of the order NB =N 109 ; so we may
product CP; and indeed that all weak interactions conjure up a scenario in which the B and CP
were CP invariant. In 1964, however, it was found violating forces give rise to a volume of space in
that there is a small element of CP violation in K0 which there are, say, 109 antibaryons, 109 þ 1
decay. CP-violating effects are also expected in B0 baryons and approximately the same number of
decays. The physical origin of CP violation is still not photons. Then, all the antibaryons become annihi-
understood, but its importance is that it implies T lated leaving one baryon and 109 photons – as
violation, so that in (at least some) weak interactions, observed.
there is an ‘‘arrow of time’’ on the subnuclear scale. A recent development in the area of discrete
(Such an arrow of time is, of course, familiar in symmetries has been the suggestion by Kostelecky
thermodynamics.) This is used in a cosmological and coworkers that there might exist spontaneous
context to explain baryon–antibaryon asymmetry in violation of CPT and Lorentz symmetry.
the Universe.

Topological Charges
Baryon–Antibaryon Asymmetry
Conserved quantities of a quite different type have
In the standard model of cosmology it is shown that received a lot of attention in recent decades. Their
applying the known laws of physics to the early conservation is a consequence of nontrivial bound-
Universe (the first few minutes) leads to the ary conditions for the fields. A famous example is
conclusion that at an age of 226 s nuclear fusion the sine-Gordon ‘‘kink.’’ The sine-Gordon equation
reactions took place resulting in a mixture of 74%
protons and 26% particles, so that, hundreds of @2 @2 1
 þ sinðbÞ ¼ 0 ½11
thousands of years later, when galactic condensation @t2 @x2 b2
took place, it would involve precisely this admixture describes a scalar field in one space and one time
of hydrogen and helium gases. Just this amount of dimension. It is a nonlinear equation which pos-
helium has been found in the Sun, giving great sesses, among others, the interesting solution
confidence to the ‘‘big bang’’ model. Assuming that
at extremely small times the baryon number of the 4 p
f ð
Þ ¼ arctan exp½ð= bÞ

Universe was zero, B = 0, and assuming also (a big b
assumption, but one nevertheless made by cosmol- where
= x  vt and  = (1  v2 )1=2 . This corre-
ogists) that the Universe is made of matter and not sponds to a solitary wave which moves, preserving
antimatter, we may then ask, why is this – where its shape and size – in distinction to usual waves,
has the antimatter gone? which spread out and dissipate. Waves of this type
Surprisingly, this question was addressed as early are called solitons, and solitons have in fact been
as 1966 by Sakharov, who showed that, starting observed moving along canals. In this case, they are
with an initial state with B = 0, it would be possible solutions to the Korteveg de Vries equation. Equa-
to reach a state with B 6¼ 0 as long as three tion [11] clearly possesses the constant solutions
conditions obtained: B violating interactions, CP
2n
and C violating interactions, and lack of thermal ¼ ; n ¼ 0; 1; 2; . . .
equilibrium. GUTs and ordinary weak interactions b
already provide possibilities for the first two of these which, it may be shown, all have zero energy. We
conditions. Breakdown of thermal equilibrium will may then construct a solution of the above type, but
be expected to occur as the Universe expands. with n = 0 as x ! 1 and n = N as x ! þ1. This
When the particle density is high, reactions such as so-called ‘‘kink’’ solution has finite energy and is not
pþp  !  þ  will ensure an equal population of continuously deformable into a solution with n = 0
baryons and antibaryons, even in the presence of B everywhere, since this would involve overcoming an
Symmetries and Conservation Laws 171

infinite energy barrier. The ‘‘kink number’’ may be Supersymmetry


characterized as a charge: defining the current
Supersymmetry is a fermion–boson symmetry, pos-
b  tulating that multiplets of fundamental particles
J ¼ " @  contain both fermions and bosons. Thus, for
2
example, since electrons exist there should also be
with " the totally antisymmetric symbol, it is clear ‘‘selectrons’’ – ‘‘scalar’’ electrons, with spin 0. There
that this is identically conserved, @ J = 0. This is a should also be photinos, with spin 1/2, to take their
consequence of the definition of " ; it is not a place alongside photons, and so on. If supersymme-
consequence of invariance of the sine-Gordon try were exact, these particles would have the same
Lagrangian under a symmetry operation, so the mass as their partners and would have all been
current J is not a Noether current. The associated found, but in fact none have yet been discovered, so
conserved charge is presumably supersymmetry is a broken symmetry.
Z Z The feature that makes supersymmetry attractive is
b @
Q ¼ J0 dx ¼ dx that it holds some promise for solving divergence
2 @x
problems in quantum field theory, since the radia-
b tive corrections from fermion and boson loops are
¼ ½ð1Þ  ð1Þ ¼ N
2 opposite in sign and may exactly cancel. Super-
Models of the above type may be written down in symmetric models can also help to solve the
a spacetime with more than two dimensions. In that so-called hierarchy problem in quantum field theory.
case the above solution depends only on one If supersymmetry is made into a local symmetry,
coordinate, so represents an infinite planar ‘‘domain rather than simply a global one, extra fields must be
wall,’’ on the two sides of which the field assumes introduced (as the photon field was introduced
different values. Such domain walls, as well as above), and it turns out that one of these is a spin-2
‘‘cosmic strings,’’ are considered as serious possibi- field, which may be identified with the graviton.
lities in cosmology. Local supersymmetry thus becomes supergravity.
Nonabelian gauge theories and the sigma model
also provide a fertile ground for topological excita-
General Relativity
tions – field configurations which for topological
reasons do not decay. Gauge theories with sponta- Symmetries and conservation laws take on new aspects
neous symmetry breaking have two-dimensional when general relativity is considered. Einstein’s field
solutions corresponding to vortex lines and three- equations relate the energy–momentum tensor of
dimensional solutions corresponding to magnetic matter (and radiation) to the Ricci tensor of spacetime.
monopoles. In spacetime (3 þ 1 dimensions), there The Ricci tensor has vanishing covariant divergence,
is a solution to the gauge field equations, with no which means that the energy–momentum tensor
spontaneous symmetry breaking, corresponding to possesses the same property, but conservation of
an ‘‘instanton,’’ a finite-energy field configuration, energy and momentum requires that it is the ordinary
localized in time as well as in space (hence the name). derivative, not the covariant one, of this tensor that
The gauge group here is SU(2), whose group space is should vanish. It might be expected that this problem
S3 . Spacetime is ‘‘Euclideanized’’ into R4 , whose could be alleviated by including the contribution of the
boundary is then S3 . Asymptotic field configurations gravitational field itself in energy–momentum tensor.
may then be characterized by mappings of S3 in field This is quite reasonable, but then problems of
space into S3 in parameter space, and since the third interpretation arise, since at any one point in a general
homotopy group of S3 is nontrivial, 3 (S3 ) = Z, these spacetime, a coordinate system might be found which
field configurations belong to different classes and is inertial (this is the force of the equivalence principle),
are not deformable into each other. These define corresponding to no gravitational field, and therefore
‘‘degenerate vacua’’ of the gauge field equations. In no energy. The usual procedure is to introduce an
quantum theory, tunneling between these vacua is energy–momentum ‘‘pseudotensor,’’ and to conclude
allowed and ’t Hooft has shown how this may give that energy in a gravitational field is not localizable.
rise to deuteron decay d ! eþ þ  . Other exam- The role of symmetries in general relativity is rather
ples of topologically nontrivial configurations are different from its role in particle physics, which is set in
so-called sphalerons, which may also contribute to Minkowski spacetime. In a general spacetime there are
baryon number violation in the early Universe, and no symmetries, but many examples of particular
skyrmions, constructs in the nonlinear sigma model spacetimes with their own symmetries are now
which serve as a model for baryon number. known. The symmetry operations involved are
172 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions

isometries, with corresponding groups of motion (so Eguchi T, Gilkey PB, and Hanson AJ (1980) Gravitation, gauge
that the isometry group of Minkowski space is the theories and differential geometry. Physics Reports 66: 213.
Huang K (1998) Quantum Field Theory: From Operators to Path
Poincaré group). These groups are an important Integrals. New York: Wiley.
subject of study in cosmology; for example, there is a Kostelecky VA (ed.) (2004) CPT and Lorentz Symmetry:
classification of homogeneous cosmological models, Proceedings of the Third Meeting. Singapore: World Scientific.
labeled according to the Bianchi classification. Landau LD and Lifshitz EM (1971) The Classical Theory of
Fields. Oxford: Pergamon Press.
See also: Cotangent Bundle Reduction; Effective Field Manton N and Sutcliffe P (2004) Topological Solitons.
Cambridge: Cambridge University Press.
Theories; Electroweak Theory; General Relativity:
Perkins DH (2000) Introduction to High Energy Physics, 4th edn.
Overview; Infinite-Dimensional Hamiltonian Systems;
Cambridge: Cambridge University Press.
Noncommutative Geometry and the Standard Model; Review of Particle Properties (2002), Physical Review D 66:
Quantum Field Theory: A Brief Introduction; 01002, July 2002, Part 1.
Quasiperiodic Systems; Sine-Gordon Equation; Rubakov V (2002) Classical Theory of Gauge Fields. Princeton:
Supergravity; Symmetries in Quantum Field Theory of Princeton University Press.
Lower Spacetime dimensions; Symmetry and Symplectic Ryder LH (1996) Quantum Field Theory, 2nd edn. Cambridge:
Reduction; Symmetry Classes in Random Matrix Theory; Cambridge University Press.
Topological Defects and Their Homotopy Classification. Stephani H (2004) Relativity: An Introduction to Special and General
Relativity, 3rd edn. Cambridge: Cambridge University Press.
Weinberg S (1983) The First Three Minutes. London: Fontana.
Further Reading Wess J and Bagger (1983) Supersymmetry and Supergravity.
Princeton: Princeton University Press.
Aitchison IJ and Hey AJ (1981) Gauge Theories in Particle Wigner E (1939) On unitary representations of the inhomoge-
Physics. Bristol: Adam Hilger. neous Lorentz group. Annals of Mathematics 40: 149.
Cheng T-P and Li L-F (1984) Gauge Theory of Elementary
Particle Physics. Oxford: Clarendon Press.

Symmetries in Quantum Field Theory of Lower Spacetime


Dimensions
J Mund, Universidade de São Paulo, São Paulo, Brazil freedom. The Coleman–Mandula (1967) theorem
K-H Rehren, Universität Göttingen, Göttingen, states that internal and spacetime symmetries cannot
Germany be mixed, in the sense that the generators of internal
ª 2006 Elsevier Ltd. All rights reserved. symmetries must be Lorentz scalars, hence the total
group of symmetries factorizes into a direct product.
Supersymmetries are an exception of this theorem
Symmetries in Quantum Field Theory because their generators do not form a Lie algebra,
and they were in fact designed to circumvent the
Symmetries have proved to be one of the most
Coleman–Mandula theorem.
powerful concepts in quantum theory, and in
It is well known that the structure of symmetries
quantum field theory in particular. From the
of quantum systems in low-dimensional spacetime
beginnings of quantum mechanics, it is well known
differs significantly from that in four-dimensional
that the presence of a symmetry allows one to
spacetime. (‘‘Low’’ means in our context two or
predict relations between different measurements, to
three, depending on the type of charge localization,
classify spectra (energy or other), and to understand
c.f. below.) To name some examples:
the Pauli exclusion principle, to name only a few
applications. Much more remarkably, in modern
Two-dimensional quantum systems may have much
relativistic quantum field theory, designed to higher symmetries than four-dimensional ones:
describe the interactions of elementary particles, – In two dimensions, there exist massive integr-
fundamental interactions have been found to be able models with infinitely many conservation
induced by the principle of local gauge invariance. laws and factorizable scattering matrices (see
One distinguishes spacetime symmetries (Poincaré Integrability and Quantum Field Theory).
or conformal transformations), which change the These models exhibit solitonic superselection
position and orientation of the system in space and sectors, c.f. below.
time, and internal symmetries, which preserve the – The conformal group of two-dimensional
localization, acting on certain internal degrees of spacetime is infinite dimensional, allowing for
Symmetries in Quantum Field Theory of Lower Spacetime Dimensions 173

the exact computation of correlation functions functions) invariant. The symmetries form a group
by the help of Ward identities (Belavin, of -automorphisms of the algebra of fields:
Polyakov, and Zamolodchikov 1984). Only
the finite-dimensional Möbius group, however, g ð1 2 Þ ¼ g ð1 Þg ð2 Þ
 
is also a symmetry of the vacuum state. g ðÞ ¼ g ð Þ ½1
Möbius covariance implies that the theory g1 g2 ¼ g1 g2
contains two subtheories of chiral fields
defined on the light rays t  x = constant, (typically given by linear transformations of field
resp. t þ x = constant, and that these can be multiplets). In the strongest case, the automorphisms
extended to fields defined on a circle, by are implemented by unitary operators on the state
adding a ‘‘point at infinity’’ to the light ray space
(Lüscher and Mack 1976). One arrives thus at
one-dimensional chiral quantum field theories UðgÞUðgÞ ¼ g ðÞ ½2
on a circle, which will play an important role
The implementers form a representation of the
in the discussion below.
 Continuous symmetries cannot be spontaneously group of automorphisms,
broken in two dimensions. The latter is true not Uðg1 ÞUðg2 Þ ¼ Uðg1 g2 Þ ½3
only for relativistic quantum field theory (Cole-
man 1973), but also in quantum statistical and there is an invariant vector state (a ground state,
mechanics (Mermin and Wagner 1966) where or the vacuum state in relativistic quantum field
it is responsible for the absence of ferromagnet- theory),
ism (see Symmetry Breaking in Field Theory).
UðgÞ ¼  ½4
Spontaneous symmetry breakdown requires
long-range order which is overcome by thermal However, depending on the dynamics of the
fluctuations down to zero temperature, because quantum system, these relations cannot always be
these diverge logarithmically (in the thermody- fully realized. One therefore considers several
namical limit) in two dimensions. This theorem weaker or more general notions of symmetries
thus illustrates how the spacetime dimension- relevant in four dimensions:
dependent size of phase space has an effect on
internal symmetries of quantum systems. A  Spontaneously broken symmetries. The transfor-
detailed mathematical analysis of the balance mations are given as automorphisms of an
between phase space (thermal fluctuations) and algebra, but which are not unitarily implemented
long-range order (symmetry breakdown) has in a given irreducible representation of the
been given in a recent discussion of the Gold- algebra. Invariant pure states do not exist.
stone theorem (Buchholz, Doplicher, Longo and  Projective representations. The symmetries are
Roberts 1992). unitarily implemented, but the implementers fail
 The Coleman–Mandula theorem, excluding a to satisfy the group law [3]. They give rise to ray
mixing between internal and spacetime symme- (projective) representations or representations of a
tries (see above), is valid only in higher covering group. In particular, an invariant state
dimensions. vector as in [4] cannot exist in an irreducible
representation.
In more recent times, it has become apparent that  Infinitesimal symmetries. Lie algebras of infinite-
low-dimensional quantum systems do not only simal transformations, given as derivations of an
admit more symmetries, but they may exhibit algebra, which cannot be integrated to finite
internal symmetries of an entirely new type, not transformations. Derivations may or may not be
describable by groups of transformations. In this implemented in a given representation of the algebra
article, we shall focus on the various ways in which by commutators with self-adjoint generators.
the new symmetries can arise, and how they can be  Supersymmetry. The infinitesimal transforma-
understood. In order to properly appreciate these tions form a graded Lie algebra.
issues, let us first recall some basic symmetry  Local gauge symmetries form an infinite-
concepts in the conventional case. dimensional group which are, however, not
In the traditional setting, symmetries arise in the realized as automorphisms of the quantum alge-
form of groups of transformations of the quantum bra. Quantization of classical gauge interactions
system which leave observable quantities (e.g., usually proceeds by breaking the gauge invariance
vacuum expectation values and correlation in some way and restoring it at a later stage.
174 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions

The Connection between Symmetry Superselection sectors of two-dimensional models


and Superselection Sectors do not follow this scheme expected by the WWW
scenario (see below). This was most strikingly
It is often convenient to describe a model in terms of demonstrated through the classification of the
localized fields which do not represent an observable unitary highest-weight representations of the
(in the sense of quantum mechanics that an operator Virasoro algebra (Friedan, Qiu, and Shenker)
corresponds to some measurement prescription). For which is nothing other than the classification of the
example, Fermi fields which violate the principle of superselection sectors of the observable algebra
causality because they anticommute with each other generated by the chiral stress–energy tensor, and
at spacelike distance rather than commute are not through the determination of their fusion rules by
observables. Only fields which are quadratic in the Belavin, Polyakov, and Zamolodchikov (1984).
Fermi fields (densities of charge, current, energy) are In two dimensions, one is therefore lacking a
observables. This means that an internal symmetry compelling a priori ansatz, like the WWW scenario,
is used in order to distinguish the observables as for describing the system in terms of auxiliary
those operators which are invariant under the nonobservable charged fields. At this point, one
symmetry: in the example, the symmetry transfor- may argue that from an operational point of view, a
mation multiplies each Fermi field by 1 (by the quantum field theory, and in particular its symme-
spin-statistics theorem, this transformation coincides tries, should be understood entirely in terms of its
with the univalence of the Lorentz group). We observables. (This viewpoint is emphasized in the
characterize this situation by writing algebraic approach to QFT, see Algebraic Approach
AðOÞ ¼ FðOÞG ½5 to Quantum Field Theory.) We shall therefore now
ask the opposite question: suppose we are given an
where A(O) and F(O) stand for the algebras of algebra A of local observables (without knowledge
observables and fields localized in some spacetime of a field algebra and its gauge group). We define
region O, respectively, G is the internal symmetry the superselection sectors intrinsically as (the unitary
group acting by automorphisms on each F(O) equivalence classes of) the positive-energy represen-
without affecting the localization, and F(O)G  {a 2 tations of A. Then the question is: do these sectors
F(O), g (a) = a for all g 2 G} denotes the subalgebra arise through a WWW scenario from some field
of invariants. The internal symmetry group G which algebra and a gauge symmetry, and if so, can the
distinguishes the observables according to [5] is latter be reconstructed from the given observables
usually called the ‘‘(global) gauge group.’’ alone?
If the gauge symmetry G is unbroken in the The answer in four dimensions is positive, thanks
vacuum state, then there is a well-known connec- to a deep result due to Doplicher and Roberts
tion between symmetry and superselection rules (1990). Let us sketch the line of reasoning leading to
(see Symmetries and Conservation Laws): namely, this result in some detail, because it shows how the
the observables act reducibly on the vacuum connection between (global) gauge symmetry on the
Hilbert space representation of F because they one hand and spacetime geometry on the other hand
commute with the unitary operators which imple- emerges through the principle of causality (locality)
ment the symmetry (or with their infinitesimal of relativistic quantum field theory, and because it
generators, usually called charges). As a conse- makes apparent what is different in low-dimensional
quence, the validity of the superposition principle is spacetime.
restricted because two eigenstates of different The analysis is based on the general structure
eigenvalues of the charges cannot exhibit interfer- theory of superselection sectors due to Doplicher,
ence. In other words, they belong to different Haag, and Roberts (DHR, 1971). The latter starts
superselection sectors. Wick, Wightman, and with a selection criterion invoking the concept of a
Wigner (1952) were the first to point out this localized charge: a superselection sector which by
relation. We therefore call this scenario the ‘‘WWW measurements within the causal complement of
scenario’’ for brevity. some spacetime region O cannot be distinguished
In the WWW scenario, the decomposition of the from the vacuum sector. The heuristic idea is, of
Hilbert space is determined by the central decom- course, that the sector is obtained from the vacuum
position of the internal symmetry group (the sector by placing some charge in the region O (e.g.,
eigenvalues of the Casimir operators). In this way, by the application of a localized charged field
the superselection sectors are in one-to-one corre- operator to the vacuum vector).
spondence with the irreducible representations of It has been shown (Buchholz and Fredenhagen
the internal symmetry group. 1982) that positive-energy representations of
Symmetries in Quantum Field Theory of Lower Spacetime Dimensions 175

massive theories always satisfy this selection criter- At a more elementary level, one may think of
ion with a localization region O of the form of a statistics operators as reflecting commutation rela-
narrow cone extending in spacelike direction. (In tions between the searched-for charged fields. Mak-
massless theories with long-range interactions, such ing an ansatz for the commutation relations at
as QED, the situation is more complicated because spacelike separation, essentially the same topological
the charge creates an electric field whose flux at argument as before implies, together with Poincaré
infinity does not vanish (Gauss’ law) and is not invariance, that the coefficients appearing in this
Lorentz invariant.) DHR assume that the localiza- relation should form a representation of the permu-
tion region is even compact, and can be chosen tation group, or of the braid group, respectively. The
arbitrarily within the unitary equivalence class of the DHR approach, however, is entirely intrinsic,
representation. avoiding any a priori assumption of charged fields.
Exploiting a strong version of locality (Haag The duality theorem due to Doplicher and
duality) for the vacuum representation of the Roberts (1990) now states that every symmetric C
observables, DHR proceed to define an associative tensor category (with some further qualifications
composition (or fusion) law for positive-energy valid in the DHR setting) is isomorphic to the
representations. This law is commutative only up category of unitary representations of a compact
to unitary equivalence. The crucial point is that the group, in which the composition law is the tensor
unitary intertwiner establishing this equivalence (the product and the (permutation) symmetry is the
statistics operator) can be chosen in a unique way natural one. Moreover, the category uniquely
provided any pair of spacelike disconnected locali- determines the group, and by a crossed product
zation regions can be continuously deformed into construction (an action of the category on the
any other such pair. algebra A) one reconstructs a field algebra F such
This point marks the separation between high and that [5] holds. If fermionic sectors are present, then
low dimensions. In two dimensions, in each pair of there is some arbitrariness in the commutation
spacelike disconnected regions, one region is to the relations among the corresponding fermionic fields,
left of the other, thus distinguishing the pair which can be exploited to produce the normal
(O1 , O2 ) from (O2 , O1 ). Consequently, they cannot commutation relations (fermionic fields anticom-
be deformed into each other, and there arise two mute among each other, and bosonic fields commute
statistics operators. The same holds in three dimen- with any field at spacelike separation). This fixes the
sions when the localization regions are spacelike field algebra F up to unitary equivalence. The
cones, and O1 , O2 are taken within (the causal conclusion is that the WWW scenario is the most
complement of) some larger spacelike cone. If the general in four dimensions (apart from the reserva-
spacetime dimension is at least 4, or if in three tions due to long-range forces, see above).
dimensions the localization regions are compact,
then the statistics operator is unique and, as a
consequence, coincides with its inverse. Generalized Symmetries in Low
The (non-)uniqueness of the statistics operator has Dimensions
far-reaching consequences concerning our original
In view of the success of this program in four
question about the underlying gauge symmetry.
dimensions and the advantage of the WWW
Namely, the DHR analysis proceeds to show that
scenario for model building, the obvious challenge
the set of positive-energy representations equipped
is to search for an analogous understanding of
with the composition law, and the linear spaces of
superselection sectors (charges) in low dimensions in
inertwiners between different representations,
terms of an algebra of charged fields and a gauge
together form the mathematical structure of a C
symmetry distinguishing the observables. This gauge
tensor category. The statistics operators which are
symmetry cannot, in general, be a group for several
distinguished intertwiners give additional structure
reasons:
to this category: this structure is called a (permuta-
tion) symmetry if the statistics operators coincide  As stated before, the tensor category of super-
with their inverse, and it is called a braiding selection sectors possesses only a braiding, rather
otherwise. (It gives rise to a representation of the than a (permutation) symmetry, hence the duality
permutation group or the braid group, respectively.) theorem fails.
In other words, the spacetime topology, through the  One can associate a (statistical) dimension d to
intervention of the uniqueness of the statistics each superselection sector [] which is multi-
operator, causes the tensor category to be symmetric plicative under the composition law (fusion), and
in high dimensions, and braided in low dimensions. additive under direct sums. In a symmetric
176 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions

category, the dimensions are necessarily positive approaches to appropriate symmetry concepts in
integers. Indeed, in the WWW scenario, they low dimensions.
coincide with the naive dimension of the asso- Attempts to classify the possible algebraic struc-
ciated representation of the gauge group. But in tures of generalized internal symmetries in a model-
the low-dimensional models, the dimensions turn independent setting start from the idea that the
out to be nonintegers in general. representation category of the internal symmetries of
 Moore and Seiberg (1988) have axiomatized the a given model should be equivalent to the tensor
superselection structure of chiral and two- category of its superselection sectors. Several alge-
dimensional conformal field theories in terms braic structures have been proposed as candidates,
of a system of recoupling and braiding coeffi- complying with this idea. They all assume specific
cients controlling the fusion of sectors and its modifications or deformations of eqns [1]–[5] above,
noncommutativity. (In fact, this system is highly constrained by self-consistency. Among these
basically equivalent to the DHR category.) For proposals are:
models such as SU(2) current algebras at level
 quantum groups (see e.g., Fröhlich and Kerler
k, these coefficients turn out to coincide with
1993),
the recoupling and braiding coefficients one can
 weak quasiquantum groups (Mack and Schomerus
associate with a quantum group deformation
1992) and rational Hopf algebras (Fuchs et al.
(Drinfel’d 1986) of SU(2) with deformation
1994),
parameter q = exp i=k. Representations of
 weak C Hopf algebras (Rehren 1997, Böhm and
quantum groups (quasitriangular Hopf algebras,
Szlachányi 1996) or quantum groupoids (Nik-
see Hopf Algebras and q-Deformation Quantum
shych and Vainerman 1998), and
Groups) have a tensor product defined in terms
 braided groups (Majid 1991).
of a noncocommutative coproduct. Moreover,
they possess a quantum dimension which is a In several cases, the respective ‘‘symmetry alge-
q-deformation of an integer. The quantum bra’’ can be reconstructed from the tensor category
dimensions precisely match the statistical dimen- of superselection sectors, and a field algebra with
sions of the superselection sectors. All this linear transformation behavior can be constructed
strongly suggests that quantum groups appear as which contains the observables as invariant ele-
generalized symmetries in two dimensions, at ments as in [5]. However, the situation is unsatis-
least in a large class of models. factory for various reasons. First, the class of QFT
models for which these constructions have been
A natural testing ground for the search for performed is quite restricted (most constructions
appropriate generalized symmetry concepts in low work only for rational models, i.e., models with a
dimensions is the abundance of models in chiral and finite set of charges); second, the reconstructed
two-dimensional conformal QFT (see Two- symmetry algebra is not unique and finally, the
Dimensional Models). As mentioned before, confor- constructed field algebras have features which
mal symmetry in two dimensions has far-reaching diverge significantly from the WWW scenario. For
consequences, especially the existence of chiral quan- example, it is not always warranted that the
tum fields which are defined on a one-dimensional quantum symmetries are consistent with the

light ray. As a null direction in the two-dimensional -structure, indispensable for Hilbert space positiv-
spacetime, this ray unites both the spacelike property ity (a necessary prerequisite for the probability
of carrying a causal structure, and the timelike interpretation of quantum theory). Moreover, typi-
property that the generator of translations has positive cally there are global gauge transformations which
spectrum (energy). These two features together with are implemented by localized field operators, thus
Möbius covariance are so powerful that they allow for exhibiting a mixing of local and global concepts. It
the exact construction of large classes of models. The also happens that this holds for elements in the
most elementary ones (minimal models) are center of the symmetry algebra, which implies that
completely described by the chiral stress–energy the field algebra is not local relative to its gauge
density field, that is, the local generator of the invariant elements, that is, the charged fields do not
conformal symmetry. Other models also contain commute with the gauge-invariant elements at
currents which are the local generators of internal spacelike separation. In other constructions, the
symmetries. These models exhibit many nontrivial field algebra is not associative, or there are no finite
superselection structures, which illustrate the wide field multiplets.
range of possible deviations from higher-dimensional Historically, the first candidate for a ‘‘symmetry
QFT, and at the same time exhibit possible algebra’’ compatible with braid group statistics has
Symmetries in Quantum Field Theory of Lower Spacetime Dimensions 177

been the structure of a quantum group, as men- to the observables in the sense mentioned before. In
tioned above. However, in physically interesting rational chiral CFT, such extensions can be classi-
models, the quantum group is not semisimple and fied (and indeed constructed) in terms of the super-
thus has too many (namely, indecomposable) repre- selection category of A, giving direct access to the
sentations. Solutions to this problem have been: decomposition of the vacuum Hilbert space of F into
superselection sectors of A. The advantage here is
1. A BRS approach in an indefinite-metric frame-
that no problems with Hilbert space structure can
work (Hadjiivanov et al. 1991),
arise (because the approach is entirely in terms of
2. ‘‘Truncation,’’ that is, discarding the ‘‘unphysi-
operator algebras); a drawback is that in general F is
cal’’ representations. Fröhlich and Kerler (1993)
not unique, and nonvacuum representations of F
have done this consistently in a categorical
also have to be considered in order to generate all
framework. In fact, they have given a complete
sectors of A.
classification of the possible braided tensor
The method can be used to classify and construct
categories generated by a single irreducible object
both nonlocal chiral extensions as candidates for
with statistical dimension d satisfying 1 < d < 2,
sector-generating field algebras for a theory A of
in terms of categories constructed from the
chiral observables, and local two-dimensional quan-
‘‘truncated’’ representations of Uq (sl2 ). Trunca-
tum field theories containing two given chiral
tion can also be performed by dividing the
subtheories, that is, observable algebras of two-
quantum group itself through the ideal which is
dimensional models (Kawahigashi and Longo 2004).
annihilated by all ‘‘physical’’ representations,
The chiral sector structure of the latter models is
leading to a weak quasiquantum group (Mack
described by a ‘‘modular invariant.’’ In many cases,
and Schomerus 1992).
this means that their thermal partition functions are
3. Relaxing the axioms, thus admitting the more
invariant under the group PSL(2, Z) of modular
general structures mentioned above.
transformations of the temperature (see below).
All the above approaches assume a given general- At this point, another link between spacetime and
ized symmetry concept and show to what extent internal symmetries may be noted. The modular
field algebras complying with it can be constructed. theory of von Neumann algebras (see Tomita–
They thus concern nonobservable objects, and it is Takesaki Modular Theory) associates a one-para-
no contradiction if different symmetry concepts can meter group of automorphisms (called the ‘‘modular
be associated with the same observable data. group’’) with a state and an algebra ‘‘in standard
A more radical concept of global gauge symmetry, position.’’ In quantum field theory, for the vacuum
applicable to the low-dimensional case, has been state and an algebra of observables localized in
developed by Longo and Rehren (1995). Its point of certain wedge regions of Minkowski spacetime, this
departure is the notion of a conditional expectation, group can be identified with a boost subgroup of the
which has the same abstract properties as a group Lorentz group (Bisognano and Wichmann 1975).
average. In the WWW scenario, the Haar measure Similarly, in chiral CFT on the circle, the modular
of the compact gauge group defines an average group associated with the observables in an interval
Z and the vacuum coincides with a subgroup of the
 : F 3  7! dðgÞ g ðÞ 2 A ½6 Möbius group. For nonlocal theories, there may be
an obstruction, however. On the other hand, if a
which is a positive linear map respecting the subalgebra is stable under the modular group of
localization, and the observables are invariant, some algebra, then there is a conditional expectation
(a) = a. In fact, the observables are exactly the from the larger algebra onto the smaller algebra.
image of this map, that is, [5] is equivalently Combining these general theorems, the Möbius
formulated, but without reference to the group covariance of the inclusions A(O)  F(O) implies
transformations, as the existence of a conditional expectation, that is,
the above generalization of the average over the
AðOÞ ¼ ðFðOÞÞ ½7 internal symmetry. Moreover, assuming a general-
ized notion of compactness (‘‘finite index’’) for the
Turning to the observables A of a quantum field generalized internal symmetry, the Bisognano–Wich-
theory in low dimensions, one looks for a quantum mann property holds also for nonlocal theories
field theory F, containing A and equipped with a (Longo and Rehren 2004).
conditional expectation  such that [7] holds, and Of course, there is also a WWW scenario in chiral
which preserves the vaccum state. F may not satisfy theories, that is, one may restrict a local theory to its
local commutativity, but it should be local relative invariants under some group of internal gauge
178 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions

symmetries (‘‘orbifold models’’). It then happens sectors. This general result nicely complies with the
that the invariants not only have the expected experience with integrable models, as mentioned
superselection sectors in correspondence with the before.
representations of the gauge group, but in addition There are also some results giving interesting
‘‘twisted’’ sectors appear which, together with the insight, which can be obtained intrinsically in terms
former, constitute a ‘‘quantum double’’ structure. of the observables. One of them concerns ‘‘central’’
The twisted sectors arise by restriction of solitonic observables (generalized Casimir operators).
sectors of the original theory, which are in one-to-one Casimir operators in the WWW scenario are
correspondence with the elements of the gauge functions of the generators of the internal symmetry
group (Müger 2005). Solitonic sectors are localiz- which usually are integrals over densities belonging
able with respect to two different vacua, and do to the field algebra F (Noether’s theorem). Since
not admit an unrestricted composition law. they also commute with the generators, they can be
approximated by local observables, and are there-
fore defined in each representation of the latter. By
Schur’s lemma, they are multiples of the identity in
Special Issues each irreducible sector. Since the eigenvalues of
A particularly simple situation is the case of anyons, Casimir operators distinguish the representations of
that is, when all sectors have statistical dimension 1. the gauge group, they also distinguish the sectors.
Then the sectors form an abelian group G ^ under In chiral CFT extended to the circle (see above),
fusion, and one can construct a WWW scenario with one can find global ‘‘charge measuring operators’’
global gauge group G the dual of G. ^ The ensuing Ci , one for each sector i , in the center of the
quantum fields satisfy generalized commutation rela- observable algebra (Fredenhagen et al. 1992) which
tions at spacelike separation, given by an abelian have similar properties. They arise as a consequence
representation of the braid group, where the coeffi- of an algebraic obstruction to define the charged
cients can be arbitrary complex phases (responsible sectors on the circle, related to a nontrivial effect if a
for the name ‘‘anyons’’). However, it is known that charge is ‘‘transported once around the circle,’’ and
there can arise an obstruction, which enforces the form an operator representation of the fusion rules
‘‘local’’ global gauge transformations (mentioned within the global algebra of observables. Under
before) to be present. In this case, the gauge rather natural conditions clarified by Kawahigashi,
symmetry can also be described by a quasiquantum Longo, and Müger (2001), the matrix of eigenvalues
group. It is noteworthy that free anyon fields have j (Ci ) is nondegenerate, that is, the generalized
been constructed in two-dimensional spacetime, Casimir operators completely distinguish the super-
while in three dimensions there can be no (cone-) selection sectors. In this case, the superselection
localized massive anyon fields which are free in the category is a modular category (see Braided and
sense that they generate only single-particle states Modular Tensor Categories): the matrix with entries
from the vacuum (Mund 1998). dj j (Ci ) and the diagonal matrix with entries j (U)
The charge structure of massive quantum field (where U is the Möbius rotation by 2) are multi-
theories in two dimensions is very different both ples of the generators S and T of the ‘‘modular
from that encountered in conformal quantum field group’’ PSL(2, Z), in a matrix representation labeled
theories, and from the charge structure in high by the superselection sectors of the chiral observa-
dimensions. It has been observed long ago that, in bles. The physical significance of this matrix
contrast to four dimensions, the strong locality representation is that it relates thermal expectation
property (Haag duality) which is necessary to set values for different values of the temperature (Cardy
up the DHR analysis of superselection sectors, fails 1986, Kac and Peterson 1984, Verlinde 1988)
for the algebra of invariants under an internal gauge These examples, together with the failure of the
group in two dimensions. This algebraic feature can Coleman–Mandula theorem, may illustrate the
be traced back to the fact that the causal comple- intricate relations among spacetime geometry, cov-
ment of a point is disconnected in two dimensions, ariance, and internal symmetry (charge structure) in
or, in physical terms, that ‘‘a charge cannot be low dimensions. In relativistic quantum field theory,
transported around a detector’’ without passing the link is provided by the principle of locality,
through its region of causal dependence. Müger which ‘‘turns geometry into algebra.’’
(1998) has shown that any algebra of observables
which satisfies Haag duality, cannot possess any See also: Algebraic Approach to Quantum Field Theory;
nontrivial DHR superselection sectors at all, and Axiomatic Quantum Field Theory; Braided and Modular
that the only sectors which can exist are solitonic Tensor Categories; Hopf Algebras and q-Deformation
Symmetries in Quantum Field Theory: Algebraic Aspects 179

Quantum Groups; Integrability and Quantum Field Fuchs J, Ganchev A, and Vecsernyés P (1994) On the quantum
Theory; Quantum Field Theory: A Brief Introduction; symmetry of rational field theories. Theoretical Mathematical
Quantum Fields with Topological Defects; Symmetries Physics 98: 266–276.
and Conservation Laws; Symmetries in Quantum Field Hadjiivanov LK, Paunov RR, and Todorov IT (1991) Quantum
group extended chiral p-models. Nuclear Physics B 356:
Theory: Algebraic Aspects; Symmetry Breaking in Field
387–438.
Theory; Tomita–Takesaki Modular Theory; Longo R and Rehren K-H (1995) Nets of subfactors. Rev.
Two-Dimensional Conformal Field Theory and Vertex Mathematical Physics 7: 567–597.
Operator Algebras; Two-Dimensional Models. Mack G and Schomerus V (1992) Quasi Hopf quantum symmetry
in quantum theory. Nuclear Physics B 370: 185–230.
Majid S (1991) Braided groups and algebraic quantum field
Further Reading theories. Letters in Mathematical Physics 22: 167–175.
Moore G and Seiberg N (1989) Classical and quantum conformal
Böhm G and Szlachányi K (1996) A coassociative C -quantum field theory. Communications in Mathematical Physics 123:
group with non-integral dimensions. Letters in Mathematical 177–254.
Physics 35: 437–448. Müger M (1998) Superselection structure of massive quantum
Coleman S and Mandula J (1967) All possible symmetries of the field theories in 1 þ 1 dimensions. Reviews in Mathematical
S-matrix. Physical Review 159: 1251–1256. Physics 10: 1147–1170.
Doplicher S and Roberts JE (1990) Why there is a field algebra Mund J (1998) No-go theorem for ‘‘free’’ relativistic anyons in
with a compact gauge group describing the superselection d = 2 þ 1. Letters in Mathematical Physics 43: 319–328.
structure in particle physics. Communications in Mathematical Rehren K-H (1997) Weak C Hopf symmetry. In: Doebner H-D
Physics 131: 51–107. and Dobrev VK (eds.) Group Theoretical Methods in Physics,
Fredenhagen K, Rehren K-H, and Schroer B (1992) Superselection pp. 62–69. (q-alg/9611007). Sofia: Heron Press.
sectors with braid group statistics and exchange algebras II: Schomerus V (1995) Construction of field algebras with quantum
geometric aspects and conformal covariance. Reviews in symmetries from local observables. Communications in
Mathematical Physics SI1: 113–157. Mathematical Physics 169: 193–236.
Fröhlich J and Kerler T (1993) Quantum Groups, Quantum Wick GC, Wightman AS, and Wigner EP (1952) The intrinsic
Categories, and Quantum Field Theory. Lecture Notes in parity of elementary particles. Physical Review 88: 101–105.
Mathematics, vol. 1542. Berlin: Springer.

Symmetries in Quantum Field Theory: Algebraic Aspects


J E Roberts, Università di Roma ‘‘Tor Vergata,’’ variations. Readers unfamiliar with the mathemati-
Rome, Italy cal terminology should consult the appendix.
ª 2006 Elsevier Ltd. All rights reserved.

Elementary Quantum Mechanics


Introduction
Before turning to quantum field theory, let us
This article treats the most important results and comment on symmetries in elementary quantum
concepts relating to symmetry and conservation mechanics. These systems have the density matrices,
laws in quantum field theory. It includes such results that is, positive operators of trace 1, on an infinite-
as Wigner’s theorem, Goldstone’s theorem, the dimensional separable Hilbert space as states, the
Bisognano–Wichmann theorem, the quantum self-adjoint operators as observables. The expecta-
Noether theorem, and the theorem on the existence tion value of the bounded observable A in the state
of gauge groups and a field net. It is written within determined by  is given by tr A. Having specified
the framework of algebraic quantum field theory, the mathematical structure, the notion of symmetry
this being the simplest setting capable of expressing follows. With a suggestive notation, it is a pair of
all these concepts and results. mappings A 7! A,  7! 1 such that
Symmetries come in many guises. They are to a
physical system what automorphisms are to a
mathematical theory. In fact, when a physical tr 1 A ¼ tr A
system is described in mathematical terms, its
symmetries correspond to the automorphisms of for all observables A and states .
the mathematical structure and in particular form a If we take  and A to be the projections onto C
group, its symmetry group. The reader should bear and C for unit vectors  and , then the above
in mind this simple picture throughout its diverse condition corresponds to the conservation of
182 Symmetries in Quantum Field Theory: Algebraic Aspects

transition probabilities j(, )j2 . This formed the for every double cone O. It is usually the case that
starting point for Wigner’s analysis, who concluded: internal symmetries commute with spacetime
symmetries.
Theorem Every symmetry is of the form A ! 7
The state of prime relevance to elementary particle
UAU1 and  7! UU1 , where U is a unitary or
physics is the vacuum state !0 . The corresponding
antiunitary operator.
Gelfand–Naimark–Segal (GNS) representation 0 is
As could have been foreseen from the outset, this called the vacuum representation. Now the vacuum
simple result in no way distinguishes one elementary state of a quantum field theory is typically unique
quantum-mechanical system from another. A more and as such invariant under a symmetry of the system
useful notion of symmetry results if the Hamiltonian !0 1 = !0 .
is reckoned as part of the information describing the
system and, therefore, has to be left invariant by a
symmetry. The operator U above must therefore
Spacetime Symmetries
satisfy the condition UHU1 = H and it commutes
with the Hamiltonian. As the Hamiltonian is the Since the vacuum state is invariant, we have a
generator of time translations, U is a constant of unitary representation of the Poincaré group imple-
motion. This is the genesis of the relation between menting the spacetime symmetries in the vacuum
symmetries and conservation laws. representation. To illustrate the role of representa-
tions up to a factor, we take instead the GNS
representation of a pure state corresponding to a
particle of half-integral spin. Here we need a unitary
Quantum Field Theories
representation of the covering group of the Poincaré
The simplest types of quantum field theories can be group, inhomogeneous SL(2, C) to implement the
described by von Neumann algebras A(O) depend- symmetries. The situation for the subgroup of
ing on double cones O and subject to rotations is the same.
O1  O2 ) AðO1 Þ  AðO2 Þ The most important property of these representa-
tions is positivity of the energy. More precisely, in a
a structure referred to as the net of observables. representation of relevance to elementary particle
An alternative approach would be to use the physics such as the vacuum representation, the
Wightman formalism. This would need a discussion generator P0 of time translations is a positive
of pointlike fields and the domains of definition of operator P0  0. Expressed in a frame-independent
unbounded operators, thus complicating a general way, the spectrum of spacetime translations is
exposition of symmetry. contained in the closed forward light cone. It is
Comparing this description of a quantum field one of the basic principles to be exploited in
theory with that of an elementary quantum- applying quantum field theory to elementary particle
mechanical system, the net clearly substitutes obser- physics. Notice that the principle is no longer valid
vables but nothing has yet been said about states. for an equilibrium state.
Since the set of double cones is directed under A similar situation arises in conformal field
inclusion, the union of the A(O) is a -algebra A and theory. Here the role of double cones in Minkowski
a state of our system is a state on this algebra. space is played by intervals on the circle and that of
Most states are of no physical relevance. A the Poincaré group by the Möbius group on the
characterization of the states of physical relevance, circle PSL(2, R). Again, the Möbius group cannot
even say to elementary particle physics, is not always be unitarily implemented and conformal
known although some progress has been made. invariance is defined via a continuous unitary
The net structure is the hallmark of a field theory representation of its covering group. Most impor-
and allows us to distinguish two important classes of tantly, there is an analog of positivity of the energy.
symmetries. An internal symmetry  satisfies the The generator of rotations of the circle is a positive
condition operator.
ðAðOÞÞ ¼ AðOÞ A remarkable aspect of spacetime symmetries was
discovered by Bisognano and Wichmann in an
for all double cones O. By contrast, a spacetime application of modular theory in the field-theoretical
symmetry is an automorphism L implementing a context looking not at double cones but at wedges.
Poincaré transformation L and hence satisfying the A wedge W is a Poincaré transform of the standard
condition wedge x1 > jx0 j. They found that the modular
L ðAðOÞÞ ¼ AðLOÞ automorphisms of A(W) and the vacuum vector 0
Symmetries in Quantum Field Theory: Algebraic Aspects 183

have a geometric significance. For the standard hence the objects of a full tensor subcategory T of
wedge, they got the following result. the category of all endomorphisms and their inter-
twiners. There is a dimension function d defined on
Theorem If the net is derived from Wightman
the objects of T , d() = 1, 2, . . . , 1. If T f denotes
fields, the modular operator is e2K , where K is the
the full subcategory whose objects have finite
generator of boosts in the 1-direction and the
dimension, then the following result holds.
modular conjugation is ZR, where  is the TCP-
operator, R is the rotation through  about the Theorem T f is equivalent to the tensor category of
1-axis, and Z is the unitary operator equal to 1 on finite-dimensional continuous unitary representa-
the Bose subspace and i on the Fermi subspace. tions of a canonical compact group G. There is a
canonical field net F with Bose–Fermi commutation
The modular data for A(O) and 0 also admit a
relations extending A such that G is the group of
geometric interpretation for the free massless scalar
automorphisms of F leaving A pointwise fixed.
field.
These facts enhance our understanding of space- The first step in the proof is to define and analyze
time symmetries. The ideas have meanwhile been the statistics of the representations in question. The
applied to curved spacetime to select a state with statistics of an irreducible representation  can be
vacuum-like properties using the principle of the classified as being para-Bose or para-Fermi of order
geometric action of the modular conjugation. d(). The second step is to show that each  of finite
dimension has a well-defined conjugate up to
equivalence. The third and most difficult step is
Gauge Symmetry showing that T f can be embedded in the tensor
category of Hilbert spaces.
Gauge symmetries do not fit into our scheme in that
they act trivially on the observable algebra A. To
exhibit a gauge symmetry we need a larger net F The Local Implementation
called the field net. The gauge group will be the of Symmetries
group of automorphisms of F leaving the subnet A Gauge symmetry has its associated conservation
pointwise fixed and A the subnet of F of fixed laws in that the different sectors of the last section
points under G. This has the merit of indicating the are labeled by conserved quantities such as baryon
mathematical framework for gauge symmetry but number, lepton number, or electric charge, gener-
otherwise begs important questions. A priori one ically called charges. The theory is built round the
does not know what properties F should have nor idea of creating charge and elements of the field net
how it should be constructed. carry charges. But there should be a dual approach
The right approach is to understand what intrinsic based on measuring charges. One would like to
structure of A governs the existence of a nontrivial prove the existence of local conserved currents
gauge group. This brings us back to the states or corresponding to these charges. This has not proved
representations relevant to elementary particle phy- possible but there is a good substitute, described
sics. A condition for selecting some of these relevant below, which can be regarded as a weak version of a
representations is that asymptotically they be like quantum Noether theorem.
the vacuum in spacelike directions. More precisely, If O1  O2 is a strict inclusion of double cones,
 must be unitarily equivalent to the vacuum then the theory is said to satisfy the split property if
representation 0 on the spacelike complement of there is a type I factor M such that
every double cone.
The resulting theory of superselection sectors AðO1 Þ  M  AðO2 Þ
hinges on the property of Haag duality that, for
each double cone O, where a type I factor is a von Neumann algebra
isomorphic to some B(H). In this case M can be
AðOÞ ¼ AðO0 Þ0 chosen in a canonical fashion and there is an
isomorphism called the universal localizing map
where O0 denotes the spacelike complement of O. It
of B(H) onto M, where H is the underlying Hilbert
implies that every representation satisfying the
space. We have (A) = A for A 2 A(O1 ).
selection criterion is unitarily equivalent to one of
the form 0 , where  is an endomorphism of A Theorem If U is an implementing representation of
localized in some fixed but arbitrary double cone, the internal symmetry group G, (U) will be a
that is, (A) = A if A 2 A(O0 ). The endomorphisms representation of G in M that continues to imple-
thus obtained are closed under composition and ment the symmetry on A(O1 ). If G is a Lie group
184 Symmetries in Quantum Field Theory: Algebraic Aspects

then the infinitesimal generators in the representa- Symmetries of the S-matrix


tion are an analog of locally integrated current
Scattering theory not only allows one to construct
densities.
the multiparticle scattering states but also shows
that internal symmetries and spacetime symmetries
continue to act on these states and are therefore
Spontaneously Broken Symmetry symmetries of the S-matrix. We can, however, ask
what are all the symmetries of the S-matrix. An
The standard physical example of a spontaneously
answer was provided by Coleman and Mandula,
broken symmetry is magnetization. Despite the
who showed that, when there is nontrivial scatter-
overall rotational symmetry, a magnet picks out a
ing, there are no further symmetries of the S-matrix.
preferred direction as its direction of magnetization.
The chosen state breaks the symmetry.
The phenomenon of spontaneously broken sym-
metry involves an interplay of symmetries and Appendix
certain classes of states, vacuum states, ground
In an effort to make this article more self-contained,
states, or equilibrium states. If such an ! is induced
this appendix collects together a few simple perti-
by a vector cyclic and separating for a local algebra
nent concepts and results from the theory of
A(O), then, as explained in the appendix, given O,
operator algebras. A C -algebra is a -algebra A
modular theory yields a canonical unitary represen-
with a norm k  k making it into a Banach algebra
tation V of the internal symmetry group G:
and satisfying
gA ¼ Vg AVg ; A 2 AðOÞ
kA Ak ¼ kAk2
The results concern the breaking of a one-
for every A 2 A. Any C -algebra can be realized as a
parameter group  7!  of symmetries. More
norm closed -subalgebra of the C -algebra B(H) of
precisely, one asks whether ! = 0 or not, where 
all bounded operators on a Hilbert space H. A von
is the infinitesimal generator of  7!  ,
Neumann algebra R is a C -algebra that is the dual
space of a Banach space. This Banach space R , the
ðFÞ ¼ lim 1 ð ðFÞ  FÞ
!0 predual of R, is intrinsically defined. The topology
on R determined by duality with R is called the
where norm convergence is understood and holds on a -topology. B(H) is a von Neumann algebra and its
dense domain. , the derivation, is an infinitesimal predual is the set of trace class operators. Any
symmetry. Goldstone first showed that the sponta- von Neumann algebra can be realized as a -closed
neous breaking of such symmetries requires the unital -subalgebra of some B(H).
presence of massless bosons. The following result is A state on a C -algebra A is a positive linear
taken from a more modern treatment. OR here denotes functional ! of norm 1. If A has a unit I the
the double cone whose base is the ball in t = 0 of radius normalization condition can be expressed as
R centered on the origin and D the domain of . !(I) = 1. Of fundamental importance is the relation
Theorem Let  be a derivation on a field net F in between representations and states. A representation
s > 1 spatial dimensions such that for F 2 F of A on a Hilbert space H is just a structure-
(OR ) \ D preserving mapping or morphism of A into B(H).
For simplicity, we suppose that A has a unit. Given
j!0 Fj  cR;" ðkFk þ kF kÞ þ "kFk a state !, there is an associated representation !
defined by a vector  such that ! (A) is dense in
(i) If lim inf R!1 cR," R(s1)=2 = 0, then !0  = 0. the Hilbert space in question, that is, it is a cyclic
(ii) If lim inf R!1 cR," R(s1)=2 < 1, then !0  6¼ 0 is vector for the representation and
only possible if the spectrum of the translations
coincides with the forward light cone Vþ and the !ðAÞ ¼ ð; ! ðAÞÞ; A2A
boundary @Vþ ={0} has non-trivial spectral mea-
sure (i.e., there are massless particles in the that is, the cyclic vector implements the given state.
theory). This is referred to as the GNS construction. Given
(iii) If cR," is polynomially bounded in R, then any two such representations, there is a unique
!0  6¼ 0 is only possible if the spectrum of unitary operator mapping the one cyclic vector onto
translations coincides with Vþ but there are the other and realizing the equivalence of the
not necessarily any massless particles. representations.
Symmetries in Quantum Field Theory: Algebraic Aspects 185

A state of a von Neumann algebra is said to be and JRJ = R0 . J is called the modular conjugation, 
normal if it is continuous in the -topology. If ! is the modular operator, and it the modular auto-
normal, then ! (R) is -closed. morphisms. The closure of {1=4 A: A 2 R, A  0}
An inclusion of unital von Neumann algebras has is a cone, called the natural cone. Every normal state
the split property if there is an intermediate type I of R is implemented by a unique vector in the
factor, that is, if it has the form R1  B(H)  R2 . natural cone. If  is an automorphism of R, there is
The following elementary observation is often therefore a unique vector  in the natural cone
used in treating symmetries. If  is an automorphism such that, for every A 2 R,
of A with !1 = !, there is a unique unitary
operator leaving the cyclic vector  invariant and ð; 1 ðAÞÞ ¼ ð ; A Þ
inducing  in the representation ! . In other words, There is now a canonical unitary operator V
U =  and defined by
U! ðAÞU1 ¼ ! ðAÞ V A ¼ ðAÞ

If we apply the above lemma to a group G of V maps the natural cone into itself and  7! V is an
symmetries leaving a state invariant, it yields a implementing representation of the group of auto-
group U(g) of unitaries satisfying the condition morphisms of R. Under these circumstances, we do
not have to deal with representations up to a factor.
UðghÞ ¼ UðgÞUðhÞ; g; h 2 G
See also: Algebraic Approach to Quantum Field Theory;
since U(g) is uniquely defined by the above Axiomatic Quantum Field Theory; Boundary Conformal
conditions. Field Theory; Current Algebra; Quantum Fields with
When there is no invariant state, the situation is Topological Defects; Supergravity; Symmetries in
more complicated. Suppose there is a group G of Quantum Field Theory of Lower Spacetime Dimensions;
symmetries and a representation  of A where each Two-Dimensional Models.
g is unitarily implemented. Thus, there is a unitary
U(g) with

UðgÞðAÞUðgÞ1 ¼ ðgAÞ; A2A Further Reading


All we can now conclude is that Bisognano JJ and Wichmann EH (1976) On the duality condition
for quantum fields. Journal of Mathematical Physics 17:
UðghÞ ¼ Zðg; hÞUðgÞUðhÞ 303–321.
Borchers HJ and Buchholz D (1985) The energy–momentum
where Z(g, h) is a unitary in A0 , the commutant of spectrum in local field theories with broken Lorentz symme-
A, satisfying the 2–cocycle identity try. Communications in Mathematical Physics 97: 169–185.
Buchholz D (1982) The physical state space of quantum electro-
Zðgh; kÞZðg; hÞ ¼ Zðg; hkÞg Zðh; kÞ dynamics. Communications in Mathematical Physics 85: 49–71.
Buchholz D, Doplicher S, and Longo R (1986) On Noether’s
where g X = U(g)XU(g)1 . U is said to be a repre- theorem in quantum field theory. Annals of Physics 170:
sentation up to a factor. It can be chosen to be a 1–17.
representation if the cocycle Z is a coboundary, that Buchholz D, Doplicher S, Longo R, and Roberts JE (1992) A new
is, if there is a unitary Y(g) in A0 such that look at Goldstone’s theorem. Reviews in Mathematical
Physics (Special Issue): 49–83.
YðgÞg YðhÞ ¼ YðghÞZðg; hÞ Buchholz D, Dreyer O, Florig M, and Summers SJ (2000)
Geometric modular action and spacetime symmetry groups.
In general, little is known about solving problems Reviews in Mathematical Physics 12: 475–560.
Coleman S and Mandula J (1967) All possible symmetries of the
of this kind, but there are a number of results when
S-matrix. Physical Review 159: 1251–1256.
 is irreducible and the unitary group of its Doplicher S and Roberts JE (1990) Why there is a field algebra
commutant reduces to the circle. with a compact gauge group describing the superselection
We turn now to consider the modular theory of structure in particle physics. Communications in Mathematical
von Neumann algebras. A vector  is said to be Physics 131: 51–107.
Guido D and Longo R (1996) The conformal spin and statistics
separating for a von Neumann algebra R if A = 0
theorem. Communications in Mathematical Physics 181: 11–35.
and A 2 R implies A = 0. If  is both cyclic and Lopuszański J (1991) An Introduction to Symmetry and Super-
separating, there is a uniquely determined closed symmetry in Quantum Field Theory. Singapore: World
antilinear involution S with SA = A  for A 2 R. Scientific.
If S = J1=2 is the polar decomposition of S, then the O’Raifeartaigh L (1965) Lorentz invariance and internal symme-
try. Physical Review 139: B1052–B1062.
unitary operators it induce automorphisms it of R
184 Symmetry and Symmetry Breaking in Dynamical Systems

Symmetry and Symmetry Breaking in Dynamical Systems


I Melbourne, University of Surrey, Guildford, UK Given an isotropy subgroup   , define the
ª 2006 Elsevier Ltd. All rights reserved. fixed-point subspace

Fix  ¼ fy 2 R n: y ¼ y for all  2 g

If f : Rn ! R n is a -equivariant vector field, then


Introduction f (Fix )  Fix  for each isotropy subgroup .
The same symmetries may underlie diverse contexts Hence Fix  is flow invariant.
such as phase transitions of crystals (Landau The normalizer N() = { 2  :  1 = } is the
theory), fluid dynamics, and problems in biology largest subgroup of  that acts on Fix , and
and chemical engineering. Hence, seemingly unre- f = f jFix is (N()=)-equivariant.
lated systems may exhibit similar phenomena in An isotropy subgroup  is axial if dim Fix  = 1,
regard to symmetries of patterns and transitions and then N()= ffi Z2 or 1. More generally,  is
between patterns (spontaneous symmetry breaking). maximal if there are no isotropy subgroups T with
It is natural to focus attention on aspects of pattern   T   other than T =  and T = . Then
formation that are universal or model independent – N()= acts fixed-point freely on Fix  and the
aspects depending on underlying symmetries rather connected component of the identity (N(=)0 ffi 1,
than model-specific details. SO(2) or SU(2). Correspondingly  is called real,
The general framework is that the underlying complex, or quaternionic. In the complex case
system is governed by an evolution equation dim Fix  is even; in the quaternionic case dim Fix
  0 mod 4.
x_ ¼ f ðxÞ ½1 The dihedral group  = Dm of order m is the
symmetry group of the regular m-gon, m  3. Its
with symmetry group . To avoid technicalities, we standard action on R2 is generated by
assume that [1] is an ordinary differential equation !
(ODE), the vector field f : Rn ! Rn is as smooth as cos 2=m  sin 2=m

desired, and  is a compact Lie group acting linearly sin 2=m cos 2=m
on R n . An inner product may be chosen so that !
 acts orthogonally. The vector field in [1] is 1 0

-equivariant if 0 1

f ðxÞ ¼ f ðxÞ for all x 2 Rn ;  2  ½2 For m even, the isotropy subgroups up to conjugacy
are
Equivalently, if x(t) is a solution and  2 , then
x(t) is a solution. Dm ; Z2 ðÞ; Z2 ðÞ; 1
In this article, we are interested in the dynamics to
be expected for equivariant vector fields, and where Zj (g) denotes the cyclic group of order j
transitions that arise as parameters are varied. The generated by g. The maximal isotropy subgroups
symmetry group  is taken as given, whereas f is a  = Z2 (), Z2 () are axial with N()= ffi Z2 . For
general -equivariant vector field. (Other features m odd, Z2 () is conjugate to Z2 () leaving three
such as energy conservation or time reversibility conjugacy classes of isotropy subgroups, and
must be built into the general setup, but are  = Z2 () is axial with N()= = 1.
excluded in this article.) The space of commuting linear maps

Hom ðRn Þ ¼ fL : R n ! Rn linear:


L ¼ L for all  2 g
Isotropy Subgroups and Commuting
Linear Maps is completely described representation-theoretically.
Let  be a compact Lie group acting linearly on R n . Recall that  acts irreducibly on R n if the only
The isotropy subgroup of x 2 Rn is defined to be -invariant subspaces of R n are Rn and {0}. Then
Hom (Rn ) is a real division ring (skew field) D ffi R,
x ¼ f 2 : x ¼ xg C or H. The representation is called absolutely
irreducible when D = R and nonabsolutely irreduci-
Note that x = x  1 for all x 2 Rn ,  2 . ble when D = C or H.
Symmetry and Symmetry Breaking in Dynamical Systems 185

If the action of  is not irreducible, write Rn = Relative Equilibria and Skew Products
V1   Vk (nonuniquely) as a sum of irreducible
A point x0 2 Rn (or the corresponding group orbit
subspaces. Summing together irreducible subspaces
x0 ) is a relative equilibrium if f (x0 ) 2 Tx0 x0 =
that are isomorphic to form isotypic components W
Lx0 . If x0 has isotropy , then x0 is a relative
gives the (unique) isotypic decomposition
equilibrium if f (x0 ) 2 LD x0 , where D = (N()=)0 .
R n = W1   W‘ . If L 2 Hom (R n ), then
Write f (x0 ) = x0 , where  2 LD . The closure of
L(Wj )  Wj for each j, hence Hom (Rn ) =
the one-parameter subgroup exp(t) is a maximal
Hom (W1 )   Hom (W‘ ). Each Wj consists of
torus in D for almost every . All maximal tori are
kj isomorphic copies of an irreducible representation
conjugate with common dimension d = rank D .
with division ring Dj . Let Mk (D) denote the space of
The solution x(t) = exp(t)x0 is typically a
k
k matrices with entries in D. Then
d-dimensional quasiperiodic motion. ‘‘Typically’’
Hom ðR n Þ ffi Mk1 ðD1 Þ   Mk‘ ðD‘ Þ ½3 holds in both the topological and probabilistic sense
and there is no phase-locking. When d = 1, x(t) is
Spectral properties of commuting linear maps can be periodic, often called a rotating wave.
recovered from the decomposition [3], paying due Choose a -invariant local cross section X to the
attention to multiplicity and complex conjugates of group orbit x0 at x0 . There is a -invariant
eigenvalues. neighborhood of x0 that is -equivariantly diffeo-
morphic to (
X)=, where  acts freely on 
X
by
Equivariant Dynamics
The dynamics of equivariant systems includes  ð; xÞ ¼ ð1 ; xÞ
(relative) equilibria and periodic solutions, robust and  acts by left multiplication on the first
heteroclinic cycles/networks, and symmetric chaotic factor. The -equivariant ODE on (
X)= lifts
attractors. to a (
)-equivariant skew product on 
X
_ ¼ ðxÞ; x_ ¼ hðxÞ ½4
Equilibria
where  : X ! L, h : X ! X satisfy the -equivariance
Consider the ODE [1] with -equivariant vector
conditions
field f satisfying [2]. If x(t)  x0 is an equilibrium,
f (x0 ) = 0, then there is a group orbit x0 of ðxÞ ¼ Ad ðxÞ ¼ ðxÞ1
equilibria.
hðxÞ ¼ hðxÞ
Let  = x0 be the isotropy subgroup of x0 . If
dim  = dim , then generically (for an open dense and h(x0 ) = 0.
set of -equivariant vector fields), the eigenvalues of Thus, dynamics near the relative equilibrium
(df )x0 have nonzero real part, hence x0 is hyperbolic. x0  R n reduces to dynamics near the ordinary
If the eigenvalues all have negative real part, then x0 equilibrium x0 2 X for the -equivariant vector
is asymptotically stable. If at least one eigenvalue h : X ! X, coupled with  drifts. In particular, the
has positive real part, then x0 is unstable. Hyper- stability of x0 is determined by (dh)x0 .
bolic equilibria are isolated and persist under
perturbations of f; the perturbed equilibria continue
Periodic Solutions
to have isotropy . Since (df )x0 2 Hom (Rn ),
decomposition [3] for the action of  on Rn A nonequilibrium solution x(t) is periodic if x(t þ T) =
facilitates stability computations for x0 . x(t) for some T > 0. The least such T is the (absolute)
If dim  < dim , then x0 is a continuous group period. The spatial symmetry group  is the isotropy
orbit of equilibria. Generically, dim ker (df )x0 = subgroup of x(t) for some, and hence all, t 2 R. The
dim  dim  and ker (df )0 = {x0 :  2 L}, where periodic solution P = {x(t): 0 t < T} lies inside
L is the Lie algebra of . The remaining k = n  Fix . Define the spatiotemporal symmetry group
dim  þ dim  eigenvalues generically have nonzero  = { 2  : P = P}. Note that  is a normal subgroup
real part so x0 is normally hyperbolic. If all k of  and either = ffi S1 (P is a rotating wave) or
eigenvalues have nonzero real part, then x0 is = ffi Zq and P is called a standing wave or a discrete
asymptotically stable. If at least one has positive real rotating wave. For each  2 , there exists T 2 [0, T)
part, then x0 is unstable. When N()= is finite, such that x(t) = x(t þ T ). The relative period of x(t)
generically x0 is an isolated equilibrium in Fix  and is the least T > 0 such that x(T) 2 x0 .
persists as an equilibrium with isotropy  under If dim  = dim , then generically P is hyperbolic,
perturbation. hence isolated, the stability of P is determined by its
186 Symmetry and Symmetry Breaking in Dynamical Systems

Floquet exponents, and P persists under perturba- and reorient themselves at approximately 60 ), and
tion as a periodic solution with spatial symmetry  provide a possible intrinsic explanation for irregular
and spatiotemporal symmetry . For  infinite and reversals of the Earth’s magnetic field.
N()= finite, generically P is isolated in Fix  and Asymmetric perturbations (deterministic or noisy)
the neutral Floquet exponent has multiplicity destroy the cycles, but the perturbed attractors
dim   dim  þ 1. inherit the bursting behavior.
Establishing the existence of heteroclinic connec-
Relative Periodic Solutions tions is often straightforward when dim Fix i = 2
and nontrivial with dim Fix i  3. Criteria for
A solution x(t) is a relative periodic solution if it is asymptotic stability of heteroclinic cycles are given
not a relative equilibrium and x(T) 2 x(0) for some in terms of real parts of eigenvalues of (df )xi , and
T > 0. The least such T is the relative period. The depend on the geometry of the representation of .
spatial symmetry group  = x(t) for some, hence Robust cycles exist also between more complicated
all, t. The spatiotemporal symmetry group  is the dynamical states such as periodic solutions or chaotic
closed subgroup of  generated by  and , where sets (cycling chaos). When W u (xi ) connects to two or
x(T) = x(0), and generically = ffi Td
Zq is a more distinct states, the collection of unstable
maximal topologically cyclic (Cartan) subgroup of manifolds forms a heteroclinic network leading to
N()= containing . Then x(t) is a (d þ 1)- competition between various subnetworks.
dimensional quasiperiodic motion.
The dynamics near the relative periodic solution
is again governed by a skew product. There exists Symmetric Attractors
n  1 such that n = exp (n), where  2 LZ()
Suppose that  is a finite group acting linearly on R n .
and Z()   is the centralizer of . Define
A closed subset A  Rn has symmetry groups  =
 = exp(). Form a semidirect product  o Z2n
{ 2 : x = x for all x 2 A},  = { 2 : A = A}.
by adjoining to  an element Q of order 2n such
Here,  is an isotropy subgroup and     N().
that Q Q1 =  1 for 2 .
In applications,  corresponds to instantaneous
In a comoving frame with velocity , a neighbor-
symmetry and  to symmetry on average.
hood of the relative periodic orbit is -equivariantly
If A is an attractor (a Lyapunov stable !-limit set)
diffeomorphic to (
X
S1 )= o Z2n , where X is
for a -equivariant vector field f : R n ! Rn , then 
a  o Z2n -invariant cross section, S1 = R=2nZ and
fixes a connected component of Fix   L, where L
 o Z2n acts on 
X
S1 as
is the union of proper fixed-point spaces in Fix .
ð; x;
Þ ¼ ð 1 ; x;
Þ Provided dim Fix   3, all pairs ,  satisfying
the above restrictions arise as symmetry groups of a
Q ð; x;
Þ ¼ ð1 ; Qx;
þ 1Þ
nonperiodic attractor A. If dim Fix   5, then A is
The -equivariant ODE on (
X
S1 )= o Z2n realized by a uniformly hyperbolic (Axiom A)
lifts to a 
( o Z2n )-equivariant skew product attractor.
If dim Fix   3 and  fixes a connected compo-
_ ¼ ðx;
Þ; x_ ¼ hðx;
Þ;
_ ¼ 1 ½5 nent of Fix   L, then A is realized by a periodic
1 1
where  : X
S ! L, h : X
S ! X satisfy appro- sink provided = is cyclic. If dim Fix  = 2, then in
priate  o Z2n -equivariance conditions. addition either  =  or  = N().
Suppose A is an attractor and  2   . Then
Robust Heteroclinic Cycles
A \ A = ;. Varying a parameter, A may undergo a
symmetry-increasing bifurcation: A grows until it
Heteroclinic cycles, degenerate in systems without collides with A producing a larger attractor with
symmetry, arise robustly in equivariant systems. Let symmetry on average generated by  and .
x1 , . . . , xm 2 R n be saddles with W u (xi )  {xi }  Determining symmetries of an attractor by inspec-
W s (xiþ1 ) (where m þ 1 = 1). If 1 , . . . , m   are tion is often infeasible. A detective is a -equivariant
isotropy subgroups, W u (xi )  Fix i , and xiþ1 is a polynomial : Rn ! V where every subgroup of  is
sink in Fix i , then saddle–sink connections from xi an isotropy subgroup for the action on V, and each
to xiþ1Spersist for nearby -equivariant flows. The component of is nonzero. Suppose that A  R n is
union m u
i = 1 W (xi ) forms a robust heteroclinic cycle an attractor with physical (Sinai–Ruelle–Bowen)
(see the subsection ‘‘Dynamics’’ for an example). Such measure . By ergodicity, the time average
cycles, when asymptotically stable, are a mechanism Z
for intermittency or bursting, notably in rotating 1 T
A ¼ lim ðxðtÞÞdt 2 V
Rayleigh–Bénard convection (where rolls disappear T!1 T 0
Symmetry and Symmetry Breaking in Dynamical Systems 187

is well defined for almost every trajectory x(t) in degeneracies can be treated using singularity theory.
supp . Generically,  A = A so computing the The equilibria and their stability determines the
symmetry of A reduces to computing the symmetry local dynamics. All bifurcating equilibria have
of a point. isotropy , so there is no symmetry breaking.
If  is an infinite compact Lie group, and A is an From now on, consider the remaining subcase
!-limit set containing points of trivial isotropy, then where  acts absolutely irreducibly and nontrivially
A cannot be uniformly hyperbolic. Hence partially on Rn . Then Fix  = {0}, f (0, )  0, and (df )0,  =
hyperbolic flows arise naturally in systems with c()In where generically c0 (0) 6¼ 0. Assume that
continuous symmetry. Consider the skew product c0 (0) > 0, so the ‘‘trivial solution’’ x = 0 is asympto-
[4] where  = 1 and h : X ! X possesses a hyperbolic tically stable subcritically ( < 0) and unstable
basic set   X with equilibrium measure (for a supercritically ( > 0). Bifurcating solutions lie out-
Hölder potential). Let denote Haar measure on . side Fix  and hence there is spontaneous symmetry
Then 
 is partially hyperbolic, and
is breaking.
ergodic (even Bernoulli) for an open dense set of
equivariant flows. Such stably ergodic flows possess Axial Isotropy Subgroups
strong statistical properties (rapid decay of correla- The ‘‘equivariant branching lemma’’ guarantees
tions, central-limit theorem); a possible explanation branches of equilibria with isotropy  for each
for hypermeander (Brownian-like motion) of spiral axial isotropy subgroup. There are three associated
waves in planar excitable media. branching patterns, see Figure 1.
If N()= = Z2 , then f is odd. Generically,
Forced Symmetry Breaking
@x3 f (0, 0) 6¼ 0, since (x21 þ þ x2n )x is -equivar-
In applications, symmetry is not perfect and account iant, and there are two branches of equilibria
should be taken of 0 -equivariant perturbations of bifurcating supercritically or subcritically together,
[1] for 0 a subgroup of  (including 0 = 1). This and lying on the same group orbit. The branches
topic is not discussed in this article, except in the form a symmetric pitchfork whose direction of
subsections ‘‘Robust heteroclinic cycles’’ and branching is determined by sgn @x3 f (0, 0).
‘‘Branching patterns and finite determinacy.’’ If N()= ffi 1, then generically f is even. If all
quadratic -equivariant maps vanish on Fix , then
the bifurcation is sub/supercritical depending on
Equivariant Bifurcation Theory sgn @x3 f (0, 0) but the branches lie on distinct group
Consider families of ODEs x_ = f (x, ), with bifurca- orbits. This is an asymmetric pitchfork.
tion parameter  2 R and vector field f : Rn
If @x2 f (0, 0) 6¼ 0, then the equilibria exist tran-
R ! Rn satisfying f (0, 0) = 0 and the -equivariance scritically: for  < 0 and  > 0.
condition The natural actions of Dm on R2 are absolutely
irreducible. The axial branches are symmetric
f ðx; Þ ¼ f ðx; Þ pitchforks for m  4 even, asymmetric pitchforks
for all x 2 Rn ;  2 R;  2  for m  5 odd, and transcritical for m = 3.
The actions of Dm , m  5 odd, provide the
A local bifurcation from the equilibrium x = 0 simplest instances of hidden symmetries, where
occurs if (df )0, 0 is nonhyperbolic. The center sub- certain N()=-equivariant mappings on Fix  do
space Ec is the sum of generalized eigenspaces not extend to smooth -equivariant mappings on R n .
corresponding to eigenvalues on the imaginary
axis, and is -invariant. By center manifold theory, Nonaxial Maximal Isotropy Subgroups
local dynamics ((x, ) near (0, 0)) are captured by the
center manifold W c . After center manifold reduction For  a real maximal isotropy subgroup, dim Fix 
(or Lyapunov–Schmidt reduction if the focus is on odd, there exist branches of equilibria with isotropy
equilibria), it may be assumed that R n = Ec .
If (df )0, 0 possesses zero eigenvalues, then there is
a steady-state bifurcation. Generically, (df )0, 0 = 0
and Ec is absolutely irreducible. There are two
subcases.
If  acts trivially on Rn , then n = 1 and generically (a) (b) (c)
there is a saddle–node (or limit point) bifurcation Figure 1 Axial branches: (a) supercritical symmetric pitchfork,
where the zero sets of f (x, ) and x2  are (b) supercritical asymmetric pitchfork, and (c) transcritical
diffeomorphic for (x, ) near (0, 0). Higher-order branches.
188 Symmetry and Symmetry Breaking in Dynamical Systems

. When dim Fix  is even, there are examples Branching Patterns and Finite Determinacy
where equilibria exist and examples where no
The following notion of finite determinacy is based
equilibria exist. For  complex or quaternionic,
on equivariant transversality theory. Assume  acts
there exist branches of rotating waves with isotropy
absolutely irreducibly. Consider the set F of
. In the quaternionic case, the rotating waves
-equivariant vector fields f : Rn
R ! Rn satisfy-
foliate the SU(2) group orbits according to the Hopf
ing (df )0, 0 = 0. For an open dense subset of F ,
fibration.
branches of relative equilibria near (0, 0) are
Submaximal Isotropy Subgroups normally hyperbolic. The collection of branches of
relative equilibria, together with their isotropy type,
It has been conjectured falsely that steady-state direction of branching, and stability properties, is
bifurcation leads generically to equilibria only with called a branching pattern. These persist under small
maximal isotropy. The simplest counterexample is perturbations and are finitely determined: there exist
the 24-element group  = Z3 Z32 generated by q = q  2 and an open dense subset U(q)  F such
0 1 0 1 that the branching patterns of f and f þ g are
0 1 0 1 0 0
 ¼ @ 0 0 1 A;  ¼ @ 0 1 0A identical for f 2 U(q), g 2 F , provided g(x, ) =
1 0 0 0 0 1 o(kxkq ).
Furthermore, branching patterns are strongly
(Alternatively,  = T  Z2 (I3 ), where T  SO(3) is finitely determined: there exist d  2 and an open
the tetrahedral group.) dense subset S(d)  F such that the branching
The isotropy subgroup  = Z2 () has two- patterns of f and f þ g are identical for f 2 S(d)
dimensional fixed-point subspace Fix  = {(x, y, 0)}. and all (not necessarily equivariant) g satisfying
The only one-dimensional fixed-point spaces con- g(x, ) = o(kxkd ).
tained in Fix  are the x- and y-axes. The general For example, consider the hyperoctahedral group
-equivariant vector field is Sn Zn2 , n  1. Here Sn acts by permutations of the
coordinates (x1 , . . . , xn ) and Zn2 consists of diagonal
x_ ¼ gðx2 ; y2 ; z2 ; Þx
matrices with entries 1. Let  = T Zn2 , where T  Sn
y_ ¼ gðy2 ; z2 ; x2 ; Þy is a transitive subgroup. Then  acts absolutely
z_ ¼ gðz2 ; x2 ; y2 ; Þz irreducibly on Rn and is strongly 3-determined.
Submaximal branches of equilibria exist except when
After scaling, T = Sn , T = An and, if n = 6, T = PGL2 (F5 ).
gðx2 ; y2 ; z2 ; Þ
Dynamics
¼   x2  ay2  bz2 þ oðx2 ; y2 ; z2 ; Þ ½6
Absolutely irreducible representations have arbitra-
Restricting to Fix  and dividing out the axial rily high dimension, so steady-state bifurcation
solutions x = 0 and y = 0 yields at lowest order the leads to rich dynamics. The group  = Z3 Z32 with
equations  = x2 þ ay2 = y2 þ bx2 . Submaximal sgn(a  1) 6¼ sgn(b  1) and a þ b > 2 in [6] yields
solutions exist provided sgn(a  1) = sgn(b  1). asymptotically stable heteroclinic cycles with planar
In general, the existence of equilibria with connections connecting equilibria in the x-, y- and
submaximal isotropy must be treated on a case- z-axes (see Figure 2). In R4 , there is the possibility of
by-case basis (for each absolutely irreducible repre- instant chaos where chaotic dynamics bifurcates
sentation of  and isotropy subgroup ). directly from the equilibrium 0.
Asymptotic Stability
Subcritical and axial transcritical branches are
automatically unstable. Moreover, the existence of
a quadratic -equivariant mapping q : Rn ! R n and
x 2 Fix  such that (dq)x has eigenvalues with
nonzero real part guarantees that branches of
equilibria with axial isotropy  are generically
unstable (even when qjFix  0).
There are no general results for asymptotic
stability, and calculations must be done on a case-
by-case basis. (The remarks in the subsection
‘‘Equilibria’’ are useful here.) Figure 2 Robust heteroclinic cycle for the group  = Z3 n Z32 .
Symmetry and Symmetry Breaking in Dynamical Systems 189

In the absence of quadratic equivariants, the L = 0, otherwise L is nilpotent and there is an


invariant-sphere theorem gives an open set of equivariant Takens–Bogdanov bifurcation. Similarly,
equivariant vector fields for which an attracting there are codimension-2 steady-state/Hopf and Hopf/
normally hyperbolic flow-invariant (n  1)-dimen- Hopf bifurcations.
sional sphere bifurcates supercritically. This simpli- Write L = S þ N (uniquely), where S is semisimple,
fies computations of nontrivial dynamics. N is nilpotent, and SN = NS. Then {exp tS: t 2 R} is
a torus Tp , where p  0 is the number of rationally
independent eigenvalues for L.
Hopf Bifurcation and Mode Interactions For each k  1, there is a -equivariant degree-k
Equivariant Hopf Bifurcation polynomial change of coordinates P : Rn ! Rn satis-
fying P(0) = 0, (dP)0 = I transforming f to Birkhoff
The setting is the same as in the last section, except normal form fk þ o(kxkk ), where fk is (
Tp )-
that L = (df )0, 0 has imaginary eigenvalues i! of equivariant.
algebraic and geometric multiplicity n=2. Generic- If N 6¼ 0, then {exp tNT : t 2 R} ffi R and fk can be
ally, Rn = Ec is -simple: either the direct sum of chosen so that the nonlinear terms are (
Tp
R)-
two isomorphic absolutely irreducible subspaces, or equivariant. The linear terms are not R-equivariant.
nonabsolutely irreducible. The study of mode interactions proceeds by first
By Birkhoff normal-form theory (see below), for analyzing (
Tp )-equivariant normal forms, then
any k  1 there is a -equivariant change of considering exponentially small effects of the
coordinates after which f (x, ) = fk (x, ) þ o(kxkk ), -equivariant tail. Versions of the equivariant branch-
where fk is (
S1 )-equivariant. Here S1 = ing lemma and equivariant Hopf theorem establish
{exp(tL): t 2 R} acts freely on Rn and 
S1 acts existence of certain solutions. There are numerous
complex irreducibly (D = C). Hence, dim Fix J is examples of robust heteroclinic cycles connecting
even for each isotropy subgroup J  
S1 , and (relative) equilibria and periodic solutions, symmetric
N(J)=J ffi S1 when J is maximal. The equivariant chaos, and symmetry-increasing bifurcations.
Hopf theorem guarantees, generically, branches of
rotating waves with absolute period approximately
2=! for each maximal isotropy subgroup J.
The notions of finite and strong finite determinacy Bifurcations from Relative Equilibria
extend to complex irreducible representations and the and Periodic Solutions
rotating waves persist as periodic solutions for the Using the skew product [4], bifurcations from
original -equivariant vector field f. Define the a relative equilibrium with isotropy  for a
spatial and spatiotemporal symmetry groups   -equivariant vector field reduce to bifurcations
   as in the subsection ‘‘Periodic solutions.’’ Then from a fully symmetric equilibrium for a
J = {(,
()):  2 } is a twisted subgroup, with -equivariant vector field h coupled with  drifts.

:  ! S1 a homomorphism and  = J \  = ker


. If h possesses (relative) equilibria or periodic
In the non-symmetry-breaking case, where  acts solutions, then the drift is determined generically as
trivially on R 2 , phase–amplitude reduction leads to in the subsections ‘‘Relative equilibria and skew
Z2 -equivariant amplitude equations on R and products’’ and ‘‘Relative periodic solutions.’’ Never-
higher-order degeneracies are amenable to Z2 - theless, solving the drift equation can be useful for
equivariant singularity theory. Similar comments understanding behavior in physical space. This is
apply to O(2)-equivariant Hopf bifurcation where facilitated by making equivariant polynomial
the amplitude equations are D4 -equivariant. The changes of coordinates (Q(x), P(x)) putting h into
technique fails for general groups . Birkhoff normal form and simplifying .
Bifurcations from (relative) periodic solutions also
Mode Interactions and Birkhoff Normal Form
reduce, mainly, to bifurcations from equilibria (with
Steady-state and Hopf bifurcations are codimen- enlarged symmetry group). Based on the discussion
sion 1 and occur generically in one-parameter in the subsection ‘‘Relative periodic solutions,’’ it
families of -equivariant vector fields. Multipara- suffices to consider bifurcations from isolated
meter families may undergo higher-codimension periodic solutions P = {x(t)} with spatial symmetry
bifurcations called mode interactions. Suppressing  and spatiotemporal symmetry . Write x(T) =
parameters, steady-state/steady-state bifurcation x(0), where T is the relative period and  is chosen
occurs when Rn = Ec = V1  V2 , where V1 and V2 so that the automorphism 7! 1 , 2 , has
are absolutely irreducible and L = (df )0 has zero finite order k. Form the semidirect product
eigenvalues. If V1 and V2 are nonisomorphic then  o Z2k by adjoining to  an element  of order
190 Symmetry and Symplectic Reduction

2k such that  1  = 1 , for 2 . Codimension- Crawford JD and Knobloch E (1991) Symmetry and symmetry-
1 bifurcations from P are in one-to-one correspon- breaking in fluid dynamics. In: Lumley JL, Van Dyke M, and
Reed HL (eds.) Annual Review of Fluid Mechanics, vol. 23,
dence (modulo tail terms) with bifurcations from pp. 341–387. Palo Alto, CA: Annual Reviews.
fully symmetric equilibria for a ( o Z2k )-equivariant Fiedler B and Scheel A (2003) Spatio-temporal dynamics of
vector field. In particular, period-preserving and reaction-diffusion patterns. In: Kirkilionis M, Krömker S,
period-doubling bifurcations from P reduce to Rannacher R, and Tomi F (eds.) Trends in Nonlinear Analysis,
steady-state bifurcations, and Naimark–Sacker pp. 23–152. Berlin: Springer.
Field M (1996a) Lectures on Bifurcations, Dynamics and
bifurcations reduce to Hopf bifurcations. This Symmetry. Pitman Research Notes in Mathematics Series,
framework incorporates issues such as suppression vol. 356. Harlow: Addison Wesley Longman.
of period doubling. Similar results hold for higher- Field M (1996b) Symmetry Breaking for Compact Lie Groups.
codimension bifurcations. Memoirs of the American Mathematical Society, vol. 574.
The skew products [4] and [5] are valid for proper Providence, RI: American Mathematical Society.
Golubitsky M and Stewart IN (2002) The Symmetry Perspective.
actions of certain noncompact Lie groups  pro- Progress in Mathematics, vol. 200. Basel: Birkhäuser.
vided the spatial symmetries are compact, leading to Golubitsky M, Stewart IN, and Schaeffer D (1988) Singularities
explanations of spiral and scroll wave phenomena in and Groups in Bifurcation Theory, Vol. II, Applied Mathe-
excitable media. matical Sciences, vol. 69. New York: Springer.
When the spatial symmetry group is noncompact, Lamb JSW and Melbourne I (1999) Bifurcation from periodic
solutions with spatiotemporal symmetry. In: Golubitsky M, Luss
Ec may be infinite-dimensional and center manifold D, and Strogatz SH (eds.) Pattern Formation in Continuous and
reduction may break down due to continuous- Coupled Systems, IMA Volumes in Mathematics and its
spectrum issues. For Euclidean symmetry, there Applications, vol. 115, pp. 175–191. New York: Springer.
is a theory of modulation or Ginzburg–Landau Lamb JSW, Melbourne I, and Wulff C (2003) Bifurcation from
equations. periodic solutions with spatiotemporal symmetry, including
resonances and mode interactions. Journal Differential Equa-
tions 191: 377–407.
See also: Bifurcation Theory; Bifurcations in Fluid
Melbourne I (2000) Ginzburg–Landau theory and symmetry. In:
Dynamics; Bifurcations of Periodic Orbits; Central Debnath L and Riahi DN (eds.) Nonlinear Instability, Chaos
Manifolds, Normal Forms; Chaos and Attractors; and Turbulence, Vol 2, Advances in Fluid Mechanics, vol. 25,
Electroweak Theory; Finite Group Symmetry Breaking; pp. 79–109. Southampton: WIT Press.
Hyperbolic Dynamical Systems; Quantum Spin Systems; Michel L (1980) Symmetry defects and broken symmetry.
Quasiperiodic Systems; Singularity and Bifurcation Configurations. Hidden symmetry. Reviews of Modern Phy-
Theory. sics 52: 617–651.
Sandstede B, Scheel A, and Wulff C (1999) Dynamical behavior of
patterns with Euclidean symmetry. In: Golubitsky M, Luss D,
Further Reading and Strogatz SH (eds.) Pattern Formation in Continuous and
Coupled Systems, IMA Volumes in Mathematics and its
Chossat P and Lauterbach R (2000) Methods in Equivariant Applications, vol. 115, pp. 249–264. New York: Springer.
Bifurcations and Dynamical Systems. Advanced Series in
Nonlinear Dynamics, vol. 15. Singapore: World Scientific.

Symmetry and Symplectic Reduction


J-P Ortega, Université de Franche-Comté, name of ‘‘reduction’’ that restricts the study of its
Besançon, France dynamics to a system of smaller dimension. This
T S Ratiu, Ecole Polytechnique Fédérale de procedure is also used in a purely geometric context
Lausanne, Lausanne, Switzerland to construct new nontrivial manifolds having var-
ª 2006 Elsevier Ltd. All rights reserved. ious additional structures.
Most of the reduction methods can be seen as
constructions that systematize the techniques of
elimination of variables found in classical
Introduction
mechanics. These procedures consist basically of
The use of symmetries in the quantitative and two steps. First, one restricts the dynamics to flow-
qualitative study of dynamical systems has a long invariant submanifolds of the system in question
history that goes back to the founders of mechanics. and, second, one projects the restricted dynamics
In most cases, the symmetries of a system are used to onto the symmetry orbit quotients of the spaces
implement a procedure generically known under the constructed in the first step. Sometimes, the
Symmetry and Symplectic Reduction 191

flow-invariant manifolds appear as the level sets of a The action  is called proper whenever for any
momentum map induced by the symmetry of the two convergent sequences {mn }n2N and {gn  mn :=
system. (gn , mn )}n2N in M, there exists a convergent
subsequence {gnk }k2N in G. Compact group actions
are obviously proper.
Symmetry Reduction
The Symmetries of a System Symmetry Reduction of Vector Fields

The standard mathematical fashion to describe the Let M be a smooth manifold and G a Lie group
symmetries of a dynamical system (see Dynamical acting properly on M. Let X 2 X(M)G and Ft be its
Systems in Mathematical Physics: An Illustration (necessarily equivariant) flow. For any isotropy
from Water Waves) X 2 X(M) defined on a mani- subgroup H of the G-action on M, the H-isotropy
fold M(X(M) denotes the Lie algebra of smooth type submanifold MH := {m 2 M j Gm = H} is pre-
vector fields on M endowed with the Jacobi–Lie served by the flow Ft . This property is known as the
bracket [ , ]) consists in studying its invariance law of conservation of isotropy. The properness of
properties with respect to a smooth Lie group the action guarantees that Gm is compact and that
 : G  M ! M (continuous symmetries) or Lie the (connected components of) MH are embedded
algebra  : g ! X(M) (infinitesimal symmetry) submanifolds of M for any closed subgroup H of G.
action. Recall that  is a (left) action if the map The manifolds MH are, in general, not closed in M.
g 2 G 7! (g, ) 2 Diff(M) is a group homomorph- Moreover, the quotient group N(H)=H (where N(H)
ism, where Diff(M) denotes the group of smooth denotes the normalizer of H in G) acts freely and
diffeomorphisms of the manifold M. The map  is a properly on MH . Hence, if H : MH ! MH =(N(H)=H)
(left) Lie algebra action if the map  2 g 7! () 2 denotes the projection onto orbit space and
X(M) is a Lie algebra antihomomorphism and the iH : MH ,! M is the injection, the vector field X
map (m, ) 2 M  g 7! ()(m) 2 TM is smooth. The induces a unique vector field XH on the quotient
vector field X is said to be G-symmetric whenever it MH =(N(H)=H) defined by XH  H = TH  X  iH ,
is equivariant with respect to the G-action , that is, whose flow FtH is given by FtH  H = H  Ft  iH . We
X  g = Tg  X, for any g 2 G. The space of will refer to XH 2 X(MH =(N(H)=H)) as the H-isotropy
G-symmetric vector fields on M is denoted by type reduced vector field induced by X.
X(M)G . The flow Ft of a G-symmetric vector This reduction technique has been widely
field X 2 X(M)G is G-equivariant, that is, exploited in handling specific dynamical systems.
Ft  g = g  Ft , for any g 2 G. The vector field X is When the symmetry group G is compact and we are
said to be g-symmetric if [(), X] = 0, for any  2 g. dealing with a linear action, the construction of the
If g is the Lie algebra of the Lie group G (see Lie quotient MH =(N(H)=H) can be implemented in a
Groups: General Theory) then the infinitesimal gen- very explicit and convenient manner by using the
erators M 2 X(M) of a smooth G-group action invariant polynomials of the action and the theo-
defined by rems of Hilbert and Schwarz–Mather.

d 
M ðmÞ :¼  ðexp t; mÞ;  2 g; m 2 M
dt t¼0 Symplectic Reduction
constitute a smooth Lie algebra g-action and we Symplectic or Marsden–Weinstein reduction is a
denote in this case () = M . procedure that implements symmetry reduction for
If m 2 M, the closed Lie subgroup Gm := {g 2 Gj the symmetric Hamiltonian systems defined on a
(g, m) = m} is called the isotropy or symmetry symplectic manifold (M, !). The particular case in
subgroup of m. Similarly, the Lie subalgebra which the symplectic manifold is a cotangent bundle
gm := { 2 g j ()(m) = 0} is called the isotropy or is dealt with separately (see Cotangent Bundle
symmetry subalgebra of m. If g is the Lie algebra of Reduction). We recall that the Hamiltonian vector
G and the Lie algebra action is given by the field Xh 2 X(M) associated to the Hamiltonian
infinitesimal generators, then gm is the Lie algebra function h 2 C1 (M) is uniquely determined by the
of Gm . The action is called free if Gm = {e} for every equality !(Xh , ) = dh. In this context, the symme-
m 2 M and locally free if gm = {0} for every m 2 M. tries  : G  M ! M of interest are given by sym-
We will write interchangeably (g, m) = g (m) = plectic or canonical transformations, that is,
m (g) = g  m, for m 2 M and g 2 G. g ! = !, for any g 2 G. For canonical actions each
In this article we will focus mainly on continuous G-invariant function h 2 C1 (M)G has an associated
symmetries induced by proper Lie group actions. G-symmetric Hamiltonian vector field Xh . A Lie
192 Symmetry and Symplectic Reduction

algebra action ’ is called symplectic or canonical if given by h J(q ), i = q (Q (q)), for any q 2
£() ! = 0 for all  2 g, where £ denotes the Lie T  Q and any  2 g.
derivative operator. If the Lie algebra action is (iv) (Symplectic linear actions) Let (V, !) be a
induced from a canonical Lie group action by taking symplectic linear space and let G be a subgroup
its infinitesimal generators, then it is also canonical. of the linear symplectic group, acting naturally
on V. By the choice of G this action is canonical
and has a momentum map given by
Momentum Maps
h J(v), i = (1=2)!(V (v), v), for  2 g and v 2 V
The symmetry reduction described in the previous arbitrary.
section for general vector fields does not produce a
well-adapted answer for symplectic manifolds (M, !)
Properties of the Momentum Map
in the sense that the reduced spaces MH =(N(H)=H)
are, in general, not symplectic. To solve this The main feature of the momentum map that makes it of
problem one has to use the conservation laws interest for use in reduction is that it encodes conserva-
associated to the canonical action, which often tion laws for G-symmetric Hamiltonian systems.
appear as momentum maps. Noether’s theorem states that the momentum map is a
Let G be a Lie group acting canonically on the constant of the motion for the Hamiltonian vector field
symplectic manifold (M, !). Suppose that for any  2 g, Xh associated to any G-invariant function h 2 C1 (M)G
the vector field M is Hamiltonian, with Hamiltonian (see Symmetries and Conservation Laws).
function J  2 C1 (M) and that  2 g 7! J  2 C1 (M) is The derivative TJ of the momentum map satisfies
linear. The map J : M ! g defined by the relation the following two properties: range (Tm J) = (gm ) and
h J(z), i = J  (z), for all  2 g and z 2 M, is called ker Tm J = (gm)! , for any m 2 M, where (gm )
a momentum map of the G-action (see Hamiltonian denotes the annihilator in g of the isotropy subalgebra
Group Actions). Momentum maps, if they exist, are gm of m, gm := Tm (Gm) = {M (m)j 2 g} is the
determined up to a constant in g for any connected tangent space at m to the G-orbit that contains this
component of M. point, and (gm)! is the symplectic orthogonal space
to gm in the symplectic vector space (Tm M, !(m)).
Examples 1
The first relation is sometimes called the bifurcation
(i) (Linear momentum) The phase space of an lemma since it establishes a link between the symmetry
N-particle system is the cotangent space T  R3N of a point and the rank of the momentum map at
endowed with its canonical symplectic struc- that point.
ture. The additive group R3 , whose Lie algebra The existence of the momentum map for a given
is abelian and is also equal to R3 , acts canonical action is not guaranteed. A momentum
canonically on it by spatial translation on each map exists if and only if the linear map : [] 2
factor: v  (qi , pi ) = (qi þ v, pi ), with i = 1, . . . , N. g=[g, g] 7! [!(M , )] 2 H 1 (M, R) is identically zero.
This action has an associated momentum map Thus, if H 1 (M, R) = 0 or g=[g, g] = H 1 (g, R) = 0
J : T  R3N ! R3 , where we identified the dual of then  0. In particular, if g is semisimple, the
R 3 with itself using the Euclidean inner pro- ‘‘first Whitehead lemma’’ states that H 1 (g, R) = 0
duct, which coincidesPwith the classical linear and therefore a momentum map always exists for
momentum J(qi , pi ) = N i = 1 pi . canonical semisimple Lie algebra actions.
(ii) (Angular momentum) Let SO(3) act on R3 A natural question to ask is when the map
and then, by lift, on T  R3 , that is, A  (q, p) = (g, [ , ]) ! (C1 (M), { , }) defined by  7! J  ,  2 g,
(Aq, Ap). This action is canonical and has as is a Lie algebra homomorphism, that is,
associated momentum map J : T  R3 ! so(3) ffi J [,
] = { J  , J
}, ,
2 g. Here { , } : C1 (M) 
R3 , the classical angular momentum J(q, p) = C1 (M) ! C1 (M) denotes the Poisson bracket asso-
q  p. ciated to the symplectic form ! of M defined by
(iii) (Lifted actions on cotangent bundles) The {f , h} := !(Xf , Xh ), f , h 2 C1 (M). This is the case if
previous two examples are particular cases of and only if Tz J(M (z)) = ad J(z), for any  2 g,
the following situation. Let  : G  M ! M be a z 2 M, where ad is the dual of the adjoint
smooth Lie group action. The (left) cotangent representation ad : (,
) 2 g  g 7! [,
] 2 g of g on
lifted action of G on T  Q is given by g  q : = itself. A momentum map that satisfies this relation

Tgq g1 (q ) for g 2 G and q 2 T  Q. Cotan- in called infinitesimally equivariant. The reason
gent lifted actions preserve the canonical 1-form behind this terminology is that this is the infinitesi-
on T  Q and hence are canonical. They admit mal version of global or coadjoint equivariance: J is
an associated momentum map J : T  Q ! g G-equivariant if Adg1  J = J  g or, equivalently,
Symmetry and Symplectic Reduction 193

J Ad 
g (g  z) = J (z), for all g 2 G,  2 g, and z 2 M; manifold out of a given symmetric one in which the

Ad denotes the dual of the adjoint representation conservation laws encoded in the form of a
Ad of G on g. Actions admitting infinitesimally momentum map and the degeneracies associated to
equivariant momentum maps are called Hamilto- the symmetry have been eliminated. This strategy
nian actions and Lie group actions with coadjoint allows the reduction of a symmetric Hamiltonian
equivariant momentum maps are called globally dynamical system to a dimensionally smaller one.
Hamiltonian actions. If the symmetry group G is This reduction procedure preserves the symplectic
connected then global and infinitesimal equivariance category, that is, if we start with a Hamiltonian
of the momentum map are equivalent concepts. If g system on a symplectic manifold, the reduced system
acts canonically on (M, !) and H 1 (g, R) = {0} then is also a Hamiltonian system on a symplectic
this action admits at most one infinitesimally manifold. The reduced symplectic manifold is
equivariant momentum map. usually referred to as the symplectic or Marsden–
Since momentum maps are not uniquely defined, Weinstein reduced space.
one may ask whether one can choose them to be
Theorem 2 Let  : G  M ! M be a free proper
equivariant. It turns out that if the momentum map is
canonical action of the Lie group G on the connected
associated to the action of a compact Lie group, this
symplectic manifold (M, !). Suppose that this action
can always be done. Momentum maps of cotangent
has an associated momentum map J : M ! g , with
lifted actions are also equivariant as are momentum
nonequivariance 1-cocycle  : G ! g . Let 2 g be
maps defined by symplectic linear actions. Canonical
a value of J and denote by G the isotropy of under
actions of semisimple Lie algebras on symplectic
the affine action of G on g . Then:
manifolds admit infinitesimally equivariant momen-
tum maps, since the ‘‘second Whitehead lemma’’ (i) The space M := J 1 ( )=G is a regular quotient
states that H2 (g, R) = 0 if g is semisimple. We shall manifold and, moreover, it is a symplectic
identify below a specific element of H 2 (g, R) which is manifold with symplectic form ! uniquely
the obstruction to the equivariance of a momentum characterized by the relation
map (assuming it exists).
 ! ¼ i !
Even though, in general, it is not possible to
choose a coadjoint equivariant momentum map, it The maps i : J 1 ( ) ,! M and  : J 1 ( ) !
turns out that when the symplectic manifold is J 1 ( )=G denote the inclusion and the projec-
connected there is an affine action on the dual of the tion, respectively. The pair (M , ! ) is called the
Lie algebra with respect to which the momentum symplectic point reduced space.
map is equivariant. Define the nonequivariance (ii) Let h 2 C1 (M)G be a G-invariant Hamiltonian.
1-cocycle associated to J as the map  : G ! g The flow Ft of the Hamiltonian vector field Xh
given by g 7! J(g (z))Adg1 ( J(z)). The connectivity leaves the connected components of J 1 ( )
of M implies that the right-hand side of this equality invariant and commutes with the G-action, so
is independent of the point z 2 M. In addition,  is a it induces a flow Ft on M defined by
(left) g -valued 1-cocycle on G with respect to the   Ft  i = Ft   .
coadjoint representation of G on g , that is, (iii) The vector field generated by the flow Ft on
(gh) = (g) þ Adg1 (h) for all g, h 2 G. Relative to (M , ! ) is Hamiltonian with associated
the affine action  : G  g ! g given by reduced Hamiltonian function h 2 C1 (M )
(g, ) 7! Adg1 þ (g), the momentum map J is defined by h   = h  i . The vector fields
equivariant. The ‘‘reduction lemma,’’ the main Xh and Xh are  -related. The triple
technical ingredient in the proof of the reduction (M , ! , h ) is called the reduced Hamiltonian
theorem, states that for any m 2 M we have system.
gJðmÞm ¼ gm \ ker Tm J ¼ gm \ ðgmÞ! (iv) Let k 2 C1 (M)G be another G-invariant func-
tion. Then {h, k} is also G-invariant and
where gJ(m) is the Lie algebra of the isotropy group {h, k} = {h , k }M , where { , }M denotes the
GJ(m) of J(m) 2 g with respect to the affine action Poisson bracket associated to the symplectic
of G on g induced by the nonequivariance form ! on M .
1-cocycle of J.
Reconstruction of Dynamics
The Symplectic Reduction Theorem
We pose now the question converse to the reduction
The symplectic reduction procedure that we now of a Hamiltonian system. Assume that an integral
present consists of constructing a new symplectic curve c (t) of the reduced Hamiltonian system Xh
194 Symmetry and Symplectic Reduction

on (M , ! ) is known. Let m0 2 J 1 ( ) be given. One for arbitrary 2 O , and ,


2 g. The symbol
can determine from this data the integral curve of g ( ) := ad þ (, ) denotes the infinitesimal
the Hamiltonian system Xh with initial condition generator of the affine action on g associated to
m0 . In other words, one can reconstruct the solution  2 g. The symplectic structures !O on O are
of the given system knowing the corresponding called the ( )-orbit or Kostant–Kirillov–Souriau
reduced solution. The general method of reconstruc- (KKS) symplectic forms.
tion is the following. Pick a smooth curve d(t) in This symplectic form can be obtained from
J 1 ( ) such that d(0) = m0 and  (d(t)) = c (t). Then, Theorem 2 by considering the symplectic reduction
if c(t) denotes the integral curve of Xh with of the cotangent bundle T  G endowed with the
c(0) = m0 , we can write c(t) = g(t)d(t) for some magnetic symplectic structure ! := !can   B ,
smooth curve g(t) in G that is obtained in two where !can is the canonical symplectic form on
steps. First, one finds a smooth curve (t) in g T  G, : T  G ! G is the projection onto the base,
such that (t)M (d(t)) = Xh (d(t))  d(t). With the and B 2 2 (G)G is a left-invariant 2-form on G
(t) 2 g just obtained, one solves the nonautono- whose value at the identity is the Lie algebra
mous differential equation g(t)_ = Te Lg(t) (t) on G 2-cocycle  : g  g ! R. Since  is a cocycle, it
with g(0) = e. follows that B is closed and hence ! is a
symplectic form. Moreover, the lifting of the left
translations on G provides a canonical G-action on
The Orbit Formulation of the Symplectic T  G that has a momentum map given by
Reduction Theorem
J(g, ) = (g, ), (g, ) 2 G  g ’ T  G, where the
There is an alternative approach to the reduction trivialization G  g ’ T  G is obtained via left
theorem which consists of choosing as numerator of translations. Symplectic reduction using these ingre-
the symplectic reduced space the group invariant dients yields symplectic reduced spaces that are
saturation of the level sets of the momentum map. naturally symplectically diffeomorphic to the affine
This option produces as a result a space that is orbits O with the symplectic form [1].
symplectomorphic to the Marsden–Weinstein quo-
Theorem 3 (Symplectic orbit reduction). Let  : G 
tient but presents the advantage of being more
M ! M be a free proper canonical action of the Lie
appropriate in the context of quantization problems.
group G on the connected symplectic manifold (M, !).
Additionally, this approach makes easier the com-
Suppose that this action has an associated momentum
parison of the symplectic reduced spaces corres-
map J : M ! g , with nonequivariance 1-cocycle
ponding to different values of the momentum map
 : G ! g . Let O := G  g be the G-orbit of the
which is important in the context of Poisson point 2 g with respect to the affine action of G on
reduction (see Poisson Reduction). In carrying out
g associated to . Then the set MO := J 1 (O )=G
this construction, one needs to use the natural
is a regular quotient symplectic manifold with
symplectic structures that one can define on the
the symplectic form !O uniquely characterized by
orbits of the affine action of a group on the dual of
the relation iO ! = O !O þ J O !þ
O , where J O is
its Lie algebra and that we now quickly review.
the restriction of J to J 1 (O ) and !þ O is the (þ)–
Let G be a Lie group,  : G ! g a coadjoint
symplectic structure on the affine orbit O . The maps
1-cocycle, and 2 g . Let O be the orbit through
iO : J 1 (O ) ,! M and O : J 1 (O ) ! MO are nat-
of the affine G-action on g associated to . If ural injection and the projection, respectively. The pair
 : g  g ! R defined by
(MO , !O ) is called the symplectic orbit reduced space.
 Statements similar to (ii)–(iv) in Theorem 2 can be
X d 
ð;
Þ :¼  hðexpðtÞ;
i formulated for the orbit reduced spaces (MO , !O ).
dt t¼0
We emphasize that given a momentum value 2 g ,
is a real-valued Lie algebra 2-cocycle (which is the reduced spaces M and MO are symplectically
always the case if  is the derivative of a smooth diffeomorphic via the projection to the quotients of the
real-valued group 2-cocycle or if  is the non- inclusion J 1 ( ) ,! J 1 (O ).
equivariance 1-cocycle of a momentum map), that Reduction at a general point can be replaced by
is,  : g  g ! R is skew-symmetric and ([,
], ) þ reduction at zero at the expense of enlarging the
([
, ], ) þ ([ , ],
) = 0 for all ,
, 2 g, then manifold by the affine orbit. Consider the canonical
the affine orbit O is a symplectic manifold with diagonal action of G on the symplectic difference
G-invariant symplectic structure ! O given by M Oþ , which is the manifold M  O with the
symplectic form 1 !  2 !Oþ , where 1 : M 
!
O ð Þðg ð Þ;
g ð ÞÞ ¼ h ; ½;

i ð;
Þ ½1
O ! M and 2 : M  O ! O are the projections.
Symmetry and Symplectic Reduction 195

A momentum map for this action is given by J  topology is the relative topology induced by P. The
1  2 : M O þ  þ
! g . Let (M O )0 := (( J  1  depth dp(z) of any z 2 (P, Z) is defined as
1 þ
2 ) (0)=G, (! !O )0 ) be the symplectic point
reduced space at zero. dpðzÞ :¼ supfk 2 N j 9 S0 ; S1 ; . . . ; Sk 2 Z
with z 2 S0  S1      Sk g
Theorem 4 (Shifting theorem). Under the hypoth-
eses of the symplectic orbit reduction theorem Since for any two elements x, y 2 S in the same piece
(Theorem 3), the symplectic orbit reduced space S 2 P we have dp(x) = dp(y), the depth dp(S) of the
MO , the point reduced spaces M , and (M Oþ
)0
piece S is well defined by dp(S) := dp(x), x 2 S.
are symplectically diffeomorphic. Finally, the depth dp(P) of (P, Z) is defined by
dp(P) := sup{dp(S) j S 2 Z}.
A continuous mapping f : P ! Q between the
Singular Reduction decomposed spaces (P, Z) and (Q, Y) is a morphism
In the previous section we carried out symplectic of decomposed spaces if, for every piece S 2 Z, there
reduction for free and proper actions. The freeness is a piece T 2 Y such that f (S) T and the
guarantees via the bifurcation lemma that the restriction f jS : S ! T is smooth. If (P, Z) and (P, T )
momentum map J is a submersion and hence the are two decompositions of the same topological
level sets J 1 ( ) are smooth manifolds. Freeness and space we say that Z is coarser than T or that T is
properness ensure that the orbit spaces finer than Z if the identity mapping (P, T ) ! (P, Z)
M := J 1 ( )=G are regular quotient manifolds. is a morphism of decomposed spaces. A topological
The theory of singular reduction studies the proper- subspace Q P is a decomposed subspace of (P, Z)
ties of the orbit space M when the hypothesis on if, for all pieces S 2 Z, the intersection S \ Q is a
the freeness of the action is dropped. The main submanifold of S and the corresponding partition
result in this situation shows that these quotients are Z \ Q forms a decomposition of Q.
symplectic Whitney stratified spaces, in the sense Let P be a topological space and z 2 P. Two subsets
that the strata are symplectic manifolds in a very A and B of P are said to be equivalent at z if there is an
natural way; moreover, the local properties of this open neighborhood U of z such that A \ U = B \ U.
Whitney stratification make it into what is called a This relation constitutes an equivalence relation on the
cone space. This statement is referred to as the power set of P. The class of all sets equivalent to a
‘‘symplectic stratification theorem’’ and adapts to given subset A at z will be denoted by [A]z and called
the symplectic symmetric context the stratification the set germ of A at z. If A B P, we say that [A]z is
theorem of the orbit space of a proper Lie group a subgerm of [B]z , and denote [A]z [B]z .
action by using its orbit type manifolds. In order to A stratification of the topological space P is a map
present this result, we review the necessary defini- S that associates to any z 2 P the set germ S(z) of a
tions and results on stratified spaces (see Singularity closed subset of P such that the following condition
and Bifurcation Theory for more information on is satisfied:
singularity theory). Condition (ST) For every z 2 P there is a neighbor-
hood U of z and a decomposition Z of U such that
Stratified Spaces
for all y 2 U the germ S(y) coincides with the set
Let Z be a locally finite partition of the topological germ of the piece of Z that contains y.
space P into smooth manifolds Si P, i 2 I. We
The pair (P, S) is called a stratified space. Any
assume that the manifolds Si P, i 2 I, with their
decomposition of P defines a stratification of P by
manifold topology are locally closed topological sub- associating to each of its points the set germ of the
spaces of P. The pair (P, Z) is a decomposition of P with piece in which it is contained. The converse is, by
pieces in Z when the following condition is satisfied: definition, locally true.
Condition (DS) If R, S 2 Z are such that R \ S 6¼ ;,
then R  S. In this case we write R  S. If, in The Strata
addition, R 6¼ S we say that R is incident to S or that
Two decompositions Z 1 and Z 2 of P are said to be
it is a boundary piece of S and write R  S.
equivalent if they induce the same stratification of P.
The above condition is called the frontier condition If Z 1 and Z 2 are equivalent decompositions of P
and the pair (P, Z) is called a decomposed space. The then, for all z 2 P, we have that dpZ 1 (z) = dpZ 2 (z).
dimension of P is defined as dim P = sup{dim Si j Si 2 Any stratified space (P, S) has a unique decomposi-
Z}. If k 2 N, the k-skeleton Pk of P is the union of all tion Z S associated with the following maximality
the pieces of dimension smaller than or equal to k; its property: for any open subset U P and any
196 Symmetry and Symplectic Reduction

decomposition Z of P inducing S over U, the z 2 R with respect to the chart (U, ) is given by the
restriction of Z S to U is coarser than the restriction following statement:
of Z to U. The decomposition Z S is called the
Condition (B) Let {xn }n2N R \ U and {yn }n2N
canonical decomposition associated to the stratifica-
S \ U be two sequences with the same limit
tion (P, S). It is often denoted by S and its pieces are
called the strata of P. The local finiteness of the z ¼ lim xn ¼ lim yn
n!1 n!1
decomposition Z S implies that for any stratum S
of (P, S) there are only finitely many strata R with and such that xn 6¼ yn , for all n 2 N. Suppose that
S  R. Henceforth, the symbol S in the stratification the set of connecting lines (xn )(yn ) Rn con-
(P, S) will denote both the map that associates to verges in projective space to a line L and that the
each point a set germ and the set of pieces associated sequence of tangent spaces {Tyn S}n2N converges in
to the canonical decomposition induced by the the Grassmann bundle of ( dim S)-dimensional sub-
stratification of P. spaces of TM to  Tz M. Then, (Tz )1 (L) .
If the condition (A) (respectively (B)) is verified
Stratified Spaces with Smooth Structure for every point z 2 R, the pair (R, S) is said to satisfy
Let (P, S) be a stratified space. A singular or the Whitney condition (A) (respectively (B)). It can
stratified chart of P is a homeomorphism be verified that Whitney’s condition (B) does not
 : U ! (U) R n from an open set U P to a depend on the chart used to formulate it. A stratified
subset of Rn such that for every stratum S 2 S space with smooth structure such that, for every pair
the image (U \ S) is a submanifold of Rn and of strata, Whitney’s condition (B) is satisfied is
the restriction jU\S : U \ S ! (U \ S) is a diffeo- called a Whitney space.
morphism. Two singular charts  : U ! (U) Rn
and ’ : V ! ’(V) Rm are compatible if for any Cone Spaces and Local Triviality
z 2 U \ V there exist an open neighborhood
W U \ V of z, a natural number N  max {n, m}, Let P be a topological space. Consider the equiva-
open neighborhoods O, O0 RN of (U)  {0} and lence relation  in the product P  [0, 1) given by
’(V)  {0}, respectively, and a diffeomorphism (z, a)  (z0 , a0 ) if and only if a = a0 = 0. We define the
: O ! O0 such that im  ’jW =  in  jW , where cone CP on P as the quotient topological space P 
in and im denote the natural embeddings of R n and [0, 1)=  . If P is a smooth manifold then the cone
Rm into RN by using the first n and m coordinates, CP is a decomposed space with two pieces, namely,
respectively. The notion of singular or stratified P  (0, 1) and the vertex which is the class
atlas is the natural generalization for stratifications corresponding to any element of the form (z, 0),
of the concept of atlas existing for smooth mani- z 2 P, that is, P  {0}. Analogously, if (P, Z) is a
folds. Analogously, we can talk of compatible and decomposed (stratified) space then the associated
maximal stratified atlases. If the stratified space cone CP is also a decomposed (stratified) space
(P, S) has a well-defined maximal atlas, then we say whose pieces (strata) are the vertex and the sets of
that this atlas determines a smooth or differentiable the form S  (0, 1), with S 2 Z. This implies, in
structure on P. We will refer to (P, S) as a smooth particular, that dim CP = dim P þ 1 and dp(CP) =
stratified space. dp(P) þ 1.
A stratified space (P, S) is said to be locally trivial
if for any z 2 P there exist a neighborhood U of z, a
The Whitney Conditions stratified space (F, S F ), a distinguished point 0 2 F,
Let M be a manifold and R, S M two submani- and an isomorphism of stratified spaces
folds. We say that the pair (R, S) satisfies the : U ! ðS \ UÞ  F
Whitney condition (A) at the point z 2 R if the
following condition is satisfied: where S is the stratum that contains z and satisfies
1
(y, 0) = y, for all y 2 S \ U. When F is given by a
Condition (A) For any sequence of points {zn }n2N
cone CL over a compact stratified space L then L is
in S converging to z 2 R for which the sequence of
called the link of z.
tangent spaces {Tzn S}n2N converges in the Grass-
An important corollary of ‘‘Thom’s first isotopy
mann bundle of dim S–dimensional subspaces of TM
lemma’’ guarantees that every Whitney stratified
to  Tz M, we have that Tz R .
space is locally trivial. A converse to this implication
Let  : U ! Rn be a smooth chart of M around needs the introduction of cone spaces. Their defini-
the point z. The Whitney condition (B) at the point tion is given by recursion on the depth of the space.
Symmetry and Symplectic Reduction 197

Definition 5 Let m 2 N [ {1, !}. A cone space of 4. Let h 2 C1 (M)G be a G-invariant Hamiltonian.
class Cm and depth 0 is the union of countably many Then the flow Ft of Xh leaves the connected
Cm manifolds together with the stratification whose components of J 1 ( ) \ G MzH invariant and com-
strata are the unions of the connected components mutes with the G -action, so it induces a flow Ft on
of equal dimension. A cone space of class Cm and M(H)
that is characterized by (H) (H)
 F t  i =
depth d þ 1, d 2 N, is a stratified space (P, S) with a Ft  (H)
.
Cm differentiable structure such that for any z 2 P 5. The flow Ft is Hamiltonian on M(H) , with
there exists a connected neighborhood U of z, a reduced Hamiltonian function h(H) : M (H)
!R
compact cone space L of class Cm and depth d called defined by h (H)  (H) = h  i (H)
. The vector fields
(H)
the link, and a stratified isomorphism Xh and Xh(H)
are  -related.
6. Let k : M ! R be another G-invariant function.
: U ! ðS \ UÞ  CL Then {h, k} is also G-invariant and {h, k}(H) (H)
= {h ,
(H)
k }M(H) , where { , }M(H) denotes the Poisson bracket
where S is the stratum that contains the point z, the
induced by the symplectic structure on M(H) .
map satisfies 1 (y, 0) = y, for all y 2 S \ U, and 0
is the vertex of the cone CL. Theorem 6 (Symplectic stratification theorem). The
If m 6¼ 0 then L is required to be embedded into a quotient M := J 1 ( )=G is a cone space when
sphere via a fixed smooth global singular chart considered as a stratified space with strata M(H)
.
’ : L ! Sl that determines the smooth structure
of CL. More specifically, the smooth structure of As was the case for regular reduction, this theorem
can also be formulated from the orbit reduction point
CL is generated by the global chart  : [z, t] 2
of view. Using that approach one can conclude
CL 7! t’(z) 2 Rlþ1 . The maps : U ! (S \ U) 
that the orbit reduced spaces MO are cone
CL and ’ : L ! Sl are referred to as a cone chart
spaces symplectically stratified by the manifolds
and a link chart, respectively. Moreover, if m 6¼ 0
M(H) 1 z
O := G  (J ( ) \ MH )=G that have symplectic
then and 1 are required to be differentiable of
structure uniquely determined by the expression
class Cm as maps between stratified spaces with a
smooth structure. ðHÞ ðHÞ ðHÞ ðHÞ
i O ! ¼  O !O þ J O !þ
O
(H)
where iO
: G  ( J 1 ( ) \ MzH ) ,! M is the inclusion,
The Symplectic Stratification Theorem (H) 1
J O : G  ( J ( ) \ MzH ) ! O is obtained by restric-
Let (M, !) be a connected symplectic manifold acted tion of the momentum map J, and !þ O is the
canonically and properly upon by a Lie group G. (þ)–symplectic form on O . Analogous statements
Suppose that this action has an associated momen- to (7)–(6) above with obvious modifications are valid.
tum map J : M ! g with nonequivariance 1–cocycle
 : G ! g . Let 2 g be a value of J, G the See also: Cotangent Bundle Reduction; Dynamical
isotropy subgroup of with respect to the affine Systems in Mathematical Physics: An Illustration
from Water Waves; Graded Poisson Algebras;
action  : G  g ! g determined by , and let
Hamiltonian Group Actions; Lie Groups: General Theory;
H G be an isotropy subgroup of the G-action on Poisson Reduction; Singularity and Bifurcation Theory;
M. Let MzH be the connected component of the Symmetries and Conservation Laws.
H-isotropy type manifold that contains a given
element z 2 M such that J(z) = and let G MzH be
its G -saturation. Then the following hold:
Further Reading
1. The set J 1 ( ) \ G MzH is a submanifold of M.
1 z
2. The set M(H)
:= [ J ( ) \ G MH ]=G has a unique
Abraham R and Marsden JE (1978) Foundations of Mechanics,
2nd edn. Reading, MA: Addison-Wesley.
quotient differentiable structure such that the
Arms JM, Cushman R, and Gotay MJ (1991) A universal
canonical projection  (H) : J 1 ( ) \ G MzH ! reduction procedure for Hamiltonian group actions. In:
M(H)
is a surjective submersion. Ratiu TS (ed.) The Geometry of Hamiltonian Systems,
3. There is a unique symplectic structure !(H) on pp. 33–51. New York: Springer.
M(H)
characterized by Cendra H, Marsden JE, and Ratiu TS (2001) Lagrangian
reduction by stages. Memoirs of the American Mathematical
Society, Volume 152, No. 722.
iðHÞ
! ¼ ðHÞ
!ðHÞ
Huebschmann J (2001) Singularities and Poisson geometry of
certain representation spaces. In: Landsman NP, Pflaum M,
where i (H) : J 1 ( ) \ G MzH ,! M is the natural and Schlichenmaier M (eds.) Quantization of Singular
inclusion. The pairs (M(H) (H)
, ! ) will be called Symplectic Quotients, Progr. Math., vol. 198, pp. 119–135.
singular symplectic point strata. Boston, MA: Birkhäuser.
198 Symmetry Breaking in Field Theory

Kazhdan D, Kostant B, and Sternberg S (1978) Hamiltonian Marsden JE and Weinstein A (2001) Comments on the history,
group actions and dynamical systems of Calogero type. theory, and applications of symplectic reduction. In:
Communications in Pure and Applied Mathematics 31: Landsman N, Pflaum M, and Schlichenmaier M (eds.)
481–508. Quantization of Singular Sympectic Quotients, Progress in
Kirillov AA (1976) Elements of the Theory of Representations, Mathematics, vol. 198, pp. 1–20. Boston: Birkhäuser.
Grundlehren Math. Wiss. Berlin–New York: Springer. Mayer KR (1973) Symmetries and integrals in mechanics. In:
Kostant B (1970) Quantization and Unitary Representations, Peixoto M (ed.) Dynamical Systems, pp. 259–273. New York:
Lecture Notes in Math., vol. 570, pp. 177–306. Berlin: Academic Press.
Springer. Ortega J-P and Ratiu TS (2003) Momentum Maps and Hamilto-
Lerman E, Montgomery R, and Sjamaar R (1993) Examples of nian Reduction, Progress in Math., vol. 222. Boston:
singular reduction. In: Symplectic Geometry, London Math. Birkhäuser.
Soc. Lecture Note Ser., vol. 192, pp. 127–155. Cambridge: Pflaum MJ (2001) Analytic and Geometric Study of Stratified
Cambridge University Press. Spaces, Lecture Notes in Mathematics, vol. 1768. Berlin:
Marsden J (1981) Lectures on Geometric Methods in Mathema- Springer–Verlag.
tical Physics. Philadelphia: SIAM. Sjamaar R and Lerman E (1991) Stratified symplectic spaces and
Marsden JE, Misiołek G, Ortega J-P, Perlmutter M, and Ratiu TS reduction. Annals of Mathematics 134: 375–422.
(2005) Hamiltonian Reduction by Stages, Lecture Notes in Souriau J-M (1970) Structure des Systèmes Dynamiques. Paris:
Mathematics. Berlin: Springer. Dunod. (English translation by Cushman RH and Tuynman
Marsden JE and Ratiu TS (2003) Introduction to Mechanics and GM (1997)) Structure of Dynamical Systems, Progress in
Symmetry, 2nd edn., second printing (1st edn (1994)), Texts in Math., vol. 149. Boston: Birkhäuser.
Applied Mathematics, vol. 17. New York: Springer.
Marsden JE and Weinstein A (1974) Reduction of symplectic
manifolds with symmetry. Reports on Mathematical Physics
5(1): 121–130.

Symmetry Breaking in Field Theory


T W B Kibble, Imperial College, London, UK commutes with the Hamiltonian H. ^ If the ground state
j0i of the system in not invariant under U, ^ then
ª 2006 Elsevier Ltd. All rights reserved.
^
j00 i = Uj0i 6¼ cj0i is also a ground state. In other
words, the ground state is degenerate.
For a system with a finite number of degrees
Introduction
of freedom, whose states are represented by vectors
Spontaneous symmetry breaking in its simplest form in a separable Hilbert space H, symmetry breaking
occurs when there is a symmetry of a dynamical of an abelian symmetry group G is impossible,
system that is not manifest in its ground state or unless there are additional accidental symmetries.
equilibrium state. It is a common feature of many Consider, for example, a particle in a double-well
classical and quantum systems. In quantum field potential
theories, in the infinite-volume limit, there are new
features, the appearance of unitarily inequivalent m!2 2
V¼ ðx  a2 Þ2 ½1

representations of the canonical commutation 4a2


relations, and the possibility of a true phase which has the discrete symmetry group G = Z2 ; the
transition – a point in the phase space where the inversion symmetry operator U ^ satisfies U
^ 2 = 1.
^
thermodynamic free energy is nonanalytic. The There are then two approximate ground states j0i
spontaneous breaking of a continuous global sym- ^
and j00 i = Uj0i, with wave functions proportional to
metry implies the existence of massless particles, the exp[ð1/2Þ m!(x a)2 ]. However, there is an over-
Goldstone bosons, while in the local-symmetry case lap between these, and the off-diagonal matrix
some or all of these may be eliminated by the Higgs ^ 0 i is nonzero, although exponentially
element h0jHj0
mechanism. Spontaneous symmetry breaking in small, so the true p energy
ffiffiffi eigenstates are, approxi-
gauge theories is however a more elusive concept. mately, j0 i = (1= 2)(j0i j00 i). (More accurate
energy eigenfunctions and eigenvalues may be
found by using the WKB approximation.)
Breaking of Global Symmetries
Of course, if the symmetry group is nonabelian,
In a quantum-mechanical system a (time-independent) and the ground state belongs to a nontrivial
symmetry is represented by a unitary operator U ^ representation, then degeneracy is unavoidable. For
acting on the Hilbert space of quantum states which example, if G is the rotation group SO(3) (or SU(2))
Symmetry Breaking in Field Theory 199

and the ground state has angular momentum j 6¼ 0, However, these are not unitary operators on the
then it is (2j þ 1)-fold degenerate. spaces H , but rather maps from one space to
The situation is different, however, in a quantum another: U ^  : H ! Hþ – or, alternatively, opera-
L
field theory. In the infinite-volume limit, even abelian tors on the nonseparable Hilbert space H =  H .
symmetries can be spontaneously broken. Take, for So far, our discussion has been restricted to the
example, a real scalar field with Lagrangian tree approximation. For a full quantum treatment,
V() must be replaced by the effective potential
L ¼ 12@ @    V ¼ 12_ 2  12ðrÞ2  V ½2 Veff (), which may be defined as the minimum value
(where we set c = 
h = 1), again with a double-well of the mean energy density in all states in which the
potential field ˆ has the uniform expectation value h(x)i
ˆ = .
Veff may be computed by summing vacuum loop
V ¼ 18ð2  2 Þ2 ½3 diagrams.
A point to note is that although the degenerate
exhibiting a Z2 symmetry under which
vacua j0 i are mathematically distinct, in the
(x) 7! (x).
absence of any external definition of phase, they
At least in the semiclassical or tree approxi-
are physically identical. There is no internal obser-
mation, there are two degenerate vacuum states j0i
vational test that will distinguish them.
and j00 i, with
^
h0jðxÞj0i  and ^
h00 jðxÞj0 0
i   ½4
If we quantize the system in a box of finite volume Symmetry-Breaking Phase Transitions
V, then, as earlier, there is an off-diagonal matrix
element of the Hamiltonian connecting the two Spontaneous symmetry breaking often occurs in the
states, context of a phase transition. At high temperature,
pffiffiffi so the true ground state is (approximately) T  , there are large fluctuations in  and the
(1= 2)(j0i þ j00 i). However, this matrix element
goes to zero exponentially as V ! 0. Even for large central hump of the potential is unimportant. Then
the equilibrium state is symmetric, with hi ˆ = 0.
but finite volume, the rate of transitions from j0i to
j00 i is exponentially slow. However, as the temperature falls, it becomes less
Similarly, we can consider a complex scalar field probable that the field will fluctuate over the top of
theory with a sombrero potential: the hump. It will tend to fall into the trough, and
acquire a nonzero average value hi ˆ – the order
_ 2  jrj2  V
L ¼ jj parameter for the phase transition – thus breaking
 2 ½5 the symmetry. The direction of symmetry breaking
V ¼ 12 jj2  122 (e.g., the phase of  in the U(1) model) is random,
determined in practice by small preexisting fluctua-
This model is invariant under the U(1) group of phase tions or interactions with the environment.
transformations, (x) 7! (x)ei , so we now have a One way of studying this process is to compute
continuously infinite set of degenerate vacuum states the temperature-dependent effective potential
j0 i labeled by an angle , and satisfying Veff (, T). In the one-loop approximation, at high
1 temperature, the leading corrections to the zero-
^
h0 jðxÞj0 i  pffiffiffi e
i
½6
2 temperature effective potential Veff (, T) are of the
form
Once again, one finds that in the infinite-volume
limit there are no matrix elements connecting the
2
different vacuum states. Moreover, in this limit no Veff ð; TÞ ¼ Veff ð; 0Þ  N T 4
polynomial formed from the field operators (x)ˆ in 90
a finite volume can have nonzero matrix elements 1 2
þ M ðÞT 2 þ OðTÞ ½8
between j0 i and j0 i for  6¼ . Applying the 24 
ˆ
operators (x) to any one of these vacuum states
j0 i, we can construct a Fock space H , and the where N is the total number of helicity states of light
representations of the canonical commutation rela- particles (those with masses T), and M2 , which
tions on these separate Hilbert spaces are unitarily depends on , is the sum of their squared masses.
inequivalent. Formally, we can introduce operators (Fermions if present contribute to N with a factor of
U^  that perform the symmetry transformations: 7/8 and to M2 with a factor of 1/2.) In the simplest
case, where we have only a multiplet f = (a )a = 1, ...; N
U^  ðxÞ
^ U ^ 1 ¼ ðxÞe
^ i
½7
 of real scalar fields, N = N and M2 = M2aa
200 Symmetry Breaking in Field Theory

(summation over a implied), where the mass-squared presence of long-range forces) to the appearance of
matrix is massless modes – the Goldstone bosons.
The proof is straightforward. Associated with any
@2V continuous symmetry there is a Noether current
M2ab ¼ ½9
@a @b satisfying the continuity equation @^j = 0 and such
that infinitesimal symmetry transformations are
For example, in an O(N) theory, with V = (1/8)
generated by the spatial integral of ^j0 . The fact that
(f 2  2 )2 , where f 2 = a a , one has
the symmetry is broken means that there is some
M2ab ¼ 12 ðf 2  2 Þ ab þ a b ½10 ˆ
scalar field (x) whose vacuum expectation value
ˆ
h0j(0)j0i is not invariant under the symmetry
whence transformation. Hence,
Z
1 2
Veff ðf; TÞ  ðf 2  2 Þ2  NT 4 lim i d3 xh0j½^j0 ðxÞ; ð0Þj0ij
^
x0 ¼0 6¼ 0 ½14
8 90 V!0 V
1
þ T 2 ½ðN þ 2Þf 2  N2  ½11 Moreover, the time derivative of this integral is
48
Z
It is then easy to see that the minimum occurs at
lim i d3 xh0j½@0^j0 ðxÞ; ð0Þj0ij
^
x0 ¼0
f = 0 for T > Tc , where in this approximation V!0 V
Tc 2 = 122 =(N þ 2), while below the critical tem- Z
perature the minimum is at ¼ lim i dSk h0j½^jk ðxÞ; ð0Þj0ij
^
x0 ¼0 ¼ 0 ½15
V!0 @V
Nþ2 2
f 2 ¼ 2eq ðTÞ  2  T ½12 where @V is the bounding surface of V. This vanishes
12
because the surface integral is zero – in a relativistic
As T ! 0, the equilibrium state approaches one of theory, because the commutator vanishes at space-
the vacuum states j0n i, labeled by an N-dimensional like separation, and more generally in the absence of
unit vector n, such that h0n jf̂j0n i = n. long-range interactions because it tends rapidly to
It is often convenient to introduce a classical zero at large spatial separation.
symmetry-breaking potential. For example, in the Now, inserting a complete set of momentum
O(N) model, we may take Vsb = j
f(x), where j eigenstates jn, pi in [14], we can see that there must
is a constant N-vector. This has the effect of tilting the ˆ
exist states such that hn, pj(0)j0i 6¼ 0, with p0 ! 0
potential, thus removing the degeneracy. A character- in the limit jpj ! 0, that is, massless modes.
istic of spontaneous symmetry breaking is that the One can see this more directly in the U(1) model
limits j ! 0 and V ! 1 do not commute. If (for above. Consider a vacuum state j0i such that
pffiffiffi
T < Tc ) we take the infinite-volume limit first, and ˆ = = 2 is real. Then it is useful to shift the
h0jj0i
then let j ! 0, we get different equilibrium states, origin of  by writing
depending on the direction from which j approaches
zero; if we fix n and let j = jn, j ! 0, then we find 1
ðxÞ ¼ pffiffiffi ½ þ ’1 ðxÞ þ i’2 ðxÞ ½16
2
lim lim hf̂ðxÞijn ¼ eq ðTÞn ½13
j!0 V!1
where ’1 and ’2 are real. Then the Lagrangian
We may also regard j as representing an interac- becomes
tion with the external environment (e.g., other h
fields). If such a term is present during the cooling L ¼ 12 ’_ 21  ðr’1 Þ2 þ ’_ 22  ðr’2 Þ2  2 ’21
of the system through the phase transition, it will    2 i
constrain the direction of the spontaneous symmetry ’1 ’21 þ ’22  14 ’21 þ ’22 ½17
breaking. Note that one always arrives in this way
at one of the degenerate vacua j0n i, not a linear Evidently, the field ’1 , corresponding p toffiffiffi radial
combination of them. oscillations in , is massive, with mass . But
there is no term in ’22 , so ’2 is massless.
In the case of spontaneous symmetry breaking of
nonabelian symmetries, there may be several Gold-
Goldstone Bosons stone bosons, one for each broken component of the
The Goldstone theorem states that spontaneous continuous symmetry. In our theory with symmetry
breaking of any continuous global symmetry leads group G = O(N), the possible values of the vacuum
inevitably (except, as we discuss later, in the ˆ
expectation value at T = 0 are h0n j(0)j0 n i  n,
Symmetry Breaking in Field Theory 201

where n is an arbitrary unit vector. In this case, for Again, consider


pffiffiffi a vacuum state j0i in which
given n, there is an unbroken symmetry subgroup ˆ = = 2, and make the same decomposition,
h0jj0i
[16]. Then, if we set
H ¼ fR 2 OðNÞ : Rn ¼ ng ¼ OðN  1Þ ½18
1
A0 ¼ A þ @ ’2 ½23
and the number of broken symmetries is e
we find that the kinetic term for ’2 has been
dim G  dim H ¼ N  1 ½19
absorbed into a mass term (1/2)e2 2 A0 A0 for the
Thus, the radial component of  is massive, and vector field. We have a model with only pffiffiffi massive
there are N  1 Goldstone bosons, the N  1 fields: the ‘‘Higgs field’’ ’1 with mass  and the
transverse components. gauge field A0 with mass e. The Goldstone bosons
have been ‘‘eaten up’’ by the vector field to provide
its longitudinal mode. This is the Higgs mechanism,
first noted by Anderson in the context of the photon
Spontaneously Broken Gauge Theories
in a plasma becoming a massive plasmon.
As we shall see, symmetry breaking in gauge A more elegant way of seeing this is to note that
theories is a more problematic concept but, for the we can always make a gauge transformation to
moment, these complications are ignored and the ensure that  is real (at least so long as  6¼ 0; where
present discussion will continue with an approach it is zero,pthere
ffiffiffi may be problems). This means that
similar to that used above. (x) = ð1/ 2Þ( þ ’1 ); ’2 disappears altogether, and
The simplest local gauge symmetry theory is a its kinetic term reduces to ð1/2Þe2 A A ( þ ’1 )2 ,
U(1) Higgs model, a model of a complex scalar field which includes the mass term for A as well as cubic
(x) interacting with a gauge potential A (x), and quartic interaction terms.
described by the Lagrangian As before, the discussion can be generalized to
nonabelian theories, although there are additional
L ¼ D  D   14F
F
 VðjjÞ ½20 problems to be discussed later. If we have a local
symmetry group G that breaks spontaneously to
where V is a sombrero potential as in [5], while the leave an unbroken subgroup H, then the gauge fields
covariant derivative D  and gauge field F
are associated with H remain massless. Each of the
given by (dim G  dim H) complementary fields ‘‘eats up’’
one of the Goldstone bosons, becoming massive in
D  ¼ @  þ ieA ; F
¼ @ A
 @
A ½21
the process. We are left only with other, ‘‘radial’’
The model is invariant under the local U(1) gauge components of f, the massive Higgs fields.
transformations Consider, for example, a local SO(3) model,
with scalar fields f = (a )a =1,2,3 and gauge potentials
ðxÞ 7! ðxÞeiðxÞ A = (Aa ). The infinitesimal gauge transformations are
1 ½22
A ðxÞ 7! A ðxÞ  @ ðxÞ 1
e f ¼ dw f; A ¼ dw A  @ dw ½24
e
The Goldstone theorem does not apply to local- where dw is the gauge parameter. The Lagrangian is
symmetry theories. The problem is that to have a
Hilbert space containing only physical states one L ¼ 12D f
D f  14F

F 
 18ðf 2  2 Þ2 ½25
must eliminate the gauge freedom by choosing a where the covariant derivative and gauge field are
gauge condition (e.g., in the U(1) case the Coulomb
gauge @k Ak (x) = 0, which has the effect of restricting D f ¼ @ f þ eA f
the number of polarization states of photons to ½26
F 
¼ @ A
 @
A þ eA A

two). This necessarily breaks manifest Lorentz


invariance, although the theory is, of course, still If we take hf̂i in the 3-direction, the fields A1 and
fully Lorentz invariant. The proof of the theorem A2 absorb the Goldstone fields 1 , 2 to become
fails because the current is no longer local; the long- massive. As in the abelian case, we can use the local
range Coulomb interaction makes the commutator SO(3) invariance to rotate f everywhere to the
fall off only like 1=r2 , so the surface integral no 3-direction, and write f = (0, 0,  þ ’3 ). In this
longer vanishes in the infinite-volume limit. (The gauge the kinetic term ð1/2Þ(eA f)2 gives a mass
theorem also fails for nonrelativistic models with e to the fields A1 , A2 while A3 remains
pffiffiffi massless,
long-range forces.) and the Higgs field ’3 again has mass .
202 Symmetry Breaking in Field Theory

Elitzur’s Theorem; the Role of in the perturbation series. As is well known, this
Gauge Fixing problem can be dealt with by introducing a gauge-
fixing term, which explicitly breaks the gauge
The concept of spontaneous symmetry breaking in symmetry, and renders Elitzur’s theorem inapplic-
the context of a local symmetry requires further able. But this procedure leaves a global symmetry
discussion, in particular because of Elitzur’s theo- unbroken, and it is in fact that global symmetry that
rem, proved in 1975, which states in essence that is broken spontaneously.
‘‘spontaneous breaking of a local symmetry is One example is the Landau–Ginzburg model of a
impossible.’’ In the light of this theorem, it may superconductor, which is essentially just the non-
seem that a ‘‘spontaneously broken gauge theory’’ is relativistic limit of the abelian Higgs model,
an oxymoron. In fact, it means something rather although there is one significant difference: here
different, although even that is not unproblematic. the field ˆ annihilates a Cooper pair, a bound pair
The theorem was proved in the context of lattice of electrons with equal and opposite momenta and
gauge theory, where the spatial continuum is spins, so e above is replaced by the charge 2e of a
replaced by a discrete lattice. The scalar field is Cooper pair. The appearance of a condensate of
then represented by values f x at each lattice site, and Cooper pairs in the low-temperature superconduct-
the gauge potential by values Ax,  on the links of the ing phase corresponds to a state in which hi ˆ is
lattice. This is significant because on the lattice one nonzero. This would not be possible without fixing
can use a manifestly gauge-invariant formalism. a gauge. In the nonrelativistic context, the obvious
Expectation values of gauge-invariant physical gauge to choose is the Coulomb gauge, defined by
variables can be found, for example, by a Monte the condition @k Ak = 0. This gauge-fixing condition
Carlo algorithm that effectively averages over all breaks the local symmetry explicitly, but it leaves
possible gauges. In this context, it is possible to unbroken the global symmetry (x) ! (x)ei with
show that the expectation value of any gauge- constant . It is that global symmetry that is
noninvariant operator (such as f^x ) necessarily spontaneously broken when hi ˆ 6¼ 0.
vanishes identically. For a model with nonabelian local symmetry the
To be more specific, suppose we incorporate
P a standard procedure used to derive a perturbation
symmetry-breaking term of the form j
x f x , and expansion is that of Faddeev and Popov. Consider,
consider the limits V ! 1 followed by j ! 0. In the for example, the SO(3) gauge theory discussed in the
global-symmetry case, as we noted earlier, this yields preceding section. To fix the gauge, we can choose a
the nonzero result [13]. However, in the case of a set of functions F = (Fa ) of the fields, and introduce
local gauge symmetry, one can show rigorously that into the path integral a gauge-fixing term of the form
lim lim hf^x ijn ¼ 0 ½27 1 2
j!0 V!1 Lgf ¼  F ½28
2
The essential reason for this is that we can make a
gauge transformation in the neighborhood of the where is an arbitrary real constant. However, to
point x to make f x have any value we like without ensure that this does not bias the integral, so that the
changing the energy by more than a very small gauge-fixed theory is at least formally equivalent to
amount that goes to zero as j ! 0. Within this the original gauge-invariant theory, one must also
manifestly gauge-invariant formalism, it is clear that include the determinant of the Jacobian matrix
the expectation value of a gauge-noninvariant Fa ðxÞ
operator such as f̂ is not an appropriate order Jab ðx; yÞ ¼ ½29
!b ðyÞ
parameter. One must instead look for a gauge-
invariant order parameter. The easiest way to do this is to introduce Faddeev–
It is important to note, however, that this result  C, which are scalar Grassmann
Popov ghost fields C,
applies only in the context of a manifestly gauge- variables, and an appropriate term in the
invariant formalism. But, in general, gauge theories Lagrangian
cannot be quantized in a manifestly gauge-invariant
way. In a path-integral formalism, the action 
J
C
LFP ¼ C ½30
functional, which appears in the exponent, is
constant along the orbits of the gauge-group action. For the SO(3) model, a convenient choice of gauge is
Consequently, the integral contains an infinite the R gauge defined by
factor, the volume of the (infinite-dimensional)
gauge group. There are corresponding divergences F ¼ @ A  en f ½31
Symmetry Breaking in Field Theory 203

where n is an arbitrarily chosen unit vector. It is Gauge theories pose particular problems because
clear that the full Lagrangian L þ Lgf þ LFP is no of the infrared divergences in the thermal field
longer invariant under the full SO(3) gauge group, theory at high temperature, and because in asymp-
although there is a residual U(1) gauge invariance totically free nonabelian theories the coupling
corresponding to rotations about n. In this gauge, becomes large at very low energy. Even when they
the arbitrary choice of n means that the global appear to exhibit spontaneous symmetry breaking,
SO(3) symmetry is also broken. However, for other they do not necessarily undergo a true phase
choices, such as the Lorentz gauge F ¼ @ A or transition. Lattice gauge theory calculations have
axial gauge F ¼ A3 , the Lagrangian is invariant led to the conclusion that in nonabelian gauge
under global SO(3) rotations of all the fields. This theories with the Higgs field in the fundamental
global symmetry is then spontaneously broken, with representation, there are values of the coupling
f̂ acquiring as before a nonzero expectation value of constants for which there is no phase transition,
the form hf̂(x)i = n. only a rapid but smooth crossover from one type of
It is interesting to look again at the particle behavior to another, so that the high- and low-
content of this model. By setting f(x) = n þ j(x) temperature phases are analytically connected. If the
with n = (0, 0, 1), one finds that in the quadratic part coupling constant is small, there is a first-order
of the Lagrangian, the cross-terms between A and j phase transition, and for moderate values the theory
combine to form a total divergence which can be exhibits a very rapid crossover that looks quite
dropped. As before, ’3 is the Higgs field, with similar to a symmetry-breaking phase transition.
m2 = 2 , A3 is the massless gauge field corres- Nevertheless, the analytic connection between the
ponding to the unbroken gauge symmetry, and the two phases implies that there cannot exist an order
three transverse components of A1 and A2 parameter that is strictly zero above the transition
represent the massive vector fields, with m2 = e2 2 . and nonzero below it.
There are, however, also unphysical fields with In particular, it appears that for physical values
 1, 2 , and the long-
-dependent masses: ’1, 2 , C1, 2 , C of the Higgs mass, the electroweak theory does not
itudinal components @ A1, 2 all have m2 = e2 2 . We undergo in fact undergo a true phase transition. It is
can now compute the effective potential Veff (T, j). somewhat ironic that the most famous example of a
One point that should be noted in performing this spontaneously broken gauge theory probably does
calculation is that the ghost fields C, C  contribute not, strictly speaking, exhibit a symmetry-breaking
negatively. Obviously, Veff , being -dependent, is phase transition!
not itself physically meaningful. Nevertheless, it can
be shown that the stationary points of Veff are
physical, and correspond to the possible equilibrium Conclusions
states of the theory. Moreover, the extremal values
We have discussed the main features of spontaneous
of Veff are independent of and give correctly the
symmetry breaking in both the global- and local-
thermodynamic potential in the corresponding equi-
symmetry cases, especially the appearance of Gold-
librium states. The negative contributions from the
stone bosons when a continuous global symmetry
ghost fields to N and M2 ensure that the
breaks, and their elimination in the local-symmetry
dependence cancels out, and we find as expected
case by the Higgs mechanism, as well as the
N = 9 and M2 = ( þ 6e2 )2 .
problems attaching to the concept of spontaneous
symmetry breaking in gauge theories.
Phase Transitions and Crossovers
See also: Abelian Higgs Vortices; Effective Field
Our discussion so far has for the most part been Theories; Electroweak Theory; Finite Group Symmetry
restricted to a semiclassical or mean-field approx- Breaking; Lattice Gauge Theory; Noncommutative
imation. It is important to bear in mind, however, Geometry and the Standard Model; Phase Transitions in
that this approximation does not suffice to deter- Continuous Systems; Quantum Central Limit Theorems;
mine whether a phase transition (where the thermo- Quantum Spin Systems; Symmetries in Quantum Field
dynamic free energy is nonanalytic) exists, or what Theory of Lower Spacetime Dimensions; Topological
its nature is. Determining the detailed characteristics Defects and their Homotopy Classification.
of phase transitions requires other methods, such as
the renormalization group or lattice simulations. In
many cases, it is far from trivial to establish the Further Reading
order of the transition, or even whether a true phase Anderson PW (1963) Plasmons, gauge invariance and mass.
transition actually exists. Physical Review 130: 439–442.
204 Symmetry Classes in Random Matrix Theory

Coleman S (1985) Aspects of Symmetry, ch. 5. Cambridge: Guralnik GS, Hagen CR, and Kibble TWB (1964) Global
Cambridge University Press. conservation laws and massless particles. Physical Review
Elitzur S (1975) Impossibility of spontaneously breaking local Letters 13: 585–587.
symmetries. Physical Review D 12: 3978–3982. Higgs PW (1964) Broken symmetries, massless particles and
Englert F and Brout R (1964) Broken symmetry and the mass of gauge fields. Physics Letters 12: 132–133.
gauge vector bosons. Physical Review Letters 13: 321–323. Kibble TWB (1967) Symmetry-breaking in non-abelian gauge
Fradkin E and Shenker SH (1979) Phase diagrams of lattice gauge theories. Physical Review 155: 1554–1561.
theories with Higgs fields. Physical Review D 19: 3682–3697. Weinberg S (1996) The Quantum Theory of Fields, Vol. II.
Goldstone J, Salam A, and Weinberg S (1962) Broken symmetries. Modern Applications, chs. 19 and 21. Cambridge: Cambridge
Physical Review 127: 965–970. University Press.

Symmetry Classes in Random Matrix Theory


M R Zirnbauer, Universität Köln, Köln, Germany numerous areas of physics – see Random Matrix
ª 2006 Elsevier Ltd. All rights reserved. Theory in Physics – quantum mechanics is still
where many of its applications lie. Quantum
mechanics also provides a natural framework in
which to classify random matrix ensembles.
Introduction
Following Dyson, the mathematical setting for
A classification of random matrix ensembles by classification consists of two pieces of data:
symmetries was first established by Dyson, in an
A finite-dimensional complex vector space V with
influential 1962 paper with the title ‘‘the threefold
a Hermitian scalar product h
,
i, called a ‘‘unitary
way: algebraic structure of symmetry groups and
structure’’ for short. (In physics applications,
ensembles in quantum mechanics.’’ Dyson’s three-
V will usually be the truncated Hilbert space of
fold way has since become fundamental to various
a family of quantum Hamiltonian systems.)
areas of theoretical physics, including the statistical
On V there acts a group G of unitary and
theory of complex many-body systems, mesoscopic
antiunitary operators (the joint symmetry group
physics, disordered electron systems, and the field of
of the multiparameter family of quantum systems).
quantum chaos.
Over the last decade, a number of random matrix Given this setup, one is interested in the linear space
ensembles beyond Dyson’s classification have come of self-adjoint operators on V – the Hamiltonians H
to the fore in physics and mathematics. On the – with the property that they commute with the
physics side, these emerged from work on the low- G-action. Such a space is reducible in general, that
energy Dirac spectrum of quantum chromodynamics is, the matrix of H decomposes into blocks. The goal
(QCD) and from the mesoscopic physics of low- of classification is to list all of the irreducible blocks
energy quasiparticles in disordered superconductors. that occur.
In the mathematical research area of number theory,
the study of statistical correlations in the values of Symmetry Groups
Riemann zeta and similar functions has prompted Basic to classification is the notion of a symmetry
some of the same generalizations. group in quantum Hamiltonian systems, a notion
In this article, Dyson’s fundamental result will be that will now be explained.
reviewed from a modern perspective, and the recent In classical mechanics, the symmetry group G0 of
extension of Dyson’s threefold way will be moti- a Hamiltonian system is understood to be the group
vated and described. In particular, it will be of canonical transformations that commute with the
explained why symmetry classes are associated phase flow of the system. An important example is
with large families of symmetric spaces. the rotation group for systems in a central field.
In passing from classical to quantum mechanics,
one replaces the classical phase space by a quantum-
The Framework
mechanical Hilbert space V and assigns to the
Random matrices have their physical origin in the symmetry group G0 a (projective) representation by
quantum world, more precisely in the statistical unitary C-linear operators on V. Besides the one-
theory of strongly interacting many-body systems parameter continuous subgroups, whose significance
such as atomic nuclei. Although random matrix is highlighted by Noether’s theorem, the compo-
theory is nowadays understood to be of relevance to nents of G0 not connected with the identity play an
Symmetry Classes in Random Matrix Theory 205

important role. A prominent example is provided by what are the corresponding symmetry classes,
the operator for space reflection. Its eigenspaces are meaning the irreducible spaces of Hamiltonians on
the subspaces of states with positive and negative V that commute with G.
parity; these reduce the matrix of any reflection- For technical reasons, we assume the group G0 to
invariant Hamiltonian to two blocks. be compact; this is an assumption that covers most
Not all symmetries of a quantum-mechanical (if not all) of the cases of interest in physics. The
system are of the canonical, unitary kind: the noncompact group of space translations can be
prime counterexample is the operation of inverting incorporated, if necessary, by wrapping the system
the time direction, called time reversal for short. In around a torus, whereby translations are turned into
classical mechanics, this operation reverses the sign compact torus rotations.
of the symplectic structure of phase space; in While the primary objects to classify are the
quantum mechanics, its algebraic properties reflect spaces of Hamiltonians H, we shall focus for
the fact that inverting the
pffiffiffiffiffiffitime direction, t 7! t, convenience on the spaces of time evolutions
amounts to sending i = 1 to i. Indeed, time t Ut = eitH=h instead. This change of focus results in
enters in the Dirac, Pauli, or Schrödinger equation no loss, as the Hamiltonians can always be retrieved
as ihd=dt. Therefore, time reversal is represented in by linearizing in t at t = 0.
the quantum theory by an antiunitary operator T,
which is to say that T is complex antilinear: Symmetric Spaces

TðzvÞ ¼ zTv ðz 2 C; v 2 VÞ We appropriate a few basic facts from the theory of


symmetric spaces.
and preserves the Hermitian scalar product or Let M be a connected m-dimensional Riemannian
unitary structure up to complex conjugation: manifold and p a point of M. In some open subset
Np of a neighborhood of p there exists a map
hTv1 ; Tv2 i ¼ hv1 ; v2 i ¼ hv2 ; v1 i sp : Np ! Np , the geodesic inversion with respect to
p, which sends a point x 2 Np with normal
Another operation of this kind is charge conjugation
coordinates (x1 , . . . , xm ) to the point with normal
in relativistic theories such as the Dirac equation.
coordinates (x1 , . . . , xm ). The Riemannian mani-
By the symmetry group G of a quantum-mechanical
fold M is called locally symmetric if the geodesic
system with Hamiltonian H, one then means the group
inversion is an isometry, and is called globally
of all unitary and antiunitary transformations g of V
symmetric if sp extends to an isometry sp : M ! M,
that leave the Hamiltonian invariant: gHg1 = H. We
for all p 2 M. A globally symmetric Riemannian
denote the unitary subgroup of G by G0 , and the set of
manifold is called a symmetric space for short.
antiunitary operators in G by G1 (not a group). If V
The Riemann curvature tensor of a symmetric
carries extra structure, as will be the case for some
space is covariantly constant, which leads one to
extensions of Dyson’s basic scheme, the action of G on
distinguish between three cases: the scalar curvature
V has to be compatible with that structure.
can be positive, zero, or negative, and the symmetric
The set G1 may be empty. When it is not, the
space is said to be of compact type, Euclidean type,
composition of any two elements of G1 is unitary, so
or noncompact type, respectively. (In mesoscopic
every g 2 G1 can be obtained from a fixed element of
physics, each type plays a role: the first provides us
G1 , say T, by right multiplication with some U 2 G0 :
with the scattering matrices and time evolutions, the
g = TU. In other words, when G1 is nonempty the
second with the Hamiltonians, and the third with
coset space G=G0 consists of exactly two elements, G0
the transfer matrices.) The focus in the current
and T  G0 = G1 . We shall assume that T represents
article will be on compact type, as it is this type that
some inversion symmetry such as time reversal or
houses the unitary time evolution operators of
charge conjugation. T must then be a (projective)
quantum mechanics. The compact symmetric spaces
involution, that is, T 2 = z  Id with z a complex
are subdivided into two major subtypes, both of
number of unit modulus, so that conjugation by T 2 is
which occur naturally in the present context, as
the identity operation. Since T is complex antilinear,
follows.
the associative law T 2  T = T  T 2 forces z to be real,
and hence T 2 = Id.
Type II
Finding the total symmetry group of a Hamiltonian
system need not always be straightforward, but this Consider first the case where the antiunitary
complication will not be an issue here: we take the component G1 of the symmetry group is empty, so
symmetry group G and its action on the Hilbert the data are (V, G) with G = G0 . Let U (V) denote
space V as fundamental and given, and then ask the group of all complex linear transformations that
206 Symmetry Classes in Random Matrix Theory

leave the structure of the vector space V invariant. G0 = SO3 we take R to be the standard irreducible
Thus, U (V) is a group of unitary transformations if module of dimension 2 þ 1; and m then is the
V carries no more than the usual Hermitian scalar number of times a multiplet of states with total
product; and is some subgroup of the unitary group angular momentum  occurs in V .
if V does have extra structure (as is the case for the The natural mapping L  R ! V by l  r 7! l(r)
Nambu space of quasiparticle excitations in a is an isomorphism,
superconductor). The symmetry group G0 , by acting
V ffi L  R
on V and preserving its structure, is contained as a
subgroup in U (V). and using it we can transfer the entire discussion
Let now H be any Hamiltonian with the pre- from V to L  R . The group G0 acts trivially on
scribed symmetries. Then the time evolution L ffi Cm and irreducibly on R . Therefore, the
t 7! Ut = eitH=h generated by H is a one-parameter component Z of the centralizer Z is the unitary
subgroup of U (V) which commutes with the group
G0 -action. The total set of transformations Ut that
Z ffi UðL Þ ffi Um
arise in this way is called the (connected part of the)
‘‘centralizer’’ of G0 in U (V), and is denoted by Z. if V is a unitary vector space with no extra structure.
This is the ‘‘good’’ set of unitary time evolutions – In the presence of extra structure (which, by
the set compatible with the given symmetries of an compatibility with the G0 -action, restricts to every
ensemble of quantum systems. subspace V ), the factor Z is some subgroup of
The centralizer Z is obviously a group: if U and Um . In all cases, Z is a direct product of connected
U0 belong to Z, then so do their inverses and their compact Lie groups Z .
product. What can one say about the structure of To make the connection with symmetric spaces, write
the group Z? M := Z . Since M is a group, the operation of taking
Since G0 is compact by assumption, its group the inverse, U 7! U1 , makes sense for all U 2 M.
action on V is completely reducible and V is Moreover, being a compact Lie group, the manifold M
guaranteed to have an orthogonal decomposition admits a left- and right-invariant Riemannian structure
M in which the inversion U 7! U1 is an isometry. By
V¼ V translation, one gets an isometry sU1 : U 7! U1 U1 U1

for every U1 2 M. All of these maps sU1 are globally
where the sum runs over isomorphism classes of defined, and the restriction of sU1 to some neighborhood
irreducible G0 -representations , and the vector of U1 coincides with the geodesic inversion with respect
spaces V are called the G0 -isotypic components of to U1 . Thus, M is a symmetric space by the definition
V. For example, if G0 is the rotation group SO3 , the given above. Symmetric spaces of this kind are called
G0 -isotypic component V of V is the subspace type II.
spanned by all the states with total angular
momentum .
Type I
Consider now any U 2 Z. Since U commutes with
the G0 -action, it does not connect different Consider next the case of G1 6¼ ;, where some
G0 -isotypic components. (Indeed, in the example of antiunitary symmetry T is present. As before, let Z
SO3 -invariant dynamics, angular momentum is be the connected component of the centralizer of G0
conserved and transitions between different angular in U (V). Conjugation by T,
momentum sectors are forbidden.) Thus, every
G0 -isotypic component V is an invariant subspace U 7! ðUÞ :¼ TUT 1
for the
Q action of Z on V, and Z decomposes as is an automorphism of U (V) and, owing to T 2 = Id,
Z =  Z with blocks Z = Z jV .  is involutive. Because G0 G is a normal
To say more, fix a standard irreducible subgroup,  restricts to an involutive automorphism
G0 -module R of isomorphism class  and consider (still denoted by ) of Z. Now recall that T is
L :¼ HomG0 ðR ; V Þ complex antilinear and the good Hamiltonians are
subject to THT 1 = H. The good time evolutions
the linear space of C-linear maps l : R ! V that Ut = eitH=h clearly satisfy (Ut ) = Ut = Ut1 . Thus,
intertwine the G0 -actions on R and V . An element the good set to consider is M := {U 2 Z j U = (U)1 }.
of L is called a G0 -equivariant homomorphism. By The set M is a manifold, but in general is not a
Schur’s lemma, L ffi C if V is G0 -irreducible. More Lie group.
generally, dimL =: m counts the multiplicity of Further details depend on what  does with the
Q
occurrence of R in V ; for example, in the case of factorization Z =  Z . If V is a G0 -isotypic
Symmetry Classes in Random Matrix Theory 207

component of V, then so is TV , since T normalizes Table 1 The large families of symmetric spaces. The form of H
G0 . Thus, either V \ TV = 0, or TV = V . In the in the header applies to the last seven families
former case, the involutive automorphism  just Family Symmetric
 
W Z
relates U 2 Z with (U) 2 ZTV , whence no intrin- space Form of H =
Zy 
W
sic constraint on Z results, and the time evolutions
(U, (U)1 ) 2 Z  ZTV constitute a type-II sym- A UN Complex Hermitian
AI UN =ON Real symmetric
metric space, as before. AII U2N =USp2N Quaternion self-adjoint
A novel situation occurs when TV = V , in which C USp2N Z complex symmetric,
case  restricts to an automorphism of Z . Let W =Wy
therefore TV = V , put K
Z for short, and CI USp2N =UN Z complex symmetric,
consider W =0
D SO2N Z complex skew,
M :¼ fU 2 KjU ¼ ðUÞ1 g W =Wy
DIII SO2N =UN Z complex skew,
Note that if two elements p, p0 of K are in M, W =0
then so is the product p0 p1 p0 . The group K acts on AIII Upþq =Up  Uq Z complex p  q, W = 0
BDI SOpþq =SOp  SOq Z real p  q, W = 0
M K by CII USp2pþ2q =USp2p Z quaternion
USp2q 2p  2q, W = 0
k  U ¼ kUðkÞ1 ðk 2 KÞ

and this group action is transitive, that is, every U 2 M


can be written as U = k(k)1 with some k 2 K. the (irreducible) spaces of time evolution operators
(Finding k for a given U is like taking a square root, U that are ‘‘compatible’’ with G, meaning
which is possible since exp : Lie K ! K is surjective.) U ¼ g0 Ug1 1 1
0 ¼ g1 U g1
There exists such a K-invariant Riemannian structure
ðfor all g 2 G Þ
for M that for all p0 2 M the mapping sp0 : M ! M
defined by As we have seen, the spaces that arise in this way are
1 symmetric spaces of type I or II depending on the
sp0 ðpÞ ¼ p0 p p0
nature of the time reversal (or other antiunitary
is the geodesic inversion with respect to p0 2 M. symmetry) T.
Thus, in this natural geometry M is a globally An even stronger statement can be made when
symmetric Riemannian manifold and hence a sym- more information about the Hilbert space V is
metric space. The present kind of symmetric space is specified. In Dyson’s classification, the Hermitian
called type I. If K is the set of fixed points of  in K, scalar product of V is assumed to be the only
the symmetric space M is analytically diffeomorphic invariant structure that exists on V. With that
to the coset space K=K by assumption, only three large families of symmetric
spaces arise; these correspond to what we call the
K=K ! M K; UK 7! UðUÞ1 ‘‘Wigner–Dyson symmetry classes.’’
which we call the ‘‘Cartan embedding’’ of K=K
into K.
Class A
In summary, the solution to the problem of
finding the set of unitary time evolution operators Recall that in Dyson’s case, the connected part of the
that are compatible with a given symmetry group G centralizer of G0 in U (V) is a direct product of
and structure of Hilbert space V is always a unitary groups, each factor being associated with one
symmetric space. This is a valuable insight, as G0 -isotypic component V of V. The type-II situation
symmetric spaces are rigid objects and have been occurs when the set G1 of antiunitary symmetries is
completely classified by Cartan. either empty or else exchanges different V . In both
If the dimension of V is kept variable, the cases, the set of good time evolution operators
irreducible symmetric spaces that occur belong to restricted to one G0 -isotypic component V is a
one of the large families listed in Table 1. unitary group Um , with m being the multiplicity of
the irreducible G0 -representation  in V .
The unitary groups UN = m or, to be precise, their
simple parts SUN , are called type-II symmetric spaces
Dyson’s Threefold Way
of the A family or A series – hence the name class A.
Recall the goal: given a Hilbert space V and a The Hamiltonians H, the generators of time evolu-
symmetry group G acting on it, one wants to classify tions Ut = eitH=h , in this class are represented by
208 Symmetry Classes in Random Matrix Theory

complex Hermitian N  N matrices. By putting a Knowing the sign of " = 1 we know the group
UN -invariant Gaussian probability measure K . Indeed, an element k 2 K commutes with T and
  after transfer from V to L still commutes with . But
exp trH 2 =22 dH ð 2 RÞ since K is a subgroup of K = Um , this means that
on that space, one gets what is called the GUE – the k 2 K preserves Q. In the case of " = þ1, what is
Gaussian unitary ensemble – which defines the preserved is a symmetric pairing, and therefore K ffi
Wigner–Dyson universality class of unitary symmetry. Om . For " = 1, the multiplicity m must be even
and K preserves an alternating pairing (or symplec-
Classes AI and AII tic structure); in that case K ffi USpm , the unitary
symplectic group.
Consider next the case G1 6¼ ;, with antiunitary Thus, there is a dichotomy for the sets of good
generator T. Let V = TV be any G0 -isotypic time evolutions M ffi K=K :
component of V invariant under T (the type-I
situation). The mapping U 7! TUT 1 = (U) then is Class AI : K=K ffi UN =ON ðN ¼ m Þ
an automorphism of the groups U(V ), G0 and Class AII : K=K ffi U2N =USp2N ð2N ¼ m Þ
K = Z ffi Um . If K is the subgroup of fixed points
of  in K, the space of good time evolutions can be Again we are referring to symmetric spaces by the
identified with the symmetric space K=K by the names they – or rather their simple parts SUN =SON
Cartan embedding. Our task is to determine K . and SU2N =USp2N – have in the Cartan classification.
To simplify the notation let us write V
V, R
In general, there is no immediate means of
R, and L
L. We now ask what happens with predicting the parity " , and one has no choice but
T : V ! V in the process of transfer to L  R ffi V. to go through the steps of constructing . If
The answer, so we claim, is that T transfers to a  : R ! R happens to be G0 -invariant, however, the
pure tensor made from antiunitary maps  : L ! L situation simplifies. In that case  determines a
and  : R ! R, G0 -invariant pairing R  R ! C (in the same way as
 determines Q : L  L ! C above). On general
T ¼ grounds, an irreducible G0 -representation space
To prove this claim, let C be the antilinear map admits at most one such pairing. If that pairing is
from V to the dual vector space V by v 7!hv,i. symmetric, then, as we have seen, " = 1; if it is
Because the elements of G0 are represented by alternating, then " = 1. The parity " is given by
unitaries, the C-linear operator CT : V ! V inter- " " = "T .
twines G0 -actions:
Example Consider any physical system with spin-
CT aðgÞ ¼ g1 CT ðg 2 G0 Þ
rotation symmetry (G0 = SU2 ) and time-reversal
where a is the automorphism a(g) = T 1 gT. From symmetry. The physical operation of time reversal,
the irreducibility of R it follows that the space of T, commutes with spin rotations and, hence, here
intertwiners R ! R is one dimensional here (Schur’s is a case where the factor  in T =    is
lemma). Therefore, CT : L  R ! L  R must be a G0 -invariant. On fundamental physics grounds one
pure tensor (as opposed to a sum of such tensors), has T 2 = (1)2S on states with spin S. The spin-S
and since C is clearly a pure tensor, so is T. This representation of SU2 is known to carry an invariant
completes the proof. pairing which is symmetric or skew depending on
By the involutive property T 2 = "T IdV ("T = 1), whether the integer 2S is even or odd. Therefore,
the two antiunitary factors of T =    cannot "T = " and " = þ1 in all cases.
but square to 2 = " IdL and  2 = " IdR where Thus, T-invariant systems with no symmetries
" , " = 1 are related by " " = "T . The factor  other than energy and spin invariably are class AI.
determines a nondegenerate complex bilinear form By breaking spin-rotation symmetry (G0 = {Id},
Q : L  L ! C by " = þ1) while maintaining T-symmetry for states
with half-integer spin (say single electrons, which
Qðl1 ; l2 Þ ¼ hl1 ; l2 iL ðl1 ; l2 2 LÞ
carry spin S = 1=2), one gets " = 1, thereby
Since  is antiunitary one has the exchange realizing class AII.
symmetry
The Hamiltonians By passing to the tangent space
Qðl1 ; l2 Þ ¼ h2 l1 ; l2 iL ¼ " Qðl2 ; l1 Þ
of K=K at unity one obtains Hermitian matrices
Thus, the complex bilinear form (or pairing) Q is with entries that are real numbers (class AI) or real
symmetric for " = þ1 and alternating for " = 1. quaternions (class AII). When K -invariant Gaussian
Symmetry Classes in Random Matrix Theory 209

probability measures (called GOE resp., GSE) are purposes, the best viewpoint to take is to attribute
put on these spaces, one gets the Wigner–Dyson the extra invariant structure to the Hilbert space V,
universality classes of orthogonal resp., symplectic thereby turning it into a Nambu space.
symmetry. In mesoscopic physics, these are realized
in disordered metals with time-reversal invariance Nambu Space
(absence of magnetic fields and magnetic impuri-
ties). Spin-rotation symmetry is broken by strong Adopting the standard physics conventions of
spin–orbit scatterers such as gold impurities. second quantization, consider some set of single-
particle creation and annihilation operators cyi and
ci , where i = 1, 2, . . . labels an orthonormal system
Warning
of single-particle states. Such operators are subject
The word ‘‘symmetry class’’ is not synonymous with to the canonical anticommutation relations (CARs)
‘‘universality class.’’ Indeed, inside a symmetry class
many different types of physical behavior are cyi cj þ cj cyi ¼ ij
½1
possible. For example, random matrix models for cyi cyj þ cyj cyi ¼ 0 ¼ ci cj þ cj ci
disordered metallic grains with time-reversal sym-
metry belong to the symmetry class of the example When written in terms of cj þ cyj and i(cj  cyj ), these
above (class AI), and so do Anderson tight-binding become the standard defining relations of a Clifford
models with real hopping. The former are known to algebra over
P R. Field operators are linear combina-
exhibit energy level statistics of universal GOE type, tions y = i (ui ci þ fi cyi ) with complex coefficients ui
whereas the latter have localized eigenfunctions and and fi .
hence level statistics which is expected to approach Now take H to be some Hamiltonian which is
the Poisson limit when the system size goes to quadratic in the creation and annihilation operators:
infinity. X 1 X 
H¼ Wij cyi cj þ Zij cyi cyj þ Z
 ij cj ci
i;j
2 i;j
Disordered Superconductors and let H act on field operators y by the
When Dirac first wrote down his famous equation in commutator: H  y
[H, y ]. The time evolution of
1928, he assumed that he was writing an equation y is then determined by the Heisenberg equation of
for the wave function of the electron. Later, because motion
of the instability caused by negative-energy solu- dy
tions, the Dirac equation was reinterpreted (via ih ¼Hy ½2
dt
second quantization) as an equation for the ferm-
ionic field operators of a quantum field theory. A which integrates to y (t) = eitH=h  y (0), and is easily
similar change of viewpoint is carried out in reverse verified to preserve the CARs [1].
in the Hartree–Fock–Bogoliubov mean-field descrip- The dynamical equation [2] is equivalent to a
tion of quasiparticle excitations in superconductors. system of linear differential equations for the
There, one starts from the equations of motion for amplitudes ui and fi . If these are assembled into
linear superpositions of the electron creation and vectors, and the Wij and Zij into matrices, eqn [2]
annihilation operators, and reinterprets them as a becomes
unitary quantum dynamics for what might be called     
d f W Z f
the quasiparticle ‘‘wave function.’’ ih ¼ y 
dt u Z  W u
In both cases – the Dirac equation and the
quasiparticle dynamics of a superconductor – there The Hamiltonian matrix on the right-hand side has
enters a structure not present in the standard some special properties due to Zij = Zji (from
quantum mechanics underlying Dyson’s classifica- ci cj = cj ci ) and Wij = W ji (from H being self-
tion: the field operators for fermionic particles are adjoint as an operator in Fock space). To keep
subject to a set of relations called the ‘‘canonical track of these properties while imposing some
anticommutation relations,’’ and these are preserved unitary and antiunitary symmetries, it is best to put
by the quantum dynamics. Therefore, whenever everything in invariant form.
second quantization is undone (assuming it can be So, let U be the unitary
P vector space of annihila-
undone) to return from field operators to wave tion operatorsPu = i ui ci , and view the creation
functions, the wave-function dynamics is required to operators f = i fi cyi as lying in the dual vector space
preserve some extra structure. This puts a linear U . The field operators y = u þ f then are elements
constraint on the good Hamiltonians H. For our of the direct sum U U =: V, called ‘‘Nambu
210 Symmetry Classes in Random Matrix Theory

space.’’ On V there exists a canonical unitary group SO(V), and imposing unitarity yields a real
structure expressed by orthogonal subgroup SO(VR ) with dim VR 2 2N –
X a symmetric space of the D family.
~i ¼
hy ; y ~ þ fi ~fi Þ
u u
ð When expressed in some basis of Majorana
i i i

A second canonical structure on V = U U is given fermions (meaning a basis of VR ), the matrix of


by the symmetric complex bilinear form the time evolution generator iH 2 so(VR ) is real
X skew, and that of H imaginary skew. The simplest
~g ¼
fy ; y ð~f u þ fi u
i i i
~i Þ ¼ ~f ðuÞ þ f ð~
uÞ random matrix model for class D, the SO-invariant
Gaussian ensemble of imaginary skew matrices, is
where the last expression uses the meaning of f as a analyzed in the second edition of Mehta’s (1991)
linear function f : U ! C. Note that {y , ỹ } agrees book. From the expressions given by Mehta it is
with the anticommutator of the field operators, seen that the level correlation functions at high
y ỹ þ ỹ y . energy coincide with those of the Wigner–Dyson
Now recall that the quantum dynamics is deter- universality class of unitary symmetry. The level
mined by a Hamiltonian H that acts on y by the correlations at low energy, however, show different
commutator H  y = [H, y ]. The one-parameter behavior defining a separate universality class.
groups t 7! eitH=h generated by this action (the time This universal behavior at low energies has immedi-
evolutions) preserve the symmetric pairing: ate physical relevance, as it is precisely the low-
~ g ¼ feitH=h y ; eitH=h y
~g energy quasiparticles that determine the thermal
fy ; y
transport properties of the superconductor at low
since the anticommutation relations [1] do not temperatures.
change with time. They also preserve the unitary
structure, Class DIII
D E
~ i ¼ eitH=h y ; eitH=h y
hy ; y ~ Let now magnetic fields and magnetic impurities
be absent, so that time reversal T is a symmetry of
because probability in Nambu space is conserved. the quasiparticle system: G = {Id, T}. Following the
(Physically speaking, this holds true as long as H is section ‘‘The framework,’’ the set of good time
quadratic, i.e., many-body interactions are negligible.) evolutions is M ffi K=K with K = SO(VR ) and K
One can now pose Dyson’s question again: given the set of fixed points of U 7! (U) = TUT 1 in K.
Nambu space V and a symmetry group G acting on What is K ?
it, what is the set of time evolution operators that The square of the time-reversal operator is T 2 = Id
preserve the structure of V and are compatible with (for particles with spin 1/2), and commutes with
G? From the section ‘‘The framework,’’ we know particle–hole conjugation C, which makes P := iCT a
the answer to be some symmetric space, but which useful operator to consider. Since C by definition
are the symmetric spaces that occur? commutes with the action of K, and hence also with
that of K , the subgroup K has an equivalent
description as
Class D
K ¼ fk 2 UðVÞ j k ¼ PkP1 ¼ ðkÞg
Consider a superconductor with no symmetries in its
quasiparticle dynamics, so G = {Id}. (A concrete The operator P is easily seen to have the following
example would be a disordered spin-triplet super- properties: (1) P is unitary, (2) P2 = Id, and (3) trV
conductor in the vortex phase.) The time evolutions P = 0. Consequently, P possesses two eigenspaces
Ut = eitH=h are then constrained only by invariance V of equal dimension, and the condition k = PkP1
of the unitary structure and the symmetric pairing fixes a subgroup U(Vpþffiffiffiffiffiffi
)  U(V ) of U(V). Since P
{ , } of Nambu space. These two structures are contains a factor i = 1 in its definition, it antic-
consistent; they are related by particle–hole con- ommutes with the antilinear operator T. Therefore,
jugation C: the automorphism  exchanges U(Vþ ) with U(V ),
and the fixed-point set K is the same as U(Vþ ) ffi
~ g ¼ hCy ; y
fy ; y ~i
U2N . Thus,
which is an antiunitary operator with square C2 = þId.
M ¼ K=K ffi SO4N =U2N ðdim Vþ ¼ 2NÞ
Let VR V denote the real vector space of fixed
points of C. (The field operators in VR are called of a symmetric space in the DIII family. Note that
‘‘Majorana’’ type in physics.) The condition for particles with spin 1/2 the dimension of Vþ has
{y , ỹ } = {Ut y , Ut ỹ } selects a complex orthogonal to be even.
Symmetry Classes in Random Matrix Theory 211

By realizing the algebra of involutions C, T as Class CI


Cy = (12N  ix )y and Ty = (12N  iy )y  , the
The next class is obtained by taking the time
Hamiltonians H in class DIII are brought into the
reversal T as well as the spin rotations g 2 SU2 to
standard form
be symmetries of the quasiparticle system.
  By arguments that should be familiar by now, the
0 Z
H¼  0 set of good time evolutions is a symmetric space
Z
M ffi K=K with K = USp(L) and K the set of fixed
points of  in K. Once again, the question to be
where the 2N  2N matrix Z is complex and skew.
answered is: what is K ? The situation here is very
Class C similar to the one for class DIII, with L and USp(L)
taking the roles of V and SO(VR ). By adapting the
Next let the spin of the quasiparticles be previous argument to the present case, one shows
conserved, as is the case for a spin-singlet super- that K is the same as U(Lþ ) ffi UN , where Lþ is the
conductor with no spin–orbit scatterers present, and positive eigenspace of P = iCT viewed as a unitary
let time-reversal invariance be broken by a magnetic operator on L. Thus,
field. The symmetry group of the quasiparticle
system then is the spin-rotation group: G = G0 = M ¼ K=K ffi USp2N =UN
Spin3 = SU2 .
Nambu space V can be arranged to be a tensor
product V = L  R so that G0 acts trivially on L and Dirac Fermions: The Chiral Classes
by the spinor representation on the spinor space R

C2 . Since two spinors combine to give a scalar, the Three large families of symmetric spaces remain to
latter comes with an alternating bilinear form a : R  be implemented. Although these, too, occur in
R ! C. In a suitable basis, the anticommutation mesoscopic physics, their most natural realization
relations [1] factor on particle–hole and spin indices. is by 4D Dirac fermions in a random gauge field
The symmetric bilinear form {,} of V correspondingly background.
factors under the tensor product decomposition Consider the Lagrangian L for the Euclidean
V = L  R as spacetime version of QCD with Nc  3 colors of
quarks coupled to an SUNc gauge field A :
fl1  r1 ; l2  r2 g ¼ ½l1 ; l2  aðr1 ; r2 Þ   ð@  A Þy þ imy
L ¼ iy y

where [ , ] is an alternating form on L, giving L the The massless Dirac operator D = i  (@  A ) anti-
structure of a complex symplectic vector space. commutes with 5 = 0 1 2 3 . Therefore, in a basis
The good set M now consists of the time of eigenstates of 5 the matrix of D takes the form
 
evolutions that, in addition to preserving the 0 Z
structure of Nambu space, commute with the spin- D¼ ½3
Zy 0
rotation group SU2 :
If the gauge field carries topological charge
2 Z,
M ¼ fU 2 UðVÞjUC ¼ CU; 8g 2 SU2 : gU ¼ Ugg the Dirac operator D has at least j
j zero modes by the
index theorem. To make a simple model of the
By the last condition, all time evolutions act trivially challenging situation where A is distributed according
on the factor R. The condition UC = CU, which to Yang–Mills measure, one takes the matrices Z to be
expresses invariance of the symmetric form of V, complex rectangular, of size p  q with p  q =
, and
then implies that time evolutions preserve the puts a Gaussian probability measure on that space.
alternating form of L. Time evolutions therefore This random matrix model for D captures the
are unitary symplectic transformations of L, hence universal features of the QCD Dirac spectrum in the
M = USp(L) ffi USp2N – a symmetric space of the C massless limit.
family. The Hamiltonian matrices in class C have The exponential of the truncated Dirac operator,
the standard form eitD (where t is not the time), lies in a space
  equivalent to Upþq =Up  Uq – a symmetric space of
W Z the AIII family. We therefore say that the universal
H¼ 
Zy W behavior of the QCD Dirac spectrum is that of
symmetry class AIII.
with W being Hermitian and Z complex and But hold on! Why are we entitled to speak of a
symmetric. symmetry class here? By definition, symmetries
212 Symmetry Classes in Random Matrix Theory

always commute with the Hamiltonian, never do on Nambu space with Dirac U1 -symmetry and an
they anticommute! (The relation D =  5 D 5 is not antiunitary symmetry T.
a symmetry in the sense of Dyson, nor is it a
symmetry in our sense.)
Classes BDI and CII
Consider Hamiltonians D still of the form [3] but
Class AIII
now with matrix entries taken from either the real
To incorporate the massless QCD Dirac operator numbers or the real quaternions. Their one-parameter
into the present classification scheme, we adapt it groups eitD belong to two further families of
to the Nambu space setting. This is done by symmetric spaces, namely the classes BDI and CII
reorganizing the four-component Dirac spinor of Table 1. These large families are known to be
y,y as an eight-component Majorana spinor , realized as symmetry classes by the massless Dirac
to write operator with gauge group SU2 (for BDI), or with
fermions in the adjoint representation (for CII). For
i
L m¼0 ¼  ð@  A  Þ the details we must refer to Verbaarschot’s (1994)
2 paper and the recent article by Heinzner et al. (2005).
The 8  8 matrices  are real symmetric besides
satisfying the Clifford relations  
þ 
 = 2
. See also: Classical Groups and Homogeneous Spaces;
A possible tensor product realization is Compact Groups and Their Representations;
Determinantal Random Fields; Dirac Fields in Gravitation
0 ¼ 1  z  1; 1 ¼  x   y   y and Nonabelian Gauge Theory; Dirac Operator and Dirac
Field; High Tc Superconductor Theory; Integrable
2 ¼ y  y  1; 3 ¼  z   y   y
Systems in Random Matrix Theory; Lie Groups: General
The gauge field in this Majorana representation Theory; Random Matrix Theory in Physics; Random
is A = 1  1  (A() (þ) () Partitions; Supersymmetry Methods in Random Matrix
  A y ) where A = (1=2)
t
(A  A ) are the symmetric and skew parts of Theory; Symmetries and Conservation Laws.
A 2 su (Nc ).
The operator H = i (@  A  ) is imaginary
skew, therefore eitH is real orthogonal. This means Further Reading
that there exists a Nambu space V with unitary
structure h , i and symmetric pairing { , }, both of Altland A, Simons BD, and Zirnbauer MR (2002) Theories of
low-energy quasi-particle states in disordered d-wave super-
which are preserved by the action of eitH . No change conductors. Physics Reports 359: 283–354.
of physical meaning or interpretation is implied by Altland A and Zirnbauer MR (1997) Non-standard symmetry
the identical rewriting from Dirac D to Majorana H. classes in mesoscopic normal-/superconducting hybrid systems.
The fact that Dirac fermions are not truly Majorana Physical Review B 55: 1142–1161.
is encoded in a U1 -symmetry Hei Q = ei Q H gener- Dyson FJ (1962) The threefold way: algebraic structure of
symmetry groups and ensembles in quantum mechanics.
ated by Q = 1  1  y . Journal of Mathematical Physics 3: 1199–1215.
Now comes the essential point: since H obeys Heinzner P, Huckleberry A, and Zirnbauer MR (2005) Symmetry
H = H, the chiral ‘‘symmetry’’ H = 5 H5 with classes of disordered fermions. Communications in Mathe-
5 = 1  x  1 can be recast as a true symmetry: matical Physics 257: 725–771.
Helgason S (1978) Differential Geometry, Lie Groups and
 5 ¼ THT 1
H ¼ þ5 H Symmetric Spaces. New York: Academic Press.
Mehta ML (1991) Random Matrices. New York: Academic Press.
with antilinear T :  7! 5 . Thus, the massless Weyl H (1939) The Classical Groups: Their Invariants and
Representations. Princeton: Princeton University Press.
QCD Dirac operator is indeed associated with a Verbaarschot JJM (1994) The spectrum of the QCD Dirac
symmetry class in the present, post-Dyson sense: operator and chiral random matrix theory: the threefold
that is class AIII, realized by self-adjoint operators way. Physical Review Letters 72: 2531–2533.
Synchronization of Chaos 213

Synchronization of Chaos
M A Aziz-Alaoui, Université du Havre, Le Havre, behavior of insects, animals, or humans (Pikovsky
France et al. 2001).
ª 2006 Elsevier Ltd. All rights reserved. This process may also be encountered in celestial
mechanics, where it explains the locking of revolu-
tion period of planets and satellites.
Its view was strongly broadened with the devel-
Introduction: Chaotic Systems Can
opments in radio engineering and acoustics, due to
Synchronize the work of Eccles and Vincent, 1920, who found
Synchronization is a ubiquitous phenomenon char- synchronization of a triode generator. Appleton,
acteristic of many processes in natural systems and Van der Pol, and Van der Mark, 1922–27, have,
(nonlinear) science. It has permanently remained an experimentally and theoretically, extended it and
objective of intensive research and is today consid- worked on radio tube oscillators, where they
ered as one of the basic nonlinear phenomena observed entrainment when driving such oscillators
studied in mathematics, physics, engineering, or life sinusoidally, that is, the frequency of a generator
science. This word has a Greek root, syn = common can be synchronized by a weak external signal of a
and chronos = time, which means to share the slightly different frequency.
common time or to occur at the same time, that is, But, even though original notion and theory of
correlation or agreement in time of different synchronization implies periodicity of oscillators,
processes (Boccaletti et al. 2002). Thus, synchroni- during the last decades, the notion of synchroniza-
zation of two dynamical systems generally means tion has been generalized to the case of interacting
that one system somehow traces the motion of chaotic oscillators. Indeed, the discovery of determi-
another. Indeed, it is well known that many coupled nistic chaos introduced new types of oscillating
oscillators have the ability to adjust some common systems, namely the chaotic generators.
relation that they have between them due to weak Chaotic oscillators are found in many dynamical
interaction, which yields to a situation in which a systems of various origins; the behavior of such
synchronization-like phenomenon takes place. systems is characterized by instability and, as a
The original work on synchronization involved result, limited predictability in time.
periodic oscillators. Indeed, observations of (peri- Roughly speaking, a system is chaotic if it is
odic) synchronization phenomena in physics go back deterministic, has a long-term aperiodic behavior,
at least as far as C Huygens (1673), who, during his and exhibits sensitive dependence on initial condi-
experiments on the development of improved pen- tions on a closed invariant set (the chaos theory is
dulum clocks, discovered that two very weakly discussed in more detail elsewhere in this encyclo-
coupled pendulum clocks become synchronized in pedia) (see Chaos and Attractors).
phase: two clocks hanging from a common support Consequently, for a chaotic system, trajectories
(on the same beam of his room) were found to starting arbitrarily close to each other diverge
oscillate with exactly the same frequency and exponentially with time, and quickly become uncor-
opposite phase due to the (weak) coupling in terms related. It follows that two identical chaotic systems
of the almost imperceptible oscillations of the beam cannot synchronize. This means that they cannot
generated by the clocks. produce identical chaotic signals, unless they are
Since this discovery, periodic synchronization has initialized at exactly the same point, which is in
found numerous applications in various domains, general physically impossible. Thus, at first sight,
for instance, in biological systems and living nature synchronization of chaotic systems seems to be
where synchronization is encountered on different rather surprising because one may intuitively (and
levels. Examples range from the modeling of the naively) expect that the sensitive dependence on
heart to the investigation of the circadian rhythm, initial conditions would lead to an immediate
phase locking of respiration with a mechanical breakdown of any synchronization of coupled
ventilator, synchronization of oscillations of human chaotic systems. This scenario in fact led to the
insulin secretion and glucose infusion, neuronal belief that chaos is uncontrollable and thus unusa-
information processing within a brain area and ble. Despite this, in the last decades, the search for
communication between different brain areas. Also, synchronization has moved to chaotic systems.
synchronization plays an important role in several Significant research has been done and, as a result,
neurological diseases such as epilepsies and patho- Yamada and Fujisaka (1983), Afraimovich et al.
logical tremors, or in different forms of cooperative (1986), and Pecora and Carroll (1990) showed that
214 Synchronization of Chaos

two chaotic systems could be synchronized by Our discussion and examples given here are based
coupling them: synchronization of chaos is actual on unidirectionally continuous systems, most of the
and chaos could then be exploitable. Ever since, exposed ideas can be easily extended to discrete
many researchers have discussed the theory and the systems.
design or applications of synchronized motion in Let us also emphasize that the same year, 1990,
coupled chaotic systems. A broad variety of applica- saw the publication of another seminal paper, by
tions has emerged, for example, to increase the Ott, Grebogi, and Yorke (OGY) on the control of
power of lasers, to synchronize the output of chaos (Ott et al. 1990). Recently, it has been
electronic circuits, to control oscillations in chemical realized that synchronization and control of chaos
reactions, or to encode electronic messages for share a common root in nonlinear control theory.
secure communications. Both topics were presented by many authors in a
The publication of the seminal paper of Pecora unified framework. However, synchronization of
and Caroll (1990) had a very strong impact in the chaos has evolved in its own right, even if it is
domain of chaos theory and chaos synchronization, nowadays known as a part of the nonlinear control
and their applications. It had stimulated very intense theory.
research activities and the related studies continue to
attract great attention. Many authors have contrib-
uted to developing this domain, theoretically or
Synchronization and Stability
experimentally (Boccaletti et al. 2002, Pecorra et al.
1997, references therein). For the basic master–slave configuration, where an
However, the special features of chaotic systems autonomous chaotic system (the master)
make it impossible to directly apply the methods
dX
developed for synchronization of periodic oscilla- ¼ FðXÞ; X 2 Rn ½1
tors. Moreover, in the topics of coupled chaotic dt
systems, many different phenomena, which are drives another system (the slave),
usually referred to as synchronization, exist and
dY
have been studied now for over a decade. Thus, ¼ GðX; YÞ; Y 2 Rm ½2
more precise descriptions of such systems are indeed dt
desirable. synchronization takes place when Y asymptotically
Several different regimes of synchronization have copies, in a certain manner, a subset Xp of X. That
been investigated. In the following, the focus will be is, there exists a relation between the two coupled
on explaining the essentials on this large topic, systems, which could be a smooth invertible func-
subdivided into four basic types of synchronization tion , which transforms the trajectories on the
of coupled or forced chaotic systems which have attractor of a first system into those on the attractor
been found and have received much attention, while of a second system. In other words, if we know,
emphasizing on the first three: after a transient regime, the state of the first system,
it allows us to predict the state of the second:
 identical (or complete) synchronization (IS), Y(t) = (X(t)). Generally, it is assumed that n  m;
which is defined as the coincidence of states of however, for the sake of easy readability (even if this
interacting systems; is not a necessary restriction) the case n = m will
 generalized synchronization (GS), which extends only be considered; thus, Xp = X. Henceforth, if we
the IS phenomenon and implies the presence of denote the difference Y  (X) by X? , in order to
some functional relation between two coupled arrive at a synchronized motion, it is expected that
systems; if this relationship is the identity, we
jjX? jj! 0; as t ! þ1 ½3
recover the IS;
 phase synchronization (PS), which means entrain- If is the identity function, the process is called IS.
ment of phases of chaotic oscillators, whereas
Definition of IS System [2] synchronizes with
their amplitudes remain uncorrelated; and
system [1], if the set M = {(X, Y) 2 Rn  R n , Y = X}
 lag synchronization (LS), which appears as a
is an attracting set with a basin of attraction B(M  B)
coincidence of time-shifted states of two systems.
such that limt!1 kX(t)  Y(t)k = 0, for all
(X(0), Y(0)) 2 B.
Other regimes exist, some of them will be briefly
pointed out at the end of this article; we also will Thus, this regime corresponds to the situation
briefly discuss the very relevant issue of the stability where all the variables of two (or more) coupled
of synchronous motions. chaotic systems converge.
Synchronization of Chaos 215

If is not the identity function, the phenomenon synchronization, when the system is smooth, are
is more general and is referred to as GS. given by Josic (2000). This approach relies on the
Fenichel theory of normally hyperbolic invariant
Definition of GS System [2] synchronizes with
manifolds and quantities that resemble Lyapunov
system [1], in the generalized sense, if there exists a
exponents, and is referred to as differentiable GS.
transformation : Rn ! Rm , a manifold M =
nþm However, many situations correspond to the case
{(X, Y) 2 R , Y = (X)} and a subset B (M  B),
where, in some region of values of parameters
such that for all (X0 , Y0 ) 2 B, the trajectory based
coupling, the function is only continuous but not
on the initial conditions (X0 , Y0 ) approaches M as
smooth, that is, the graph of is a complicated
time goes to infinity. This is explained further in the
geometrical object. This kind of synchronization
following.
is called nonsmooth GS (Afraimovich et al. 2001).
Henceforth, in the case of IS, eqn [3] above means Furthermore, the mathematical theory of IS often
that a certain hyperplane M, called synchronization assumes the coupled oscillators to be identical, even
manifold, within R2n , is asymptotically stable. if, in practice, no two oscillators are exact copies of
Consequently, for the sake of synchrony motion, each other. This leads to small differences in system
we have to prove that the origin of the transverse parameters and then to synchronization errors.
system X? = Y  X is asymptotically stable. That is, These errors have been studied by many authors
to prove that the motion transversal to the synchro- (see, e.g., Illing (2002), and references therein).
nization manifold dies out.
However, significant progress has been made by
mathematicians and physicists in studying the Identical Synchronization
stability of synchronous motions. Two main tools
Perhaps the best way to explain synchronization of
are used in the literature for this aim: conditional
chaos is through IS, also referred to as conventional
Lyapunov exponents and asymptotic stability. In the
or complete synchronization (Boccaletti et al. 2002).
examples given below, we will essentially formulate
It is the simplest form of chaos synchronization and
conditions for synchronization in terms of Lyapunov
generalizes to the complete replacement which is
exponents, which play a central role in chaos theory.
explained below. It is also the most typical form of
These quantities measure the sensitive dependence
chaotic synchronization often observable in two
on initial conditions for a dynamical system and also
identical systems.
quantify synchronization of chaos.
There are various processes leading to synchroni-
The Lyapunov exponents associated with the
zation; depending on the particular coupling config-
variational equation corresponding to the transverse
uration used these processes could be very different.
system X? :
So, one has to distinguish between the following two
dX? main situations, even if they are, in some sense,
¼ DFðXÞX? ½4 similar: the unidirectional and the bidirectional
dt
coupling. Indeed, synchronization of chaotic systems
where DF(X) is the Jacobian of the vector field
is often studied for schemes of the form
evaluated onto the driving trajectory X, are referred
to as transverse or conditional Lyapunov exponents dX
¼ FðXÞ þ kNðX  YÞ
(CLEs). dt ½5
In the case of IS, it appears that the condition L?max < dY
0 is sufficient to insure synchronization, where L? ¼ GðYÞ þ kMðX  YÞ
max is dt
the largest CLE. Indeed, eqn [4] gives the dynamics of
where F and G act in Rn , (X, Y) 2 (Rn )2 , is a scalar,
the motion transverse to the synchronization manifold;
and M and N are coupling matrices belonging to
therefore, CLEs indicate if this motion dies out or not,
Rnn . If F = G the two subsystems X and Y are
and hence, whether the synchronization state is stable
identical. Moreover, when both matrices are non-
or not. Consequently, if L? max is negative, it insures the
zero then the coupling is called bidirectional, while
stability of the synchronized state. This will be best
it is referred to as unidirectional if one is the zero
explained using two examples below.
matrix, and the other nonzero.
Even if there exist other approaches for studying
synchronization, one may ask if this condition on
L? Constructing Pairs of Synchronized Systems:
max is true in general. To answer this question,
Complete Replacement
mathematicians have recently formulated it in terms
of properties of manifolds (or synchronization Pecora and Carroll (1990) proposed the use of
hyperplanes). Some rigorous results on (generalized) stable subsystems of given chaotic systems to
216 Synchronization of Chaos

construct pairs of unidirectionally coupled synchro- Let w(t) be a chaotic trajectory with initial
nizing systems. Since then generalizations of this condition w(0), and w0 (t) be a trajectory started at
approach have been developed and various meth- a nearly point w0 (0). The basic idea of the Pecora–
ods now exist to synchronize systems (Wu 2002, Carroll approach is to establish the asymptotic
Hasler 1998). stability of the solutions of w0 -subsystem by means
One way to build a couple of synchronized of CLEs. They have shown the following result
systems is then to use the basic construction method (Pecora and Carroll 1990):
introduced by Pecora and Carroll, who made an
Theorem A necessary and sufficient condition for
important observation. They found that, when they
the two subsystems, w andw0 , to be synchronized is
make a replica of part of a chaotic system and send
that all of the CLEs be negative.
a system variable from the original system (trans-
mitter) to drive this replica (receiver), sometimes the Note that only a finite number of possible
replica subsystem and the original chaotic one lock decompositions (or couplings) v–w exist; this is
in their steps and evolve together chaotically in bounded by the number of different possible
synchrony. This method can be described as follows. subsystems, namely N(N  1)=2. (For a description
Consider the autonomous n-dimensional dynamical and mathematical analysis of various coupling
system, schemes see Wu (2002).) Furthermore, by splitting
the main system [6] in a different way, (complete)
du synchronization could not exist. Indeed, in general,
¼ FðuÞ ½6
dt only a few of the possible response subsystems
possess negative CLEs, and may thus be used to
divide this system into two subsystems (u = (v, w)), implement synchronizing systems using the Pecora–
Caroll method. In fact, it has been pointed out in the
dv literature that in some cases, the CLE criterion is not
¼ Gðv; wÞ
dt as practical as some other criteria.
½7
dw For simplicity, the idea will now be developed on
¼ Hðv; wÞ the following three-dimensional simple autonomous
dt
system, which belongs to the class of dynamical
where v = (u1 , . . . , um ), w = (umþ1 , . . . , un ), G =(F1 , . . . , systems called generalized Lorenz systems (see
Fm ), and H = (Fmþ1 , . . . , Fn ). Next, create a new Derivière and Aziz-Alaoui (2003), and references
subsystem w0 identical to the w-subsystem. This therein):
yields a (2n  m)-dimensional system:
x_ ¼ 9x  9y
dv
¼ Gðv; wÞ y_ ¼ 17x  y  xz ½9
dt
dw z_ ¼ z þ xy
¼ Hðv; wÞ ½8
dt
0 (This should be compared with the well-known
dw
¼ Hðv; w0 Þ Lorenz system:
dt

The first state-variable component v(t) of the (v, w) x_ ¼ 10x þ 10y


system is then used as the input to the w0 -system.
y_ ¼ 28x  y  xz
The coupling is unidirectional and the (v, w)
subsystem is referred to as the driving (or master) z_ ¼  83 z þ xy
system, the w0 -subsystem as the response (or slave)
system. In this context, the following notions and which differs in the signs of various terms and the
results are useful. values of coefficients.) From previous observations,
it was shown that system [9] oscillates chaotically;
Definition If limt ! þ1 kw0 (t)  w(t)k = 0 and w0 (t)
its Lyapunov exponents are þ0.601, 0.000, and
continues to remain in step with w(t) in the course
16.470; it exhibits the chaotic attractor of Figure 1,
of the time, the two subsystems are said to be
with a three-dimensional feature very similar to that
synchronized.
of Lorenz attractor (in fact, it satisfies the condition
Definition The Lyapunov exponents of the z < 0, but in our context it does not matter).
response subsystem (w0 ) for a particular driven Let us divide system [9] into two subsystems
trajectory v(t) are called CLEs. v = x1 and w = (y1 , z1 ). By creating a copy
Synchronization of Chaos 217

15 0

10 –5

5 –10

0 –15

–5 –20

–10 –25

–15 –30
–15 –10 –5 0 5 10 15 –15 –10 –5 0 5 10 15
Figure 1 The chaotic attractor of system [9]: x–y and x–z plane projections.

w0 = (y2 , z2 ) of the w-subsystem, we obtain the is replaced with the drive counterpart only in certain
following five-dimensional dynamical system: locations (Pecora et al. 1997).

x_1 ¼ 9x1  9y1


Unidirectional IS
y_1 ¼ 17x1  y1  x1 z1
z_1 ¼ z1 þ x1 y1 The IS synchronization has also been called as one-
½10
way diffusive coupling, drive–response coupling,
y_2 ¼ 17x1  y2  x1 z2 master–slave coupling, or negative feedback control.
z_2 ¼ z2 þ x1 y2 System [5], F = G and N = 0, becomes unidirec-
tionally coupled, and reads
In numerical experiments, it was observed that the
dX
motion quickly results in the two equalities, ¼ FðXÞ
limt ! þ1 jy2  y1 j = 0 and limt ! þ1 jz2  z1 j = 0, to dt ½11
dY
be satisfied, that is, limt ! þ1 kw0  wk = 0. These ¼ FðYÞ þ kMðX  YÞ
equalities persist as the system evolves. Hence, the dt
two subsystems w and w0 are synchronized. Figure 2 M is then a matrix that determines the linear
illustrates this phenomenon. combination of X components that will be used
It is also easy to verify that the synchronization in the difference, and k determines the strength of
persists even if a slight change in the parameters of the coupling (see, for an interesting review on
the system is made. The CLEs of the linearization of this subject, Pecora et al. (1997)). In unidirectional
the system around the synchronous state, the synchronization, the evolution of the first system
negativity of which determines the stability of the (the drive) is unaltered by the coupling, the second
synchronized solution, are also computed easily. system (the response) is then constrained to copy the
Pecora–Carroll similarly built the system [10] by dynamics of the first. Let us consider an example
using the following steps. Starting with two copies with two copies of system [9], and for
of system [9], a signal x(t) is transmitted from the 0 1
1 0 0
first to the second: in the second system all x-
M¼@ 0 0 0 A ½12
components are replaced with the signal from the
0 0 0
first system, that is, x2 is replaced by x1 in the
second system. Finally, the dx2 =dt equation is that is, by adding a damping term to the first equation
eliminated, since it is exactly the same as dx1 =dt of the response system, we get a following unidir-
equation, and is superfluous. This then results in ectionally coupled system, coupled through a linear
system [10]. For this reason, Pecora–Carroll called term k > 0 according to variables x1, 2 :
this construction a complete replacement. Thus, it is x_1 ¼ 9x1  9y1
natural to think of the x1 variable as driving the
second system, but also to label the first system the y_1 ¼ 17x1  y1  x1 z1
drive and the second system the response. In fact, z_1 ¼ z1 þ x1 y1
this method is a particular case of the unidirectional ½13
x_2 ¼ 9x2  9y2  kðx2  x1 Þ
coupling method explained below. Note also that
this method could be modified by using a partial y_2 ¼ 17x2  y2  x2 z2
substitution approach, in which a response variable z_2 ¼ z2 þ x2 y2
218 Synchronization of Chaos

20
15
10
5
0
–5
–10
0 1 2 3 4 5

(a)
5
0
–5
–10
–15
–20
–25
–30
–35
0 1 2 3 4 5

(b)

15

10

–5

–10

–15
–15 –10 –5 0 5 10 15

(c)
Figure 2 Complete replacement synchronization. Time series for (a) yi (t) and (b) zi (t), i = 1, 2, in system [10]. The difference
between the variable of the transmitter and the variable of the receiver asymptotes tends to zero as time progresses, that is,
synchronization occurs after transients die down. (c) The plot of amplitudes y1 against y2 , after transients die down, shows a diagonal
line, which also indicates that the receiver and the transmitter are maintaining synchronization. The plot of z1 against z2 shows a
similar behavior.

For k = 0, the two subsystems are uncoupled; for against x2 , y1 against y2 , and z1 against z2 , can also
k > 0 both subsystems are unidirectionally coupled; indicate the occurrence of system synchronization.
and for k ! þ1, we recover the complete replace- IS was the first for which examples of unidir-
ment coupling scheme explained above. Our numer- ectionally coupled chaotic systems were presented. It
ical computations yield the optimal value k̃ for the is important for potential applications of chaos
synchronization; we found that for k  k̃ = 4.999, synchronization in communication systems, or for
both subsystems of [13] synchronize. That is, time-series analysis, where the information flow is
starting from random initial conditions, and after also unidirectional.
some transient time, system [13] generates the same
attractor as for system [9] (see Figure 1). Conse-
quently, all the variables of the coupled chaotic Bidirectional IS
subsystems converge: x2 converges to x1 , y2 to y1 , A second brief example uses a bidirectional (also
and z2 to z1 (see Figure 3). Thus, the second system called mutual or two-way) coupling. In this situa-
(the response) is locked to the first one (the drive). tion, in contrast to the unidirectional coupling, both
Alternatively, observation of diagonal lines in drive and response systems are connected in such a
correlation diagrams, which plot the amplitudes x1 way that they influence each other’s behavior. Many
Synchronization of Chaos 219

10
5
0
–5
–10
2162 2164 2166 2168 2170 2172 2174 2176 2178 2180

10
5
0
–5
–10
2162 2164 2166 2168 2170 2172 2174 2176 2178 2180

–5
–10
–15
–20
–25
1525 1530 1535 1540 1545 1550
Figure 3 Time series for xi (t), yi (t), and zi (t)(i = 1, 2) in system [13] for the coupling constant k = 5:0, that is, beyond the threshold
necessary for synchronization. After transients die down, the two subsystems synchronize perfectly.

biological or physical systems consist of bidirection- as system [9], implies that the attractors of these
ally interacting elements or components; examples combined drive–response six-dimensional systems
range from cardiac and respiratory systems to are confined to a three-dimensional hyperplane (the
coupled lasers with feedback. Let us then take two synchronization manifold) defined by Y = X. After
copies of the same system [9] as given above, but the synchronization is reached, this manifold is a
two-way coupled through a linear constant term k > stable submanifold in the full phase space R 6 .
0 according to variables x1, 2 : Figure 5 gives an idea of what the geometry of the
synchronous attractor of system [13] or [14] looks
x_1 ¼ 9x1  9y1  kðx1  x2 Þ
like, by exhibiting the projection of the phase space
y_1 ¼ 17x1  y1  x1 z1 R6 onto (x1 , y1 , y2 ) subspace. But, one can simi-
z_1 ¼ z1 þ x1 y1 larly plot any combination of variable xi , yi , and
½14 zi (i = 1, 2), and get the same result, since the
x_2 ¼ 9x2  9y2  kðx2  x1 Þ motion, in case of synchronization, is confined to
y_2 ¼ 17x2  y2  x2 z2 the hyperplane defined in R6 by the equalities
z_2 ¼ z2 þ x2 y2 x1 = x2 , y1 = y2 , and z1 = z2 .
This hyperplane is stable since small perturbations
We can get an idea of the onset of synchronization which take the trajectory off the synchronization
by plotting, for example, x1 against x2 for various manifold decay in time. Indeed, as stated earlier,
values of the coupling-strength parameter k. Our CLEs of the linearization of the system around the
numerical computations yield the optimal value k̃ synchronous state could determine the stability of
for the synchronization: k̃ ’ 2.50 (Figure 4), both the synchronized solution. This leads to requiring
(xi , yi , zi ) subsystems synchronize and system [14] that the origin of the transverse system, X? , is
also generates the attractor of Figure 1. asymptotically stable. To see this, for both systems
[13] and [14], we then switch to the new set of
Synchronization manifold and stability Geometri- coordinates, X? = Y  X, that is, x? = x2  x1 ,
Geometrically, the fact that systems [13] and [14], y? = y2  y1 , and z? = z2  z1 . The origin (0, 0, 0)
beyond synchronization, generate the same attractor is obviously a fixed point for this transverse system,
220 Synchronization of Chaos

10 10

5 5

0 0

–5 –5

–10 –10
–10 –5 0 5 10 –10 –5 0 5 10

(a) (b)

10 40
35
5 30
25
0 20
15
–5 10
5
–10
0
–10 –5 0 5 10 1200 1220 1240 1260 1280 1300

(c) (d)
Figure 4 Illustration of the onset of synchronization of system [14]. (a)–(c) Plots of amplitudes x1 against x2 for values of the coupling
parameter k = 0:5, 1:5, 2:8, respectively. The system synchronizes for k  2:5. (d) Plot, for k = 2:8, of the norm N(X ) = kx1  x2 k þ
ky1  y2 k þ kz1  z2 k versus t, which shows that the system synchronizes very quickly.

within the synchronization manifold. Therefore, for


small deviations from the synchronization manifold,
this system reduces to a typical variational equation:
dX?
¼ DFðXÞX? ½15
dt
Tr irec

where DF(X) is the Jacobian of the vector field


an tio
d
sv n

evaluated onto the driving trajectory X, that is,


y2
er

15
se

1 =

0 1 0 1
y

dx? x?
ma niz e :
old on
hro lan
nif ati

B dt C B C
nc rp

B C B C
sy ype

B C B C
B dy? C B C
H

y2 0 B C ¼ V B y? C ½16
B dt C B C
B C B C
@ dz? A @ A
15 dt z?
–15
–12 For systems [13] and [14], we obtain
0 y1 0 1
0 9  ki 9 0
x1
V ¼ Vi ¼@ 17  z 1 x A ½17
12 –15 y x 1
Figure 5 The motion of synchronized system [13] or [14] takes
place on a chaotic attractor which is embedded in the with ki = k for system [13] and ki = 2k for system
synchronization manifold, that is, the hyperplane defined by [14]. Let us remark that the only difference between
x1 = x2 , y1 = y2 , and z1 = z2 : both matrices Vi is the coupling k which has a factor
Synchronization of Chaos 221

0.6 expressed by a smooth invertible function,


Y(t) = (X(t)). This phenomenon, called GS, is thus a
Lyapunov exponents, Lmax

0.4

0.2
relaxed and extended form of IS in non-identical
0
systems.
–0.2
However, it may also occur for pairs of identical
U nid
–0.4
ir e c
ti o n a systems, for example, for systems having reflection
l c ou
B id pling symmetry, F(X) = F(X). Besides these examples
–0.6 i rec
ti o n a of GS, others also exist that exploit symmetries of
–0.8 l cou
p li n g
the underlying systems (Parlitz and Kocarev 1999).
–1
GS was introduced for unidirectionally coupled
–1.2
0 5 10 15 20 25 systems by Rulkov et al. (1995). For simplicity, we
Coupling strength, k
also focus on unidirectionally coupled continuous
Figure 6 The largest transverse Lyapunov exponents L? max as time systems:
a function of coupling strength k, in the unidirectional system [13]
(solid) and the bidirectional system [14] (dotted).
dX
¼ FðXÞ
dt
½18
dY
2 in the bidirectional case. Figure 6 shows the ¼ GðY; uðtÞÞ
dt
dependence of L? max on k, for both examples of
unidirectionally and bidirectionally coupling sys- where X 2 Rn , Y 2 Rm , F : Rn ! Rn , G : R m 
tems. L?max becomes negative as k increases, which Rk ! R m , and u(t) = (u1 (t), . . . , uk (t)) with
insures the stability of the synchronized state for ui (t) = hi (X(t, Xo )). Two (non-identical) dynamical
systems [13] and [14]. systems are said to be synchronized in a generalized
Let us note that this can also be proved sense if there is a continuous function from the
analytically as done by Derivière and Aziz-Alaoui phase space of the first to the phase space of the
(2003) by using a suitable Lyapunov function, and second, taking orbits of the first system to orbits of
using some new extended version of LaSalle invar- the second.
iance principle. The main problem is to know when and under
what conditions system [18] undergoes GS. Many
authors have addressed this question, and it has been
Desynchronization motion Synchronization depends
shown that asymptotic stability is equally significant
not only on the coupling strength, but also on the
for this more universal concept (for some theoretical
vector field and the coupling function. For some
results, see Rulkov et al. (1995) and Parlitz and
choice of these quantities, synchronization may
Kocarev (1999)). For unidirectionally coupled con-
occur only within a finite range [k1 , k2 ] of coupling
tinuous time systems, the following results hold:
strength; in such a case a desynchronization phe-
nomenon occurs. Thus, increasing k beyond the Theorem A necessary and sufficient condition for
critical value k2 yields loss of the synchronized system [18] to be synchronized in the generalized
motion (L? max becomes positive). sense is that for each u(t) = u(X(t, Xo )) the system-
is asymptotically stable.
When it is not possible to find a Lyapunov function
Generalized Synchronization
in order to use this theorem, one can numerically
Identical chaotic systems synchronize by following the compute the CLEs of the response system, and use the
same chaotic trajectory. However, real systems are in following result:
general not identical. For instance, when the para-
Theorem The drive and response subsystems of
meters of two coupled identical systems do not match,
system [18] synchronize in the generalized sense iff
or when these coupled systems belong to different
all of the CLEs of the response subsystem are
classes, complete IS may not be expected, because
negative.
there does not exist such an invariant manifold Y = X,
as for IS. For non-identical systems, the possibility of The definition of has the advantage that it allows
some type of synchronization has been investigated the discussion of synchronization of non-identical
(Afraimovich et al. 1986). It was shown that when two systems and, at the same time, to consider synchroni-
different systems are coupled with sufficiently strong zation in terms of the property of synchronization
coupling strength, a general synchronous relation manifold. Therefore, it is important to study the
between their states could exist and it could be existence of the transformation and its nature
222 Synchronization of Chaos

(continuity, smoothness, . . .). Unfortunately, except in on the functional relation occurring in case of GS,
special cases (Afraimovich et al. 1986), rarely will one between two coupled systems.)
be able to produce formulas exhibiting the mapping .
An example of two unidirectionally coupled
chaotic systems which synchronize in the generalized
sense is given below. Consider the following Rössler Phase Synchronization
system driven by system [9]: For coupled non-identical chaotic systems, other
types of synchronizations exist. Recently, a rather
x_1 ¼ 9x1  9y1
weak degree of synchronization, the PS, of chaotic
y_1 ¼ 17x1  y1  x1 z1 systems has been described (Pikovsky et al. 2001).
The Greek meaning of the word synchronization,
z_1 ¼ z1 þ x1 y1 mentioned in the introduction, is closely related to
½19 this type of processes. The synchronous motion is
x_2 ¼ y2  z2  kðx2  ðx12 þ y12 ÞÞ actually not visible. Indeed, in PS the phases of
y_2 ¼ x2 þ 0:2y2  kðy2  ðy12 þ z12 ÞÞ chaotic systems with PS are locked, that is, there
exists a certain relation between them, whereas the
z_2 ¼ 0:2 þ z2 ðx2  9:0Þ  kðz2  ðx12 þ z12 ÞÞ amplitudes vary chaotically and are practically
uncorrelated. Thus, it is mostly close to synchroni-
As shown in Figure 7, it appears impossible to tell zation of periodic oscillators.
what the relation is between the transmitter sub-
system (x1 , y1 , z1 ) in eqn [19] and the two Rössler Definition PS of two coupled chaotic oscillators
response subsystems (x2 , y2 , z2 ) at k = 1 and k = 100. occurs if, for arbitrary integers n and m, the phase
However, GS occurs for large values of the locking condition between the corresponding
coupling-strength parameter k. Therefore, for such phases, jn1 (t)  m2 (t)j  constant, holds and the
values we expect that orbits of [19] will lie in the amplitudes of both systems remain uncorrelated.
vicinity of a certain synchronization manifold. Let us note that such a phenomenon occurs when
Indeed, let us define the set a zero Lyapunov exponent of the response system
becomes negative, while, as explained above, iden-
S ¼ fðx1 ; y1 ; z1 ; x2 ; y2 ; z2 Þ 2 R6 : x2 ¼ x12 þ y12 ; tical chaotic systems synchronize by following the
y2 ¼ y12 þ z12 ; z2 ¼ x21 þ z21 g same chaotic trajectory, when their largest trans-
verse Lyapunov exponent of the synchronized
Since the projections of S onto the coordinates manifold decreases from positive to negative values.
(x1 , y1 , x2 ), (y1 , z1 , y2 ), and (x1 , z1 , z2 ) are parabo- Moreover, following the definition above, this
loids, we can see how the synchronization manifold phenomenon is best observed when a well-defined
is approached. This is illustrated in Figure 8, where phase variable can be identified in both coupled
the (x1 , y1 , x2 ) projections of typical trajectories are systems. This can be done for strange attractors that
shown at four different coupling values. (See Josic spiral around a ‘‘hole,’’ or a particular (fixed) point
(2000) for other examples and further develop- in a two-dimensional projection of the attractor. The
ments; see also Pecora et al. (1997), where the typical example is given by the Rössler system, which,
authors summarize a method in order to get an idea for some range of parameters, exhibits a Möbius-

15 250 700

10 600
200
500
5
150 400
0
100 300
–5
200
–10 50
100
–15 0 0
–15 –10 –5 0 5 10 –160 –140 –120 –100 –80 –60 –40 –20 –20 0 20 40 60 80 100 120 140 160

(a) (b) (c)


Figure 7 Projections onto the (x–y) plane of typical trajectories of system [19]. (a) (x1 , y1 ) projection, that is, a typical trajectory of
system [9]; (b) and (c) (x2 , y2 ) projections at, respectively, k = 1 and k = 100:
Synchronization of Chaos 223

(a) (b)

(c) (d)
Figure 8 Generalized synchronization. (x1 , y1 , x2 ) projections of typical trajectories of system [19] after transients die out, with
(a) k = 1, (b) k = 20, (c) k = 100, and (d) k = 200. For the last value, the attractor lies in the set S, three-dimensional projections of
which are paraboloı̈ds.

strip-like chaotic attractor with a central hole. In such the phase has a physically important property, it
a case, a phase angle (t) can be defined that decreases does correspond to the direction with the zero
or increases monotonically. For an illustration, we Lyapunov exponent in the phase space, its perturba-
take the following two coupled Rössler oscillators: tions neither grow nor decay in time. Figure 9c
shows that there is a transition from the nonsyn-
x_1 ¼ 1 y1  z1 þ kðx2  x1 Þ chronous phase regime, where the phase difference
y_1 ¼ 1 x1 þ 0:17y1 increases almost linearly with time (k = 0.01 and
k = 0.05), to a synchronous state, where the relation
z_1 ¼ 0:2 þ z1 ðx1  9:0Þ j1 (t)  2 (t)j < constant holds (k = 0.1), that is,
½20
x_2 ¼ 2 y2  z2 þ kðx1  x2 Þ the phase difference does not grow with time.
However, the amplitudes are obviously uncorrelated
y_2 ¼ 2 x2 þ 0:17y2 as seen in Figure 9b. This example shows that
z_2 ¼ 0:2 þ z2 ðx2  9:0Þ PS could takes place for weaker degree of synchro-
nization in chaotic systems. Readers can find more
with a small parameter mismatch 1, 2 = rigorous mathematical discussion on this subject,
0.95 0.04,k governs the strength of coupling. and on the definition of phases of chaotic oscillators,
If we can define a Poincaré section surface for in Pikovsky et al. (2001), see also Boccaletti et al.
the system, then, for each piece of a trajectory (2002) and references therein.
between two cross sections with this surface, we
define the phase, as done in Pikovsky et al. (2001),
as a piecewise linear function of time, so that the Other Treatments and Types
phase increment is 2 at each rotation: of Synchronization
t  tn Lag Synchronization
ðtÞ ¼ 2 þ 2n; tn  t  tnþ1
tnþ1  tn
PS synchronization occurs when non-identical chao-
where tn is the time of the nth crossing of the secant tic oscillators are weakly coupled: the phases are
surface. locked, while the amplitudes remain uncorrelated.
In our example, the last has been chosen as the When the coupling strength becomes larger, some
negative x-axis and represented by the wide segment relationships between amplitudes may be estab-
in Figure 9a. This definition of phases is clearly lished. Indeed, it has been shown (Rosenblum et al.
ambiguous since it depends on the choice of the 1997), in symmetrically coupled non-identical oscil-
Poincaré section; nevertheless, defined in this way, lators and in time-delayed systems, that there exists
224 Synchronization of Chaos

14
10

5 12

0 10

–5 8

–10
6
–15
4
–15 –10 –5 0 5 10 15 4 6 8 10 12 14 16

(a) (b)
30

25

20
k = 0.01
15
φ1 − φ2

10
k = 0.05
5
k = 0.1
0

–5
20 40 60 80 100 120 140 160 180 200
Time
(c)
Figure 9 (a) Rössler chaotic attractor projection onto x–y plane. (b) Amplitudes A1 versus A2 for the phase synchronized case at
k = 0:1. (c) Time serie of phase difference for different coupling strengths k; for k = 0:01 PS is not achieved, while for k = 0:1 PS takes
place. Although the phases are locked, for k = 0:1, the amplitudes remain chaotic and uncorrelated.

a regime of LS. This process appears as a coin- provided that the systems satisfy some stability
cidence of time-shifted states of two systems: conditions.
However, this process could not be classified as
lim jjYðtÞ  Xðt  Þjj ¼ 0
t!þ1 GS, even if there exists a linear relation between the
coupled systems, because the response system of
where  is a positive delay.
projective synchronization is not asymptotically
Projective Synchronization stable. For more information about this subject,
the reader is referred to Mainieri and Rehacek
In coupled partially linear systems, it was reported (1999).
by Mainieri and Rehacek (1999) that two identical
systems could be synchronized up to a scaling factor. Anticipating Synchronization
This type of chaotic synchronization is referred to as
projective synchronization. Consider, for example, a It is interesting to mention that a new form of
three-dimensional chaotic system Ẋ = F(X), where synchronization has recently appeared, the so-called
X = (x, y, z). Decompose X into a vector v = (x, y) anticipating synchronization (Boccaletti et al. 2002).
and a scalar z; the system can then be rewritten as It shows that some coupled chaotic systems might
synchronize such that their response anticipates the
du dz drivers by synchronizing with their future states.
¼ gðv; zÞ; ¼ hðv; zÞ
dt dt It is also interesting to mention the nonlinear H1
synchronization method for nonautonomous
In projective synchronization, two identical sys-
schemes introduced by Suykens et al. (1997).
tems X1 = (x1 , y1 , z1 ) (drive) and X2 = (x2 , y2 , z2 )
(response) are coupled through the scalar variable z.
Spatio-Temporal Synchronization
It occurs if the state vectors v1 and v2 synchronize up
to a constant ratio, that is, limt ! þ1 jjv1 (t)  Low-dimensional systems have rather limited useful-
v2 (t)jj = 0, where  is called a scaling factor. For ness in modeling real-world applications. This is
partially linear systems, it may automatically occur why the synchronization of chaos has been carried
Synchronization of Chaos 225

out in high dimensions (see Kocarev et al. (1997) for Robustness to parameter mismatch was addressed
a review). See also Chen and Dong (2001) for a by many authors (Illing et al. 2002). Lozi et al.
discussion of special high-dimensional systems, (1993) showed that, by connecting two identical
namely large arrays of coupled chaotic systems. receivers in cascade, a significant amount of the
noise can be reduced, thereby allowing the recovery
of a much higher quality signal.
Application to Transmission Systems
Furthermore, different implementations of chaotic
and Secure Communication secure communication have been proposed during
Synchronization principles are useful in practical the last decades, as well as methods for cracking this
applications. Use of chaotic signals to transmit encoding. The methods used to crack such a chaotic
information has been a very active research topic encoding make use of the low dimensionality of the
in the last decade. Thus, it has been established that chaotic attractors. Indeed, since the properties of
chaotic circuits may be used to transmit information low-dimensional chaotic systems with one positive
by synchronization. As a result, several proposals Lyapunov exponent can be reconstructed by analyz-
for secure-communication schemes have been ing the signal, such as through the delay-time
advanced (see, e.g., Cuomo et al. (1993), Hasler reconstruction methods, it seems unlikely that these
(1998), and Parlitz et al. (1999)). The first labora- systems might provide a secure encryption method.
tory demonstration of a secure-communication The hidden message can often be retrieved easily by
system, which uses a chaotic signal for masking an eavesdropper without using the receiver. But,
purposes, and which exploits the chaotic synchroni- chaotic masking and encoding are difficult to break,
zation techniques to recover the signal, was reported using the state-of-the-art analysis tools, if suffi-
by Kocarev et al. (1992). ciently high dimensional chaos generators with
It is difficult, within the scope of this article, to multiple positive Lyapunov exponents (i.e., hyperch-
give a complete or detailed discussion, and it should aotic systems) are used (see Pecora et al. (1997), and
be noted that there exist many competing and tested references therein).
methods that are well established.
The main idea of the communication schemes is
to encode a message by means of a chaotic Conclusion
dynamical system (the transmitter), and to decode
In spite of the essential progress in theoretical and
it using a second dynamical system (the receiver)
experimental studies, synchronization of chaotic
that synchronizes with the first. In general, secure-
systems continues to be a topic of active investiga-
communication applications assume additionally
tions and will certainly continue to have a broad
that the coupled systems used are identical.
impact in the future. Theory of synchronization
Different methods can be used to hide the useful
remains a challenging problem of nonlinear
information, for example, chaotic masking, chaotic
science.
switching, or direct chaotic modulation (Hasler
1998). For instance, in the chaotic masking method, See also: Bifurcations of Periodic Orbits; Chaos and
an analog information carrying the signal s(t) is Attractors; Fractal Dimensions in Dynamics; Generic
added to the output y(t) of the chaotic system in the Properties of Dynamical Systems; Isochronous Systems;
transmitter. The receiver tries to synchronize with Lyapunov Exponents and Strange Attractors; Singularity
component y(t) of the transmitted signal s(t) þ y(t). and Bifurcation Theory; Stability Theory and KAM;
If synchronization takes place, the information Weakly Coupled Oscillators.
signal can be retrieved by subtraction (Figure 10).
It is interesting to note that, in all proposed
schemes for secure communications using the idea of Further Reading
synchronization (experimental realization or com-
Afraimovich V, Chazottes JR, and Cordonet A (2001) Synchroni-
puter simulation), there is an inevitable noise zation in directionally coupled systems some rigourous results.
degrading the fidelity of the original message. Discrete and Continuous Dynamical Systems B 1(4): 421–442.
Afraimovich V, Verichev N, and Rabinovich MI (1986) Stochastic
synchronization of oscillations in dissipative systems. Radio-
^ physics and Quantum Electron 29: 795–803.
s(t ) Transmitter y(t ) s(t )
(chaotic) Receiver Boccaletti S, Kurths J, Osipov G, Valladares D, and Zhou C
Information Transmitted Retrieved (2002) The synchronization of chaotic systems. Physics
signal signal information Reports 366: 1–101.
(chaotic) signal
Chen G and Dong X (1998) From Chaos to Order. Singapore:
Figure 10 A typical communication setup. World Scientific.
226 Synchronization of Chaos

Cuomo K, Oppenheim A, and Strogatz S (1993) Synchroniza- Mainieri R and Rehacek J (1999) Projective synchronization in
tion of Lorenz-based chaotic circuits with applications to three-dimensional chaotic systems. Physical Review Letters
communications. IEEE Transactions on Circuits and Sys- 82: 3042–3045.
tems – II: Analog and Digital Signal Processing 40(10): Ott E, Grebogi C, and Yorke JA (1990) Controlling chaos.
626–633. Physical Review Letters 64: 1196–1199.
Derivière S and Aziz-Alaoui MA (2003) Estimation of attractors Parlitz U and Kocarev L (1999) Synchronization of chaotic
and synchronization of generalized Lorenz systems. Dynamics systems. In: Schuster HG (ed.) Handbooks of Chaos Control,
of Continuous, Discrete and Impulsive Systems Series B: pp. 271–303. Germany: Wiley-VCH.
Applications and Algorithms 10(6): 833–852. Pecora L and Carroll T (1990) Synchronization in chaotic
Hasler M (1998) Synchronization of chaotic systems and systems. Physical Review Letters 64: 821–824.
transmission of information. International Journal of Bifurca- Pecora L and Carroll T (1991) Driving systems with chaotic
tion and Chaos 8(4): 647–659. signals. Physical Review A 44: 2374–2383.
Huygens Ch (Hugenii) (1673) Horologium Oscillatorium (English Pecora L, Carroll T, Johnson G, and Mar D (1997) Fundamentals
translation: 1986 The Pendulum Clock. Ames: Iowa State of synchronization in chaotic systems, concepts and applica-
University Press). Parisiis, France: Apud F. Muguet. tions. Chaos 7(4): 520–543.
Illing L (2002) Chaos Synchronization and Communications in Pikovsky A, Rosenblum M, and Kurths J (2001) Synchronization,
Semiconductor Lasers. Ph.D. dissertation,. San Diego: Uni- A Universal Concept in Nonlinear Science. Cambridge: Cam-
versity of California. bridge University Press.
Illing L, Brocker J, Kocarev L, Parlitz U, and Abarbanel H (2002) Rosenblum M, Pikovsky A, and Kurths J (1997) From phase to
When are synchronization errors small? Physical Review E 66: lag synchronization in coupled chaotic oscillators. Physical
036229. Review Letters 78: 4193–4196.
Josic K (2000) Synchronization of chaotic systems and invariants Rulkov N, Sushchik M, Tsimring L, and Abarbanel H (1995)
manifolds. Nonlinearity 13: 1321–1336. Generalized synchronization of chaos in directionally coupled
Kocarev Lj, Halle K, Eckert K, and Chua LO (1992) Experi- chaotic systems. Physical Review E 51(2): 980–994.
mental demonstration of secure communication via chaotic Suykens J, Vandewalle J, and Chua LO (1997) Nonlinear H1
synchronization. International Journal of Bifurcation and synchronization of chaotic Lure systems. International Journal
Chaos 2(3): 709–713. of Bifurcation and Chaos 7(6): 1323–1335.
Kocarev Lj, Tasev Z, Stojanovski T, and CParlitz U (1997) Wu CW (2002) Synchronization in Coupled Chaotic Circuits and
Synchronizing spatiotemporel chaos. Chaos 7(4): 635–643. Systems. Series on Nonlinear Science, Series A, vol. 41.
Lozi R and Chua LO (1993) Secure communications via chaotic Singapore: World Scientific.
synchronization II: noise reduction by cascading two identical Yamada T and Fujisaka H (1983) Stability theory of synchronized
receivers. International Journal of Bifurcation and Chaos 3(5): motion in coupled-oscillator systems. Progress of Theoretical
1319–1325. Physics 70: 1240–1248.
T
t Hooft–Polyakov Monopoles see Solitons and Other Extended Field Configurations

Thermal Quantum Field Theory


C D Jäkel, Ludwig-Maximilians-Universität München, space-time,’’ but only recently has it received
München, Germany proper attention.
ª 2006 Elsevier Ltd. All rights reserved. At the same time, around 1974, cosmology and
heavy-ion collisions drew the interest of phyiscists
towards the quantum statistical mechanics of hot
relativistic quantum systems. Well-known papers
Introduction from this early stage include those by Weinberg,
Bernard, and Dolan and Jackiw. While most of the
Quantum field theory was initially invented in order
papers used Euclidean path integrals, Umezawa and his
to describe high-energy elementary particles, thereby
school developed a real-time framework called
unifying quantum mechanics and special relativity.
‘‘thermo-field dynamics,’’ which involved a doubling
In other words, quantum field theory was addressed
of the degrees of freedom. The excellent review by
to the so-called vacuum sector, that is, roughly
Landsman and van Weert (1987) covers these early
speaking physics at zero temperature and zero
attempts; it also explains the basic connection to the
particle density.
algebraic approach.
The same applies to the various mathematically
In the following years, it became evident that
rigorous versions of quantum field theory that have
statistical mechanics (in its standard formulation) is
been developed since the mid-1950s. Indeed, in
barely sufficient to derive the properties of bulk
Wightman’s axiomatic setting, quantum field theory
matter from the underlying microscopic description
is describes in terms of a set of the so-called vacuum
provided by quantum field theory. Thus, various
expectation values. The ‘‘algebraic approach’’ to
people began to establish mathematically rigorous
quantum field theory developed by Araki, Haag,
foundations for the description of thermal field
Kastler, and their collaborators is more flexible in
theory. The most successful approach was launched
nature. In fact, right from the beginning, the new
by D Buchholz (with various collaborators), who,
algebraic tools were successfully applied to lattice
from about 1985 onwards, started applying the
models and other nonrelativistic systems with
KMS condition (which describes a thermal equili-
infinitely many degrees of freedom (see Operator
brium state in the operator-algebraic framework of
algebras and quantum statistical mechanics by
local quantum physics) to relativistic quantum field
O Bratteli and D W Robinson). But the need to
theory. In 1994, Buchholz and Bros managed to
treat large systems of relativistic particles was
integrate the holomorphic structure of Wightman
apparently not felt. Even in Haag’s recent mono-
field theory into Haag’s operator-algebraic frame-
graph, Local Quantum Physics, the subjects of
work, which led them to the notion of a relativistic
algebraic quantum field theory and algebraic quan-
KMS condition.
tum statistical mechanics are treated separately.
The advanced mathematical concepts involved in
It is remarkable that constructive field theory
the formulation of entropy densities for thermal
was ahead of its time in this respect. The famous
quantum fields (see Narnhofer (1994)) do not allow
P()2 model (first constructed by Glimm and Jaffe)
us to present this topic. The reader is referred to the
was adapted to thermal states by Høegh-Krohn
excellent book Quantum Entropy and Its Use by
as early as 1974 (see Høegh-Krohn (1974)).
M Ohya and D Petz for an introduction to the
His paper was properly named ‘‘Relativistic quan-
subject. A discussion of the so-called thermalization
tum statistical mechanics in two-dimensional
228 Thermal Quantum Field Theory

effects that occur as a result of a curved spacetime is of the (orthochronous) Poincaré group P "þ . Here
provided in Quantum Field Theory in Curved , x is an automorphism of A, that is, a mapping
Spacetime. Another subject, which is missing almost from A to A which preserves the algebraic structure.
completely, is perturbation theory. This subject has Once a Lorentz frame is fixed by choosing a timelike
been covered extensively in three well-known text- vector e 2 Vþ , the time evolution t 7! 1, te will be
books by Kapusta, Le Bellac, and Umezawa. denoted by t 7! t .
For the free field, the group of automorphisms
(, x) 7! , x is defined by
Observables and States  
;x ðWðf ÞÞ :¼ W f ð1 ð:  xÞÞ
Following Heisenberg, we start from the basic
assumption that quantum theory can be formulated As before, f 2 S(Rdþ1 ) is a Schwarz function over
in terms of observables which form an algebra A, that the Minkowski space R dþ1 .
is, a vector space with a (noncommutative) multi- While the invariance of the equations of motion is
plication law. Although our emphasis on the abstract reflected in the existence of a representation of the
algebraic structure may look strange, there is a Poincaré group in terms of automorphisms in the
profound reason for starting out with an abstract Heisenberg picture, at least the invariance with
algebra of observables: as soon as one considers respect to Lorentz boosts is spontaneously broken
systems with infinitely many degrees of freedom, one in the Schrödinger picture for a thermal equilibrium
encounters a possibility to realize the abstract elements state.
of the algebra A as operators on a Hilbert space in The usual notions of vector states and density
various inequivalent ways. The famous equivalence matrices associated with a given Hilbert space
between the Heisenberg and the Schrödinger picture (usually Fock space) are a priori not general enough
simply breaks down. States which are macroscopically to cover all cases of interest in thermal field theory.
different (e.g., thermal equilibrium states for different The following algebraic definition of a state sub-
temperatures) give rise – in a natural way, which will stantially generalizes the notion of a state: A state !
be discussed in the sequel – to unitarily inequivalent is a positive, linear, and normalized functional, that
representations of the abstract algebra of observables is, a linear map ! : A ! C such that
A, while states which only differ microscopically can
!ða aÞ  0 and !ð1Þ ¼ 1
be accommodated by density matrices within the same
Hilbert space. In other words, a physical state is Once a state ! is distinguished on physical grounds,
described macroscopically by specifying a representa- the GNS reconstruction theorem provides a Hilbert
tion, and microscopically by a density matrix in this space H! and a representation ! of A, that is, a
representation. map from A to the set of bounded operators B(H! ),
In a Lagrangian approach, the algebra of obser- which preserves the algebraic relations.
vables A may be thought of as being generated by It is instructive to consider the GNS representa-
the underlying fields, currents, etc. This leads to the tion of the Pauli matrices {0 = 1, 1 , 2 , 3 }. Given a
so-called polynomial algebras. It is mathematically state (a diagonal 2  2 matrix  with positive entries
convenient to assume that A is an algebra of and tr  = 1), the left regular representation (a
bounded operators, generated by the bounded construction well known from group theory)
functions of the underlying quantum fields. If (x) pffiffiffi pffiffiffi
ði Þj  > ¼ ji  >; i ¼ 0; 1; 2; 3
is any such field and if f 2 S(Rdþ1 ) is any real test
function with support in a bounded region of defines a reducible representation on C4 , unless one
spacetime, then the corresponding operator of the entries in the diagonal of  is zero (which
 Z  corresponds to a pure state). In the latter case, the
Wðf Þ ¼ exp i dx f ðxÞðxÞ GNS Hilbert space is C2 . By construction,
pffiffiffi pffiffiffi
<  j(i )j  > = tr i , i = 1, 2, 3.
would be a typical element of A. The set of
operators {W(f ) j supp f  O} will generate a sub-
algebra A(O) of A. The underlying fields can be Thermal Equilibrium
recovered by taking (functional) derivatives, once a
representation of A on a Hilbert space is specified. The variety of nonequilibrium states ranges from
The spacetime symmetry of Minkowski space mild perturbations of equilibrium states through
manifests itself in the existence of a representation steady states, whose properties are governed
by external heat baths, or hydrodynamic flows
 : ð; xÞ 7! ;x 2 AutðAÞ; ð; xÞ 2 P "þ up to totally chaotic states which no longer
Thermal Quantum Field Theory 229

admit a description in terms of thermodynamic semipassive with respect to one fixed efficiency
notions. Buchholz et al. (2002) have initiated an bound E. It has been shown by Kuckert (2002)
investigation of nonequilibrium states that are that a state is completely semipassive in all inertial
locally (but not globally) close to thermal equili- frames if and only if it is completely passive in some
brium. Unfortunately, we will not be able to cover inertial frame. The latter implies that ! is a KMS
this topic. Instead, we will concentrate on states state or a ground state (a result due to Pusz and
which deviate from a true equilibrium state only Woronowicz).
microscopically. Let us now turn to properties of thermal
equilibrium states which are specific for relativistic
Characterization of Thermal Equilibrium States models. It was first recognized by Bros and
Buchholz (1994) that KMS states of a relativistic
When the time evolution t 7! t 2 Aut(A) is changed
theory have stronger analyticity properties in con-
by a local perturbation, which is slowly switched on
figuration space than those imposed by the tradi-
and slowly switched off again, then an equilibrium
tional KMS condition:
state ! returns to its original form at the end of this
procedure. This heuristic condition of adiabatic Definition 2 A KMS state ! satisfies the relativis-
invariance can be expressed by the stability tic KMS condition (Bros and Buchholz 1994) if there
requirement exists a unit vector e in the forward light cone Vþ
Z t such that for every pair of local elements a, b of A
lim dt !ð½a; t ðbÞÞ ¼ 0 8a; b 2 A ½1 the function Fa, b
t!1 t

In a pioniering work Haag, Kastler, and Trych- Fa;b ðx1 ; x2 Þ ¼ ! ðx1 ðaÞx2 ðbÞÞ
Pohlmeyer showed that the characterization [1] of extends to an analytic function in the tube domain
an equilibrium state leads to a sharp mathematical T e=2  T e=2 , where T e=2 = {z 2 C j =z 2 Vþ \
criterion, first encountered by Haag, Hugenholtz, (e=2  Vþ )}.
and Winnink and more implicitly by Kubo, Martin,
and Schwinger: The relativistic KMS condition can be understood
as a remnant of the relativistic spectrum condition in
Definition 1 A state ! over A is called a KMS the vacuum sector. It has been rigorously established
state for some  > 0, if for all a, b 2 A, there exists a (Bros and Bruchholz 1994) for the KMS states
function Fa, b which is continuous in the strip 0  constructed by Buchholz and Junglas (1989) and by
=z   and analytic and bounded in the open strip C Gérard and the author for the P()2 model. In the
0 < =z < , with boundary values given by thermal Wightman framework (Bros and Buchholz
Fa;b ðtÞ ¼ ! ðat ðbÞÞ and 1996) it has been shown that the relativistic KMS
condition implies existence of model-independent
Fa;b ðt þ iÞ ¼ ! ðt ðbÞaÞ 8t 2 R ½2
analyticity properties of thermal n-point functions.
Before we start analyzing the properties of KMS These properties also appear in perturbative compu-
states, we should mention an alternative character- tations of the thermal Wightman functions
ization of thermal equilibrium states: passivity. The (Steinmann 1995).
amount of work a cycle can perform when applied We now turn to the properties of the set of KMS
to a moving thermodynamic equilibrium state is states. For given , the convex set S  of all KMS
bounded by the amount of work an ideal windmill states is known to form a simplex; the extreme
or turbine could perform; this property is called points in the set S  are called extremal KMS states.
semipassivity (Kuckert 2002): a state ! is called As a consequence, the extremal states in S  can be
semipassive (passive) if there is an ‘‘efficiency distinguished with the help of ‘‘classical’’ (central)
bound’’ E  0 (E = 0) such that observables, that is, by observables which commute
with all other observables.
 ðW! ; H! W! Þ  E ðW! ; jP! jW! Þ If ! is an extremal KMS state and is an
8W 2 ! ðAÞ00 automorphism which commutes with the time
evolution t 7! t , then the state !0 defined by
with W 1 = W  , [H! , W] 2 ! (A)00 , and [P! , W] 2
! (A)00 . Here (H! , P! ) denote the generators imple- !0 ðaÞ :¼ !ð ðaÞÞ; a2A
menting the spacetime translations in the GNS
representation (H! , ! , ! ). Generalizing the notion is again an extremal KMS state to the same
of complete passivity, the state ! is called completely parameter values. If !0 6¼ !, one says that the
semipassive if all its finite tensorial powers are symmetry is spontaneously broken.
230 Thermal Quantum Field Theory

Lorentz invariance with respect to boosts is in the vacuum representation vac. . Next it is shown
always broken by a KMS state, since the KMS that the function
condition distinguishes a rest frame. A KMS state
might also break spatial translation or rotation t 7! !; ðat ðbÞÞ
invariance. However, by averaging over the different 1
¼ tr EðÞeH EðÞvac: ðat ðbÞÞ
configurations one can usually construct a transla- Z
tion- and rotation-invariant state. The situation is allows an analytic extension to a strip of width 
drastically different with respect to supersymmetry. which satisfies the KMS boundary condition [2] for
Buchholz and Ojima (1997) have shown that super- jtj < if a, b 2 A(O
) and O
þ te  O for jtj < . In
symmetry is broken in any thermal state and it is the final step, Buchholz and Junglas were able to
impossible to proceed from it by ‘‘symmetrization’’ demonstrate that bounds on the nuclear norm are
to states on which an action of supercharges can be even sufficient to control the thermodynamic limit.
defined. Given a thermal field theory, a slight variation of
the method used by Buchholz and Junglas allows
Existence of Thermal Equilibrium States one to construct a KMS state for a new temperature
(Jäkel 2004), that is, to change the temperature of a
Buchholz and Junglas (1989) demonstrated that the thermal state.
existence of KMS states can be guaranteed for a
large class of quantum field-theoretic models. The
basic assumption to be met concerns the phase-space Thermal Representations
properties of the model. A generalized trace norm Given a KMS state ! , the GNS construction gives
(the so-called ‘‘nuclear norm’’) is used to estimate rise to a Hilbert space H and a representation  ,
the ‘‘number’’ of degrees of freedom in phase space. called a thermal representation, of A. The algebra
The first step is to construction a subspace H() R :=  (A)00 possesses a cyclic (due to the GNS
of the vacuum Hilbert space Hvac. , which represents construction) and separating (due to the KMS
excitations of the vacuum strictly localized inside of condition) vector  such that
a bounded spacetime region Ô. Due to the strong  
correlations present in the vacuum state of any ! ðaÞ ¼  ;  ðaÞ 8a 2 A
relativistic model, as a consequence of the Reeh–
The KMS condition implies that ! is invariant
Schlieder property (see the section ‘‘Analyticity of n-
under time translations, that is, !
t = ! for all
point functions’’) this is a delicate procedure, which
t 2 R. Thus,
involves the so-called ‘‘split property.’’ This property
ensures that there exists a product vector
in UðtÞ ðaÞ ¼  ðt ðaÞÞ ; a2A
vacuum Hilbert space Hvac. such that
defines a strongly continuous unitary group
ð
; vac: ðabÞ
Þ ¼ !vac: ðaÞ !vac: ðbÞ {U(t)}t2R implementing the time evolution in the
^c representation  . By Stone’s theorem there exists a
8a 2 AðOÞ; b 2 AðOÞ ½3
self-adjoint generator L such that
Here O  Ô denotes a slightly smaller open space-
time region (such that the closure O is inside the UðtÞ ¼ eiLt ; t2R ½4
interior of Ô) and A(Ô)c := {A 2 A j [A, B] = 0 8B 2 For 0   < 1, the Liouville operator L is not
A(Ô)}. The existence of a product vector can be bounded from below; its spectrum is symmetric and
ensured if the nuclear norm satisfies some mild consists typically of the whole real line. However,
bounds which are expected to hold in all models of the negative part of L is ‘‘suppressed’’ with respect
physical interest. Given a product vector
which to the algebra of observables R :=  (A)00 in the
satisfies [3], the sought after subspace is following sense (Haag 1992): let 1]1, ] be the
spectral projection of L for the interval ] 1,  ] 
HðÞ :¼ vac: ðAðOÞÞ00 vac: Sp(L), then
The crucial step in the proof of existence of KMS
k11;  A k  e kAk 8A 2 R
states is to show that
We now turn to structural aspects which are
tr EðÞeH EðÞ < 1 for  > 0 characteristic for a relativistic model, namely the
if the nuclearity condition holds. Here E() denotes existence of strong spatial correlations and the
the projection onto the subspace H() representing connection between the decay of these correlations
localized excitations and H denotes the Hamiltonian and the spectral properties of the Liouville operator.
Thermal Quantum Field Theory 231

Let ! be a state, which satisfies the relativistic conjugation) and a self-adjoint operator 1=2 . The
KMS condition. It follows (using a theorem of connection to physics was established independently
Glaser) that for a 2 A the function a : R4 ! H , by Takesaki and Winnink, showing that the pair
(R, ) satisfies the KMS condition for  = 1, if one
x 7!  ðx ðaÞÞ
sets t (A) = it Ait for A 2 R.
can be analytically continued from the real axis into Taking advantage of the Reeh–Schlieder property
the domain T e=2 such that it is weakly continuous [5], one can associate modular objects to certain
for =z & 0. If the usual additivity assumption spacetime regions O. In general, a physical inter-
[i Oi = O ) _i R (Oi ) = R (O) for the local von pretation of these modular objects is missing. But for
Neumann algebras holds, then two-dimensional thermal models, which factorize in
light-cone coordinates, the modular group corre-
H ¼  ðAðOÞÞ ½5 sponding to the algebra of a spacelike wedge admits
for any open spacetime region O  R dþ1 . Junglas a simple description: at large distances (compared to
has shown that the thermal Reeh–Schlieder property ) from the boundary, the flow pattern is essentially
[5] follows as well from the standard KMS condi- the same as time translations. These are results due
tion, if ! is locally normal with respect to the to Borchers and Yngvason (1999).
vacuum representation.
The decay of spatial correlations depends on
Analyticity Properties of n-Point Functions
infrared properties of the model, and the essential
ingredients for the following cluster theorem are the The correlation functions describe the full physical
continuity properties of the spectrum of L near zero. content of the theory: all observable quantities can
in principle be derived from them. This is so because
Theorem 3 Let  denote the unique (up to a
according to the Wightman reconstruction theorem
phase) normalized eigenvector with eigenvalue {0} of
(which is closely related to the GNS construction)
the Liouvillean L and let Pþ denote the projection
knowledge of the correlation functions allows the
onto the strictly positive part of the spectrum of L.
reconstruction of the full representation of the field
Assume that there exist positive constants m > 0
algebra. The Wightman distributions {W (n) }n2N ,
and C1 (O) > 0 such that
ðnÞ
W  ðt2  t1 ; x2  x1 ; . . . ; tn  tn1 ; xn  xn1 Þ
ke L Pþ  ðaÞ k
 C1 ðOÞ m kak 8a 2 AðOÞ :¼ ð ;  ðt1 ; x1 Þ  ðtn ; xn Þ Þ ½6
R
Here O  Rdþ1 is an open and bounded spacetime where  (W(f )) =: exp(i dt dx f (t, x) (t, x)), satisfy
region. Now consider two spacelike separated a number of key properties: locality, positivity,
spacetime regions O1 , O2 , which can be embedded Poincaré covariance, and temperedness. These prop-
into O by translation and such that O1 þ e  erties have been formulated for thermal field by Bros
O02 , >> . then, for a 2 A(O1 ) and b 2 A(O2 ), and Buchholz (1996), and this section is entirely
based on their work.
j! ðbaÞ  ! ðbÞ! ðaÞj  C2 2m kak kbk The relativistic KMS condition implies that the
Wightman distributions {W (n)  }n2N of a translation-
The constant C2 (, O) 2 Rþ may depend on the
invariant equilibrium state admit in the correspond-
temperature 1 and the size of the region O but is
ing set of spacetime variables (t2  t1 , x2  x1 ), . . . ,
independent of , a, and b.
(tn  tn1 , xn  xn1 ) an analytic continuation into
From explicit calculations one expects that the union of domains
m = 1=2 for free massless bosons in 3 þ 1 spacetime
ð1 T e Þ   ðn1 T e Þ
dimensions. Consequently, the exponent given on
Pn1
the right-hand side is optimal since it is well known for i > 0, i = 1, . . . , n  1 and i = 1 i = 1. The
that in this case the correlations decay only like 1 . tube domains T e were specified in Definition 2.
A description of thermal representations would be For  ! 1, the tube T e tends to the vacuum tube
inadequate without pointing out one of the deepest T vac. = Rdþ1 þ iVþ ; thus, one recovers the spectrum
connections between pure mathematics and physics condition for the vacuum expectation values.
that emerged in the last century: consider a von Let us now turn to the Fourier transformed
Neumann algebra R which possesses a cyclic and Wightman correlation functions. Translation invar-
separating vector . Then polar decomposition of iance implies
the closeable operator S : A 7! A , A 2 R, pro-
vides an antiunitary operator J (the modular eðnÞ ð1 ; p1 ; . . . ; n ; pn Þ ð1 þ þ n Þ ðp1 þ þ pn Þ
W
232 Thermal Quantum Field Theory

The Wightman distribution W e(n) satisfies on the the general form of the thermal two-point functions

linear manifold (1 , p1 ) þ þ (n , pn ) = 0 the KMS that allow one to apply the techniques of the Jost–
relation in the energy variables: for any pair of Lehmann–Dyson representation. As has been shown
multi-indices (I, J) the identity by Bros and Buchholz (1996), the interacting two-
point function W  can be represented in the form
eðnÞ ðJ; IÞ ¼ eI W
W eðnÞ ðI; JÞ
  Z 1
ð2Þ
holds, where W e(n) (J, I) is an abbreviation for W  ðt; xÞ ¼ dm D ðx; mÞW  ðt; x; mÞ
 P 0
e ({pi }i2I , {pj }j2J ) and I =
(n)
W  i2I i .
We now specialize to the two-point function W (2)  .
Here D (x, m) is a distribution in x, m which is
The corresponding commutation function C(x) is symmetric in x, and
given by Z
ð2Þ ð2Þ
ðnÞ ð2Þ
W  ðt; x; mÞ ¼ ð2Þ1 ddp eiðtpxÞ W~ ð; pÞ
Cðx1  x2 Þ ¼ W  ðx1 ; x2 Þ  W  ðx2 ; x1 Þ
Locality implies that supp C  Vþ [ V . The is the two-point correlation function of the free
retarded and the advanced propagator r and a, thermal field of mass m. In contrast to the vacuum
formally given by case, the damping factors D (x, m) depend in a
nontrivial way on the spatial variables x. The
rðxÞ ¼ iðx
ÞCðxÞ; aðxÞ ¼ iðx
ÞCðxÞ damping factors describe the dissipative effects of
satisfy the relation the thermal system on the propagation of sharply
localized excitations. Bros and Buchholz suggested
r  a ¼ iC that the damping factor D (x, m) can be decom-
which corresponds to a partition of the support of posed into a discrete and an absolute continuous
C in its convex components: supp r  Vþ and part
supp a  V . For the free scalar field of mass m D ðx; mÞ ¼ ðm  m0 ÞD;d ðxÞ þ D;c ðx; mÞ
the commutator function is
Z and that the -contribution in the damping factors is
1 due to stable constituent particles of mass m0 out of
ðmÞ
C ðxÞ ¼ dp eipx C~ðmÞ ðpÞ
ð2Þ2 R4 which the thermal states are formed, whereas the
with collective quasiparticle-like excitations only contri-
bute to the continuous part of the damping factors
1 (Bros and Buchholz 1996).
C~ðmÞ ðpÞ ¼ sgnðÞ ð 2  p2  m2 Þ
2 In the case of spontaneously broken internal
and subsequently the retarded and advanced propa- symmetries Bros and Buchholz (1998) have shown
gators r(m) and a(m) are structural functions of the that the damping factors D  (x, m) which appear in
field algebra, which are determined by the c-number the representation of current-field correlations
commutation relations of the fields. Thus, they are functions
independent of the temperature, in contrast to the ð ; j0 ðt; xÞ ð0; 0Þ Þ
two-point function: Z 1 
ð2Þ
¼ dm Dþ  ðx; mÞ@t W  ðt; x; mÞ
ð2Þ C~ðmÞ ðpÞ 0
W~ ðpÞ ¼ ½7 
1  e þ D
ð2Þ
 ðx; mÞW  ðt; x; mÞ
Let now ˜ (p) be the Fourier transform of the time-
ordered function (x). The relation indeed contain a discrete (in the sense of measures)
zero-mass contribution and are slowly decreasing in
aðpÞe
i~rðpÞ þ i~ jxj for small values of m. Thus, these damping
~ðpÞ ¼
1  e factors coincide locally with the Källén–Lehmann
shows that (p)
˜ and i~r(p) only ‘‘coincide up to an weights appearing in the case of spontaneous
exponential tail’’ at very high energies (Bros and symmetry breaking in the vacuum sector (Bros and
Buchholz 1996). Buchholz 1998). It is easily seen in examples that
there is no sharp energy–momentum dispersion law
for the Goldstone particles. Thus, the Källén–
Particle Aspects
Lehmann representation is better suited than Fourier
The condition of locality (together with the relati- transformation to uncover the particle aspects of
vistic KMS condition) leads to strong constraints on thermal equilibrium states.
Thermal Quantum Field Theory 233

Models of Thermal Field Theory The Thermal P()2 Model

In the simplest case, the classical Lagrangian density In 1 þ 1 spacetime dimensions Wick ordering is
of the so-called P()2 models is given by sufficient to eliminate the UV divergences of poly-
nomial interactions. As it turns out, the leading

L ¼ ð@ Þð@  Þ  m2 2  4 ½8 order in the UV divergences is independent of the
4 temperature (in agreement with the results found in
Here (t, x) denotes a real scalar field over space- Kopper et al. (2001)). Thus, it is a matter of
time. The construction of the corresponding quan- convenience whether one uses the thermal covar-
tized thermal field presented in this section (Gérard iance function C ,
and Jäkel 2005) is based on the original ideas of  
Høegh-Krohn (1974). ð1 þ e Þ
C ðh1 ; h2 Þ :¼ h1 ; h2
2ð1  e Þ L2 ðRÞ

Free Fields h1 ; h2 2 SðRÞ

Let h m denote the L2 -closure of C1


0 (R) withp
respect to
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi or the vacuum covariance function Cvac. to define
the norm kf k = (f , (1=2)f ), where (k) = k2 þ m2 the Wick ordering:
denotes the one-particle energy for a single neutral  m
½n=2
X
scalar boson and the scalar product is the usual n n! n2m 1
: ðf Þ :C ¼ ðf Þ  Cðf ; f Þ
L2 -scalar product. The subspaces associated to a m!ðn  2m!Þ 2
m¼0
double cone O are given by
Now let P( ) be a real-valued polynomial, which is
h m ðOÞ :¼ fh 2 h m jsupp <h; supp  1 =h  Og bounded from below. Then Euclidean techniques
where O denotes the basis of the double cone O. can be used to define the operator sum
The corresponding free quantum field is described by Z l
the Weyl algebra W(h m ) := {W(f ) j f 2 h m }, together Hl :¼ LAW þ : Pð ðxÞÞ :C dx
with the time evolution {t
}t2R , l

t
ðWðf ÞÞ ¼ Wðeit f Þ; f 2 hm in the Araki–Woods representation and to show that
Hl is essentially self-adjoint Gérard and Jäkel (2005).
If m > 0, the KMS condition allows just one unique Thus, (the closure of) Hl can be used to define a
(quasifree) (
, )-KMS state: perturbed time evolution t 7! tl on A and the vector
!
 ðWðf ÞÞ :¼ eð1=4Þðf ;ð1þ2Þf Þm ;  :¼ ðe  1Þ1 eð=2ÞHl AW
l :¼
The GNS representation associated to the pair keð=2ÞHl AW k
(W(h m ), !
 ) is the well-known Araki–Woods repre-
induces a KMS state !l for the dynamical system
sentation, given by
(AW (A)00 ,  l ).
 
HAW :¼  h m h m ; AW :¼ F ; A finite propagation speed argument (using
  Trotter’s product formula) shows that
AW ðWðhÞÞ :¼ WF ð1 þ Þ1=2 h 1=2 h ; h 2 h m
tl ðAÞ :¼ eitHl AeitHl ; t2R ½9
Here h m is the Hilbert space conjugate to h m , WF (.)
is independent of l for A 2 RAW (O), t 2 R fixed and
denotes the usual Weyl operator on the Fock space
l sufficiently large. Thus, there exists a limiting
(h m h m ) and F 2 (h m h m ) is the Fock
dynamics  such that
vacuum. The Liouvillean LAW (see [4]) can be
identified with d( ).
lim ktl ðAÞ  t ðAÞk ¼ 0 ½10
The local von Neumann algebra generated by l!1
{AW (W(h)) j h 2 h m (O)} is denoted by RAW (O). The
for all A 2 RAW (O), O bounded. This norm conver-
algebra of observables for the free quantum field
gence extends to the norm closure A of the local von
(and, as we will see, the P()2 model) is the norm
Neumann algebras.
closure
The existence of weak limit points (which are
[ C states) of the (generalized) sequence {!l }l > 0 is a
A :¼ RAW ðOÞ
consequence of the Banach–Alaoglu theorem. The
OR 2
fact that all limit states satisfy the KMS condition
of the local von Neumann algebras. with respect to the pair (A, ) follows from [10]. To
234 Thermal Quantum Field Theory

prove that the sequence {!l }l>0 has only one finite in all orders of the perturbation expansion, once
accumulation point, the theory has been renormalized at zero temperature
by usual renormalization prescriptions.
! ¼ lim !l ½11
l!1
Asymptotic Dynamics of Thermal Fields
is more delicate. Following Høegh-Krohn, Nelson
symmetry is used in Gérard and Jäkel (2005) to Timelike asymptotic properties of thermal correlation
relate the interacting thermal theory on the real line functions cannot be interpreted in terms of free fields
to the P()2 model on the circle S1 of length at due to persistent dissipative effects of a thermal
temperature 0. The existence of the limit [11] then system. This well-known fact manifests itself in a
follows from the uniqueness of the vacuum state on softened pole structure of the Green’s functions in
the circle. The relativistic KMS condition can be momentum space and is at the root of the failure of
derived by Nelson symmetry as well, using the fact the conventional approach to thermal perturbation
that the discrete spectrum of the model on the circle theory (Bros and Bruchholz 2002). In fact, assuming
satisfies the spectrum condition. Since the limit [11] a sharp dispersion law, one would be forced to
exists on the norm closure A of the weakly closed conclude that the scattering matrix is trivial (a
local algebras, it follows from a result of Takesaki famous no-go theorem by Narnhofer et al. (1983)).
and Winnink that ! is locally normal with respect However, there seems to be a possibility to find an
to the Araki–Woods representation (which itself is effective theory, which is much simpler and still
locally normal with respect to the Fock representa- reproduces the correct asymptotic behavior of the full
tion). Consequently, theory. Disregarding low-energy excitations, Bros and
Buchholz (2002) have shown that the -contributions
R ðOÞ :¼  ðAðOÞÞ00 ffi RAW ðOÞ; O bounded
in the damping factors give rise to asymptotically
that is, R (O) is (isomorphic to) the unique leading terms which have a rather simple form: they are
hyperfinite factor of type III1 . Moreover, the local products of the thermal correlation function of a free
Fock property implies that the split property holds. field and a damping factor describing the dissipative
effects of the model-dependent thermal background.
Perturbation Theory This result is based on the assumption that the
truncated n-point functions satisfy
Steinmann (1995) has shown that perturbative expan-
sions for the Wightman distributions of the :4 :4 model ðnÞ
lim T3ðn1Þ=2 W  ðt1 ; x1 ; . . . ; tn ; xn Þtrunc: ¼ 0
can be derived directly in the thermodynamic limit, T!1
using as only inputs the equations of motion and the >0
(thermal) Wightman axioms. The result can be
while the -contribution in the damping factors
represented as a sum over generalized Feynman graphs.
exhibit, for large timelike separations T, a T 3=2
The method consists in solving the differential
type behavior (in 3 þ 1 spacetime dimensions).
equations for the correlation functions which follow
Bros and Buchholz (2002) have shown that the
from the field equation, by a power series expansion
asymptotically dominating parts of the correlation
in the coupling constant, using the axiomatic
functions can be interpreted in terms of quasifree
properties of the Wightman functions as subsidiary
states acting on the algebra generated by a Hermi-
conditions. The Wightman axioms are expected to
hold separately in each order of perturbation theory, tian field 0 satisfying the commutation relations
with the exception of the cluster property. ½0 ðt1 ; x1 Þ; 0 ðt2 ; x2 Þ
As expected, the UV renormalization can be
¼ m0 ðt1  t2 ; x1 ; x2 ÞZðx1  x2 Þ
chosen to be temperature independent, that is, one
can use the same counterterms as in the vacuum Here m0 is the usual commutator function of a free
case. But the infrared divergencies are more severe, scalar field of mass m0 and Z is an operator-valued
they cannot be removed by minor adjustments of the distribution commuting with 0 such that !ˆ  (Z(x1 
renormalization procedure. Various elaborate x2 )) = D, d (x1  x2 ). (Here !ˆ  denotes a KMS state
resummation techniques have been proposed to (at for the algebra generated by 0 .) Intuitively speak-
least partially) remove the infrared singularities. ing, the field 0 carries an additional stochastic
Another approach has been pursued by Kopper et al. degree of freedom, which manifests itself in a central
(2001). They have investigated the perturbation expan- element that appears in the commutation relations
sion of the :4 :4 model in the imaginary-time formal- and couples to the thermal background.
ism, using Wilson’s flow equations. The result is once As 0 describes the interacting field asymptoti-
again that all correlation functions become ultraviolet- cally, one may expect that 0 satisfies the field
Toda Lattices 235

equation of the interacting field in an asymptotic Cuniberti G, De Micheli E, and Viano GA (2001) Reconstructing the
sense. Buchholz and Bros (2002) have demonstrated thermal green functions at real times from those at imaginary
times. Communications in Mathematical Physics 216: 59–83.
that this assumption allows one to derive an explicit Derezinski J, Jaksic V, and Pillet CA (2003) Perturbations of W  -
expression for the discrete part of the damping dynamics, Liouvilleans and KMS states. Reviews in
factors D, d (x) in simple models. Mathematical Physics 15: 447–489.
Fröhlich J (1975) The reconstruction of quantum fields from
See also: Axiomatic Quantum Field Theory; Quantum Euclidean Green’s functions at arbitrary temperatures. Helve-
Field Theory in Curved Spacetime; Scattering in tica Physica Acta 48: 355–363.
Relativistic Quantum Field Theory: The Analytic Gérard C and Jäkel C (2005) Thermal quantum fields with
spatially cutoff interactions in 1 þ 1 space-time dimensions.
Program; Tomita–Takesaki Modular Theory.
Journal of Functional Analysis 220: 157–213.
Gérard C and Jäkel C (2005) Thermal quantum fields without
cutoffs in 1 þ 1 space-time dimensions. Reviews in Mathema-
Further Reading tical Physics 17: 113–173.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Borchers H-J and Yngvason J (1999) Modular groups of quantum Høegh-Krohn R (1974) Relativistic quantum statistical mechanics
fields in thermal states. Journal of Mathematical Physics 40: in two-dimensional space-time. Communications in Mathe-
601–624. matical Physics 38: 195–224.
Birke L and Fröhlich J (2002) KMS, etc., Reviews in Mathema- Jäkel CD (2004) The relation between KMS states for different
tical Physics 14: 829–871. temperatures. Annales de l’Institut Henri Poincaré 5: 1–30.
Bros J and Buchholz D (1994) Towards a relativistic KMS Kopper C, Müller VF, and Reisz T (2001) Temperature
condition. Nuclear Physics B 429: 291–318. independant renormalization of finite temperature field the-
Bros J and Buchholz D (1996) Axiomatic analyticity properties ory. Annales de l’Institut Henri Poincaré 2: 387–402.
and representations of particles in thermal quantum field Kuckert B (2002) Covariant thermodynamics of quantum
theory. Annales de l’Institut Henri Poincaré 64: 495–521. systems: passivity, semipassivity, and the Unruh effect. Annals
Bros J and Buchholz D (1998) The unmasking of thermal of Physics 295: 216–229.
Goldstone bosons. Physical Review D 58: 125012-1. Landsman NP and van Weert Ch G (1987) Real- and imaginary-
Bros J and Buchholz D (2002) Asymptotic dynamics of thermal time field theory at finite temperature and density. Physics
quantum fields. Nuclear Physics B 627: 289–310. Reports 145: 141–249.
Buchholz D and Junglas P (1989) On the existence of equilibrium Narnhofer H (1994) Entropy density for relativistic quantum field
states in local quantum field theory. Communications in theory. Reviews in Mathematical Physics 6: 1127–1145.
Mathematical Physics 121: 255–270. Narnhofer H, Requardt M, and Thirring W (1983) Quasiparticles
Buchholz D and Ojima I (1997) Spontaneous collapse of super- at finite temperature. Communications in Mathematical
symmetry. Nuclear Physics B 498: 228–242. Physics 92: 247–268.
Buchholz D, Ojima I, and Roos H (2002) Thermodynamic Steinmann O (1995) Perturbative quantum field theory at positive
properties of non-equilibrium states in quantum field theory. temperatures: an axiomatic approach. Communications in
Annals of Physics 297: 219–242. Mathematical Physics 170: 405–415.

Thermohydraulics see Newtonian Fluids and Thermohydraulics

Toda Lattices
Y B Suris, Technische Universität München, translationally symmetric systems like crystals. On
München, Germany his search for lattice models admitting interesting
ª 2006 Elsevier Ltd. All rights reserved. explicit solutions, M Toda discovered in 1967 the
lattice which nowadays carries his name:
€n ¼ eqnþ1 qn  eqn qn1
q ½1
Lattices, or differential–difference equations, are a
special class of ordinary differential equations, with Toda lattice is one of the most celebrated systems of
the dependent variable t playing the role of time and mathematical physics, and a large amount of
an infinite number of dependent variables qn = qn (t) literature is devoted to it and to its various genera-
numbered by integer indices n, characterized by a lizations. Its most prominent property is ‘‘integr-
translational invariance with respect to the shift ability,’’ so that it is amenable to a rather complete
n ! n þ 1. Due to this property, such equations are exact treatment; moreover, it can be regarded as one
well suited for description of processes in of the basic models, illustrating all the relevant
236 Toda Lattices

paradigms, notions, methods, and results of the nþ1 ðtÞn1 ðtÞ


theory of integrable systems (sometimes called the eqnþ1 ðtÞqn ðtÞ ¼ ½6
n2 ðtÞ
theory of solitons). One has a rare possibility to read
the first-hand presentation of a large body of where, for an M-soliton solution, n (t) can be
relevant results, including the authentic story of the represented through the M  M determinant depend-
original discovery, in Toda (1989). ing on 2M parameters zj 2 (1, 1) and cj 2 R:
!
ci ðtÞcj ðtÞðzi zj Þnþ1
n ðtÞ ¼ det ij þ ½7
The Infinite Toda Lattice 1  zi zj
1i;jM
Model
where cj (t) = cj ej t , j = (1=2)(z1 j  zj ). If one sets
The classical infinite Toda lattice [1] describes a one- zj = ej with j > 0, then j = sinh j , and one
dimensional chain of unit mass particles, each one can show that asymptotically both for t ! 1 and
interacting with the nearest neighbors only, qn being for t ! þ1 the solution [6] looks like the sum of
the displacement of the nth particle from equilibrium. well-separated solitons [4] with the velocities
It can be treated within the Hamiltonian formalism vj = j =j and the respective phases j n  j t þ j() .
of the classical mechanics (with some care, because of This is usually interpreted as a particle-like behavior
the infinite number of degrees of freedom). In this of solitons. One can show that the scattering of
framework, the second-order Newtonian equations solitons is factorized:
of motion [1] are replaced by the first-order  
Hamiltonian ones, for the coordinates qn and ðþÞ ðÞ
X 1  zj zk 
 j  j ¼ log  
canonically conjugate momenta pn : vk <vj
zj  zk 
 
X 1  zj zk 
q_ n ¼ pn ; p_ n ¼ eqnþ1 qn  eqn qn1 ½2  log   ½8
v >vj
zj  zk 
k
The corresponding Hamilton function is
which means that the phase shifts of individual
1 X 2 X qnþ1 qn solitons can be interpreted as coming from the
H2 ðp; qÞ ¼ p þ ðe  1Þ ½3
2 n2Z n n2Z pairwise interactions only.

One can understand infinite sums here formally, or,


alternatively, one can impose suitable boundary condi- Integrability
tions, like qnþ1  qn ! 0, pn ! 0 as jnj ! 1 (usually The infinite Toda lattice is completely integrable in
one requires decay faster than any degree of 1=jnj). the sense of the classical Hamiltonian mechanics: it
admits an infinite number of functionally indepen-
Multisoliton Solutions dent integrals of motion in involution. This was
M Toda found in 1967 a number of exact traveling demonstrated in 1974 by M Hénon. An instance of
wave solutions of this system, including the 1-soliton these higher integrals of motion is given by
solution: 1X 3 X
H3 ðp; qÞ ¼ p þ ðpn þ pnþ1 Þeqnþ1 qn ½9
1 þ e2ð1 n1 tþ1 Þ 3 n2Z n n2Z
qn ðtÞ ¼ log ½4
1 þ e2ð1 ðn1Þ1 tþ1 Þ Hamiltonian flows corresponding to the higher
or, equivalently, integrals of motion (usually referred to as higher
Toda flows) form the ‘‘Toda lattice hierarchy.’’ A
12 beautiful approach to this hierarchy is based on the
eqnþ1 ðtÞqn ðtÞ ¼ 1 þ 2
½5 Lax representation of the Toda lattice, discovered in
cosh ð1 n  1 t þ 1 Þ
1974 independently by H Flaschka and S Manakov.
where 1 > 0, 1 = sinh 1 , and 1 is an arbitrary In the variables an , bn , related to qn , pn by
phase. Such a soliton moves with the velocity an ¼ eqnþ1 qn ; bn ¼ pn ½10
v1 = 1 =1 (to the right, if v1 > 0, and to the left, if
v1 < 0). Note that the faster the soliton is, the larger its equations of motion of the Toda lattice [2] are
amplitude. Multisoliton solutions were constructed in rewritten as
1973 by R Hirota with the help of his ingenious
‘‘direct’’ (or bilinear) method. They can be written as a_ n ¼ an ðbnþ1  bn Þ; b_ n ¼ an  an1 ½11
Toda Lattices 237

It turns out that eqns [11] are equivalent to the first application of IST in the lattice context. The
operator equation matrix L0 in [16] is symmetric tridiagonal, which
yields that the operator L0 is second order and self-
L_ ¼ ½L; Aþ  ¼ ½A ; L ½12 adjoint. The direct and inverse-spectral problem for
L0 =  with such operators L0 is well studied and
where L and A are linear difference operators with
parallel, to a large extent, to the corresponding
coefficients depending on an , bn :
theory for second-order differential operators. In the
X X X
L¼ bn En;n þ an En;nþ1 þ Enþ1;n ½13 rapidly decaying case, the set of spectral data of the
n2Z n2Z n2Z operator L0 , allowing for a solution of the inverse
problem, consists of:
X X 1. eigenvalues j = zj þ z1j of the discrete spectrum,
Aþ ¼ bn En;n þ Enþ1;n with zj 2 (1, 1);
n2Z n2Z
X ½14 2. normalizing coefficients j of the corresponding
A ¼ an En;nþ1 eigenfunctions; and
n2Z 3. reflection coefficient r(z) for jzj = 1, characterizing
Here difference operators are represented as infinite the continuous spectrum  = z þ z1 2 [2, 2].
matrices, Em, n being the matrix with the only The solution of the inverse-spectral problem is given
nonvanishing element equal to 1 in the position in terms of the Riemann–Hilbert problem or its
(m, n). A diagonal similarity (gauge) transformation variants, like the Gelfand–Levitan equation. Equa-
of the matrix L leads to an equivalent Lax tion [12] means that the evolution of the operator L,
representation of the Toda lattice: induced by the evolution of qn (t), pn (t) in virtue of
the Toda lattice equations [2], is ‘‘isospectral.’’ More
L_0 ¼ ½L0 ; A0  ½15
precisely, the discrete eigenvalues are integrals of
with motion, while the evolution of other spectral data is
X X   governed by simple linear equations:
L0 ¼ bn En;n þ a1=2
n Enþ1;n þ En;nþ1 ½16
j ðtÞ ¼ j ð0Þeðzj zj Þt
1
n2Z n2Z zj ¼ const:;
1
½18
rðz; tÞ ¼ rðz; 0Þeðz zÞt

1 X 1=2  
A0 ¼ an Enþ1;n  En;nþ1 ½17 In particular, the multisoliton solutions correspond
2 n2Z
to the reflectionless case r(z, t) 0. The IST solution
Being equivalent for the Toda lattice, these two Lax of the initial-value problem for the infinite Toda
representations admit nonequivalent generalizations lattice can be schematically depicted as in Figure 1.
(see below). Note that the matrices A in [14] may
be interpreted as A =  (L), where  stands for Bi-Hamiltonian Structure
the lower-triangular, resp., strictly upper-triangular
part. The commuting higher members of the Toda The canonical Poisson bracket for the variables qn , pn
lattice hierarchy (enumerated by s 2 N) are char- turns in the Flaschka–Manakov variables [10] into
acterized by the Lax equations of the form [12] with
the same Lax matrix L as in [13] and with fbn ; an g1 ¼ an ; fan ; bnþ1 g1 ¼ an ½19
A =  (Ls ). In the Lax representation [15], the
higher Toda flows are obtained by choosing
A0 = skew(Ls0 ), where ‘‘skew’’ denotes the skew- qn(0), pn(0) Direct-spectral problem zj, γj (0), r (z, 0)
symmetric part (strictly lower-triangular part minus
strictly upper-triangular part) of the symmetric
matrix. The Hamilton functions of the higher flows
are obtained as Hs  tr(Ls ) = tr(Ls0 ). Linear
evolution
Inverse Scattering
H Flaschka and S Manakov laid the Lax representa-
qn(t ), pn(t ) Inverse-spectral problem zj , γj (t ), r (z, t )
tion into the base of the application of the inverse-
scattering, or inverse-spectral, transformation
method (IST) to the infinite Toda lattice. It was the Figure 1 General scheme of the IST.
238 Toda Lattices

(all other brackets of the coordinate functions with the flipped factors. The Bäcklund transforma-
vanish), and the system [11] is Hamiltonian with tion [21] serves also as an integrable discretization
respect to this bracket, with the Hamilton function of the Toda flow [2] with the time step h.
1X 2 X
H2 ¼ bn þ an
2 Finite Open-End Toda Lattice
However, one can define also a different Poisson
Model
bracket for the variables an , bn :
The infinite Toda lattice [1] can be reduced to finite-
fbn ; an g2 ¼ bn an dimensional systems by imposing suitable boundary
fan ; bnþ1 g2 ¼ an bnþ1 conditions, different from the rapidly decaying ones.
½20
fbn ; bnþ1 g2 ¼ an Particularly important are ‘‘open-end boundary
fan ; anþ1 g2 ¼ an anþ1 conditions,’’ which correspond to placing the parti-
cles 0 and N þ 1 at q0 = þ1 and qNþ1 = 1,
with the following properties: it is compatible with respectively. In terms of the Flaschka–Manakov
the first one (i.e., their linear combinations are again variables, this means that a0 = aN = 0 and b0 =
Poisson brackets), and the system [11] is Hamilto- bNþ1 = 0. The Hamilton function of the resulting
nian with respectP to this bracket, with the Hamilton system with N degrees of freedom is
function H1 = bn . So, the Toda lattice in the form
[11] is a bi-Hamiltonian system. This result is due 1X N X
N 1
H2 ðp; qÞ ¼ p2n þ eqnþ1 qn ½24
to M Adler (1979). The bi-Hamiltonian property, 2 n¼1 n¼1
introduced by F Magri in 1978 on the example of
the Korteweg–de Vries equation, has been estab- This system consists of N particles subject to
lished since then as an alternative (and highly repulsive forces between nearest neighbors, and
effective and informative) definition of integrability. exhibits a scattering behavior both as t ! 1 and
Actually, the Toda lattice [11] is even tri-Hamiltonian, t ! þ1. It admits a Lax representation of the same
since there exists one more local Poisson bracket for form [12] or [15] as in the infinite case, but with all
the variables an , bn with similar properties, discovered the matrices being now of finite size N  N, so that
by B Kupershmidt in 1985. [13]–[14] and [16]–[17] are replaced by
X
N X
N 1 X
N 1
Darboux–Bäcklund Transformations L¼ bn En;n þ an En;nþ1 þ Enþ1;n ½25
and Discretization n¼1 n¼1 n¼1

A further indispensable attribute of integrable


systems are the so-called Darboux–Bäcklund trans- X
N X
N 1
Aþ ¼ bn En;n þ Enþ1;n
formations. For the Toda lattice they were first
n¼1 n¼1
found by M Toda and M Wadati in 1975. A ½26
X
N 1
qn , e
Bäcklund transformation (qn , pn ) 7! (e pn ) with the A ¼ an En;nþ1
parameter h can be written as n¼1

1 þ hpn ¼ eq~n qn þ h2 eqn ~qn1 and


q
~n qn 2 qnþ1 ~
qn
½21
1 þ he
pn ¼ e þh e X
N X
N 1  
L0 ¼ bn En;n þ a1=2
n Enþ1;n þ En;nþ1 ½27
This is a canonical transformation, possessing a n¼1 n¼1
classical generating function. These formulas can be X
1N 1  
given a fundamentally important interpretation in A0 ¼ a1=2
n Enþ1;n  En;nþ1 ½28
terms of the matrices 2 n¼1
X X
Uþ ¼ eq~n qn En;n þ h Enþ1;n ½22 The qualitative behavior of the solutions is easily
n2Z n2Z understood: as a consequence of repulsive interac-
tions, the pairwise distances between particles grow
X infinitely, an (t) = eqnþ1 (t)qn (t) ! 0 as t ! 1, so that
U ¼ I þ h eqnþ1 ~qn En;nþ1 ½23 the matrix L0 becomes asymptotically diagonal,
n2Z
with the limit velocities bn (1) = q_ n (1) as the
The first formula in [21] is equivalent to the diagonal entries. Due to the isospectral evolution of
factorization I þ hL = Uþ U , while the second one L0 , these limit velocities have to coincide with the
is equivalent to the factorization I þ hL e = U Uþ eigenvalues j of L0 , which are integrals of motion.
Toda Lattices 239

As t ! 1, they appear on the diagonal in the


the Lie group G of the Lie algebra g , and its Lie
increasing order (the rightmost particle q1 being the subgroups G with the Lie algebras g  ; and
slowest, and the leftmost qN being the fastest), while
a function  : g ! g covariant with respect to the
as t ! þ1, their order on the diagonal changes to adjoint action of G (in the case of matrix Lie
the decreasing one (the particle q1 becoming the algebras and groups, one can take, e.g.,
fastest and qN becoming the slowest). (L) = Ls ).

Moser’s Solution
The AKS method provides a formula for the solution
of the initial-value problem for Lax equations of the
Integration of this system has been first performed by form [12] with the Lax matrix L 2 g and
J Moser in 1975. His solution can be interpreted A =  ((L)). The solution is given by
within the general scheme of the IST (see Figure 1). 1 1
The spectral data in this case consist, for example, of LðtÞ ¼ Uþ ðtÞLð0ÞUþ ðtÞ ¼ U ðtÞLð0ÞU ðtÞ ½33
the eigenvalues j (j = 1, . . . , N) of the matrix L0 and where the elements U (t) 2 G solve the factoriza-
the first components rj of the corresponding ortho- tion problem
normal eigenvectors. The evolution of these data
 
induced by the Toda flow [2] turns out to be simple: exp tðLð0ÞÞ ¼ Uþ ðtÞU ðtÞ ½34
r2j ð0Þej t For the open-end Toda lattice g = gl(N), the Lie
j ¼ const:; r2j ðtÞ ¼ PN ½29
2 i t algebra of all N  N matrices, g  consist of all
i¼1 ri ð0Þe
lower-triangular, resp., strictly upper-triangular,
The IST is expressed by the identity matrices. Accordingly, G = GL(N), the Lie group
X
N r2j 1 of all nondegenerate N  N matrices, and G
¼ a1 ½30 consist of all nondegenerate lower-triangular
  j   b1 
j¼1   b2  . matrices, resp., of upper-triangular matrices with
.. units on the diagonal. The corresponding factor-

aN1
  bN ization problem in G is well known in the linear
algebra under the name of LR factorization, and is
both parts of which represent the entry (1, 1) of the
related to the Gaussian elimination. From [33] and
matrix (I  L0 )1 . It implies that all variables
the well-known expression of the diagonal ele-
an (t), bn (t) are rational functions of j and ej t ; in
ments of the lower-triangular factor in the LR
particular, one finds:
factorization through the minors of the factorized
n1 ðtÞnþ1 ðtÞ matrix, we find:
an ðtÞ ¼ eqnþ1 ðtÞqn ðtÞ ¼ ½31
n2 ðtÞ
nþ1 ðtÞn1 ðtÞ
where n (t) can be represented as an n  n Hankel an ðtÞ ¼ an ð0Þ ½35
n2 ðtÞ
determinant
  where n (t) is the upper-left n  n minor of
n ðtÞ ¼ det cjþk ðtÞ 0j;kn1
the matrix exp(tL(0)). If L(t) is the Lax matrix
X
N
j
½32 along the solution of the Toda flow ((L) = L), then
cj ðtÞ ¼ i r2i ðtÞ the sampling of the matrix exp(L(t)) at the integer
i¼1
times t 2 Z coincides with the result of application
of the Rutishauser’s LR algorithm to the matrix
exp(L(0)). The LR algorithm applied to the matrix
Factorization Solution I þ hL(0) is nothing other but the Bäcklund trans-
formation [21] in the open-end situation.
The Lax representation [12] is a particular instance of
a general construction, known under the name of
Adler–Kostant–Symes (AKS) method and found
Finite Periodic Toda Lattice
around 1980. The ingredients of this construction are:
Model

a Lie algebra g , equipped with a nondegenerate


scalar product which is used to identify g with its A different reduction of the infinite Toda lattice to a
dual space g  ; finite-dimensional system appears by imposing peri-

a splitting of g into a direct sum of its two odic boundary conditions, qnþN (t) qn (t) for all
subspaces g  which are also Lie subalgebras, with n 2 Z (of course, such relations hold also for the
 : g ! g  being the corresponding projections; Flaschka–Manakov variables an , bn ). The Hamilton
240 Toda Lattices

function of the resulting system with N degrees of multidimensional theta-functions by formula [35]
freedom is with n (t) =
(nU  tV þ D), where U, V, D are
certain vectors on the Jacobian of R (the first two
1 X 2 X
H2 ðp; qÞ ¼ pn þ eqnþ1 qn ½36 of them depending on the spectrum R only).
2 n2Z=NZ n2Z=NZ
Loop Algebras
This system consists of N particles qn (n = 1, . . . , N),
and it is always assumed that qNþ1 q1 and q0 qN . The periodic Toda lattice can be included into the
Thus, the potential energy in [36] differs from the general AKS scheme, if one interprets the Lax
potential energy in [24] by one additional term eq1 qN . matrix L as an element of the loop algebra g
However, this modest difference leads to much more which consists of Laurent polynomials (in ) with
complicated dynamics of the system (quasiperiodic coefficients from gl(N), singled out by the additional
instead of scattering). It is convenient to replace condition
infinite matrices in the Lax representation [12] by  
g ¼ Lð Þ 2 glðNÞ½ ; 1  : Lð Þ1 ¼ Lð! Þ
finite ones, of size N  N, but depending on an
additional parameter (called the spectral parameter): where  = diag(1, !, .. ., !N1 ), ! = exp(2i=N). Sub-
X X algebras g  consist of Laurent polynomials with
L¼ bn En;n þ 1 an En;nþ1 respect to non-negative, resp., strictly negative
n2Z=NZ n2Z=NZ
X powers of . The Lie group G corresponding to the
þ Enþ1;n ½37 Lie algebra g consists of GL(N)-valued functions
n2Z=NZ U( ) of the complex parameter , regular in
X X CP1 n{0, 1} and satisfying U( )1 = U(! ). Its
Aþ ¼ bn En;n þ Enþ1;n ½38 subgroups G corresponding to the Lie algebras g
n2Z=NZ n2Z=NZ
are singled out by the following conditions: elements
X of Gþ are regular in the neighborhood of = 0,
A ¼ 1 an En;nþ1 ½39
n2Z=NZ
while elements of G are regular in the neighbor-
hood of = 1 and take at = 1 the value I. The
The Lax representation [12] holds identically in , corresponding factorization is called the generalized
so that the spectral parameter drops out of the LR factorization. As opposed to the open-end case,
equations of motion. Note that, unlike the open-end finding such a factorization is a problem of the
case, L is no more a tridiagonal matrix, because of Riemann–Hilbert type which is solved in terms of
the nonvanishing entries in the positions (N, 1) algebraic geometry and theta-functions rather than in
and (1, N). terms of linear algebra and exponential functions. This
approach to the periodic Toda lattice is due to Reyman
Inverse-Spectral Transformation and Semenov-Tian-Shansky (1979) and, indepen-
dently, to M Adler and P van Moerbeke (1980).
Solution of the periodic lattice in terms of multi-
dimensional theta functions has been given indepen-
dently by E Date and S Tanaka, and by I Krichever Generalizations: Lie-Algebraic Systems
in 1976. In this case, the set of the spectral data is
more complicated; it includes: The AKS interpretation of the finite Toda lattices
leads directly to their generalizations by replacing

a hyperelliptic Riemann surface R of genus N  1 the algebra gl(N), resp., the loop algebra over gl(N),
determined by the eigenvalues of the periodic by simple Lie algebras, resp. affine Lie algebras.
boundary-value problem for the operator L, or, These generalized Toda systems were introduced in
in other words, by the equation R( , ) = 1976 by O Bogoyavlensky and solved in 1979
det(L( )  I) = 0; and independently by M Olshanetsky, A Perelomov,

N  1 points Pk on R, which correspond to the and by B Kostant.


eigenvalues of L with vanishing boundary
conditions. Simple Lie Algebras
Due to [12], the Riemann surface R itself is an Let g be a simple Lie algebra (complex or real split),
integral of motion, and the evolution of points Pk is and h its Cartan subalgebra. Let further  = þ [ 
such that the image of the divisor P1 þ þ PN1 be the root system of g , decomposed into the sets of
under the Abel map moves along a straight line in positive roots þ and the set of negative roots  .
the Jacobi variety of R. Solution of the inverse- One has a direct vector space g = g þ g  , where g þ is
spectral problem is given in terms of spanned by the root spaces for positive roots and by h,
Toda Lattices 241

while g  is spanned by the root spaces for negative abelian subalgebra of g . Denote by  the set of 2
roots (Borel decomposition). For 2  let E be a a  for which there exist nonzero elements E 2 g 1 with
corresponding root vector. So, [H, E ] = (H)E for all [H, E ] = (H)E for all H 2 a . The elements E 2
H 2 h. The root 2 h  may be identified with H 2 h g 1 are defined similarly. It can be shown that 
defined by hH , Hi = (H) for all H 2 h. It is easy to contains s þ 1 elements, so that between them there
deduce that [E , E ] = c H , where c = hE , E i. exists exactly one linear relation. The elements of 
The system of simple roots will be denoted by  þ . are called simple weights of the loop algebra g. The Lie
The generalized Toda lattice for the Lie algebra g algebra g is a direct sum of its two subspaces g
is the following system of differential equations on consisting of Laurent polynomials with non-negative,
h  h: resp., with strictly negative powers of ; these
subspaces are also Lie subalgebras.
_ ¼P
Q
X X Now the generalized Toda lattice related to the loop
P_ ¼  e ðQÞ ½E ; E  ¼  c e ðQÞ H ½40 algebra g can be introduced as the system of differential
2 2 equations on a  a , which looks formally exactly as
This system can be given a Hamiltonian formula- [40], and has the Hamilton function which looks
tion, with the Hamilton function exactly as [41], but with the set of simple roots  of g
being replaced by the set of simple weights  of g. The
1 X matrices participating in the Lax representation [12]
Hg ¼ hP; Pi þ c e ðQÞ ½41
2 2
belong now to the loop algebra g:
X X
It is completely integrable, and has a Lax represen- Lð Þ ¼ P þ E þ 1 e ðQÞ E ½45
tation [12] with 2 2
X X X
L¼Pþ E þ e ðQÞ E ½42 Aþ ð Þ ¼ P þ E
2 2 2
X ½46
X X A ð Þ ¼ 1 e ðQÞ E
Aþ ¼ P þ E ; A ¼ e ðQÞ E ½43 2
2 2
For the classical series of loop algebras, the
The usual open-end Toda lattice corresponds to the Hamilton functions Hg in the canonically conjugate
algebra sl(N) (series AN1 ), so that the Hamilton variables qn , pn (n = 1, . . . , N) can be presented as
function [24] can be denoted by HAN1 . The
Hamilton functions of the generalized lattices Hg ðp; qÞ ¼ HAN1 ðp; qÞ
8 q ð1Þ
corresponding to other classical algebras so(2N þ >
> e N þ eq1 þq2 ; g ¼ BN
>
>
1) (series BN ), sp(N) (series CN ), and so(2N) (series >
> e2qN þ e2q1 ; ð1Þ
>
> g ¼ CN
DN ) can be written in the canonically conjugate >
>
>
< eqN qN1 þ eq1 þq2 ; ð1Þ
variables qn , pn (n = 1, . . . , N) as g ¼ DN
8 qN þ ½47
>
> e2qN þ eq1 þq2 ;
ð2Þ
g ¼ A2N1
<e ;
> g ¼ BN >
>
>
>
Hg ðp; qÞ ¼ HAN1 ðp; qÞ þ e 2q N
; g ¼ CN ½44 >
> eqN þ e2q1 ; ð2Þ
g ¼ A2N
> >
>
: qN qN1 >
: qN
e ; g ¼ DN ð2Þ
e þ e q1 ; g ¼ DNþ1
Actually, one can find even more general integrable
systems of the Toda type: one can add to HAN1 (p, q)
Affine Lie Algebras any of the two potentials eqN qN1 or eqN þ e2qN
Turning to the generalizations of the periodic Toda on one end combined with any of the two potentials
lattice, let
be a Coxeter automorphism of a simple eq1 þq2 or eq1 þ e2q1 on the other end, where
complex algebra g , the order of
being m. Introduce , , ,  are arbitrary constants. This result is due
the loop algebra g as the Lie algebra of Laurent to E Sklyanin (1987).
polynomials
 
g ¼ Lð Þ 2 g ½ ; 1  :
ðLð ÞÞ ¼ Lð! Þ Generalizations: Lattices with
Nearest-Neighbor Interactions
where ! = exp(2i=m). Denote by g j the eigenspaces
of
corresponding to the eigenvalues !j (j 2 Z=mZ). There exist further integrable lattice systems with
Set a = g 0 , and let s denote the dimension of a . By the nearest-neighbor interaction apart from the
definition of the Coxeter automorphism, a is an classical exponential Toda lattice [1]. Those of the
242 Toda Lattices


type q€n = r(q_ n )(g(qnþ1  qn )  g(qn  qn1 )) have eqnþ1 qn
been classified by R Yamilov in 1982, and the list qn ¼ qn ð1  qn Þ ð1  q_ nþ1 Þ
€ _ _
1 þ eqnþ1 qn
contains, apart from the usual Toda lattice [1], the 
eqn qn1
following ones:  ð1  q_ n1 Þ ½56
1 þ eqn qn1
qnþ1 qn qn qn1
q
€n ¼ q_ n ðe e Þ ½48
two -perturbations of the dual Toda lattice [49]:
q
€n ¼ q_ n ðqnþ1  2qn þ qn1 Þ ½49 q_ nþ1 q_ n
q
€n ¼ q_ n ðqnþ1  2qn þ qn1 Þ þ
  1 þ ðqnþ1  qn Þ
1 1
€n ¼ ðq_ 2n  2 Þ
q  ½50 q_ n q_ n1
qnþ1  qn qn  qn1  ½57
1 þ ðqn  qn1 Þ

€n ¼  ðq_ 2n  2 Þðcothðqnþ1  qn Þ
q q  qn  q_ nþ1
€n ¼ q_ n ð1 þ 2 q_ n Þ nþ1
q
 cothðqn  qn1 ÞÞ ½51 1 þ ðqnþ1  qn Þ

Equations [48] are known as the ‘‘modified Toda qn  qn1  q_ n1
 ½58
lattice.’’ Equations [49] describe the ‘‘dual Toda lattice’’ 1 þ ðqn  qn1 Þ
which was instrumental in the original discovery by and one -perturbation of each of the systems [50]
Toda (see Toda (1989)). All systems [49]–[51] can be and [51]:
obtained from [11] via suitable parametrizations of the
variables an , bn by canonically conjugate ones qn , pn ,   qnþ1  qn  q_ nþ1
€n ¼  q_ 2n  2
q
similar to [10] for [1], see Suris (2003). ðqnþ1  qn Þ2  ð Þ2
A remarkable discovery of the integrable relati- !
vistic Toda lattice is due to S Ruijsenaars (1990). qn  qn1  q_ n1
 ½59
This lattice with the equations of motion ðqn  qn1 Þ2  ð Þ2

eqnþ1 qn 1 2 
q
€n ¼ ð1 þ q_ n Þ ð1 þ q_ nþ1 Þ
q
€n ¼  q_ n  2
1 þ 2 eqnþ1 qk 2

eqn qn1
 ð1 þ qn1 Þ
_ ½52 sinh 2ðqnþ1  qn Þ  1 sinhð2 Þq_ nþ1
1 þ 2 eqn qn1 
sinh2 ðqnþ1  qn Þ  sinh2 ð Þ
!
can be considered as the perturbation of the usual sinh 2ðqn  qn1 Þ  1 sinhð2 Þq_ n1
Toda lattice with the small parameter (the inverse  ½60
sinh2 ðqn  qn1 Þ  sinh2 ð Þ
speed of light).
A class of integrable lattice systems of the relativistic A detailed study of all these systems, their interrelations,
Toda type q €n = r(q_ n )(q_ nþ1 f (qnþ1  qn )  q_ n1 f (qn  and time discretizations can be found in Suris (2003).
qn1 ) þ g(qnþ1  qn )  g(qn  qn1 )) is richer than There exist also lattices with more complicated
that of the Toda type, and has been isolated by Yu nearest-neighbor interactions, involving elliptic
B Suris and by V Adler and A Shabat in 1997. The list functions. They were discovered by A Shabat and
contains, apart from the relativistic Toda lattice [52], R Yamilov (1990), and by I Krichever (2000). For
two more -perturbations of the usual Toda lattice [1]: example, the nonrelativistic elliptic Toda lattice is
governed by the equations
€n ¼ ð1 þ q_ nþ1 Þeqnþ1 qn  ð1 þ q_ n1 Þeqn qn1
q

 
€n ¼ q_ 2n  1 ðVðqn ; qnþ1 Þ þ Vðqn ; qn1 ÞÞ
q ½61
 2 e2ðqnþ1 qn Þ  e2ðqn qn1 Þ ½53
where V(q, q0 ) = (q þ q0 ) þ (q  q0 )  (2q) is an

elliptic function in both arguments q, q0 (here (q) is
€n ¼ ð1  q_ n Þ2 ð1  q_ nþ1 Þ eqnþ1 qn
q

the Weierstrass -function).
 ð1  q_ n1 Þeqn qn1 ½54
Further Developments
two -perturbations of the modified Toda lattice [48]:
and Generalizations

eqnþ1 qn Sato’s Theory
€k ¼ q_ n eqnþ1 qn  eqn qn1 þ q_ nþ1
q
1 þ eqnþ1 qn
qn qn1
 Formulas [6], [31], and [35] have the same structure,
e
 q_ n1 ½55 with the case-dependent functions n (t) given by the
1 þ eqn qn1 determinants [7] for the multisoliton solution in the
Toda Lattices 243

infinite case, by the Hankel determinants [32] or by the which is obtained from [62] by setting vn =
minors of the matrix exp(L(0)) in the open case, and exp(qnþ1  qn ), already appeared in studies by
by the multidimensional theta functions in the periodic G Darboux in the 1880s, as the equation satisfied
case. All these seemingly different objects are actually by the Laplace invariants of the chain of Laplace
particular cases of a beautiful construction due to M transformations of a given conjugate net. This
Sato (1981), developed by E Date, M Jimbo, M relation to the classical differential geometry was
Kashiwara, T Miwa (1981–83), and by G Segal and G extensively studied by G Darboux, G Tzitzéica, and
Wilson (1985), which provides one of the major others long before the advent of the theory of
unifying schemes for the theory of integrable integrable systems. Another link to the differential
systems. In this construction, integrable systems are geometry is a more recent observation, and relates the
interpreted as simple dynamical systems on an infinite- two-dimensional Toda lattice, with the d’Alembert
dimensional Grassmannian. The -function (first operator ( )xy on the left-hand side of [62] replaced by
invented by R Hirota in 1971) receives in this theory the Laplace operator ( )zz , to harmonic maps. For
a representation-theoretical interpretation in terms of instance, the sinh-Gordon equation uzz = sinh u gov-
the determinant bundle over the Grassmannian. erns harmonic maps from C into the unit sphere S2 ,
which can be interpreted also as Gauss maps of the
Band Matrices constant mean curvature surfaces in R3 . A review of
this topic can be found in Guest (1997).
The Lax matrices [13] and [16] in the Manakov–
Discretization of Toda lattices, nonabelian Toda
Flaschka variables can be easily generalized: in the
Lattices, quantization of Toda lattices, dispersionless
symmetric matrix L0 one can admit nonvanishing
limit of Toda lattices, etc., are only some of the
elements in the band of the width 2s þ 1 > 3 around
further relevant topics, which cannot be discussed in
the main diagonal, in the Heisenberg matrix L one
any detail in the restricted frame of this article, and
can admit more nonvanishing diagonals in the
the same holds, unfortunately, for such fascinating
upper-triangle part. A systematic presentation of
applications of the Toda lattice as the Frobenius
a large body of relevant results is given in
manifolds, Laplacian growth problem, quantum
Kupershmidt (1985). In the setting of finite lattices,
cohomology, random matrix theory, two-dimensional
the integrability of such systems becomes a non-
gravity, etc.
trivial problem (as opposed to the tridiagonal
situation), because the number of independent
conjugation-invariant functions tr(Ls ) becomes See also: Bäcklund Transformations; Bi-Hamiltonian
less than the number of degrees of freedom. An Methods in Soliton Theory; Classical r-Matrices,
effective approach to this problem based on the Lie Bialgebras, and Poisson Lie Groups; Current Algebra;
semi-invariant functions has been found by P Deift, Dynamical Systems and Thermodynamics; Functional
L-Ch Li, T Nanda, and C Tomei in 1986. Equations and Integrable Systems; Integrable Discrete
Systems; Integrable Systems and Discrete Geometry;
Two-Dimensional Toda Lattices Integrable Systems and the Inverse Scattering Method;
Integrable Systems: Overview; Lie Groups: General
Up to now, we considered integrable lattices with Theory; Multi-Hamiltonian Systems; Quantum
one continuous and one discrete independent vari- Calogero–Moser Systems; Separation of Variables for
ables. This allows for a further generalization. Differential Equations; Solitons and Kac–Moody Lie
Integrable systems with two continuous and one Algebras; WDVV Equations and Frobenius Manifolds.
discrete independent variables are well known and
widely used as models of the field theory. For
instance, the Toda field theory deals with the system Further Reading
ðqn Þxy ¼ eqnþ1 qn  eqn qn1 ½62 Adler M (1979) On a trace functional for formal pseudo–differential
operators and the symplectic structure for Korteweg–de Vries
introduced in the soliton theory by A Mikhailov in type equations. Invent. Mathematics 50: 219–248.
1979. This two-dimensional system admits all possi- Adler VE and Shabat AB (1997) On a class of Toda chains. Teor.
ble kinds of reductions and generalizations mentioned Mat. Phys. 111: 323–334 (in Russian; English translation:
above for the usual Toda lattice. In particular, the Theor. Math. Phys. 111: 647–657); Generalized Legendre
periodic two-dimensional Toda lattice is referred to transformations. Teor. Mat. Phys. 112: 179–194 (in Russian;
English translation: Theor. Math. Phys. 112: 935–948).
as the affine Toda field theory (with the prominent Adler M and van Moerbeke P (1980) Completely integrable systems,
example of the sine-Gordon field which corresponds Kac–Moody algebras and curves; Advances in Mathematics 38:
to the period 2). Later, it was realized that the 267–317. Linearization of Hamiltonian systems, Jacobi varieties
equivalent equation ( log vn )xy = vnþ1  2vn þ vn1 , and representation theory. Advances in Mathematics 38: 318–379.
244 Toeplitz Determinants and Statistical Mechanics

Bogoyavlensky OI (1976) On perturbations of the periodic Toda Olshanetsky MA and Perelomov AM (1979) Explicit solutions of
lattice. Communications in Mathematical Physics 51: 201–209. classical generalized Toda models. Invent. Math. 54: 261–269.
Date F, Jimbo M, and Miwa T (1982–83) Method for generating Reyman AG and Semenov-Tian-Shansky MA (1979–81) Reduc-
discrete soliton equations. I–V. Journal of the Physical Society tion of Hamiltonian systems, affine Lie algebras and Lax
of Japan 51: 4116–4124, 4125–4131; 52: 388–393, 761–765, equations. I, II. Invent. Math. 54: 81–100, 63: 423–432.
766–771. Reyman AG and Semenov-Tian-Shansky MA (1994) Group
Date E and Tanaka S (1976) Analogue of inverse scattering theory for theoretical methods in the theory of finite dimensional
the discrete Hill’s equation and exact solutions for the periodic integrable systems. In: Arnold VI and Novikov SP (eds.)
Toda lattice. Progress in Theoretical Physics 55: 457–465. Encyclopaedia of Mathematical Science, vol. 16. Dynamical
Deift P, Li L-C, Nanda T, and Tomei C (1986) The Toda flow on Systems VII, pp. 116–225. Berlin: Springer.
a generic orbit is integrable. Communications in Pure and Ruijsenaars SNM (1990) Relativistic Toda systems. Communica-
Applied Mathematics 39: 183–232. tions in Mathematical Physics 133: 217–247.
Flaschka H (1974) On the Toda lattice II. Inverse scattering Sato M and Sato Y (1983) Soliton equations as dynamical systems
solution. Progress in Theoretical Physics 51: 703–716. on infinite-dimensional Grassmann manifold. In: Nonlinear
Guest MA (1997) Harmonic Maps, Loop Groups, and Integrable Partial Differential Equations in Applied Science (Tokyo,
Systems. Cambridge: Cambridge University Press. 1982), pp. 259–271. North-Holland: Amsterdam.
Hénon M (1974) Integrals of the Toda lattice. Physical Review B Segal G and Wilson G (1985) Loop groups and equations of KdV
9: 1921–1923. type. Inst. Hautes Études Sci. Publ. Math. 61: 5–65.
Krichever IM (1978) Algebraic curves and nonlinear difference Shabat AB and Yamilov RI (1990) Symmetries on nonlinear
equations. Uspekhi Mat. Nauk 33: 215–216 (in Russian). chains. Algebra i Analiz 2: 183–208 (in Russian; English
Krichever I (2000) Elliptic analog of the Toda Lattice. Int. Math. translation: Leningrad Mathematical Journal 2: 377–400).
Res. Notes 8: 383–412. Sklyanin EK (1987) Boundary conditions for integrable equations.
Kupershmidt BA (1985) Discrete Lax equations and differential- Funkts. Anal. Prilozh. 21: 86–87 (in Russian; English
difference calculus. Asterisque 123: 212. translation: Funct. Anal. Appl. 21: 164–166).
Magri F (1978) A simple model of the integrable Hamiltonian Suris YB (1997) New integrable systems related to the relativistic Toda
equation. Journal of Mathematical Physics 19: 1156–1162. lattice. Journal of Physics A: Math. and Gen. 30: 1745–1761.
Manakov SV (1974) On the complete integrability and stochas- Suris YuB (2003) The Problem of Integrable Discretization:
tization in discrete dynamical systems. Zh. Exp. Theoretical Hamiltonian Approach. Basel: Birkhäuser.
Physics 67: 543–555 (in Russian; English translation: Soviet Toda M (1967) Vibration of a chain with nonlinear interaction.
Physics JETP 40: 269–274). Journal of Physical Society Japan 22: 431–436.
Mikhailov AV (1979) Integrability of a two–dimensional general- Toda M (1989) Theory of Nonlinear Lattices. Berlin: Springer.
ization of the Toda chain. Pis’ma Zh. Eksp. Teor. Fiz. 30: 443–448 Toda M and Wadati M (1975) A canonical transformation for the
(in Russian; English translation: JETP Letters 30: 414–418). exponential lattice. Journal of the Physical Society of Japan
Moser J (1975) Finitely many mass points on the line under the 39: 1204–1211.
influence of an exponential potential – an integrable system. Yamilov RI (1982) On the classification of discrete equations.
Lecture Notes Physics 38: 467–497. In: Integrable Systems, Ufa, 95–114 (in Russian).

Toeplitz Determinants and Statistical Mechanics


E L Basor, California Polytechnic State University, and the determinant is an0 . The other case is when
San Luis Obispo, CA, USA the matrix is of the form
ª 2006 Elsevier Ltd. All rights reserved. 0 1
a0 an1 an2 a1
B a1 a0 an1 a2 C
B C
Introduction B a2 a1 a0 a3 C
B C ½2
B .. .. .. .. C
@ . . . . A
A finite Toeplitz matrix is an n  n matrix with the
following structure: an1 an2 an3 a0
0 1
a0 a1 a2 anþ1 In this latter case, the matrix is called a circulant
B a1 a0 a1 anþ2 C
B C matrix and the eigenvalues are given by the formula
B a2 a a0 anþ3 C
B 1 C ½1
B .. .. .. .. C
@ . . . . A n ðe
i2k=n
Þ; 0kn1
an1 an2 an3 a0
where
The entries depend on the difference i  j and hence
they are constant down all the diagonals. There are
X
n1
two cases when the determinant is easy to compute. n ðe
i

Þ¼ ai eij

One is when the matrix is upper- or lower-triangular j¼0


Toeplitz Determinants and Statistical Mechanics 245

The corresponding eigenvector for eigenvalue Here E1 , E2 , and  = 1=kT are, without loss of
i2k=n
n (e ) is generality, assumed to be positive constants, T is the
temperature, and k is the Boltzmann constant. If X
½1; ei2k=n ; . . . ; ei2kðn1Þ=n  is a random variable defined on the space of
This can be verified by direct computation. The role configurations, the expectation is given by
of circulant matrices will not be emphasized in this
article, although they are used in the computation of 1 X
EðXÞ ¼ XðÞeEðÞ
the generating function for certain dimer configura- Z ¼1
tions and also in applications using the discrete
Fourier transform. Let n be fixed for the moment and assume toroidal
The most common way to generate a finite boundary conditions for the lattice and then let
Toeplitz matrix is with the Fourier coefficients of N, M ! 1. It is known that the random variable
an integrable function. Let  : T ! C be a function
defined on the unit circle with Fourier coefficients XðÞ ¼ 0;0 0;n
Z
1 
k ¼ ðei Þeik d ½3 has expectation h0,0 0,n i given by Dn (), where
2 
We define Tn () to be the Toeplitz matrix:  1=2
i ð1  1 ei Þð1  2 ei Þ
 n1 ðe Þ ¼
Tn ðÞ ¼ ij i;j¼0 ð1  1 ei Þð1  2 ei Þ
   
A basic problem that in large part has been 1  z2 1  z2
motivated by statistical mechanics is to determine 1 ¼ z1 ; 2 ¼ z1
1
1 þ z2 1 þ z2
the behavior of the asymptotics of the determinant
of Tn () as n ! 1. The determinant will be and
referred to as Dn (), where  is called the generating
function of the determinant. If the generating z1 ¼ tanh E1 ; z2 ¼ tanh E2
function has the property that its Fourier coefficients
vanish for negative index (positive index) then the The square root is taken so that (ei ) = 1. This
corresponding matrix is lower-triangular (upper- formula was first stated by Onsager and later
triangular) and hence the determinant is n0 . For verified in a difficult computation by Montroll,
other cases, the determinant is not easy to determine Potts, and Ward.
and requires additional mathematical machinery. The spontaneous magnetization M for the Ising
Some of the primary motivation to study the model is defined by
determinant of these matrices comes from the two-
dimensional Ising model. We consider the Onsager M2 ¼ lim h0;0 0;n i ¼ lim Dn ðÞ
lattice in the absence of a magnetic field with sites n!1 n!1
labeled by
Note that it is the square root of the correlation
ði; jÞ; 0  i  M; 0;  j  N between two distant sites. Hence, the asymptotics of
and with a value i, j = 1 assigned to each site. In the Toeplitz determinants will determine whether
the Ising model, i, j signifies the state of the spin at the magnetization is positive or tends to zero as
the site (i, j). To each possible configuration of spins, n ! 1.
we define an energy
X X
EðÞ ¼ E1 i;j iþ1;j  E2 i;j i;jþ1 Strong Szegö Limit Theorem
i;j i;j
To determine the behavior of the determinants, we
Let need to analyze the generating function . Let us
X
Z¼ eEðÞ first consider the case where 2 < 1. (It is always the
 case that 0 < 1 < 1.) This generating function is
be the partition function. Then the probability of a differentiable, nonzero and has winding number
given configuration is zero, and it is for functions of this type that a
second-order expansion of the Toeplitz determinants
1 EðÞ can be described. The expansion first formulated by
e
Z Szegö, in response to the question concerning the
246 Toeplitz Determinants and Statistical Mechanics

spontaneous magnetization, is called the ‘‘strong hence the infinite array is upper triangular. From
Szegö limit theorem.’’ this, it follows that
Before proving the Szegö theorem, it should be
remarked that we can view the finite Toeplitz matrix Tðþ ÞTð1 1
þ Þ ¼ Tð ÞTð Þ ¼ I ½5
as a truncation of an infinite array,
0 1 Tð ÞTðþ Þ ¼ TðÞ ½6
0 1 2
B . C and
B 1 0 1 . . C
B C
B .. C ½4 Pn Tðþ Þ ¼ Pn Tðþ ÞPn
B 2 1 0 .C ½7
@ A Pn Tð ÞPn ¼ Tð ÞPn
.. .. ..
. . .
This yields
The above infinite array is the matrix representation
for the Toeplitz operator Dn ðÞ ¼ det Tn ðÞ ¼ det Pn TðÞPn ½8

TðÞ : H 2 ! H 2 ¼ det Pn Tðþ ÞTð1 1


þ ÞTðÞTð ÞTð ÞPn ½9
defined by
¼ detPn Tðþ ÞPn Tð1 1
þ ÞTðÞTð ÞPn Tð ÞPn ½10
TðÞf ¼ Pðf Þ
where H 2 is the Hardy space z ¼ det Pn Tðþ ÞPn detðPn Tð1 1
þ ÞTðÞTð ÞPn Þ
ff 2 L2 ðTÞ j fk ¼ 0; k < 0g  det Pn Tð ÞPn ½11
the function  2 L1 (T), and P is the orthogonal The determinants of the right-hand side and the left-
projection of L2 (T) onto H 2 . The matrix representa- hand side of the above expression are (( )0 )n ,
tion given in [4] is with respect to the Hilbert space respectively. Now given the Banach algebra condi-
basis of H 2 , tions imposed on the symbol , it follows that the
feik j 0  k < 1g operator
and  is called the symbol of the operator. Now Tð1 1
þ ÞTðÞTð Þ
define Pn : H 2 ! H 2 by
is of the form I þ K, where K is trace class. Hence,
Pn ðf0 ; f1 ; f2 ; . . .Þ ¼ ðf0 ; f1 ; f2 ; . . . ; fn1 ; 0; 0; . . .Þ the eigenvalues i of K satisfy
X
The finite Toeplitz matrix can be thought of as the ji j < 1
upper-left corner of the array given in [4] or as
Pn T()Pn . and the infinite (Fredholm) determinant of I þ K is
To prove the Strong Szegö limit theorem, we defined. To verify the claim that the operator
introduce the Banach
P algebra 2B of bounded func-
tions f satisfying 1 Tð1 1 1 1
þ ÞTðÞTð Þ ¼ Tðþ ÞTð ÞTðþ ÞTð Þ
k = 1 jkkfk j < 1.

Theorem 1 (Strong Szegö limit theorem). Assume is I plus a trace class operator, we use the identity
 =  þ , where  have logarithms in B. Suppose TðfgÞ  Tðf ÞTðgÞ ¼ Hðf ÞHð~gÞ ½12
log  , log þ 2 H 2 . Then !
X1 where H(f) has matrix form (fiþjþ1 )1 i,j = 0 , and
n
lim Dn ðÞ=GðÞ ¼ EðÞ ¼ exp ksk sk g̃(ei ) = g(ei ). Our Banach algebra conditions
n!1
k¼1 show that if f is in B then the operator H(f) satisfies
P 2
where G() = exp (( log )0 ) and sk = log k . i, j jaij j < 1, where the aij are the matrix entries
of the operator. Any operator satisfying this is called
Since B is a Banach algebra, it follows that if a Hilbert–Schmidt operator, and it is known that the
log  belong to B so do product of two Hilbert–Schmidt is trace class.
Applying the identity to
 ; þ ; 1 1
þ ;  ; ; 
1
 
and hence they are bounded. Since þ is in H 2 as T 1
þ Tð Þ
well, its Fourier coefficients vanish for negative shows that this operator is T(1
þ  ) plus trace class.
index and the Toeplitz operator has a corresponding The operator
infinite array that is lower-triangular. The Fourier  
coefficients vanish for positive index for  and Tðþ ÞT 1
Toeplitz Determinants and Statistical Mechanics 247

is thus T(þ 1 ) plus trace class and one more and, by the identity from eqn [12], becomes
application of the identity combined with the fact
that trace class operators form an ideal yield the Hðlog  ÞHðlog þ Þ
desired result. It can be directly computed that
From the theory of infinite determinants, as
n ! 1, trðHðlog  ÞHðlog þ ÞÞ
   1 
det Pn T 1
þ TðÞT  Pn ½13 equals
converges to X
1

    1  ksk sk
det T 1
þ TðÞT  ½14 k¼1

At this point, we have proved that and the theorem is proved.


Returning to the Ising model, one needs to
lim Dn ðÞ=ðð Þ0 Þn ððþ Þ0 Þn
n!1 compute the asymptotics of the determinants for
   1 
¼ lim det Pn T 1þ TðÞT  Pn
the generating function
n!1
    1   1=2
¼ det T 1þ TðÞT  ½15 ð1  1 ei Þð1  2 ei Þ
ðei Þ ¼
ð1  1 ei Þð1  2 ei Þ
It only remains to identify the constants. To see that
The term G() = 1 and for k > 0
GðÞ ¼ ðð Þ0 Þn ððþ Þ0 Þn
 
we note that 1 2k1 2k
2 k1 k2
ksk sk ¼ þ þ2
 Z  4 k k k
2
1
GðÞ ¼ expððlog Þ0 Þ ¼ exp log ðei Þd from which it follows that
2 0
 Z 2  "  #1=4
1 1  21 1  22
¼ exp ðlog  ðei Þ þ log ðei ÞÞd lim Dn ðÞ ¼
2 0 n!1 ð1  1 2 Þ2
¼ expðlog  Þ0 expðlog þ Þ0 ¼ ð Þ0 ðþ Þ0
Recalling the definition of 1 and 2 yields
To compute the determinant of " #1=4
   1  1
T 1þ TðÞT  lim h0;0 0;n i ¼ 1 
n!1 ðsinh 2E1 sinh 2E2 Þ2
we write
   1  or the spontaneous magnetization M as
det T 1
þ TðÞT  !1=8
   1 
¼ det T 1 1
þ Tð þ ÞT  M¼ 1
   1  ðsinh 2E1 sinh 2E2 Þ2
¼ det T 1
þ Tð ÞTðþ ÞT 

This last expression is the form In order for this computation to be valid, it was
necessary for 0 < 2 < 1, and by elementary com-
eA eB eA eB putations one can show that this is equivalent to the
inequality
where
sinh 2E1 sinh 2E2 > 1
A ¼ Tðlog þ Þ and B ¼ Tðlog  Þ
If AB  BA is trace class then
Nonsmooth Symbols or T = Tc
A B A B trðABBAÞ
det e e e e ¼e A problem occurs in the analysis just outlined when
The operator AB  BA is the inequality 0 < 2 < 1 does not hold. There are
two separate possibilities, 2 > 1 or 2 = 1. First, we
Tðlog þ ÞTðlog  Þ þ Tðlog  ÞTðlog þ Þ consider the latter case. For fixed E1 and E2 , this
which equals happens for exactly one fixed value of the constant
c = 1=kTc and the corresponding temperature Tc is
Tðlog þ ÞTðlog  Þ þ Tððlog  Þðlog þ ÞÞ called the critical temperature. The ‘‘strong Szegö
248 Toeplitz Determinants and Statistical Mechanics

limit theorem’’ does not apply since our generating For the above factors, we normalize so that the
function is of the form geometric mean is 1. Then we may assume that
 1=2 the factors þ ,  (  þ = ) are 1 at zero and
i ð1  1 ei Þð1  ei Þ infinity, respectively, and this defines the loga-
ðe Þ ¼ ½16
ð1  1 ei Þð1  ei Þ rithms for the first product. The E( ) term is the
constant in Szegö’s theorem, and the argument of
In 1968, Fisher and Hartwig raised a conjecture
a term of the form (1  ei(s r ) ) is taken between
about Dn () for nonsmooth  which included the
=2 and =2.
above example. They considered generating func-
In the case where R = 1, the conjecture is known
tions of the form
to hold if < > 1=2 and the function b satisfies the
Y
R conditions of Szegö’s theorem and is infinitely
ðei Þ ¼ ðei Þ j ;j ðeiðj Þ Þ ½17 differentiable. The theorem also has an extension
j¼1
to the case where < < 1=2, with 2 not an
where integer, as long as the Fourier coefficients are
defined as the coefficients of a distribution.
; ðei Þ ¼ ð2  2 cos Þ eiðÞ ; 0 <  < 2 If we apply the theorem to the generating function
< > 1=2, and  is not an integer. The function from [16]
is assumed to be a smooth function. Using the  1=2
i i 1  1 ei
Fisher–Hartwig notation, the symbol of interest in ðe Þ0;1=2 ðe Þ ¼ 0;1=2 ðei Þ
the Ising model from eqn [16] can be written as 1  1 ei

ðei Þ0;1=2 ðei Þ we see that the asymptotic expansion is given by


 
where 1 þ 1 1=4
n1=4 Gð1=2ÞGð3=2Þ
 1=2 1  1
1  1 ei
ðei Þ ¼ This last formula shows that, at the critical
1  1 ei
temperature,
The conjecture of Fisher and Hartwig for general
symbols of this type stated that lim h0;0 0;n i ¼ lim Dn ðÞ ¼ 0
n!1 n!1
n p
Dn ðÞ Gð Þ n E thus, M = 0, and hence there is no correlation
between distant lattice points.
where
It should be remarked here that the diagonal
X
R
correlation at the critical temperature is also given
p¼ ð2r  r2 Þ
r¼1
by a singular Toeplitz determinant,

and E is a constant whose value they did not h0;0 n;n i ¼ Dn ð0;1=2 Þ n1=4 Gð1=2ÞGð3=2Þ
identify. The constant was later computed to be
and thus this limit is also zero.
Y
R The proof of the Fisher–Hartwig conjecture is
ij j þj ij j j
E
ðÞ ¼ Eð Þ þ ðe Þ  ðe Þ much more complicated than the proof of the
j¼1
Y ‘‘strong Szegö limit theorem.’’ For an indication of
 ð1  eiðs r Þ Þðs þs Þðr r Þ how it is proved, note that if we consider the
1s6¼rR generating function 0,  , the Fourier coefficients
Y
R
Gð1 þ j þ j ÞGð1 þ j  j Þ are (sin )=[(n  )] and hence the matrix is
 Cauchy and the determinant can be computed
Gð1 þ 2j Þ
j¼1 exactly. From this the asymptotics can be derived
where G(z) is the Barnes G-function satisfying and they yield a special case of the Fisher–Hartwig
conjecture. The main idea in extending the result to
Gð1 þ zÞ ¼ ðzÞGðzÞ a symbol of the form
and is defined by ðei Þ0; ðei Þ
2
Gð1 þ zÞ ¼ ð2Þz=2 eðzþ1Þz=2 z =2 is to prove that the limit of
1 
Y z k zþz2 =2k Dn ð 0; Þ
 1þ e
k¼1
k Dn ð ÞDn ð0; Þ
Toeplitz Determinants and Statistical Mechanics 249

exists. The proof uses much of the same trace-class In the example above, 1 = 1=2, 2 = 1=2,
approach used in proving the ‘‘strong Szegö limit 1 = 0, 2 = , n1 = 1, and n2 = 1. The result for
theorem,’’ although the results are more compli- the counterexample, combined with what is known
cated. These ideas are then extended for R > 1 and for the case of integer values of  and , leads to the
also more general  and . following generalized conjecture. Suppose
It should be noted that in this article the Fisher–
Y
R
Hartwig conjecture does not always hold. If we ðei Þ ¼ k k ;k ;j
consider the function j¼1
j j

 PR
1;  <  < 0 for some set of indices k. Define Q(k) = (kj )2 
ðei Þ ¼ j=1
1; 0 <  <  (jk )2 . Let Q = maxk <(Q(k)) and

then K ¼ fk j <ðQðkÞÞ ¼ Qg
 The generalized asymptotic formula is conjectured
0, if k is even
n ¼ to be
2i/ðkÞ, if k is odd X
Dn ð Þ ¼ Gð k ÞnQðkÞ E
k þ oðjGðÞjn nQ Þ
The matrix Tn () is antisymmetric and, if n is odd, k2K
Dn () = 0. If n is even, using elementary row and
column operations, the determinant can be put in It may turn out that there is only one element in K
block form with each block of Cauchy type. The and for these symbols there is a unique representa-
determinant can then be evaluated to find tion that yields the highest power in the exponent of
the asymptotic expansion. These are the symbols for
Dn ðÞ ðiÞn n1=2 K which the original Fisher–Hartwig conjecture should
be true and it is now confirmed in these cases. For
where K is a certain constant.
example, the conjecture is known to hold for R > 1
It is instructive to note that
when j<r j < 1=2 and j<r j < 1=2.
ðei Þ ¼ 0;1=2 ðei Þ0;1=2 ðeiðÞ Þ
¼ 0;1=2 ðei Þ0;1=2 ðeiðÞ Þ

and thus that this particular symbol has two Symbols with Nonzero Index or T > Tc
representations of the type given in [17] and each
would give a different asymptotic expansion of the The last possibility in computing the correlation
determinant if the conjecture were true for this set of asymptotics is the case where 2 > 1. Note that, for
parameters. Hence, it is clear that the conjecture fixed E1 and E2 , there is exactly one value of
must fail to hold in this case.  = 1=kT where
 
However, this example indicates that there might 1 1  z2
be a generalization of the original conjecture of 2 ¼ z1 ¼1
1 þ z2
Fisher and Hartwig. If
For values of T > Tc , we have that the symbol
; ðeið Þ Þ ¼ ;;  1=2
ð1  1 ei Þð1  2 ei Þ
then ð1  1 ei Þð1  2 ei Þ
Y
R is the same as
¼ j ;j ;j
 1=2
j¼1 ð1  1 ei Þð1  ð1/2 Þei Þ
ei
it is also the case that ð1  1 ei Þð1  ð1/2 Þei Þ

Y
R with the argument chosen so that the symbol is

j ;j þnj ;j positive at . Except for the extra factor of ei , this
j¼1 is the same type of smooth symbol that was
considered earlier (see the section ‘‘Strong Szegö
where limit theorem’’). However, a factor of ei can change
X
R Y
R the asymptotics considerably as can be seen by
nj ¼ 0 and

¼ ðeij Þnj considering the simple example of the  1.


j¼1 j¼1 Fortunately, a variation of the Szegö theorem, first
250 Toeplitz Determinants and Statistical Mechanics

considered by Fisher and Hartwig, holds for this Further Remarks


case of smooth, nonvanishing index.
The interaction between statistical mechanics and
Theorem 2 Suppose that  =  þ satisfies the the theory of Toeplitz determinants has a long
condition of the ‘‘strong Szegö limit theorem’’ and history, and much of the motivation to describe the
in addition is at least once continuously differenti- asymptotics of the determinants was spurred by the
able. Then, if b =  1 1
þ and c =  þ , question of spontaneous magnetization in the two-
dimensional Ising model. The previous three sections
Dn ðeim Þ attempt to show how the very different physical
ð1Þðn1þmÞm GðÞn EðÞGðcÞm ½18 situations – T < Tc , T = Tc , and T > Tc – all
correspond to very different behavior in the symbols
of the generating functions. Critical systems predict
0 0 1 1 qualitatively different Szegö type theorems. For
bn  bnmþ1
B B .. .. .. C C example, the phase transition at Tc predicts that
B B
@det@ . . .
C þ Oðn3 ÞC
A A the asymptotics for singular symbols cannot be
bn1þm  bn predicted by the smooth symbols, that is, one cannot
1
use continuous functions to approximate the results
 ð1 þ Oðn ÞÞ ½19 for singular symbols.
Onsager (1971) was the first to understand that
Applying this to the symbol
the correlation function could be expressed as a
 1=2 Toeplitz determinant. This was made explicit by
ð1  1 ei Þð1  ð1/2 Þei Þ
ei ðei Þ ¼ ei Montroll et al. (1963). For more information about
ð1  1 ei Þð1  ð1/2 Þei Þ the Ising model, the reader is referred to McCoy
and Wu (1973), where a clear and complete
we have that m = 1, G() = 1; G(c) = 1, and
description of the Ising model (and most of the
"    #1=4 notation used here in reference to this model) can
  1 1 2 be found.
EðÞ ¼ 1 21 1 2 1 ½20
2 2 Szegö (1915, 1952) had originally proved a weak
form of the ‘‘limit’’ theorem and he understood that
The determinant in the above formula is the it was desirable to extend to a second-order term.
constant Szegö first proved the ‘‘strong Szegö limit theorem’’
Z 2    for positive generating functions and this was later
1 1 extended to the nonpositive case.
bn ¼ ð1  1 ei Þ 1  ei
2 0 2 The first to understand that a different asymptotic
 1=2
1 behavior was expected at the critical temperature
 ð1  1 ei Þ 1  ei ein d was Fisher and this resulted in the conjecture for the
2
class of determinants generated by what is now
The last integral can be deformed to a segment of known as Fisher–Hartwig symbols (Fisher and
the real line and evaluated asymptotically to find Hartwig 1968). Progress on the conjecture was
that the leading term is made by many authors. Böttcher and Silbermann
   1=2 (1998) have provided general results concerning
1 1 1 Toeplitz operators and determinants. Additional
 pffiffiffi n 1 ð1  1 2 Þ 1  2
 2 2 2 information about the conjectures of Fisher and
ðn þ 1=2Þ Hartwig can be found in Böttcher and Silbermann
 (1990, 1998), Ehrhardt (2001), and Ehrhardt and
ðn þ 1Þ
Silbermann (1997).
Putting this together with the above constants, we Toeplitz determinants are also important in many
have, for T > Tc , other applications. One more recent area of interest
is the connection between random-matrix theory
h0;0 0;n i and Toeplitz determinants. Many statistical quanti-
"   #
1  
2 1=4 1 1=4 1=2
ties for the circular unitary ensemble can be
pffiffiffiffiffiffi n 1  1 1 2 ð1  1 2 Þ described as a Toeplitz determinant. For example,
n2 2
the probability of finding no eigenvalues in an
This implies that the correlation tends to zero very interval can be expressed as a Toeplitz determinant.
rapidly as n ! 1. It is also the case that many of the most interesting
Tomita–Takesaki Modular Theory 251

statistics correspond to singular symbols. For basic Hughes C, Keating JP, and O’Connell N (2001) On the
random-matrix theory information see Mehta characteristic polynomial of a random unitary matrix. Com-
munications in Mathematical Physics 220: 429–451.
(1991), and for connections between the circular McCoy BM and Wu TT (1973) The Two-Dimensional Ising
unitary ensemble and Toeplitz determinants, Model. Cambridge, MA: Harvard University Press.
see Hughes (2001), Tracy and Widom (1993), and Metha ML (1991) Random Matrices. San Diego: Academic Press.
Widom (1994). Montroll EW, Potts RB, and Ward JC (1963) Correlations and
spontaneous magnetization of the two-dimensional Ising
See also: Integrable Systems in Random Matrix Theory; model. Journal of Mathematical Physics 4(2): 308–322.
Onsager L (1971) The Ising model in two dimensions. In: Mills
Two-Dimensional Ising Model.
RE, Ascher E, and Jaffee RI (eds.) Critical Phenomena in
Alloys, Magnets, and Superconductors, pp. 3–12. New York:
McGraw-Hill.
Further Reading
Szegö G (1915) Ein Grenzwertsatz über die Toeplitzschen
Böttcher A and Silbermann B (1990) Analysis of Toeplitz Determinanten einer reellen positiven Funktion. Mathema-
Operators. Berlin: Akademie-Verlag. tische Annalen 76: 490–503.
Böttcher A and Silbermann B (1998) Introduction to Large Szegö G (1952) On Certain Hermitian Forms Associated with the
Truncated Toeplitz Matrices. Berlin: Springer. Fourier Series of a Positive Function, pp. 222–238. Lund:
Ehrhardt T (2001) A status report on the asymptotic behavior of Festschrift Marcel Riesz.
Toeplitz determinants with Fisher–Hartwig singularities. Tracy CA and Widom H (1993) Introduction to random
Operator Theory: Advances and Applications 124: 217–241. matrices. In: Helminck GF (ed.) Geometric and Quantum
Ehrhardt T and Silbermann B (1997) Toeplitz determinants with Aspects of Integrable Systems, (Scheveningen, 1992), Lecture
one Fisher–Hartwig singularity. Journal of Functional Analysis Notes in Physics, vol. 424, pp. 103–130. Berlin: Springer.
148: 229–256. Widom H (1994) Random Hermitian matrices and (nonrandom)
Fisher ME and Hartwig RE (1968) Toeplitz determinants: some Toeplitz matrices. In: Basor E and Gohberg I (eds.) Toeplitz
applications, theorems, and conjectures. Advances in Chemical Operators and Related Topics (Santa Cruz, CA, 1992), Oper.
Physics 15: 333–353. Theory Adv. Appl., vol. 71, pp. 9–15. Basel: Birkhuser.

Tomita–Takesaki Modular Theory


S J Summers, University of Florida, Gainesville, FL, unique antiunitary operator occurring in the polar
USA decomposition
ª 2006 Elsevier Ltd. All rights reserved.
S ¼ J1=2 ¼ 1=2 J
 is called the modular operator and J the modular
Basic Structure conjugation (or modular involution) associated with
the pair (M, ). Note that J2 is the identity operator
The origins of Tomita–Takesaki modular theory lie and J = J
. Moreover, the spectral calculus may be
in two unpublished papers of M Tomita in 1967 and applied to  so that it is a unitary operator for
a slim volume by Takesaki (1970). It has developed each t 2 R and {it j t 2 R} forms a strongly con-
into one of the most important tools in the theory of tinuous unitary group. Let M0 denote the set of all
operator algebras and has found many applications bounded linear operators on H which commute with
in mathematical physics. all elements of M. The modular theory begins with
Although the modular theory has been formulated the following remarkable theorem.
in a more general setting, it will be presented in the
form in which it most often finds application in Theorem 1 Let M be a von Neumann algebra
mathematical physics (for generalizations, details, with a cyclic and separating vector . Then
and further references concerning the material J =  = , and the following equalities hold:
covered in this article, the reader is referred to the
Further Reading section). Let M be a von Neumann JMJ ¼ M0
algebra on a Hilbert space H containing a vector  and
which is cyclic and separating for M. Define the
operator S0 on H as follows: it Mit ¼ M; for all t 2 R

S0 A ¼ A ; for all A 2 M
Note that if one defines F0 A0  = A0
, for all A0 2
This operator extends to a closed antilinear operator M0 , and takes its closure F, then one has the relations
S defined on a dense subset of H. Let  be the
unique positive, self-adjoint operator and J the  ¼ FS; 1 ¼ SF; F ¼ J1=2
252 Tomita–Takesaki Modular Theory

Modular Automorphism Group Theorem 2 Let M be a von Neumann algebra


it with a cyclic and separating vector . Then the
By Theorem 1, the unitaries  , t 2 R, induce a one-
induced state ! on M satisfies the modular condi-
parameter automorphism group {t } of M by
tion with respect to the modular automorphism
group {t j t 2 R} associated to the pair (M, ).
t ðAÞ ¼ it Ait ; A 2 M; t2R
The modular automorphism group is, therefore,
This group is called the modular automorphism endowed with the analyticity associated with the
group of M (relative to ). Let ! denote the faithful KMS condition, and this is a powerful tool in
normal state on M induced by : many applications of the modular theory to
mathematical physics. In addition, the physical
1 properties and interpretations of KMS states are
!ðAÞ ¼ h; Ai; A2M
kk2 often invoked when applying modular theory to
quantum physics.
From Theorem 1 it follows that ! is invariant under Note that while the nontriviality of the modular
{t }, that is, !(t (A)) = !(A) for all A 2 M and t 2 R. automorphism group gives a measure of the non-
The modular automorphism group contains infor- tracial nature of the state, the KMS condition for the
mation about both M and !. For example, the modular automorphism group provides the missing
modular automorphism group is an inner auto- link between the values !(AB) and !(BA), for all
morphism on M if and only if M is semifinite. It is A,B 2 M (hence the use of the term ‘‘modular,’’ as
trivial if and only if ! is a tracial state on M. Indeed, in the theory of integration on locally compact
for any B 2 M, one has t (B) = B for all t 2 R if and groups).
only if !(AB) = !(BA) for all A 2 M. Let M The modular condition is quite restrictive. Only
denote the set of all such B in M. the modular group can satisfy the modular condition
for (M, ), and the modular group for one state can
satisfy the modular condition only in states differing
The KMS Condition
from the original state by the action of an element in
The modular automorphism group satisfies a condi- the center of M.
tion which had already been used in mathematical
physics to characterize equilibrium temperature Theorem 3 Let M be a von Neumann algebra
states of quantum systems in statistical mechanics with a cyclic and separating vector , and let {t }
and field theory – the Kubo–Martin–Schwinger be the corresponding modular automorphism
(KMS) condition. If M is a von Neumann algebra group. If the induced state ! satisfies the modular
and {t j t 2 R} is a -weakly continuous one- condition with respect to a group {t } of auto-
parameter group of automorphisms of M, then the morphisms of M, then {t } must coincide with {t }.
state  on M satisfies the KMS condition at (inverse Moreover, a normal state on M satisfies the
temperature)  (0 <  < 1) with respect to {t } if modular condition with respect to {t } if and only
for any A,B 2 M there exists a complex function if (  ) = !(h  ) = !(h1=2  h1=2 ) for some unique
FA,B (z) which is analytic on the strip {z 2 C j 0 < positive injective operator h affiliated with the
Im z < } and continuous on the closure of this strip center of M.
such that
Hence, if M is a factor, two distinct states cannot
share the same modular automorphism group. The
FA;B ðtÞ ¼ ðt ðAÞBÞ
relation between the modular automorphism groups
FA;B ðt þ iÞ ¼ ðBt ðAÞÞ for two different states will be described in more
detail.
for all t 2 R. In this case, (i (A)B) = (BA), for all
A, B in a -weakly dense, -invariant -subalgebra
of M. Such KMS states are -invariant, that is,
One Algebra and Two States
(t (A)) = (A), for all A 2 M, t 2 R, and are stable
and passive (cf. Bratteli and Robinson (1981) and Consider a von Neumann algebra M with two
Haag (1992)). cyclic and separating vectors  and , and denote
Every faithful normal state satisfies the KMS by ! and , respectively, the induced states on M.
condition at  = 1 (henceforth called the modular Let {!t } and {t } denote the corresponding modular
condition) with respect to the corresponding mod- groups. There is a general relation between the
ular automorphism group. modular automorphism groups of these states.
Tomita–Takesaki Modular Theory 253

Theorem 4 There exists a -strongly continuous setting of faithful normal functionals  on von
map R 3 t 7! Ut 2 M such that Neumann algebras M of any type, enabling the
definition of noncommutative Lp spaces, Lp (M, ).
(i) Ut is unitary for all t 2 R;
(ii) Utþs = Ut !t (Us ) for all s,t 2 R; and
(iii) t (A) = Ut !t (A)Ut  for all A 2 M and t 2 R. Modular Invariants and the Classification
The 1-cocycle {Ut } is commonly called the cocycle of von Neumann Algebras
derivative of  with respect to ! and one writes
As already mentioned, the modular structure carries
Ut = (D : D!)t . There is a chain rule for this
information about the algebra. This is best evi-
derivative, as well: If , , and  are faithful normal
denced in the structure of type III factors. As this
states on M, then (D : D)t = (D : D)t (D : D)t ,
theory is rather involved, only a sketch of some of
for all t 2 R. More can be said about the cocycle
the results can be given.
derivative if the states satisfy any of the conditions
If M is a type III algebra, then its crossed
in the following theorem.
product N = M o ! R relative to the modular
Theorem 5 The following conditions are automorphism group of any faithful normal state
equivalent: ! on M is a type II1 algebra with a faithful
semifinite normal trace  such that   t = et ,
(i)  is {!t }-invariant; t 2 R, where is the dual of ! on N . Moreover,
(ii) ! is {t }-invariant; the algebra M is isomorphic to the cross product
(iii) there exists a unique positive injective operator N o R, and this decomposition is unique in a very
! 
h affiliated with M \ M such that !(  ) ¼ strong sense. This structure theorem entails the
(h  ) = (h1=2  h1=2 ); existence of important algebraic invariants for M,
(iv) there exists a unique positive injective operator which has many consequences, one of which is made
! 
h0 affiliated with M \ M such that (  ) ¼ explicit here.
!(h0  ) = !(h01=2  h01=2 ); If ! is a faithful normal state of a von Neumann
(v) the norms of the linear functionals ! þ i and algebra M induced by , let ! denote the modular
!  i are equal; and operator associated to (M, ) and sp ! denote the
(vi) !t s = s !t , for all s, t 2 R. spectrum of ! . The intersection
The conditions in Theorem 5 turn out to be
S0 ðMÞ ¼ \ sp !
equivalent to the cocycle derivative being a
representation. over all faithful normal states ! of M is an algebraic
invariant of M.
Theorem 6 The cocycle {Ut } intertwining {!t } with
{t } is a group representation of the additive group Theorem 8 Let M be a factor acting on a
of reals if and only if  and ! satisfy the conditions separable Hilbert space. If M is of type III, then
in Theorem 5. In that case, U(t) = hit . 0 2 S0 (M); otherwise, S0 (M) = {0,1} if M is of type
I1 or II1 and S0 (M) = {1} if not. Let M now be a
The operator h0 = h1 in Theorem 5 is called the factor of type III.
Radon–Nikodym derivative of  with respect to !
(often denoted by d=d!), due to the following (i) M is of type III
, 0 <
< 1, if and only if
result, which, if the algebra M is abelian, is the S0 (M) = {0} [ {
n j n 2 Z}.
well-known Radon–Nikodym theorem from mea- (ii) M is of type III0 if and only if S0 (M) = {0, 1}.
sure theory. (iii) M is of type III1 if and only if S0 (M) = [0, 1).

Theorem 7 If  and ! are normal positive linear In certain physically relevant situations, the
functionals on M such that (A)  !(A), for all spectra of the modular operators of all faithful
positive elements A 2 M, then there exists a unique normal states coincide, so that Theorem 8 entails
element h1=2 2 M such that (  ) = !(h1=2  h1=2 ) and that it suffices to compute the spectrum of any
0  h1=2  1. conveniently chosen modular operator in order to
determine the type of M. In other such situations,
The analogies with measure theory are not there are distinguished states ! such that
accidental, although these are not discussed in detail S0 (M) = sp ! . One such example is provided by
here. Indeed, any normal trace on a (finite) von asymptotically abelian systems. A von Neumann
Neumann algebra M gives rise to a noncommuta- algebra M is said to be ‘‘asymptotically abelian’’ if
tive integration theory in a natural manner. Mod- there exists a sequence {n }n2N of automorphisms of
ular theory affords an extension of this theory to the M such that the limit of {An (B)  n (B)A}n2N in
254 Tomita–Takesaki Modular Theory

the strong operator topology is zero, for all A, B 2 is isomorphic to Aut(M) under the above map
M. If the state ! is n -invariant, for all n 2 N, then  7! V(), which is called the ‘‘standard implemen-
sp ! is contained in sp  , for all faithful normal tation’’ of Aut(M).
states  on M, so that S0 (M) = sp ! . If, moreover,
Often of particular physical interest are (anti-)auto-
sp ! = [0, 1), then sp ! = sp  , for all  as
morphisms of M leaving ! invariant. They can only
described.
be implemented by (anti)unitaries which leave
the pair (M, ) invariant. In fact, if U is a unitary
or antiunitary operator satisfying U =  and
Self-Dual Cones UMU = M, then U commutes with both J and .
Let j : M ! M0 denote the antilinear -isomorphism
defined by j(A) = JAJ,A 2 M. The natural positive Two Algebras and One State
cone P \ associated with the pair (M, ) is defined as
Motivated by applications to quantum field theory,
the closure, in H, of the set of vectors
the study of the modular structures associated with
fAjðAÞ j A 2 Mg one state and more than one von Neumann algebra
has begun (see Borchers (2000) for references and
Let Mþ denote the set of all positive elements of M. details). Let N  M be von Neumann algebras
The following theorem collects the main attributes with a common cyclic and separating vector ,
of the natural cone. and N , JN and M , JM denote the corresponding
modular objects. The structure (M, N , ) is called
Theorem 9 a -half-sided modular inclusion if itM N itM 
N , for all t 0.
(i) P \ coincides with the closure in H of the set
{1=4 A j A 2 Mþ }. Theorem 11 Let M be a von Neumann algebra
(ii) it P \ = P \ for all t 2 R. with cyclic and separating vector . The following
(iii) J =  for all  2 P \ . are equivalent:
(iv) Aj(A)P \  P \ for all A 2 M.
(i) There exists a proper subalgebra N  M such that
(v) P \ is a pointed, self-dual cone whose linear
(M, N , ) is a
-half-sided modular inclusion.
span coincides with H.
(ii) There exists a unitary group {U(t)} with positive
(vi) If  2 P \ , then  is cyclic for M if and only if
generator such that
 is separating for M.
(vii) If  2 P \ is cyclic, and hence separating, for UðtÞMUðtÞ1  M; for all  t 0;
M, then the modular conjugation and the UðtÞ ¼ ; for all t 2 R
natural cone associated with the pair (M, )
coincide with J and P \ , respectively.
(viii) For every normal positive linear functional  Moreover, if these conditions are satisfied, then the
on M, there exists a unique vector  2 P \ following relations must hold:
such that (A) = h , A i for all A 2 M.
itM UðsÞit it it
M ¼ N UðsÞN ¼ Uðe

2 t

In fact, the algebras M and M0 are uniquely
characterized by the natural cone P \ [4]. In light of and
(viii), if  is an automorphism of M, then
JM UðsÞJM ¼ JN UðsÞJN ¼ UðsÞ
VðÞ ¼ 1
for all s,t 2 R. In addition, N = U(1)MU(1)1 ,
defines an isometric operator on P \ , which by (v) and if M is a factor, it must be type III1 .
extends to a unitary operator on H. The map
The richness of this structure is further suggested
 7!V() defines a unitary representation of the
by the next theorem.
group of automorphisms Aut(M) on M in such a
manner that V()AV()1 = (A) for all A 2 M and Theorem 12
 2 Aut(M). Indeed, one has the following:
(i) Let (M, N 1 , ) and (M, N 2 , ) be -half-sided,
Theorem 10 Let M be a von Neumann algebra resp. þ-half-sided, modular inclusions satisfy-
with a cyclic and separating vector . The group V ing the condition JN 1 JN 2 = JM JN 2 JN 1 JM . Then
of all unitaries V satisfying the modular unitaries itM , isN 1 , iu
N 2 , s, t, u 2 R,
generate a faithful continuous unitary repre-
VMV  ¼ M; VJV  ¼ J; VP \ ¼ P \ sentation of the identity component of the
Tomita–Takesaki Modular Theory 255

group of isometries of two-dimensional Min- equilibrium state at inverse temperature , with all
kowski space. the consequences which both of these facts have.
(ii) Let M, N , N \ M be von Neumann algebras But it has become increasingly clear that the
with a common cyclic and separating vector . If modular objects it , J, of certain algebras of
(M, M \ N , ) and (N , M \ N , ) are -half- observables and states encode additional physical
sided, resp. þ-half-sided, modular inclusions such information. In 1975, it was discovered that if one
that JN MJN = M, then the modular unitaries considers the algebras of observables associated with
itM , isN , iu
N \M , s, t, u 2 R, generate a faithful a finite-component quantum field theory satisfying
continuous unitary representation of SL the Wightman axioms, then the modular objects
(2, R)=Z2 . associated with the vacuum state and algebras of
observables localized in certain wedge-shaped
This has led to a further useful notion. If N  M
regions in Minkowski space have geometric content.
and  is cyclic for N \ M, then (M, N , ) is said to
In fact, the unitary group {it } implements the group
be a ‘‘-modular intersection’’ if both (M, M \ N , )
of Lorentz boosts leaving the wedge region invariant
and (N , M \ N , ) are -half-sided modular inclu-
(this property is now called modular covariance),
sions and
and the modular involution J implements the space-
 
time reflection about the edge of the wedge, along
JN lim itN it
M JN ¼ lim M N
it it
with a charge conjugation. This discovery caused
t!
1 t!
1
some intense research activity (see Baumgartel and
where the existence of the strong operator limits is Wollenberg 1992, Borchers 2000, Haag 1992).
assured by the preceding assumptions. An example
of the utility of this structure is the following Positive Energy
theorem.
In quantum physics the time development of the
Theorem 13 Let N , M, L be von Neumann alge- system is often represented by a strongly continuous
bras with a common cyclic and separating vector . If group {U(t) = eitH j t 2 R} of unitary operators, and
0
(M, N , ) and (N , L, ) are –-modular intersections the generator H is interpreted as the total energy of
and (M, L, ) is a þ-modular intersection, then the the system. There is a link between modular
unitaries itM , isN , iu
L , s, t, u 2 R, generate a faithful structure and positive energy, which has found
continuous unitary representation of SO" (1, 2). many applications in quantum field theory. This
These results and their extensions to larger result was crucial in the development of Theorem 11
numbers of algebras were developed for application and was motivated by the 1975 discovery mentioned
in algebraic quantum field theory, but one may above, now commonly called the Bisognano–
anticipate that half-sided modular inclusions will Wichmann theorem.
find wider use. Modular theory has also been Theorem 14 Let M be a von Neumann algebra
applied fruitfully in the theory of inclusions N  M with a cyclic and separating vector , and let {U(t)}
of properly infinite algebras with finite or infinite be a continuous unitary group satisfying U(t)MU
index. ( t)  M, for all t 0. Then any two of the
following conditions imply the third:
(i) U(t) = eitH , with H 0;
Applications in Quantum Theory (ii) U(t) = , for all t 2 R; and
The Tomita–Takesaki theory has found many (iii) it U(s)it = U(e2 t s) and JU(s)J = U(s), for
applications in quantum field theory and quantum all s, t 2 R.
statistical mechanics. As mentioned earlier, the
Modular Nuclearity and Phase Space Properties
modular automorphism group satisfies the KMS
condition, a property of physical significance in the Modular theory can be used to express physically
quantum theory of many-particle systems, which meaningful properties of quantum ‘‘phase spaces’’
includes quantum statistical mechanics and quantum by a condition of compactness or nuclearity of
field theory. In such settings, for a suitable algebra certain maps. In its initial form, the condition was
of observables M and state !, an automorphism formulated in terms of the Hamiltonian, the global
group {t } representing the time evolution of the energy operator of theories in Minkowski space.
system satisfies the modular condition. Hence, on The above indications that the modular operators
the one hand, {t } is the modular automorphism carry information about the energy of the system
group of the pair (M, ), and, on the other, ! is an were reinforced when it was shown that a
256 Tomita–Takesaki Modular Theory

formulation in terms of modular operators was Minkowski space for d = 1, 2, 3. Conversely, such
essentially equivalent. quantum field theories naturally yield such systems
Let O1  O2 be nonempty bounded open subregions of algebras.
of Minkowski space with corresponding algebras of This intimate relation would seem to open up the
observables A(O1 )  A(O2 ) in a vacuum representa- possibility of constructing interacting quantum field
tion with vacuum vector , and let  be the modular theories from a limited number of modular inclu-
operator associated with (A(O2 ), ) (by the Reeh– sions/intersections.
Schlieder theorem,  is cyclic and separating for
A(O2 )). For each
2 (0, 1=2) define the mapping
Geometric Modular Action

: A(O1 ) ! H by 
(A) = 
A. The compactness
of any one of these mappings implies the compactness The fact that the modular objects in quantum field
of all of the others. Moreover, the lp (nuclear) norms of theory associated with wedge-shaped regions and the
these mappings are interrelated and provide a measure vacuum state in Minkowski space have geometric
of the number of local degrees of freedom of the significance (‘‘geometric modular action’’) was origin-
system. Suitable conditions on the maps in terms of ally discovered in the framework of the Wightman
these norms entail the strong statistical independence axioms. As an algebraic quantum field theory (AQFT)
condition called the split property. Conversely, the split does not rely on the concept of Wightman fields, it was
property implies the compactness of all of these maps. natural to ask (i) when does geometric modular action
Moreover, the existence of equilibrium temperature hold in AQFT and (ii) which physically relevant
states on the global algebra of observables can be consequences follow from this feature?
derived from suitable conditions on these norms in the There are two approaches to the study of
vacuum sector. geometric modular action. In the first, attention is
The conceptual advantage of the modular com- focused on modular covariance, expressed in terms of
pactness and nuclearity conditions compared to the modular groups associated with wedge algebras
their original Hamiltonian form lies in the fact that and the vacuum state in Minkowski space. Modular
they are meaningful also for quantum systems in covariance has been proven to obtain in conformally
curved spacetimes, where global energy operators invariant AQFT, in any massive theory satisfying
(i.e., generators corresponding to global timelike asymptotic completeness, and also in the presence of
Killing vector fields) need not exist. other, physically natural assumptions. To mention
only three of its consequences, both the spin–statistics
theorem and the PCT theorem, as well as the
Modular Position and Quantum Field Theory
existence of a continuous unitary representation of
The characterization of the relative ‘‘geometric’’ the Poincaré group acting covariantly upon the
position of algebras based on the notions of modular observable algebras and satisfying the spectrum
inclusion and modular intersection was directly condition follow from modular covariance.
motivated by the Bisognano–Wichmann theorem. In a second approach to geometric modular action,
Observable algebras associated with suitably chosen the modular involutions are the primary focus. Here,
wedge regions in Minkowski space provided exam- no a priori connection between the modular objects
ples whose essential structure could be abstracted and isometries of the spacetime is assumed. The central
for more general application, resulting in the notions assumption, given the state vector  and the von
presented in the preceding sections. Neumann algebras of localized observables {A(O)} on
Theorem 12(ii) has been used to construct, from the spacetime, is that there exists a family W of subsets
two algebras and the indicated half-sided modular of the spacetime such that JW1 R(W2 )JW1 2
inclusions, a conformal quantum field theory on the {R(W) j W 2 W}, for every W1 ,W2 2 W. This condi-
circle (compactified light ray) with positive energy. tion makes no explicit appeal to isometries or other
Since the chiral part of a conformal quantum field special attributes and is thus applicable, in principle, to
model in two spacetime dimensions naturally yields quantum field theories on general curved spacetimes.
such half-sided modular inclusions, studying the It has been shown for certain spacetimes, including
inclusions in Theorem 12(ii) is equivalent to study- Minkowski space, that under certain additional
ing such field theories. Theorems 12(i) and 13 technical assumptions, the modular involutions
and their generalizations to inclusions involving up encode enough information to determine the
to six algebras have been employed to construct dynamics of the theory, the isometry group of the
Poincaré-covariant nets of observable algebras (the spacetime, and a continuous unitary representation of
algebraic form of quantum field theories) satisfying the isometry group which acts covariantly upon the
the spectrum condition on (d þ 1)-dimensional observables and leaves the state invariant. In certain
Topological Defects and Their Homotopy Classification 257

cases including Minkowski space, it is even possible Lower Spacetime Dimensions; Thermal Quantum Field
to derive the spacetime itself from the group J Theory; Positive Maps on C-Algebras; Two-Dimensional
generated by the modular involutions {JW j W 2 W}. Models; von Neumann Algebras: Introduction, Modular
The modular unitaries itW enter in this approach Theory, and Classification Theory.
through a condition which is designed to assure the
stability of the theory, namely that itW 2 J , for all
t 2 R and W 2 W. In Minkowski space, this addi- Further Reading
tional condition entails that the derived representation
Baumgärtel H and Wollenberg M (1992) Causal Nets of Operator
of the Poincaré group satisfies the spectrum condition.
Algebras. Berlin: Akademie-Verlag.
Borchers HJ (2000) On revolutionizing quantum field theory with
Further Applications
Tomita’s modular theory. Journal of Mathematical Physics 41:
As previously observed, through the close connec- 3604–3673.
Bratteli O and Robinson DW (1981) Operator Algebras and
tion to the KMS condition, modular theory enters
Quantum Statistical Mechanics II. Berlin: Springer.
naturally into the equilibrium thermodynamics of Connes A (1974) Caractérisation des algèbres de von Neumann
many-body systems. But in recent work on the comme espaces vectoriels ordonnés. Annales de l’Institut
theory of nonequilibrium thermodynamics it also Fourier 24: 121–155.
plays a role in making mathematical sense of the Haag R (1992) Local Quantum Physics. Berlin: Springer.
Kadison RV and Ringrose JR (1986) Fundamentals of the Theory
notion of quantum systems in local thermodynamic
of Operator Algebras, vol. II. Orlando: Academic Press.
equilibrium. Modular theory has also proved to be Pedersen GK (1979) C-Algebras and Their Automorphism
of utility in recent developments in the theory of Groups. New York: Academic Press.
superselection rules and their attendant sectors, Stratila S (1981) Modular Theory in Operator Algebras. Tun-
charges and charge-carrying fields. bridge Wells: Abacus Press.
Takesaki M (1970) Tomita’s Theory of Modular Hilbert Algebras
and Its Applications, Lecture Notes in Mathematics, vol. 128.
See also: Algebraic Approach to Quantum Field Theory; Berlin: Springer.
Axiomatic Quantum Field Theory; Quantum Central-Limit Takesaki M (2003) Theory of Operator Algebras II. Berlin:
Theorems; Symmetries in Quantum Field Theory of Springer.

Topological Defects and Their Homotopy Classification


T W B Kibble, Imperial College, London, UK represented on the Hilbert space of quantum states
ª 2006 Elsevier Ltd. All rights reserved. by a unitary operator Û(g), which commutes with
the Hamiltonian. Spontaneous symmetry breaking
occurs if this symmetry is not shared by the ground
Introduction state or vacuum state j0i of the system. In other
words, for some g 2 G, Û(g)j0i 6¼ j0i. Then the
Symmetry-breaking phase transitions occur in a wide ground state is necessarily degenerate: Û(g)j0i must
variety of systems – from condensed matter to the have the same energy as j0i.
early universe. One of the common features of such Spontaneous symmetry breaking is usually
transitions is the appearance, in the broken-symmetry describable in terms of an order-parameter field,
phase, of topological defects, trapped regions in which vanishes above the transition and is nonzero
which the symmetry is restored, or at least changed. below it. We can find a scalar field (r), ˆ or multiplet
Examples are vortices in superfluids, domain walls in ˆ ˆ
of fields  = (i , i = 1, . . . , n) transforming according
ferromagnets, and disclination lines in liquid crystals. to some representation D of G (assumed not to
Often these defects are stable for topological reasons, contain the trivial representation), whose expecta-
and play an important role in the dynamics of the tion value in the ground state is nonzero:
system. An astonishingly rich variety of defects can be
found in various systems. They can usefully be ^
h0jðrÞj0i ¼ 0 6¼ 0 ½1
classified using the tools of homotopy theory.
This is the order parameter. Since

Spontaneous Symmetry Breaking ^ y ðgÞðrÞ


h0jU ^
^ UðgÞj0i ¼ DðgÞ0 ½2
Let us consider a quantum-mechanical system with a it follows that the only elements of G that can be
symmetry group G. This means that each g 2 G is symmetries of the ground state are those in the
258 Topological Defects and Their Homotopy Classification

stability subgroup H of 0 (the group of unbroken continuous manner everywhere around the periph-
symmetries in this ground state): ery of some region, it is topologically impossible to
complete the process throughout its interior.
H ¼ fg 2 G : DðgÞ0 ¼ 0 g ½3
Continuity may require that there are points where
In terms of this subgroup, we can find a useful  leaves the surface M. For example, if our
characterization of the manifold M of degenerate ferromagnet has two opposite possible directions of
ground states. As noted above, for each g 2 G, easy magnetization, described by f 0 and f 0 , then
Û(g)j0i is also a ground state. However, these are M consists essentially of these two points. Regions
not all distinct, because clearly Û(gh)j0i = Û(g)j0i where f  f 0 and where f  f 0 must be separated
for all h 2 H. Hence, the distinct ground states are by domain walls across which f varies smoothly
in one-to-one correspondence with the left cosets gH from one to the other.
of H in G, and M may be identified with the
quotient space G/H, the space of left cosets.
For example, suppose G is the rotation group Homotopy Groups
SO(3), and f̂ belongs to the three-dimensional To classify the various possible types of defect, we
vector representation. If f 6¼ 0 in the ground state, need to consider the homotopy groups of the
we may choose f 0 = (0, 0, v). Then, clearly, manifold M of degenerate ground states. In this
H = SO(2), the group of rotations about the z-axis, section, we briefly review the necessary definitions.
and M = SO(3)=SO(2) = S2 , the 2-sphere. It is useful A path in M is a map  : I ! M from the unit
to think of M as the subset of the order-parameter interval I = [0, 1]  R. We choose a base point m0 2
space comprising the possible expectation values M (which may be identified with 0 ), and consider
ˆ for the various degenerate ground states. For
f = hi loops in M, paths such that (0) = (1) = m0 . We
example, in this case, M = {f: f 2 = v2 }. say that two loops are homotopic, and write   ,
if one can be continuously deformed into the other
Defect Formation within M, that is, if there exists a map  : I2 ! M
such that
It is often possible to characterize the dynamics at
finite temperature in terms of a function of the order ð0; tÞ ¼ ðtÞ and ð1; tÞ ¼ ðtÞ ½4
parameter, the effective potential V(), which is
for all t, and
necessarily invariant under G, and whose minima
define the equilibrium states. At low temperatures, it ðs; 0Þ ¼ ðs; 1Þ ¼ m0 ½5
has a form like V = (f 2  v2 )2 , whose minima
for all s. This is an equivalence relation. The set
occur at nonzero values of f. But above the critical
1 (M) is the set of equivalence classes [] of loops
temperature Tc , the only minimum is at f = 0, so the
under this relation.
equilibrium state is symmetric under G. In the high-
On the set of loops, we may define a product  ,
temperature phase, there may be large fluctuations
comprising the loop  followed by (see Figure 1).
in f̂, but its mean value will be zero.
Explicitly,
Now, when the system is cooled through the (
phase tränsition, ˆ will acquire a nonzero expecta- ð2tÞ; 0  t  12
tion value, gradually approaching one of the ð ÞðtÞ ¼ ½6
ð2t  1Þ; 12 < t  1
degenerate ground states characterized by a point
of M. But the choice of which one is unpredict- It is easy to show that if   0 and  0 , then
able; the symmetry breaking is spontaneous.   0 0 . Hence, this defines a product on 1 (M),
Moreover, in a large system, there is no reason by [][ ] = [ ]. So equipped, 1 (M) becomes the
why the same choice should be made everywhere.
For example, a ferromagnet cooling through its
Curie point may acquire a spontaneous magneti- φ
zation in different directions in different parts of
the sample.
Of course, there is an energetic penalty to having
a spatially varying order parameter, so it will tend to
become more uniform as the temperature is lowered. ψ φψ
But the question arises whether there may be any
topological obstruction to this process. It can
happen that if we choose points on M in a Figure 1 The product of loops.
Topological Defects and Their Homotopy Classification 259

fundamental group or first homotopy group of M. general no product can be defined on 0 (M), so
Note that the identity is the equivalence class [0 ] of 0 (M) should be called the zeroth homotopy set
the trivial loop with 0 (t) m0 , while the inverse is (not group). There is an important exception,
˜ where the map ˜ is the reverse of
[]1 = [], however: if G is a Lie group, and G0 its connected
˜ = (1  t).
: (t) subgroup (the subset of elements joined by paths to
Strictly speaking, we should write 1 (M, m0 ) in the identity e), then 0 (M) may be identified with
place of 1 (M). However, for any path-connected the quotient group G=G0 . Note, however, that this
space, the groups 1 (M, m0 ) and 1 (M, m00 ) are group 0 (M) = G=G0 is not necessarily abelian.
always isomorphic, and, more importantly, the same
is true for any coset space M = G=H, where G is a
Classification of Defects
Lie group and H a closed subgroup. For a general
manifold M, 1 (M) is not necessarily abelian, but it We now turn to the classification of defects by
is so if M is a Lie group, or more generally a means of homotopy groups. It will be useful to start
Riemannian symmetric space. The space M is said to with simple specific examples in three-dimensional
be simply connected if 1 (M) = 0, the group compris- space, R3 .
ing only the identity element, 0 = {[0 ]}. (Although First, suppose again that f belongs to the vector
1 (M) is not always abelian, it is conventional for representation of G = SO(3). Then M = SO(3)/
homotopy groups to use an additive notation and SO(2) = S2 may be identified with the sphere
represent the trivial group by 0 rather than 1.) M = {f: f 2 = v2 } in  space. Consider a closed surface
The nth homotopy group n (M) may be defined S, an embedding of a 2-sphere S2 in R3 . Assume
similarly, as a set of equivalence classes of maps that everywhere on S the field f(r) has one of the
 : In ! M such that  maps the entire boundary @In ground-state values. In other words, we have a map
to the base point m0 . Two such maps are homotopic f : S ! M, from one 2-sphere to another. The map f
(  ) if there exists a map  : Inþ1 ! M such that can be extended to a map from the interior of S to M
only if it belongs to the trivial homotopy class [f 0 ] 2
ð0; tÞ ¼ ðtÞ and ð1; tÞ ¼ ðtÞ ½7
2 (M), where 0 : I2 ! M : (t1 , t2 ) 7! m0 = eH. In all
for all t = (t1 , . . . , tn ), and, for each s 2 I, (s, t) = m0 other cases, there must be at least one point where
for all t 2 @In . The product  is defined by f(r) = 0; this is a point defect. The second homotopy
group in this case is 2 (S2 ) = Z, so the possible
ð Þðt1 ; . . . ; tn Þ point defects, or monopoles, are labeled by an integer
(
ð2t1 ; t2 ; . . . ; tn Þ; 0  t1  12 n 2 Z, the winding number. (An example of a map
¼ 1
½8 with winding number n is (in spherical polars)
ð2t1  1; t2 ; . . . ; tn Þ; 2 < t1  1
(r, , ’) 7! (v, , n’).)
The choice of t1 rather than any other tj is arbitrary; More generally, point defects in R d are classified
all choices yield homotopic product maps. The by d1 (M). A map  from a closed (d  1)-
product again defines a product on n (M), which dimensional surface S  R d to M can be extended
thereby becomes a group, the nth homotopy group. to the interior of S if and only if it belongs to the
One new feature is that, for all n > 1, n (M) is trivial homotopy class [0 ] 2 d1 (M). If this is not
always abelian. the case, there must be at least one point around
Note that since the entire boundary of In is which (r) leaves the surface M, although in general
mapped to a single point, it is possible to collapse it, it is not required to vanish anywhere.
and talk instead about maps from the n-sphere Sn to Second, take the case where  is a single complex
M, taking one designated point to m0 . The fact that field, and G is the phase symmetry group U(1). In
n (M) is nontrivial indicates the existence in M of this case, H is the subgroup 1 = {1}  G. Thus,
closed n-surfaces that cannot be smoothly shrunk to M = U(1)=1 = S1 ; this manifold may be identified
a point. In particular, it is worth noting that, for any n, with the circle {: jj = v} in the order-parameter
n (Sn ) = Z, the additive group of integers, while space. Now consider a closed loop C in space, an
m (Sn ) = 0 for all m < n. embedding of S1 in R3 (see Figure 2). Suppose that
A special case is n = 0. Here, S0 comprises two on C, (r) takes one of the ground-state values,
points only, and since one of them is always mapped say (r) = v exp [i(r)]. If S is some surface with
to m0 , we really have to consider maps from a single boundary C, then the map  : C ! M can be
point to M, that is, points in M. Two points are extended to a map  : S ! M if and only if it
homotopic if they can be joined by a path in M. belongs to the trivial homotopy class [0 ] 2 1 (M).
Thus, 0 (M) may be identified with the set of path- If it does not, then there must be at least one
connected components of M. Note, however, that in point on S within C where  = 0. Moreover, this
260 Topological Defects and Their Homotopy Classification

φ=0 p
i G

1 H 1
e e m0

Figure 3 An exact sequence.

is an exact sequence: the image of each map is the


kernel of the following one (see Figure 3).
Next, we note that since any closed loops (or
n-surfaces) in H belonging to the same homotopy
class are also homotopic as loops (or n-surfaces) in
Figure 2 A linear defect. G, there is an induced homomorphism i : n (H) !
n (G). Similarly, homotopic loops or n-surfaces in G
project to homotopic loops or n-surfaces in M, so
must be true of every surface S spanning C, so there there is an induced homomorphism p : n (G) !
must be a curve passing through C along which n (M). Moreover, it is easy to see that although i is
 = 0. This is a linear defect, a string or vortex line. not necessarily injective and p not necessarily
In this case, the first homotopy group is 1 (S1 ) = Z, projective, it is true that the image of i is the kernel
so we see that the possible linear defects are of p . For example, any loop in G will be mapped to
classified by an integer, the winding number n. An a homotopically trivial loop in M if and only if it is
example of a map with winding number n is homotopic to the image of a loop in H.
’ 7! veni’ . In addition, there is a boundary map that
Again, this result can easily be generalized. Linear relates homotopy groups of different dimension:
defects in R d are classified by d2 (M). If, on a @ : nþ1 (M) ! n (H). To see this, it is useful to
(d  2)-dimensional surface C, (r) takes values in think of G as a fiber bundle with base space M and
M, and if it does not belong to the trivial homotopy fiber H. Now consider a map  : (Inþ1 , @Inþ1 ) !
class, there must be a linear defect threading (M, m0 ). Since p is a projection,  can always be
through C, around which  leaves the surface M – lifted to a map ˆ : (Inþ1 , @Inþ1 ) ! (G, H), that is, we
although again it need not necessarily vanish. can find a (nonunique) map ˆ such that  = p

More generally yet, in the d-dimensional space Rd , (see Figure 4). However, ˆ does not necessarily map
defects of dimension p are classified by the homotopy the boundary to a single point; what is true is that ˆ
group dp1 (M). For example, in three dimensions, must map the boundary to a subset of H, and since
planar defects – domain walls – are classified by topologically @Inþ1 ’ Sn , this defines a map ˜ : Sn ! H.
0 (M). If we allow  to vary over some homotopy class
of maps, and ˆ to vary continuously, then ˜ will

The Exact Sequence


There are mathematical theorems that greatly G
facilitate the computation of the homotopy group H
of homogeneous spaces, of the form M = G=H. i
We begin with the maps relating these spaces to e
e ∧
each other. There is a canonical injective homo- φ (t )
morphism i : H ! G : h 7! h, and a canonical pro-
jection associating each element of G with its coset:
p : G ! M : g 7! gH. Moreover, it is clear that the
image of i, namely the subgroup H, is also the kernel p
of p, the inverse image p1 m0 of the distinguished
element m0 = eH of M. These statements can be
summarized by saying that m0 φ (t )
i p
1 ! H!G!M ! 1 Figure 4 Lift of a loop.
Topological Defects and Their Homotopy Classification 261

also remain in one homotopy class. Thus, we have example, by SU(2). Thus, we may also assume that
defined a map @ : nþ1 (M) ! n (H) : [] 7! []. ˜ 1 (G) = 0. Then the section of the exact sequence in
It is also easy to see that the image the second line of [9] becomes
of @ : nþ1 (M) ! n (H) is the kernel of i : n (H) ! p @ i
n (G), because the n-surface in H defined by ˜ is 0 ! 1 ðMÞ ! 0 ðHÞ ! 0
necessarily homotopically trivial in G. Similarly, one which implies that the two groups in the center are
can see that the image of p : nþ1 (G) ! nþ1 (M) is isomorphic:
the kernel of @ : nþ1 (M) ! n (H).
Putting all these results together, we see that there 1 ðMÞ ¼ 0 ðHÞ ½10
is a (semi-infinite) exact sequence connecting all the For example, if the symmetry group G = SO(3) is
homotopy groups: completely broken, so that H = 1, then replacing G by
p @ i p G̃ = SU(2) requires replacing H by H̃ = { þ1, 1}
! nþ1 ðMÞ ! n ðHÞ ! n ðGÞ ! n ðMÞ
’ Z2 , hence also 1 (M) = 0 (H̃) = Z2 ; there is only
@ i p @ one nontrivial class of linear defects in this model.
! n1 ðHÞ ! 1 ðGÞ! 1 ðMÞ ! 0 ðHÞ ½9
To find 2 (M), we need a standard theorem
i p about Lie groups, namely that the second homotopy
! 0 ðGÞ ! 0 ðMÞ group of any Lie group is trivial: for any
G, 2 (G) = 0. (No details of the proof are given
This sequence makes it easy to compute most of here. It derives from the fact that a generic element
the low-dimensional homotopy groups of M. Let us g 2 G belongs to a unique one-parameter subgroup
begin with 0 (M), which merely labels its discon- { exp (tX), t 2 R}  G, where X is an element of the
nected components. As noted earlier, for the Lie Lie algebra of G. Thus, all the points on a surface in
group G, 0 (G) is the quotient group 0 (G) = G=G0 , G may be joined by these paths to the identity, and
where G0 is the connected subgroup of G. Now the the surface may then be shrunk along the resulting
image of 0 (H) under i is clearly the set of cone. There are exceptional elements for which
connected components of G that contain elements this is not true, but it can be shown that in a d-
of H, so if G has m connected components, and n of dimensional group they lie on (d  3)-dimensional
them contain elements of H, then 0 (M) has m=n surfaces, so any 2-surface can be smoothly deformed
elements (see Figure 5). to avoid them.)
Next, we note that, for all the higher homotopy It follows from this theorem that another section
groups, disconnected pieces are irrelevant. Since a of the exact sequence is
loop, for example, starting at m0 must remain p @ i
0 ! 2 ðMÞ ! 1 ðHÞ ! 0
within its connected component M0  M, it
follows that 1 (M) = 1 (M0 ), and similarly which again implies an isomorphism:
n (M) = n (M0 ) for all n > 1. So one can ignore
2 ðMÞ ¼ 1 ðHÞ ½11
any disconnected parts of the symmetry group G,
and assume from now on that 0 (G) = 0. Moreover, For example, if G = SO(3) and H = SO(2), or
it is always possible to replace G by its simply equivalently G̃ = SU(2) and H̃ = U(1) (a double
connected covering group, replacing SO(3), for cover of the SO(2)), then 2 (M) = 1 (H̃) = Z, so
point defects in this theory are labeled by an integer
winding number.
π0(G)

π0( ) Examples
The simplest continuous symmetry is the U(1) phase
symmetry ˆ 7! e
ˆ i of a complex field. In a weakly
π0(H ) interacting Bose gas, below the Bose–Einstein con-
densation temperature, or in superfluid helium-4,
a macroscopic fraction of the atoms occupies a
single quantum state, and ˆ acquires a nonzero
ˆ = , whose phase is arbitrary,
expectation value, hi
so the symmetry is completely broken to H = 1.
Figure 5 The disconnected components of G are shaded, those Thus, M = S1 ; we have a circle of equivalent
of H are cross-hatched. Here 0 (M) has two elements. degenerate ground states. (This corresponds to
262 Topological Defects and Their Homotopy Classification

spontaneous breaking of the particle-number sym-


metry. It is possible to describe the system in a U(1)-
invariant way, by projecting out a state of definite
particle number, a uniform superposition of all the
states in M, but it is generally less convenient to do
so.) In this case, the only nontrivial homotopy group
is 1 (M) = Z, so the only defects are linear defects
classified by a winding number n 2 Z. The defects
with n = 1 are stable vortices. Those with jnj > 1
are in general unstable and tend to break up into jnj
single-quantum vortices. Figure 6 Orientation of molecules around a disclination line.
Low-temperature superconductors also have a
U(1) symmetry, although there are important differ- stable linear defects, here called disclination lines,
ences. This is not a global symmetry but a local, around which the director n rotates by  (see Figure 6).
gauge symmetry, with coupling to the electromag- The fact that these defects are classified by Z2 rather
netic field. Moreover, it is not single atoms that than Z means that a line around which n rotates by 2
condense but Cooper pairs, pairs of electrons of is topologically trivial; indeed, n can be smoothly
equal and opposite momentum and spin. These rotated near the line to run parallel to it, leaving a
systems too exhibit linear defects, magnetic flux configuration with no defect.
tubes carrying a magnetic flux 4n h=e. There are also point defects; since 2 (M) =
A less trivial example is a nematic liquid crystal. 1 (H̃) = Z, they are labeled by an integer winding
These materials are composed of rod-shaped mole- number n. In a defect with n = 1, the vector n points
cules that tend, at low temperatures, to line up radially outwards all round the defect position.
parallel to one another. The nematic state is
characterized by a preferred orientation, described
Helium-3
by a unit vector n, the director. (Note that n and n
are physically equivalent.) There is long-range Finally, let us turn to helium-3, one of the most
orientational order, with molecules preferentially fascinating and complex examples of spontaneous
lining up parallel to n, but unlike a solid crystal symmetry breaking, which becomes a superfluid at a
there is no long-range translational order – the temperature of a few millikelvin. Unlike helium-4, this
molecules move freely past each other as in a normal is, of course, a Fermi liquid, so it is not the atoms that
liquid. condense, but bound pairs of atoms, analogous to
A convenient order parameter here is the mean Cooper pairs. In this case, however, the most attractive
mass quadrupole tensor  of a molecule. In the channel is not the 1 S, but the 3 P, so the pairs have both
nematic state,  is proportional to (3nn  1); for orbital and spin angular momentum, L = S = 1. There-
example, if n = (0, 0, 1), then  is diagonal with fore, the order parameter is not a single complex scalar
diagonal elements proportional to (1, 1, 2). In field but a 3 3 complex matrix jk , where the two
this case, the symmetry group is SO(3) (or, more indices label the orbital and spin angular momentum
precisely, O(3); but the inversion symmetry is not states.
broken, so we can restrict our attention to the To a good approximation, the system is invariant
connected part of the group). The subgroup H that under separate rotations of L and S (the effects of
leaves this  invariant is a semidirect product, the small spin–orbit coupling will be discussed
H = SO(2) n Z2 (isomorphic to O(2)), composed of later), so the symmetry group is
rotations about the z-axis and rotations through 
G ¼ Uð1ÞY SOð3ÞL SOð3ÞS ½12
about axes in the x–y plane. (If we enlarge G to its
simply connected covering group G̃ = SU(2), then H where the subscripts denote the generators and U(1)Y
becomes H̃ = [U(1) n Z4 ]=Z2 , where U(1) is gener- represents multiplication by an overall phase factor,
ated as before by Jz . The essential difference is that eiY : jk 7! jk ei . This complicated symmetry allows
the square of any of the elements in the disconnected much scope for a large variety of defects. There are, in
piece of H̃ is not now the identity but the element fact, two distinct superfluid phases, A and B, with
e2iJz = 1 2 U(1).) The manifold M of degenerate different symmetries (and indeed in the presence of a
ground states in this case is the projective space RP2 magnetic field there is a third, A1).
(obtained by identifying opposite points of S2 ). In the 3 He-A phase, the order parameter has the
Since H̃ has disconnected pieces, we have form jk / (mj þ inj )dk , where m, n, d are unit
1 (M) = 0 (H̃) = Z2 . Thus, there can be topologically vectors, with m ? n; if we set l = m ^ n, then
Topological Defects and Their Homotopy Classification 263

l defines the orbital angular momentum state by The case of helium-3 is slightly different. Here it
l L = 1, while d defines the spin quantization axis, is the small spin–orbit coupling, arising from long-
such that d S = 0. The manifold MA for this range dipole–dipole interactions, that introduces
phase is the second scale. Its effect is only significant over
large distances.
MA ¼ ½SOð3Þ S2 =Z2 ½13
In the 3 He-A phase, at short range the l and d
where the Z2 is present because (m, n, d) and vectors are uncorrelated but, over large distances,
(m, n, d) represent the same state. If, for they tend to be aligned parallel or antiparallel. We
example, we take l and d in the z-direction, the can use the Z2 symmetry mentioned earlier to
unbroken symmetry subgroup is choose l = d. Hence, the manifold M0A of true
ground states is only a submanifold of MA , namely
HA ¼ SOð2ÞLz þY ½SOð2ÞSz n Z2  ½14
M0A = SO(3), whose homotopy groups are
where the nontrivial element of Z2 may be taken to
0 ðM0A Þ ¼ 0; 1 ðM0A Þ ¼ Z2 ; 2 ðM0A Þ ¼ 0 ½21
be ei(Sx þLz ) . The covering group of G is, of course,
~ ¼ RY SUð2Þ SUð2Þ
G ½15
L S Because of different behavior on different scales,
Correspondingly, ‘‘composite’’ defects can arise. For example, because
2 (MA ) = Z, there are short-range monopole con-
~ A ¼ RL þY ½Uð1Þ n Z4 
H ½16 figurations. For the n = 1 monopole, we have a
z Sz
configuration with uniform l, and with d pointing
It follows that the homotopy groups are
outwards from the center. But, eventually the
0 ðMA Þ ¼ 0; 1 ðMA Þ ¼ Z4 ; 2 ðMA Þ ¼ Z ½17 misalignment of d with l is energetically disfavored,
and at large distances d tends to rotate to align with
There are linear defects labeled by a mod-4 quantum
l except around one particular direction where it is
number and point defects labeled by an integer.
oppositely aligned (see Figure 7). We have a
For the 3 He-B phase, by contrast, the order
composite defect: a small monopole coupled to a
parameter is of the form
relatively fat string.
jk / Rjk ei ½18 To see how the small- and large-scale structures fit
together, one has to look also at the relative
where R is a rotation matrix, R 2 SO(3). Here then, homotopy groups n (M, M0 ), whose elements are
MB ¼ SOð3Þ S1 ½19 homotopy classes of maps from In to M such that
one face of the boundary is mapped into M0 , and the
with homotopy groups remainder to the chosen base point m0 . For example,
0 ðMB Þ ¼ 0; 1 ðMB Þ ¼ Z2 Z; 1 (M, M0 ) classifies paths that terminate at m0 while
beginning at any point of M0 . There is, in fact, a
2 ðMB Þ ¼ 0 ½20 long exact sequence, similar to [9], relating these
In this phase, there are two distinct types of linear homotopy groups, of which a typical segment is
defect, the mass vortices with an integer label, and @ i p
the spin vortices with a mod-2 label. (One can also ! n ðM0 Þ ! n ðMÞ !n ðM; M0 Þ
have a ‘‘spin–mass vortex’’ carrying both quantum @ i
! n1 ðM0 Þ! ½22
numbers.)

l
Composite Defects
There are several cases, including in particular
helium-3, that exhibit symmetry breaking with
multiple length or energy scales. For example, there d
may be two order parameters, say , , with
jj  j j. If j j is negligible, the symmetry G is
broken by  to H, and the manifold of degenerate
ground states is M = G=H. However, these states
are not all exactly degenerate: breaks the
symmetry further to K  H, so the precisely degen- Figure 7 Cross-section of a short-range monopole attached to
erate ground states form a submanifold M0 = G=K. a fat string.
264 Topological Gravity, Two-Dimensional

The relevant groups in the present case are nontrivial elements of 2 (M0B ) have no short-range
singularity at all.
1 ðMA ; M0A Þ ¼ Z2 ; 2 ðMA ; M0A Þ ¼ Z ½23
Because 1 (MA ) = Z4 , there are three distinct classes See also: Abelian Higgs Vortices; Leray–Schauder
of linear defects at small scales, but only those with Theory and Mapping Degree; Liquid Crystals; Phase
Transition Dynamics; Quantum Field Theory: A Brief
quantum number n = 2 (mod 4) survive unchanged to
Introduction; Quantum Fields with Topological Defects;
large scales; they correspond to the nontrivial element
Solitons and Other Extended Field Configurations; String
of 1 (M0A ) = Z2 . On the other hand, the homotopy Topology: Homotopy and Geometric Perspectives;
classes n = 1 (mod 4) are mapped to nontrivial Symmetries and Conservation Laws; Symmetry Breaking
elements of 1 (MA , M0A ) = Z2 , which indicates that in Field Theory; Variational Techniques for
the corresponding linear defects are coupled at long Ginzburg–Landau Energies.
range to fat domain walls, across which d rotates
through  with a compensating rotation through 
about l. Similarly, the nontrivial elements of Further Reading
2 (MA ) = Z are mapped to nontrivial elements of
2 (MA , M0A ), confirming that these short-range mono- Helgason S (2001) Differential Geometry, Lie Groups and Sym-
metric Spaces. Providence, RI: American Mathematical Society.
poles are coupled to fat strings, as in Figure 7.
Hu S-T (1959) Homotopy Theory. New York: Academic Press.
For 3 He-B, the effect of the spin–orbit coupling Kibble TWB (1976) Topology of cosmic domains and strings. Journal
is to make the most energetically favorable of Physics A: Mathematical and General 9: 1387–1398.
configurations those in which the rotation Kibble TWB (2000) Classification of topological defects and their
matrix R in [18] represents a rotation about an relevance to cosmology and elsewhere. In: Bunkov YM and
Godfrin H (eds.) Topological Defects and the Non-Equilibrium-
arbitrary axis n through the Leggett angle L =
Dynamics of Symmetry Breaking Phase Transitions, NATO
arccos(1/4) = 104
: R = exp (iL n J). Science Series C: Mathematical and Physical Sciences. vol. 249,
Consequently, pp. 7–31. Dordrecht: Kluwer Academic Publishers.
Shellard EPS and Vilenkin A (1994) Cosmic Strings and
M0B ¼ S2 S1 ½24 Other Topological Defects. Cambridge: Cambridge University
Press.
and so Tilley DR and Tilley J (1990) Superfluidity and Superconductiv-
ity, 3rd edn. Bristol: IoP Publishing.
0 ðM0B Þ ¼ 0; 1 ðM0B Þ ¼ Z; 2 ðM0B Þ ¼ Z ½25
Toulouse G and Kléman M (1976) Principles of a classification of
The relative homotopy groups are defects in ordered media. Journal de Physique Lettres 37: 149.
Vollhardt D and Wölfle P (1990) The Superfluid Phases of
1 ðMB ; M0B Þ ¼ Z2 ; 2 ðMB ; M0B Þ ¼ 0 ½26 Helium 3. London: Taylor and Francis.
Volovik GE (1992) Exotic Properties of Superfluid 3He.
Here the mass vortex persists at long range, but the Singapore: World Scientific.
Volovik GE and Mineev VP (1977) Investigation of singularities
configuration around the spin vortex deforms so in superfluid 3He and in liquid crystals by the homotopic
that they become attached to fat domain walls. The topology methods. Zhurnal Eksperimentalnoi i Teoreticheskoi
‘‘monopole’’ configurations corresponding to Fiziki 72: 2256–2274 (Soviet Physics–JETP 24: 1186–1196).

Topological Gravity, Two-Dimensional


T Eguchi, University of Tokyo, Tokyo, Japan we find that the free energy F = log Z has the 1=N
ª 2006 Elsevier Ltd. All rights reserved.
expansion
X
1
F¼ N22g Fg ðÞ ½2
g¼0

Introduction Inspection of the Feynman diagrams shows that Fg


It is well known that large-N Hermitian matrix models reproduces the sum over the triangulations of genus
generate Feynman diagrams which represent the g Riemann surfaces. The theory [1] is obviously well
triangulation of Riemann surfaces. For instance, if we defined for   0. In the large-N expansion, the
consider the integral of an N N Hermitian matrix H theory continues to exist also at negative values of
Z     down to the critical point c = 1=12.
1 2  4 The double scaling limit of large-N matrix
Z ¼ dH exp N tr H þ tr H ½1
2 4 models (Brézin and Kazakov 1990, Douglas and
Topological Gravity, Two-Dimensional 265

Shenker 1990, Gross and Migdal 1990) is given by intersection numbers on moduli space (t) is the
adjusting the coupling  to c and at the same time -function of KdV hierarchy. KdV hierarchy is
taking the limit N ! 1. In this limit, contributions of obtained by generalizing the well-known KdV
all genera survive, and the theory describes the equation
dynamics of fluctuating surfaces of arbitrary topolo-
gies. Results obtained in this way do not, in fact, @u 3 @u 1 @ 3 u
¼ u þ ½5
depend on the detailed choice of the potential (4 type @t 2 @x 4 @x3
in [1]) and have a high degree of universality. Thus, it Identification of the KdV equation with topological
provides an interesting model of two-dimensional (2D) gravity is given by u = 2hO1 O1 i, x = t1 , t = t3 .
quantum gravity. Witten’s conjecture was verified by Kontsevich
Soon after the discovery of double scaling limit of (1991) by an explicit construction of a new type of
matrix models, Witten observed that the correlation matrix model which generates the triangulation of
functions of the 2D gravity theory may be given a the moduli space of Riemann surfaces.
geometrical interpretation as topological invariants In the general case of (p, 1) topological gravity,
of the moduli space of Riemann surfaces M, and the partition function of the theory obeys the
that the 2D gravity theory may be reformulated as a equations of pth generalized KdV hierarchy (p
topological field theory (Witten 1990). This refor- reduction of KP hierarchy).
mulation of the results of the 2D gravity theory is
called ‘‘2D topological gravity.’’
In fact, 2D gravity theories come in a family Intersection Theory
parametrized by a pair of integers (p, q). The double
scaling limit of [1] gives the simplest example We now present some basic features of intersection
(p = 2, q = 1). Models with a chain of p  1 Hermi- theory on the moduli space of Riemann surfaces. It
tian matrices give the (p, q) 2D gravity theories. The is known that 2D oriented surfaces  with g handles
label q stands for the order of criticality of the and s marked points xi (i = 1, . . . , s) possess a finite
model, and higher values of q are achieved by fine- number of inequivalent complex structures (complex
tuning the parameters of the potential. At q = 1, 2D structures are identified when they differ only by
gravity theories possess a topological interpretation. diffeomorphism). The space of inequivalent complex
The most basic case (p = 2, q = 1) is called pure structures is called the moduli space Mg,s of the
topological gravity, and in theories at higher values Riemann surface . Its dimension is given by
of p, topological gravity is coupled to a matter
system, that is, topological minimal models. Topo- dim Mg;s ¼ 3g  3 þ s ½6
logical minimal models are obtained by twisting
For a mathematically rigorous treatment, we have to
N = 2 superconformal field theories.  g,s of moduli space
consider a compactification M
Let us first consider the case of pure gravity (p = 2, 1).
Mg,s by adding suitable boundary components
Let On denote the observables in the theory and tn the
which arise due to various types of degenerations
coupling constants to these operators. The correlation
of Riemann surfaces. In the Deligne–Mumford or
functions of topological gravity are given by
stable compactification, one considers the following
hOn1 On2 . . . Ons ig ; ni ¼ 1; 2; . . . ½3 three classes of singular Riemann surfaces :

where h  ig denotes the expectation value on a 1. Two points, xi and xj , on  come close together. In
surface with g handles. The precise significance of this case, an extra 2-sphere is pinched off from the
eqn [3] as the intersection number on the moduli surface by forming a thin neck. The sphere contains
space is discussed below. The string partition points xi and xj and also the point xl at the end of
function (t) is defined as the generating function the neck (see Figure 1a). Since the original surface
of all possible correlation functions now has one point less and the 2-sphere with three
points has no moduli, the degenerate surface has
X1 D X E
3g  4 þ s parameters and forms a boundary
ðtÞ ¼ exp exp tn On ½4  g,s .
g¼0
g divisor of the moduli space M
2. If a cycle of nontrivial homology class shrinks to
The most striking aspect of topological gravity is a point, we have a surface with one less genus
the connection of the intersection theory on M to and two extra marked points. Singular surface
the theory of completely integrable systems, that is, has 3(g  1)  3 þ s þ 2 number of moduli and
Korteveg–de Vries (KdV) and KP hierarchies. this is again a complex codimension-1 compo-
Witten conjectured that the generating function of nent (see Figure 1b).
266 Topological Gravity, Two-Dimensional

Correlation functions are defined by integrating


xj
xi these classes over the moduli space:
xl Z
xs hn1    ns ig
c1 ðL1 Þn1 ^    ^ c1 ðLs Þns ½8
 g;s
M
x1 x2  g,s
These integrals are topological invariants of M
and are nonzero only when the degree of the
cohomology classes adds up to the dimension of
the moduli space
(a)
X
s
ni ¼ 3g  3 þ s ½9
i¼1

x1 xi n (i) is known as the nth descendant of the puncture


operator 0 (i), since it is associated with the marked
x2
xs
point xi .
xj
The above correlation functions are evaluated
using various recursion relations. First, one has the
puncture equation
(b) X
s
h0 n1    ns ig ¼ hn1    ni 1    ns ig ½10
i¼1;ni 6¼0
x1
xs which can be derived by considering a map
xi xj :M  g, s where one forgets the position of
 g, sþ1 ! M
an extra point. Contributions arise when the for-
gotten point coincides with the other points. This
relation can be used to eliminate 0 ’s from correla-
tion functions when they are well defined. At g = 0,
(c) less than three insertions are ill-defined and one has
Figure 1 Degenerate Riemann surface obtained when (a) the
points xi and xj coincide; (b) a nontrivial cycle collapses, two
h0 0 0 i0 ¼ 1 ½11
new points xi and xj are created; (c) a pinching cycle collapses, Another basic relation is the dilaton equation for
two new points xi and xj are created.
the operator 1 :
h1 n1    ns ig ¼ ð2g  2 þ sÞhn1    ns ig ½12
3. Similarly, if a dividing cycle pinches, one obtains
The dilaton equation follows from the fact that since
two disconnected surfaces of genus gi with si þ 1
1 is the first Chern class c1 (L), it calculates the
marked points (i = 1, 2; g1 þ g2 = g, s1 þ s2 = s).
degree of the canonical line bundle of genus g
This type of degeneration
P also hasP the same surface with s punctures. At g = 1, one insertion is
number of parameters (3gi  3) þ (s1 þ 1) =
required and one has
3g  4 þ s (see Figure 1c).
1
It is known that M  g,s is a compact and smooth h1 i1 ¼ 24 ½13
orbifold space, and observables of topological gravity By combining these recursion relations, one can
are given by the cohomology classes on M  g,s . There
evaluate the correlation functions. For instance, at
exist special cohomology classes introduced by g = 0 one finds
Mumford and Morita, which are defined as follows:
There are natural line bundles L1 , . . . , Ls on the ðn1 þ    þ ns Þ!
 g,s . The fiber of the bundle Li at a hn1    ns i0 ¼ ½14
moduli space M n1 !    ns !

point  2 Mg,s is the cotangent space Tx i  to the
A powerful way of computing correlation functions is
point xi on the surface . These line bundles have the
given by the KdV hierarchies and Virasoro conditions
first Chern classes c1 (Li ) and by taking their exterior
as discussed below. In the context of integrable
power we can define 2n-dimensional classes
systems, it is convenient to redefine the observables as
n ðiÞ ¼ c1 ðLi Þn 2 H2n ðM
 g;s Þ ½7 O2nþ1 ¼ ð2n þ 1Þ!!  n ; n0 ½15
Topological Gravity, Two-Dimensional 267

Topological Minimal Models If one defines the twisted stress tensor by


Standard intersection theory applies to the case of T 0 ðzÞ ¼ TðzÞ þ 12@JðzÞ ½22
pure topological gravity, p = 2. At higher values of
p, the theory is generalized as follows: one intro- then T 0 (z) has a vanishing central charge. Further-
duces the coupling of topological gravity to the more, the conformal dimensions of the supersym-
topological matter sector which is obtained by metry operators G become shifted from 3/2 to
twisting the N = 2 superconformal theories. h(Gþ ) = 1 and h(G ) = 2. It is then possible to
We recall that N = 2 superconformal symmetry is integrate Gþ on the Riemann surface and define a
H
generated by the operators, stress tensor T(z), U(1) fermionic scalar operator Gþ 0 = dzG þ
(z). From the
current J(z), and two types of supersymmetry N = 2 algebra, one has
generators G(z) . (In the holomorphic sector of the
2
theory these operators depend on the holomorphic ðGþ
0 Þ ¼ 0; fGþ  0
0 ; G ðzÞg ¼ 2 T ðzÞ ½23
coordinate z of the Riemann surface. In the antiholo-
morphic sector they depend on the antiholomorphic If we identify Gþ0 as the Becchi–Rouet–Stora–Tyupin
variable z̄.) Mode expansion of the stress tensor and (BRST) operator of the theory, then the twisted
U(1) current is given by stress tensor becomes BRST trivial, which is the
X X characteristic feature of topological field theory.
TðzÞ ¼ Ln zn2 ; JðzÞ ¼ Jn zn1 ½16 Thus, we obtain a topological field theory by twisting
n n N = 2 conformal theory (Eguchi and Yang 1990).
These are topological minimal models. BRST-invar-
Ln generates the Virasoro algebra iant observables are given by the chiral primary fields
c [20]. (To be precise, when we take account of the
½Lm ; Ln  ¼ ðm  nÞLmþn þ mðm2  1Þmþn;0 ½17 antiholomorphic sector, we may define either Q =
12 þ 
Gþ þ
0 þ Ḡ0 or Q = G0 þ Ḡ0 as the BRST operator.
where c denotes the central charge of the theory. Thus, in general, we obtain two different topological
Commutators of Jn and Ln are given by field theories. This is the origin of the mirror
c symmetry. In the context of topological gravity, one
þ
½ Jm ; Jn  ¼ mmþn;0 ; ½Lm ; Jn  ¼ nJmþn ½18 takes the convention Q = Gþ
3 0 þ Ḡ0 .)
Now, we consider the coupling of topological
It is known that there is a continuum of unitary gravity to topological minimal models. We identify
N = 2 conformal theories in the range c  3; k = p  2. Making use of chiral fields ‘ (‘ = 0, . . . ,
however, only discrete values of the central charge p  2), observables are constructed:
c = 3k=(k þ 2), k = 1, 2, . . . are allowed in the
region 3  c  1. These are the N = 2 minimal n;‘ ¼ n ‘ ½24
models labeled by the level k. Only a finite
N = 2U(1) charge is identified as the degree of
number of primary fields exist in these theories.
differential form of the moduli space. Thus, the
In N = 2 theory, primary fields  are characterized
degree
Q of n, ‘ is n þ ‘=p. Correlation functions
by their conformal dimension and U(1) charge:
h si = 1 ni ,‘i ig are nonzero if the selection rule
L0 j i ¼ hj i; J0 j i ¼ qj i ½19 s    
X ‘i p2
ni þ ¼ 3 ðg  1Þ þ s ½25
There exists a special set of primary operators, chiral i¼1
p p
primary fields ‘ (‘ = 0, . . . , k), which are annihilated
by the supercharge operator Gþ : is obeyed.
I We may assemble n, ‘ into operators with one
dzGþ ðzÞj‘ i ¼ 0 ½20 index Om as
Y
n
‘ has the dimension and U(1) charge Onpþ‘þ1 ¼ ðrp þ ‘ þ 1Þ  n;‘ ½26
r¼0

qð‘ Þ ¼ ; hð‘ Þ ¼ 12qð‘ Þ; ‘ ¼ 0; 1; . . . ; k ½21 where one introduces a convenient normalization
kþ2
factor. Note that the operators Om do not exist
By considering primary fields annihilated by G , when m
0 mod p and the corresponding paramters
we can also define antichiral fields. Antichiral fields tm are absent. This is a characteristic feature of p
have U(1) charge opposite to those of chiral fields. reduced KP hierarchy.
268 Topological Gravity, Two-Dimensional

The puncture and dilaton equations for (p, 1) It is a basic result of the calculus of pseudodiffer-
theories read ential operators that the above Hamiltonians satisfy
the zero-curvature condition
h0;0 n1 ;k1    ns ;ks ig
@Hm @Hn
X s  þ ½Hm ; Hn  ¼ 0 ½35
¼ hn1 ;k1    ni 1;ki    ns ;ks ig ½27 @tn @tm
i¼1;ni 6¼0
Note that when m is a multiple of p, Hm becomes a
power of L and trivially commutes with L. Thus, the
time variables tm are absent for n
0 mod p. In the
h1;0 n1 ;k1    ns ;ks ig
simple case of p = 2, one has
¼ ð2g  2 þ sÞhn1 ;k1    ns ;ks ig ½28
L ¼ D2 þ uðxÞ ½36
The special terms at g = 0 and g = 1 are given by and H3 = D3 þ (3=2)uD þ (3=4)u0 . One finds
p1 @L @u 3 @u 1 @ 3 u
h0;0 0;i 0;pi2 i0 ¼ 1; h1;0 i1 ¼ ½29 ¼ ¼ ½H3 ; L ¼ u þ ½37
24 @t3 @t3 2 @x 4 @x3
which is the standard KdV equation.
Integrable Hierarchy In the case of KP hierarchy, one starts with a
pseudodifferential operator
We now summarize some basic facts about the
integrable hierarchy (see for instance eqn [5]). We X
1

introduce a pth order differential operator: Q¼Dþ ai Di ½38


i¼1

p2
X @ and considers the time evolution equations
L ¼ Dp þ ui ðxÞDi ; D
½30
@x @Q
i¼0
¼ ½Hn ; Q; Hn ¼ ðQn Þþ ½39
@tn
where the coefficient functions ui are arbitrary
p-reduced KP hierarchy is obtained if one has
functions of x. This Lax operator describes the pth
generalized KdV hierarchy. We consider the time Qp  ¼ 0 ½40
evolution of the operator L by an infinite set of
commuting Hamiltonians: By introducing a pseudodifferential operator K, one
may bring Q to the simple derivative operator D as
@L
¼ ½Hn ; L; n ¼ 1; 2; . . . ½31 Q ¼ KDK1 ½41
@tn
K has an expansion of the form
where Hn is given by
  X
1
K¼1þ ai Di ½42
Hn ¼ Ln=p ½32
þ i¼1

After time evolution, the coefficient functions ui (x)


Here ‘‘þ’’ denotes the non-negative part of a
of the Lax operator depend also on the variables
pseudodifferential operator and is defined as
t2 , t3 , . . . and become functions of t
{t1 , t2 , . . . }.
X
n X
n These functions are expressed by the -function (t)
A¼ fi ðxÞDi ; Aþ ¼ fi ðxÞDi ½33 of the hierarchy in the following manner:
i¼1 i¼0
@
res K ¼  log ðtÞ ½43
We also use the notation @x

X
1 @2
i res Li=p ¼ log ðtÞ ½44
res A ¼ f1 ðxÞ; A ¼ fi ðxÞD ½34 @x@ti
i¼1
These residues are expressed in terms of {ui } and
Note that x is identified as the first time variable t1 , their derivatives in x, and one can determine them in
that is, x = t1 . terms of the -function.
Topological Gravity, Two-Dimensional 269

In the case p = 2, one has From [55], we see that (p, 1) theory corresponds to the
k=2
background value of the coupling tpþ1 = 1=(p þ 1).
½Hk ; L ¼ 2D resðL Þ ¼ DRk ; k ¼ odd ½45 In the case of (p, q) theory, background value is given
Here {Rk } are the Gelfand–Dikii potentials by tpqþ1 = 1=(pq þ 1).

R1 ¼ u; R3 ¼ 14ð3u2 þ u00 Þ
Virasoro Conditions
1
R5 ¼ 16 ð10u3 þ 5u02 þ 10uu00 þ u0000 Þ ½46
.. A powerful algebraic machinery controlling
. the structure of 2D gravity is the so-called ‘‘Virasoro
conditions.’’ One introduces differential operators
and obey the recursion relation
 @ X1
@ 1X
DRkþ2 ¼ 14 D3 þ 2ðDu þ uDÞ Rk ½47 L1 ¼  þ ktk þ ijti tj ½56
@t1 k¼pþ1 @tkp 2 iþj¼p
If one uses the relation [44], Gelfand–Dikii potentials
are identified as
@ X
1
@ p2  1
Rk ¼ 2hO1 Ok i ½48 L0 ¼  þ ktk þ ½57
@tpþ1 k¼1 @tk 24
By setting k = 1, we note u = 2hO1 O1 i and find that
the evolution equations [31] are all satisfied as By using the fact that derivative in tn brings down
the operator On when acting on the -function, it is
@L @u @
¼ ¼2 hO1 O1 i ¼ 2DhO1 Ok i easy to show that
@tk @tk @tk
L1   ¼ 0 ½58
¼ DRk ¼ ½Hk ; L ½49
Now it is possible to identify the initial condition
for the Lax operator in the case of topological (p, 1) L0   ¼ 0 ½59
gravity. By using the definition reproduce the puncture [27] and dilaton equation
* + [28], respectively. It is possible to show that the
X1 X
log ðtÞ ¼ exp tn O n ½50 L1 -condition, L1   = 0, is equivalent to the
g¼0 n g string equation [55].
Together with the operators (n  1)
one has
@ X
1
@ 1 X @2
res Li=p ð0Þ ¼ hO1 Oi i; i ¼ 1; . . . ; p  1 ½51 Ln ¼  þ ktk þ
@t1þðnþ1Þp k¼1
@tkþnp 2 iþj¼np @ti @tj
From [29] one finds
they generate Virasoro algebra (L0n
(1=p)Ln )
i=p
resL ð0Þ ¼ ix  i;p1 ½52
½L0m ; L0n  ¼ ðm  nÞL0mþn ; n; m  1 ½60
This gives the initial value of the Lax operator:
It is possible to show that the (p, 1) model obeys the
Lð0Þ ¼ Dp þ px ½53 Virasoro conditions [6]
Thus, only the lowest term u0 (x) = px is nonzero Ln   ¼ 0; n  1 ½61
and higher coefficients all vanish at t = 0. This is the
special simplification which takes place in the It is known that (p, 1) models with p > 2 also obey
topological gravity theory. constraints of W-algebra.
We note a relation The relationship of the Virasoro conditions to
  KdV hierarchy is summarized as
1
D; Lð0Þ ¼ 1 ½54 string equation þ KdV hierarchy
p
() Virasoro and W-algebra constraints
This is the so-called ‘‘string equation’’ (at t = 0). At
nonzero values of t, the string equation takes the form
½P; L ¼ 1 Topological -Model
0 1
1 X ½55 It is known that when the target space of a
P ¼ @ðL1=p Þþ  ktk ðLðkpÞ=p Þþ A supersymmetric nonlinear -model is a Kahler
p k¼pþ1
manifold K, the theory acquires an enhanced N = 2
270 Topological Gravity, Two-Dimensional

supersymmetry. Then we can twist the theory and Kahler manifolds are annihilated by the Virasoro
converted into a topological field theory. This is the operators which are constructed by taking an analogy
topological -model [7]. The partition function of with those of (p, 1) gravity. The Virasoro conjecture is
the theory consists of a sum over world-sheet a natural generalization of Witten’s conjecture, and
instantons, that is, holomorphic maps from the has recently been rigorously proved in the case of
Riemann surface to the target space K. Due to curves and projective spaces.
supersymmetry, functional determinants around Excellent reviews on the theory of 2D topological
instantons cancel and the theory simply counts the gravity are given in Witten (1991) and Dijkgraaf (1991).
number of holomorphic curves inside the Kahler
manifold K. Thus, the topological -model has a See also: Axiomatic Approach to Topological Quantum
close relationship with enumerative problems in Field Theory; Large-N and Topological Strings; Mirror
algebraic geometry, that is, Gromov–Witten invar- Symmetry: A Geometric Survey; Moduli Spaces: An
Introduction; Riemann Surfaces; Topological Sigma
iants and quantum cohomology theory.
Models; WDVV Equations and Frobenius Manifolds.
When the topological -model is coupled to topolo-
gical gravity, the BRST-invariant observables are given
by n (i )
n i , where i are cohomology classes
of K. Correlation functions are defined as Further Reading
* + Z
Ys Y s Brézin E and Kazakov V (1990) Exactly soluble field theory of
ni ði Þ ¼ ci ðLi Þni ^ e i ði Þ ½62 closed strings. Physics Letters B 236: 144.

MðK;dÞ Dijkgraaf R (1991) Lectures Presented at Cargése Summer
i¼1 g;d g;s i¼1
School on New Symmetry Principles in Quantum Field
Here M  g, s (K; d) denotes the (stable compactification Theory, hep-th/9201003.
of) moduli space of degree d holomorphic maps Dijkgraaf R, Verlinde E, and Verlinde H (1991) Loop equations
to K from genus g Riemann surfaces . e i is the and Virasoro constraints in non-perturbative 2d quantum
gravity. Nulcear Physics B 348: 435.
pullback of the evaluation map ei : (f ; x1 , . . . , xs ) 2 Douglas M and Shenker S (1990) Strings in less than one dimension.
M g, s (K; d) ! f (xi ) 2 K by f where f is a holo- Nuclear Physics B 335: 635.
morphic map. Correlation functions [62] give Eguchi T and Yang S-K (1990) N = 2 superconformal models
topological (symplectic) invariants of the manifold as topological field theories. Modern Physics Letters A 5: 1693.
K. In the cases ni = 0 (i = 1, . . . , s), they are known as Eguchi T, Hori K, and Xiong C-S (1997) Quantum cohomology
and Virasoro algebra. Physics Letters B 402: 71.
Gromov–Witten invariants. Fukuma M, Kawai H, and Nakayama R (1991) Continuum
Equation [62] is nonvanishing if the selection rule Schwinger–Dyson equations and universal structures in two-
dimensional quantum gravity. International Journal of Mod-
X
s
ðni þ qi Þ ¼ dim Mg;s ðK; dÞ ern Physics A 6: 1385.
Gelfand IM and Dikii LA (1975) Asymptotic behavior of the
i¼1
¼ c1 ðKÞd þ ð3  dim KÞðg  1Þ þ s ½63 resolvent of Sturm–Liouville equations and the algebra of the
Korteweg–de Vries equations. Russian Mathematical Surveys
is obeyed, where qi is the degree of cohomology 30(5): 77–113.
class i and c1 (K) is the first Chern class of the Gross D and Migdal A (1990) Non-perturbative two dimensional
quantum gravity. Physical Review Letters 64: 17.
tangent bundle of K. Kontsevich M (1991) Intersection theory on the moduli space of
We see that there is a close parallel between the curves and the matrix Airy function. Communications in
topological -model and (p, 1) topological gravity. Mathematical Physics.
If we formally set qi = ‘i =p, c1 (K) = 0, and dim K = Witten E (1988) Topological sigma models. Communications in
(p  2)=p, eqn [63] agrees with eqn [25]. Based on this Mathematical Physics 118: 411.
Witten E (1990) On the topological phase of two dimensional
analogy, Eguchi, Hori, and Xiong proposed the gravity. Nuclear Physics B 340: 281.
Virasoro conjecture [8], that is, generating functions Witten E (1991) Two-dimensional gravity and intersection theory of
of the number of holomorphic maps to arbitrary moduli space. Surveys in Differential Geometry 1: 243–310.
Topological Knot Theory and Macroscopic Physics 271

Topological Knot Theory and Macroscopic Physics


L Boi, EHESS and LUTH, Paris, France forms of magnetic flux. In these complex structures of
ª 2006 Elsevier Ltd. All rights reserved. the fields, huge amounts of magnetic energy can be
stored. It is, however, a typical property of astro-
physical plasmas, that the dynamics of magnetic fields
is alternating between an ideal motion, where all forms
Introduction to the Physical and
of knottedness and linkage of the field are conserved
Mathematical Contexts and Issues (topology conservation), and a kind of disruption of
One of the most exciting developments of mathema- the magnetic structure, the so-called magnetic recon-
tical physics in the last three decades has been the nection. In the latter, the magnetic structure breaks up
discovery of numerous intimate relationships between and reconnects, a process often accompanied by
the topology and the geometry of knot theory and the explosive eruptions, where enormous amounts of
dynamics of many domains of ‘‘classical’’ and ‘‘new’’ energy are set free. Magnetic reconnection is in close
macroscopic physics. Indeed, complex systems of analogy to splitting of knots, which makes us
knotted and entangled filamentary structures are confident that the global dynamics of magnetic and
ubiquitous in nature and arise in such disparate electromagnetic fields can be characterized with the
contexts as electrodynamics, magnetohydrodynamics, help of such topological quantities as well.
fluid dynamics (vortex structures), superfluidity, 3. Knotting and unknotting of phase singularities.
dynamical systems, plasma physics, cosmic string It has long been known that dislocation lines can be
theory, chaos of magnetic flows and nonlinear closed, and recently it was shown that they can be
phenomena, turbulence, polymer physics, and mole- knotted and linked. Moreover, Berry and Dennis
cular biology. In the recent years, mathematical tools (2001) constructed exact solutions of the Helmhotz
have been developed to identify and analyze the equation representing torus knots and links; in fact,
geometrical and topological complex structures and a straightforward application of this idea led to
behaviors of such systems and relate this information knotted and linked dislocation lines in stationary
to energy levels and stable states. states of electrons in hydrogen. As a parameter,
The influence of geometry and topology on called , is varied, the topology of dislocation lines
macroscopic physics has been especially fruitful in can change, leading to the creation of knots and
the study and comprehension of the following topics. links from initially simple dislocation loops, and the
reverse process of unknotting and unlinking. The
1. Knots and braids in dynamical systems. It is
main purpose here is to elucidate the mechanism of
now clear that the chaotic behavior of the Hénon–
these changes of topology. All waves are solutions of
Heiles system and other nonlinear systems is driven
monochromatic wave equations, that is, stationary
and controlled by topological properties. For example,
waves, and  is an external parameter that could be
it has been found that trajectories in the phase space
manipulated experimentally. However,  could
form hyperbolic knots. The finding of knots in the
represent time, and then the analogous solutions of
Lorenz equations is another important theme closely
time-dependent wave equations would describe
related to the previous. By varying the Rayleigh
knotting and linking events in the history of waves.
number r, a parameter in the Lorenz equations, both
The methods of Berry and Dennis are based on exact
chaotic and periodic behavior is observed. In the recent
stationary solutions of wave equations, and lead to
years, the knots (notably several torus knots) corre-
knots and links threaded by multistranded helices.
sponding to the different periodic solutions of the
system have been found and classified. By finding
hyperbolic knots and in particular hyperbolic figure-8
The Origins of Topological Vortex
knot as a solution to the Lorenz equations the
Dynamics Ideas
suspicion that there exists a new route to chaos
would be strengthened. The intimate relationship between three-dimensional
2. Topological structures of electromagnetic fields. vortex dynamics and topology was recognized as early
Progress in the field of space physics, astronomy, and as 1869 by W Thomson (Lord Kelvin) who tried to
astrophysics over the last decade, increasingly reveals elaborate a theory of matter in which atoms were
the significance of topological magnetic fields in these thought to be tiny vortex filaments embedded in an
areas. In particular, the interaction of plasma and elastic-like fluid medium, called ether. Accordingly,
magnetic field can create an astonishing variety of the infinite variety of possible chemical compounds
structures, which often exhibit linked and knotted was given by the endless family of topological
272 Topological Knot Theory and Macroscopic Physics

combinations of linked and knotted vortices. Kelvin and circulation around all smooth simple closed
was inspired by the work of Gauss, who in an attempt curves C are preserved under the flow,
to describe topologically the behavior of two insepar- Z
d
ably closed linked circuits carrying electric current, X  dr ¼ 0
found a relationship between the magnetic action dt t ðCÞ
induced by the currents and a pure number that
One knows that in three dimensions, the Helmholtz–
depends only on the type of link, and not on the
Kelvin theorem says that the vorticity (now a vector
geometry: this number is the first topological invariant
field) is transported. Thus, with generic initial
now known as the linking number.
vorticity a 3D time-periodic Euler fluid motion
In modern mathematical terms, Gauss introduced
preserves a nontrivial vector field. One very interest-
an invariant of a link consisting of two simple closed
ing question that remains to be elucidated is the
curves 1 , 2 in R3 , namely the signed number of turns
following: are there any chaotic, time-periodic Euler
of one of the curves around the other, the linking
flows with stationary boundaries?
coefficient {1 , 2 } of the link. His formula for this is
N ¼ f1 ; 2 g
Z Z The Connection between Topological and
1
¼ ð½d1 ðtÞ; d2 ðtÞ; 1  2 Þ= Numerical Invariants of Knots and the
4 1 2
Physical Helicity of Vector Fields
j1 ðtÞ  2 ðtÞj3 ½1
The writhing number of a curve in Euclidean three-
where [ , ] denotes the vector (or cross) product of dimensional space is the standard measure of the
vectors in R3 and ( , ) the Euclidean scalar product. extent to which the curve wraps and coils around
Thus, this integral always has an integer value N. If itself; it has proved its importance for electrody-
we take one of the curves to be the z-axis in R3 and namics and fluid mechanics in the study of the
the other to lie in the (x, y)-plane, then the formula knotted structures of magnetic vortices and
[1] gives the net number of turns of the plane curve dynamics flows, and for molecular biologists in the
around the z-axis. It is interesting to note that the study of knotted duplex DNA and the enzymes
linking coefficient [1] may be zero even though the which affect it. The helicity of a divergenceless
curves are nontrivially linked. Thus, its having vector field defined on a domain in Euclidean
nonzero value represents only a sufficient condition 3-space, introduced by Woltjer in 1958 in an
for nontrivial linkage of the loops. This last astrophysical context and coined by Moffat in
consideration leads naturally to the mathematical 1969 in the study of its topological meaning, is the
concepts of knots and links whose most striking standard measure of the extent to which the field
properties have been investigated in our introduc- lines wrap and coil around one another; it plays
tory article (see Mathematical Knot Theory). important roles in fluid mechanics, magnetohydro-
The other source of inspiration of Kelvin’s theory dynamics, and plasma physics. The ‘‘Biot–Savart
of matter was the Helmholtz’s laws of vortex operator’’ associates with each current distribution
motion, which state that in an ideal fluid (where on a given domain the restriction of its magnetic
there is no viscosity) vortex lives forever: two closed field to the domain. When the domain is simply
vortex rings, once linked, will always be linked. The connected, the divergence-free fields which are
classical results obtained by Helmholtz are basic to tangent to the boundary and which minimize energy
understanding the dynamics of Euler motions. The for given helicity provide models for stable force-
vorticity of a velocity field is its curl and is denoted free magnetic fields in space and laboratory plasmas;
!t (z) := curl(X(z, t)). In two dimensions, the vorticity these fields appear mathematically as the extreme
is a real-valued function and !t = , where  is eigenfields for an appropriate modification of the
the stream function of X(z, t). Recall that the push- Biot–Savart operator. Information about these fields
forward of a scalar field (0-form) s under a can be converted into bounds on the writhing
diffeomorphism f is f s = s  f 1 . These results, in number of a given piece of DNA.
modern terms, can be stated as follows: Recent researches (Cantarella et al. 2001)
Theorem (Helmholtz–Kelvin). An incompressible obtained rough upper bounds for the writhing
fluid motion (Mt , t ) with velocity field X and number of a knot or link in terms of its length and
vorticity !t is Euler if and only if its vorticity is thickness, and rough upper bounds for the helicity
passively transported, of a vector field in terms of its energy and the
geometry of its domain. It was also showed that in
t !0 ¼ !t the case of classical electrodynamics in vacuum, the
Topological Knot Theory and Macroscopic Physics 273

natural helicity invariant, called the electromagnetic HðVÞ ¼ FluxðVÞ2 WrðKÞ


helicity, has an important particle meaning: the
difference between the numbers of right- and left- In the formula, Flux(V) denotes the flux of V
handed photons. Recently, a topological model of through any of the cross-sectional disks D,
classical electrodynamics has been proposed in Z
which the helicity is topologically quantized, in a
FluxðVÞ ¼ V  n dðareaÞ
relation that connects the wave and particle aspects D
of the fields (Trueba and Rañada 2000).
Consider two disjoint closed space curves, C and where n is a unit normal vector field to D.
C0 , and the Gauss’ integral formula for their linking A key feature of this formula is that the helicity of V
number depends on the writhing number of K, but not any
Z further on its geometry; in particular, such quantities
1 as the curvature and torsion of K do not enter into the
LkðC; C0 Þ ¼ ðdx=ds  dy=dtÞ
4 C  C0 formula. Berger and Field actually showed that the
 x  y=jx  yj ds dt ½2 helicity H(V) is a sum of two terms: a ‘‘kink helicity,’’
which is given by the right-hand side of the above
The curves C and C0 are assumed to be smooth and formula, and a ‘‘twist helicity,’’ which is easily shown
to be parametrized by arclength. Now the question in our case to be zero. Their proof assumes K is a knot,
is to know what happens to this integral when the but it is straightforward to extend it to cover links.
two space curves C and C0 come together and Let  be a compact domain in 3-space with
coalesce as one curve C. At first glance, the smooth boundary @; we allow both  and @ to be
integrand looks like it might blow up along the disconnected. Let V be a smooth vector field (where
diagonal of C  C0 , but a careful calculation shows ‘‘smooth’’ means of class C1 ), defined on the
that in fact the integrand approaches zero on the domain . The helicity H(V) of the vector field V
diagonal, and so the integral converges. Its value is is defined by the formula
the writhing number Wr(C) of C defined above: Z
Z 1
1 HðVÞ ¼ VðxÞ  VðyÞ  x  y=jx  yj3
WrðCÞ ¼ ðdx=ds  dy=dtÞ 4   
4 C  C
 dðvolÞx dðvolÞy ½5
 x  y=jx  yj ds dt ½3
Here is the very useful result, due to Fuller (1978). Clearly, helicity for vector fields is the analog of
The writhing number of a knot K is the average writhing number for knots. Both formulas are
linking number of K with its slight perturbations in variants of Gauss’ integral formula for the linking
every possible direction: number of two disjoint closed space curves.
Z In order to understand this formula for helicity,
1 think of V as a distribution of electric current, and
WrðKÞ ¼ LkðK; K þ "WÞ dðareaÞ ½4
4 W2S2 use the Biot–Savart law of electrodynamics to
compute its magnetic field:
This is helpful for getting a quick approximation to
Z
the writhing number of a knot which almost lies in a 1
plane; in the example of a trefoil knot, Wr(K)  3. BSðVÞðyÞ ¼ VðxÞ  y  x=jy  xj3 dðvolÞx ½6
4 
Here, a very important result must be recalled, a
‘‘bridge theorem,’’ proved by Berger and Field Then the helicity of V can be expressed as an integrated
(1984), see also Ricca and Moffatt (1992), which dot product of V with its magnetic field BS(V):
connects helicity of vector fields to writhing of knots Z
1
and links, and which can be used to convert upper HðVÞ ¼ VðxÞ  VðyÞ  x  y=jy  xj3
bounds on helicity into upper bounds on writhing. 4   
 dðvolÞx dðvolÞy
Proposition (Berger and Field). Let K be a smooth Z  Z 
knot or link in 3-space and  = N(K, R) a tubular 1 3
¼ VðgÞ  VðxÞ  y  x=jx  yj dðvolÞx
neighborhood of radius R about K. Let V be a  4 
vector field defined in , orthogonal to the cross-  dðvolÞy
sectional disks, with length depending only on Z
distance from K. This makes V divergence-free and ¼ VðyÞ  BSðVÞðyÞ dðvolÞy
tangent to the boundary of . Then the writhing Z
number Wr(K) of K and the helicity H(V) of the ¼ V  BSðVÞ dðvolÞ
vector field V are related by the formula 
274 Topological Knot Theory and Macroscopic Physics

Cantarella et al. (2001) found two very interesting Influence of Geometry and Topology
results. on Fluid Flows

Theorem 1 Let K be a smooth knot or link in Ideal topological fluid mechanics deals essentially
3-space, with length L and with an embedded with the study of fluid structures that are
tubular neighborhood of radius R. Then the wri- continuously deformed from one configuration to
thing number Wr(K) of K is bounded by another by ambient isotopies. Since the fluid flow
map ’ is both continuous and invertible, then
jWrðKÞj < 1=4ðL=RÞ4=3 ’t1 (K) and ’t2 (K) generate isotopies of a fluid
structure K (e.g., a vortex filament) for any
For the proof, see Cantarella et al. (2001). {t1 , t2 } 2 I. Isotopic flows generate equivalence
classes of (linked and knotted) fluid structures. In
Theorem 2 The helicity of a unit vector field V the case of (vortex or magnetic) fluid flux tubes,
defined on the compact domain  is bounded by fluid actions induce continuous deformations in D.
One of the simplest deformations is local stretch-
jHðVÞj < 1=2 volðÞ4=3 ing of the tube. From a mathematical viewpoint,
this deformation corresponds to a time-dependent,
Let us now give a brief overview of the methods
continuous reparametrization of the tube center-
used to find sharp upper bounds for the helicity of
line. This reparametrization (via homotopy classes)
vector fields defined on a given domain  in
generates ambient isotopies of the flux tube, with
3-space. As usual,  will denote a compact domain
a continuous deformation of the integral curves.
with smooth boundary in 3-space. Let K() denotes
Moreover, in the context of the Euler equations,
the set of all smooth divergence-free vector fields
the Reidemeister moves (or isotopic plane deforma-
defined on  and tangent to its boundary. These
tions), whose changes conserves the knot topology,
vector fields, sometimes called ‘‘fluid knots,’’ are
are performed quite naturally by the action of local
prominent for several reasons: (1) They are natural
flows on flux tube strands. If the fluid in (D  K) is
vector fields to study in a ‘‘fluid dynamics
irrotational, then these fluid flows (with velocity u)
approach’’ to geometric knot theory. (2) They
must satisfy the Dirichlet problem for the Laplacian
correspond to incompressible fluid flows inside a
of the stream function ’, that is,
fixed container. (3) They are vector fields most often
studied in plasma physics. (4) For given energy
(equivalently minimize energy for given helicity), u ¼ r’ in ðD  KÞ
½7
they provide models for stable force-free magnetic r2 ’ ¼ 0
fields in gaseous nebulaes and laboratory plasmas.
(5) The search for these helicity-maximizing fields with normal component of the velocity to the tube
can be converted to the task of solving a system of boundary u? given. Equations [7] admit a unique
partial differential equations. (6) The fluid knots can solution in terms of local flows, and these flows are
reveal some fundamental and still unknown interpretable in terms of Reidemeister’s moves
mechanisms, which characterize the phenomenon performed on the tube strands. Note that boundary
of phase transition, and in particular the transition conditions prescribe only u? , whereas no condition
from chaotic (unstable) phases and behaviors of is imposed on the tangential component of the
matter to ordered (stable) ones. velocity. This is consistent with the fact that
tangential effects do not alter the topology of the
physical knot (or link). The three type of Reidemeister’s
moves are therefore performed by local fluid flows,
Knots and Fluid Mechanics (Vortex Lines,
which are solutions to [7], up to arbitrary tangential
Magnetic Helicity, and Turbulence)
actions.
The Kelvin’s theory of explaining atoms as knotted
vortices in fluid ether was seminal in the develop-
Knotted and Linked Tubes of Magnetic Flux
ment of topological fluid mechanics. The recent
revival (starting in the 1970s) is mainly due to the Let T be the standard solid torus in R3 given by
work of Moffat, on topological interpretation of
helicity, and Arnol’d, on asymptotic linking number ðð2 þ " cos Þ cos ’; ð2 þ " cos Þ sin ’; " sin ÞÞ ½8
of space-filling curves. Modern developments have
been influenced by recent progress in the theory of where 0 , ’ < 2, and 0 " < 1. For relatively
knots and links. prime integers p and q, let Fp, q denote the foliation
Topological Knot Theory and Macroscopic Physics 275

of T by the curves ",  (where 0 " 1 and It can be shown that i is independent of the
0  < 2) given by chosen meridional disk. It also can be shown that
each i is a fluid flow invariant, that is,
"; ðsÞ ¼ ð2 þ " cosð þ qsÞÞ cosðpsÞ; ZZ
ð2 þ " cosð þ qsÞÞ sinðpsÞ; " sinð þ qsÞ ½9 i ðgt LT i Þ ¼ B  U dðareaÞ ½11
gt LDi
where 0 s < 2.
is independent of t.
Definition A magnetic tubular link (or magnetic One more fluid invariant that will play a central
link) is a smooth immersion into R3 of finitely many role in the energy minimization of magnetic links is
disjoint standard solid tori [ni= 1 T i given by the following definition.

L : [ni¼ 1 T i ! R3 Definition The helicity of a magnetic link L is


defined as
and a smooth magnetic field B on R3 such that ZZZ
HðLÞ ¼ A  B dðvolÞ
(i) L is an imbedding when restricted to the [i LT i
interior of [ni= 1 T i ,
(ii) the bounding surface of [i L(T i ), that is, The term helicity was first introduced in a fluid
[i L(@T i ) is a magnetic surface, and context by Moffat, and it was previously used in
(iii) for each component LT i , there exist relatively particle physics for the scalar product of the momen-
prime nonzero integers pi and qi such that L tum and spin of a particle. In another connection, note
maps the foliation Fpi , qi of T i onto the integral that the helicity H(L) is the same as the Chern–Simons
curves of B in LT i . action:
Z
Remark Thus, for every fixed i and j, the linking
HðLÞ ¼ A ^ dA
number between an arbitrary field line in LT i and
Z
an arbitrary field line in LT j is the same regardless
of which integral curves are chosen from LT i and ¼ trðA ^ dA þ 23 A ^ A ^ AÞ ½12
LT j , respectively. This is true even when i = j.
where A now denotes the magnetic vector potential
It follows that a magnetic link [i LT i remains a
as a 1-form.
magnetic link under the action of the fluid flow, that
It can be shown that H(L) is gauge invariant, and
is, [i gt LT i is a magnetic link for t
0.
hence well defined.
Keeping that the magnetic field B is frozen in the
fluid, we can now find and study those properties of Theorem (Moffat). The helicity is invariant under
magnetic links that are invariant under the action of fluid flow, that is,
fluid flow. One obvious invariant is the volume Vi
of each flux tube gt LT i , that is, d
Hðgt LÞ ¼ 0
dt
ZZZ
Vi ¼ VolðLT i Þ ¼ Volðgt LT i Þ ¼ dðvolÞ ½10 Arnol’d (1998) defines the helicity in a more abstract
gt LT i setting and shows that it is invariant under the group
S(Diff) of volume-preserving diffeomorphisms.
which remains unchanged because of incompressibility. The following theorem summarizes the many
Another invariant of fluid flow is defined as results due to Moffat, Ricca, Berger, Lomonaco,
follows: Hornig, Kauffman, and others, relating the helicity
Definition Let L be a magnetic link. For each solid of magnetic links to linking and to magnetic flux.
torus T i , choose a meridional disk Di . The magnetic Theorem Let L be a magnetic link. Then
flux i = (LT i ) in the ith component is the surface
X X
integral defined as HðLÞ ¼ 2 i SLFi þ 2 i j LKij
ZZ 1 i n 1 i<j n

i ¼ ðLT i Þ : B  U dðareaÞ
LDi where SLFi denotes the self-linking number of
the axis curve of the tube LT i with respect to the
where U denotes the normal to the surface LDi framing Fi induced by the integral curves of the
pointing in the positive direction induced by the B magnetic field B within LT i , and LKij denotes
field. the linking number between any integral curve of
276 Topological Knot Theory and Macroscopic Physics

the magnetic field B in LT i with any integral curve losing energy, the magnetic lines of force will
of the magnetic field B in LT j . contract. On the other hand, since this is a volume-
preserving process, the cross sections of the flux
Remark In fact, SLFi is the same as the linking
tubes of gt L will at the same time expand. These
number between any two integral curves of the
changes of topology occur while the flux , volume
magnetic field B within the tube LT i .
V, and helicity of gt L will remain the same. In other
Thus, as many authors have showed, the helicity words, knotted magnetic flux tubes left free to
does reflect the topology and the geometry of the evolve in such a fluid will do so by conserving their
magnetic lines of force within a magnetic link. If, for magnetic flux  and volume V, but converting their
example, L has only one component, that is, L is a magnetic energy into kinetic energy, which in turn
magnetic knot, then dissipates by internal friction. Magnetic links and
knots evolve from high to low magnetic energy
HðLÞ ¼ 2 SLF ðCÞ ½13 levels, conserving topology; and because of the
where SLF (C) is the self-linking number of the axis induced shortening of field lines under conservation
curve C of the knotted tube with respect to the of volume, they become fatter and fatter, with an
framing F induced by the integral curves of the increase of the average tube cross section.
magnetic field B within the magnetic knot. If, for This process cannot continue indefinitely. Even-
example, the tube is knotted in the form of a trefoil tually, the magnetic flux tubes of gt L must make
and if the magnetic lines of force appear to be contact with each other. In other words, the topology
parallel to the axis curve when the trefoil is placed of the magnetic link gt L, as expressed in knotting and
on a plane flat surface, then SL = 3 and linking, creates a barrier to the full dissipation of the
magnetic link’s energy, that is, EM (gt L) has a positive
H ¼ 32 ½14 lower bound that results from the topology of gt L.
On the other hand, if for example, the magnetic That means, in other words, that relaxation is
lines of force induce the trivial framing in each obstructed by the knottedness and entanglement of
component, then the field lines, and a minimum magnetic energy is
X reached. Thus, the magnetic link will reach a
HðLÞ ¼ 2 i j LKij ½15 nontrivial stable and invariant energy state, much as
1 i<j n Kelvin conjectured his atomic vortices would.
Thus, if L is a magnetic two-component Hopf link Various estimates of magnetomechanical energy in
with no twisting of the integral curves of the terms of topological quantities have been put forward
magnetic field within the components of L, then in recent years (see Freedmann and He (1994)). These
relations give lower bounds for the energy levels
HðLÞ ¼ 21 2 ½16 attainable by knot or link types by taking into account
because the self-linking number based on the B-field the effects that linking numbers and number of
framing is zero for each component, and the linking crossings have on the energy of the relaxed state.
number between the two components is 1. These bounds are expressed by relationship of the kind

Emin
ðCmin ; ; V; NÞ ½17
Energy of Magnetic Knots and Links
Let us conclude this section with the definition of where Emin is the equilibrium energy and  gives the
the energy of a magnetic link. relationship between physical quantities – such as
total flux , number of tubes N, magnetic volume V –
Definition The magnetic energy EM (L) of a mag- and topology, given here by the minimal possible
netic link L is defined by the classical formula number of crossing Cmin . These relations offer
ZZZ
1 numerous advantages due to the explicit dependence
EM ðLÞ ¼ jBj2 dðvolÞ ðGaussian unitsÞ on qualitative properties of the flow field. A simple
8 [i LT i
example is provided by the analysis of three braids,
Although the energy EM is not flow invariant, it will which shows that magnetic energy grows quadrati-
play a central role in magnetic relaxation of knots cally in time due to random braiding. This means
and minimum energy magnetic links. that the least possible amount of magnetic energy
Consider a magnetic link L in a perfectly that can be attained by the physical knot or link is
conducting, incompressible, viscous fluid. As a result determined purely by its topology. If topological
of dissipative frictional fluid forces, the magnetic information sets the levels of minimum energy
energy EM (gt L) of gt L will decrease with time t. In accessible to the knot or link, geometric properties
Topological Knot Theory and Macroscopic Physics 277

may also influence the relaxation process. Consid- vector field lines (streamlines, vortex lines, or
erations of helicity and linking numbers, for magnetic lines) cross each other. If two field lines
example, demonstrate that internal rearrangement meet, the point of crossing is a true nodal point, like
of magnetic field geometry leads to a spectrum of a bifurcation in a path. Dissipative effects allow the
different asymptotic endstates with the same topol- reconnection to proceed through such points.
ogy. Moreover, magnetic knots have a natural In dissipative fluids, mathematical and physical
tendency to get rid of excessive torsion of field properties are no longer conserved, and during the
lines and S-shaped tube geometries, and this may process we lose part of the original information.
influence the relaxation process. However, some of the invariants are rather robust
Since the helicity H(gt L) is both an invariant of and may only degrade slowly. One of them is magnetic
fluid flow and an expression of the magnetic link helicity, the magnetic analog of the kinetic helicity. Its
gt L’s topology, the following theorem, first stated by dissipation during reconnection can be modest; in
Moffat, is a mathematical expression of this particular, if the reconnection timescale is small
topological bounds. compared to classical dissipation times, then helicity
loss will be negligible. The robustness of magnetic
Theorem Let L be a magnetic link. Then
helicity plays a central role in fusion plasma physics
EM ðLÞ
q0 jHðLÞj and in many astrophysical contexts. On the other hand,
large changes in kinetic helicity are intimately related to
where q0 is a nonzero constant that is independent
qualitative changes in the topology of vortex flows.
of the magnetic link.
Under Euler’s equations, the helicity of a vortex
Freedman and He (1991) obtain more subtle and tube Rof vorticity ! and velocity u is defined by
tighter topological bounds on the minimum energy H = u  ! dV. The integral is taken over the tube
of magnetic links. For example, for a magnetic knot volume V occupied by !. Now, for n knotted and
K, they prove that linked vortex tubes, each of (constant) strength
(total vorticity) i (1 i N), the helicity of the
1 ðKÞ3=2 acðKÞ3=4 whole system can be expressed in terms of linking
EM ðKÞ

45=4 VðKÞ1=3 numbers Lkij as


X
1 ðKÞ3=2 ð2gðKÞ  1Þ H¼ Lkij i j Lkij

45=4 VðKÞ1=3 ij

ðGaussian unitsÞ ½18 which is equal to Lkji ; this is a topological invariant


whose value does not change under continuous
where V(K) denotes the volume of the magnetic knot
deformation of the fluid structure. Since helicity
K, (K) denotes the flux in K, ac(K) is the asymptotic
and flux-tube strength are measurable conserved
crossing number, and g(K) is the genus of the knot K.
quantities, the above equation provides useful
Freedman and He conjecture that ac(K) = c(K), where
information about the topology of the flow field
c(K) is the crossing number, that is, the minimum
and flow structures. In addition, by direct measure-
number of crossings among all plane diagrams repre-
ments of helicity and application of conservation of
senting the knot K. Besides, Moffat (1990) suggests
topology, one can estimate average geometric
that the minimum energy spectrum of a magnetic knot
quantities, such as the mean twist of field lines,
can be used to construct new knot invariants.
and their contribution to the total energy.

Topological Changes, Dissipation,


and Reconnection in Fluid Patterns Brief Conclusion
As we saw above, topological changes do occur In this article, we have made an attempt to indicate
when dissipative effects become predominant over how ‘‘classical’’ field theories, which have been
the coherency of structures. When this happens, successfully used to describe physics of fundamental
there is a dramatic change of fluid patterns, often on structures and forces of nature, can also be used to
small timescales compared to evolution. The change study geometry and topology of low-dimensional
occurs through the formation and disappearance of manifolds. These developments not only provide new
physical reconnections in the fluid pattern. In real insights into old problems of topology of these
fluids, for example, vortex and magnetic tubes do manifolds but also have been responsible for pro-
interact and reconnect freely. From a dynamical foundly interesting new mathematics (fluid
system viewpoint, reconnections take place when the mechanics, dynamical flows, and polymer biophysics
278 Topological Quantum Field Theory: Overview

are maybe the most significant examples in the last Boyland P (2001) Fluid mechanics and mathematical structures.
years). In particular fluid dynamics, a topological In: Ricca RL (ed.) An Introduction to the Geometry and
Topology of Fluid Flows, pp. 105–134. NATO-ASI Series:
macroscopic field theory, provides a powerful frame- Mathematics. Dordrecht: Kluwer.
work for modern theory of knots and links in Cantarella J, De Turk D, and Gkuck H (2001) The Biot–Savart
3-manifolds. Moreover, as we saw here, it provides operator for application to knot theory fluid dynamics, and
a physical interpretation of the link, self-linking, and plasma physics. Journal of Mathematical Physics 42: 876–905.
writhing number of knots and links. The present Freedman MH and Zheng-Xu He (1991) Divergence free fields:
energy and asymptotic crossing number. Annals of Mathe-
article was essentially aimed to illustrate such a matics 134: 189–229.
relationship. Thus, the most fundamental result we Fuller FB (1978) Decomposition of the linking number of a closed
reported here is the relation (formula) connecting the ribbon: a problem from molecular biology. Proceedings of the
helicity of vector (magnetic) fields to the writhing National Academy of Sciences, USA 75: 3557–3561.
number of knots: H(V) = Flux(V)2 Wr(K). So, wri- Ghrist RW, Holmes PhJ, and Sullivan MC (1997) Knots and
Links in Three-Dimensional Flows. Heidelberg: Springer.
thing number for knots is the analog of helicity for Hornig G (2002) Topological Methods in Fluid Dynamics.
vector fields. Both expressions of these invariants are Preprint. Ruhr-Universität-Bochum.
variants of the (Gaussian) integral formula for the Kauffman LH (1995) Knots and Applications. Series on Knots
linking number of two disjoint closed space curves. and Everything, vol. 6, Singapore: World Scientific.
Further investigations of these invariants and their Lomonaco SJ (1995) The modern legacies of Thomson’s atomic
vortex theory in classical electrodynamics. In: Kauffman LH
mathematical properties might throw new light on (ed.) The Interface of Knots and Physics, Proc. Symp. Appl.
the interfaces between many different areas of Math., vol. 51, pp. 145–166. American Mathematical Society.
macroscopic and quantum physics. Moffatt HK (1969) The degree of knottedness of tangled vortex
lines. Journal of Fluid Mechanics 35: 117–129.
See also: The Jones Polynomial; Knot Theory and Moffat HK (1990) The energy spectrum of knots and links.
Physics; Magnetohydrodynamics; Mathematical Knot Nature 347: 367–369.
Theory; Stability of Flows; Superfluids; Topological Moffatt HK, Zaslavsky GM, Comte P, and Tabor M (1992)
Topological Aspects of the Dynamics of Fluids and Plasmas,
Quantum Field Theory: Overview; Vortex Dynamics;
NATO ASI Series, Series E: Applied Sciences, vol. 218.
Yang–Baxter Equations.
Dordrecht: Kluwer Academic.
Ricca RL (1998) New developments in topological fluid
mechanics: from Kelvin’s vortex knots to magnetic knots. In:
Further Reading Stasiak A, Katritch V, and Kauffman LH (eds.) Ideal Knots.
Singapore: World Scientific.
Arnol’d V and Khesin B (1998) Topological Methods in Ricca RL and Moffat HK (1992) The helicity of a knotted vortex
Hydrodynamics. Heidelberg: Springer. filament. In: Moffat HK (ed.) Topological Aspects of
Berger MA and Field GB (1984) The topological properties of Dynamics of Fluids and Plasmas, pp. 225–236. Dordrecht:
magnetic helicity. Journal of Fluid Mechanics 147: 133–148. Kluwer.
Berry MV and Dennis MR (2001) Knotted and linked phase Tait PG (1900) On Knots I, II, III. In: Scientific Papers.
singularities in nonchromatic waves. Proceedings of the Royal Cambridge: Cambridge University Press.
Society A 457: 2251–2263. Thomson JJ (1883) A Treatise on the Motion of Vortex Rings.
Boi L (2005) Topological knots’ models in physics and biology. London: Macmillan.
In: Boi L (ed.) Geometries of Nature, Living Systems and Trueba JL and Rañada AF (2000) Helicity in classical electro-
Human Cognition. New Interactions of Mathematics with dynamics and its topological quantization. Apeiron 7: 83–88.
Natural Sciences and Humanities, pp. 211–294. Singapore: Woltjer L (1958) A theorem on force-free magnetic fields. Proceed-
World Scientific. ings of the National Academy Sciences, USA 44: 489–491.

Topological Quantum Field Theory: Overview


J M F Labastida, CSIC, Madrid, Spain new fields of research. A well-known example is the
C Lozano, INTA, Torrejón de Ardoz, Spain prediction of Seiberg–Witten invariants as building
ª 2006 Elsevier Ltd. All rights reserved. blocks of Donaldson invariants. However, there are
others such as the recent proposal for the coeffi-
cients of the HOMFLY polynomial invariants for
knots as quantities related to enumerative geometry.
Introduction
These developments have drawn the attention of
Topological quantum field theory (TQFT) constitu- mathematicians and physicists into TQFT since the
tes one of the most successful fields of mathematical 1980s, a very fruitful period in which both commu-
physics since it originated in the 1980s. It possesses nities have benefited from each other.
an inherent property which makes it unique: TQFT Topology has always been present in mathematical
provides predictions in mathematics which open physics, in particular when dealing with aspects of
Topological Quantum Field Theory: Overview 279

quantum physics. Global effects play an important Donaldson–Witten theory where we discuss the
role in quantum-mechanical models and topology computation of its observables from a perturbative
becomes an essential ingredient in their description. approach, showing their relation to the Donaldson
TQFT itself appeared in the winter of 1987 after invariants. Next, we introduce Chern–Simons gauge
Witten’s work (Witten 1988a) on Donaldson theory theory as a theory of knot and link invariants. The
(Donaldson 1990), but a series of papers during the penultimate section deals with advanced develop-
1980s which dealt with topological aspects of field and ments in TQFT. Finally, we end up with some
string theory anticipated its existence. Two of these concluding remarks.
correspond to Witten’s works on supersymmetric
quantum mechanics and supersymmetric sigma mod-
els (Witten 1982) that led to a generalization of Morse Topological Quantum Field Theory
theory. This generalization was considered by Floer
(1987) in a new context that constituted the key We will start our overview by presenting the most
element in Witten’s construction of TQFT. These general structure of a TQFT from a functional
developments were certainly influenced by Atiyah integral point of view which, though not rigorously
(1988). TQFT was born as a result of the interplay defined, is the approach that has led to the most
between physics and mathematics. This has been a important developments. As in conventional quan-
constant feature all along its development. tum field theory, axiomatic approaches to TQFT do
Soon after the formulation of the TQFT exist, but we will not follow that route here.
addressing Donaldson theory, now known Let us consider an n-dimensional Riemannian
as Donaldson–Witten theory, Witten formulated a manifold X endowed with a metric g and a
new TQFT which focuses on knot invariants such as quantum field theory on it. We will say that this
the Jones polynomial and its generalizations (Jones theory is ‘‘topological’’ if there exist operators in the
1985). Witten (1989) constructed Chern–Simons theory such that their correlation functions do not
gauge theory and proved its relation to the theory depend on the metric. If we denote these operators
of knot and link invariants. This theory possesses by Oi (where i is a generic label), then
different features than Donaldson–Witten theory,
and in fact it turns out that these two theories fall 
hOi1    Oin i ¼ 0 ½1
into two different general types of TQFTs as will g
be explained in the following section. Anyhow,
despite their formal differences, both Donaldson– where h  i denotes a vacuum expectation value.
Witten and Chern–Simons gauge theory emerged The operators that satisfy this equation are called
as a novel way to express topological invariants in ‘‘topological observables.’’
terms of quantum field theory quantities as well as The simplest way to achieve metric independence is
to generalize their previous formulation. But there to consider a theory whose action and operators do not
was much more to them than it seemed in their depend on the metric. In this situation, if no
beginnings. Once these topological invariants were anomalous metric dependence is generated upon
formulated in field theory language, one had a quantization, the correlation functions of these opera-
huge machinery to study them from different tors satisfy [1] and lead to topological invariants on X.
points of view. Theoretical physicists have devel- Theories of this sort are collectively referred to as
oped many useful tools to study quantum field Schwarz-type TQFTs, and well-known examples are
theory. The use of these tools led to new frame- Chern–Simons gauge theory and BF theories. How-
works for these topological invariants. ever, Schwarz-type theories are too restrictive. One
In this overview we are going to provide the basics would like to have a theory satisfying property [1] with
of TQFT and briefly describe two examples – a weaker condition on the action. This can be achieved
Donaldson–Witten theory and Chern–Simons gauge with the help of a symmetry. The resulting TQFTs are
theory – to explain how the general features are called of Witten or cohomological type, the main
implemented. Some excellent reviews on the subject examples being Donaldson–Witten theory and topo-
(Birmingham et al. 1991, Cordes et al. 1996, logical sigma models (Witten 1988b).
Labastida and Mariño 2004) are available. The For TQFTs of Witten type, the action may depend
organization of this work is as follows. In the on the metric. However, the theory has an underlying
following section we present a general introduction scalar symmetry  acting on the fields i . Since  is a
to TQFT from a functional integral point of view. symmetry, the action of the theory satisfies S(i ) = 0.
Next, we touch upon the twisting of extended In these theories, metric independence of the correla-
supersymmetry as a general constructive approach tion functions is achieved as follows. Let T =
to TQFT. This is followed by a section on (=g  )S(i ) be the energy–momentum tensor of
280 Topological Quantum Field Theory: Overview

the theory. It turns out that the energy–momentum the equivariant cohomology of . Given an operator
tensor is -exact: (0) in the equivariant cohomology of , let us
consider the following set of equations:
T ¼ iG ½2
dðnÞ ¼ ðnþ1Þ ; n0 ½7
G being some tensor. Indeed, if [2] is satisfied, it
follows that for any set of operators Oi which are where the operators (n) (n = 1, . . . , dim X) are diff-
-invariant, erential forms of degree n on X and d is the de Rham
differential. These differential equations are called
 ‘‘descent equations’’ and their solutions (n) (n  0)
hOi1 Oi2    Oin i ¼ hOi1 Oi2    Oin T i
g ‘‘topological descendants’’ of (0) . We will show how
¼ ihOi1 Oi2    Oin G i to construct a solution to these equations on general
¼ ihðOi1 Oi2    Oin G Þi grounds.
The topological descendants lead to the construc-
¼0 ½3
tion of a set of elements of the equivariant coho-
In this computation we have assumed that mology of . Let n be an n-cycle on X, n 2 Hn (X),
the symmetry  is not anomalous and that there are and let us consider the following operator:
no contributions coming from boundary terms since Z
ðn Þ
we have integrated by parts in field space. This is not Wð0Þ ¼ ðnÞ ½8
n
always the case and in fact the situations in which one
of these two properties fails lead to rich phenomena. In This operator is -invariant,
those cases, for example, in Donaldson–Witten theory Z Z Z
ð Þ
on manifolds with bþ2 = 1, the correlation functions fail Wð0Þn ¼ ðnÞ ¼ dðn1Þ ¼ ðn1Þ ¼ 0 ½9
to be topological invariants in a controlled manner n n @ n

which unveils many interesting properties. since @ n = 0. On the other hand, if n were trivial
We will now describe Witten-type theories in a in homology, that is, if n = @ nþ1 , we would have
general context. The general structure of Schwarz-type that W((0)n ) is -exact:
theories is much simpler and will be illustrated in Z Z Z
the example presented below. In Witten-type theories ðn Þ ðnÞ ðnÞ
Wð0Þ ¼  ¼ d ¼  ðnþ1Þ ½10
the observables are the -invariant operators. It is @ nþ1 nþ1 nþ1
simple to prove that -exact operators decouple from (0)
Thus, given the operator  , we have constructed a
the theory. Indeed, if Oa is -exact, Oa = O^a , then map between the homology of X and the equivar-
hOa Oi1 Oi2    Oin i ¼ hO^a Oi1 Oi2    Oin i iant cohomology of . There are as many maps as
basic operators (0) one finds in the theory.
¼ hðO^a Oi Oi    Oi Þi ¼ 0
1 2 n
½4 To actually construct these maps, we need to find
Thus, one can restrict the set of observables to the a solution of the descent equations [7]. As
cohomology of : announced before, there is a general solution to
those equations in Witten-type theories. Since in this
Ker  type of theories [2] holds, there exists an operator
O2 ½5
Im 
G  G0 ½11
There is no reason a priori why the -symmetry
should be a scalar Grassmannian symmetry, but in that satisfies
all known models of Witten-type TQFTs this turns P ¼ T0 ¼ iG ½12
out to be the case. Thus, these theories violate the
spin-statistics theorem. In all these models the Notice that G is an anticommuting operator and a
algebra of the  symmetry has the form 1-form in spacetime. With the aid of this operator,
one constructs the following solution to the descent
2 ¼ Z ½6 equations [7]:
where Z is a symmetry transformation (typically a 1 ðnÞ
gauge symmetry of some sort). This property forces ðnÞ ¼  dx1 ^    ^ dxn ½13
n! 1 2 ...n
to consider Z-invariant observables and to work in
the context of ‘‘equivariant cohomology.’’ where
The observables of Witten-type theories fit into a ðnÞ ð0Þ
1 2 ...n ðxÞ ¼ G1 G2    Gn  ðxÞ;
general pattern that we describe now. The key
ingredient is a map between the homology of X and n ¼ 1; . . . ; dim X ½14
Topological Quantum Field Theory: Overview 281

1
One can easily check using [12] and the -invariance Qv and Qv ˙ transform under H as (0, 2, 2) and
1
of (0) that the operators [13] do satisfy the descent (2, 0, 2) , respectively. M˙ ˙ and M are the
equations [7]. generators of SU(2)þ and SU(2) , respectively, while
We have seen that Witten-type TQFTs are char- Bvw and R generate SU(2)R and U(1)R , respectively.
acterized by property [2]. It would be desirable to have The twisting of a supersymmetric theory involves a
at hand a systematic procedure to build theories modification of the couplings of the theory to a
satisfying that property. It has been found that background metric on the space where the theory is
extended supersymmetry provides a very helpful defined. This modification is carried out redefining
starting point to build those theories. Although super- the Lorentz transformation properties of the different
symmetry guarantees from first principles only the fields making use of the internal symmetry SU(2)R . In
weaker condition [12] instead of [2], all TQFTs that particular, we will redefine the couplings of the fields
have been constructed from extended supersymmetry to the SU(2)þ spin connection according to the way
actually satisfy [2]. To build a TQFT from a theory they transform under SU(2)R . This is easily done by
with extended supersymmetry, one needs to go identifying the SU(2)R indices v with the SU(2)þ
through the twisting procedure that we now describe. indices .˙ The procedure involves a redefinition of
the rotation group into K0 = SU0 (2)þ SU(2) , where
SU0 (2)þ is generated by
Twisting of Extended Supersymmetry
M0_ _ ¼ M_ _  B_ _ ½16
All known Witten-type theories are related to an
underlying extended supersymmetric quantum field The supersymmetry generators Qv and Qv
˙ get
theory. The topological theory is a modified version of transformed in the following way:
the supersymmetric theory in which the Lorentz
Qv
_ ! Q_ _
transformation properties (spins) of some of the fields ½17
have been modified. This modification of spin assign- Qv ! Q_
ments is known as twisting, and it can be carried out
on any theory with extended supersymmetry in any which allows us to define the ‘‘topological
spacetime dimension. We will not consider the supercharge’’:
procedure in such a general setting but instead we _
Q  _  Q_ _ ½18
will illustrate it by considering the case of N = 2
supersymmetry in four dimensions. We will begin with It is simple to prove using [15] and [16] that this
a general description and then we will apply it to a quantity is a scalar under the new rotation group
specific example: Donaldson–Witten theory. K0 : [M , Q] = 0 and [M0˙ ˙ , Q] = 0. In addition, from
Let us consider the Euclidean version of the N = 2 [15], it follows that Q is nilpotent (in the absence of
supersymmetry algebra with no central charges. Central central charges):
charges can be included without much ado but we will 2
not consider them for simplicity. The total symmetry Q ¼0 ½19
group of the theory is H = SU(2)þ  SU(2)  SU(2)R 
The scalar generator Q leads to the topological
U(1)R , K = SU(2)þ  SU(2) being the rotation group,
symmetry  of the previous section. Actually, the
and SU(2)R  U(1)R the internal symmetry group of
twisting procedure provides also the operator G in
the N = 2 supersymmetry algebra. The generator
[12]. Defining
algebra takes the following form:
i _
fQv ; Qw 
_ g ¼ 2 vw
_ P ;
fQv ; Qw g ¼ 0 G ¼ ð
 Þ Q_ ½20
4
½P ; Qv  ¼ 0; ½P ; Qv
_ ¼ 0 one easily finds, after using [15] and [18],
½M ; Qv  ¼ ð QÞv ; ½M ; Qv
_  ¼0
fQ; G g ¼ @ ½21
½M_ _ ; Qv  ¼ 0; ½M_ _ ; Qv _ _ QÞv
_  ¼ ð _
u wÞ
which is indeed equivalent to [12]. On general
½Bvw ; Qu  ¼ uðv QwÞ
 ; ½Bvw ; Q_  ¼  uðv Q_ grounds we cannot prove that twisted supersym-
½Qv ; R ¼ Qv ; metric theories lead to theories which satisfy [12].
_ ; R ¼ Qv
½Qv _
However relation [12], which is weaker, is guaran-
½15
teed. It turns out that in all the models originated
In these relations v, w 2 {1, 2} are SU(2)R indices and from extended supersymmetry which have been
 and ˙ denote spinorial indices of SU(2) and studied, [2] is satisfied and thus the resulting
SU(2)þ , respectively. The supersymmetry generators theories are TQFTs of Witten type.
282 Topological Quantum Field Theory: Overview

Donaldson–Witten Theory In this table the representations of the respective


rotation groups carried by the fields have been
One of the greatest successes of TQFT has been the
indicated. The superindices refer to the U(1)R charge
discovery of Seiberg–Witten invariants as building
which is also called ‘‘ghost number’’ in the context
blocks of Donaldson invariants. This was achieved
of TQFT. The fields and  are given by the
in two main steps. First, Donaldson theory was
antisymmetric and symmetric pieces of ˙ ˙ : ˙ ˙ =
reformulated in field-theoretical terms, using pertur- ˙
(˙ )˙ and = (1=2) ˙  ˙ ˙ .
bative methods. Second, the resulting TQFT was
Notice that the twisted fields in [24] are differ-
solved using nonperturbative methods. In this sec-
ential forms on X; therefore, the twisted theory
tion we are going to describe in some detail the first
makes sense globally on any arbitrary Riemannian
step. The second one will be briefly addressed later
4-manifold. This is not the case with the original
and is the main object of a separate article in the
N = 2 supersymmetric Yang–Mills, which contains
encyclopedia (see Seiberg–Witten Theory).
fermionic fields. Making global sense of those on
Let us consider N = 2 supersymmetric Yang–Mills
arbitrary Riemannian 4-manifolds requires the
theory in four dimensions. The field content of the
manifold to be Spin.
theory is the following: a gauge field A , two spinors
The dynamics of the twisted theory is governed by
v , and a complex scalar , all of them in the
an action which can be obtained by twisting the
adjoint representation of a gauge group G. In
action [22]. On an arbitrary Riemannian 4-manifold
addition, the theory possesses the auxiliary fields
endowed with a metric g , the twisted action
Dvw in the 3 of the internal SU(2)R . The theory has
becomes
the following action:
Z 
Z  4 pffiffiffi _
v 1 S ¼ d x g tr r r y  i  
 _
r _ _
d4 x tr r y r   i v
 r  F F
4
_ 1 1 _
1 1 i i _ r  F F þ D_ _ D_ 

þ Dvw Dvw  ½; y 2  pffiffiffi vw v  ½y ; w  4 4
4 2 2 1 i
 _
 ½; y 2  pffiffiffi _  ½; _ _ 
i v w_ 2
 pffiffiffi vw _ ½ ;  ½22 2
2 
pffiffiffi i _ y
þ i 2 ½;   pffiffiffi _ ½ ;  ½25
This action is invariant under the following N = 2 2
supersymmetric transformations: pffiffiffi
where g = (det(g ))1=2 .
pffiffiffi
 ¼ 2 vw v w To obtain the transformations of the fields under
v v the topological symmetry, we need to compute the
A ¼ i v
  i v
 Q-transformations. These are easily obtained using
 v ¼ Dv w w  i v ½; y   i
   v F [18] and [23]. They turn out to be
pffiffiffi w_ ½23
þ i 2 vw
_ r  ½Q;  ¼ 0
ðv ðv ½Q; A  ¼
Dvw ¼ 2i
 r wÞ þ 2ir
 wÞ 
pffiffiffi pffiffiffi ðv wÞ fQ; g ¼ ½; y 
þ 2i 2 ðv ½ wÞ ; y  þ 2i 2 ½ ;  pffiffiffi
fQ;  g ¼ 2 2r  ½26
v being spinorial N = 2 supersymmetric parameters. pffiffiffi
We can now twist the above theory following the ½Q; y  ¼ 2 2i
procedure explained in the previous section. Upon fQ; _ _ g ¼ iðFþ_ _  D_ _ Þ
twisting, the fields of the theory change their spin pffiffiffi
content as follows: ½Q; D ¼ ð2r Þþ þ 2 2½; 
˙
A ð2; 2; 0Þ0 ! A ð2; 2Þ0 where  =
˙  and Fþ˙ ˙ =
 F is the self-dual
˙ ˙ 
part of F . Using these transformations, one easily
v ð2; 0; 2Þ1 ! _ ð2; 2Þ
1 2
finds that Q is a gauge transformation. This is not
_ ð0; 2; 2Þ
v 1
! ð0; 0Þ ; _ _ ð3; 0Þ1
1 unexpected since the N = 2 supersymmetric trans-
½24 formations [23] are in the Wess–Zumino gauge and
ð0; 0; 0Þ2 ! ð0; 0Þ2 they close only up to gauge transformations. This
y ð0; 0; 0Þ2 ! y ð0; 0Þ2 property implies that one must consider the equiv-
ariant cohomology of Q defined on the set of gauge-
Dvw ð0; 0; 3Þ0 ! D_ _ ð0; 0Þ0 invariant operators.
Topological Quantum Field Theory: Overview 283

The action [25] is Q-exact up to a topological Using G we can now construct the map between
term: the homology of X and the equivariant cohomology
Z of Q. Let us consider the simple case SU(2). There
1
S ¼ fQ; Vg  F^F ½27 exists only one independent Casimir and, corre-
2 spondingly, only one basic operator:
where
O ¼ trð2 Þ ½32
Z 
pffiffiffi i _ _ for which one finds the following set of descendants:
V ¼ d4 x g tr _ _ ðF_  þ D_  Þ
4  
 1
1 1 Oð1Þ ¼ tr pffiffiffi   dx
 ½; y  þ pffiffiffi _ r_
y ½28 2
2 2 2 
ð2Þ 1 1 
O ¼  tr pffiffiffi ðF þ D Þ
Actually, it turns out that in all the theories obtained 2 2 ½33

after twisting extended supersymmetry, the resulting 1  
actions are Q-exact up to topological terms. In the    dx ^ dx
4
case
R of N = 2 theories, topological (theta) terms
..
F ^ F are generically not observable (due to a chiral .
anomaly), so it is customary to pick
The map from the homology of X to the equivariant
SDW ¼ fQ; Vg ½29 cohomology of Q can now be constructed very
easily. Let i be an element of the homology group
as the action of the theory, which immediately implies
Hi (X). We associate to it the following observable:
[2] and therefore the topological character of the Z
theory. Notice, however, that [29] is stronger than [2].
i ! Ii ði Þ ¼ OðiÞ ½34
As we described in the previous section, the i
observables of the theory can be constructed using (i)
the operator G in [20]. Its action on the twisted where O is given in [33]. The construction assures
fields is easily obtained using [23]: that Ii (i ) is invariant under Q and gauge transfor-
mations. Furthermore, it is also assured that Ii (i ) is
1 not Q-exact.
½G ;  ¼ pffiffiffi 
2 2 Let us consider the computation of correlation
i functions. The discussion will be presented for a
½G ; A  ¼ g  i generic gauge group. We will consider the topologi-
2pffiffiffi
cal theory defined by the Donaldson–Witten action
i 2
½G;  ¼  r
4 SDW ¼ fQ; Vg ½35

fG ;  g ¼ ðF þ Dþ
 Þ
½30 where V is defined in [28]. The property [35] has a
½G;  ¼ 0 very important consequence. The action SDW shows
3i up in the correlation functions as exp(SDW =e2 ),
½G; Fþ  ¼ ir þ
r
where e is a free parameter which corresponds to
pffiffiffi2
3i 2 the coupling constant of the N = 2 theory. Since the
fG; g ¼ 
r term involving the coupling constant is Q-exact, the
8
3i 3i correlation functions of Q-invariant operators are
½G; D ¼ 
r þ r independent of e. Let us explain this in some detail.
4 2
The (unnormalized) correlation functions of the
We now need to fix the basic operator (0) in [14]. theory are defined by
The starting point must be a set of gauge-invariant, Z
2
Q-closed operators which are not Q-trivial. Since h1    n i ¼ D1    n eð1=e ÞSDW ½36
[Q, ] = 0, these operators are the gauge-invariant
polynomials in the field . For a simple gauge group where 1 , . . . , n are invariant under Q transforma-
of rank r the algebra of these polynomials is tions. Using the fact that SDW is Q-exact, one obtains
generated by r elements, and we shall denote this
basis by On , n = 1, . . . , r. A simple choice for SU(N) @ 2
h1    n i ¼ 3 h1    n SDW i
consists of the following Casimirs: @e e
2
On ¼ trðnþ1 Þ; n ¼ 1; . . . ; N ½31 ¼ 3 hfQ; 1    n Vgi ¼ 0 ½37
e
284 Topological Quantum Field Theory: Overview

where we have used the fact that Q is a symmetry of We finish this section by pointing out that many
the theory, and therefore as in [3] the last functional features of the evaluation of the functional integral
integral gives zero. This result implies that one can of the Donaldson–Witten theory developed here are
compute these correlation functions in different common to most topological field theories of the
limits of e. In the weak-coupling limit (semiclassical Witten type. These features can be studied in the
or saddle point approximation), one establishes the context of the Mathai–Quillen formalism which is
connection with Donaldson theory. In the strong- the object of a separate article in the encyclopedia
coupling limit, Seiberg–Witten invariants appear and (see Mathai–Quillen Formalism).
one finds the connection between these two types of
invariants. We will briefly explore the weak-
coupling limit e ! 0. The functional integral [36]
Chern–Simons Gauge Theory
can be evaluated exactly in two steps: first one
analyzes the zero modes or classical configurations
for Knots and Links
that minimize the action, then one expands around Chern–Simons gauge theory is the most important
them considering only quadratic fluctuations. The example of Schwarz-type TQFTs. Let us begin by
integration over these quadratic fluctuations introducing its basic elements. Chern–Simons gauge
involves ratios of determinants of kinetic operators theory is a quantum field theory whose action is
that because of the Q-symmetry of the theory (which based on the Chern–Simons form associated to a
in fact is a Bose–Fermi symmetry) are 1. One is nonabelian gauge group. The theory is defined by
then left with an integral over the bosonic zero the following data: a smooth 3-manifold M which
modes which leads to a finite-dimensional integral will be taken to be compact, a gauge group G which
over the space of bosonic collective coordinates, and will be taken semisimple and compact, and an
a finite Grassmannian integral over the zero modes integer parameter k. The action of the theory is
of the fermionic fields. A careful analysis of the zero Z  
modes, first carried out by Witten, reveals that the k 2
SCS ðAÞ ¼ tr A ^ dA þ A ^ A ^ A ½39
infinite-dimensional functional integral is replaced 4 M 3
by a finite-dimensional integral over the moduli where A is a gauge connection and the trace is taken
space of anti-self-dual (ASD) connections MASD , in the fundamental representation. The exponential
þ
that is, the space of connections satisfying F = 0. of i times this action is invariant under gauge
Therefore, the correlation functions [36] have the transformations,
form
Z A ! A þ g1 dg ½40
h1    n i ¼ ^1 ^    ^ ^n ½38 where g is a map g : M ! G.
MASD
Notice that the action [39] is independent of the
where the fields in 1    n are mapped to differ- metric on the 3-manifold M. In this theory, appro-
ential forms ^1    ^n on MASD – the degree of each priate observables lead to correlation functions
form being given by the ghost number of its which correspond to topological invariants. Candi-
partner. Notice that the integral on the right-hand dates to be observables of this type must be metric
side vanishes unless the form has top degree. From independent and gauge invariant. Wilson loops
the field-theoretical point of view, this is the satisfy these properties. They correspond to the
requirement that the overall ghost number of the holonomy of the gauge connection A along a loop.
correlation function must be equal to dim MASD . Given a representation R of the gauge group G and
The quantities on the right-hand side of [38] are – a 1-cycle  on M, it is defined as
for gauge group SU(2) – precisely the Donaldson Z
 
invariants. Thus, Witten’s work provided a new WR ðAÞ ¼ trR Hol ðAÞ ¼ trR P exp A ½41

point of view on these invariants by reformulating
them in a quantum field theory language. This is a Products of these operators are the natural candi-
very important contribution since quantum field dates to obtain topological invariants after comput-
theory is a very rich framework and a wide variety ing their correlation functions. These correlation
of methods can be used to analyze the correlation functions are formally written as
functions. This opened an entirely new strategy to
investigate the Donaldson invariants. The emergence hWR11 WR22    WRnn i
Z
of Seiberg–Witten invariants is perhaps the greatest
¼ ½DAWR11 ðAÞWR22 ðAÞ    WRnn ðAÞeiSCS ðAÞ ½42
achievement of the implementation of this strategy.
Topological Quantum Field Theory: Overview 285

where 1 , 2 , . . . , n are 1-cycles on M and R1 , R2 , information at the crossings. The problem of


and Rn are representations of G. In [42], the classifying knots is equivalent to the problem
quantity [DA] denotes the functional integral mea- of classifying knot projections modulo a series of
sure and it is assumed that an integration over relations among them. These relations are known as
connections modulo gauge transformations is car- Reidemeister moves. Invariance of a quantity under
ried out. As usual in quantum field theory, this the three Reidemeister moves is called invariance
integration is not well defined. Field theorists have under ambient isotopy. If a quantity is invariant
developed methods to assign a meaning to the right- under all but the first move, it is said to possess
hand side of [42]. These methods mainly fall into invariance under regular isotopy.
two categories – perturbative and nonperturbative – The formalism described for knots generalizes to
and their degree of success mostly depends on the the case of links. For a link of n components, one
quantum field theory under consideration. For gauge considers n embeddings, i : S1 ! M (i = 1, . . . , n),
theories, it is also possible to take an alternative with no intersections among them. Again, the main
approach, the large-N expansion, which in general problem that link theory faces is the problem of
provides further insights into the theory. In Chern– their classification modulo homeomorphisms on M.
Simons gauge theory all these three methods have In this case one can also define regular projections
proved of great value. and reformulate the problem in terms of their
Witten (1989) showed, using nonperturbative classification modulo the Reidemeister moves.
methods, that when one considers nonintersecting The study of knot and link invariants experimen-
cycles 1 , 2 , . . . , n without self-intersections, the ted important progress in the 1980s. Jones (1985)
correlation functions [42] lead to the polynomial discovered a new invariant which carries his name.
invariants of knot theory discovered a few years The Jones polynomial can be defined very simply in
earlier starting with the work of Jones (1985). terms of skein relations. These are a set of rules that
Knot theory studies embeddings  : S1 ! M. Any can be applied to the diagram of a regular knot
two of such embeddings are considered equivalent if projection to construct the polynomial invariant.
the image of one of them can be deformed into the They establish a relation between the invariants
image of the other by a homeomorphism on M. The associated to three links which only differ in a
main goal of knot theory is to classify the resulting region as shown in Figure 1 where arrows have been
equivalence classes. Each of these classes is a knot. introduced to take into account that the Jones
Most of the work on knot theory has been carried polynomial is defined for oriented links.
out for the simple case M = S3 . Chern–Simons gauge If one denotes by VL (t) the Jones polynomial
theory, however, being a formulation intrinsically corresponding to a link L, t being the argument of
three dimensional, provides a framework to study the the polynomial, it must satisfy the skein relation:
case of more general 3-manifolds M.  
A powerful approach to classify knots is based on 1 pffiffi 1
VLþ  tVL ¼ t  pffiffi VL0 ½43
the construction of knot invariants. These are t t
quantities which can be computed for a representa-
tive of a class and are invariant within the class, that where Lþ , L , and L0 are the links shown in
is, under continuous deformations of the chosen Figure 1. This relation plus a choice of normali-
representative. At present, it is not known if there zation for the unknot (U) are enough to compute the
exist enough knot invariants to classify knots. Jones polynomial for any link. The standard choice
Vassiliev invariants (Vassiliev 1990) are the most for the unknot is
promising candidates, but it is already known that if VU ¼ 1 ½44
they do provide such a classification, infinitely many
of them are needed. though it is not the most natural one from the point
The problem of the classification of knots in S3 of view of Chern–Simons gauge theory. After Jones
can be reformulated in a two-dimensional frame-
work using regular knot projections. Given a
representative of a knot in S3, deform it continuously
in such a way that the projection on a plane has
simple crossings. Draw the projection on the plane,
and at each crossing use the convention that the line
that goes under the crossing is erased in a neighbor-
L+ L– L0
hood of the crossing. The resulting diagram is a set
of segments on the plane, containing the relevant Figure 1 Skein relations.
286 Topological Quantum Field Theory: Overview

work in 1984, many other polynomial invariants Advanced Developments


were discovered, as the HOMFLY and the
Topological sigma models are another important
Kauffman polynomial invariants.
type of (Witten-type) TQFTs. These theories are
The pioneering work of Witten in 1988 showed
obtained after twisting 2D N = 2 supersymmetric
that the correlation functions of products of Wilson
sigma models. The twisting can be done in two
loops [42] correspond to the Jones polynomial when
different ways leading to two types of models, A and
one considers SU(2) as gauge group and all the
B. Their existence is related to mirror symmetry.
Wilson loops entering in the correlation function are
Only type-A models will be described in what
taken in the fundamental representation F. For
follows. These models can be defined on an
example, if one considers a knot K, Witten showed
arbitrary almost-complex manifold, though typically
that
they are considered on Kähler manifolds. The theory
 
VK ðtÞ ¼ WKF ½45 involves maps from two-dimensional Riemann sur-
faces  to target spaces X, together with fermionic
provided that one performs the identification degrees of freedom on  which are mapped to
  tangent vectors on X. The functional integral of the
2i
t ¼ exp ½46 resulting theory is localized on holomorphic maps,
kþh
defining the corresponding moduli space. The
where h = 2 is the dual Coxeter number of the gauge corresponding Q-cohomology provides the set of
group SU(2). Witten also showed that if instead of physical observables, which can be mapped to
SU(2) one considers SU(N) and the Wilson loop cohomology classes on the moduli space and
carries the fundamental representation, the resulting integrated to produce topological invariants.
invariant is the HOMFLY polynomial. The second Topological sigma models keep fixed the com-
variable of this polynomial originates in this context plex structure of the Riemann surface . Moti-
from the N dependence. However, these cases are vated by string theory, one also considers the
just a sample of the general framework intrinsic to situation in which one integrates over complex
Chern–Simons gauge theory. Taking other groups structures. In this case, one ends up working with
and other representations, one possesses an enormous holomorphic maps in the entire moduli space of
set of knot and link invariants. These invariants curves. The resulting theories are called topologi-
can also be obtained in the context of quantum cal strings.
groups. We will review now a particular example of
Many nonperturbative studies of Chern–Simons topological string theory which, besides being very
gauge theory have been carried out. The quantiza- interesting from the point of view of physics and
tion of the theory has been studied from the point of mathematics, will be very useful in establishing a
view of the operator formalism as well as other relation with Chern–Simons gauge theory. Let us
more geometrical methods. Also, its connection to consider topological strings with target manifold X
two-dimensional conformal field theory has been a Calabi–Yau 3-fold. In this case, the virtual
further elucidated, and a powerful method for the dimension of the moduli space of holomorphic
general computation of knot and link invariants has maps turns out to be zero. Two situations can
been developed by Kaul and collaborators. occur: either the space is given by a number of
Chern–Simons theory is also amenable to pertur- points (the real dimension is zero) or the moduli
bative analysis, which has provided important space is finite dimensional and possesses a bundle of
representations of the Vassiliev invariants. These the same dimension as the tangent bundle. In the
invariants, proposed by Vassiliev in 1990, turned first case, topological strings count the number of
out to be the coefficients of the perturbative series points weighted by the exponential of the area of the
expansion of the correlators of Chern–Simons gauge holomorphic map (the pullback of the Kähler form
theory. Perturbative studies can be carried out in integrated over the surface) times x2g2 , where x is
different gauges, originating a variety of new the string-coupling constant and g is the genus of .
representations of Vassiliev invariants. Among the In the second case, one computes the top Chern class
most relevant results related to these topics are the of the appropriate bundles (properly defined), again
integral expressions for Vassiliev invariants by weighted by the same factor. In both cases one can
Kontsevich and by Bott and Taubes, as well as the classify the contributions according to the cohomology
recent combinatorial ones. These developments are class  on X in which the image of the holomorphic
not described here but the interested reader is map is contained. The sum of the numbers obtained
referred to the recent review (Labastida 1999). for each  and fixed g are known as Gromov–Witten
Topological Quantum Field Theory: Overview 287

invariants, Ng . The topological string contribution over the moduli space of the selected forms. The
takes the form resulting quantities are Donaldson invariants.
0 1 As in the case of topological sigma models one could
X X R
! be tempted to argue that the observation leading to a
x2g2 @ Ng e  A ½47
field-theoretical interpretation of Donaldson invar-
g0 2H2 ðX;ZÞ
iants does not provide any new insight. Quite on the
where ! is the Kähler class of the Calabi–Yau manifold. contrary, once a field theory formulation is available,
In general, the quantities Ng are rational numbers. one has at his disposal a huge machinery which could
The precedent discussion has shown how Gromov– lead, on the one hand, to further generalizations of the
Witten invariants can be interpreted in terms of string theory and, on the other hand, to new ways to
theory. One could think that this is just a fancy compute quantities such as [49], obtaining new
observation and that no further insight on these insights on these invariants. This is indeed what
invariants can be gained from this formulation. The happened in the 1990s, leading to an important
situation turns out to be quite the opposite. Once a string breakthrough in 1994 when Seiberg and Witten
formulation has been obtained, the whole machinery of calculated [49] in a different way and pointed out the
string theory is at our disposal. One should look to new relation of Donaldson invariants to new integer
ways to compute the quantity [47], where Gromov– invariants that nowadays bear their names.
Witten invariants are packed. The hope is that, if this is The localization argument that led to the interpreta-
possible, the new emerging picture will provide new tion of [49] as Donaldson invariants is valid because
insights on these invariants. This is indeed what the theory under consideration is exact in the weak-
occurred recently. It turns out that the quantity [47] coupling limit. Actually, the topological theory under
can be obtained from an alternative point of view in consideration is independent of the coupling constant
which the embedded Riemann surfaces are regarded as and thus calculations in the strong-coupling limit are
D-branes. The outcome of this approach is that the also exact. These types of calculations were out of
Gromov–Witten invariants can be written in terms of reach before 1994. The situation changed dramatically
other invariants which are integers and that possess a after the work of Seiberg and Witten in which N = 2
geometrical interpretation. To be more specific, the super Yang–Mills theory was solved in the strong-
quantity [47] takes the form coupling limit. Its application to the corresponding
twisted version was immediate and it turned out that
X X 1  2g2 R
dx
ng 2 sin
d
e 
!
½48 Donaldson invariants can be written in terms of new
g0 d>0
d 2 integer invariants now known as Seiberg–Witten
2H2 ðX;ZÞ
invariants (Witten 1994). The development has a
where ng are the new ‘‘integer’’ invariants. This strong resemblance with the one described above for
prediction has been verified in all the cases in which topological strings: certain noninteger invariants can
it has been tested. A similar structure will be found be expressed in terms of new integer invariants.
in the next section in the context of knot theory in The Seiberg–Witten invariants are actually simpler
the large-N limit. to compute than Donaldson invariants. They corre-
Let us now consider also Donaldson–Witten theory spond to partition functions of topological
from a new perspective. To be more specific, let us Yang–Mills theories where the gauge group is
consider the case in which the gauge group is SU(2), abelian. These contributions can be grouped into
and the 4-manifold X is simply connected and has classes labeled by x = 2c1 (L), where c1 (L) is the
bþ þ first Chern class of the corresponding line bundle.
2 > 1 (the case b2 = 1 is anomalous). In this situation
there are 1 þ b2 physical observables [34], O = I1 and The sum of contributions, each being 1, for a given
I(a ) = I2 (a ) (a = 1, . . . , b2 ), where a is a basis of class x is the integer Seiberg–Witten invariant nx . The
H2 (X). These can be packed in a generating functional: strong-coupling analysis of topological Yang–Mills
* !+ theory leads to the following expression for [49]:
X 2 X
exp a Iða Þ þ O ½49 21þð1=4Þð7þ11
Þ eððv =2Þþ2 Þ nx evx
a x
X

ððv2 =2Þ2 Þ
where and a (a = 1, . . . , b2 ) are parameters. In þi þ
=4
e nx eivx ½50
computing this quantity one can argue that the P x
contribution is localized on the moduli space of where v = a a a , and  and
are the Euler
instantons configurations and one ends up, after number and the signature of the manifold X. This
taking into account the selection rule dictated by the result matches the known structure of [49] (structure
dimensionality of the moduli space, with integrations theorem of Kronheimer and Mrowka) and provides
288 Topological Quantum Field Theory: Overview

a meaning to its unknown quantities in terms of the O(1) O(1) ! P 1 , t being the flux of the B-field
new Seiberg–Witten invariants. Equation [50] is a through P 1 . The quantities Fg (t) have been computed
rather remarkable prediction that has been tested in using both physical and mathematical arguments,
many cases, and for which a general proof has been thus proving the conjecture.
recently proposed. For a review of the subject, see Once a new picture for the partition function of
Labastida and Lozano (1998). Chern–Simons gauge theory is available, one should
The situation for manifolds with bþ 2 = 1 involves a ask about the form that the expectation values of
metric dependence and has been worked out in Wilson loops could take in the new context. The
detail (Moore and Witten 1998). The formulation of question was faced by Ooguri and Vafa and they
Donaldson invariants in field-theoretical terms has provided the answer, later refined by Labastida,
also provided a generalization of these invariants. Mariño, and Vafa. The outcome is an entirely new
This generalization has been carried out in several point of view in the theory of knot and link
directions: (1) the consideration of higher-rank invariants. The new picture provides a geometrical
groups, (2) the coupling to matter fields after interpretation of the integer coefficients of the
twisting N = 2 hypermultiplets, and (3) the twist of quantum group invariants, an issue that has been
theories involving N = 4 supersymmetry. investigated during many years. To present an
We will now look at Chern–Simons gauge theory account of these developments, one needs to review
from the perspective that emerges from its treatment first some basic facts of large-N expansions.
in the context of the large-N expansion. We will To consider the presence of Wilson loops, it is
restrict the discussion to the case of knots on S3 with convenient to introduce a particular generating
gauge group SU(N). Gauge theories with gauge group functional. First, one performs a change of basis
SU(N) admit, besides the perturbative expansion, a from representations R to conjugacy classes C(k) of
large-N expansion. In this expansion correlators are the symmetric group, labeled byP vectors
expanded in powers of 1/N while keeping the k = (k1 , k2 , . . . ) with ki  0, and Pjkj = j kj > 0.
’t Hooft coupling t = Nx fixed, x being the coupling The change of basis is Wk = R R (C(k))WR ,
constant of the gauge theory. For example, for the where R P are characters of the permutation group
free energy of the theory, one has the general form S‘ of ‘ = j jkj elements (‘ is also the number of
boxes of the Young tableau associated to R).
X
1
F¼ Cg;h N22g t2g2þh ½51 Second, one introduces the generating functional:
g0
h1
X jCðkÞj ðcÞ
FðVÞ ¼ log ZðVÞ ¼ Wk k ðVÞ ½53
‘!
In the case of Chern–Simons gauge theory, the coupling k
constant is x = 2i=(k þ N) after taking into account where
the shift in k. The large-N expansion [51] resembles a
string-theory expansion and indeed the quantities Cg, h X jCðkÞj
ZðVÞ ¼ Wk k ðVÞ
can be identified with the partition function of a k
‘!
topological open string with g handles and h bound- Y
aries, with N D-branes on S3 in an ambient six- k ðVÞ ¼ ðtr V j Þkj
j
dimensional target space T
S3 . This was pointed out by
Witten in 1992. The result makes a connection between In these expressions jC(k)j denotes the number of
a topological three-dimensional field theory and the elements of the class C(k) in S‘ . The reason behind
topological strings described in the previous section. the introduction of this generating functional is that
In 1998 an important breakthrough took place the large-N structure of the connected Wilson loops,
which provided a new approach to compute quan- Wk(c) , turns out to be very simple:
tities such as [51]. Using arguments inspired by the
AdS/CFT correspondence, Gopakumar and Vafa jCðkÞj ðcÞ X1
Wk ¼ x2g2þjkj Fg; k ð Þ ½54
(1999) provided a closed-string-theory interpretation ‘! g¼0
of the partition function [51]. They conjectured that
the free energy F can be expressed as where = et and t = Nx is the ’t Hooft coupling.
Writing x = t=N, it corresponds to a power series
X
1
expansion in 1/N. As before, the expansion looks
F¼ N 22g Fg ðtÞ ½52
like a perturbative series in string theory where g is
g0
the genus and jkj is the number of holes. Ooguri and
where Fg (t) corresponds to the partition function of a Vafa conjectured in 1999 the appropriate string-
topological closed-string theory on the noncompact theory description of [54]. It corresponds to an open
Calabi–Yau manifold X called the resolved conifold, topological string theory (notice that the ones
Topological Quantum Field Theory: Overview 289

described in the previous section were closed), described how the many faces of TQFT provide a
whose target space is the resolved conifold X. The variety of important insights in a selected set of
contribution from this theory will lead to open- problems in topology. Among these outstand the
string analogs of Gromov–Witten invariants. reformulation of Donaldson theory and the discovery
In order to describe in more detail the fact that one of the Seiberg–Witten invariants, and the string-theory
is dealing with open strings, some new data need to description of the large-N expansion of Chern–Simons
be introduced. Here is where the knot description gauge theory, which provides an entirely new point of
intrinsic to the Wilson loop enters. Given a knot K on view in the study of knot and link invariants and points
S3 , let us associate to it a Lagrangian submanifold CK to an underlying fascinating interplay between string
with b1 = 1 in the resolved conifold X and consider a theory, knot theory, and enumerative geometry which
topological open string on it. The contributions in opens new fields of study.
this open topological string are localized on holo- In addition to their intrinsic mathematical inter-
morphic maps f : g, h ! X with h = jkj which satisfy est, TQFTs have been found relevant to important
f
[g, h ] = Q, and f
[C] = j[] for kj oriented circles questions in physics as well. This is so because, in a
C. In these expressions  2 H1 (CK , Z), and Q 2 sense, TQFTs are easier to solve than conventional
H2 (X, CK , Z), that is, the map is such that kj quantum field theories. For example, topological
boundaries of g, h wrap the knot j times, and g, h sigma models are relevant to the computation of
itself gets mapped to a relative two-homology class certain couplings in string theory. Also, Witten-type
characterized by the Lagrangian submanifold CK . gauge TQFTs such as Donaldson–Witten theories
The number of such maps (in the sense described in and its generalizations play a role in string theory as
the previous section) is the open-string analog of effective world-volume theories of extended string
Gromov–Witten invariants. They will be denoted by states (branes) wrapping curved spaces, and TQFTs
Q
Ng, k . Comparing to the situation that led to [47] in arising from N = 4 gauge theories in four dimen-
the closed-string case, one concludes that in this case sions have shed light on field- (and string-) theory
the quantities Fg, k ( ) in [54] must take the form dualities.
X R Z Most of these developments, and others that we
Q !
Fg; k ð Þ ¼ Ng; k e Q ; t ¼ ! ½55 have not touched upon or only mentioned in passing
Q P1 have their own entries in the encyclopedia, to which
where ! is the Kähler class of the Calabi–Yau we refer the interested reader for further details.
t
manifold
R X and = e . For any Q, one can always See also: Axiomatic Approach to Topological Quantum
write Q ! = Qt, where Q is in general a half-integer Field Theory; BF Theories; Chern–Simons Models:
number. Therefore, Fg, k ( ) is a polynomial in 1=2 Rigorous Results; Donaldson–Witten Theory; Gauge
with rational coefficients. Theoretic Invariants of 4-Manifolds; Gauge Theory:
The result [55] is very impressive but still does not Mathematical Applications; Hamiltonian Fluid Dynamics;
provide a representation where one can assign a The Jones Polynomial; Knot Theory and Physics;
geometrical interpretation to the integer coefficients Mathai–Quillen Formalism; Mathematical Knot Theory;
of the quantum-group invariants. Notice that to Schwarz-Type Topological Quantum Field Theory;
match a polynomial invariant to [55], after obtain- Seiberg–Witten Theory; Stationary Phase Approximation;
ing its connected part, one must expand it in x after Topological Sigma Models.
setting q = ex keeping fixed. One would like to
have a refined version of [55], in the spirit of what Further Reading
was described in the previous section leading from
the Gromov–Witten invariants Ng of [47] to the Atiyah MF (1988) New invariants of three and four dimensional
new integer invariants ng of [48]. It turns out that, manifolds. In: The Mathematical Heritage of Herman Weyl, Proc.
Symp. Pure Math., vol. 48. American Math. Soc. pp. 285–299.
indeed, F(V) can be expressed in terms of integer Birmingham D, Blau M, Rakowski M, and Thompson G (1991)
invariants in complete analogy with the description Topological field theory. Physics Reports 209: 129–340.
presented in the previous section for topological Cordes S, Moore G, and Rangoolam S (1996) Lectures on 2D
strings. A good review on the subject can be found Yang–Mills theory, equivariant cohomology and topological
in Mariño (2005). field theories. In: David F, Ginsparg P, and Zinn-Justin J (eds.)
Fluctuating Geometries in Statistical Mechanics and Field
Theory, Les Houches Sesion LXII, p. 505 (hep-th/9411210).
Elsevier.
Concluding Remarks Donaldson SK (1990) Polynomial invariants for smooth four-
manifolds. Topology 29: 257–315.
In this overview we have introduced key features of Floer A (1987) Morse theory for fixed points of symplectic
TQFTs and we have described some of the most diffeomorphisms. Bulletin of the American Mathematical
relevant results emerged from them. We have Society 16: 279.
290 Topological Sigma Models

Freyd P, Yetter D, Hoste J, Lickorish WBR, Millet K, and Labastida JMF and Mariño M (2005) Topological Quantum Field
Ocneanu A (1985) A new polynomial invariant of knots and Theory and Four Manifolds. Dordrecht: Elsevier; Norwell, MA:
links. Bulletin of the American Mathematical Society 12(2): Springer.
239–246. Mariño M (2005) Chern–Simons theory and topological strings.
Gopakumar R and Vafa C (1999) On the gauge theory/geometry Reviews of Modern Physics 77: 675–720.
correspondence. Advances in Theoretical and Mathematical Moore G and Witten E (1998) Integrating over the u-plane in
Physics 3: 1415 (hep-th/9811131). Donaldson theory. Advances in Theoretical and Mathematic
Jones VFR (1985) A polynomial invariant for knots via von Physics 1: 298–387.
Neumann algebras. Bulletin of the American Mathematical Vassiliev VA (1990) Cohomology of knot spaces. In: Theory of
Society 12: 103–112. Singularities and Its Applications, Advances in Soviet Mathe-
Jones VFR (1987) Hecke algebra representations of braid groups matics, vol. 1, pp. 23–69. American Mathematical Society.
and link polynomials. Annals of Mathematics 126(2): 335–388. Witten E (1982) Supersymmetry and Morse theory. Journal of
Kauffman LH (1990) An invariant of regular isotopy. Transac- Differential Geometry 17: 661–692.
tions of American Mathematical Society 318(2): 417–471. Witten E (1988a) Topological quantum field theory. Commu-
Labastida JMF (1999) Chern-Simons gauge theory: ten years nications in Mathematical Physics 117: 353.
after. In: Falomir H, Gamboa R, and Schaposnik F (eds.) Witten E (1988b) Topological sigma models. Communications in
Trends in Theoretical Physics II, ch. 484, 1. New York: AIP Mathematical Physics 118: 411.
(hep-th/9905057). Witten E (1989) Quantum field theory and the Jones polynomial.
Labastida JMF and Lozano C (1998) Lectures in topological Communications in Mathematical Physics 121: 351.
quantum field theory. In: Falomir H, Gamboa R, and Witten E (1994) Monopoles and four-manifolds. Mathematical
Schaposnik F (eds.) Trends in Theoretical Physics, ch. 419, Research Letters 1: 769–796.
54. New York: AIP (hep-th/9709192).

Topological Sigma Models


D Birmingham, University of the Pacific, fields needed to define the path integral. In particular,
Stockton, CA, USA one can show that the partition function and all
ª 2006 Elsevier Ltd. All rights reserved. correlation functions are independent of the metric on
both the base manifold  and the target space M. For
example, let us define the path integral by
Introduction Z
Z ¼ d efQ;Vg ½1
Topological sigma models govern the quantum
mechanics of maps from a Riemann surface  to a
target space M. In contrast to the standard super- where  denotes the full set of fields required at the
symmetric sigma model, the topological version has a quantum level. In general, the function V depends
special local shift symmetry. This symmetry takes the on geometric data of both  and M. Nevertheless,
form ui = i , where i is an arbitrary local function of one can easily establish that the partition function is
the coordinates on the base manifold . In essence, this independent of this data by noting the following.
topological shift symmetry ensures that all local Variation of Z with respect to the metric of the
degrees of freedom of the model can be gauged away. target space g (for example) gives
As a result, the dynamics of such a model resides in a Z
finite number of global topological degrees of freedom. g Z ¼  d efQ;Vg fQ; g Vg ½2
This feature is generic to all topological field theories
of Witten type, also known as cohomological field The right-hand side of this equation is nothing but
theories (see Topological Quantum Field Theory: the vacuum expectation value of a BRST commu-
Overview). The topological shift symmetry is respon- tator, and this vanishes by BRST invariance of the
sible for the special topological nature of the model, vacuum. It is important to note here that the BRST
which is seen most readily by BRST quantizing the operator Q can be constructed to be independent of
local shift symmetry. This gives rise to a nilpotent g. Apart from the necessity of introducing the metric
BRST operator Q. The properties of this BRST tensor, these models also require additional geo-
operator are crucial for establishing the topological metric data for their construction. The complex
nature of the model. The key point in the construction structure of , and at least an almost-complex
of any cohomological field theory is the fact that the structure on M, is required. By a similar argument,
full quantum action Sq can be written as a BRST one can show that the partition function and
commutator Sq = {Q, V}, where V is a function of the correlation functions are independent of this extra
Topological Sigma Models 291

geometric data. As mentioned above, these models manifold M. The coordinates on  are denoted by
possess no local degrees of freedom. One can then  ( = 1, 2), while those on the target manifold M
show that the path-integral expression for the are denoted by ui (i = 1, . . . , dim M). The metric and
correlation functions can be localized to a finite- complex structure of  are denoted by h and   ,
dimensional moduli space of instanton configura- respectively; they obey the relations     =  
tions which minimize the classical action. and  = h   . The metric tensor gij and almost-
We will first show how the full quantum action of complex structure Ji j of M obey analogous relations
the theory can be obtained as a BRST quantization of a to the above. In the general model, the target space
classical action with a local gauge symmetry. How- need only be an almost-complex manifold. This
ever, we shall then highlight the fact that the gauge requires the existence of a globally defined tensor
algebra for this topological shift symmetry only closes field Ji j such that Ji j Jj k = i k .
on-shell. In order to proceed with a BRST quantization The action [3] is invariant under the topological
of the model, and obtain the complete quantum shift symmetry
action, one must take recourse to the Batalin–
Vilkovisky quantization scheme. This machinery is ui ¼ i ½7
ideally tailored for such a problem, with the end result
where i is an arbitrary local function of the
that quartic ghost terms are present in the action.
coordinates on the base manifold . Already, at
However, the presence of such terms does not affect
this level, we see the distinction with the standard
the arguments presented above, since the quantum is
sigma model. The presence of this shift symmetry
still obtained as a BRST commutator. Following this,
means that all local degrees of freedom can be
we construct all observables of the theory and
gauged away, leaving only a finite number of global
demonstrate their connection to the de Rham coho-
topological degrees of freedom. It requires some
mology of the target space. The special topological
work to determine the corresponding transformation
properties of the observables are then discussed, and it
for Gi , the key point being the preservation of the
is shown how their computation is localized to the
self-duality constraint. We find
moduli space M of holomorphic maps from  to M.
  j 1  l   k
As a particular example, we show how the computa- Gi ¼ Pi j
þ j D  þ 2    Dl J k @ u
tion of a certain class of observables determines the  
intersection numbers of the moduli space M. We þ 12   k Dk Ji j Gj  ilk k Gl ½8
present a brief discussion of the connection between
topological sigma models with Calabi–Yau target where the covariant derivative is defined by
space M, and the mirror symmetry of M. D i = @ i þ ijk (@ uj )k .
Having determined the classical symmetries of the
model, we can now proceed with the BRST quantized
Construction of the Model form of the quantum action. As a topological field
theory of Witten type, one can show that the quantum
We begin with the following classical action: action can be written as a BRST commutator, that is,
Z pffiffiffi Sq = {Q, V}, where the gauge fermion V is defined by
Sc ¼ d2  h h gij Ki Kj ½3 Z pffiffiffi  
 i @  ui   Bi

V ¼ d2  h C ½9
where 4
 
Ki ¼ Gi  12 @  ui þ   Ji j @  uj ½4 where  is an arbitrary gauge-fixing parameter. The
i i
BRST operator Q is nilpotent Q2 = 0, off-shell. It is
The fields G and K both satisfy the self-duality defined by  = {Q, }, and takes the form
constraint
ui ¼  Ci
Gi ¼ Pi
þ j G
j
½5 Ci ¼ 0
Ki ¼ Pi
þ j K
j
 
 1  
j  k k j
Ci ¼  Bi þ  Dk Ji Cj C þ ij Ck C
where the self-dual and anti-self-dual projection 2
operators are defined as    t ½10
  Bi ¼ Ck Cl Riklt þ Rklrs Jri Js t C
4  
Pi 1  i  i
 j ¼ 2    j    J j ½6 
   Dk Ji j Ck Bj
2  
The above action describes a theory of maps ui () 
þ Ck Dk Ji s Cl Dl Jt s C  t þ i Cj Bk
jk
from a Riemann surface  to an almost complex 4
292 Topological Sigma Models

In the above, the ghost field is denoted by Ci , while Construction of Observables


 i and the multiplier field Bi
the anti-ghost field C
Having defined the quantum action, it is now of
obey the self-duality constraint [5]. The key point to
interest to consider the correlation functions of the
note in the above transformations is the fact that the
model. In the functional integral, we integrate over
ghost field Ci is BRST invariant. Again, this is a
all maps  ! M in a fixed homotopy class. Let us
feature which is generic to all cohomological field
consider a correlation function
theories. The existence of such a field allows the
Z
construction of an entire set of topological correla-  i dCi etSq O
hOi ¼ dui dC ½13
tion functions, as we shall see in the following
section.
where t > 0 is a parameter, and the observable O is
While the gauge-fixing parameter  is arbitrary, a
BRST invariant {Q, O} = 0. From the BRST invar-
conventional choice is to take  = 1, and then
iance of the vacuum, it follows immediately that the
integrate out the multiplier field B. This yields the
vacuum expectation value of a BRST commutator is
action in the form
zero, h{Q, O}i = 0. An operator which is a BRST
Z pffiffiffi 1 1
commutator is said to be Q-exact. Hence, our
Sq ¼ d2  h h gij @ ui @ uj þ  Jij @ ui @ uj interest is in the Q-cohomology classes of operators,
2 2 that is, BRST invariant operators modulo BRST
 
 
 i D Ci þ 1   Dj Ji @  uk Cj
þC
exact operators. It is for this reason that such a
k
2 model is called a cohomological field theory.
1  m  k j r
One can now show that the variation of [13] with
þ C  C Rmkjr C C respect to t is a BRST commutator, namely
8

1 Z
k
þ Ci C ðDj Jli ÞðDr Jlk ÞCj Cr ½11 t hOi ¼ t dui dC  i dCi etSq fQ; VOg ¼ 0 ½14
16
As a result, one can evaluate the correlation function
It should be stressed that the classical gauge algebra
in the large-t (weak-coupling) limit. In this limit, the
[7] and [8] only closes on-shell. Quantization of the
path integral is dominated by fluctuations around
model is therefore more subtle, and requires use of
the classical minima. For the sigma model under
the Batalin–Vilkovisky formalism. The on-shell
study, the classical action is minimized by the
closure problem automatically results in the pre-
instanton configurations
sence of quartic ghost coupling terms in the action
and consequently cubic terms in the BRST transfor- @ ui þ   Ji j @ ui ¼ 0 ½15
mations. Despite this, we have established that the
full quantum action can be written as a BRST Indeed, this localization of the path integral to the
commutator. moduli space of instantons can also be seen by
The form of the action simplifies when the choosing the  = 0 gauge in [9]. Integration over the
complex structure of the target manifold is multiplier field then imposes a delta function
covariantly constant, Dk Ji j = 0. In this case, the constraint to the instanton configurations. The key
target manifold M is Kähler and we denote the point in the above derivation is the fact that the
complex coordinates as uI , with their complex quantum action is a BRST commutator, Sq = {Q, V}.

conjugates denoted by uI . The nonzero compo- By a similar argument, one can show that variations
nents of the metric tensor are then gIJ . Similarly, of hOi with respect to the metric and complex
the coordinates of  are denoted   , with nonzero structure of  and M are also zero.
metric components hþ . The nonzero components Our aim now is to construct the Q-cohomology
of the ghost and anti-ghost are then given by classes of operators in the theory. Let us first associate
 
CI , CI , C  an operator O(0) i1
A to each p-form A = Ai1 ip du ^    ^
þI , CI . The action can be written in the ip
form du on the target space M, given by
ð0Þ
Z pffiffiffi OA ¼ Ai1 ip Ci1    Cip ½16
2  1 I J þ
Sq ¼ d  h hþ gIJ @þ uI @ uJ þ C þ ðD C Þh gIJ
2 where Ci is the ghost field. Under a BRST
1  I J þ
transformation, we see that
þ C  ðDþ C Þh gIJ
2
ð0Þ
fQ; OA g ¼ @i0 Ai1 ip Ci0    Cip
1  þI C

  I R  CJ CJ
þ hþ C IIJJ ½12 ð0Þ
4 ¼ OdA ½17
Topological Sigma Models 293

since the ghost fields are BRST invariant by [10]. n o Z n o Z


ð1Þ ð1Þ ð0Þ
Hence, O(0) Q; WA ðÞ ¼ Q; OA ¼ dOA ¼ 0 ½22
A is BRST invariant if and only if A is a  
closed p-form. Similarly, if A is an exact p-form,
then the corresponding operator is Q-exact. Hence, Moreover, if  happens to be the boundary of a two-
the BRST cohomology classes of these operators are dimensional surface ( = @), so that  is trivial in
in one to one correspondence with the de Rham homology, then this new operator is likewise trivial
cohomology classes on M. The reason for assigning in Q cohomology:
the peculiar superscript to the operator O(0) will Z Z Z
ð1Þ ð1Þ ð1Þ ð2Þ
become clear at the end of this construction. Notice WA ðÞ ¼ OA ¼ dOA ¼ Q; OA ½23
also that operators of the form O(0)
A can be used as
  

building blocks for constructing new observables. If where


we consider a set of closed forms A1 , . . . , Ak , then
pðp  1Þ
the product of the associated operators O(0) (0)
A1    OAk
ð2Þ
OA ¼  Ai1 ip dui1 ^ dui2 Ci3    Cip
is clearly Q-invariant as well. 2
When considering the vacuum expectation values As before, let us now associate to each homology
of operators which are polynomials in the fields, 2-cycle (@ = 0), another BRST invariant operator
there is an implicit dependence on the points where WA(2) defined by
the operators are located. In the case at hand Z
however, the operator O(0)A () at the point  has a
ð2Þ
WA ðÞ ¼ OA
ð2Þ
½24
vacuum expectation value which is a topological 
invariant, and thus cannot depend on the chosen The BRST invariance follows trivially as in [23].
point. To see this explicitly, we consider all fields In summary, we have produced three operators
defined over , and differentiate the operator with O(0) (1) (2)
A , OA , and OA from any given closed form A,
respect to some local coordinates  : which satisfy the relations:
n o n o
@ i1 ip @ui0 i1 ð0Þ
0 ¼ Q; OA ;
ð0Þ
dOA ¼ Q; OA
ð1Þ
A i 1 i p
C    C ¼ ð@ i0
Ai 1 i p
Þ C    Cip
@ @ n o ½25
ð1Þ ð2Þ ð2Þ
@ui0 dOA ¼ Q; OA ; dOA ¼ 0
þ pAi1 ip ð@i0 Ci1 Þ  Ci2    Cip ½18
@
The BRST observables are then given by arbitrary
In terms of exterior derivatives, this takes the form, products
R (i) of the integrated operators WA(i) () =
 OA , where  is any i-cycle in homology.
ð0Þ
dOA ¼ @i0 Ai1 ip dui0 Ci1    Cip þ pAi1 ip dCi1 Ci2    Cip
ð1Þ
¼ fQ; OA g ½19
Observables and Intersection Theory
where O(1) i1 i2 ip
A = pAi1 ip du C    C , and we have
Let us consider the computation of the correlation
used the fact that A is a closed p-form. If we let 
function hOi in the background field method. We
represent any path between two arbitrary points P
first pick a background instanton configuration [15],
and P0 , then this expression has the integral form,
and then integrate over the quantum fluctuations
Z around that instanton. The relevant part of the
ð0Þ ð0Þ ð1Þ
OA ðPÞ  OA ðP0 Þ ¼ Q; OA ½20 quantum action is quadratic in the quantum fields,

and localization of the model then ensures that such
and we see that the vacuum expectation value of a computation is exact. The quantum fields are
O(0) expanded into eigenfunctions of the operators that
A is point independent by the BRST invariance of
the vacuum. The same remark applies to any appear in the quadratic part of the action, and the
product of operators of the form we are considering. functional integral is replaced by an integral over the
To continue our construction, consider a one- eigenmodes. However, if there are fermionic zero
dimensional homology cycle (@ = 0), and define modes, then those modes do not enterR in the action.
As a result, the fermionic integrals ( d = 0) over
Z
ð1Þ ð1Þ those modes will cause hOi to vanish unless it has
WA ðÞ ¼ OA ½21 the correct fermion content; the zero modes must be

absorbed. In our case, a glance at the quantum
This new operator WA(1) () is BRST invariant by action indicates that we should concern ourselves
inspection, with the zero modes of the ghost Ci and anti-ghost
294 Topological Sigma Models

 i . A Ci zero mode is clearly in the kernel of the


C for all [ ] 2 H nk (M). By on the right-hand side of
operator this equation, we mean the pullback i under the
  inclusion i : N ! M. Conversely, to each closed
 i ¼ D i j þ  Ji jD þ  Dj Ji @  uk
D ½26
j k k-form on M, we can associate an (n  k)-cycle
 i zero mode is a zero eigenfunction of its N (it is in general a chain of subspaces), unique up
and a C
  . In the BRST quantization of the model, to homology, such that the previous relation is
adjoint D
satisfied. Furthermore, one can show that the
the ghost fields Ci are assigned ghost number þ1,
 i have ghost number Poincaré dual to N can be chosen in such a way
while the anti-ghost fields C
that its support is localized within any given open
1. It is therefore apparent that the vacuum
neighborhood of N in M (essentially delta function
expectation value of any observable will vanish
support on N).
unless that observable has a ghost number equal to
 zero modes, a, minus the number Let us now define the notion of transversal
the number of D
  intersection. For simplicity, we will first consider
of D zero modes, b. This difference, ! = a  b, is
 the intersection of two submanifolds M1 and M2
called the index of the operator D.
contained in M. We will say that these two
There is a direct link between this index and the
submanifolds have transversal intersection if the
dimension of the moduli space of instantons. Recall
tangent spaces satisfy
that we are considering the space of maps  ! M in a
specified homotopy class, which satisfy equation [15]. Tx ðM1 Þ þ Tx ðM2 Þ ¼ Tx ðMÞ ½29
It is then of interest to determine the dimension of the
space of such solutions. To this aim, we examine the for all x 2 M1 \ M2 . It is a theorem that a submanifold
constraint that arises by considering an instanton ui , of codimension k can be locally ‘‘cut-out’’ by k smooth
and another neighboring solution ui þ u ^i , where u^i is functions, that is, the submanifold is locally specified by
an infinitesimal deformation. To first order in u i
^ , we the zeros of this set of functions. It is a worthwhile
see that u ^i must be a zero mode of the operator D.  exercise to convince oneself that the definition of
This is no coincidence, and we can thus interpret the transversal intersection is equivalent to the statement
ghost fields Ci as cotangent vectors to instanton that the functions which cut-out M1 are independent
moduli space M. In particular, if M is a smooth from those which cut-out M2 . Thus, we can write
manifold, then dim M = a. The index of the operator codimðM1 \ M2 Þ ¼ codimðM1 Þ þ codimðM2 Þ ½30
 is called the virtual dimension of the moduli space.
D
In generic situations, the virtual dimension is equal to More generally, we say that the intersection M1 \    \
the actual dimension dim M. Ms of s submanifolds is transversal if the intersection of
It is possible to interpret some of the observables every pair of them is transversal. It then follows
that we have described in terms of intersection trivially by the previous argument that the codimen-
theory applied to the moduli space of instantons. In sions must satisfy
particular, one can show that all correlation func- X
s
tions of the form codimðM1 \    \ Ms Þ ¼ codimðMi Þ ½31
D E i¼1
ð0Þ ð0Þ
OA1    OAs ½27
The special case which will be important for us
are intersection numbers of certain submanifolds of occurs when the intersection of submanifolds is a
moduli space. In order to see this in a simple collection of points, that is, when the codimension
example, we first recall the notion of Poincaré of the intersection is equal to the dimension of M.
duality and the relationship between cohomology Since these points are isolated, the compactness of
and homology. M guarantees that they are finite in number.
Poincaré duality can be formulated as a relation- We are now in a position to describeDin what sense E
ship between de Rham cohomology (defined in correlation functions of the form O(0) A1    O(0)
As
terms of closed differential forms) and homology determine intersection numbers in the moduli space
(defined in terms of subspaces of M). For our M of instantons. By definition, this moduli space is
purposes here, it is sufficient to state that we can the set of maps from  to M which satisfy [15]. Let
associate to each boundaryless submanifold N of us consider the generic situation, where the virtual
codimension k, a cohomology class [ ] 2 H k (M), dimension of M (i.e., the index of D)  is equal to
such that dim M. For convenience, let us begin by choosing the
Z Z forms Ai which represent de Rham cohomology
^ ¼ ½28 classes on M, together with their Poincaré duals Mi ,
M N such that the forms have essentially delta function
Turbulence Theories 295

support on their respective submanifolds. Since each topological B model. The usefulness of this observa-
of the operators in the correlation function depends tion lies in the fact that the topological A model on a
on some fixed point i , it is meaningful to define the Calabi–Yau target space M is related to the
submanifolds Li  {u 2 M j u(i ) 2 Mi }  M. Now, topological B model on the mirror of M. This
the correlation function represents a functional relationship and the computation of correlation
integral over the space of maps Map(, M), and we functions in the A and B models thus sheds light
have argued that this integral only receives contribu- on the nature of mirror symmetry.
tions from the instanton configurations. Since the
operators Ai (u(i )) vanish unless u 2 Li by our choice See also: Batalin–Vilkovisky Quantization;
of the Poincaré duals, we see that the only contribu- BRST Quantization; Functional Integration in Quantum
tion to the functional integral can be from those maps Physics; Graded Poisson Algebras; Mathai–Quillen
Formalism; Mirror Symmetry: A Geometric Survey;
which lie in the intersection L1 \    \ Ls . By ghost
Several Complex Variables: Compact Manifolds;
number considerations, this correlation function must
Singularities of the Ricci Flow; Topological Gravity, Two-
vanish unless the codimension of the intersection Dimensional; Topological Quantum Field Theory:
equals the virtual dimension of M. In the generic Overview; WDVV Equations and Frobenius Manifolds.
case where the virtual dimension is equal to dim M,
this means that the intersection is simply a finite
number of points. Intersection numbers 1 can then
Further Reading
be assigned to each point in the intersection L1 \
   \ Ls , by considering the relative orientation of the Baulieu L and Singer I (1989) The topological sigma model.
submanifolds Li at the intersection points. From the Communications in Mathematical Physics 125: 227.
functional integral point of view, the computation Birmingham D, Blau M, Rakowski M, and Thompson G (1991)
Topological field theory. Physics Reports 209: 129.
reduces to an evaluation of the ratio of the bosonic Birmingham D, Rakowski M, and Thompson G (1989) BRST
determinant (integration over ui ) to the fermionic quantization of topological field theory. Nuclear Physics B
determinant (integration over Ci and C  i ). In the 315: 577.
Kähler case, for example, the intersection number Eguchi T and Yang SK (1990) N = 2 superconformal models as
topological field theories. Modern Physics Letters A 5: 1693.
assigned to each point in the intersection is always þ1.
 I determinant is Floer A (1987) Morse theory for fixed points of symplectic
This is due to the fact that the CI , C diffeomorphisms. Bulletin of the American Mathematical
I  I
the complex conjugate of the C , Cþ determinant. Society 16: 279.
Floer A (1988) An instanton invariant for three manifolds.
Communications in Mathematical Physics 118: 215.
A and B Models and Mirror Symmetry Gromov M (1985) Pseudo holomorphic curves in symplectic
manifolds. Inventiones Mathematicae 82: 307.
The topological sigma model for a Kähler target Gromov M (1986) In: Proceedings of the International Congress
space [12] is also known as the topological A model. of Mathematicians, p. 81. Berkeley.
In this case, the action can be recovered by twisting Vafa C (1991) Topological Landau–Ginzburg models. Modern
the standard N = 2 supersymmetric sigma model. Physics Letters A 6: 337.
This twisting procedure amounts to a reassignment Witten E (1988) Topological sigma models. Communications in
Mathematical Physics 118: 411.
of the spins of the fields in the theory. However, Witten E (1991) Mirror manifolds and topological field theory.
there is an alternative twisting which can be done, In: Yau ST (ed.) Mirror Symmetry I, p. 121. American
and this leads to another model known as the Mathematical Society.

Turbulence Theories
R M S Rosa, Universidade Federal do Rio de Janeiro, examples of a multitude of flows which display
Rio de Janeiro, Brazil turbulent regimes: from the blood that flows in our
ª 2006 Elsevier Ltd. All rights reserved. veins and arteries to the motion of air within our
lungs and around us; from the flow of water in
creeks to the atmospheric and oceanic currents;
from the flows past submarines, ships, automobiles,
Introduction
and aircraft to the combustion processes propelling
Turbulence has initially been defined as an irregular them; and in the flow of gas, oil, and water, from
motion in fluids. The cloud formations in the the prospecting end to the entrails of the cities. The
atmosphere and the motion of water in rivers make great majority of flows in nature and in engineering
this point clear. These are but a few readily available applications are somehow turbulent.
296 Turbulence Theories

over a flat plate) or to parameters (e.g., as we


increase the angle of attack of a wing or the pressure
gradient in a pipe). This subject is divided into two
cases: wall-bounded and free-shear flows. In the
former, the viscosity, which causes the fluid to
adhere to the surface of the wall, is the primary
cause of the instability in the transition process. In
Figure 1 Illustration of the irregular motion of a turbulent flow the latter, inviscid mechanisms such as mixing layers
over a flat plate (thin lines), and of the well-defined velocity
and jets are the main factors. The tools for studying
profile of the mean flow (thick lines).
the transition to turbulence include linearization of
the equations of motion around the laminar solu-
But turbulent flows are much more than simply tion, nonlinear amplitude equations, and bifurcation
irregular. More refined definitions were desirable theory.
and were later coined. A definitive and precise one, ‘‘Fully developed turbulence,’’ on the other hand,
however, may only come when the phenomenon is concerns turbulence which evolves without imposed
fully understood. Nevertheless, several characteristic constraints, such as boundaries and external forces.
properties of a turbulent flow can be listed: This can be thought of turbulence in its ‘‘pure’’
Irregularity and unpredictability A turbulent flow form, and it is somewhat a theoretical framework
is irregular both in space and time, displaying for research due to its idealized nature. Hypotheses
unpredictable, random patterns. of homogeneity (when the mean quantities asso-
Statistical order From the irregularity of a turbu- ciated with the statistical order characterizing a
lent motion there emerges a certain statistical turbulent flow are independent in space), stationar-
order. Mean quantities and correlation are regular ity (idem in time), and isotropy (idem with respect
and predictable (Figure 1). to rotations in space) concern fully developed
Wide range of active scales A wide range of scales of turbulent flows. The Kolmogorov theory was devel-
motion are active and display an irregular motion, oped in this context and it is the most fundamental
yielding a large number of degrees of freedom. theory of turbulence. Current research is dedicated
Mixing and enhanced diffusivity The fluid particles in great part to unveil the mechanisms behind a
undergo complicated and convoluted paths, caus- phenomenon called intermittency and how it affects
ing a large mixing of different parts of fluid. This the laws obtained from the conventional theory.
mixing significantly enhances diffusion, increasing Research is also dedicated to derive such laws as
the transport of momentum, energy, heat, and much from first principles as possible, minimizing
other advected quantities. the use of phenomenological and dimensional
Vortex stretching When a moving portion of fluid analysis.
also rotates transversally to its motion an increase Real turbulent flows involve various regimes at
in speed causes it to rotate faster, a phenomenon once. A typical flow past a blunt object, for
called vortex stretching. This causes that portion instance, displays laminar motion at its upstream
of fluid to become thinner and elongated, and fold edge, a turbulent boundary layer further down-
and intertwine with other such portions. This is stream, and the formation of a turbulent wake
an intrinsically three-dimensional mechanism (Figure 2). The subject of turbulent boundary layer
which plays a fundamental role in turbulence is a world in itself with current research aiming to
and is associated with large fluctuations in the determine mean properties of flows over rough
vorticity field. surfaces and varied topography. Convective turbu-
lence involves coupling with active scalars such as

Turbulent Regimes
Turbulence is studied from many perspectives. The
subject of ‘‘transition to turbulence’’ attempts to
describe the initial mechanisms responsible for the
generation of turbulence starting from a laminar
motion in particular geometries. This transition can
be followed with respect to position in space (e.g., Figure 2 Illustration of a flow past an object, with a laminar
the flow becomes more complicated as we look boundary layer (light gray), a turbulent boundary layer (medium
further downstream on a flow past an obstacle or gray), and a turbulent wake (dark gray).
Turbulence Theories 297

large heat gradients, occurring in the atmosphere, ‘‘Chaos’’ serves as a paradigm for turbulence, in
and large salinity gradients, in the ocean. Geophy- the sense that it is now accepted that turbulence is a
sical turbulence involves also stratification and the dynamic processes in a sensitive deterministic
anisotropy generated by Earth’s rotation. Anisotro- system. But not all chaotic motions in fluids are
pic turbulence is also crucial in astrophysics and termed turbulent for they may not display mixing
plasma theory. Multiphase and multicomponent and vortex stretching or involve a wide range of
turbulence appear in flows with suspended particles scales. An important such example appears in the
or bubbles and in mixtures such as gas, water, and dispersive, nonlinear interactions of waves.
oil. Transonic and supersonic flows are also of great
importance and fall into the category of compres-
sible turbulence, much less explored than the The Equations of Motion
incompressible case.
It is usually stressed that turbulence is a continuum
In all those real situations one would like, from the
phenomenon, in the sense that the active scales are
engineering point of view, to compute mean proper-
much larger than the collision mean free path
ties of the flow, such as drag and lift for more between molecules. For this reason, turbulence is
efficient designs of aircraft, ships, and other vehicles. believed to be fully accounted for by the Navier–
Knowledge of the drag coefficient is also of funda- Stokes equations.
mental importance in the design of pipes and pumps, In the case of incompressible homogeneous flows,
from pipelines to artificial human organs. Mean the Navier–Stokes equations in the Eulerian form
turbulent diffusion coefficients of heat and other
and in vector notation read
passive scalars – quantities advected by the flow
without interfering on it, such as chemical products, @u
 u þ ðu  rÞu þ rp ¼ f ½1a
nutrients, moisture, and pollutants – are also of @t
major importance in industry, ecology, meteorology,
and climatology, for instance. And in most of those r  u ¼ 0: ½1b
cases a large amount of research is dedicated to the
Here, u = u(x, t) = (u1 , u2 , u3 ) denotes the velocity
‘‘control of turbulence,’’ either to increase mixing
vector of an idealized fluid particle located at
or reduce drag, for instance. From a theoretical
position x = (x1 , x2 , x3 ), at time t. The mass density
point of view, one would like to fully understand
in a homogeneous flow is constant, denoted . The
and characterize the mechanisms involved in
constant  denotes the kinematic viscosity of the
turbulent flows, clarifying this fascinating phe-
fluid, which is the molecular viscosity  divided by
nomenon. This could also improve practical appli-
. The variable p = p(x, t) is the kinematic pressure,
cations and lead to a better control of turbulence.
and f = f (x, t) = (f1 , f2 , f3 ) denotes the mass density
The concept of ‘‘two-dimensional turbulence’’ is
of volume forces.
controversial. A two-dimensional flow may be
Equation [1a] expresses the conservation of linear
irregular and display mixing, statistical order, and
momentum. The term u accounts for the dissipa-
a wide range of active scales but definitely it does
tion of energy due to molecular viscosity, and the
not involve vortex stretching since the velocity field
nonlinear term (u  r)u, also called the inertial term,
is always perpendicular to the vorticity field. For this
accounts for the redistribution of energy among
reason many researchers discard two-dimensional
different structures and scales of motion. Equation
turbulence altogether. It is also argued that real
[1b] represents the incompressibility condition. In
two-dimensional flows are unstable at complicated
Einstein’s summation convention, these equations
regimes and soon develop into a three-dimensional
can be written as
flow. Nevertheless, many believe that two-dimensional
turbulence, even lacking vortex stretching, is of @ui @ 2 ui @ui @p @uj
fundamental theoretical importance. It may shed þ  2 þ uj þ ¼ fi ; ¼0
@t @xj @xj @xi @xj
some light into the three-dimensional theory and
modeling, and it can serve as an approximation to
some situations such as the motion of the atmos-
The Reynolds Number
phere and oceans in the large and meso scales and
some magnetohydrodynamic flows. The relative The transition to turbulence was carefully studied by
shallowness of the atmosphere and oceans or the Reynolds in the late nineteenth century in a series of
imposition of a strong uniform magnetic field may experiments in which water at rest in a tank was
force the flow into two-dimensionality, at least for allowed to flow through a glass pipe. Starting with
a certain range of scales. dimensional analysis, Reynolds argued that a critical
298 Turbulence Theories

value of a certain nondimensional quantity was The Closure Problem and Turbulence
likely to exist beyond which a laminar flow gives Models
rise to a ‘‘sinuous’’ motion. This was followed by
observations of the flow for tubes with different The RANS equations cannot be solved directly for the
diameter L, different mean velocities U across the mean flow since the Reynolds stresses are unknown.
tube section, and with the kinematic viscosity Equations for these stress terms can be derived but they
 = = being altered through changes in tempera- involve further unknown moments. This continues
ture. The experiments confirmed the existence of with equations for moments of a given order depend-
such a critical value for what is now called the ing on new moments up to a higher order, leading to
Reynolds number: an infinite system of equations known as the Fried-
man–Keller system. For practical applications,
LU approximations closing the system at some finite
Re ¼
 order are needed, in what is called the closure problem.
The dimensional analysis argument can be repro- Several ad hoc approximations exist, the most famous
duced in the following form: the physical dimension being the Boussinesq eddy-viscosity approximation, in
for the inertial term in [1a] is U2 =L, while that for which the turbulent fluctuations are regarded as
the viscous term is U=L2 . The ratio between them increasing the viscosity of the flow. Prandtl’s mixing-
is precisely Re = LU=. For small values of Re length hypothesis yields a prescription for the compu-
viscosity dominates and the flow is laminar, whereas tation of this eddy viscosity, and together they form the
for large values of Re the inertial term dominates, basis of the algebraic models of turbulence. Other
models involve additional equations, such as the k-
and the flow becomes more complicated and
eventually turbulent. In applications, different types and k-! models. Most of the practical computations of
of Reynolds number can be used depending on the industrial flows are based on such lower-order models,
choice of the characteristic velocity and length, but and a large amount of research is done to determine
in any case, the larger the Reynolds number, the appropriate values for the various ad hoc parameters
more complicated the flow. which appear in these models and which are highly
dependent on the geometry of the flow. This depen-
dency can be explained by the fact that the RANS is
supposed to model the mean flow even at the large
The Reynolds Equations scales of motion, which are highly affected by the
Another advance put forward by Reynolds in a geometry.
subsequent article was to decompose the flow into a Computational fluid dynamics (CFD) is indeed a
mean component and the remaining fluctuations. In fundamental tool in turbulence, both for research and
terms of the velocity and pressure fields this can be engineering applications. From the theoretical side,
written as direct numerical simulations (DNS), which attempt to
resolve all the active scales of the flow, reveal some
 þ u0 ;
u¼u  þ p0
p¼p ½2 fundamental mechanisms involved in the transition to
with u  representing the mean components and
 and p turbulence and in vortex stretching. As for applica-
0 0
u and p , the fluctuations. By substituting [2] into tions, DNS applies to flows up to low-Reynolds
[1], one finds the Reynolds-averaged Navier–Stokes turbulence, with the current computational power
(RANS) equations for the mean flow: not allowing for a full resolution of all the scales
involved in high-Reynolds flows. And the current rate
u
@ of evolution of computational power predicts that
u þ ð
  u  rÞ ¼f þr
u þ rp
@t this will continue so for several decades.
ru
¼0 An intermediate CFD method between RANS and
DNS is the large-eddy simulation (LES), which
It differs from [1] only by the addition of the attempts to fully resolve the large scales while
Reynolds stress tensor: modeling the turbulent motion at the smaller scales.
 3 Several models have been proposed which have their
 ¼ u0  u0 ¼  u0i u0j own advantages and limitations as compared to
i;j¼1
RANS and DNS. It is currently a subject of intense
In a laminar flow, the fluctuations are negligible, research, particularly for the development of suitable
otherwise this decomposition shows how they models for the structure functions near the boundary.
influence the mean flow through this additional Theoretical results on fully developed turbulence play
turbulent stresses. a fundamental role in the modeling process.
Turbulence Theories 299

LESs are a promising tool and they have been Based on this assumption, the averages may in
successfully applied to a number of situations. The practice be calculated as time averages over a
choice of the best method for a given application, sufficiently large period T. There is a related
however, depends very much on the Reynolds argument for substituting space averages by time
number of the flow and the prior knowledge of averages and based on the mechanics of turbulence
similar situations for adjusting the parameters. which is called the ‘‘Taylor hypothesis.’’
Another fundamental concept in the statistical
theory is that of homogeneity, which is the spatial
analog of the statistical equilibrium in time.
Elements of the Statistical Theory
In homogeneous turbulence, the statistical quantities
Several types of averages can be used. The ensemble of a flow are independent of translations in space,
average is taken with respect to a number of experi- that is,
ments at nearly identical conditions. Despite the
h’ðuð  þ ‘;  Þi ¼ h’ðuð  ;  Þi
irregular motion of, say, the velocity vector u(n) (x, t)
of each experiment n = 1, . . . , N, the average value for all ‘ 2 R3 . The concept of isotropic turbulence
assumes further independence with respect to
1X N
rotations and reflections in the frame of reference,
u
 ðx; tÞ ¼ uðnÞ ðx; tÞ
N n¼1 that is,
is expected to behave in a more regular way. This h’ðQt uðQ  ;  Þi ¼ h’ðuð  ;  Þi
type of averaging is usually denoted with the symbol
for all orthogonal transformations Q in R3 , with
h  i. This notion can be cast into the context of a
adjoint Qt .
probability space (M, , P), where M is a set,  is a
Under the homogeneity assumption, mean quan-
-algebra of subsets of M, and P is a probability
tities can be defined independently of position in
measure on . The velocity field is a random
space, such as the mean kinetic energy per unit mass
variable in the sense that it is a density function
! 7! u(x, t, !) from M into the space of time- 1 1X 3

dependent divergence-free velocity fields. The mean e ¼ hjuðxÞj2 i ¼ hjui ðxÞj2 i


2 2 j¼1
velocity field in this context is regarded as
Z and the mean rate of viscous energy dissipation per
huðx; tÞi ¼ uðx; t; !ÞdPð!Þ unit mass and unit time
M
X3 X 3  2 
Other flow quantities such as energy and correla- 2  @ui ðxÞ
¼ hjrui ðxÞj i ¼   @x 
tions in space and time can be expressed by means i¼1 i;j¼1 j
of a function ’ = ’(u(  ,  )) of the velocity field,
with their mean value given by The mean kinetic energy can be written as
Z e = trR(0)=2, where
h’ðuð  ;  ÞÞi ¼ ’ðuð  ;  ; !ÞdPð!Þ
M trRð‘Þ ¼ R11 ð‘Þ þ R22 ð‘Þ þ R33 ð‘Þ; ‘ 2 R3 ;
In general, the statistics of the flow are allowed to is the trace of the correlation tensor
change with time. A particular situation is when
statistical equilibrium is reached, so that hu(x, t)i Rð‘Þ ¼ huðxÞ  uðx þ ‘Þi ¼ ðRij ð‘ÞÞ3i;j¼1
and, more generally, h’(u(  ,  þ t))i are independent ¼ ðhui ðxÞuj ðx þ ‘ÞiÞ3i;j¼1
of t. In this case, an ergodic assumption is usually
invoked, which means that for ‘‘most’’ individual which measures the correlation between the velocity
flows u(  ,  , !0 ) (i.e, for almost all !0 with respect to components at different positions in space. From the
the probability measure P), the time averages along homogeneity assumption, this tensor is a function
this flow converge to the mean ensemble value as only of the relative position ‘. Then, assuming that
the period of the average increases to the mean value the Fourier transform of trR(‘) exists, and denoting
obtained by the ensemble average: it by Q( ), for 2 R3 , we have
Z Z
1 T 1
lim ’ðuð  ;  þ s; !0 ÞÞds trRð‘Þ ¼ 3=2
Qð Þei‘  d
T!1 T 0 ð2
Þ R 3
Z Z 1
¼ ’ðuð  ;  ; !ÞÞdPð!Þ ¼2 Sð Þei‘  d
M 0
300 Turbulence Theories

where S( ) is the energy spectrum defined by


Z
1
Sð Þ ¼ Qð Þdð Þ
2ð2
Þ3=2 j j¼
8 > 0
with d( ) denoting the area element of the
2-sphere of radius j j. Then we can write
1 1
e ¼ hjuðxÞj2 i ¼ trRð0Þ
2
Z 1 2
¼ Sð Þd
0

By expanding the velocity coordinates into Four-


ier modes exp (‘  ), with j j þ d and Figure 3 Illustration of the eddy breakdown process in which
interpreting them as ‘‘eddies’’ with characteristic energy is transferred to smaller eddies and so on until the smallest
scales are reached and the energy is dissipated by viscosity.
wave number j j, the quantity S( )d can be
interpreted as the energy of the component of the
flow formed by the ‘‘eddies’’ with characteristic for which viscosity becomes important (Figure 3). At
wave number between and þ d . those smallest scales kinetic energy is finally dis-
Similarly, sipated into heat. It should be emphasized that
Z 1 turbulence is a dissipative process; no matter how
 ¼ 2 2 Sð Þd large the Reynolds number is, viscosity plays a role
0
in the smallest scales.
and we obtain the dissipation spectrum 2 2 S( ), The Kolmogorov theory of locally isotropic
which can be interpreted as the density of energy turbulence allows for inhomogeneity and anisotropy
dissipation occurring at wave number . in the large scales, which contain most of the energy,
In the previous arguments it is assumed that the assuming that with the cascade transfer of energy to
flow extends to all the space R3 . This avoids the smaller scales, the orienting effects generated in the
presence of boundaries, addressing the idealized case large scales become weaker and weaker so that for
of fully developed turbulence. It is sometimes sufficiently small eddies the motion becomes statis-
customary to assume as well that the flow is tically homogeneous, isotropic, and independent of
periodic in space to avoid problems with unbounded the particular energy-productive mechanisms. He
domains such as infinite kinetic energy. proposed that the statistical regime of the small-
The random nature of turbulent flows was greatly scale eddies is then universal and depends only on 
explored by Taylor in the early twentieth century, and . The equilibrium range is defined as the range
who introduced most of the concepts described of scales in which this universality holds.
above. Another important concept he introduced Simple dimensional analysis shows that the only
was the Taylor microlength ‘T , which is a char- algebraic combination of  and  with dimension of
acteristic length for the small scales based on the length is ‘ = ( 3 =)1=4 , which is then interpreted as
correlation tensor. A microscale Reynolds number that near which the viscous effect becomes impor-
based on the Taylor microlength is very often used tant and hence most of the energy dissipation takes
in applications. place. The scale ‘ is known as Kolmogorov
dissipation length.
Kolmogorov theory gives particular attention to
Kolmogorov Theory
moments involving differences of velocities, such as
An inspiring concept in the theory of turbulence is the pth-order structure function
Richardson’s ‘‘energy cascade’’ process. For large
def
Reynolds numbers the nonlinear term dominates the Sp ð‘Þ ¼ hðuðx þ ‘eÞ  e  uðxÞeÞp i
viscosity according to the dimensional analysis, but
this is valid only for the large-scale structures. The where e may be taken as an arbitrary unit vector,
small scales have their own characteristic length and thanks to the isotropy assumption. By restricting the
velocity. In the cascade process, the inertial term is search for universal laws for the structure functions
responsible for the transfer of energy to smaller and only for small values of ‘ anisotropy and inhomo-
smaller scales until small enough scales are reached geneity are allowed in the large scales.
Turbulence Theories 301

The theory assumes a wide separation between concentrated on the large scales, while the dissipa-
the energy-containing scales, of order say ‘0 , and the tion is concentrated near the Kolmogorov scale ‘ .
energy-dissipative scales, of order ‘ , so that the The four-fifths law becomes visible as a straight line
cascade process occurs within a wide range of scales in the logarithmic scale.
‘ such that ‘0

‘ . In this range, termed the A more precise mechanism for the energy cascade
inertial range, the viscous effects are still negligible assumes that in the inertial range, eddies with length
and the statistical regime should depend only on . scale ‘ transfer kinetic energy to smaller eddies during
Then, the Kolmogorov ‘‘two-thirds law’’ asserts that their characteristic timescale, also known as circula-
within the inertial range the second-order correla- tion time. If u‘ is their characteristic velocity, then
tions must be proportional to (‘)2=3 , that is, ‘ = ‘=u‘ is their circulation time, so that the kinetic
energy transferred from these eddies during this time is
S2 ð‘Þ ¼ CK ð‘Þ2=3
u2‘ u3‘
for some constant CK known as the Kolmogorov ‘ ¼
‘ ‘
constant in physical space (there is a related constant
in spectral space). The argument extends to higher- In statistical equilibrium, the energy lost to the
order structure functions, yielding smaller scales equals the energy gained from the
larger scales, and that should also equal the total
Sp ð‘Þ ¼ Cp ð‘Þp=3 kinetic energy dissipated by viscous effects. Hence,
Kolmogorov’s derivation of these results was not by ‘  , and we find
dimensional analysis, it was in fact a more convincing u3‘
self-similarity argument based on the universality 

assumed for the equilibrium range. A different argu-
ment without resorting to universality assumptions, It also follows that ‘ = ‘=u‘ = ‘(‘)1=3 = 1=3 ‘2=3 so
however, was applied to the third-order structure that the circulation time decreases with the length
function, yielding the more precise ‘‘four-fifths law’’: scale and becomes of the order of the viscous
dissipation time (=)1=2 precisely when ‘ ‘ .
S3 ð‘Þ ¼ 45‘ A similar relation between  and the large scales
can also be obtained with heuristic arguments: let e
The ‘‘Kolmogorov five-thirds law’’ concerns the
be the mean kinetic energy and ‘0 , a characteristic
energy spectrum S( ) and is the spectral version of
length for the large scales. Then u0 given by e = u20 =2
the two-thirds law, given by Obukhoff:
is a characteristic velocity for the large scales, and
Sð Þ ¼ C0K 2=3 5=3 0 = ‘0 =u0 is the large-scale circulation time. In
statistical equilibrium, the rate  of kinetic energy
The constant C0K is the Kolmogorov constant dissipated per unit time and unit mass is expected to
in spectral space. The spectral version of the be of the order of e=0 , hence
dissipation length is the Kolmogorov wave number
u30
 = (= 3 )1=4 . 
A typical distribution of energy in a turbulent ‘0
flow is depicted in Figure 4. The energy is which is called the ‘‘energy dissipation law.’’

(κ) 2νκ 2 (κ) ln (κ)

ln κ

κ0 κ κ ln κ 0 ln κ
Inertial range Inertial range

Equilibrium range Equilibrium range


Figure 4 A typical distribution for the energy spectrum S( ) and the dissipation spectrum 2 2 S( ) in spectral space in
nonlogarithmic and logarithmic scales. The energy is mostly concentrated on the large scales while the dissipation is concentrated
near the dissipation scale. In the logarithmic scale, the four-fifths law for the energy spectrum stands out as a straight line with
slope 4=5 over the inertial range.
302 Turbulence Theories

ᐉ

ᐉ0
Figure 5 A schematic representation of a flow structure Figure 6 A portion of rotating fluid gets stretched and thinned
displaying a range of active scales and a three-dimensional as the flow speeds up, generating one of many coherent
grid with linear dimension ‘0 and mesh length ‘ , sufficient to structures of high vorticity and low dissipation.
represent all the active scales in a turbulent flow. The number of
degrees of freedom is the number of blocks: (‘0 =‘ )3 .
Sp (‘) / ‘ (p) , (p) < p=3, for high-order (p > 3) struc-
From the energy dissipation law, several relations ture functions. The issues of intermittency and
between characteristic quantities of turbulent flows can coherent structures and whether and how they could
be obtained, such as ‘0 =‘ Re3=4 , for Re = ‘0 u0 =. affect the deductions of the universality theory such as
Now, assuming the active scales in a turbulent the power laws for the structure functions are far from
flow exist down to the Kolmogorov scale ‘ , one settled and are currently one of the major and most
needs a three-dimensional grid with mesh spacing ‘ fascinating issues being addressed in turbulence
to resolve all the scales, which means that the theory. Several phenomenological theories attempt to
number N of degrees of freedom of the system is of adjust the universality theory to the existence of such
the order of N (‘0 =‘ )3 (see Figure 5). This coherent structures. Multifractal models, for instance,
number can be estimated in terms of the Reynolds suppose that the eddies generated in the cascade
number by N Re9=4 . This relation is important in process do not fill up the space and form multifractal
predicting the computational power needed to structures. Field-theoretic renormalization group
simulate all the active scales in turbulent flows. develops techniques based on quantum field renor-
Several such universal laws can be deduced and malization theory. Intermediate asymptotics also
extended to other situations such as turbulent exploits self-similar analysis and renormalization
boundary layers, with the famous logarithmic law theory but with a somewhat different flavor. Detailed
of the wall. They play a fundamental role in mathematical analysis of the vorticity equations is
turbulence modeling and closure, for the calculation also playing a major role in the understanding of the
of the mean flow and other quantities. dynamics of the vorticity field.

Intermittency Mathematical Aspects


of Turbulence Theory
The universality hypothesis based on a constant mean
energy dissipation rate throughout the flow received From a mathematical perspective, it is fundamental to
some criticisms and was later modified by Kolmo- develop a rigorous background upon which to study
gorov in an attempt to account for observed large the physical quantities of a turbulent flow. The first
deviations on the mean rate of energy dissipation. Such problem in the mathematical theory is related to the
phenomenon of intermittency is related to the vortex deterministic nature of chaotic systems assumed in
stretching and thinning mechanism, which leads to the dynamical system theory and believed to hold in
formation of coherent structures of vortex filaments of turbulence. This has actually not been proved for the
high vorticity and low dissipation (Figure 6). These Navier–Stokes equations. It is in fact one of the most
filaments have diameter as small as the Kolmogorov outstanding open problems in mathematics to deter-
scale and longitudinal length extending from the mine whether given an initial condition for the velocity
Taylor scale up to the large scales and with a lifetime field there exists, in some sense, a unique solution of
of the order of the large-scale circulation time. the Navier–Stokes equations starting with this initial
It has been argued based on experimental evidence condition and valid for all later times. It has been
that intermittency leads to modified power laws proved that a global solution (i.e., valid for all later
Twistor Theory: Some Applications 303

times) exists but which may not be unique, and it has Turbulence; Viscous Incompressible Fluids:
been proved that unique solutions exist which may not Mathematical Theory; Vortex Dynamics; Wavelets:
be global (i.e., they are guaranteed to exist as unique Application to Turbulence.
solutions only for a finite time).
The difficulty here is the possible existence of
singularities in the vorticity field (vorticity becoming Further Reading
infinite at some points in space and time). Depending
on how large the singularity set is, uniqueness may fail Adzhemyan LT, Antonov NV, and Vasiliev AN (1999) The Field
Theoretic Renormalization Group in Fully Developed Turbu-
in strictly mathematical terms. The existence of
lence. Amsterdam: Gordon and Breach.
singularities may not be a purely mathematical Barenblatt GI and Chorin AJ (1998) New perspectives in
curiosity, it may in fact be related with the inter- turbulence: scaling laws, asymptotics, and intermittency.
mittency phenomenon. Rigorous studies of the vorti- SIAM Review 40(2): 265–291.
city equation may continue to reveal more fundamental Batchelor GK (1953) The Theory of Homogeneous Turbulence.
Cambridge Monographs on Mechanics and Applied Mathe-
aspects on vortex dynamics and coherent structures.
matics. New York: Cambridge University Press.
The statistical theory has also been put into a firm Chorin AJ (1994) Vorticity and Turbulence. Applied Mathema-
foundation with the notion of statistical solution of the tical Sciences vol. 103, New York: Springer.
Navier–Stokes equations. It addresses the existence Constantin P (1994) Geometric statistics in turbulence. SIAM
and regularity of the probability distribution assumed Review 36: 73–98.
Foias C, Manley OP, Rosa R, and Temam R (2001) Navier–Stokes
for turbulent flows and of the fundamental elements of
Equations and Turbulence. Encyclopedia of Mathematics and its
the statistical theory such as correlation functions and Applications, vol. 83. Cambridge: Cambridge University Press.
spectra. Based on that, a number of relations between Friedlander S and Topper L (1961) Turbulence. Classic Papers on
physical quantities of turbulent flows may be derived Statistical Theory. New York: Interscience Publisher.
in a mathematically sound and definitive way. This Frisch U (1995) Turbulence. The Legacy of A. N. Kolmogorov.
Cambridge: Cambridge University Press.
does not replace other theories, it is mostly a
Hinze JO (1975) Turbulence. McGraw-Hill Series in Mechanical
mathematical framework upon which other techni- Engineering. New York: McGraw-Hill.
ques can be applied to yield rigorous results. Holmes P, Lumley JL, and Berkooz G (1996) Turbulence,
Despite the difficulties in the mathematical theory Coherent Structures, Dynamical Systems, and Symmetry.
of the NSE some successes have been collected such Cambridge: Cambridge University Press.
Lesieur M (1997) Turbulence in Fluids. Fluid Mechanics and its
as estimates for the number of degrees of freedom in
Applications, 3rd edn. vol. 40. Dordrecht: Kluwer Academic.
terms of fractal dimensions of suitable sets asso- Monin AS and Yaglom AM (1975) Statistical Fluid Mechanics:
ciated with the solutions of the Navier–Stokes Mechanics of Turbulence 2. Cambridge: MIT Press.
equations, and partial estimates of a number of Sagaut P (2001) Large Eddy Simulation for Incompressible Flows.
relations derived in the statistical theory of fully Berlin: Springer.
Schlichting H and Gersten K (2000) Boundary Layer Theory, 8th
developed turbulence.
edn. Berlin: Springer.
Tennekes H and Lumley JL (1972) A First Course in Turbulence.
See also: Bifurcations in Fluid Dynamics; Geophysical Cambridge, MA: MIT Press.
Dynamics; Incompressible Euler Equations: Vishik MI and Fursikov AV (1988) Mathematical Problems of
Mathematical Theory; Intermittency in Turbulence; Statistical Hydrodynamics. Dordrecht: Kluwer.
Inviscid Flows; Lagrangian Dispersion (Passive Scalar); Wilcox DC (2000) Turbulence Modeling for CFD, 2nd edn.
Stochastic Hydrodynamics; Variational Methods in Anaheim, CA: DCW Industries Inc.

Twistor Theory: Some Applications


L Mason, University of Oxford, Oxford, UK of reformulating and superceding the established
ª 2006 Elsevier Ltd. All rights reserved. theories of basic physics are still a long way from
being fulfilled. Nevertheless, the successes have had
many important applications across mathematics and
mathematical physics. This article will concentrate on
Introduction
three areas of application: integrable systems, geome-
Roger Penrose introduced twistor theory as a geome- try, and perturbative gauge theory (via twistor-string
trical framework for basic physics in order to unify theory). It is intended to be self-contained as far as
quantum theory and gravity. This program has had possible, but the reader may well find it easier to first
many successes along the way, but the long-term goals read the article Twistors.
304 Twistor Theory: Some Applications

Twistor Theory They are totally null (i.e., the tangent vectors not
only have zero length but are also mutually
A basic motivation of twistor theory is to bring out
orthogonal) and also self-dual (under the differential
the complex (holomorphic) geometry that underlies
geometer’s notion of Hodge duality).
real spacetime. In general relativity, a spacetime is a
This complex correspondence can also be
4-manifold with metric g of signature (1, 3), and
restricted to give correspondences for R4 with
when it is flat, that is, g = dt2  dx2  dy2  dz2 ,
metrics of positive-definite, Euclidean, signature or
where (t, x, y, z) are coordinates on R4 , it is called
ultrahyperbolic, (2, 2), signature. A particular sim-
Minkowski space. The first appearance of a com-
plification in Euclidean signature is that the complex
plex structure arises from the fact that, at a given
-planes intersect the real slice in a point. The
event, the celestial sphere of light rays (directions of
conformal compactification of Euclidean R 4 is the
zero length with respect to g) naturally has the
4-sphere S4 given by adding a single point at infinity,
structure of the Riemann sphere, CP1 , in such a way
and so we have a projection p : PT ! S4 whose
that Lorentz transformations (linear transformations
fibers are holomorphically embedded CP1 s. These
of the tangent space preserving the metric) act on
fibers can be characterized as the lines in PT that
this sphere by Möbius transformations. These are
are invariant under a quaternionic complex con-
the maximal group of complex analytic transforma-
jugation which is an antiholomorpic map^: PT !
tions of CP1 .
PT with no fixed points. (Here quaternionic means
Twistor space extends this idea to the whole of
that on the nonprojective twistor space, T = C4 , the
Minkowski space. Denoted PT, the twistor space for ^^
conjugation has the property Z = Z so that it
Minkowski space is complex projective 3-space, CP3 ,
defines a second complex structure anticommuting
the space of one-dimensional subspaces of C4 ; it is a
with the standard one; this is sufficient to express
three-dimensional complex manifold obtained by add-
T = Q 2 , where Q denotes the quaternions. The
ing a ‘‘plane at infinity’’ to C3 . Explicitly, we can
complex structures i, j, and k of the
ffi quaternions
introduce homogeneous coordinates Z 2 C4  {0} pffiffiffiffiffiffi
are given by identifying i with 1 on C4 and j
with  = 0, 1, 2, 3 but where Z  Z for  2 C  {0}.
with ^ and k = ij.)
Affine coordinates on a C3 chart Z3 6¼ 0 can
be obtained by setting (z1 , z2 , ) = (Z0 =Z3 , Z1 =Z3 ,
The Penrose Transform
Z2 =Z3 ). Physically, points of twistor space corre-
spond to spinning massless particles in Minkowski A basic task of twistor theory is to transform
space. Mathematically, the correspondence can be solutions to the field equations of mathematical
understood as the Klein correspondence. physics into objects on twistor space. This works
well for linear massless fields such as the Weyl
The Klein Correspondence neutrino equation, Maxwell’s equations for electro-
The correspondence between PT and Minkowski magnetism and linearized gravity. In its general
space can be extended first to complexified Minkowski form, this transform has become known as the
space so that the coordinates are allowed to take on Penrose transform. Such fields correspond to freely
values in C, and then to its conformal compactification prescribable holomorphic functions f (, z1 , z2 ) (or,
by including the ‘‘light cone at infinity.’’ It then more precisely, analytic cohomology classes) on
coincides with the classical complex Klein correspon- regions of twistor space. The field can be obtained
dence. The Klein correspondence is the one-to-one from this function by means of a contour integral.
correspondence between lines in CP3 and points of a The simplest of these integral formulas is
I
four complex-dimensional quadric, CM, in CP5 . The
4-quadric CM can be understood as conformally ðxa Þ ¼ f ð; t  z þ ðx þ iyÞ; x  iy
compactified complexified Minkowski space. Introdu- þ ðt þ zÞÞd
cing affine coordinates (z1 , z2 , ) on PT and (t, x, y, z)
on CM, we find that a point (t, x, y, z) in CM and differentiation under the integral sign leads
corresponds to a line in PT according to easily to the fact that  satisfies the wave equation
    
z1 t  z x þ iy 1 @2 @2 @2 @2
¼    ¼0
z2 x  iy t þ z  @t2 @x2 @y2 @z2
Alternatively, fixing (, z1 , z2 ) in these equations This formula was originally discovered by Bateman.
gives a 2-plane in complex Minkowski space Note that f must have singularities on twistor space
corresponding to all the lines in PT through to yield a nontrivial  and even then, there are many
(, z1 , z2 ). Such 2-planes are called ‘‘-planes.’’ choices of f that yield zero. For a solution  defined
Twistor Theory: Some Applications 305

over a region U in spacetime, the function f is natural to ask which complexified metrics admit a
correctly understood as a representative of a Cech full family of -surfaces, that is, 2-surfaces that are
cohomology class defined on the region U0 in twistor totally null and self-dual. The answer is that a full
space swept out by the lines corresponding to points family of -surfaces exists iff the conformally
of U. Furthermore, the function f should be taken invariant part of the curvature tensor, the Weyl
globally to be a function of homogeneity 2, tensor, is anti-self-dual. If this is the case, twistor
f (Z ) = 2 f (Z ). This formula has generalizations space can be defined to be the (necessarily three-
to massless fields of all helicities in which a field of dimensional) space of such -surfaces.
helicity s corresponds to a function (Cech cocycle) of The remarkable fact is that the twistor space,
homogeneity degree 2s  2. together with its complex structure, is sufficient to
The Penrose transform has found important determine the original spacetime. Twistor space is
applications in representation theory and integral again a three-dimensional complex manifold, and
geometry. For a review, the reader is referred to contains holomorphically embedded rational curves,
Baston and Eastwood (1989), the relevant survey CP1 s, at least one for each point of the spacetime.
articles in Bailey and Baston (1990), or Mason and However, holomorphic rigidity implies that the
Hughston (1990, chapter 1). family of rational curves is precisely four-
dimensional over the complex numbers. Further-
more, incidence of a pair of curves can be taken to
Twistor Theory and Nonlinear Equations imply that the corresponding points in spacetime lie
on a null geodesic and this yields a conformal
The Penrose transform for the Maxwell equations
structure on spacetime. Further structures on twistor
and linearized gravity turns out to be linearizations
space can be imposed to give the complex spacetime
of correspondences for the nonlinear analogs of
a metric that is vacuum, perhaps with a cosmologi-
these equations: the Einstein vacuum equations and
cal constant. The correspondence is stable under
the Yang–Mills equations. However, the construc-
small deformations and so the data defining the
tions only work when these fields are anti-self-dual.
twistor space is effectively freely prescribable, see
This is the condition that the curvature 2-forms
Penrose (1976).
satisfy F = iF, where  denotes the Hodge dual
In Euclidean signature, again the complex
(which, up to certain factors of i, has the effect of
-planes intersect the real spacetime in a point, so
interchanging electric and magnetic fields); it is a
the twistor space again fibers over spacetime. The
nonlinear generalization of the right-handed circular
twistor fibration can be constructed as the projecti-
polarization condition. Explicitly, in terms of space-
vized bundle of self-dual spinors or more commonly

time indices a, b, . . . = 0, 1, 2, 3, Fab = (1=2)"abcd Fcd ,
as the unit sphere bundle in the space of self-dual
where "0123 = 1 and "abcd = "[abcd] . In Minkowski
2-forms (Atiyah et al. 1978). In the latter formula-
signature, the i factor in the anti-self-duality condi-
tion, the complex structure on the twistor space
tion implies that real fields cannot be anti-self-dual.
arises from the direct sum of the naturally defined
Thus, these extensions are not sufficient to fulfill the
complex structures on the horizontal and vertical
ambitions of twistor theory to incorporate real
tangent spaces to the bundle; that on the vertical
classical nonlinear physics in Minkowski space.
subspace is the standard one on the sphere, and that
However, the factor of i is not present in Euclidean
on the horizontal subspace is a multiple of the self-
and ultrahyperbolic signature, so the anti-self-
dual 2-form at the given point of the fiber.
duality condition is consistent with real fields in
There are now large families of extensions,
these signatures and this is where the main applica-
generalizations, and reductions of this construction.
tions of these constructions have been.
They are all based on the idea of realizing a space
with a given complexified geometric structure as the
The Nonlinear Graviton Construction
parameter space of a family of holomorphically
and Its Generalizations
embedded submanifolds inside a twistor space. In
The first nonlinear twistor construction was due to general, the most useful of these constructions are
Penrose (1976), and was inspired by Newman’s those in which the ‘‘spacetime’’ is obtained as the
(1976) construction of ‘‘heavens’’ from the infinities space of rational curves in a twistor space. This is
of asymptotically flat spacetimes in general because the equations that are solved on the
relativity. corresponding spacetime can be thought of as a
The nonlinear graviton construction proceeds completely integrable system in which the integr-
from the definition of twistors in flat spacetime as ability condition for the generalized -surfaces is
-planes in complexified Minkowski space. It is interpreted as the consistency condition of a Lax
306 Twistor Theory: Some Applications

pair or more general linear system. For a more consisting of local holomorphic matrix-valued func-
detailed discussion from this point of view, see tions on twistor space. To construct the solution on
Mason and Woodhouse (1996, chapter 13). spacetime, one must first find a Birkhoff factoriza-
tion of the patching data on each Riemann sphere in
twistor space corresponding to points of the appro-
The Anti-Self-Dual Yang–Mills Equation
priate region in spacetime. On each Riemann sphere,
and Its Twistor Correspondence
the Birkhoff factorization starts with the given
The anti-self-dual Yang–Mills equations extend patching function with values in GL(n, C) on the
Maxwell’s equations for electromagnetism in the real axis in the complex plane, and expresses it as a
right-circularly polarized case. They are a family of product of functions with values in GL(n, C) one of
equations that depend on a choice of Lie group G, which extends over the upper-half plane, and the
usually taken to be a group of complex matrices; other over the lower-half complex plane. The anti-
Maxwell’s equations arise when G = U(1). self-dual connection can be obtained by differentiat-
Introduce coordinates xa , a = 0, 1, 2, 3, on R4 with ing the resulting matrices. See Penrose (1984, 1986),
metric ds2 = dx0  dx3  dx1  dx2 (this is a metric of Ward and Wells (1990), or Mason and Woodhouse
ultrahyperbolic signature – Euclidean signature can (1996) for a full discussion, and Atiyah (1979) for
be obtained by choosing the coordinates to be the formulation appropriate to Euclidean signature.
complex, but with (x3 , x2 ) the complex conjugates
of (x0 , x1 )). The dependent variables are the compo-
Completely Integrable Systems
nents Aa of a connection Da = @a  Aa , where
@a = @=@xa and Aa = Aa (xb ) 2 Lie G, the Lie algebra In effect, the twistor constructions amount to
of G. This connection defines a method of differ- providing a geometric general local solution to the
entiating vector-valued functions s in some repre- anti-self-duality equations; the twistor data is, for a
sentation of G. The freedom in changing bases for local solution, freely prescribable. In this sense, they
the vector bundle induce the gauge transformations demonstrate complete integrability of the anti-self-
Aa ! g1 Aa g  g1 @a g, g(x) 2 G on Aa ; two connec- duality equations. The reconstruction of a solution
tions that are related by a gauge transformation are on spacetime from twistor data is not a quadrature –
deemed to be the same. The self-dual Yang–Mills it involves, in the anti-self-dual Yang–Mills case, a
equations are the condition Birkhoff factorization (also sometimes referred to as
the solution to a Riemann–Hilbert problem), and in
½D0 ; D2  ¼ ½D1 ; D3  ¼ ½D0 ; D3   ½D1 ; D2  ¼ 0
the case of the anti-self-dual Einstein equations, the
They are the compatibility conditions construction of a family of rational curves inside a
complex manifold. Nevertheless, such constructions
½D0 þ D1 ; D2 þ D3  ¼ 0
are a familiar part of the apparatus of the theory of
for the linear system of equations integrable systems.
In Ward (1985), this connection with integrable
ðD0  D1 Þs ¼ ðD2  D3 Þs ¼ 0 ½1
systems was developed further, and the anti-self-
where  2 C and s is an n-component column dual Yang–Mills equations were shown to yield
vector. These latter equations form a ‘‘Lax pair’’ many important integrable systems under symmetry
for the system. reduction. Ward’s list has been extended and now
The Ward (1977) construction provides a one–one includes many of the most famous examples of
correspondence between gauge equivalence classes integrable systems such as the Painlevé equations,
of solutions of the self-dual Yang–Mills equations the Korteweg–de Vries (KdV) equation, the non-
and holomorphic vector bundles on regions in linear Schrödinger equation, the n-wave equations,
twistor space. The key point here is that eqn [1] and so on, see Mason and Woodhouse (1996) for a
defines parallel propagation along -planes. To each review. There are some notable omissions from the
point Z in twistor space, we can associate the vector list such as the Kadomtsev–Petviashvili (KP) and
space EZ of solutions to eqn [1] along the Davey–Stewartson equations (at least if one restricts
corresponding -plane. These vector spaces vary oneself to finite-dimensional gauge groups; reduc-
holomorphically with Z and that is what one means tions using infinite dimensional gauge groups have
by a holomorphic vector bundle E ! PT. The been obtained).
remarkable fact is that the anti-self-dual Yang– The list of integrable systems obtainable by
Mills field can be reconstructed up to gauge from E, symmetry reduction nevertheless remains impressive
and, in effect, for local analytic solutions, E can be and provides a route to the classification of at least
represented by freely prescribable ‘‘patching’’ data those integrable systems that can be obtained in this
Twistor Theory: Some Applications 307

way. Such systems can be classified by the choice of with the Euclidean signature versions of the original
ingredients required in the symmetry reduction: the Ward construction for anti-self-dual Yang–Mills
gauge group, the group of spacetime symmetries to fields and Penrose’s nonlinear graviton construction
be reduced by, the choice of Euclidean or ultra- for Ricci-flat anti-self-dual metrics but, as we will
hyperbolic signature, and the choice of certain discuss, these constructions have a number of
constants of integration that arise in the reduction. extensions and generalizations.
Another implication is that if an integrable system The first dramatic application of these construc-
can be obtained from one of the self-duality tions was the ADHM construction of Yang–Mills
equations by symmetry reduction, then it inherits a instantons. These areR absolute minima of the Yang–
reduced twistor correspondence because the twistor Mills action, S[A] = tr(F ^ F ) on the 4-sphere, S4 ,
correspondences share the symmetry groups of the with its round metric. A simple argument shows that
spacetime field equations. These twistor correspon- the action is bounded below by the second Chern
dences can be seen to underlie much of the theory of class of the bundle and that this bound is achieved
these equations; for example, Backlund transforma- only for anti-self-dual fields. Thus, the problem was
tions of solutions correspond to elementary alge- to characterize all the anti-self-dual Yang–Mills
braic operations on the twistor data, similarly the fields on S4 . In this Euclidean context, twistor
Kac–Moody Lie algebras of hidden symmetries act space, CP3 , fibers over S4 and the corresponding
locally on the twistor data by matrix multiplication Ward vector bundle is a bundle over all of CP3 . It
of the appropriate loop algebras. Similarly, the turns out that all such bundles satisfying a certain
inverse-scattering transform for the KdV and non- stability condition had been constructed reasonably
linear Schrodinger equations can be seen to arise as explicitly by algebraic geometers. Since the stability
particular presentations of the twistor construction. condition was implied by the context, this could be
By and large, although twistor methods have turned into an algebraic construction of the general
yielded new insight into the geometry and structure instanton explicit enough to give some insight into
of systems in dimensions 1 and 2, they have not both the local and global structure of the solution
necessarily superceded pre-existing techniques for space. See Atiyah (1979) for a review.
constructing solutions and analyzing the solution Hitchin used the Euclidean version of the non-
space. The systems for which twistor methods have linear graviton to develop the theory of gravitational
been particularly effective for constructing solutions instantons that are asymptotically locally Euclidean
and characterizing their properties are in 2 þ 1 or (i.e., asymptotically R4 =, where  is a finite
higher dimension. Key examples here are of course subgroup of the rotation group). These were finally
the anti-self-dual Yang–Mills and Einstein equations constructed by Kronheimer who again used twistor
themselves, and their single translation reductions. theory to identify the appropriate parameter space,
In the anti-self-dual Yang–Mills case, these reduc- see his article in Mason et al. (2001) and Dancer’s
tions lead either to Ward’s or Manakov and review of hyper-Kähler manifolds in LeBrun and
Zakharov’s chiral model in Lorentzian signature, Wang (1999).
2 þ 1, or the Bogomolny equations for monopoles, Even in four dimensions, there are a number of
the reduction from Euclidean signature. In both variants of the nonlinear graviton construction. The
cases, the twistor construction has played a major basic twistor correspondence produces a twistor
role in constructing and studying the solitonic space that is a complex 3-manifold PT for
solutions. 4-manifolds with conformal structures whose Weyl
See Ward and Wells (1990), Mason and Wood- tensor is anti-self-dual. There are four natural
house (1996), Ward’s article in Huggett et al. (1998) specializations that have attracted study: (1) the
and the first few chapters of Mason et al. (1995), Ricci-flat case, (2) the Einstein case (with nonzero
and Mason et al. (2001) for more examples of cosmological constant), (3) the scalar-flat Kähler
aspects of the theory of integrable systems arising case, and (4) the hypercomplex case.
from twistor correspondences. The twistor space in the Ricci-flat case admits the
additional structure of a fibration over CP1 together
with a holomorphic Poisson structure on the fibers
Applications to Geometry
with values in the pullback of the 1-forms on CP1
These applications are, to a large extent, higher- (alternatively, the bundle of holomorphic 3-forms
dimensional analogs of those discussed above; most should be the pullback of the square of the bundle of
of the problems in geometry to which twistor theory holomorphic 1-forms on CP1 ). The Einstein case
has been applied are those for which the underlying with nonzero cosmological constant is a variant of
differential equations are integrable. These start this in which the twistor space admits a
308 Twistor Theory: Some Applications

nondegenerate holomorphic contact structure, that This is a connection that is naturally defined on any
is, a distribution of 2-plane elements, which are only conformal manifold being the spinor representation
integrable when the cosmological constant vanishes. of the Cartan conformal connection. An impressive
It also admits a Kähler form when the scalar application here is the construction of conformally
curvature is positive (in the negative case the invariant differential operators and other conformal
corresponding Kähler form is indefinite). For the invariants. See the article by Baston and Eastwood
case of Kähler metrics with vanishing scalar curva- in Bailey and Baston (1990).
ture, the twistor space admits a holomorphic volume
form with a double pole. The Ricci-flat case is
equivalent to the case of hyper-Kähler metrics, those Beyond Classical Integrability:
that are Kähler with respect to three different Twistor-String Theory
complex structures I, J, and K satisfying the stan- Until Witten (2004), there was little indication that
dard quaternionic relations IJ = K, etc. A hypercom- twistor theory would have much useful to say about
plex structure is obtained when one only has the Yang–Mills or gravitational fields that are not anti-
three integrable complex structures satisfying the self-dual. Furthermore, it was problematic to incor-
quaternion relations. Such manifolds admit an porate quantum field theory into twistor ideas.
underlying conformal structure that is anti-self- However, twistor-string theory has transformed the
dual, and the corresponding twistor space admits a situation and has furthermore had impressive appli-
fibration to CP1 . cations to the field of perturbative gauge theory.
These constructions have all played a significant The story starts with a formulation by Nair of the
role in the general analysis of these geometric remarkable Park–Taylor formulas for the so-called
structures, and the construction of examples. A maximal helicity violating (MHV) amplitudes in
striking example of an application of the nonlinear gauge theory. These are scattering amplitudes at tree
graviton construction to general properties is due to level in which helicity conservation is maximally
Donaldson and Friedman who show that if two violated; using crossing symmetry to take all the
4-manifolds admit anti-self-dual conformal struc- particles to be outgoing, these are amplitudes in
tures, then their direct sum does also. which n  2 of the particles have helicity 1 and two
In higher dimensions, most generalizations rely on have helicity þ1. These amplitudes can be expressed
quaternionic geometry and its reductions. The simply as follows. Let the n particles have color ti in
Euclidean signature formulation of the nonlinear the Lie algebra of the gauge group and null
graviton construction has natural extensions to momenta pi with spinor decompositions pai = ~A A0
i i ,
quaternionic manifolds in 4k dimensions. These are A0
i = 1, . . . , n where the i are self-dual spinors and
manifolds with metric whose holonomies are con- ~A are anti-self-dual spinors using the index notation
tained in Sp(k)  Sp(1). The latter SP(1) = SU(2) of Spinors and Spin Coefficients, and Twistors. Let
factor leads to an associated S2 bundle whose total i = r and i = s be the two gluons of helicity þ1. Then
space is the twistor space PT and it naturally has the coefficient of the colour term tr(t1 t2    tn ) is
the structure of a (2k þ 1)-dimensional complex !
manifold. Xn
r  s
4 a
For a series of review articles, the reader is  pi n
i¼1
i¼1 i  iþ1
referred to Bailey and Baston (1990, chapters 3
0
and 4) and also LeBrun and Wang (1999, chapters where i  j = Ai jA0 denotes the standard skew-
2, 5, 6, 10, and 14) which, despite being a book on symmetric inner product on chiral spinors and
the distinct subject of Einstein manifolds, is strongly nþ1 = 1 . A striking feature is that, except for the
influenced by twistor theory. Other applications delta function, it is holomorphic in the i s except at
along these lines are summarized in Mason et al. the simple poles i  iþ1 = 0. Nair interprets these
(2001, chapter 1). poles as those associated to fermion correlators in a
There are a number of applications that go current algebra on a CP1 parametrized by . Using a
beyond complete integrability. A striking application supersymmetric formulation adapted to N = 4 super
is the twistor framework of Merkulov for studying Yang–Mills, he formulated the amplitude as arising
arbitrary geometric structures. This has led to a from an integral over lines in supertwistor space
classification of all possible irreducible holonomies CP3j4 .
of torsion-free affine connections, see Merkulov’s Witten extends these ideas to give, at least
article in Huggett et al. (1998). Another important conjecturally, a complete theory. He proposes that
area is in the field of conformal invariants in which full perturbative N = 4 super Yang–Mills theory on
the local twistor connection plays a prominent role. spacetime is equivalent to a string theory, a topological
Twistor Theory: Some Applications 309

B model, on a supersymmetric version of twistor As it stands, although this holomorphic Chern–


space, PTs = CP3j4 . This is the space obtained by Simons theory gives the correct field content of
taking C4j4 with bosonic coordinates Z ,  = 0, . . . , 3 N = 4 super Yang–Mills, the couplings are only
and fermionic coordinates i , i = 1, . . . , 4 moduli the those of an anti-self-dual sector and more couplings
equivalence relation (Z , i )  (Z , i ) where are needed to obtain full N = 4 super Yang–Mills.
 2 C,  6¼ 0. The remarkable fact is that these can be naturally
The number 4 here plays two crucial but different introduced by coupling in certain D1 instantons.
roles. It is the maximum number of supersymmetries The D1 instantons are algebraic curves C in twistor
that Yang–Mills can have; it has the effect of space and the coupling is via a pair of spinor fields 
incorporating both the positive and negative helicity and  on C with values in E and E , respectively
parts of the gauge field in the same supermultiplet. It with action
is also the only value of N for which CP3jN is a Z
Calabi–Yau manifold and this is a necessary condi- S½; ; A ¼  @A 
tion for the topological twisted B model to be C
anomaly-free. The Calabi–Yau condition is the This leads to explicit expressions for Yang–Mills
condition that the manifold admit a global holo- scattering amplitudes in terms of integrals of
morphic volume form which here is fermion correlators over the moduli spaces of such
algebraic curves in supertwistor space. In principle,
s ¼ "  Z dZ ^ dZ ^ dZ the integral is over all algebraic curves. However,
^ d1 ^ d2 ^ d3 ^ d4 algebraic curves have two topological invariants,
their degree denoted d and genus g. An argument
This is invariant under (Z , i ) ! (Z , i ) because based on a classical scaling symmetry gives that
d(i ) = 1 di ,  2 C Rfollows from the Berezinian integration over just those of curves of degree d
rule of integration
d
= 1 for anticommuting gives the subset of processes for which
variables.
d ¼q1þl
Open-string topological twisted B models are
known to correspond to holomorphic Chern–Simons where q is the number of outgoing particles of
theories on their target space. A holomorphic Chern– helicity þ1 in the process and l is the number of
Simons theory is a theory whose basic variable is a loops. It is also the case that g l.
d-bar operator @A = @ þ A on a complex vector An elegant formula for the amplitudes is that for
bundle E ! PT3j4 , where A is a Lie algebra valued the on-shell generating functional for tree-level
(0, 1)-form on the target space and whose action is scattering amplitudes A [A], where A is the on-
Z   shell twistor field, being the above-mentioned (0, 1)-
1  þ 1 A3 ^ s form. The generating functional for processes with
S½A ¼ A@A
2 3 q = d þ 1 external fields of helicity þ1 is then
Z
The field equations are @A2 = 0. The classical solutions A d ½A ¼ detð@ þ AÞjC d
therefore consist of holomorphic vector bundles on C2M d
the target space, here CP3j4 . The twistor-space
where d is a natural measure on the moduli space
representation of the fields are obtained by expanding
M d of connected rational (genus 0) curves in CP3j4
A in the anticommuting variables i to obtain
of degree d. This approach has been successfully
exploited to obtain implicit algebraic formulas for
A ¼ a þ i bi þ i j cij þ i j k dijk
all tree-level scattering amplitudes.
þ 1 2 3 4 g In an alternative version, the curves of degree d
can be taken to be maximally disconnected, being
and a has homogeneity zero, but because the the union of d lines. However, in this approach, we
homogeneity of i is of degree 1, bi has homogeneity need to also incorporate Chern–Simons propagators
degree 1, and so on down to homogeneity degree which, for tree diagrams, join the lines into a tree.
4 for g. Via the Ward construction, the a This gives a very flexible calculus for perturbative
component corresponds to an anti-self-dual Yang– gauge theory in which scattering processes are
Mills field on spacetime. The other components of A obtained by gluing together MHV diagrams. It has
can be seen to correspond to spacetime fields with been argued that the two formulations are equiva-
helicities 1=2 to þ1 that are background coupled to lent. On the one hand, the Chern–Simons propaga-
the anti-self-dual Yang–Mills field. tor has a simple pole when the lines meet and the
310 Twistor Theory: Some Applications

contour integral over the moduli space can be to be done to extend these ideas to provide a
performed using residues in such a way as to consistent approach to the main equations of basic
eliminate the Chern–Simons propagators leaving an physics, obstacles that seemed insurmountable a few
integral over d intersecting lines. On the other hand, years ago have been overcome.
the measure on the space of connected curves has a
simple pole where the curve acquires double points See also: Chern–Simons Models: Rigorous Results;
and again the contour integral can be performed in Einstein Equations: Exact Solutions; General Relativity:
such a way as to yield the same integral over d Overview; Instantons: Topological Aspects; Integrable
Systems and the Inverse Scattering Method; Riemann–
intersecting lines.
Hilbert Methods in Integrable Systems; Spinors and Spin
It should be mentioned that Berkovits has given an
Coefficients; Twistors; Classical Groups and
alternative version of twistor-string theory which is a Homogeneous Spaces; Quantum Mechanics:
heterotic open-string theory with target supertwistor Foundations; Several Complex Variables: Compact
space in which the strings are taken to have boundary Manifolds; Several Complex Variables: Basic Geometric
on the real slice RP3 in CP3 (this is appropriate to a Theory.
spacetime with split signature) and the D1-instanton
expansions are replaced by expansions in the funda-
mental modes of the string (this is not a topological
theory). This gives rise to the same formulas for
Further Reading
scattering amplitudes as Witten’s original model. Atiyah MF (1979) Geometry of Yang–Mills Fields: Lezioni
There have been many applications now of these Fermiane. Pisa: Accademia Nazionale dei Lincei Scuola
ideas, perhaps the most striking being the recursion Normale Superiore.
Atiyah MF, Hitchin NJ, and Singer IM (1978) Self-duality in
relations of Britto, Cachazo, Feng, and Witten
four-dimensional Riemannian geometry. Proceedings of the
which give, at tree level, on-shell recurrence rela- Royal Society A 362: 425.
tions for Yang–Mills scattering amplitudes that Bailey TN and Baston R (eds.) (1990) Twistors in Mathematics
suggests a hitherto unsuspected underlying structure and Physics, LMS Lecture Notes Series, vol. 156. Cambridge:
for Yang-Mills theory. Cambridge University Press.
Baston RJ and Eastwood MG (1989) The Penrose Transform: Its
Despite all these successes, twistor-string theory is Interaction with Representation Theory. Oxford: Oxford
not thought by string theorists to be a good vehicle for University Press.
basic physics. The most serious problem is that the Cachazo F, and Svrcek P (2005) Lectures on twistor strings and
closed-string sector gives rise to conformal supergravity perturbative Yang–Mills theory, arXiv:hep-th/0504194.
which is an unphysical theory. This is particularly Hitchin N (1987) Monopoles, Minimal Surfaces and Algebraic
Curves, Seminaire de Mathematiques supérieures, vol. 105.
pernicious from the point of view of analyzing loop
NATO Advanced Study Institute. Les Presses de l’Universite
diagrams as from the point of view of string theory, De Montreal.
loop diagrams will carry supergravity modes. From this Huggett S, Mason LJ, Tod KP, Tsou TS, and Woodhouse NMJ
point of view, twistor-string theory is another duality, (eds.) (1998) The Geometric Universe. Oxford: Oxford
like AdS-CFT etc., that gives insight into some standard University Press.
LeBrun C and Wang M (eds.) (1999) Essays on Einstein
physics but is fundamentally limited.
manifolds, Surveys in Differential Geometry, vol. VI. Boston,
From the point of view of a twistor theorist, MA: International Press.
however, twistor-string theory has overcome major Mason LJ (2005) Twistor actions for non-self-dual fields, a
obstacles to the twistor programme. Hodges has derivation of twistor-string theory, hep-th/0507269.
used the BCFW recursion relations to provide all Mason LJ and Hughston LP (eds.) (1990) Further Advances in
Twistor Theory, Volume I: The Penrose Transform and Its
twistor diagrams for gauge theory. In Mason (2005)
Applications. Pitman Research Notes in Maths, vol. 231.
it is shown how to derive the main generating Harlow: Longman.
function formulas from Yang–Mills and conformal Mason LJ, Hughston LP, and Kobak PZ (1995) Further Advances
gravity spacetime action principles via a twistor in Twistor Theory, Volume II: Integrable Systems, Conformal
space actions for these theories. These twistor Geometry and Gravitation. Pitman Research Notes in Maths,
vol. 232. Harlow: Longman.
actions can in the first instance be expressed purely
Mason LJ, Hughston LP, Kobak PZ, and Pulverer K (eds.) (2001)
bosonically and distinctly and the twistor-string Further Advances in Twistor Theory, Volume III: Curved
generating function formulas are obtained by Twistor Spaces. Pitman Research Notes in Maths, vol. 424.
expanding and re-summing the classical limit of the Boca Raton, FL: Chapman and Hall/CRC Press.
path integral in a parameter that expands about the Mason LJ and Woodhouse NMJ (1996) Twistor Theory, Self-
Duality and Twistor Theory. Oxford: Oxford University Press.
anti-self-dual sector. This allows one to decouple the
Newman ET (1976) Heaven and its properties. General Relativity
Yang–Mills and conformal gravity modes, and and Gravitation 7(1): 107–111.
indeed to work purely bosonically – one is not tied Penrose R (1976) Nonlinear gravitons and curved twistor theory.
to super Yang–Mills. Although there is much work General Relativity and Gravitation 7: 31–52.
Twistors 311

Penrose R (1984, 1986) Spinors and Space-Time, vols. I and II. Ward RS and Wells RO (1990) Twistor Geometry and Field
Cambridge: Cambridge University Press. Theory. Cambridge: Cambridge University Press.
Ward RS (1977) On self-dual gauge fields. Physics Letters A 61: Witten E (2004) Perturbative gauge theory as a string theory in
81–82. twistor space. Communications in Mathematical Physics 252:
Ward RS (1985) Integrable and solvable systems and relations 189 (arXiv:hep-th/0312171).
among them. Philosophical Transactions of the Royal Society
A 315: 451–457.

Twistors
K P Tod, University of Oxford, Oxford, UK Twistor Geometry
ª 2006 Elsevier Ltd. All rights reserved. General references for this section are the books by
Penrose and Rindler (1986) and Hugget and Tod
(1994). It will be convenient to use Penrose’s
abstract index convention (Penrose and Rindler
Introduction
1984, 1986), which is also used in Spinors and
Twistor theory initially arose from two principal Spin Coefficients. This can be used wherever vector
motivations: a desire for a conformally invariant or tensor indices occur. Suppose that V is a (real or
calculus for spacetime geometry and fields on complex) finite-dimensional vector space with dual
spacetime, and a desire to unify and account for V 0 . Elements of V are written va, ub, wc, . . . , where an
the various occurrences of complex numbers and index a, b, c, . . . is regarded not as an integer in the
holomorphic functions in mathematical physics, range 1 to dim V but simply as an abstract label
especially in general relativity (Penrose and indicating that the object to which it is attached is a
MacCallum 1973). The theory leads to a nonlocal vector. Elements of V 0 are similarly written
relation between spacetime and twistor space, ua , vb , wc , . . . and elements of the tensor algebra as
whereby a point in one is an extended object in tab cd according to valence, and so on. The usual
the other. Part of the present-day motivation of the operations of tensor algebra are written in the way
subject is that this nonlocal relation will be a that component calculations would suggest, but
fruitful way to approach the quantization of without necessitating a choice of basis. The jump
spacetime. A comparison is often invoked with to tensor fields on a manifold M is immediate. A
Hamiltonian mechanics, which is a formal rephras- metric is a particular field gab and determines a Levi-
ing of classical mechanics that nonetheless provides Civita connection ra which defines maps ra : vb !
a bridge from that theory to quantum mechanics. ra vb and similar for other valences. The virtue of
The hope is that the twistor theory has the right the formalism is that, while remaining invariant, it
character to provide a bridge from general relativ- can harness the strength and flexibility of calcula-
ity to quantum theory, specifically to quantum tions in components.
gravity. With this understanding, twistors may first be
The principal successes of twistor theory in defined as the fundamental representation of
mathematical physics can be characterized as SU(2, 2), so that they are elements Z of a four-
the linear Penrose transform, which provides a dimensional complex vector space T. T carries a
solution of the zero-rest-mass free-field equations Hermitian form  of signature (þ þ  ) which is
in Minkowski space in terms of sheaf cohomology in made explicit below and which provides an isomorph-
twistor space, and the nonlinear Penrose transform, ism from the complex conjugate of T to its dual. This
which provides solutions of certain nonlinear field isomorphism is used to eliminate all appearances of
equations in terms of holomorphic geometry. These complex-conjugate twistors from the formalism and is
are treated below, together with other applications therefore regarded as an antilinear map to the dual.
of twistor theory, following a brief introduction to SU(2, 2) is the double cover of O(2, 4), the rotation
twistor geometry. group of E2,4 , the six-dimensional space with flat
Very recently, there has been a resurgence of interest metric 2,4 of signature (þ    þ ),which in turn is
in twistor theory following Witten’s introduction of the double cover of C(1, 3), the conformal group of
twistor string theory (Witten 2003) as a string theory Minkowski space M. This last group homomorphism
in twistor space. This is not treated here, but this may be made explicit as follows (suspending the
article does provide the necessary background. abstract-index convention for the duration of this
312 Twistors

aside): introduce pseudo-Cartesian coordinates the points of PN corresponding to the generators


xa = (x0 , x1 , x2 , x3 ) on M and y = (y0 , y1 , y2 , y3 , satisfying [4]. To interpret these, we consider the
y4 , y5 ) on E2,4 . The corresponding metrics are points satisfying the similar equation y4  y5 = 0. By
inspection of , [3], we see that these points
1;3 ¼ ab dxa dxb correspond to the light cone of the origin in M.
¼ ðdx0 Þ2  ðdx1 Þ2  ðdx2 Þ2  ðdx3 Þ2 ½1 Thus, Mc is obtained from M by adding a single
light cone, the light cone at infinity known as I and
2;4 ¼  dy dy read as ‘‘scri,’’ short for ‘‘script-I.’’
Now the rotation group O(2, 4) of E2,4 maps N to
¼ ðdy0 Þ2  ðdy1 Þ2  ðdy2 Þ2  ðdy3 Þ2 itself preserving the metric and consequently maps
þ ðdy4 Þ2  ðdy5 Þ2 ½2 PN to itself, preserving the conformal metric. Thus,
O(2, 4) defines conformal transformations of Mc
We map M into E2,4 by and a count of dimension shows that it is locally
isomorphic to the conformal group C(1, 3). The map
ðxa Þ ¼ ðx0; x1; x2; x3; ð1  Þ=2; ð1 þ Þ=2Þ ½3 is two-to-one with I in O(2, 4) maping to I in
C(1, 3). The fact that SU(2, 2) is four-to-one homo-
where  = ab xa xb with ab as in [1], and it can be morphic to C(1, 3) follows from calculations below.
checked that (M) is the intersection of the null cone It is because of this homomorphism of SU(2, 2) and
N of the origin in E2,4 with the plane P defined by C(1, 3) that the geometry and analysis of twistors
y4 þ y5 = 1. P is in fact a null hyperplane in E2,4 (i.e., twistor theory) provides a formalism adapted
and any point of N not on the null hyperplane to conformally invariant or conformally covariant
defined by notions in M or Mc .
A twistor may be expressed in terms of two-
y4 þ y5 ¼ 0 ½4
component spinors of SL(2, C), the double cover of
can be mapped along the generators of N to a the Lorentz group, as follows:
unique point of P (recall that any point on a cone  
lies on a line through the vertex: these lines are the Z ¼ !A; A0 ½5
generators). Thus, the image of M under  gives a where again indices are abstract, so that
point on every generator of N except those satisfying
[4]. It can also be seen from [2] that the intrinsic 0
T¼SS
metric in E2,4 on the intersection of N and P is in terms of the spin space S and complex-conjugate
just 1,3 . dual spin space S  0 of M. Now we can write
Now let PN be the projective null cone, or, the action of infinitesimal elements of C(1, 3)
equivalently, the space of generators of N. This is a explicitly as
compact manifold with topology S1  S3 , as one can
0
see by intersecting N with the sphere b A ¼ A B !B  iT AA A0 þ !A
!
0 ½6
ðy0 Þ2 þ ðy1 Þ2 þ ðy2 Þ2 þ ðy3 Þ2 þ ðy4 Þ2 þ ðy5 Þ2 ¼ 2 bA0 ¼ B 0 B0 þ iBAA0 !A þ A0
 A
0

Each generator meets this sphere twice at, say, y where T AA (a real vector) defines an infinitesimal
and y, and PN is the quotient by this identifica- translation, BAA0 (another real vector) defines an
tion of the two surfaces infinitesimal special conformal transformation,  (a
real constant) defines a dilatation and the (real)
ðy0 Þ2 þ ðy4 Þ2 ¼ 1 ¼ ðy1 Þ2 þ ðy2 Þ2 þ ðy3 Þ2 þ ðy5 Þ2 bivector Mab = AB A0 B0 þ ¯ A0 B0 AB defines an infini-
tesimal rotation. This gives a total of 15 parameters
which define the intersection. The metric 2,4 defines for the transformation, which is the correct dimen-
a degenerate metric on N, which, however, is sion for C(1, 3).
nondegenerate on any smooth cross section of N The Hermitian form ( , ) can be written as
which meets each generator once. Furthermore, the
0
map along the generators between any two such   ¼ !A A þ !
ðZ; ZÞ ¼ Z Z  A A 0 ½7
cross sections is conformal. Thus, there is a
conformal metric on PN and it is conformal to when it can be checked that the transformations [6]
1,3 . We call PN compactified Minkowski space Mc leave it invariant (and that its signature is (þ þ  );
as it is compact and has the same conformal metric this establishes that SU(2, 2) is locally isomorphic to
as Minkowski space. It can be thought of as M C(1, 3)). Equation [7] will be referred to as the norm
compactified by the addition of some points, namely of a twistor.
Twistors 313

0
From [6], a twistor Z = (!A, A0 ) gives rise, under tangent vector (proportional to) A A . Twistors with
0
translation by a variable xAA , to a spinor field A norm zero are called null and the (five-dimensional,
given by real) submanifold of them in PT is PN. This is a
0
compactification of the space of (unscaled) null
A ¼ !A  ixAA A0 ½8 geodesics in M by the inclusion of the 2-sphere of
null geodesics in Mc which lie on the light cone at I .
Differentiating [8] and symmetrizing, we see that A
For use in the next section, we note the definition of
satisfies the differential equation
PTþ and PT as the projective twistors with positive
rA0 ðA BÞ ¼ 0 ½9 and negative norm, respectively.
To summarize, we have found M and Mc :
which is known as the twistor equation. In fact, the (complex projective) lines in PT define points of
general solution of [9] takes the form of [8] for CMc ; lines in PN define points of Mc with one such,
constant spinors !A and A0 . Furthermore, the call it I, picked out as the vertex of the null cone I ;
conformal group can be shown directly to act on lines in PN which meet I correspond to points of I ;
solutions of [9], so that twistor theory can begin lines in PN which do not meet I correspond to
with the study of [9] and its solutions. In this points in M. As for CMc , the conformal structure of
approach, a twistor is precisely a solution of [9]. M and Mc is determined by incidence in PN. We
Given a spinor field A of the form of [8], we may may now note the nonlocal correspondence men-
seek the points of M where it vanishes. In general, there tioned in the introduction: points in CMc are lines in
are none, but if we consider complexified Minkowski PT and points in PT are -planes in CMc .
space CM, then A vanishes on a two-dimensional It will be convenient to refer to the line in PT
complex plane with the property that every tangent associated with a point x in CMc as Lx . With this
0
vector is of the form A A for varying A and fixed notation, it is possible to characterize the forward or
0
A . The 2-plane is flat and totally null, in that the future tube in terms of twistor space: a point x of
(analytically extended) Minkowski metric vanishes CM is in the forward tube iff its imaginary part is
identically on it, and it has a self-dual (SD) tangent timelike and past-pointing, and this is equivalent to
0
bivector determined by A . Such a 2-plane is known as Lx lying in PT.
an -plane (reserving the term -plane for a totally null The starting point for Riemannian twistor theory is
2-plane with anti-self-dual (ASD) tangent bivector). At the fact that CP3 is a fibration with fiber CP1 over
a given point p in CM, there is an -plane for each S4, where the fiber above a point p can be interpreted
choice of A0 up to scale (in other words, for each as the almost-complex structures at p (since this is the
element of the projective (primed) spin space at p) same as the projective primed spin space at p). In the
which is a copy of the complex projective line, CP1. picture developed above, this means that there is an
The -plane is determined by the twistor up to S4 ’s worth of lines filling out CP3, no two of which
scale (in that a constant complex multiple of the intersect (so that there are no null vectors and the
field A determines the same -plane). Thus, we metric is definite). The complexification of S4 with its
consider the projective twistor space PT which, conformal structure is again CMc .
since T is C4, is a copy of complex projective If a twistor has nonzero norm, say Z Z   = s 6¼ 0,
3-space, CP3. This is now the space of -planes, but then it can be interpreted as a massless particle with
is also compact. We define complexified, compacti- spin s: the momentum is pa = A A0 and the
0 0
fied Minkowski space CMc as the space of all angular momentum bivector is Mab = i!(A B) A B 
(complex projective) lines in PT; then it is easy to (A0 B0 ) AB
i!
   . The angular momentum transforms
see that this includes CM as an open dense subset. appropriately under translation by virtue of [6]
PT is the space of -planes in CMc and two lines and the (Pauli–Lubanski) spin vector is spa , as it
meet in PT iff the corresponding points in CMc lie should be for a massless spinning particle.
on an -plane, or, equivalently, iff they are null
separated. Thus, the conformal structure in CMc is
determined by incidence of lines in PT.
To find M and Mc in this picture, we seek -planes
The Linear Penrose Transform:
containing real points. If A from [8] vanishes at a Zero-Rest-Mass Free Fields
0
real xAA , then the contraction !A A must be purely A zero-rest-mass free field of spin s is a symmetric
imaginary, so that, by [7], the norm of the twistor is spinor field AB...C with 2s indices which satisfies the
zero. Conversely, one calculates that A can indeed field equation
vanish at real points if the norm is zero, and that it 0
will then in fact vanish along a null geodesic with rA A AB...C ¼ 0 ½10
314 Twistors

The Weyl neutrino equation, source-free Maxwell helicity-s zero-rest-mass fields (thus, Û must con-
equation, and linearized Einstein vacuum equation tain the neighborhood of lines Lx for points x in U).
are examples of zero-rest-mass free-field equations, Similarly, [13] is interpreted cohomologically in
with spins 1/2, 1, and 2, respectively, so that these terms of potentials modulo a gauge. With appro-
are equations of physical interest. Conventionally, priate conditions on Û and U (for brevity, U is said
one takes the s = 0 case to be the wave equation, and to be elementary), these groups can be shown to be
the complex-conjugate fields A0 B0 ...C0 to have the isomorphic and this isomorphism is known as the
same spin but opposite helicity. Penrose transform (Ward and Wells (1991)). A
The conformal group acts on solutions of [10], so particular instance of an elementary U is the
that the equations are conformally invariant. The forward tube, when Û is PT. Since the definition
equations can be solved by contour integral expres- of positive frequency is holomorphicity on the
sions involving homogeneous functions of a twistor forward tube, this observation geometrizes the
variable. To be explicit, we define an operation x of notion of positive frequency in terms of twistor
restriction to the line Lx for a function of a twistor space.
variable by the following: For free fields with mass, there are generalizations
0
of [12] and [13] to solve the Dirac equation for
x f ðZ Þ ¼ f ðixAA A0 ; A0 Þ ½11 different spins. However, the integrands now
involve functions of more than one twistor variable,
Now suppose that f (Z ) is holomorphic and homo-
subject to an equation. This equation is a counter-
geneous of degree 2s  2 in the twistor variable for
part of the Klein–Gordon equation and breaks the
positive integer 2s, but otherwise arbitrary, and
conformal invariance (as it must, since mass does). It
consider the integral
can be imposed by a projection which can in turn be
A0 B0 ...C0 ðxÞ
written as a contour integral over arbitrary holo-
Z morphic functions. It has been argued that the
0 0
¼ A0 B0 . . . C0 x f ðZ ÞE F E0 dF0 ½12 appropriate description of leptons and hadrons in
twistor theory is with functions of two and three
where there are 2s indices on and the integration twistor variables, respectively. Such a function has
is around a contour in the line Lx in PT. The choice two or three integer quantum numbers determined
of homogeneity ensures that the integral is well by the homogeneities in different variables, and this
defined but, to obtain a nonzero answer, x f must leads to a twistor particle classification scheme (see,
have some singularities as a function of A0 on Lx . e.g., Hughston and Sheppard (1980) and Sparling
The answer then automatically gives a helicity-(s) (1981)), similar in many respects to, but not
solution of [10], as may be checked by differentia- identical with, the standard classifications.
tion under the integral sign. Given that free fields, massive or massless, are
For a helicity-s solution, we take an arbitrary determined from arbitrary twistor functions through
function f (Z ), holomorphic and of homogeneity contour integrals, one may translate the Feynman
(2s  2), and consider the integral diagrams of a quantum field theory into contour
integrals over twistor functions. In the massless case,
AB...C ðxÞ the contours are compact, so that the integrals are
Z   finite without need for renormalization. The massive
@ @ @ 0 0
¼ x    f ð Z 
Þ E F E0 dF0 ½13 case is more complicated but essentially parallel.
@!A @!B @!C
This is twistor diagam theory and there is a
where there are 2s indices on  and the integration is substantial literature on it (see, e.g., the article by
again around a contour in the line Lx . As before, Hodges in the volume edited by Huggett et al.
one needs singularities to make the contour integral (1998)). There is currently no new physical theory,
nonzero, but again the result satisfies [10]. distinct from a known quantum field theory, to
The correct framework in which to understand generate the relevant diagrams.
these integrals is sheaf cohomology theory. For
[12], the functions with singularities are actually
elements of H 1 (Û, O( 2s  2)), the first cohomol-
The Nonlinear Penrose Transform:
ogy group of a region Û in PT with coefficients in
Curved Twistor Spaces
the sheaf of germs of holomorphic functions of
homogeneity 2s  2, while the fields are elements The electromagnetic field, in Minkowski space say,
of H 0 (U, Z s ), the zeroth cohomology group of the can be regarded as a spinor field subject to field
corresponding region U of M with coefficients in equations, in which case these equations can be
Twistors 315

solved via the Penrose transform by contour completely integrable partial differential equations
integrals. Alternatively, it can be seen as the (PDEs) (including the sine-Gordon, Korteweg–de
curvature of a connection on a U(1) bundle over Vries (KdV) and nonlinear Schrödinger equations)
M, which is a more active role for the field in are reductions of the ASD Yang–Mills equations.
curving a bundle. For SD or ASD electromagnetic Solutions of these other integrable systems can be
fields, there are analogous active twistor construc- given in terms of a geometrical construction,
tions. From an ASD electromagnetic field, one may usually of some structure in holomorphic geometry.
define a connection on the primed spin space of CM The other major active twistor construction,
which is flat on -planes: if the tangents to the - which historically preceded the Yang–Mills one, is
0
plane are of the form A A for varying A and with Penrose’s nonlinear graviton (Penrose 1976), which
A0
 fixed up to scale, then consider the propagation solves the ASD Einstein vacuum equations. For this,
of A0 around the -plane given by one starts from a complex, four-dimensional mani-
0
fold M with holomorphic metric, vanishing Ricci
A ðrA0 A  iAA0 A ÞB0 ¼ 0 ½14 curvature and ASD Weyl tensor. These conditions
on the curvature are necessary and sufficient to
where AA0 A is a potential for the electromagnetic
allow the existence of -surfaces, which generalize
field. This connection is flat provided
-planes. They are two-dimensional totally null
0 0
A B rAA0 AA (complex) surfaces with SD tangent bivector, one
B0 ¼ 0 ½15
for each choice of (null) SD bivector, or, equiva-
and if this is to hold for all A0 then rA(A0 AA B0 ) lently, for each choice of primed spinor, at each
vanishes and the electromagnetic field, defined as point.
usual as the exterior derivative of the potential, is The space of -surfaces is a three-dimensional
necessarily ASD. Now the space of -planes in CM complex manifold, the curved twistor space PT .
is projective twistor space PT, so we define a This is curved inasmuch as it is not now (part of)
holomorphic C bundle T over PT by taking the CP3 , but it still contains complex projective lines:
fiber above an -plane to be choices of A0 scaled as given a point p in M there is an -surface through p
in [14]. If we restrict attention to the -planes for every primed spinor at p up to scale; these -
through a given point p of CM, then by comparing surfaces make up a projective line Lp in PT . The
the scalings at p we can trivialize the bundle; thus, T conditions on the curvature are equivalent to the
is trivial on lines in PT. There is a converse to this statement that the Levi-Civita connection is flat on
construction and we have: there is a one-to-one primed spinors, so that there exist constant primed
correspondence between holomorphic C bundles spinors in M, and the tangent bivector to an -
on a region Û in PT which are trivial on lines and surface can be taken to be constant, without loss of
ASD electromagnetic fields on the corresponding generality. The map associating a constant primed
region U of CM (for elementary U). spinor with each -surface defines a projection 
This construction can be extended to solve the from PT to CP1 , so that PT is a fibration over
ASD Yang–Mills equations with holomorphic vector CP1 . The lines Lp define a four-parameter family of
bundles replacing holomorphic line bundles: with Û sections of this fibration.
and elementary U as above, there is a natural one-to- To define the metric of M from PT , one needs
one correspondence between ASD GL(n, C) gauge the notion of normal bundle: the normal bundle of a
fields on U and holomorphic rank-n vector bundles submanifold Y in a manifold X is N = TXjY =TY in
E over Û which are trivial on Lx for every x in U. terms of the tangent bundles TX and TY. The
ASD Yang–Mills fields cannot be real on M, but normal bundle N p of a particular section Lp is the
using Riemannian twistor theory, one can impose same in PT as it was in PT, namely H  H, where
appropriate reality and globality conditions to H is the hyperplane-section line bundle over CP1
ensure that these ASD Yang–Mills fields are both (Ward and Wells 1991). A section SV of N p
real and globally defined on S4 . These are then corresponds to a vector V in Tp M (think of it as
instantons. The Atiyah–Drinfeld–Hitchin–Manin an infinitesimally neighboring point in M) and V is
(ADHM) construction of instantons (Atiyah et al. defined to be null iff SV has a zero. Because of the
1978) proceeds via construction of the correspond- nature of N , this defines a quadratic conformal
ing holomorphic vector bundles over twistor space. metric, which, furthermore, agrees with the con-
The construction of ASD Yang–Mills fields is formal metric on M and generalizes the definition of
also the starting point for the twistor theory of conformal metric for CMc in terms of incidence in
integrable systems (Mason and Woodhouse 1996), PT. To define the actual metric, as opposed to just
following the observation that many of the known the conformal metric, one has a covariant-constant
316 Twistors

0 0
choice of A B in M which defines an  on the base of Systems: Overview; Quantum Field Theory: A Brief
the fibration, and a Poisson structure on the fibers
Introduction; Quantum Mechanics: Foundations;
of the projection. The definition of
is more intricate, Relativistic Wave Equations Including Higher Spin Fields;
but the two structures enable the metric of M to be Riemann–Hilbert Problem; Spinors and Spin Coefficients;
Twistor Theory: Some Applications.
recovered from PT . Penrose (1976) and Huggett and
Tod (1994) provide more details.
Now the metric and curvature properties of M
are coded into holomorphic properties of PT Further Reading
together with  and
. These properties characterize
Atiyah MF, Hitchin NJ, Drinfeld VG, and Manin YuI (1978)
M: subject to topological conditions on M, there is
Construction of instantons. Physics Letters A 65: 185–187.
a one-to-one correspondence between holomorphic Bailey TN and Baston RJ (eds.) (1990) Twistors in Mathematics
solutions M of the Einstein vacuum equations with and Physics, LMS Lecture Note Series, vol. 156. Cambridge:
ASD Weyl tensor and three-dimensional complex Cambridge University Press.
manifolds PT fibered over CP1 , with a four- Frauendiener J and Penrose R (2001) Twistors and general
relativity. In: Engquist B and Schmid W (eds.) Mathematics
parameter of sections, each with normal bundle
Unlimited – 2001 and Beyond. Berlin: Springer.
H  H, and the forms  and
as above. Hitchin NJ (1995) Twistor spaces, Einstein metrics and isomono-
In fact, one only needs to assume the existence of dromic deformations. Journal of Differential Geometry. 42:
one section with the correct normal bundle and the 30–112.
full four-parameter family will automatically exist, Hitchin NJ, Segal GB, and Ward RS (1999) Integrable Systems.
Twistors, Loop Groups, and Riemann Surfaces, Oxford
at least near to the initial one. Penrose (1976)
Graduate Texts in Mathematics, vol. 4. Oxford: Oxford
showed how curved twistor spaces with the neces- University Press.
sary structures could be obtained by deforming the Huggett SA and Tod KP (1994) An Introduction to Twistor
neighborhood of a line in the ‘‘flat’’ twistor space Theory, LMS Student Text, vol. 4. Cambridge: Cambridge
PT. The Kodaira–Spencer theory of complex defor- University Press.
Huggett SA, Mason LJ, Tod KP, Tsou TS, and Woodhouse NMJ
mations ensures that the necessary lines continue to
(eds.) (1998) The Geometric Universe – Science, Geometry and
exist under this deformation. the Work of Roger Penrose. Oxford: Oxford University Press.
The original nonlinear graviton construction has Hughston LP and Sheppard M (1980) On the magnetic moments
been extended in various ways including the follow- of hadrons. Reports on Mathematical Physics 18: 53–66.
ing: to allow the possibility of a cosmological Mason LJ, Hughston LP, Kobak PZ, and Pulverer K (eds.) (1995,
1998, 2001) Further Advances in Twistor Theory, vols. 1–3.
constant (Ward and Wells 1991); to produce real,
Boca Raton, FL: Pitman Advanced Publishing Programme and
Riemannian solutions (Hitchin 1995); to solve other Chapman and Hall, CRC.
but related field equations (e.g., those for hyper- Mason LJ and Woodhouse NMJ (1996) Integrability, Self-
complex metrics, scalar-flat Kahler metrics or Duality, and Twistor Theory, LMS Monographs Series.
Einstein–Weyl structures). Oxford: Clarendon.
Penrose R (1976) Nonlinear gravitons and curved twistor theory.
The search for a twistor construction of the
General Relativity and Gravitation 7: 31–52.
SD Einstein equations (distinct from a construction Penrose R (1999) The central programme of twistor theory.
in terms of dual twistors, which is, of course, Chaos Solitons Fractals 10: 581–611.
provided by deforming dual twistor space) is an Penrose R and MacCallum MAH (1973) Twistor theory: an
active area of research. This and other applications of approach to the quantisation of fields and space–time. Physics
Reports 6C: 241–316.
twistor theory, including a quasilocal definition of
Penrose R and Rindler W (1984, 1986) Spinors and Space–Time,
mass in general relativity, the classification of affine vols. 1 and 2. Cambridge: Cambridge University Press.
holonomies and the construction of four-dimensional Sparling GAJ (1981) Theory of massive particles. I. Algebraic
conformal field theories, may be found in the structure. Philosophical Transactions of the Royal Society of
literature cited in the ‘‘Further reading’’ section. London A 301: 27–74.
Ward RS and Wells RO (1991) Twistor Geometry and Field
Theory. Cambridge: Cambridge University Press.
See also: Classical Groups and Homogeneous Spaces;
Witten E (2003) Perturbative gauge theory as a string theory in
Clifford Algebras and Their Representations; Integrable twistor space, hep-th/0312171.
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 317

Two-Dimensional Conformal Field Theory


and Vertex Operator Algebras
M R Gaberdiel, ETH Zürich, Zürich, Switzerland The Conformal Symmetry Group
ª 2006 Elsevier Ltd. All rights reserved. The conformal symmetry group of the n-dimen-
sional Euclidean space Rn consists of the (locally
defined) transformations that preserve angles but
not necessarily lengths. The transformations that
Introduction preserve angles as well as lengths are the well-
For the last twenty years or so, two-dimensional known translations and rotations. The conformal
(2D) conformal field theories have played an group contains (in any dimension) in addition the
important role in different areas of modern theo- dilatations or scale transformations
retical physics. One of the main applications of
conformal field theory has been in string theory (see x 7! x
~ ¼ x ½1
Compactification of Superstring Theory), where the
excitations of the string are described, from the where  2 R and x 2 Rn , as well as the so-called
point of view of the world sheet, by a 2D conformal special conformal transformations,
field theory. Conformal field theories have also been
studied in the context of statistical physics, since the x þ x2 a
x 7! x
~ ¼ ½2
critical points of second-order phase transition are 1 þ 2ðx  aÞ þ x2 a2
typically described by a conformal field theory.
Finally, conformal field theories are interesting where a 2 Rn and x2 = x x . (Note that this last
solvable toy models of genuinely interacting quan- transformation is only defined for x 6¼ a=a2 .)
tum field theories. If the dimension n of the space Rn is larger than 2,
From an abstract point of view, conformal field one can show that the full conformal group is
theories are (Euclidean) quantum field theories generated by these transformations. For n = 2,
that are characterized by the property that their however, the group of (locally defined) conformal
symmetry group contains, in addition to the transformations is much larger. To see this, it is
Euclidean symmetries, local conformal transforma- convenient to introduce complex coordinates for
tions, that is, transformations that preserve angles (x, y) 2 R2 by defining z = x þ iy and z = x  iy.
but not necessarily lengths. The local conformal Then any (locally) analytic function f (z) defines a
symmetry is of special importance in two dimen- conformal transformation by z 7! f (z), since analytic
sions since the corresponding symmetry algebra is maps preserve angles. (Incidentally, the same also
infinite dimensional in this case. As a consequence, applies to z 7! f (z), but this would reverse the
2D conformal field theories have an infinite orientation.) Clearly, the group of such transforma-
number of conserved quantities, and are essentially tions is infinite dimensional; this is a special feature
solvable by symmetry considerations alone. The of two dimensions.
mathematical formulation of these symmetries has In this complex notation, the transformations that
led to the concept of a vertex operator algebra, are generated by translations, rotations, dilatations,
which has become a new branch of mathematics in and special conformal transformations simply gen-
its own right. In particular, it has played a major erate the Möbius group of automorphisms of the
role in the explanation of ‘‘monstrous moonshine’’ Riemann sphere
for which Richard Borcherds received the Fields
medal in 1998. az þ b
z 7! f ðzÞ ¼ ½3
In the following, we want to explain the main cz þ d
features of conformal field theory using an algebraic
approach that will naturally lead to the concept of a where a, b, c, d are complex constants with
vertex operator algebra. There are other approaches ad  bc 6¼ 0; since rescaling a, b, c, d by a common
to the subject, most notably the formulation, complex number does not modify [3], the Möbius
pioneered by Segal, of conformal field theory as a group is isomorphic to SL(2, C)=Z2 . In addition to
functor from the category of Riemann surfaces to these transformations (that are globally defined on the
the category of vector spaces. Due to limitations of Riemann sphere), we have an infinite set of infinitesi-
space, however, we will not be able to discuss any of mal transformations generated by Ln : z 7! z þ znþ1
these other approaches here. for all n 2 Z. The generators L 1 and L0 generate the
318 Two-Dimensional Conformal Field Theory and Vertex Operator Algebras

subgroup of Möbius transformations, and their com- (bosonic) Euclidean quantum field theory implies
mutation relations are simply that these correlation functions are independent of
the order in which the fields appear in [5].
½Lm ; Ln  ¼ ðm  nÞLmþn ½4
It is conventional to think of z = 0 as describing
In fact, [4] describes also the commutation relations ‘‘past infinity,’’ and z = 1 as ‘‘future infinity’’; this
of all generators Ln with n 2 Z: this is the Lie defines a time direction in the Euclidean field theory
algebra of (locally defined) 2D conformal transfor- and thus a quantization scheme (radial quantiza-
mations – it is called the Witt algebra. tion). Furthermore, we identify the space of states
with the space of ‘‘incoming’’ states; thus, the state
is simply
The General Structure of Conformal ¼ Vð ; 0; 0Þj0i ½6
Field Theory
We can think of zi and zi in [5] as independent
A 2D conformal field theory is determined (like any variables, that is, we may relax the constraint that zi
other field theory) by its space of states and the is the complex conjugate of zi . Then we have two
collection of its correlation functions (vacuum commuting actions of the conformal group on these
expectation values). The space of states is a vector correlations functions: the infinitesimal action on
space H (which, in many interesting examples, is a the zi variables is described (as before) by the Ln
Hilbert space), and the correlation functions are generators, while the generators for the action on
defined for collections of vectors in some dense  n . In a conformal field theory,
the zi variables are L
subspace of H. These correlation functions are the space of states H thus carries two commuting
defined on a 2D (Euclidean) space. We shall mainly actions of the Witt algebra. The generator L0 þ L 0
be interested in the case where the underlying 2D can be identified with the time-translation operator,
space is a closed compact surface; the other and thus describes the energy operator. The space of
important case concerning surfaces with boundaries states of the physical theory should have a bounded
(whose analysis was pioneered by Cardy) will be energy spectrum, and it is thus natural to assume
reviewed elsewhere (see the article Boundary Con- that the spectrum of both L0 and L  0 is bounded
formal Field Theory). The closed surfaces are from below; representations with this property are
classified (topologically) by their genus g, which usually called positive-energy representations. It is
counts the number of handles; the simplest such relatively easy to see that the Witt algebra does not
surface which we shall mainly consider is the sphere have any unitary positive-energy representations
with g = 0, the surface with g = 1 is the torus, etc. except for the trivial representation. However, as is
One of the special features of conformal field common in many instances in quantum theory, it
theory is the fact that the theory is naturally defined possesses many interesting projective representa-
on a Riemann surface (or complex curve), that is, on tions. These projective representations are conven-
a surface that possesses suitable complex coordi- tional representations of the central extension of the
nates. In the case of the sphere, the complex Witt algebra
coordinates can be taken to be those of the complex c
plane that cover the sphere except for the point at ½Lm ; Ln  ¼ ðm  nÞLmþn þ mðm2  1Þm;n ½7
12
infinity; complex coordinates around infinity are
defined by means of the coordinate function which is the famous Virasoro algebra. Here c is a
(z) = 1=z that maps a neighborhood of infinity to central element that commutes with all Lm ; it is
a neighborhood of zero. With this choice of complex called the central charge (or conformal anomaly).
coordinates, the sphere is usually referred to as the Given the actions of the two Virasoro algebras
(that are generated by Ln and L  n ), one can
Riemann sphere, and this choice of complex
coordinates is, up to Möbius transformations, decompose the space of states H into irreducible
unique. The correlation functions of a conformal representations as
field theory that is defined on the sphere are thus of M
H¼ Mij Hi  Hj ½8
the form
ij
h0jVð 1 ; z1 ; 
z1 Þ    Vð n ; zn ; 
zn Þj0i ½5
where Hi (Hj ) denotes the irreducible representations
where V( , z, z) is the field that is associated to the of the algebra of Ln (L n ), and Mij 2 N0 describe the
state , and zi and zi are complex conjugates of one multiplicities with which these combinations of
another. Here j0i denotes the SL(2, C)=Z2 -invariant representations occur. (We are assuming here that
vacuum. The usual locality assumption of a 2D the space of states is completely reducible with
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 319

respect to the action of the two Virasoro algebras; hold for n-point functions with n  5. The full
examples where this is not the case are the so-called Virasoro symmetry must then be used to restrict
logarithmic conformal field theories.) The positive- these functions further; however, since the genera-
energy representations of the Virasoro algebra are tors Ln with n  2 do not annihilate the vacuum
characterized by the value of the central charge, as j0i, the Virasoro symmetry leads to Ward identities
well as the lowest eigenvalue of L0 ; the state that cannot be easily evaluated in general. (In typical
whose L0 eigenvalue is smallest is called the highest- examples, the Ward identities give rise to differential
weight state, and its eigenvalue L0 = h is the equations that must be obeyed by the correlation
conformal weight. The conformal weight determines functions.)
the conformal transformation properties of : under
the conformal transformation z 7! f (z), z 7! f (z), we
have Chiral Fields and Vertex Operator
Algebras
Vð ; z; zÞ
The decomposition [8] usually contains a special
h 0 h
7! ðf ðzÞÞ f ðzÞ Vð ; f ðzÞ; f ðzÞÞ
0
½9 class of states that transform as the vacuum state
with respect to L  m ; these states are the so-called
where L0 = h and L 0 = h  . The corresponding
chiral states. (Similarly, the states that transform as
field V( ; z, z) is then called a primary field; if [9] the vacuum state with respect to Lm are the
only holds for the Möbius transformations [3], the antichiral states.) Given the transformation proper-
field is called quasiprimary. ties described above, it is not difficult to see that the
Since Lm with m > 0 lowers the conformal weight corresponding chiral fields V( ; z, z) only depend on
of a state (see [7]), the highest-weight state is z in any correlation function, that is V( ; z, z)
necessarily annihilated by all Lm (and L m ) with m > 0. V( , z). (Similarly, the antichiral fields only depend
However, in general the Lm (and L  m ) with m < 0 on z.) The chiral fields always contain the field
do not annihilate ; they generate the descendants corresponding to the state L2 j0i, that describes a
of that lie in the same representation. Their specific component of the stress–energy tensor.
conformal transformation property is more compli- In conformal field theory, the product of two
cated, but can be deduced from that of the primary fields can be expressed again in terms of the fields of
state [9], as well as the commutation relations of the theory. The conformal symmetry restricts the
the Virasoro algebra. structure of this operator product expansion:
The Möbius symmetry (whose generators annihi-
late the vacuum) determines the 1-, 2- and 3-point Vð 1 ; z1 ; 
z1 ÞVð 2 ; z2 ; z2 Þ
X 
functions of quasiprimary fields up to numerical ¼ ðz1  z2 Þi ðz1  z2 Þi
constants: the 1-point function vanishes, unless i
h=h  = 0, in which case h0jV( ; z, z)j0i = C, inde- X
Vðir;s ; z2 ; z2 Þðz1  z2 Þr ðz1  z2 Þs ½12
pendent of z and z. The 2-point function of 1 and r;s0
 
2 vanishes unless h1 = h2 and h1 = h2 ; if the
conformal weights agree, it takes the form where i and i are real numbers, and r, s 2 N 0 .
(Here i labels the conformal representations that
h0jVð 1 ; z1 ; 
z1 ÞVð 2 ; z2 ; z2 Þj0i appear in the operator-product expansion, while r
 and s label the different descendants.) The actual
¼ Cðz1  z2 Þ2h ðz1  z2 Þ2h ½10
form of this expansion (in particular, representations
Finally, the structure of the 3-point function of three that appear) can be read off from the correlation
quasiprimary fields 1 , 2 , and 3 is functions of the theory since the identity [12] has to
hold in all correlation functions.
h0jVð 1 ; z1 ; 
z1 ÞVð 2 ; z2 ; z2 ÞVð 3 ; z3 ; z3 Þj0i Given that the chiral fields only depend on z in all
Y    correlation functions, it is then clear that the
¼ C ðzi  zj Þðhk hi hj Þ ðzi  zj Þðhk hi hj Þ ½11
operator-product expansion of two chiral fields
i<j
again only contains chiral fields. Thus, the subspace
where for each pair i < j, k labels the third field, that of chiral fields closes under the operator-product
is, k 6¼ i and k 6¼ j. The Möbius symmetry also expansion, and therefore defines a consistent (sub)-
restricts the higher correlation function of quasi- theory by itself. This subtheory is sometimes referred
primary fields: the 4-point function is determined up to as a meromorphic conformal field theory (Goddard
to an (undetermined) function of the Möbius 1989). (Obviously, the same also applies to the
invariant cross-ratio, and similar statements also subtheory of antichiral fields.) The operator-product
320 Two-Dimensional Conformal Field Theory and Vertex Operator Algebras

expansion defines a product on the space of mero- paper that started many of the modern develop-
morphic fields. This product involves the complex ments in conformal field theory. Another important
parameters zi in a nontrivial way, and therefore does class of examples are the Wess–Zumino–Witten
not directly define an algebra structure; it is, however, (WZW) models that describe the world-sheet theory
very similar to an algebra, and is therefore usually of strings moving on a compact Lie group. The
called a vertex operator algebra in the mathematical relevant vertex operator algebra is then generated by
literature. The formal definition involves formal the loop group symmetries. There is some evidence
power series calculus and is quite complicated; details that all rational conformal field theories can be
can be found in (Frenkel–Lepowski–Meurman 1988). obtained from the WZW models by means of two
By virtue of its definition as an identity that holds standard constructions, namely by considering
in arbitrary correlation functions, the operator- cosets and taking orbifolds; thus rational conformal
product expansion is associative, that is, field theory seems to have something of the flavor of
(reductive) Lie theory.
ðVð 1 ; z1 ; 
z1 ÞVð 2 ; z2 ; z2 ÞÞVð 3 ; z3 ; z3 Þ
Rational theories may be characterized in terms of
¼ Vð 1 ; z1 ; 
z1 ÞðVð 2 ; z2 ; z2 ÞVð 3 ; z3 ; z3 ÞÞ ½13 Zhu’s algebra that can be defined as follows. The
chiral fields V( , z) that only depend on z must by
where the brackets indicate which operator-product
themselves define local operators; they can therefore
expansion is evaluated first. If we consider the case
be expanded in a Laurent expansion as
where both 1 and 2 are meromorphic fields, then
X
the associativity of the operator-product expansion Vð ; zÞ ¼ Vn ð Þ znh ½14
implies that the states in H form a representation of n2Z
the vertex operator algebra. The same also holds for
the vertex operator algebra associated to the anti- where h is the conformal weight of the state . For
chiral fields. Thus the meromorphic fields encode in example, for the case of the holomorphic compo-
a sense the symmetries of the underlying theory: this nent of the stress–energy tensor one finds
symmetry always contains the conformal symmetry X
 2 j 0i TðzÞ ¼ Ln zn2 ½15
(since L2 j 0i is always a chiral field, and L n2Z
always an antichiral field). In general, however, the
symmetry may be larger. In order to take full where the Ln are the Virasoro generators. By the
advantage of this symmetry, it is then useful to state/field correspondence [6], it then follows that
decompose the full space of states H not just with Vn ð Þj0i ¼ 0 for n > h ½16
respect to the two Virasoro algebras, but rather with
respect to the two vertex operator algebras; the and that
structure is again the same as in [8], where, Vh ð Þj0i ¼ ½17
however, each Hi and Hj is now an irreducible
representation of the chiral and antichiral vertex (For an example of the above component of the
operator algebra, respectively. stress–energy tensor, [16] implies that L1 j0i =
L0 j0i = Ln j0i = 0 for n  0 – thus the vacuum is in
particular SL(2, C)=Z2 invariant. Furthermore, [17]
Rational Theories and Zhu’s Algebra shows that L2 j 0i is the state corresponding to this
component of the stress–energy tensor.) We denote
Of particular interest are the rational conformal by H0 the space of states that can be generated by
field theories that are characterized by the property the action of the modes Vn ( ) from the vacuum j0i.
that the corresponding vertex operator algebras only On H0 we consider the subspace O(H0 ) that is
possess finitely many irreducible representations. spanned by the states of the form
(The name ‘‘rational’’ stems from the fact that the
conformal weights and the central charge of these V ðNÞ ð Þ; N>0 ½18
theories are rational numbers.) The simplest exam-
where V (N) ( ) is defined by
ple of such rational theories are the so-called
minimal models, for which the vertex operator Xh  
ðNÞ h
algebra describes just the conformal symmetry: V ð Þ¼ VnN ð Þ ½19
n
these models exist for a certain discrete set of n¼0
central charges c < 1 and were first studied by and h is the conformal weight of . Zhu’s algebra is
Belavin, Polyakov, and Zamolodchikov in 1984. then the quotient space
(Their paper is contained in the reprint volume of
Goddard and Olive (1988).) It was this seminal A ¼ H0 =OðH0 Þ ½20
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 321

It actually forms an associative algebra, where the Hi of a vertex operator algebra, one can define the
algebra structure is defined by character
 
?  ¼ V ð0Þ ð Þ ½21 i ð Þ ¼ trHi qL0 ðc=24Þ ; q ¼ e2
i ½23
This algebra structure can be identified with the
For rational vertex operator algebras (in the math-
action of the ‘‘zero-mode algebra’’ on an arbitrary
ematical sense) these characters transform under the
highest-weight state.
modular transformation 7! 1= as
Zhu’s algebra captures much of the structure of X
the (chiral) conformal field theory: in particular, it ð1= Þ ¼ Sij j ð Þ ½24
was shown by Zhu in 1996 that the irreducible j
representations of A are in one-to-one correspon-
where Sij are constant matrices. Verlinde’s formula
dence with the representations of the full vertex
then states that, at least for unitary theories,
operator algebra. A conformal field theory is thus
rational (in the above, physicists’, sense) if Zhu’s X Sil Sjl S

algebra is finite dimensional. (In the mathematics Nij k ¼ kl


½25
l
S0l
literature, a vertex operator algebra is usually called
rational if in addition every positive-energy repre- where the ‘‘0’’ label denotes the vacuum representa-
sentation is completely reducible. It has been tion. A general argument for this formula has been
conjectured that this is equivalent to the condition given by Moore and Seiberg in 1989; very recently,
that Zhu’s algebra is semisimple.) this has been made more precise by Huang.
In practice, the determination of Zhu’s algebra is
quite complicated, and it is therefore useful to
obtain more easily testable conditions for rational- Modular Invariance and the Conformal
ity. One of these is the so-called C2 condition of Bootstrap
Zhu: a vertex operator algebra is C2 -cofinite if the
quotient space H0 =O2 (H0 ) is finite dimensional, Up to now, we have only considered conformal field
where O2 (H0 ) is spanned by the vectors of the form theories on the sphere. In order for the theory to be
well defined also on higher-genus surfaces, it is
Vnh ð Þ; n1 ½22 believed that the only additional requirement comes
It is easy to show that the C2 -cofiniteness condition from the consistency of the torus amplitudes. In
implies that Zhu’s algebra is finite dimensional. particular, the vacuum torus amplitude must only
Gaberdiel and Neitzke have shown that every depend on the equivalence class of tori that is
C2 -cofinite vertex operator algebra has a simple described by the modular parameter 2 H, up to
spanning set; this observation can, for example, be the discrete identifications that are generated by the
used to prove that all the fusion rules (see below) of usual action of the modular group SL(2, Z) on the
such a theory are finite. upper half-plane H. For the theory with decomposi-
tion [8] this requires that the function
X
Zð ; Þ ¼ Mij i ð Þj ð
Þ ½26
Fusion Rules and Verlinde’s Formula ij

As explained above, the correlation function of three is invariant under the action of SL(2, Z). This is a
primary fields is determined up to an overall very powerful constraint on the multiplicity matrices
constant. One important question is whether or not Mij that has been analyzed for various vertex
this constant actually vanishes since this determines operator algebras. For example, Cappelli, Itzykson,
the possible ‘‘couplings’’ of the theory. This infor- and Zuber have shown that the modular invariant
mation is encoded in the so-called fusion rules of the WZW models corresponding to the group SU(2)
theory. More precisely, the fusion rules Nij k 2 N0 have an A–D–E classification. The case of SU(3) was
determine the multiplicity with which the represen- solved by Gannon, using the Galois symmetries of
tation of the vertex operator algebra labeled by k these rational conformal field theories.
appears in the operator-product expansion of the The condition of modular invariance is relatively
two representations labeled by i and j. easily testable, but it does not, by itself, guarantee that
In 1988, Verlinde found a remarkable relation a given space of states H comes from a consistent
between the fusion rules of a vertex operator conformal field theory. In order to construct a
algebra and the modular transformation properties consistent conformal field theory, one needs to solve
of its characters. To each irreducible representation the conformal bootstrap, that is, one has to determine
322 Two-Dimensional Ising Model

all the normalization constants of the correlators so Frenkel I, Lepowski J, and Meurman A (1988) Vertex Operator
that the resulting set of correlators is local and Algebras and the Monster. Boston, MA: Academic Press.
Gaberdiel MR (2000) An introduction to conformal field theory.
factorizes appropriately into 3-point correlators Reports on Progress in Physics 63: 607 (arXiv:hep-th/
(crossing symmetry). This is typically a difficult 9910156).
problem which has only been solved explicitly for Gannon T (1999) Monstrous moonshine and the classification of
rather few theories, for example, the minimal models. CFT, arXiv:math.QA/9906167.
Recently, it has been noticed that the conformal Gannon T (2006) Moonshine beyond the Monster: The Bridge
Connecting Algebra, Modular Forms and Physics (to appear).
bootstrap can be more easily solved for the corre- Cambridge: Cambridge University Press.
sponding boundary conformal field theory. Further- Gawedzki K (1999) Lectures on conformal field theory. In:
more, Fuchs, Runkel, and Schweigert have shown that Quantum Fields and Strings: A Course for Mathematicians,
any solution of the boundary problem induces an vol. 2. Providence, RI: American Mathematical Society.
associated solution for conformal field theory on Ginsparg P (1988) Applied Conformal Field Theory. Lectures
Given at the Les Houches Summer School in Theoretical
surfaces without boundary. This construction relies Physics. Elsevier.
heavily on the relation between 2D conformal field Goddard P (1989) Meromorphic conformal field theory. Infinite
theory and 3D topological field theory (Turaev 1994). Dimensional Lie Algebras and Lie Groups: Proceedings of the
CIRM Luminy Conference, 1988, 556. Singapore: World
See also: Boundary Conformal Field Theory; Scientific.
Compactification of Superstring Theory; Current Algebra; Goddard P and Olive DI (1988) Kac–Moody and Virasoro
Knot Theory and Physics; String Field Theory; Algebras, A Reprint Volume for Physicists. Singapore: World
Scientific.
Superstring Theories; Symmetries in Quantum Field
Kac VG (1998) Vertex Algebras for Beginners. Providence, RI:
Theory of Lower Spacetime Dimensions.
American Mathematical Society.
Pressley A and Segal GB (1986) Loop Groups. Oxford: Clarendon.
Schweigert C, Fuchs J, and Walcher J (2000) Conformal field
Further Reading theory, boundary conditions and applications to string theory,
arXiv:hep-th/0011109.
Di Francesco P, Mathieu P, and Sénéchal D (1997) Conformal Turaev VG (1994) Quantum Invariants of Knots and 3-Manifolds.
Field Theory. New York: Springer. Berlin: de Gruyter.

Two-Dimensional Ising Model


B M McCoy, State University of New York at Stony This very simple model [1] has the remarkable
Brook, Stony Brook, NY, USA property that in two dimensions at H = 0 many
ª 2006 Elsevier Ltd. All rights reserved. properties of physical interest can be computed
exactly. Furthermore, the model has a ferromagnetic
phase transition at a critical temperature Tc , at
Introduction which the specific heat diverges and the magnetic
susceptibility diverges to infinity and below which
The Ising model is a model of a classical ferro- there is a nonzero spontaneous magnetization. In
magnet on a lattice first introduced in 1925 in the addition, the microscopic correlations between spins
one-dimensional case by E Ising. At each lattice site can also be exactly computed. These exact calcula-
there is a ‘‘spin’’ variable , which takes on the tions are the basis of the modern theory of second-
values þ1 (spin up) and 1 (spin down). The mutual order phase transitions used to analyze real ferro-
interaction energy of the pair of spins and 0 , magnets and real fluids near their critical points in
where and 0 are nearest neighbors, is E( , 0 ) if both two and three dimensions. The model may also
= 0 and is E( , 0 ) if =  0 . In addition, the be interpreted as a lattice gauge theory.
spins can interact with an external magnetic field as
H . On a square lattice, where j specifies the row
and k specifies the column, the interaction energy Solvability
for the homogeneous case where Ev ( , 0 ) and
The solvability of the Ising model at H = 0 was
Eh ( , 0 ) are independent of the position , 0 may
discovered by Onsager in 1944 in one of the most
be explicitly written as
profound and inventive papers ever written in
X mathematical physics. Onsager discovered that the
EðHÞ ¼  ½Eh j;k j;kþ1 þ Ev j;k jþ1;k þ H j;k  ½1 model possesses an infinite-dimensional symmetry,
j;k which allowed him to exactly compute the free
Two-Dimensional Ising Model 323

energy per site. This symmetry is generated by the The next property to be computed was the
relations spontaneous magnetization, which is usually
defined as
½Al ; Am  ¼ 4Glm
½Gl ; Am  ¼ 2Amþl  2Aml ½2 M ¼ limþ MðHÞ ½8
H!0
½Gl ; Gm  ¼ 0 However, because solution is only available at
This algebra of Onsager is a subalgebra of what is H = 0, this definition cannot be used and instead
now called the loop algebra of the Lie algebra Sl2 M is computed from an alternative definition in
and it is the first infinite-dimensional algebra to be terms of the spin–spin correlation function
used in physics. 1 X
In the 60 years since Onsager first computed the <0;0 M;N> ¼ 0;0 M;N eEð0Þ=kB T ½9
ZI ð0Þ ¼1
free energy, several other methods of exact solution
have been found. In 1949, Kaufman reduced the as
computation of the free energy to a problem of free
fermions. A closely related combinatorial method M2 ¼ lim <0;0 M:N> ½10
M2 þN2 !1
was invented by Kac and Ward, Hurst and Green,
and by Kastelyn. Baxter (1982) has computed the The result for M , first announced by Onsager in
free energy by means of star triangle equations and 1949, is
functional equations in his book. 
2 1=8
The fermionic and the combinatorial methods are M ¼ ð1  k Þ for T  Tc ½11
0 for T > Tc
powerful enough to compute the correlation func-
tions but are not generalizable to other models. The where
functional equation methods of Baxter generalize to
many other important models but they do not give k ¼ ðsinh 2Ev =kB T sinh 2Eh =kB TÞ1 ½12
correlation functions. There are still aspects of A key point in the computation of the magnetiza-
Onsager’s method that remain unexplored. tion [11] from [9] is that the spin–spin correlation
The free energy per site in the thermodynamic function can be written as a determinant. In fact,
limit is defined as there are many such different, but equal, determi-
1 nental representations and the size of the smallest
F ¼ kB T lim N ln ZI ðHÞ ½3
N !1 one in general is 2(jMj þ jNj). The simplest case is
the diagonal correlation
where N is the total number of sites of the lattice
and the partition function ZI (H) is defined as  
 a0 a1 a2    a1N 
X  
 a1 a0 a1    a2N 
ZI ðHÞ ¼ eEðHÞ=kB T ½4 
 
all ¼1 <0;0 N;N> ¼  a2 a1 a0    a3N  ½13

 .. .. .. .. 
with the sum being over all values j, k = 1 and kB  . . .  . 

is Boltzmann’s constant. The result of Onsager is  
aN1 aN2 aN3    a0
that, at H = 0,
Z 2 Z 2 h where
1
F=kB T ¼ ln 2 þ 2 d1 d2 ln cosh 2Eh =kB T Z  1=2
8 0 0 1 
in 1  kei
an ¼ de ½14
 cosh 2Ev =kB T  sinh 2Eh =kB T cos 1 2  1  kei
i
 sinh 2Ev =kB T cos 2 ½5 Determinants of the form [13], where the elements
on each diagonal are equal, are called Toeplitz.
This free energy has a singularity at a temperature The study of the spin–spin correlations of the
Tc defined from Ising model provides a microscopic picture of the
sinhð2Ev =kB Tc Þ sinhð2Eh =kB Tc Þ ¼ 1 ½6 behavior of the ferromagnet near the phase transition
temperature Tc , and an entire branch of mathematics
and near Tc the specific heat diverges as has developed from the study of the behavior of
2  2 Toeplitz determinants when the size is large. The first
c E sinh2 2Ev =kB Tc þ 2Ev Eh such mathematical advance was the discovery by
kB T 2  h
 Szegö of a general formula for the limit as N ! 1,
þ E2v sinh2 2Eh =kB Tc ln j1  T=Tc j ½7 from which the magnetization [11] is computed.
324 Two-Dimensional Ising Model

The simplest result for the approach to this N ! 1 d t


limit is the behavior of the diagonal correlation at N ðtÞ ¼ tðt  1Þ ln <0;0 N;N>  ½24
dt 4
T = Tc (k = 1), where [13] exactly reduces to
 N N1   Furthermore it was found by McCoy et al. (1981)
2 Y 1 lN that for a given temperature the general two spin
<0;0 N;N> ¼ 1 2 ½15
 4l correlation function and all multipoint correlations
l¼1
satisfy quadratic nonlinear partial difference equa-
which behaves as N ! 1 as tions in the locations of the spins.
<0;0 N;N>  AN 1=4 ½16
where A  0.6450    is a transcendental constant.
Scaling Theory
Further results for large N and T fixed are for
T < Tc (k < 1), It is evident that the results [17] and [18] do not
( ) reduce to [16] when k ! 1. Therefore, in order to
2 2k2ðNþ1Þ uniformly characterize the behavior of the correla-
<0;0 N;N >  M 1 þ þ    ½17
N 2 ðk2  1Þ2 tion function in the critical region near Tc , it is
necessary to introduce what is called the scaling
and for T > Tc (k > 1), function. This uniform expansion is obtained by
introducing a scaled length defined as
kN
<0;0 N;N>  þ  ½18
ðNÞ1=2 ð1  k2 Þ1=4 r ¼ N= ½25

By comparing [16] with [17] and [18], we see that and considering the joint (scaling limit) where
at T = Tc the correlations decay algebraically but for
T 6¼ Tc the decay is exponential. It is useful to write N ! 1 and T ! 1 with r fixed ½26
the exponential in [17] for T < Tc as We define the scaled correlation function as
N N= 1
k ¼e with  ¼  ln k ½19 G ðrÞ ¼ lim M2 ½27
 <0;0 N;N>
scaling
and in [18] for T > Tc as
where the subscript  means that the limit is taken
kN ¼ eN=þ with 1 ¼ ln k ½20
from T > Tc or T < Tc , respectively, M is the
The quantity  is called the correlation length and as spontaneous magnetization [11] and
T ! Tc the correlation length diverges as
Mþ ¼ ðk2  1Þ1=8 ½28
1 1
  j1  kj ¼ const: jT  Tc j ½21
This concept of the scaling limit and scaling
A more profound property of the correlations is function is very general and can be defined for any
that they satisfy differential and difference equa- system with a critical point that has an order
tions. It was found by Jimbo and Miwa (1980) that parameter like M that vanishes at Tc and a
the diagonal correlation function satisfies the non- correlation length that diverges at Tc . However,
linear differential equation related to the sixth the Ising model has the further remarkable property
Painlevé function discovered by Wu et al. (1976) that the scaled
!2 correlation function may be explicitly expressed in
d2  terms of a function which satisfies an ordinary
tðt  1Þ 2
dt nonlinear differential equation. Specifically,
 2
d 1
¼ N 2 ðt  1Þ  G ðrÞ ¼ ½1 ðr=2Þðr=2Þ1=2
dt 2
  Z 1 0
d d 1 d r
4 ðt  1Þ  t  ½22  exp r0 2 ½ð1  2 Þ2  ð0 Þ2  ½29
dt dt 4 dt r=2 4

where for T < Tc we set t = k2 and where the function (r) satisfies the Painlevé III
equation
d 1
N ðtÞ ¼ tðt  1Þ ln <0;0 N;N>  ½23
dt 4 1 0
00 ¼ ð0 Þ2  þ 3  1 ½30
and for T > Tc we set t = k2 and  r
Two-Dimensional Ising Model 325

with the boundary condition that The susceptibility may be studied by using the
determinental expression for the correlation func-
ðrÞ  1  2K0 ð2rÞ as r ! 1 ½31
tion. The simplest result is obtained (for the
where K0 (r) is the modified Bessel function of the isotropic case, Ev = Eh ) by using the scaling form
third kind and [27] to find for T  Tc that
Z 1
 ¼ 1= ½32 2 2
kB T  ðTÞ  M  2 dr rfG   g ½38
0
The leading behavior of G (r) for r ! 1 is
where þ = 0 and  = 1. and thus  (T) diverges
Gþ ðrÞ  K0 ðrÞ ½33 at T ! Tc as

2 2
 ðTÞ  C jT  Tc j7=4 ½39
G ðrÞ  1 þ  r ½K21 ðrÞ  K20 ðrÞ

where C are transcendental constants given as
 rK0 ðrÞK1 ðrÞ þ 12K20 ðrÞ ½34 integrals over the scaling function G (r), which
were first evaluated by Barouch et al. in 1973 as
where Kn (z) is the modified Bessel function of the
third kind. When  is given by [32] these r ! 1 C ¼ 0:0255369719 . . . ;
½40
limits of G (r) agree with the behavior of Cþ ¼ 0:9625817322 . . .
<0, 0 N, N> for N
1 and jT  Tc j small with
NjT  Tc j
1 which is obtained from [18] and
[17]. The behavior of G (r) for r ! 0 with the value Critical-Exponent Phenomenology
of  given by [32] is
From the behavior for the Ising model of the
G ðrÞ ¼ const: r 1=4
½35 specific heat, magnetization, susceptibility, corre-
lation length, and the correlation at Tc given
where the constant agrees with that computed from above we abstract for general systems the phe-
the result [16] for < 0, 0 N, N > at T = Tc for N
1. nomenological critical-exponent parametrization
For other values of the boundary condition constant , for T ! Tc  of
the scaling function G (r) diverges with a power 
which differs from 1/4. The computation of the c  A
C jT  Tc j ½41
constant in [35] requires the evaluation of a nontrivial
integral involving the Painlevé III function. M  AM jTc  Tj
½42
The agreement of the limits r ! 1 and r ! 0 of
 
the function G (r) with the lattice results near Tc  A
jT  Tc j ½43
means that this scaling function uniformly inter-
polates between T 6¼ Tc and T = Tc and that the   A  
 jT  Tc j ½44
lattice size (defined here as unity) and the self-
generated correlation length  are the only two and at T = Tc for R ! 1
length scales in the theory. This feature that the
<0 R>  A =Rd2þ where d is the dimension ½45
system generates only one new length scale near Tc
is referred to as one length scale scaling. The exponents  ,  ,  above and below Tc are usually
found to be equal, and the exponent  is usually called the
anomalous dimension. If it is assumed that the scaling
Susceptibility function [27] exists and that one length scale scaling holds
The final quantity of macroscopic thermodynamic then the exponents are related by what are called scaling
interest is the magnetic susceptibility laws, such as

@MðHÞ 2
¼  ðd  2 þ Þ ½46
ðTÞ ¼ ½36
@H H¼0
 þ 2
  ¼ 2 ½47
which is expressed in terms of the spin–spin
correlation function as d  ¼ 2   ½48

1 X Thus, from the properties of the Ising model near


ðTÞ ¼ f<0;0 M;N> M2 g ½37 Tc , we have obtained a phenomenology for use on
kB T M;N
all systems near the critical point.
326 Two-Dimensional Ising Model

Fuchsian Equations and Natural correct, there will be physical effects which are not
Boundaries for Susceptibility incorporated in the phenomenological scaling theory
of critical phenomena.
This critical phenomenology, however, has not
taken into account the fact that the susceptibility
is a much more complicated function than either
the spontaneous magnetization [11] or the free Impure Ising Models
energy [5], which have only isolated singularities at The Ising model may also be studied when the
k2 = 1, and that there is more structure to the interaction energies at sites j, k are not chosen to be
susceptibility than the singularity of [39]. independent of position but are allowed to vary
For arbitrary T, the susceptibility was shown by from site to site. When these interactions are chosen
Wu, McCoy, Tracy, and Barouch to be expressible randomly out of some probability distribution, this
in the form is a model of a ferromagnet with frozen (quenched)
X
 ðTÞ ¼ M2 ~ðjÞ ðTÞ
½49 impurities. All real systems will be impure to some
j extent, so the study of such dirty systems is of great
practical importance.
where in the sum j is odd (even) for T above (below) The special case where the interactions are transla-
Tc . The quantities ~(j) (T) are explicitly given as tionally invariant in the horizontal direction but are
j-fold integrals of algebraic functions and thus will allowed to vary in a layered fashion from row to row
satisfy linear differential equations with polynomial was introduced by McCoy and Wu in 1968 and
coefficients. Such functions can have only isolated found to be dramatically different from the pure Ising
singularities. The function ~(1) (T) is elementary and model described above. In particular, what is a
has a double pole at Tc and ~(2) (T) is given in terms critical temperature Tc in the pure case is now spread
of complete elliptic integrals. Quite recently, out into a region bounded by the temperatures the
remarkable Fuchsian linear differential equations pure model would be critical if all the bonds took on
for ~(3) (T) and ~(4) (T) of seventh and tenth orders, the minimum or maximum value allowed by the
respectively, have been obtained by Zenine, Bouk- probability distribution. In this new region, the
raa, Hassani, and Maillard for the isotropic lattice. correlations (in the direction of translational invar-
Furthermore, it was shown by Orrick et al. (2001) iance) are found to decay as a power law which
that ~(j) has singularities in the complex T plane at depends on the temperature; the specific heat is never
coshð2Ev =kTÞ coshð2Eh =kTÞ infinite but the susceptibility is infinite in an entire
temperature region that includes the temperature at
 sinhð2Ev =kTÞ cosð2=jÞ which the spontaneous magnetization first appears as
 sinhð2Eh =kTÞ cosð2m0 =jÞ ¼ 0 ½50 T is lowered. The existence of this new region for
Ising models with a general randomness in two and
with m, m0 = 1, 2, . . . , j. The form of the singularity
three dimensions has been demonstrated by Griffiths.
~(j) (T) for T > Tc is as
in
More recently, this effect has been reinterpreted in
2
ðj 3Þ=2
ln ½51 terms of impurities in quantum spin chains.

and, for T < Tc , it is as

ðj
2
3Þ=2
½52 Quantum Field Theory

where measures the deviation from the singular The Ising model of [1] may be reinterpreted as a two-
point [50]. These singularities become dense as dimensional lattice gauge theory of the gauge field
j ! 1 and, therefore, the singularity at T = Tc is
sjþ1=2;k ¼ 1
not isolated and instead the critical point is
embedded in a natural boundary. Such a function on the vertical link between ðj; kÞ and ðj þ 1; kÞ
cannot satisfy a linear differential equation of finite sj;kþ1=2 ¼ 1
order with polynomial coefficients.
The existence of the natural boundary in the on the horizontal link between ðj; kÞ
susceptibility is a new phenomenon which is not and ðj; k þ 1Þ ½53
seen in either the free energy or magnetization and
leads to the speculation that in the presence of a and a ‘‘Higgs’’ field
magnetic field the one length scale scaling property
of the model at H = 0 may fail. If this proves to be j:k ¼ 1 on the site ðj; kÞ ½54
Two-Dimensional Ising Model 327

with the action has poles of the form Al =(k2 þ m2l ), where ml is the
X mass of the lth particle. If we note that the Fourier
Sg ¼ Eg sjþ1=2;k sjþ1;kþ1=2 sjþ1=2;kþ1 sj;kþ1=2
transform of K0 (r) is
j;k
X Z
 Eh ðj;k sjþ1=2;k jþ1;k þ j;k sj;kþ1=2 j;kþ1 Þ ½55 2
d2 reikr K0 ðrÞ ¼ 2 ½65
j;k k þ1
If we define we see that the Fourier transform of [62] is the sum of
an infinite number of poles. This is to be compared
zg;h ¼ tanh Eg;h =kB T ½56
with the Fourier transform of the scaled correlation
the partition function of the gauge theory is expressed function G (r) at H = 0 and T < Tc [34], which does
in terms of the Ising model partition function as not contain any poles at all and may instead be
interpreted as having a two-particle cut. This phe-
Zg ¼ ½8 coshðEg =kB TÞ cosh2 nomenon of a cut at h = 0 breaking up into an infinite
N number of poles for h > 0 is a signal that at h = 0 the
 ðEh =kB TÞz1=2
g zh  ZI ðHÞ ½57
theory has free unconfined two-particle states which
where we make the identification become weakly confined by a linear confining
potential for h > 0. This confinement is thought to
H=kB T ¼ 12 ln zg and E=kB T ¼ 12 ln zh ½58 be a characteristic of most gauge theories.

This identification may be extended to correlation See also: Eight Vertex and Hard Hexagon Models;
functions. Of particular interest for the gauge theory is Holonomic Quantum Fields; Painlevé Equations;
the plaquette–plaquette correlation < P0, 0 Pj, l > , where Percolation Theory; Phase Transitions in Continuous
Systems; Statistical Mechanics and Combinatorial
Pj;k ¼ sjþ1=2;k sjþ1;kþ1=2 sjþ1=2;kþ1 sj;kþ1=2 ½59 Problems; Toeplitz Determinants and Statistical
Mechanics; Yang–Baxter Equations.
which is expressed in terms of the Ising correlations
at H 6¼ 0 as
< P0;0 Pj;k >  < P0;0 >2 Further Reading
¼ sinh2 ð2H=kB TÞð< 0;0 j;k >  < 0;0 >2 Þ ½60 Barouch E, McCoy BM, and Wu TT (1973) Zero-field suscept-
ibility of the two dimensional Ising model near Tc . Physical
To study this correlation further, we need to study Review Letters 31: 1409–1411.
the correlations of the Ising model in nonzero Baxter RJ (1982) Exactly Solved Models in Statistical Mechanics.
London: Academic Press.
magnetic field. This has been done by McCoy and Griffiths RB (1969) Nonanalytic behavior above the critical point in
Wu in the scaling limit H ! 0, T ! Tc with a random Ising ferromagnet. Physical Review Letters 23: 17–19.
Jimbo M and Miwa T (1980) Studies on holonomic quantum
H fields. XVII. Proceedings of the Japanese Academy 56A: 405;
h¼ fixed ½61
jT  Tc j15=8 (1981) 57A: 347.
Kasteleyn PW (1963) Dimer statistics and phase transitions.
for T < Tc , where it is found that the scaling Journal of Mathematical Physics 4: 287–293.
McCoy BM (1969) Incompleteness of the critical exponent
function G(r, h) for small h and large r if
description for ferromagnetic systems containing random
X h i impurities. Physical Review Letters 23: 383–388.
Gðr; hÞ  ahK0 ð2 þ h2=3 l Þr McCoy BM (1995) The connection between statistical mechanics
l and quantum field theory. In: Bazhanov VV and Burden CJ
X 2=3
½62
 1=2 1=2 2r
r e haerh l (eds.) Statistical Mechanics and Field Theory, pp. 26–128.
Singapore: World Scientific.
l
McCoy BM and Wu TT (1968) Theory of a two dimensional Ising
where l are the solutions of model with random impurities. Physical Review 176: 631–643.
McCoy BM and Wu TT (1973) The Two Dimensional Ising
J1=3 13 3=2 þ J1=3 13 3=2 ¼ 0 ½63 Model. Cambridge: Harvard University Press.
McCoy BM and Wu TT (1978) Two dimensional Ising field
with Jn (z) the Bessel function of order n and K0 (z) theory in a magnetic field: Breakup of the cut in the 2-point
the modified Bessel function of the third kind. function. Physical Review D 18: 1259–1267.
A field theory is said to possess a particle spectrum McCoy BM, Perk JHH, and Wu TT (1981) Ising field theory:
quadratic difference equations for the n-point Green’s func-
if the Fourier transform of the two-point function tions on the lattice. Physical Review Letters 46: 757.
Z Montroll EW, Potts RB, and Ward JC (1963) Correlations and
Gðk; hÞ ¼ d2 reikr Gðr; hÞ ½64 spontaneous magnetization of the two dimensional Ising
model. Journal of Mathematical Physics 4: 308–322.
328 Two-Dimensional Models

Onsager L (1944) Crystal statistics I. A two dimensional model with Zenine N, Boukraa S, Hassani S, and Maillard JM (2004) The
an order disorder transition. Physical Review 65: 117–149. Fuchsian differential equation of the square lattice Ising model
Orrick WP, Nickel BG, Guttmann AJ, and Perk JHH (2001) (3) susceptibility. Journal of Physics A 37: 9651–9668.
Critical behavior of the two dimensional Ising susceptibility. Zenine N, Boukraa S, Hassani S, and Maillard JM (2005) Ising
Physical Review Letters 86: 4120–4123. model susceptibility: Fuchsian differential equation for (4)
Wu TT, McCoy BM, Tracy CA, and Barouch E (1976) Spin–spin and its factorization properties. Journal of Physics A 38:
correlation functions for the two dimensional Ising model, exact 4149–4173.
theory in the scaling region. Physical Review B 13: 315–374.

Two-Dimensional Models
B Schroer, Freie Universität Berlin, Berlin, Germany involving long-range forces and instead explain ferro-
ª 2006 Elsevier Ltd. All rights reserved. magnetism in terms of nonmagnetic short-range
interactions. Its one-dimensional version was solved
four years later by his student Ernst Ising. Its changeful
history reached a temporary conceptual climax when
History and Motivation
Onsager succeeded to rigorously establish a second-
Local quantum physics of systems with infinitely many order phase transition in two dimensions.
interacting degrees of freedom leads to situations Another conceptually rich model which lay
whose understanding often requires new physical dormant for almost two decades as a result of a
intuition and mathematical concepts beyond that misleading speculative higher-dimensional general-
acquired in quantum mechanics and perturbative ization by its protagonist is the bosonization/
constructions in quantum field theory. In this situa- fermionization model first proposed by Jordan
tion, two-dimensional soluble models turned out to (1937). This model establishes a certain equivalence
play an important role. On the one hand, they between massless two-dimensional fermions and
illustrate new concepts and sometimes remove mis- bosons and is related to Thirring’s massless
conceptions in an area where new physical intuition is 4-fermion coupling model and also to Luttinger’s
still in the process of being formed. On the other hand, one-dimensional model of an electron gas (Schroer).
rigorously soluble models confirm that the underlying One reason why even nowadays hardly anybody
physical postulates are mathematically consistent, a knows Jordan’s contribution is certainly the ambi-
task which for interacting systems with infinite degrees tious but unfortunate title ‘‘the neutrino theory of
of freedom is mostly beyond the capability of light’’ under which he published a series of papers.
pedestrian methods or brute force application of hard Both discoveries demonstrate the usefulness of
analysis on models whose natural invariances have having controllable low-dimensional models; at the
been mutilated by a cutoff. same time, their complicated history also illustrates
In order to underline these points and motivate the danger of rushing to premature ‘‘intuitive’’
the interest in two-dimensional QFT, let us briefly conclusions about extensions to higher dimensions.
look at the history, in particular at the physical A review of the early historical benchmarks of
significance of the three oldest two-dimensional conceptual progress through the study of solvable
models of relevance for statistical mechanics and two-dimensional models would be incomplete
relativistic particle physics, in chronological order: without mentioning Schwinger’s (1962) proposed
the Lenz–Ising (L–I) model, Jordan’s model of solution of two-dimensional quantum electrody-
bosonization/fermionization, and the Schwinger namics, afterwards referred to as the Schwinger
model (QED2 ). (A more detailed account of the model. He used this model in order to argue that
changeful history concerning their correct physical gauge theories are not necessarily tied to zero-mass
interpretation and generalizations to higher dimen- vector particles. Some work was necessary
sions of these models and the increasing conceptual (Schroer) to unravel its physical content with the
role of low-dimensional models in QFT can be result that the would-be charge of that QED2
found in Schroer (2005).) model was ‘‘screened’’ and its apparent chiral
The L–I model was proposed in 1920 by Wihelm symmetry broken; in other words, the model exists
Lenz (see Lenz (1925)) as the simplest discrete only in the so-called Schwinger–Higgs phase with
statistical mechanics model with a chance to go massive free scalar particles accounting for its
beyond the P Weiss phenomenological ansatz physical content. Another closely related aspect of
Two-Dimensional Models 329

this model which also arose in the Lagrangian and Wightman (1964)) (see Axiomatic Quantum
setting of four-dimensional gauge theories was that Field Theory), and the more algebraic setting which
of the -angle parametrizing, an ambiguity in the can be traced back to ideas which Haag (1992)
quantization. developed shortly after and which are based on
A coherent and systematic attempt at a mathema- spacetime-indexed operator algebras and related
tical control of two-dimensional models came in the concepts which developed over a long period of
wake of Wightman’s first rigorous programmatic time, with contributions of many other authors to
formulation of QFT (Schroer 2005). This formula- what is now referred to as algebraic QFT (AQFT) or
tion stayed close to the physical ideas underlying the simply local quantum physics (LQP). Whereas the
impressive success of renormalized QED perturba- Wightman approach aims directly at the (not
tion theory, although it avoided the direct use of necessarily observable) quantum fields, the opera-
Lagrangian quantization. The early attempts tor-algebraic setting (see Algebraic Approach to
towards a ‘‘constructive QFT’’ found their successful Quantum Field Theory) is more ambitious. It starts
realization in two-dimensional QFT (the P’2 models from physically well-motivated assumptions about
(Glimm and Jaffe 1987)); the restriction to low the algebraic structure of local observables and aims
dimensions is related to the mild short-distance at the reconstruction of the full field theory
singularity behavior (super-renormalizability) which (including the operators carrying the superselected
these methods require. We will focus our main charges) in the spirit of a local representation theory
attention on alternative constructive methods which, of (the assumed structure of the) local observables.
even though not suffering from such short-distance This has the advantage that the somewhat myster-
restrictions, also suffer from a lack of mathematical ious concept of an inner symmetry (as opposed to
control in higher spacetime dimensions; the illustra- outer (spacetime) symmetry) can be traced back to
tion of the constructive power of these new methods its physical roots which is the representation-
comes presently from massless d = 1 þ 1 conformal theoretical structure of the local observable algebra
and chiral QFT as well as from massive factorizing (see Symmetries in Quantum Field Theory of Lower
models. Spacetime Dimensions). In the standard Lagrangian
There are several books and review articles quantization approach, the inner symmetry is part of
(Furlan et al. 1989, Ginsparg 1990, Di Francesco the input (multiplicity indices of field components
et al. 1996) on d = 1 þ 1 conformal as well as on on which subgroups of U(n) or O(n) act linearly)
massive factorizing models (Abdalla et al. 1991). To and hence it is not possible to problematize this
the extent that concepts and mathematical structures fundamental question. When in low-dimensional
are used which permit no extension to higher spacetime dimensions the sharp separation (the
dimensions (Kac–Moody algebras, loop groups, Coleman–Mandula theorems) of inner versus outer
integrability, presence of an infinite number of symmetry becomes blurred as a result of the
conservation laws), this line of approach will not appearance of braid group statistics, the standard
be followed in this article since our primary interest Lagrangian quantization setting of most of the
will be the use of two-dimensional models of QFT textbooks is inappropriate and even the Wightman
as ‘‘theoretical laboratories’’ of general QFT. Our framework has to be extended. In that case, the
aim is twofold; on the one hand, we intend to algebraic approach is the most appropriate.
illustrate known principles of general QFT in a The important physical principles which are shared
mathematically controllable context and on the between the Wightman approach (see Streater and
other hand, we want to identify new concepts Wightman (1964)) and the operator algebra (AQFT)
whose adaptation to QFT in d = 1 þ 1 lead to their setting (Haag 1992) are the spacelike locality or
solvability (Schroer). Einstein causality (in terms of pointlike fields or
algebras localized in causally disjoint regions) and
the existence of positive-energy representations of
the Poincaré group implementing covariance and the
General Concepts and Their
stability of matter. In the algebraic approach, the
Two-Dimensional Manifestation
observable content of the theory is encoded into a
The general framework of QFT, to which the rich family of (weakly closed) operator algebras
world of controllable two-dimensional models con- {A(O)}O2K indexed by a family of convex causally
tributes as an important testing ground, exists in closed spacetime regions O (with O0 denoting the
two quite different but nevertheless closely related spacelike complement and A0 the von Neumann
formulations: the 1956 approach in terms of point- commutant) which act in one common Hilbert space.
like covariant fields due to Wightman (see Streater Covariant local fields lose their distinguished role
330 Two-Dimensional Models

which they have in the classical setting and which that is, the realization that the structure of charged
(via Lagrangian quantization) was at least partially (nonvacuum) representations (with the superposi-
inherited by the Wightman approach and, apart in tion principle being valid only within one represen-
their role as local generators of symmetries (con- tation) and the spacetime properties of the
served currents), became mere ‘‘field coordinatiza- generating fields which are the carriers of these
tions’’ of local algebras. (There is a denumerable set generalized charges (including their spacelike com-
of such pointlike field generators which form a local mutation relations which lead to the particle
equivalence (Borchers) class of fields and in the statistics and also to their internal symmetry proper-
absence of interactions permits a neat description in ties) are already encoded in the structure of the
terms of Wick-ordered free-field polynomials (Haag Einstein causal observable algebra (Symmetries in
1992). Certain properties cannot be naturally for- Quantum Field Theory: Algebraic Aspects). The
mulated in the pointlike field setting (e.g., Haag intuitive basis of this remarkable result (whose
duality for convex regions A(O0 ) = A(O)0 ), but apart prerequisite is locality) is that one can generate
from those properties the two formulations are quite charged sectors by spatially separating charges in the
close; in particular for two-dimensional theories there vacuum (neutral) sector and disposing of the
are convincing arguments that one can pass between unwanted charges at spatial infinity (Haag 1992).
the two without imposing additional technical An important concept which especially in d = 1 þ 1
requirements. (Haag duality holds for observable has considerable constructive clout is ‘‘modular
algebras in the vacuum sector in the sense that any localization.’’ It is a consequence of the above
violation can be explained in terms of a sponta- algebraic setting if either the net of algebras have
neously broken symmetry; in local theories, it can pointlike field generators, or if the one-particle
always be enforced by dualization and the resulting masses are separated by spectral gaps so that the
Haag dual algebra has a charge superselection formalism of time-dependent scattering can be
structure associated with the unbroken subgroup.) applied (Schroer 2005); in conformal theories, this
Haag duality is the statement that the commutant of property holds automatically in all spacetime
observables not only contains the algebra of the dimensions. It rests on the basic observation
causal complement that is, A(O0 )  A(O)0 (Einstein (Tomita–Takesaki Modular Theory) that a standard
causality) but is even exhausted by it; it is deeply pair (A, ) of a von Neumann operator algebra and
connected to the measurement process and its a standard vector (standardness means that the
violation in the vacuum sector for convex causally operator algebra of the pair (A, ) acts cyclic and
complete regions signals spontaneous symmetry separating on the vector ) gives rise to a Tomita
breaking in the associated charge-carrying field operator S through its star-operation whose polar
algebra (Haag 1992). It can always be enforced decomposition yield two modular objects, a one-
(assuming that the wedge-localized algebras fulfill parametric subgroup it of the unitary group of
[1] below) by symmetry-reducing extension called operators in Hilbert space whose Ad-action defines
Haag dualization. Its violation for multilocal region the modular automorphism of (A, ) whereas the
reveals the charge content of the model via charge– angular part J is the modular conjugation which
anticharge splitting in the neutral observable algebra maps A into its commutant A0
(Schroer).
Another physically important property which has SA ¼ A ; S ¼ J1=2
a natural algebraic formulation is the split property:
JW ¼ UðjW Þ ¼ Sscat J0 ; itW ¼ UðW ð2tÞÞ ½1
for regions Oi separated by a finite spacelike
distance, one finds A(O1 [ O2 ) ’ A(O1 )  A(O2 ) W ðtÞ :¼ AditW
which can be derived from the Buchholz–Wichmann
‘‘nuclearity property’’ (Haag 1992) (an appropriate The standardness assumption is always satisfied for
adaptation of the ‘‘finiteness of phase-space cell’’ any field-theoretic pair (A(O), ) of a O-localized
property of QM to QFT). Related to the Haag algebra and the vacuum state (as long as O has a
duality is the local version of the ‘‘time slice nontrivial causal disjoint O0 ), but it is only for the
property’’ (the QFT counterpart of the classical wedge region W that the modular objects have a
causal dependency property) sometimes referred to physical interpretation in terms of the global
as ‘‘strong Einstein causality’’ A(O00 ) = A(O)00 . symmetry group of the vacuum as specified in the
One of the most astonishing achievements of the second line of [1]; the modular unitary itW
algebraic approach (which justifies its emphasis on represents the W-associated boost W () and the
properties of ‘‘local observables’’) is the DHR theory modular conjugation JW implements the TCP-like
of superselection sectors (Doplicher et al. 1971), reflection along the edge of the wedge (Bisognano
Two-Dimensional Models 331

and Wichmann 1975). The third line is the defini- turns out to be a bona fide quantum field in a larger
tion of the modular group. The importance of this Hilbert space (which extends the Fock space
theory for local quantum physics results from the generated from applying currents to the vacuum).
fact that it leads to the concept of modular The power in front is determined by the requirement
localization, an intrinsic new scenario for field- that all Wightman functions (computed with the
theoretic constructions which is different from the help of free-field Wick combinatorics) stay finite in
Lagrangian quantization schemes (Schroer 2005). this massless limit; the necessary and sufficient
A special feature of d = 1 þ 1 Minkowski spacetime condition for this is the charge conservation rule
is the disconnectedness of the right/left spacelike region * +
leading to a right–left ordering structure. So in addition Y
i i ðxÞ
:e :
to the Lorentz-invariant timelike ordering x  y (x
i
earlier than y, which is independent of spacetime 8
> Y 1
ð1=2Þ i j
½4
dimensions), there is an invariant spacelike ordering >
< ;  i ¼ 0
x < y (x to the left of y) in d = 1 þ 1 which opens the ¼ i<j
ð
þij Þ" ð
ij Þ"
possibility of more general Lorentz-invariant spacelike >
>
:
commutation relations than those implemented by 0; otherwise
Bose/Fermi fields (Rehren and Schroer 1987) of fields
where the resulting correlation function has been
with a spacelike braid group commutation structure.
factored in terms of light-ray coordinates
The appearance of such exotic statistics fields is not

ij = xi  xj , x = t  x, and the "-prescription
compatible with their Fourier transforms being crea-
stands for taking the standard Wightman bound-
tion/annihilation operators for Wigner particles;
ary value t ! t þ i", lim" ! 0 which insures the
rather, the state vectors which they generate from the
positive-energy condition. The finiteness of the
vacuum contain in addition to the one-particle
limit insures that the resulting zero-mass limiting
contribution a vacuum polarization cloud (Schroer
theory is a bona fide quantum field theory that is,
2005). This close connection between new kinematic
its system of Wightman functions permits the
possibilities and interactions is one of the reasons why,
construction of an operator theory in a Hilbert
(different from higher dimensions where interactions
space with a distinguished vacuum vector.
are prescribed by the recipe of local couplings of free
The factorization into light-ray components [4]
fields) low-dimensional QFT offers a more intrinsic
shows that the exponential charge-carrying opera-
access to the central issue of interactions.
tors inherit this factorization into two independent
chiral components : exp i (x) : = exp i þ (xþ ):
: exp i  (x ):, each one being covariant under
Boson/Fermion Equivalence and
scaling
!
if one assigns the scaling dimension
Superselection Theory in a Special Model d = 2 =2 to the chiral exponential field and d = 1 to
The simplest and oldest but conceptually still rich the current. As any Wightman field, this is a singular
model is obtained, as first proposed by Jordan object which only after smearing with Schwartz test
(1937), by using a two-dimensional massless Dirac functions yields an (unbounded) operator. But the
current and showing that it may be expressed in above form of the correlation function belongs to a
terms of scalar canonical Bose creation/annihilation class of distributions which admits a much larger
operators test-function space consisting of smooth functions
which instead of decreasing rapidly only need to be
j ¼:  :¼ @ ;  bounded so that they stay finite on the compactified
Z þ1
dp light-ray line R_ = S1 . To make this visible, one uses
:¼ feipx a ðpÞ þ h:c:g ½2
1 2jpj the Cayley transform (now x denotes either xþ
or x )
Although the potential (x) of the current as a result
of its infrared divergence is not a field in the 1 þ ix
standard sense of an operator-valued distribution z¼ 2 S1 ½5
1  ix
in the Fock space of the a(p)# (It becomes an
operator after smearing with test functions whose This transforms the Schwartz test function into a space
Fourier transform vanishes at p = 0), the formal of test functions on S1 which have an infinite order
exponential defined as the zero-mass limit of a well- zero at z = 1 (corresponding to x = 1) but the
defined exponential free massive field rotational transformed fields j(z), : exp i (z): permit
2
the smearing with all smooth functions on S1 , a
: ei ðxÞ :¼ lim m =2
: ei m ðxÞ : ½3 characteristic feature of all conformal invariant
m!0
332 Two-Dimensional Models

 
theories as the present one turns out to be. There is an AðS1 Þ ¼ alg Wðf Þ; f 2 C1 ðS1 Þ
additional advantage in the use of this compactifica-
AðIÞ ¼ algfWðf Þ; suppf  Ig ½6c
tion. Fourier transforming the circular current actually
allows for a quantum-mechanical zero mode whose where
possible nonzero eigenvalues indicate the presence of Z
additional charge sectors beyond the charge-zero dz 0
sð: ; :Þ ¼ f ðzÞgðzÞ
vacuum sector. For the exponential field, this leads to 2i
a quantum-mechanical pre-exponential factor which is the symplectic form which characterizes the
automatically insures the charge selection rules so that Weyl algebra structure and [6c] denotes the
unrestricted (by charge conservation) Wick contrac- unique C algebra generated by the unitary objects
tion rules can be applied. In this approach, the W(f). A particular representation of this algebra is
original chiral Dirac fermion (x) (from which the given by assigning the vacuum state
current was formed as the : : composite) 2 P to the
generators hW(f )i0 = e(1=2)kf k0 , kf k20 = n 1 njfn j2 .
reappears as a charge-carrying exponential field Starting with the vacuum Hilbert space represen-
for = 1 and thus illustrates the meaning of tation A(S1 )0 = 0 (A(S1 )), one easily checks that
bosonization/fermionization. (It is interesting to the formula
note that Jordan’s (1937) original treatment of
fermionization had such a pre-exponential quan- hWðf Þi :¼ ei f0 hWðf Þi0 ½7a
tum-mechanical factor.) Naturally, this terminol-
ogy has to be taken with a grain of salt in view of
 ðWðf ÞÞ ¼ ei f0 0 ðWðf ÞÞ ½7b
the fact that the bosonic current algebra only
generates a superselected subspace into which the defines a state with positive energy, that is, one
charge-carrying exponential field does not fit. whose GNS representation for 6¼ 0 is unitarily
Only in the case of massive two-dimensional QFT inequivalent to the vacuum representation. Its
fermions can be incorporated into a Fock space of incorporation into the vacuum Hilbert space [7b] is
bosons (see last section). At this point, it should part of the DHR formalism. It is convenient to view
however be clear to the reader that the physical this change as the result of an application of an
content of Jordan’s paper had nothing to do with automorphism  on the C -Weyl algebra A(S1 )
its misleading title ‘‘neutrino theory of light’’ but which is implemented by a unitary charge-generat-
rather was an early illustration about charge ing operator  in a larger (nonseparable) Hilbert
superselection rules in two-dimensional QFT. space which contains all charge sectors H =  H0 ,
A systematic and rigorous approach consists in H0
Hvac = A(S1 ):
solving the problem of positive-energy representa-
hWðf Þi ¼ h ðWðf ÞÞi0
tion theory for the Weyl algebra on the circle (which ½8
is the rigorous operator-algebraic formulation of the  ðWðf ÞÞ ¼  Wðf Þ
abelian current algebra). (The Weyl algebra origi-   =  describes a state with a rotational homo-
nated in quantum mechanics around 1927; its use in geneous charge distribution; arbitrary
QFT only appeared after the cited Jordan paper. By R charge distribu-
tions of total charge that is, (dz= 2i) =
representation we mean here a regular representa- are obtained in the form
tion in which the exponentials can be differentiated
in order to obtain (unbounded) smeared current 
 Þ
¼ ð ÞWð^ ½9
operators.) It is the operator algebra generated by
the exponential of a smeared chiral current (always where ( ) is a numerical phase factor and the
with real test functions) with the following relation net effect of the Weyl operator is to change the
between the generators rotational homogeneous charge distribution into
. The necessary charge-neutral compensating
Wðf Þ ¼ eijðf Þ function  in the Weyl cocycle W(  ) is uniquely
Z determined in terms of up to the choice of one
dz
jðf Þ ¼ jðzÞf ðzÞ; ½ jðzÞ; jðz0 Þ point  2 S1 (the determining equation involves
2i the ln z function which needs the specification of
¼  0 ðz  z0 Þ ½6a a branch cut (Schroer 2005)). From this formula,
one derives the commutation relations    =
ei     for spacelike separations of the
Wðf ÞWðgÞ ¼ eð1=2Þsðf ;gÞ Wðf þ gÞ
supports; hence, these fields are relatively local
W  ðf Þ ¼ Wðf Þ ½6b (bosonic) for  = 2Z. In particular, if only one
Two-Dimensional Models 333

type ofpcharge
ffiffiffiffiffiffiffi is present, the generating charge is extension A ! AN , which renders the Hilbert space
gen = 2N and the composite charges are multi- separable and quantizes the charges, seems to be
ples, that is, gen Z. This locality condition characteristic for abelian current algebra; in all other
providing bosonic commutation relations does models which have been constructed up to now the
not yet ensure the -independence. Since the number of sectors is at least denumerable and in the
equation which controls the -change turns out more interesting ones even finite (rational models).
to be An extension is called maximal if there exists no
  further extension which maintains the bosonic
1 2 i  2iQ
¼ e e ½10 commutation relation. For the case at hand, this
would require the presence of another generating
one achieves -independence by restricting the field of the same kind as above, which belongs to an
Hilbert space charges to be ‘‘dual’’ to that of the integer N 0 is relatively local to the first one. This is
operators, that is, only possible if N is divisible by a square.

In passing, it is interesting to mention a somewhat
1
Q ¼ pffiffiffiffiffiffiffi Z unexpected relation between the Schwinger model,
2N
whose charges are screened, and the Jordan model.
The localized  1 operators acting on the restricted Since the Lagrangian formulation of the Schwinger
separable Hilbert space Hres generate a -indepen- model is a gauge theory, the analog of the four-
dent extended observable algebra AN (S1 ) (Schroer) dimensional ‘‘asymptotic freedom’’ wisdom would
and it is not difficult to see that its representation in suggest the possibility of ‘‘charge liberation’’ in the
Hres is reducible and that it decomposes into 2N short-distance limit of this model. This seems to
charge sectors contradict the statement that the intrinsic content

of the Schwinger model (QED2 with massless
1 Fermions) (after removing a classical degree of free-
pffiffiffiffiffiffiffi n; n ¼ 0; 1; . . . ; N  1
2N dom) is the QFT of a free massive Bose field and such a
simple free field is at first sight not expected to contain
Hence, the process of extension has led to a charge
subtle information about asymptotic charge liberation.
quantization with a finite (‘‘rational’’) number of
(In its original gauge-theoretical form, the Schwinger
charges relative to the new observable algebra which
model has an infinite vacuum degeneracy. The
is neutral in the new charge counting
removal of this degeneracy (restoration of the cluster
1 property) with the help of the ‘‘-angle formalism’’
Z= gen Z ¼ Z= 2gen ¼ Z2N
gen leaves a massive free Bose field (the Schwinger–Higgs
mechanism). As expected in d = 1 þ 1 the model only
The charge-carrying fields in the new setting are also possesses this phase.) Well, as we have seen above, the
of the above form [9], but now the generating field massless limit really does have liberated charges and
carries the charge the short-distance limit of the massive free field is the
Z massless model (Schroer).
dz
gen ¼ Qgen As a result of the peculiar bosonization/fermioniza-
2i
tion aspect of the zero-mass limit of the derivative of
which is a (1=2N) fraction of the old gen . Their the massive free field, Jordan’s model is also closely
commutation relations for disjoint charge supports related to the massless Thirring model (and the related
are ‘‘braidal’’ (or better ‘‘plektonic’’ which is more Luttinger model for an interacting one-dimensional
on par with being bosonic/fermionic). (In the abelian electron gas) whose massive version is in the class of
case like the present, the terminology ‘‘anyonic’’ factorizing models (see later section). (Another struc-
enjoys widespread popularity, but in the present tural consequence of this aspect leads to Coleman’s
context the ‘‘any’’ does not go well with charge theorem (Schroer 2005) which connects the Mermin–
quantization.) These objects considered as operators Wagner no-go theorem for two-dimensional sponta-
localized on S1 do depend on the cut , but using an neous continuous symmetry breaking with these
appropriate finite covering of S1 this dependence is zero-mass peculiarities.) The Thirring model is a
removed (Schroer 2005). So the field algebra F Z 2N special case in a vast class of ‘‘generalized’’ multi-
generated by the charge-carrying fields (as opposed coupling multicomponent Thirring models, that is,
to the bosonic observable algebra AN ) has its unique models with 4-fermion interactions. Under this name
localization structure on a finite covering of S1 . An they were studied in the early 1970s (Schroer) with
equivalent description which gets rid of  consists in the aim to identify massless subtheories for which the
dealing with operator-valued sections on S1 . The currents form chiral current algebras.
334 Two-Dimensional Models

The counterpart of the potential of the conserved of the model) and fulfill the global causality
Dirac current in the massive Thirring model is the condition previously discovered by I Segal (Schroer
sine-Gordon field, that is, a composite field which in 2005). They are generally highly reducible with
the attractive regime of the Thirring coupling again respect to the center of the covering group. The
obeys the so-called sine-Gordon equation of motion. family of fields on the right-hand side, on the other
Coleman gave a supportive argument (Schroer 2005) hand, are fields which were introduced (Schroer and
but some fine points about the range of its validity in Swieca 1974; Schroer et al. 1975) with the aim to
terms of the coupling strength remained open. (It was have objects which live on the projection x(xcov ),
noticed that the current potential of the free massive that is, on the spacetime of the physics laboratory
Dirac Fermion (g = 0) does not obey the sine-Gordon instead of the ‘‘hells and heavens’’ of the covering
equation (Schroer 2005).) A rigorous confirmation of (Schroer 2005). They are operator-distributional
these facts was recently given in the bootstrap form- valued sections in the compactification of ordinary
factor setting (Schroer 2005). Massive models which Minkowski spacetime. The connection is given by
have a continuous or discrete internal symmetry have the above decomposition formula into irreducible
‘‘disorder’’ fields which implement a ‘‘half-space’’ conformal blocks with respect to the center Z of the
symmetry on the charge-carrying field (acting as the noncompact covering group SO(2, g n) where ,  are
identity in the other half-axis) and together with the labels for the eigenspaces of the generating unitary Z
basic pointlike field form composites which have of the abelian center Z. The decomposition [11] is
exotic commutation relations (see the last section). minimal in the sense that in general there generally
will be a refinement due to the presence of
additional charge superselection rules (and internal
The Conformal Setting, group symmetries). The component fields are not
Structural Results Wightman fields since they annihilate the vacuum if
the right-hand projection differs from P0 = Pvac .
Chiral theories play a special role within the setting Note that the Huygens (timelike) region in Min-
of conformal quantum fields. General conformal kowski spacetime has a timelike ordering structure
theories have observable algebras which live on x  y or x y (earlier or later). In d = 1 þ 1, the
compactified Minkowski space (S1 in the case of topology allows in addition a spacelike left–right
chiral models) and fulfill the Huygens principle, ordering x 9 y. In fact, it is precisely the presence of
which in an even number of spacetime dimension these two orderings in conjunction with the factor-
means that the commutator is only nonvanishing for ization of the vacuum symmetry group SO(2, g 2) ’
lightlike separation of the fields. The fact that this g g
PSL(2R)l  PSL(2, R)r , in particular Z = Zl  Zr ,
classically expected behavior breaks down for which is at the root of a significant simplification.
nonobservable conformal fields (e.g., the massless This situation suggested a tensor factorization into
Thirring field) was noticed at the beginning of the chiral components and led to an extremely rich and
1970s and considered paradoxical at that time successful construction program of two-dimensional
(‘‘reverberation’’ in the timelike (Huygens) region). conformal QFT as a two-step process: the classifica-
Its resolution around 1974–75 confirmed that such tion of chiral observable algebras on the light ray and
fields are genuine conformal covariant objects but the amalgamation of left–right chiral theories to two-
that some fine points about their causality needed to dimensional local conformal QFT. The action on the
be addressed. The upshot was the proposal of two circular coordinates z is through fractional SU(1, 1)
different but basically equivalent concepts about transformations
globally causal fields. They are connected by the
following global decomposition formula: z þ 
X gðzÞ ¼ 
z þ

Aðxcov Þ ¼ A ; ðxÞ; A ; ðxÞ
X
¼ P AðxÞP ; Z ¼ eid P ½11 whereas the covering group acts on the Mack–
Luescher covering coordinates.
On the left-hand side, the spacetime point of the The presence of an ordering structure permits the
field is a point on the universal covering of the appearance of more general commutation relations
conformal compactified Minkowski space. These are for the above A  component fields namely
fields (Lüescher and Mack 1975) (Schroer 2005)
A ; ðxÞB; ðyÞ
which ‘‘live’’ in the sense of quantum (modular) X ;
localization on the universal covering spacetime (or ¼ R;0 B ;0 ðyÞA0 ; ðxÞ; x>y ½12
on a finite covering, depending on the ‘‘rationality’’ 0
Two-Dimensional Models 335

with numerical R-coefficients which, as a result of ( , ) 2 2Z, that is, an even-integer lattice L in V,
associativity and relative commutativity with respect whereas the restricted Hilbert subspace HL which
to observable fields, have to obey certain structure ensures -independence is associated with the dual
relations; in this way, Artin braid relations emerge lattice L : ( i , k ) = ik which contains L. The
as a new manifestation of the Einstein causality resulting superselection structure (i.e., the Q-
principle for observables in low-dimensional QFT spectrum) corresponds to the finite factor group
(Rehren and Schroer 1989) (see Schroer 2005). L =L. For self-dual lattices L = L (which only can
Indeed, the DHR method to interpret charged fields occur if dim V is a multiple of 8), the resulting
as charge superselection carriers (tied by local observable algebra has only the vacuum sector; the
representation theory to the bosonic local structure most famous case is the Leech lattice 24 in
of observable algebras) leads precisely to such a dim V = 24, also called the ‘‘moonshine’’ model.
plektonic statistics structure (Fredenhagen et al. The observation that the root lattices of the Lie
1992, Gabbiani and Froehlich 1993) for systems in algebras of types A, B, or E (e.g., su(n) corre-
low spacetime dimension (see Symmetries in Quan- sponding to An1 ) also appear among the even-
tum Field Theory of Lower Spacetime Dimensions). integral lattices suggests that the nonabelian
With an appropriately formulated adjustment to current algebras associated to those Lie algebras
observables fulfilling the Huygens commutativity, can also be implemented. This turns out to be
this plektonic structure (but now disconnected from indeed true as far as the level-1 representations are
particle/field statistics) is also a possible manifesta- concerned which brings us to the second family:
tion of causality for the higher-dimensional timelike the nonabelian current algebras of level k asso-
structure (Schroer 2005). ciated to those Lie algebras; they are characterized
The only examples known up to the appearance by the commutation relation
of the seminal BPZ work (Belavin et al. 1984) were 
the abelian current models of the previous section J ðzÞ; J ðz0 Þ ¼ if  j ðzÞ ðz  z0 Þ
which furnish a rather poor man’s illustration of the  12kg  0 ðz  z0 Þ ½13
richness of the decomposition theory. The flood-

gates of conformal QFT were only opened after the where f  are the structure constants of the under-
BPZ discovery of ‘‘minimal models,’’ which was lying Lie algebra, g their Cartan–Killing form, and
preceded by the observation (Friedan et al. 1984) k, the level of the algebra, must be an integer in
that the algebra of the stress–energy tensor came order that the current algebra can be globalized to a
with a new representation structure which was not loop group algebra. The Fourier decomposition of
compatible with an underlying internal group the current leads to the so-called affine Lie algebras,
symmetry (see Symmetries in Quantum Field The- a special family of Kac–Moody algebras. For k = 1,
ory: Algebraic Aspects). these currents can be constructed as bilinears in
An important step in the structural study of chiral terms of the multicomponent chiral Dirac field;
models was the recognition that the energy–momen- there exists also the mentioned possibility to obtain
tum tensor has the commutation structure of a Lie them by constructing their maximal Cartan currents
field (Schroer 2005); in the next section, its algebraic within the above abelian setting and representing the
structure and its representation theory will be remaining nondiagonal currents as certain charge-
presented. carrying (‘‘vertex’’ algebra) operators. Level-k alge-
bras can be constructed from reducing tensor
products of k level-1 currents or directly via the
representation theory of infinite-dimensional affine
Chiral Fields and Two-Dimensional
Lie algebras. (The global exponentiated algebras
Conformal Models
(the analogs to the Weyl algebra) are called loop
Let us start with a family which generalizes the group algebras.) Either way one finds that, for
abelian model of the previous section. Instead of a example, the SU(2) current algebra of level k has
one-component abelian current we now take n (together with the vacuum sector) k þ 1 sectors
independent copies. The resulting multicomponent (inequivalent representations). The different sectors
Weyl algebra has the previous form except that the are already distinguished by the structure of their
current is n-component and the real function space ground states of the conformal Hamiltonian L0 .
underlying the Weyl algebra consists of functions Although the computation of higher point correla-
with values in an n-component real vector space tion functions for k > 1, there is no problem in
f 2 LV with the standard Euclidean inner product securing the existence of the algebraic nets which
denoted by ( , ). The local extension now leads to define these chiral models as well as their k þ 1
336 Two-Dimensional Models

representation sectors and to identify their generat- the concept of operator-algebraic inclusions (in
ing charge-carrying fields (primary fields) including particular, inclusions with conditional expectations –
their R-matrices appearing in their plektonic com- V Jones inclusions).
mutation relations. It is customary to use the The SU(2)k current coset construction (Goddard
notation SU(2)k for the abstract operator algebras et al. 1985) revealed that the proof of existence and
associated with the current generators [13] and we the actual construction of the minimal models is
will denote their k þ 1 equivalence classes of related to that of the SU(2)k current algebras.
representations by ASU(2)k , n , n = 0, . . . , k, whereas Constructing a chiral model does not necessarily
representations of current algebras for higher rank mean the explicit determination of the n-point
groups require a more complicated labeling (in Wightman functions of their generating fields
terms of Weyl chambers). (which for most chiral models remains a prohibi-
The third family of models are the so-called tively complicated task) but rather a proof of their
minimal models which are associated with the existence by demonstrating that these models are
Lie-field commutation structure of the chiral obtained from free fields by a series of computa-
stress–energy tensor which results from the chiral tional complicated but mathematically controlled
decomposition of a conformally covariant two- operator-algebraic steps as reduction of tensor
dimensional stress–energy tensor products, formation of orbifolds under group
actions, coset constructions, and a special kind of
½TðzÞ; Tðz0 Þ ¼ iðTðzÞ þ Tðz0 ÞÞ 0 ðz  z0 Þ extensions. The generating fields of the models are
ic 000 nontrivial in the sense of not obeying free-field
þ ðz  z0 Þ ½14 equations (i.e., not being ‘‘on-shell’’). The cases
24
where one can write down explicit n-point functions
whose Fourier decomposition yields the Witt– of generating fields are very rare; in the case of the
Virasoro algebra, that is, a central extension of minimal family this is limited to the field theory of
the Lie algebra of the Diff(S1 ). (The presence of the Ising model (Schroer 2005).
the central term in the context of QFT (the analog To show the power of inclusion theory for the
of the Schwinger term) was noticed later; however, determination of the charge content of theory, let us
the terminology Witt–Virasoro algebra in the look at a simple illustration in the context of the above
physics literature came to mean the Lie algebra multicomponent abelian current algebra. The vacuum
of diffeomorphisms of the circle including the representation of the corresponding Weyl algebra is
central extension.) The first two coefficients are generated from smooth V-valued functions on the
determined by the physical role of T(z) as the circle modulo constant functions (i.e., functions with
generating field density for the Lie algebra of the vanishing total integral) f 2 LV0 . These functions
Poincaré group whereas the central extension equipped with the aforementioned complex structure
parameter c > 0 (positivity of the two-point func- and scalar product yield a Hilbert space. The
tion) for the connection with the generation of the I-localized subalgebra is generated by the Weyl image
Moebius transformations and the undetermined of I-supported functions (class functions whose repre-
parameter c > 0 (the central extension parameter) senting functions are constant in the complement I0 )
is easily identified with the strength of the two-
point function. Although the structure of the AðIÞ :¼ algfWðf Þjf 2 KðIÞg
½15
T-correlation functions resembles that of free KðIÞ ¼ ff 2 LV0 jf ¼ const:in I0 g
fields (in the sense that is an algebraically
computable unique set of correlation functions The one-interval Haag duality A(I)0 = A(I0 ) (the
once one has specified the two-point function), the commutant algebra equals the algebra localized in
realization that c is subject to a discrete quantiza- the complement) is simply a consequence of the fact
tion if c < 1 came as a surprise. As already that the symplectic complement K(I)0 in terms of
mentioned, the observation that the superselection Im(f , g) consists of real functions in that space which
sectors (the positive-energy representation struc- are localized in the complement, that is,
ture) of this algebra did not at all follow the logic K(I)0 = K(I0 ). The answer to the same question for
of a representation theory of an inner symmetry a double interval I = I1 [ I3 (think of the first and
group generated a lot of attention and stimulated a third quadrant on the circle) does not lead to duality
flurry of publications on symmetry concepts but rather to a genuine inclusion
beyond groups (quantum groups). A concept of
KððI1 [ I3 Þ0 Þ ¼ KðI2 [ I4 Þ  KðI1 [ I3 Þ0
fundamental importance is the DHR theory of ½16
localized endomorphisms of operator algebras and KðI1 [ I3 Þ  KððI1 [ I3 Þ0 Þ0
Two-Dimensional Models 337

The meaning of the left-hand side is clear; these our observable algebra. Again the Haag duality is
are functions which are constant in I1 [ I3 with the violated and converted into an inclusion Aext L (I1 [ I2 )
same constant in the two intervals whereas the  Aext
L ((I 1 [ I 2 ) 0 0
) which turns out to have the same
functions on the right-hand side are less restrictive G = L =L charge structure (it is in fact isomorphic
in that the constants can be different. The to the previous inclusion). In the general setting
conversion of real subspaces into von Neumann (current algebras, minimal model algebras, . . .), this
algebras by the Weyl functor leads to the algebraic double interval inclusion is particularly interesting if
inclusion A(I1 [ I3 )  A((I1 [ I3 )0 )0 . In physical the associated Jones index is finite. One finds
terms, the enlargement results from the fact that Kawahigashi et al. (2001) (Schroer 2005).
within the charge neutral vacuum algebra a charge
Theorem 1 A chiral theory with finite Jones index
split with one charge in I1 and the compensating
 = ind{A((I1 [ I2 )0 )0 : A(I1 [ I2 )} for the double
charge in I2 for all values of the (unquantized)
interval inclusion (always assuming that A(S1 ) is
charge occurs. A more realistic picture is obtained
strongly additive and split) is a rational theory and
if one allows a charge split to be subjected to a
the statistical dimensions d of its charge sectors are
charge quantization implemented by a lattice
related to  through the formula
condition f (I2 )  f (I4 ) 2 2L which relates the
two multicomponent constant functions (where X
¼ d 2 ½17
f (I) denotes the constant value f takes in I). As

in the previous one-component case, the choice of
even lattices corresponds to the local (bosonic) Instead of presenting more constructed chiral
extensions. Although imposing such a lattice models, it may be more informative to mention
structure destroys the linearity of the K, the some of the algebraic methods by which they are
functions still define Weyl operators which gener- constructed and explored. The already mentioned
ated operator algebras AL (I1 [ I2 ). (The linearity DHR theory provides the conceptual basis for
structure is recovered on the level of the operator converting the notion of positive-energy represen-
algebra.) But now the inclusion involves the dual tation sectors of the chiral model observable
lattice L (which of course contains the original algebras A (equivalence classes of unitary repre-
lattice), sentations) into localized endomorphisms of this
algebra. This is an important step because con-
AL ðI1 [ I2 Þ  AL ðI1 [ I2 Þ trary to group representations which have a
 
ind AL ðI1 [ I2 Þ  AL ððI1 [ I2 Þ0 Þ0 ¼ jGj natural tensor product composition structure,
AL ðI1 [ I2 Þ ¼ invG AL ðI1 [ I2 Þ representations of operator algebras generally do
not come with a natural composition structure.
This time the possible charge splits correspond to The DHR endomorphisms theory of A leads to
the factor group G = L =L, that is, the number of fusion laws and an intrinsic notion of generalized
possibilities is jGj which measures the relative size statistics (for chiral theories: plektonic in addition
of the bigger algebra in terms of the smaller. This is to bosonic/fermionic). The chiral statistics para-
a special case of the general concept of the so-called meters are complex numbers (Haag 1992) whose
Jones index of an inclusion which is a numerical phase is related to a generalized concept of spin
measure of its depth. A prerequisite is that the via a spin-statistics theorem and whose absolute
inclusion permits a conditional expectation which value (the statistics dimension) generalized the
is a generalization of the averaging under the notion of multiplicities of fields known from the
‘‘gauge group’’ G on AL (I1 [ I2 ) in the third description of inner symmetries in higher-dimen-
equation above, which identifies the invariant sional standard QFTs. The different sectors may
smaller algebra with the fix-point algebra (the be united into one bigger algebra called the
invariant part) under the action of G. In fact, exchange algebra F red in the chiral context (the
using the conceptual framework of Jones, one can ‘‘reduced field bundle’’ of DHR) in which every
show that the two-interval inclusion is independent sector occurs by definition with multiplicity 1 and
of the position of the disjoint intervals character- the statistics data are encoded into exchange
ized by the group G. (commutation) relations of charge-carrying opera-
There exists another form of this inclusion which tors or generating fields (‘‘exchange algebra
is more suitable for generalizations. One starts from fields’’) (Schroer 2005). Even though this algebra
the charge quantized extended local algebra Aext L is useful in that all properties concerning fusion
A described earlier in terms of an even-integer lattice and statistics are nicely encoded, it lacks some
L (which lives in the separable Hilbert space HL ) as cherished properties of standard field theory
338 Two-Dimensional Models

namely there is no unique state–field relation, that the elimination of short distances via the mass-shell
is, no Reeh–Schlieder property (a field A  whose restriction, would be free of ultraviolet divergencies.
source projection P does not coalesce with the This idea was enriched in the 1960s by the crossing
vacuum projection annihilates the vacuum); in property which in turn led to the bootstrap idea, a
operator-algebraic terms, the local algebras are highly nonlinear seemingly self-consistent proposal
not factors. This poses the question of how to for the determination of the S-matrix. However, the
manufacture from the set of all sectors natural protagonists of this S-matrix bootstrap program
(not necessarily local) extensions with these placed themselves into a totally antagonistic fruitless
desired properties. It was found that this problem position with respect to QFT so that the strong
can be characterized in operator-algebraic terms return of QFT in the form of gauge theory under-
by the existence of the so-called DHR triples mined their credibility. On the other hand, there
(Schroer). In case of rational theories, the number were rather convincing quasiclassical calculations in
of such extensions is finite and in the aforemen- certain two-dimensional massive QFTs as, for
tioned ‘‘classical’’ current algebra and minimal example, the sine-Gordon model which indicated
models they all have been constructed by this that the obtained quasiclassical mass spectrum is
method (thus confirming existing results complet- exact and hence suggested that the associated
ing the minimal family by adding some missing QFTs are integrable (Dashen et al. 1975) and
models). The same method adapted to the chiral have no real particle creation. These provocative
tensor product structure of d = 1 þ 1 conformal observations asked for a structural explanation
observables classifies and constructs all two-dimen- beyond quasiclassical approximations, and it soon
sional local (bosonic/ fermionic) conformal QFT B2 became clear that the natural setting for obtain-
which can be associated with the observable chiral ing such mass formulas was that of the ‘‘fusion’’
input. It turns out that this approach leads to of boundstate poles of unitary crossing-symmetric
another of those pivotal numerical matrices which purely elastic S-matrices; first in the special
encode structural properties of QFT: the coupling context of the sine-Gordon model (Schroer et al.
matrix Z, 1976) and later as a classification program from
which factorizing S-matrices can be determined
A  A  B2 by solving well-defined equations for the elastic
X
Z ; ðAÞ  ðAÞ  A  A ½18 two-particle S-matrix (Karowski et al. 1977).
 (It was incorrectly believed that the ‘‘nontrivial
where the second line is an inclusion solely elastic scattering implies particle creation’’
expressed in terms of observable algebras from statement of Aks (Aks, 1963) is also valid for
which the desired (isomorphic) inclusion in the first low-dimensional QFTs.) Some equations in this
line follows by a canonical construction, the so- bootstrap approach resembled mathematical
called Jones basic construction. The numerical structures which appeared in C N Yang’s work
matrix Z is an invariant closely related to the so- on nonrelativistic -function particle interactions
called ‘‘statistics character matrix’’ (Schroer 2005) as well as relations for Boltzmann weights in
and in case of rational models it is even a modular Baxter’s work on solvable lattice models; hence,
invariant with respect to the modular SL(2, Z) group they were referred to as Yang–Baxter relations.
transformations (which are closely related to the These results suggested that the old bootstrap
matrix S in the final section). idea, once liberated from its ideological dead
freight (in particular from the claim that the
bootstrap leads to a unique ‘‘theory of
everything’’ (minus gravity)), generates a useful
Integrability, the Bootstrap
setting for the classification and construction
Form-Factor Program
of factorizing two-dimensional relativistic
Integrability in QFT and the closely associated S-matrices. Adapting certain known relations
bootstrap form-factor construction of a very rich between two-particle form factors of field opera-
class of massive two-dimensional QFTs can be tors and the S-matrix to the case at hand
traced back to two observations made during the (Karowski and Weisz 1978), and extending this
1960s and 1970s ideas. On the one hand, there was with hindsight to generalized (multiparticle) form
the time-honored idea to bypass the ‘‘off-shell’’ field- factors, one arrived at the axiomatized recipes of
theoretic approach to particle physics in favor of a the bootstrap form-factor program of d = 1 þ 1
pure on-shell S-matrix setting which (in particular factorizable models (Smirnov 1992). Although
recommended for strong interactions), as a result of this approach can be formulated within the
Two-Dimensional Models 339

setting of the LSZ scattering formalism, the use of the level of particles. The inexorable presence of
a certain algebraic structure (Zamolodchikov and interaction-caused vacuum polarization limits a
Zamolodchikov 1979) which in the simplest fundamental/fused hierarchy to the fusion of
version reads charges.) The minimal (no additional physical
poles) two-particle S-matrix in terms of which the
ZðÞZ ð0 Þ ¼ Sð2Þ ð  0 ÞZ ð0 ÞZðÞ þ ð  0 Þ n-particle S-matrix factorizes is therefore
½19
ZðÞZð0 Þ ¼ Sð2Þ ð0  ÞZð0 ÞZðÞ
ð2Þ sinð1=2Þð þ ð2iÞ=NÞ
Smin ¼ ½20
(the -term Faddeev is due to Faddeev) brought sinð1=2Þð  ð2iÞ=NÞ
significant simplifications. In the general case, the
(minimal = without so-called CDD poles) The
Z0 s are vector valued and the S(2) -structure function
SU(N) model as compared with the U(N) model
is matrix valued. (The identification of the Z–F
requires a similar identification of bound states of
structure coefficients with the elastic two-particle
N  1 particles with an antiparticle. This S-matrix
S-matrix S(2) (which is prenempted by our notation)
enters as in the equation for the vacuum to
can be shown to follow from the physical inter-
n-particle meromorphic form factor of local opera-
pretation of the Z-F structure in terms of localiza-
tors; together with the crossing and the so-called
tion.) In that case the associativity of the Z–F
‘‘kinematical pole equation,’’ one obtains a recursive
algebra is equivalent to the Yang–Baxter equations.
infinite system linking a certain residue with a form
Recently, it became clear that this algebraic relation
factor involving a lower number of particles. The
has a deep physical interpretation; it is the simplest
solutions of this infinite system form a linear space
algebraic structure which can be associated with
from which the form factors of specific tensor fields
generators of nontrivial wedge-localized operator
can be selected by a process which is analogous but
algebras (see the next section).
more involved than the specification of a Wick basis
Conceptually as well as computationally it is much
of composite free fields. Although the statistics
simpler to identify the intrinsic meaning of integr-
property of two-dimensional massive fields is not
ability in QFT with the factorization of its S-matrix
intrinsic but a matter of choice, it would be natural
or a certain property of wedge-localized algebras
to realize, for example, the ZN fields as ZN -anyons.
(see next section) than to establish integrability (see
Another rich class of factorizing models are
Integrability and Quantum Field Theory).
the Toda theories of which the sine-Gordon and
The first step of the bootstrap form-factor
sinh–Gordon are the simplest cases. For their
program namely the classification and construction
descriptions, the quasiclassical use of Lagrangians
of model S-matrices follows a combination of two
(supported by integrability) turns out to be of some
patterns: prescribing particle multiplets transforming
help in setting up their more involved bootstrap
according to group symmetries and/or specifying
form-factor construction.
structural properties of the particle spectrum. The
The unexpected appearance of objects with new
simplest illustration for the latter strategy is supplied
fundamental (solitonic) charges (e.g., the Thirring
by the ZN model. In terms of particle content, ZN
field as the carrier of a solitonic sine-Gordon charge)
demands the identification of the Nth bound state
and the unexpected confinement of charges (e.g., the
with the antiparticle. Since the fusion condition for
CP(1) model as a confined SU(2) model) turn out to
the bound mass m2b = (p1 þ p2 )2 = m21 þ m22 þ 2m1
be opposite sides of the same coin and both cases
m2 ch(1  2 ) is only possible for a pure imaginary
have realizations in the setting of factorizing models
rapidity difference 12 = 1  2 = i (‘‘binding
(Schroer 2005).
angle’’). Hence, the binding of two ‘‘elementary’’
particles of mass m gives
sin 2 Recent Developments
m2 ¼ m
sin
There are two ongoing developments which place
and more generally of k particles with the two-dimensional bootstrap form-factor program
into a more general setting which permits to under-
sin k
mk ¼ m stand its position in the general context of local
sin
quantum physics.
so that the antiparticle mass condition mN = m
 =m One of these starts from the observation that the
fixes the binding angle to = 2=N. (The quotation smallest spacetime localization region in which it is
mark is meant to indicate that in contrast to the possible to find vacuum-polarization-free generators
Schrödinger QM there is ‘‘nuclear democracy’’ on (PFG) in the presence of interactions is the wedge
340 Two-Dimensional Models

region. If one demands in addition that these models within the scattering framework (factoriza-
generators (necessarily unbounded operators) have tion follows from existence of wedge-localized
the standard domain properties of QFT (which tempered PFGs) is rather simple and intrinsic
include stability of the domain under translations), (Schroer 2005).
then one finds that this leads precisely to the two- Among the additional ongoing investigations
dimensional Z–F algebraic structure which in turn in in which the conceptual relation with higher-
this way a spacetime interpretation for the first time dimensional QFT is achieved via modular localiza-
acquires. In these investigations (Schroer 2005), tion theory, we will select three which have caught
modular localization theory plays a prominent role our, active attention. One is motivated by the recent
and there are strong indications that with these discovery of the adaptation of Einsteins classical
methods one can show the nontriviality of intersec- principle of local covariance to QFT in curved
tions of wedge algebras which is the algebraic spacetime. The central question raised by this work
criterion for the existence of a model within local (see Algebraic Approach to Quantum Field Theory)
quantum physics. is if all models of Minkowski spacetime QFTs
There is a second constructive idea based on light- permit a local covariant extension to curved space-
front holography which uses the radical reorganiza- time and if not which models do? In the realm of
tion of spacetime properties of the algebraic structure chiral QFT, this would amount to ask if all
while maintaining the physical content including the Moebius-invariant models are also Diff(S1 )-covar-
Hilbert space. Since spacetime localization aspects iant. It has been known for sometime that a QFT
(apart from the remark about wedge algebras and with all its rich physical content can be uniquely
their PFG generators made before) are traditionally defined in terms of a carefully chosen relative
related to the concept of fields, holographic methods position of a finite number of copies of one unique
tend to de-emphasize the particle structure in favor of von Neumann operator algebra within one common
‘‘field properties.’’ Indeed, the transversely extended Hilbert space. This is a perfect quantum field-
chiral theories which arise as the holographic image theoretical illustration for Leibnitz’s philosophical
lead to simplification of many interesting properties proposal that reality results from the relative
with very similar aims to the old ‘‘light-cone position of ‘‘monades’’ (As opposed to the more
quantization’’ except that light-front holography is common (Newtonian) view that the material reality
another way of looking at the original local ambient originates from a material content being placed into
theory without subjecting it to another quantization. a spacetime vessel) if one takes the step of identify-
(The price for this simplification is that as a result of ing the hyperfinite typ III1 Murray von Neumann
the nonuniqueness of the holographic inversion factor algebra with an abstract monade from which
certain problems cannot be formulated.) the different copies result from different ways of
Actually, as a result of the absence of a transverse positioning in a shared Hilbert space (Schroer 2005).
direction in the two-dimensional setting, the family In particular, Moebius-covariant chiral QFTs arise
of factorizing models provides an excellent theore- from two monades with a joint intersection defining
tical laboratory to study their rigorous ‘‘chiral a third monade in such a way that the relative
encoding’’ which is conceptually very different positions are specified in terms of natural modular
from Zamolodchikov’s perturbative relation (which concepts (without reference to geometry). This begs
is based on identifying a factorizing model in terms the question whether one can extend these modular-
of a perturbation on a chiral theory). based algebraic ideas to pass from the global
It turns out that the issue of statistics of particles vacuum preserving Moebius invariance to local
loses its physical relevance for two-dimensional Diff(S) covariance Moeb ! Diff(S1 ). This would
massive models since they can be changed without be precisely the two-dimensional adaptation of the
affecting the physical content. Instead such notions crucial problem raised by the recent successful
as order/disorder fields and soliton take their place generalization of the local covariance principle
(Schroer 2005). underlying Einstein’s classical theory of gravity to
In accordance with its historical origin, the theory QFT in curved spacetime: does every Poincaré
of two-dimensional factorizing models may also be covariant Minkowski spacetime QFT allow a unique
viewed as an outgrowth of the quantization of correspondence with one curved spacetime (having
classical integrable systems (Integrability and Quan- the same abstract algebraic substrate but with a
tum Field Theory). But in comparison with the totally different spacetime encoding)? In the chiral
rather involved structure of integrabilty (verifying context, one is led to the notion of ‘‘partially
the existence of sufficiently many commuting con- geometric modular groups’’ which only act geome-
servation laws), the conceptual setting of factorizing trically if restricted to specific subalgebras (Schroer
Two-Dimensional Models 341

2005). It is hard to imagine how one can combine belong to a series of interesting observations whose
quantum theory and gravity without understanding final relation to the principles of QFT still needs
first the still mysterious links between spacetime clarification.
geometry, thermal properties, and relative position-
ing of monades in a joint Hilbert space. See also: Algebraic Approach to Quantum Field Theory;
A second important umbilical cord with higher- Axiomatic Quantum Field Theory; Bosons and Fermions
dimensional theories is the issue of ‘‘Euclideaniza- in External Fields; Euclidean Field Theory; Integrablility
and Quantum Field Theory; Operator Product Expansion
tion’’ in particular the chiral counterpart of
in Quantum Field Theory; Sine-Gordon Equation;
Osterwalder–Schrader localization and the closely
Symmetries in Quantum Field Theory: Algebraic Aspects;
related Nelson–Symanzik duality. In concrete chiral Symmetries in Quantum Field Theory of Lower
models (e.g., the models in the section ‘‘Chiral fields Spacetime Dimensions; Tomita–Takesaki Modular
and two-dimensional conformal models’’), it has Theory.
been noted as a result of explicit calculations that
the analytic continuation in the angular parametri-
zation for thermal correlation functions leads to Further Reading
a duality relation in
Abdalla E, Abdalla MCB, and Rothe K (1991) Non-Perturbative
hAð’1 ; . . . ; ’n Þi ;2t Methods in 2-Dimensional Quantum Field Theory. Singapore:
 a X   World Scientific.
i i i Belavin AA, Polyakov AM, and Zamolodchikov AB (1984)
¼ S  A ’1 ; . . . ; ’n ½21 Infinite conformal symmetry in two-dimensional quantum
t 
t t ;ð2=t Þ
field theory. Nuclear Physics B 241: 333.
Bisognano JJ and Wichmann EH (1975) Journal of Mathematical
where the thermal correlation function is defined as Physics 16: 985.
Dashen F, Hasslacher B, and Neveu A (1975) Physics Reviews D
hAð’1 ; . . . ; ’n Þi ;2t 11: 3424.
Di Francesco P, Mathieu P, and Sénéchal D (1996) Conformal
:¼ trH e2t ðL0 ðc=24ÞÞ  ðAð’1 ; . . . ; ’n ÞÞ Field Theory. Berlin: Springer.
½22
Y
n Doplicher S, Haag R, and Roberts JE (1971/1974) Communica-
Að’1 ; . . . ; ’n Þ ¼ Ai ð’i Þ tions in Mathematical Physics 23: 199; 35: 49.
i¼1 Fredenhagen K, Rehren KH, and Schroer B (1992) Superselection
sector with braid group statistics and exchange algebras II:
Compared with the thermally extended Nelson– Geometric aspects and conformal invariance. Reviews of
Symanzik relation for two-dimensional QFT one Mathematical Physics 1 (special issue): 113.
notices that in addition to the expected behavior of Furlan P, Sotkov G, and Todorov I (1989) Two-dimensional
conformal quantum field theory. Rivista del Nuovo Cimento
real coordinates becoming imaginary and the 12: 1–203.
2-periodicity changing role with the (suitably Gabbiani F and Froehlich J (1993) Operator algebras and
normalized) KMS inverse temperature, there is a conformal field theory. Communications in Mathematical
rotation in the space of superselected charges in Physics 155: 569.
terms of a unitary matrix S whose origin lies in the Glimm J and Jaffe A (1987) Quantum Physics. A Functional
Integral Point of View. Berlin: Springer.
braid group statistics (the statistics character Ginsparg P (1990) Applied conformal field theory. In: Brezin E
matrix). The deeper structural explanation which and Zinn-Justin J (eds.) Fields, Strings and Critical Phenom-
shows that this relation is not just a property of ena, Les Houches 1988. Amsterdam: North-Holland.
special models, but rather a generic property of Goddard P, Kent A, and Olive D (1985) Virasoro algebras and
chiral QFT, comes from a very deep angular coset space models. Physics Letters B 152: 88.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Euclideanization which is based on modular theory Ising E (1925) Zeitschrift für physik 31: 253.
(Schroer). Specializing A = identity, one obtains a Jordan P (1937) Beiträge zur Neutrinotheorie des Lichts. Zeitschrift
relation for the partition function, the famous für Physik 114: 229 and earlier papers quoted therein.
Verlinde identity which is part of the transformation Karowski M, Thun H-J, Truoung TT, and Weisz P (1977) Physics
law of the thermal angular correlation functions Letters B 67: 321.
Karowski M and Weisz P (1978) Physics Reviews B 139: 445.
under the SL(2, R) modular group. Kawahigashi Y, Longo R, and Mueger M (2001) Multi-interval
There are many additional important observations subfactors and modularity in representations of conformal field
on factorizing models whose relation to the physical theory. Communications in Mathematical Physics 219: 631.
principles of QFT, unlike the bootstrap form-factor Lenz W (1920) Physikalische Zeitschrift 21: 613.
program, is not yet settled. The meaning of the Lüescher M and Mack G (1975) Global conformal invariance in
quantum field theory. Communications in Mathematical
c-parameter outside the chiral setting and ideas on Physics 41: 203.
its renormalization group flow as well as the various Rehren K-H and Schroer B (1987) Exchange algebra and Ising
formulations of the thermodynamic Bethe ansatz n-point functions. Physics Letters B 198: 84.
342 Two-Dimensional Models

Rehren KH and Schroer B (1989) Einstein causality and Artin Schroer B, Truong TT, and Weisz P (1976) Towards an explicit
braids. Nuclear Physics B 312: 715. construction of the Sine-Gordon field theory. Annals of
Schroer B (2005) Two-dimensional models, a testing ground for Physics (New York) 102: 156.
principles and concepts of QFT, Annals of Physics (in print) Schwinger J (1962) Physical Review 128: 2425.
(hep-th/0504206). Schwinger J (1963) Gauge theory of vector particles. In:
Schroer B and Swieca JA (1974) Conformational transformations Theoretical Physics Trieste Lectures 1962. Wien: IAEA.
for quantized fields. Physics Reviews D 10: 480. Smirnov FA (1992) Advanced Series in Mathematical Physics 14.
Schroer B, Swieca JA, and Voelkel AH (1975) Global operator Singapore: World Scientific.
expansions in conformally invariant relativistic quantum field Streater RF and Wightman AS (1964) PCT, Spin and Statistics
theory. Physics Reviews D 11: 11. and All That. New York: Benjamin.
Zamolodchikov AB and Zamolodchikov AB (1979) Annals of
Physics (New York) 120: 253.
U
Universality and Renormalization
M Lyubich, University of Toronto, Toronto, ON, General Terminology and Notations
Canada and Stony Brook University, Stony Brook,
We will use general notations and terminology from
NY, USA
Holomorphic Dynamics.
ª 2006 Elsevier Ltd. All rights reserved.

Unimodal Maps
Introduction
Definitions and Conventions
Discovery of the universality phenomenon and the
Let us consider a smooth interval map f : I ! I. It is
underlying renormalization mechanism by Feigen-
called unimodal if it has a single critical point c and
baum and independently by Coullet and Tresser in
this point is an extremum. We assume that the critical
late 1970s was one of the most influential events
point is nondegenerate, unless otherwise it is expli-
in the dynamical systems theory in the last quarter
citly stated. A unimodal map is called S-unimodal if it
of the twentieth century. It was numerically
has a negative Schwarzian derivative:
observed that the cascades of doubling bifurca-
tions leading to chaotic regimes in one-parameter  
f 000 3 f 00 2
families of interval maps, as well as the dynamical Sf ¼ 0  <0
f 2 f0
attractors that appear in the limits, exhibit the
universal small-scale geometry. To explain this For simplicity, we also assume that the map f is
surprising observation, a ‘‘Renormalization Con- even, and normalize it so that c = 0 and one of the
jecture’’ was formulated which asserted that a endpoints of I is a fixed point.
natural renormalization operator acting in the
space of dynamical systems has a unique hyper- Topological Dynamics
bolic fixed point.
Let J 3 0 be a 0-symmetric periodic interval, that is,
It took about two decades to prove this conjecture
f p (J)  J for some p 2 N, such that the intervals
rigorously (and without the help of computers). The
Jk = f k (J), k = 0, 1, . . . , p  1, have disjoint interiors.
proof revealed rich mathematical structures behind
Then we refer to [ Jk as a cycle of intervals of period p.
the universality phenomenon that linked it tightly to
According to their topological dynamics, S-
holomorphic dynamics and conformal and hyper-
unimodal maps can be divided into three possible
bolic geometry.
types (Sharkovskii, Singer, Guckenheimer, Misiur-
Besides the universality per se, the renormaliza-
ewicz, van Strien, Blokh, etc.):
tion theory led to many other important results.
It includes the proof of the regular or stochastic  Regular maps. Such a map has an attracting or
dichotomy that gives us a complete under- parabolic cycle a. In this case, almost all trajec-
standing of the real quadratic family (and more tories of f converge to a. In case a is attracting, the
general families of one-dimensional maps) from map f is also called hyperbolic (see Holomorphic
measure-theoretic point of view, as well as deep Dynamics).
advances in several key problems of holomorphic  Topologically chaotic maps. For such a map,
dynamics. there is a cycle of intervals [ Jk such that the
Since the original discovery, many other manifes- restriction f j [ Jk is topologically transitive (i.e., it
tations of the universality have been observed, has a dense orbit). Moreover, for almost all z 2 I,
experimentally, numerically, and theoretically, in orb z eventually lands in this cycle.
various classes of dynamical systems. However, in  Infinitely renormalizable maps. For such a map,
this article we will focus on mathematical aspects of there is a nested sequence of periodic intervals
the original phenomenon. J1  J2     3 0 of periods pn ! 1. Then the
344 Universality and Renormalization

intersection of the corresponding cycles of Universality Phenomenon


intervals,
Universal Geometry of Doubling Bifurcations
1 p[
\ n 1 and the Feigenbaum Attractor
A ¼ Af ¼ f k ðJn Þ ½1
n¼0 k¼0
Let us consider the real quadratic family Pc : x 7! x2 þ c,
c 2 [2, 1=4]. As the parameter c moves down from
is a Cantor set endowed with a natural group 1/4, we observe a sequence of doubling bifurcations
structure (inverse limit of cyclic groups Z=pn Z) cn where the attracting cycle of period 2n gives birth
such that f jA becomes a group translation. to an attracting cycle of period 2nþ1 , n = 0, 1, . . .
Moreover, f n z ! A for a.e. z 2 I. This Cantor set (see Holomorphic Dynamics and Figure 1). This
is also called the Feigenbaum attractor of f. sequence converges to the Feigenbaum parameter
c1 at exponential rate: cn  c1  n , where 
Kneading Theory
4.6. It turns out that if we consider a similar one-
parameter family of unimodal maps, say x 7! a sin x,
Kneading theory (Milnor and Thurston, mid-1970s) we observe a similar sequence of doubling bifurca-
gives a complete topological classification of S-unimodal tions converging to the limit exponentially at the
maps (and more general one-dimensional maps). Let Iþ same rate n , independently of the family under
and I stand for the components of In{0}, where Iþ 3 consideration.
f (0). To any point x 2 I, let us associate its itinerary In the dynamical space, let us consider the
("n )N
n = 0 , where "n 2 {þ, , 0}, N 2 Zþ [ 1, in the Feigenbaum attractor Af [1] of an infinitely renor-
following way. If x is precritical then N 2 Zþ is the malizable S-unimodal map f that appear in the limit
smallest number such that f N x = 0, and we let "N = 0. of doubling bifurcations (so that the periods of
Otherwise, N = 1. For n < N, "n = þ if f n x 2 Iþ , and periodic intervals Jn are equal to 2n ). Let us consider
"n =  if f n x 2 I . the scaling factors n = jJn j=jJn1 j. Then n ! 1 ,
The kneading sequence of f is the itinerary of the where the limiting scaling factor 1 2.6 is
critical value f (0). It essentially classifies S-unimodal
maps: two nonregular S-unimodal maps are topolo-
gically conjugate if and only if they have the same
kneading sequence. (In the regular case, one should c = –2
state if the map is hyperbolic or parabolic and
specify the sign of the multiplier of the correspond- c = –1.77
ing cycle.)
The kneading theory completely describes admis-
sible kneading sequences (realizable by some unim-
c = –1.38
odal maps), and order them linearly in such a way
that a bigger sequence corresponds to a more
‘‘complicated’’ map. The minimal admissible knead-
ing sequence, þ þ þ, is realized by the parabolic map c = 1/4
x 7! x2 þ 1=4, while the maximal one, þ    , c1 = –3/4
is realized by the Chebyshev map x 7! x2  2.
A central result of the kneading theory is the
Intermediate Value Theorem asserting that a smooth
one-parameter family of S-unimodal maps ft con-
taining two kneading sequences also contains all
intermediate kneading sequences. In particular, a
family that contains the above maximal and the
minimal kneading sequences, contains all admissible
kneading sequences. Such a family is called full. We
Figure 1 Real quadratic family Pc : x 7! x 2 þ c. This picture
see that the real quadratic family Pc , c 2 [2, 1=4], presents how the limit set of the orbit fPcn (0)g1
n = 0 bifurcates as
is full: any S-unimodal map is topologically equiva- the parameter c changes from 1/4 on the right to 2 on the left.
lent to some quadratic polynomial. This indicates Three topological types of regimes are intertwined in an intricate
dynamical significance of the quadratic family. way. The gaps correspond to the regular regimes. The black
regions correspond to the chaotic regimes (though, of course,
We say that a one-parameter family of unimodal
there are many narrow invisible gaps therein). In the beginning
maps ft is almost full if it contains all admissible (on the right) one can see the cascade of doubling bifurcations.
kneading sequences except possibly the minimal one. This picture became symbolic for one-dimensional dynamics.
Universality and Renormalization 345

independent of the particular map f under considera- exponential rate n , where  is the unstable
tion. Thus, the small-scale geometry of Af is eigenvalue of the differential DR(f
). This explains
universal. the universal geometry of doubling bifurcations.
This was historically the first observed manifesta- One can also show that the Feigenbaum attractor
tion of the quantitative universality of dynamical Af of any map f 2 W s (f
) is smoothly equivalent to
and parameter structures. Af
, which explains the universal small-scale geome-
try of these attractors.
Feigenbaum–Coullet–Tresser Renormalization
Full Renormalization Horseshoe
Conjecture
Along with period doublings, one can consider
To explain the above universality phenomenon,
period triplings, quadruplings, etc. A unimodal
Feigenbaum and independently Coullet and Tresser,
map f 2 U is said to be renormalizable with period
formulated the following Renormalization Conjec-
p if it has a cycle of intervals J ! J1 !    ! Jp1 ! J
ture. Let us consider the space U of S-unimodal
of period p. The corresponding renormalization
maps f : [1, 1] ! [1, 1]. A map f 2 U is called
operator is defined as Rf (x) = 1 f p (x), where
(doubling) renormalizable if it has a cycle
 = jJj=2.
of intervals J ! J1 ! J of period 2. Then, for any
The combinatorics or type  of the renormalization
n 2 Zþ [ {1}, we can naturally define n-times
operator is the order of the intervals Jk , k =
renormalizable maps, where n = 0 corresponds to
0, 1, . . . , p  1, on the real line (up to reversal). (For
the non-renormalizable case, while n = 1 corres-
instance, there are three admissible combinatorics  of
ponds to the infinitely renormalizable case.
period 5.) If we want to specify combinatorics of the
Let U 0  U be the space of doubling renormaliz-
renormalization operator under consideration, we use
able maps. If f 2 U 0 then f 2 : J ! J is an S-unimodal
notation R . This operator is defined on the ‘‘renor-
map as well, and we define the (doubling) renorma-
malization strip’’ U  of unimodal maps f 2 U that are
lization operator R : U 0 ! U as the rescaling of this
renormalizable with combinatorics .
map:
The Renormalization Conjecture admits a
Rf ðxÞ ¼ 1 f 2 ðxÞ straightforward generalization to any renormaliza-
tion operator R . More interestingly, one can
where  = jJj=2. formulate a stronger version of it by putting all the
The Renormalization Conjecture asserted that: admissible renormalization types together. Let T
 The renormalization operator R has a unique stand for the set of all minimal renormalization
fixed point f
, and this point is hyperbolic; types, that is, the types that cannot be factored
 the stable manifold W s (f
) consists of infinitely through other types. Then the renormalization strips
renormalizable unimodal maps; U  ,  2 T , are pairwise disjoint, and we can define
 the unstable manifold W u (f
) is one dimensional the full renormalization operator
and represents an almost full family of unimodal [
maps (see the section ‘‘Kneading theory’’); and R: U ! U ½2
2T
 the quadratic family {Pc } transversally intersects
W s (f
) (see Figure 2). by letting RjU  = R . Then the strong version of the
Assuming this conjecture, one can see that for any renormalization conjecture asserted that:
curve t 7! gt in U that transversally intersects the  there is an R-invariant hyperbolic subset A  U
stable manifold W s (f
) at some moment t
, the called the full renormalization horseshoe such
doubling bifurcations parameters tn converge to t
at that the restriction RjA is topologically con-
jugate to the full shift  on the space  of bi-
infinite sequences (. . . , 1 , 0 , 1 , . . .) of symbols
Wu Quadratic family n 2 T ;
 for any f
2 A, the stable manifold W s (f
) consists
of infinitely renormalizable maps f 2 U with the
° f* ° z + c*
2 same combinatorics as f
;
 for any f
2 A, the unstable manifold W u (f
) is
one-dimensional and represents an almost full
Ws family of unimodal maps; and
 the real quadratic family {Pc } transversally inter-
Figure 2 Renormalization fixed point. sects all stable manifolds W s (f
).
346 Universality and Renormalization

Complex Renormalization  : C ! M, where M is the Mandelbrot set (see


Holomorphic Dynamics). We let Hc = 1 (c) be the
Polynomial-Like Maps
hybrid class passing through a point c 2 M. One can
A polynomial-like map is a holomorphic branched show that Hc is a codimension-one submanifold in Q.
covering of finite degree f : U ! U0 , where UYU0  Any quadratic-like map has two fixed points
C are topological disks (In other words, the maps f is counted with multiplicity. In the case of connected
proper, that is, full preimages f 1 (K) of compact sets Julia set, these fixed points have a different
K  U0 are compact). For instance, if f is a dynamical meaning: one of them, called , is either
polynomial of degree d then for a sufficiently large attracting, or neutral, or repelling separating, that is,
radius R > 0, the map f : f 1 (DR ) ! DR is a poly- J(f )n{} is disconnected. Another one, called , is
nomial-like map of the same degree d. We refer to either parabolic with multiplier 1 (and then it
such polynomial-like maps as ‘‘polynomials.’’ coincides with ) of repelling nonseparating.
The filled Julia set of f is the set of nonescaping In what follows, we normalize quadratic-like
points: maps so that 0 is their critical point.

Kðf Þ ¼ fz : f n z 2 U; n ¼ 0; 1; . . .g
Complex Renormalization and Little
Mandelbrot Sets
The Julia set of f is the boundary of its filled Julia
set: J(f ) = @K(f ). A quadratic-like map f : U ! U0 with connected
A polynomial-like map of degree d has d  1 Julia set is called renormalizable if there is a
critical points counted with multiplicities. The Julia topological disk V 3 0 and a natural number p 2
set (and the filled Julia set) is connected if and only called the renormalization period such that:
if all the critical points ci are nonescaping, that is,
 letting g = f p jV and V 0 = g(V), the map g : V ! V 0
ci 2 K(f ).
is quadratic-like;
A polynomial-like map of degree 2 is called
 the little Julia set K(g) is connected; and
quadratic-like. The Julia set of a quadratic-like
 the sets gn (K(g)), n = 1, . . . , p  1, can intersect
map is either connected or a Cantor set, depending
K(g) only at the -fixed point of g.
on whether its critical point is nonescaping or
otherwise. Under these circumstances, the quadratic-like germ g
The domain of a polynomial-like map is allowed considered up to affine conjugacy is called the renorma-
to be slightly adjusted by taking V 0 to be a lization of the quadratic-like germ f ; g = Rf . Moreover,
topological disk such that U  V 0  U0 and letting one says that f is primitively renormalizable if the
V = f 1 (V 0 ). We say that two polynomial-like maps little Julia sets gn (K(g)), n = 1, . . . , p  1, are pairwise
represent the same germ if one can be obtained from disjoint. Otherwise, f is satellite renormalizable.
the other by a sequence of such adjustments. As in the unimodal case, one can define combina-
We will be mostly interested in the quadratic case; torics or type  of the complex renormalization.
so let Q be the space of quadratic-like germs Roughly speaking, renormalizable maps with the same
considered up to affine conjugacy, and let C be the combinatorics have the same renormalization period
connectedness locus in Q, that is, the subset of f 2 Q and the ‘‘same position’’ of the little Julia sets f k (K(g))
with connected Julia set. The space Q has a natural in C^ (the rigorous definition is based on the notion of
complex analytic structure such that holomorphic Thurston’s equivalence from Holomorphic Dynamics).
curves in Q are represented by holomorphic families
Theorem 1 (Douady and Hubbard 1986). The set
f (z) of quadratic-like maps.
of parameters c for which a quadratic map
Two polynomial-like maps are called hybrid
Pc : z 7! z2 þ c is renormalizable with a given combi-
equivalent if they are conjugate by a quasiconformal
 = 0 a.e. on K(f ) (in particular, h natorics  assemble a homeomorphic copy M of the
map h such that @h
Mandelbrot set M.
is conformal on int K(f )). By the Straightening
Theorem, any polynomial-like map is hybrid equiva- This theorem explains the presence of many little
lent (after an adjustment of its domain) to a Mandelbrot sets that are observable on the compu-
polynomial of the same degree (called the ‘‘straigh- ter pictures of M (see Figures 3 and 4). Moreover,
tening’’ of f ). The straightening depends only on the the copies corresponding to the primitive renorma-
germ of f. lization originate at primitive hyperbolic compo-
For a quadratic-like map f with connected Julia nents (see Holomorphic Dynamics), while the copies
set, the straightening Pc : z 7! z2 þ c is unique, obtained by a satellite renormalization originate at
c = (f ). Thus, we obtain the straightening map satellite hyperbolic components attached to some
Universality and Renormalization 347

computer estimates (Lanford 1982). It followed, in


the 1980s, by works of Epstein, Eckmann, Khanin,
Sinai, among others, which gave a better conceptual
understanding and provided proofs of many ingre-
dients of the picture (without computer assistance).
The turning point in this development occurred
when methods of holomorphic dynamics and con-
formal geometry were introduced into the subject
(Douady and Hubbard 1985, Sullivan 1986). This
led to the proof of the renormalization conjecture in
the space of quadratic-like germs:
Theorem 2 (Sullivan–McMullen–Lyubich, the
1990s). For any real combinatorics  2 T , the
operator R has a unique fixed point f in the space
Q. Moreover, f is hyperbolic, its stable manifold
W s (f ) coincides with the hybrid class Hc , c = (f ),
while the real slice of the unstable manifold
represents an almost full family of unimodal maps.
Figure 3 A primitive copy of the Mandelbrot set.
This result was further extended to the smooth
category by de Faria, de Melo, and Pinto.

MLC, Density of Hyperbolicity, and


Geometry of Feigenbaum Julia Sets
The ‘‘Mandelbrot set is locally connected’’ (MLC)
conjecture (see Holomorphic Dynamics) is intimately
related to the renormalization phenomenon. This
connection was first revealed by the following result:
Theorem 3 (Yoccoz 1990, unpublished). Let us
consider a nonrenormalizable quadratic polynomial
Pc : z 7! z2 þ c with connected Julia set and both
fixed points repelling. Then the Julia set J(Pc ) is
Figure 4 The satellite copy of the Mandelbrot set attached to
locally connected and the Mandelbrot set is locally
the main cardioid at the point of doubling bifurcation. connected at c.
This result was recently extended to higher-degree
‘‘mother’’ hyperbolic component. (Satellite copies unicritical polynomials z 7! zd þ c (Kahn–Lyubich,
attached to the main cardioid are particularly preprint 2005).
prominent on the pictures of M.) The MLC Conjecture is still open for general infinitely
Given a combinatorial type , the set Q of renormalizable parameters. However, the similar pro-
quadratic-like germs f 2 Q that are renormalizable blem for the real quadratic family has been resolved.
with combinatorics  (the complex renormalization It implies the real version of the Fatou conjecture in
strip) is the union of hybrid classes passing through the quadratic case (see Holomorphic Dynamics):
the little copy M . As in the real case, let us consider
Theorem 4 (Lyubich 1997). Hyperbolic maps are
the set T C of all minimal combinatorial types. Then
dense in the real quadratic family.
the corresponding renormalization strips Q are
pairwise disjoint, and we can S define the full complex This result was recently extended to higher-degree
renormalization operator R : 2T C Q ! Q. polynomials by Kozlovskii, Shen, and van Strien
(preprint 2003).
Infinitely renormalizable quadratic maps of
Renormalization Theorem
bounded combinatorial type (i.e., with bounded
The first proof of the Renormalization Conjecture in relative periods pnþ1 =pn ) supply us with a rich class
the period-doubling case was based on rigorous of fractals with very interesting geometry. These
348 Universality and Renormalization

Julia sets are ‘‘hairy’’ at the origin, that is, their Avila and Moreira (2005) went on to prove that
blow-ups fill in densely the whole plane (this for a.e. c 2 N , the map Pc is Collet–Eckmann.
phenomenon is related to the universal geometry of
the Feigenbaum attractors; McMullen (1996)). Renormalization Horseshoe
However, some of them have zero Lebesgue measure
(Yarrington, thesis 1995) and Hausdorff dimension Let us consider the complexification of the renor-
smaller than 2 (Avila–Lyubich, preprint 2004). It is malization operator [2],
unknown whether this happens for all of them or [
R: Q ! Q ½3
not (in particular, the answer is unknown for the 2T
Feigenbaum map born in the cascade of doubling
bifurcations). acting in the space of quadratic-like maps.
Theorem 6 (Lyubich 2002). The ‘‘Strong Renor-
malization Conjecture’’ is valid for the operator [3].
Regular or Stochastic Dichotomy Let I  [2, 1=4] be the set of parameters for
Stochastic Maps which the quadratic map Pc is infinitely renormaliz-
able. The above theorem implies that this set has
An S-unimodal map f is called stochastic if it has an zero Lebesgue measure. (Avila and Moreira went on
absolutely continuous invariant measure . In this to prove that HD(I ) < 1.)
case, f is topologically chaotic (see the section
‘‘Topological dynamics’’) and  is supported on the
Regular or Stochastic Dichotomy
transitive cycle of intervals [ Jk . Moreover,  has a
positive characteristic exponent, Putting together Theorems 5 and 6, we obtain:
Z
Theorem 7 For a.e. c 2 [2, 1=4], the quadratic
 ¼ log jDf jd > 0
map Pc is either regular or stochastic.
and Lebesgue almost all orbits are equidistributed This result gives a complete probabilistic picture
with respect to , that is, for Lebesgue a.e. x 2 I, of dynamics in the real quadratic family. It has been
Z later transferred to any nondegenerate real analytic
1X n
ðf xÞ ! d family of S-unimodal maps (Avila–Lyubich–de
n Melo), and further to a generic smooth family of S-
for any continuous function . The map f p j J is mixing unimodal maps (Avila–Moreira).
with respect to , and in fact, is weakly Bernoulli. Palis has formulated a strong general conjecture
Here are two important criteria for stochasticity: (in all dimensions) asserting that a typical (from
the probabilistic point of view) smooth dynamical
 Collet–Eckmann condition (see Holomorphic
system f has finitely many attractors supporting
Dynamics). These maps have extra strong sto-
SRB measures (see Lyapunov Exponents and
chastic properties, notably, the exponential decay
Strange Attractors) that govern the behavior of
of correlations.
Lebesgue a.e. trajectories of f. The above results
 Martens–Nowicki condition. To state it, we need to
confirm the Palis Conjecture in the setting of S-
define the principal nest of intervals, I0  I1     3
unimodal maps.
0. Here I0 = [, ], where  is the fixed point with
negative multiplier, and Inþ1 is inductively defined
as the component of f ln (In ) containing 0, where ln
Other Universality Classes
is the moment of first return of the orbit of 0 to In .
n n1
Let
P pusffiffiffiffifficonsider the scaling factors n = jI j=jI j. If From a more general point of view, renormalization
n < 1 then f is stochastic. is an appropriately rescaled return map to a relevant
piece of the phase space, viewed as an operator in
Let N  [2, 1=4] be the set of parameters c for
some class of dynamical systems. From this point of
which the quadratic map Pc is topologically chaotic.
view, most dynamical systems are ‘‘renormalizable,’’
Not every such map is stochastic. However, the set
and the renormalization approach often provides a
of stochastic parameters has positive Lebesgue
deep insight into the nature of the systems in
measure (Jakobson 1981), and in fact,
question.
Theorem 5 (Lyubich 2000). For a.e. c 2 N , the Here is a partial list of classes of nonlinear
map Pc satisfies the Martens–Nowicki condition, systems that exhibit universality with an underlying
and thus, is stochastic. renormalization mechanism (we provide a few
Universality and Renormalization 349

relevant names, but there are many more people Cvitanović P (1984) Universality in Chaos. Bristol: Adam Hilger.
who contributed to the corresponding theories): Douady A and Hubbard JH (1985) On the dynamics of
polynomial-like maps. Annales Scientifiques de l’École
 Holomorphic germs near indifferent equilibria Normale Supérieure 18: 287–343.
(Yoccoz, Shishikura, McMullen); Lyubich M (1999) Feigenbaum–Coullet–Tresser universality and
Milnor’s hairiness conjecture. Annals of Mathematics 149:
 critical circle maps (Kadanoff, Feigenbaum, Rand, 319–420.
Lanford, Swiatek, de Faria, Yampolsky); Lyubich M (2000) Quadratic family as a qualitatively solvable
 non-renormalizable quadratic-like maps of model of dynamics. Notices of the American Mathematical
Fibonacci type (Lyubich–Milnor); Society 47(9): 1042–1052.
 conservative two-dimensional diffeomorphisms McMullen C (1996) Renormalization and 3-Manifolds Which
Fiber Over the Circle, Annals of Math. Studies, vol. 135.
near the point of breaking of KAM tori (MacKay, Princeton: Princeton University Press.
Koch); and de Melo W and van Strien S (1993) One-Dimensional Dynamics.,
 dissipative Hénon-like maps (Collet–Eckmann– Berlin: Springer.
Koch, de Carvalho–Lyubich–Martens). Sullivan D (1993) Linking the universalities of Milnor–Thurston,
Feigenbaum and Ahlfors–Bers. In: Topological Methods
See also: Fractal Dimensions in Dynamics; Holomorphic in Modern Mathematics, The Proceedings of Symposium
held in honor of John Milnor’s 60th Birthday, SUNY at
Dynamics; Lyapunov Exponents and Strange Attractors;
Stony Brook, 1991, pp. 543–564. Houston, TX: Publish or
Multiscale Approaches.
Perish.
Vul EB, Sinai YaG, and Khanin KM (1984) Feigenbaum
universality and the thermodynamical formalism. Russian
Mathematical Surveys 39: 1–40.
Further Reading
Collet P and Eckmann J-P (1980) Iterated Maps of the Interval as
Dynamical Systems. Boston: Birkhäuser.
V
Variational Methods in Turbulence
F H Busse, Universität Bayreuth, Bayreuth, Germany In the following, we shall first discuss the energy
ª 2006 Elsevier Ltd. All rights reserved. method which provides necessary conditions for the
existence of turbulent solutions of the underlying
equations and then turn to the problem of upper
bounds for the turbulent momentum transport in the
Introduction plane Couette flow configuration as a particular
example. The properties and physical relevance of
The problem of fluid turbulence is commonly the extremalizing vector fields will be discussed in a
regarded as one of the most challenging problems of final section.
theoretical physics and mathematics. There is general
agreement that the Navier–Stokes equations (NSEs)
provide a satisfactory basis for the description of
turbulent motions of homogeneous Newtonian fluids Energy Method
such as gases and most liquids. But the difficulty of For simplicity, we consider the NSEs for a homo-
generating solutions of these equations for high- geneous incompressible fluid with a constant kine-
Reynolds-number flows has prevented accurate matic viscosity  in an arbitrary fixed domain D.
answers to simple questions such as the question of Using the diameter d of the domain as length scale
the discharge of turbulent pipe flow as a function of and d2 = as timescale, we can write the NSEs of
the pressure head or the question of the heat transport motion in dimensionless form,
by turbulent convection in a fluid layer heated from
below. In view of this difficulty, it has become an @
v þ v  rv ¼ rp þ f þ r2 v ½1a
attractive idea to obtain rigorous bounds on turbulent @t
transports. Variational methods have played an
important role in the derivation of such bounds. rv ¼ 0 ½1b
There is another motivation for the use of varia- where f denotes some given steady distribution of a
tional methods for the understanding of turbulent force density. On the boundary @D of the domain D,
fluid systems. Experimenters have sometimes noted steady velocities parallel to the boundary may be
the tendency of turbulent flows to maximize trans- specified. We assume that the basic steady solution
ports under given external conditions. In his pioneer- of the problem is given by vs = Re ^v where the
ing paper, Howard (1963) mentions that the Malkus average of (^v)2 =2 over the domain D (indicated by
hypothesis of a maximum heat transport by thermal angular brackets) is unity, hj^vj2 i = 2. Any velocity
convection had motivated him to derive upper bounds field vt different from vs , that is, with u  vt 
through the use of variational methods. The techni- vs 6  0, must obey the equations
ques developed by Howard have later been applied to
other kinds of turbulent transports by Busse. While @
~ þ r2 u
u þ vs ru þ urvs þ u ru ¼ rp ½2a
relatively simple ordinary differential equations are @t
obtained when the equation of continuity is not
imposed as a constraint, the Euler–Lagrange equa- ru ¼ 0 ½2b
tions for a stationary value of the variational
together with the homogeneous boundary conditions
functional lead to nonlinear partial differential equa-
for u on @D. By multiplying eqn [2a] by u and
tions when solenoidal extremalizing vector fields are
averaging the result over the domain D we obtain
required. Nevertheless, using boundary layer methods
the relationship
one can derive approximate analytical solutions even
in the limit of asymptotically large Rayleigh and 1d
Reynolds numbers (Busse 1969, 1978). hu  ui ¼ hjruj2 i  Rehu  ðu  rÞ^vi ½3
2 dt
352 Variational Methods in Turbulence

where the vanishing of u on @D and equations such Table 1 Reynolds numbers for shear flows
as
ReE ReG Rec
hu  ðvs  rÞui ¼ 12hvs  ru  ui (from exp.)

¼ 12hr  ðvs u  uÞi ¼ 0 Plane Couette flow 82.6  1300 1


Poiseuille flow (channel flow) 99.2a  2000a 5772a
have been used to prove that the terms Hagen–Poiseuille flow 81.5a  2100a 1
vs  ru, u  ru and rp ~ do not enter the balance [3]. (pipe flow)
This balance is called the Reynolds–Orr energy Circular Couette flow with 82.6  82:6 82.6
D = ReE =2
equation and is the basis for the application of the
energy method. The lowest value Re for which the a
The maximum velocity and the channel width d (radius d in the
right-hand side of [3] is non-negative is called the case of pipe flow) have been used in definition of Re.
energy Reynolds number ReE . For Re < ReE the
steady solution vs is absolutely stable and the energy
of any disturbance u must decay exponentially in ReG for the instability of the basic flow state have also
time. Re > ReE is a necessary condition for the been listed. A unique situation occurs in the small gap
existence of a persistent turbulent state of fluid flow. limit of the Taylor–Couette system where ReE and Rec
ReE is determined as the solution of the variational coincide for a special value of the dimensionless mean
problem: rotation rate D (Busse 2002).
For a given flow ^
v in D find the minimum ReE of the
functional
Variational Problem for Turbulent
hjr uj2 i Momentum Transport
RE  ½4
u  ð
h u  rÞ^
vi In order to introduce the variational method for
among all vector fields u
 which satisfy the conditions bounds on turbulent transports we consider the
ru
 = 0 in D, u
 = 0 on @D, and h u  (
u  r)^
vi < 0. simplest configuration for which a nontrivial solu-
tion of the NSEs of motion exists: the configuration
For Re  ReE there will exist at least one vector of plane Couette flow (Figure 1). The Reynolds
field u, namely the minimizing solution u  of the number is defined in this case in terms of the
variational problem [4], the energy of which does constant relative motion U0 i between the plates,
not decay, at least not initially. In the derivation of Re = U0 d=, where i is the unit vector parallel to the
the Euler–Lagrange equations as necessary condi- plates and  is the kinematic viscosity of the fluid.
tions for stationary values of the variational func- Using the distance d between the plates as length
tional [4], scale and d2 = as timescale, the basic equations can
1 be written in the form
2Gð
u @ ^
vi þu v Þ ¼ @i  þ @ @ u ½5a
  @i ^ i
@
 ¼ 0
@ u ½5b v þ v  rv ¼ rp þ r2 v ½6
@t
the constraint r  u
 = 0 has been taken into account
rv ¼ 0 ½7
through the Lagrange multiplying function . ˇ G is a
stationary value of the functional [4] and in general We use a Cartesian system of coordinates with the
there exist many of those which are determined as x, z-coordinates in the directions of i and k,
eigenvalues of the linear boundary value problem [5]
together with its boundary condition u  i = 0 on @D.
Only the infinum of all G provides the energy – 12 Re
Reynolds number ReE . Many details on the energy z = 12
method can be found in Joseph’s book (1976). Here z y
k
we just wish to remark that the Reynolds–Orr balance
[3] remains valid when the problem is considered in a d x i
system rotating with a constant angular velocity D
since the Coriolis force does not contribute to the
1
energy balance [3]. The values of ReE are usually z=– 2
1
much smaller than the critical values Rec for the onset 2 Re
of infinitesimal disturbances as can be seen from Figure 1 Geometrical configuration of the plane Couette flow
Table 1. Here the experimentally determined values problem.
Variational Methods in Turbulence 353

respectively, where k is the unit vector normal to the The Euler–Lagrange equations as necessary con-
plates such that the boundary conditions are given by ditions for an extremal value of the functional are
given by
1 1
v ¼  Re i at z ¼ ½8
2 2 d
d
w
~ u  U
¼  r þ r2 ~v
U þ k~ ½13
dz dz
After separating the velocity field v into its mean
and fluctuating parts, v = U þ  v with  v = U, v = 0, r  ~v ¼ 0 ½14
where the bar denotes the average over planes
z = const., we obtain by multiplying eqn [6] by v where dU
=dz is defined by
and averaging it over the entire fluid layer (indicated  
by angular brackets) d
hjr~vj2 i
U ¼u
~w uwi
~  h~ ~ i R ½15
  dz 2h~vx wi
~
1d @
vj2 i ¼  uw 
hj U  hjr vj2 i ½9 and where  = h~ ux wi
~ has been set. When eqns
2 dt @z
[13]–[15] are compared with the equations for  v
Here u denotes the component of v  perpendicular to and for U, a strong similarity can be noticed. The
k and w is its z-component. We define fluid variational problem does not exhibit any time
turbulence under stationary conditions by the prop- dependence, but the Euler–Lagrange equations may
erty that quantities averaged over planes z = const. still be regarded as the symmetric analogue of the
are time independent. Accordingly, the equation for NSEs for steady flow.
the mean flow U can be integrated to yield

d Upper Bounds on the Turbulent


U ¼ wu  hwui  Re i ½10
dz Momentum Transport
where the boundary condition [8] has been A simple analytical solution of the variational
employed. With this relationship, U can be elimi- problem can be obtained when the constraint
nated from the problem and the energy balance r  ~v = 0 is dropped. In that case it is evident that
the minimum of the functional [12] is reached
hjruj2 i þ hjuw  huwij2 i ¼ Rehux wi ½11 when ~v is independent of x, y, and when
u
~x = w ~ = f (z) holds. The Euler–Lagrange equations
is obtained where the identity huw2 i  huwi2 = then assume the form of an ordinary differential
hjuw  huwij2 i has been used. equation,
Since the momentum transport in the x-direction
between the moving rigid plates is described by f 00 ¼ ½ðf 2 =hf 2 i  1Þ  R þ hf 02 i=hf 2 if ½16
M = dUx =dz jz = 1=2 = hux wi þ Re, we can con-
clude immediately that the momentum transport Since the variational functional [12] is homogeneous
by turbulent flow always exceeds the corresponding in ~v, we are free to use a normalization condition for
laminar value because hux wi is positive according which we choose max [f (z)] = 1. Multiplication of
to the relationship [11]. Since a lower bound on M eqn [16] by f 0 and integration yield
thus exists, an upper bound  on hux wi as a 
function of Re is of primary interest. Following f 02 ¼ ð1  k2 f 2 Þð1  f 2 Þ
2k2 hf 2 i
Howard (1963), it can be shown that (Re) is a
monotonous function and it is therefor equivalent with k2 ¼ =½2ðR þ Þhf 2 i  2hf 02 i   ½17
to ask for a lower bound R of Re at a given value  This equation can be solved in terms of elliptical
of hux wi. We are thus led to the following integrals. The minimum R() is determined by the
formulation of the variational problem: relationships
Find the minimum R() of the functional 8
R ¼ ½K2 ð1 þ k2 Þ þ K3 =D  3k2 KDÞ
3 ½18
hjr~vj2 i uw
hj~ ~  h~ ~ 2i
uwij
v; Þ 
Rð~ þ ½12  ¼ 8k2 KD
ux wi
h~  ~ 2
ux wi
h~
where D(k) and K(k) are the complete elliptical
among all solenoidal vector fields v ~u~ þ kw~ (with integrals usually labeled by these letters. For
u
~  k = 0) that satisfy the boundary condition ~
v = 0 at details, see the analysis by Howard (1963) of an
z = 1=2 and the condition h~ ux wi
~ > 0. analogous problem. In the asymptotic case of large
354 Variational Methods in Turbulence

Reynolds numbers, relationships [18] yield the


upper bound
8
ðReÞ ¼ Re2 ½19
128
In solving the full eqns [13]–[15], it is convenient
to eliminate eqn [14] through the general represen-
tation of the solenoidal vector field ~
v,
v ¼ r ðr kÞ þ r k
~ ½20
We assume that the minimizing vector field ~v does
not depend on x, although a rigorous proof for this
property can be given only for small values of .
Introducing the notations   @ =@y and w 
@ 2 =@y2 we are thus led to the general ansatz Figure 3 Qualitative sketch of the nested boundary layers that
characterize the vector field of maximum transport. The profile of
X
N
the mean shear is shown on the right side.
w ¼ wðNÞ  2
n wn ðzÞn ðyÞ ½21a
n¼1
distance from the wall as assumed in Prandtl’s
mixing-length theory. But the discreteness of the
X
N
scales reflects the fact that effective transports require
 ¼ ðNÞ  n ðzÞn ðyÞ ½21b
n¼1
preferred scales. Asymptotically, the upper bound for
the momentum transport approaches
where N may tend to infinity and the functions n (y)
satisfy the equation ðReÞ ¼ 0:010 Re2 ½23
@2 which represents a significant improvement over the
n ¼ 2n n ½22
@y2 relationship [19]. Nevertheless, the upper bound still
exceeds the measured values of the momentum
In the following, it will be assumed that the positive
transport by more than a factor 10.
wavenumbers n are ordered according to their size,
n1 < n < nþ1 . The solutions of the form [21] of
the Euler–Lagrange equations exhibit a boundary
layer structure for large  as sketched in Figure 2. Discussion
Accordingly, the N– solutions are characterized by a Bounds like those for the momentum transport have
hierarchy of N boundary layers at each plate and been obtained for many other kinds of turbulent
provide the upper bound sequentially with increasing transports. For details we refer to the review articles
 starting with N = 1. The extremalizing vector fields listed below. Usually, the formulation of the upper
thus exhibit a bifurcation structure similar to that bound problem requires that the external conditions
found in many cases of the transition to turbulence. are homogeneous in two spatial dimensions such
The thicknesses of the boundary layers decrease with that a separation of the turbulent velocity, tempera-
increasing  and their ratio from one layer to the next ture, or magnetic fields into mean and fluctuating
approaches the factor 4 as indicated in Figure 3. The parts is possible. In this respect, the variational
typical scale of motion increases linearly with methods for upper bounds are more restricted than
those used for determination of the energy Reynolds
μ
number ReE . The latter problem, incidentally,
wN θN wN – 2θN – 2 corresponds to the limit  ! 0 of variational
problems of the type [12] as can be seen from a
wN – 1θN – 1
wθ comparison with expression [4].
In recent years, the background field method has
been introduced by Doering and Constantin (1994) as
wN – 3θN – 3 an alternative way for obtaining bounds on properties
of turbulent flows. When optimized, it becomes
–1/2
z equivalent to the variational method discussed in this
Figure 2 Qualitative sketch of the boundary layer structure of article as has been demonstrated by Kerswell (1998).
the extremalizing N–  solution. The fact that not optimized bounds can be obtained
Variational Techniques for Ginzburg–Landau Energies 355

relatively easily emphasizes the point that the extre- See also: Bifurcations in Fluid Dynamics; Fluid
malizing vector fields are the most interesting aspect of Mechanics: Numerical Methods; Turbulence Theories.
the variational problems. They often exhibit simila-
rities with the observed turbulent velocity fields, in
particular as far as the mean flows are concerned. In Further Reading
the case of convection in a layer heated from below,
the transition of the bound from the 1 –  solution to Busse FH (1969) On Howard’s upper bound for heat transport by
the 2 –  solution corresponds closely to the experi- turbulent convection. Journal of Fluid Mechanics 37:
457–477.
mentally observed transition from convection rolls to Busse FH (1978) The optimum theory of turbulence. Advances in
bimodal convection (Busse 1969). Applied Mechanics 18: 77–121.
The close similarities between variational functionals Busse FH (2002) The problem of turbulence and the manifold of
for rather different physical systems suggest corre- asymptotic solutions of the Navier–Stokes equations. In:
sponding similarities between the respective turbulent Oberlack M and Busse FH (eds.) Theories of Turbulence,
pp. 77–121. Wien: Springer.
fields. For example, the analogy between the fluctuat- Doering CR and Constantin P (1994) Variational bounds on
ing component of the temperature in turbulent convec- energy dissipation in incompressible flows: shear flow.
tion and the streamwise component of the fluctuating Physical Review E 49: 4087–4099.
velocity field in shear flow turbulence has been Howard LN (1963) Heat transport by turbulent convection.
demonstrated and employed in a theory of the atmo- Journal of Fluid Mechanics 17: 405–432.
Howard LN (1972) Bounds on flow quantities. Annual Review of
spheric boundary layer (Busse 1978). Better bounds Fluid Mechanics 4: 473–494.
and more physically realistic properties of the extre- Joseph DD (1976) Stability of fluid motions. vol. 1. Berlin:
malizing vector fields can be expected when additional Springer.
constraints are imposed. For example, the energy Kerswell RR (1998) Unification of variational principles for
balances for poloidal and toroidal components of the turbulent shear flows: the background method of Doering–
Constantin and Howard–Busse’s mean-fluctuation formula-
velocity field can be applied separately. But these tion. Physica D 121: 175–192.
developments are still in their initial stages.

Variational Techniques for Ginzburg–Landau Energies


S Serfaty, New York University, New York, NY, USA BCS theory of Bardeen–Cooper–Schrieffer. It is a
ª 2006 Elsevier Ltd. All rights reserved.
model of great importance and recognition in physics
(with several Nobel prizes awarded for it: Landau,
Ginzburg, Abrikosov). In addition to its importance
Ginzburg–Landau-type problems are variational in the modeling of superconductivity, the Ginzburg–
problems which consider a Dirichlet-type energy Landau model turns out to be mathematically
posed on complex-valued functions, penalized by a extremely close to the Gross–Pitaevskii model for
potential term which has a well in the unit circle of superfluidity, and models for rotating Bose–Einstein
the complex plane. The denomination comes from condensates, which all have in common the appear-
the physical model of superconductivity of Ginzburg ance of topological defects called ‘‘vortices.’’
and Landau. They are phase-transition-type models Superconductivity, which was discovered in 1911
in the sense that they describe the state of the by Kammerling Ohnes, consists in the complete loss
material according to different ‘‘phases’’ which can of resistivity of certain metals and alloys at very low
coexist in a sample and be separated by various temperatures: the two most striking consequences of
types of interfaces. We start by presenting the it being the possibility of permanent superconduct-
physical model (readers familiar with it may wish ing currents and the particular behavior that an
to skip the next two sections and go straight to the external magnetic field applied to the sample gets
section ‘‘The simplified model’’). expelled from the material and can generate
vortices, through which it penetrates the sample.

Introduction to the Ginzburg–Landau Model The Energy Functional


The Ginzburg–Landau model was introduced by After a series of dimension reductions, the Ginzburg–
Ginzburg and Landau in the 1950s as a pheno- Landau model describes the state of the
menological model to describe superconductivity, superconducting sample occupying a region 
and was later justified as a limit of the quantum and submitted to the external magnetic field hex ,
356 Variational Techniques for Ginzburg–Landau Energies

below the critical temperature, through its Gibbs Reductions of the Model
energy:
The goal of variational studies of the Ginzburg–
Z 2 2 Landau model is to relate the energy to the vortices
1 ð1  j j Þ
G" ð ; AÞ ¼ jrA j2 þ and the applied field. In three dimensions (3D),
2  2"2 vortices are filaments, or lines of zeros of the order
Z
1 parameter , around which has a nonzero
þ jcurl A  hex j2 ½1
2 R3 winding number. These are quite delicate to describe
in 3D (we will mention some results below), so a
In this expression, the first unknown is the simplification that is commonly made consists in
‘‘order parameter’’ in physics. It is a complex-valued reducing to a two-dimensional model.
condensed wave function, indicating the local state When reducing to 2D, one assumes that every-
of the material, or the phase (in the Landau theory thing is independent of the vertical direction, and
approach of phase transitions): j j2 is the density of that the applied magnetic field is also vertical. The
the ‘‘Cooper pairs’’ of superconducting electrons domain  is then a two-dimensional, bounded and
explaining superconductivity in the BCS approach. (for simplicity) simply connected open set, which is
With our normalization j j  1 and where j j  1 the horizontal section of an infinite vertical
the material is in the superconducting phase, while cylinder. One can also imagine it represents a thin
where j j  0, it is in the normal phase (i.e., behaves film.
like a normal conductor), the two phases being able In 2D, the energy is written the same way:
to coexist in the sample. Z
The second unknown A is the electromagnetic 1 ð1  j j2 Þ2
G" ð ; AÞ ¼ jrA j2 þ
vector potential of the magnetic field, a function 2  2"2
from  to R3 . The induced magnetic field in the þ jcurl A  hex j2 ½4
sample is deduced by h = curl A. The notation rA
denotes the covariant derivative r  iA. The super- where this time A is R2 -valued, and the induced
conducting current is the vector j of components magnetic field h = curl A = @1 A2  @2 A1 is now a
real-valued function, which can be taken to be equal
jk ¼ hi ; ðrA Þk i ½2
to hex (now a real positive number) in R2 n.
where h. , .i denotes the scalar product in C The stationary states of the system are the critical
identified with R 2 . points of G" , or the solutions of the Ginzburg–
Finally, the parameter " is the inverse of the Landau equations:
‘‘Ginzburg–Landau parameter’’ , a dimensionless 1
parameter (ratio of the penetration depth and ðrA Þ2 ¼ ð1  j j2 Þ in 
"2
the coherence length) depending on the material only.
Most variational studies of Ginzburg–Landau r? h ¼ hi ; rA i in  ½5
focus on the regime of large  or small ", h ¼ hex on @ 
corresponding to ‘‘extreme type-II’’ superconduc- rA  ¼0 on @ 
tors, also called the London limit. In this limit, the
potential term acts as a singular perturbation, and where r? denotes (@x2 , @ x1 ).
the characteristic size of the vortices is " ! 0; A common simplification consists in suppressing
vortices become line-like topological singularities, the magnetic field, and thus in studying the
which makes it easier to extract and describe them. simplified energy
This model is a U(1)-gauge theory, that is, it is Z
invariant under the gauge transformations: 1 ð1  juj2 Þ2
E" ðuÞ ¼ jruj2 þ ½6
2  2"2
i
7! e
½3
A 7! A þ r where the order parameter is commonly denoted by
u, and is still complex valued. This energy, which
where  is a smooth real-valued function. The can be seen as a complex analog of the real-valued
physically relevant quantities are those that are Allen–Cahn model of phase transitions, has been
gauge invariant, such as the energy G" , j j, h, and extensively studied, especially since the work of
the superconducting current j. Bethuel–Brezis–Hélein, where the domain  is
For more on the model, we refer to the physics assumed to be two dimensional and simply con-
literature (e.g., DeGennes (1966) and Tinkham nected. The higher-dimensional case has also been
(1996)). considered.
Variational Techniques for Ginzburg–Landau Energies 357

Vortices and Critical Fields for a given u or ), and estimate precisely the energetic
cost of each vortex and of their interaction. This
We now need to explain more precisely what a
allows us to obtain results of variational convergence
vortex is. In two dimensions, a vortex is an object
of the energy G" , E" (or their variants), that is, to
centered at an isolated zero of u (or ), around
derive -limits, or ‘‘reduced problems’’ posed in terms
which the phase of u has a nonzero winding number
of the vortices, which are easier to minimize than the
called the ‘‘degree of the vortex.’’ It is the simplest
original ones. These limits depend on the regime of
example of a topological defect. If the zero is located
applied field, and allow to characterization of, in turn,
at x0 , the winding number or degree is the integer
the critical fields, and the optimal repartition and
that can be computed by
Z number of the vortices, if any.
1 @’ Variational methods also serve to solve some
¼d2Z ½7
2 @Bðx0 ;rÞ @ inverse problems, that is, to prove the existence of
solutions of the equation which have some given
where r is small enough, and ’ is the phase of u, that properties, such as a given repartition of vortices,
is, u can be written u = jujei’ . For example, the phase through local minimization procedures, or the use of
’ = d, where  is the polar angle centered at x0 , yields topological methods based on investigating the
a vortex of degree d. Observe that the phase ’ is not a topology of the energy levels.
well-defined function, it is multivalued (and defined up Nonvariational approaches of Ginzburg–Landau
to 2); however, we have the important relation are also very useful, in particular to identify the
X
curl r’ ¼ 2 di ai ½8 profiles of the solutions, to describe vortices of
i nonminimizing critical points, or to perform a bifurca-
tion analysis around the normal solution at Hc3 .
where the ai ’s are the zeros of u, di ’s the associated
degrees, and x denotes the Dirac mass at x.
When " is small, it is clear from [4] or [6] that juj
prefers to be close to 1, and a scaling argument hints The Simplified Model
that juj is different from 1 in regions of characteristic We first present the variational study of E" [6] in
size ". Of course this is an intuitive picture and several dimension 2, together with the mathematical tools
mathematical notions are used to describe the vortices. used for both [6] and [4]. We will restrict to the
Vortices appear due to the applied field hex . For asymptotics " ! 0, since this is the situation where
type-II superconductors there are essentially three the most results are known.
critical fields, Hc1 , Hc2 , Hc3 , critical values of hex for Let us present informally the essential ingredients
which phase transitions occur. For hex  of the analysis.
Hc1 = O(j log "j), there are no vortices and the
superconductor is in the superconducting phase
j j ’ 1 everywhere. At Hc1 the first vortices appear, Tracing the Vortices
and their number increases as hex is raised. When
The easiest way to trace the vortices is to use the
they become numerous they tend to arrange in
current hiu, rui (or the ‘‘superconducting current’’
triangular lattices called Abrikosov lattices, as
j = hi , rA i for the case with magnetic field). Here
observed in experiments and predicted by Abrikosov
we recall h. , .i denotes the scalar product in C as
from the Ginzburg–Landau model, in a very
identified with R 2 , that is, hiu, rui = (u  @1 u, u 
influential work. At the second critical field
@2 u) with  the vector product in R2 .
Hc1 = O(1="2 ) bulk superconductivity is destroyed,
The curl of the current is the vorticity of the map u,
and surface superconductivity remains until
exactly like in fluid mechanics. Writing u = ei’ we
Hc3 = O(1="2 ), the third critical field, above which
have (at least formally) hiu, rui = 2 r’ and since
 0 and the material is normal.
= juj is close to 1 (other than in the small vortex
regions), we have the approximation
Issues and Methods curl hiu; rui ¼ curl ð 2 r’Þ ’ curl r’
X
The variational approach to Ginzburg–Landau con- ¼ 2 di ai ½9
sists in expressing the energy in terms of reduced i
quantities or objects, in particular in terms of the
where the ai ’s are the zeros of u (or its vortices) and
vortices. This requires to develop mathematical tools
the di ’s their degrees, or
to describe and characterize the vortices (in particular
give some suitable definitions of a ‘‘vortex structure’’ curl hi ; rA i þ curl A ’ curl r’
358 Variational Techniques for Ginzburg–Landau Energies

in the case with magnetic field. This can be made order 1 (jruj  C="), thus negligible compared to
rigorous (see Jerrard and Soner (2002) and Sandier the cost associated to the phase, which blows up as
and Serfaty (to appear)), that is, one can express that log 1=" as " ! 0.
X The above estimate is only valid as long as
curl hiu; rui  2 di ai ! 0 as " ! 0 ½10 B(x0 , R) does not contain any other zero of u. If
i
P vortices get close to each other or become numer-
(or respectively curlhi , rA i þ curlA  2 i di ai ous, one needs refined techniques to estimate their
! 0) in some weak functional norm, thus giving a cost. This can be done through a ‘‘ball-construction
rigorous use of [8]. The quantity method’’ introduced independently by Jerrard and
Sandier.

ðuÞ ¼ curl hiu; rui ½11
or
Evaluating the Total Interaction Cost of Vortices

ð ; AÞ ¼ curl hi ; rA i þ curl A ¼ curl j þ h ½12
In a first approach, one studies configurations which
in the case with magnetic field, will thus be called satisfy the upper bound E" (u)  Cj log "j. Then,
the vorticity and be used to trace the vortices, in this lower bounds of the type [15] show that the total
limit " ! 0. The relation sum of the degrees (hence the total number of
X vortices of nonzero degree) remains bounded as " ! 0.

 2 di ai ! 0 as " ! 0 ½13
i
Up to extraction, we may assume these zeros ai
converge as " ! 0 to a finite set of points pi , with a
states that it is close to being a measure. total degree stillPdenoted di . This can also be expressed
This is also called the Jacobian determinant if as
(u" ) ! 2 i di pi as " ! 0.
written (with differential forms) Ju = dhiu, dui = This is not the only case of interest, since
hidu, dui = 2(ux1  ux2 )dx1 ^ dx2 , and under this unbounded numbers of vortices do arise, especially
form it can be used in higher dimensions. in the physical situation of the energy with magnetic
field, as we will see in the next section. However,
The Cost of Each Vortex
this hypothesis, which was made in the work of
Here we investigate informally the cost of a vortex Bethuel–Brezis–Hélein, makes the analysis easier
of degree d. We know already that the characteristic and already allows us to exhibit the main
length scale of variation of u is ", and that (1  phenomena.
juj2 )2 is strongly penalized. Thus, we may expect Vortices in superconductors are generated by the
that juj is close to 1 at a distance " of the zeros. presence of the external magnetic field hex . For the
Assuming that x0 is a zero of u, and taking formally energy without magnetic field, this has to be
juj = 1 for jx  x0 j
", we may write u = ei’ and replaced by some boundary condition which forces
jruj = jr’j for jx  x0 j
". some degree. Bethuel–Brezis–Hélein considered the
Then, we have fixed Dirichlet boundary condition u" = g on @,
Z where g is a fixed unit-valued map on @, of degree
1
jruj2 d > 0. This forces u to have a total degree d in .
2 R
jxx0 j
" However, the Neumann boundary condition, for
Z Z  2 !
1 R @’ instance, can also be considered (the minimizers of

  dr
2 "   E" are then simply constants, they are trivial, but
@Bðx0 ;rÞ @
0 !2 1 one can still look for other critical points).
Z Z Let us return to lower bounds in order to look
1 R@ @’ 1 A

dr ½14 for the next order term in the energy (still with
2 " @Bðx0 ;rÞ @ 2r
formal arguments). Cutting out holes [i B(pi , ) of
fixed size around the limiting vortices pi , we may
Z R assume that u = ei’ in n [i B(pi , ) =  , with ’ a
1 42 d2 dr R

¼ d2 log ½15 real-valued function, defined modulo 2. Minimiz-
2 2 " r "
ing the energy outside of the holes amounts to
where we have used the Cauchy–Schwarz inequality solving
for [14], and the characterization of the degree [7].
Z
We may also observe that this lower bound is sharp 1
min jruj2
if @’=@ is constant, that is, if the phase is d (and u:  !S 1 2 
the vortex radial). The cost associated to juj in the u¼g on @
energy imposes the length scale " and is generally of degðu;@Bðpi ; ÞÞ¼di
Variational Techniques for Ginzburg–Landau Energies 359

This is a harmonic map problem, whose solution is the vortex of core of size ; it is what replaces the
given in terms of ’ by infinite term in the formal calculation.
Now [18] is a good estimate for the optimal
’ ¼ 0 in  energy outside of the holes, while the energy in holes
 
@’ @g of size can be bounded below by [15]. Given the
¼ ig; on @
@ @ degree di on the boundary @B(pi , ) of the small
Z
@’ hole, B(pi , ) contains one orPseveral zeros of u of
¼ 2di degrees k with total degree
@Bðpi ; Þ @ k k = di . In view of
[15], since the cost of a vortex of degree P d grows like
and in terms of the harmonic conjugate  which is d2 j log "j, Pand since the infimum of k k2 under the
the function (up to a constant) such that constraint k k = di is k = sign(di ), the least costly
r’ = r? , way to achieve this is to have jdi j vortices of degree
sign(di ). The smallest lower bound possible is thus
 ¼ 0 in 
  Z
@ @g 1 ð1  juj2 Þ2
¼ ig; on @ jruj2 þ 2

jdi j log þ C ½20
@ @ ½16 2 Bðpi ; Þ 2" "
Z
@
¼ 2di where the constant C can be described explicitly.
@Bðpi ; Þ @ Adding up the results of [20] and [18], we find
As ! 0,  behaves like the solution of X
X 1
E" ðuÞ
 di2 log
0 ¼ 2 di pi in  i

i X
  ½17 þ jdi j log þ Wd ðp1 ; . . . ; pn Þ
@0 @g "
¼ ig; on @ i
@ @
þ nC þ o ð1Þ þ o" ð1Þ
Hence, we have X 1
Z Z
 jdi j log þ Wd ðp1 ; . . . ; pn Þ
1 1 "
jr’j2 ¼ jrj2 i
2  2  þ nC þ o" ð1Þ ½21
Z
1
’ jr0 j2
2  with equality only if u has jdi j zeros of degree
X 1 sign(di ) in each B(pi , ).
¼ di2 log þ Wd ðp1 ; . . . ; pn Þ This provides a lower bound of the energy in
i

terms of the vortices. Moreover, this bound is sharp:
þ oð1Þ as ! 0 ½18 one can construct test configurations which have the
given limiting vortices (pi , di ), and an energy equal
where
to the right-hand side of [21].
X One can thus deduce the behavior of global
Wd ða1 ; . . . ; an Þ ¼   di dj log jpi  pj j
minimizers of the energy. GivenPthe total degree
i6¼j
X d = deg(g) > 0 on @ , we need i di = d, and the
 di Rðai Þ lowest value achievable under this constraint in
i
Z   the right-hand side of [21] is to have di = 1 for
1 @g every i, and thus to have exactly d vortices of
þ 0 ig; ½19
2 @ @ degree 1. Moreover, the limiting points pi ’s
P should minimize W. We thus are led to the first
and R(x) = 0 (x)  i di log jx  pi j. The function main result.
W was introduced by Bethuel–Brezis–Hélein and
Theorem 1 (Bethuel–Brezis–Hélein). Minimizers of
called the renormalized energy, since it consists in
E" under the boundary condition u = g, deg(g) = d > 0,
the part of the energy that is left after subtracting
have d zeros of degree 1, which converge as " ! 0
the ‘‘infinite part’’ in j log "j from E" . It contains the
to a minimizer of W.
(logarithmic) interaction energy between the vor-
tices: we see that vortices with degrees of same sign This result can be rephrased as a result of
repel one another while vortices with degrees of -convergence of E"  dj log "j. It reduces the
opposite signs attract one another. The di2 log 1= minimization of E" to one of W, which is a finite-
term corresponds to the self-interaction, or cost of dimensional problem (interaction of point charges).
360 Variational Techniques for Ginzburg–Landau Energies

Thus, we see again the interest of studying this which requires more delicate estimates. Also, it is then
asymptotic limit " ! 0 because the vortices become no longer possible to study the convergence of the
pointlike and the problem reduces to a finite- individual zeros of , so one studies instead the limit of
dimensional one, or one of minimizing the vortex rescalings of the vorticity measures
( , A).
interaction.

Further Results Splitting of the Energy and Main Results

A nonvariational approach also allowed Bethuel– Let us recall that in the case with magnetic field, the
Brezis–Hélein to prove a further correspondence vorticity is given by [12]. In addition, we may
between E" and W: they obtained that critical points assume that the second set of equations in [5]
of E" , under the upper bound E"  Cj log "j, have r? h ¼ j in ; h ¼ hex on @ ½22
vortices which converge to a critical point of W.
Other important results are the study of the blow-up is satisfied (if not, keeping fixed and choosing A
profiles or solutions in the whole plane, by Brezis– which satisfies this equation always decreases the
Merle–Rivière and Mironescu. energy). Taking the curl of this equation, we find
In two dimensions, the variational approach is exactly
also used to solve inverse problems (construct h þ h ¼
ð ; AÞ in 
solutions) and study variants of the energy with ½23
h ¼ hex on @
pinning (or weighted) terms.
The variational approach is also fruitful in higher Thus, the vorticity and the induced magnetic field
dimensions. In dimension 3, for example, vortices are are in one-to-one correspondence with each other.
not points but vortex lines, and the Jacobian Combining it to the relation [13], we are led to the
Ju = d(iu, du) can be seen as a current carried by the approximate relation
vortex line, with kJuk total mass of the current equal to X
 times the length of the line, and it was established by  h þ h ’ 2 di ai in 
Jerrard and Soner that Ju" is compact in some weak i ½24
sense, and converges, up to extraction, to some  times h ¼ hex on @
integer-multiplicity rectifiable current J, with
where again the ai ’s are the vortex centers and di ’s
E" ðu" Þ their degrees, well known in physics as the
lim inf
kJk ‘‘London equation.’’ It shows how the magnetic
"!0 j log "j
field is induced by the vortices which act like
In fact, a complete -convergence result of ‘‘charges,’’ and how the magnetic field ‘‘penetrates
E" =j log "j can be proved, see the work of Alberti– the sample’’ around the positive vortex locations.
Baldo–Orlandi, and thus minimizing E" reduces at Of course this equation is only an approximation,
the limit to minimizing the length of the line, leading because the singularities at the ai ’s, where h would
to straight lines, or in higher dimensions, to become infinite, are really smoothed out in
( , A);
codimension-2 minimal currents. This is a nontrivial however, the approximation is good far from
problem, contrarily to dimension 2, where the - the vortex cores, just as [17] is an approximation
limit of E" =j log "j is trivial, which required to go to for [16].
the lower-order term to find the nontrivial renorma- It is then natural to introduce the field corre-
lized energy limit W. sponding to the vortex-free situation, which is hex h0
where h0 solves
The Functional with Magnetic Field h0 þ h0 ¼ 0 in 
½25
The aim here is to achieve the same objective: h0 ¼ 1 on @
express or bound from below the energy by terms
which depend only on the vortices and their degrees. h0 is thus a fixed smooth function, depending only
The method consists in transposing the type of on , and when there are no vortices, we expect h to
analysis above taking into account the magnetic be approximately hex h0 . Moreover, h0 := h  hex h0
field contribution to see how the external field then solves
triggers the sudden appearance of vortices, and for X
what values they appear (thus retrieving the critical h0 þ h0 ¼
ð ; AÞ ’ 2 di ai in 
fields, etc.). One of the main difficulties consists in the i ½26
fact that the number of vortices becomes divergent, h0 ¼ 0 on @
Variational Techniques for Ginzburg–Landau Energies 361

Defining the Green kernel G(. , y) by configuration ( , A) for which this is an equality,
at leading order.
G þ G ¼ y in 
½27 In that relation, h2ex J0 is a fixed energy, the energy
G¼0 on @ of the vortex-free configuration. To it are added the
intrinsic cost of each vortex jdi jjlog "j, the interac-
and S by S(x, y) = 2G(x, y) þ log jx  yj, for x far tion cost between vortices, and the interaction
enough from the ai ’s, we may approximate h0 by between
X P the vortices and the external field
h0 ðxÞ ¼ 2 Gðx; ai Þ ½28 2hex i di (h0  1)(ai ).
i
It is then simple, by minimizing the right-hand
Using the second Ginzburg–Landau equation [22] side with respect to the vortices for a given hex , and
and the fact that j j R1, we have jrA j
jjj = jrhj, observing that h0  1  0, to deduce a few basic
thus G" ( , A)
(1=2)  jrhj2 þ jh  hex j2 . Plugging facts about vortices: vortices of positive degree (and
in the decomposition h = hex h0 þ h0 and using an of degree þ1) are preferred, each vortex costs
integration by parts and [26], one finds j log "j, and allows to gain at best an energy
Z 2hex max jh0  1j when placed at the minimum of
1 2
G" ð ; AÞ ¼ hex jrh0 j2 þ jh0  1j2 h0  1. Therefore, vortices become favorable when
2 their cost becomes smaller than the gain, that is,
Z
þ hex rh0  rh0 þ ðh0  1Þh0 when hex becomes larger than the ‘‘first critical field’’
Z  j log "j
1 Hc1  ½32
þ jrh0 j2 þ jh0 j2 2j minðh0  1Þj
2 
Z
2
We have the first main result.
¼ hex J0 þ hex ðh0  1Þ
ð ; AÞ
Z  Theorem 2 (Sandier–Serfaty). When " is small
1 enough and hex  Hc1 , then minimizers of G" have
þ jrh j þ jh0 j2
0 2
½29
2  no vortices.
R
where J0 is the constant (1=2)  jrh0 j2 þ jh0  1j2 . On the other hand, if hex
Hc1 , the vortices
The right-hand side of eqn [29] can be expressed cannot all be located at the same minimum point of
P
in terms
R of the vortices.PFirst, using [26], we h0  1, because their repulsion  i6¼j log jai  aj j
have  (h0  1)
( , A) ’ 2 i di (h0  1)(ai ). Second, would be infinite. There is thus a trade-off between
R
the expression  jrh0 j2 þ jh0 j2 can be treated exactly their repulsion and the cost for being far from the
like E" (u) in the previous section, using lower bounds for minimum of h0  1. Only if n, the number of
the cost of vortices provided by the Jerrard–Sandier vortices, is small compared to hex do the vortices
method, we are led to the (approximate) relation tend to concentrate near the minimum of h0  1. If
Z X so, then, assuming for simplicity that the minimum
1 1
jrh0 j2 þ jh0 j2
 jdi j log of h0  1 is achieved at a unique point p, and
2  i
" denoting by Q the Hessian of h0  1 at p, in the
X
 di dj log jai  aj j relation above (h0  1)(ai ) can be approximated by
i6¼j min (h0  1) þ (1=2)Q(ai  p) and thus G" ( , A) by
X
þ di dj Sðai ; aj Þ ½30 G" ð ; AÞ  h2ex J0 þ nj log "j þ 2nhex minðh0  1Þ
i;j X
þ hex Qðai  pÞ
Combining this to [29] we find the decomposition i
X
X  di dj log jai  aj j þ n2 Sðp; pÞ ½33
G" ð ; AÞ
h2ex J0 þ  jdi jj log "j i6¼j
i
X
þ 2hex di ðh0  1Þðai Þ From this relation, optimizing on ‘, the character-
i istic distance to p and characteristicpffiffiffiffiffiffiffiffiffiffiffi
distance

X between the vortices, we find that ‘ = n=hex is
 di dj log jai  aj j
i6¼j
optimal.
X Moreover, optimizing with respect to n, we find
þ di dj Sðai ; aj Þ ½31 that n should remain bounded (as " ! 0) when
i;j
hex  Hc1 þ O( log j log "j). In that regime, rescaling
On the other hand, this inequality is sharp: as by setting xi = ((ai  p)=‘), we have the following
before, given vortices ai , one can construct a result:
362 Variational Techniques for Ginzburg–Landau Energies

Theorem 3 (Sandier–Serfaty). There exist fields Theorem 4 (Sandier–Serfaty). G" =h2ex -converges
Hn  Hc1 þ C(n  1) log j log "j such that when to G.
Hn  hex < Hnþ1 , minimizers of G" have n vortices
The limit problem of minimizing G turns out to
of degree 1, and the rescaled vortices xi ’s tend to
have a simple solution in terms of an obstacle
minimize:
X problem: the optimal
is a uniform density of
wn ðx1 ; . . . ; xn Þ ¼   log jxi  xj j vortices on a subdomain of  determined through a
i6¼j free boundary problem (and depending on hex ),
X
n which is nonzero.
þ n Qðxi Þ ½34 In all these regimes, we have thus been able to
i¼1 identify the optimal number and repartition of
If hex  Hc1 log jlog "j, then the optimal number vortices through a -convergence-type approach,
of vortices n becomes unbounded as " ! 0. The that is, by reducing the minimization of the energy
analysis above still holds, but in order to get a to the minimization of a limiting problem: wn or I or G,
convergence of the vortices, one needs to rescale the according to the regime.
vorticity measure by n. There is an intermediate
regime, for log jlog "j hex  Hc1 jlog "j for Further Results
which n should be 1 but still n hex , so ‘ 1: Concerning vortices, in the same spirit as what was
vortices are numerous, but still concentrate around p. done for E" , we can obtain necessary conditions
Rescaling by the scale ‘ as above, we prove that the characterizing limiting vorticities obtained from
density of vortices (after dividing it by n) converges to sequences of (nonminimizing) critical points of
a probability measure, minimizer of the energy the energy G" . They consist in passing to the limit
Z
in the conservative form of the Ginzburg–Landau

Þ ¼   log jx  yj d
ðxÞ d
ðyÞ equations [5].
R 2 R 2
Z Most of the results concerning the phase transi-
þ QðxÞ d
ðxÞ ½35 tions at the next critical fields Hc2 and Hc3 are also
R2
obtained by nonvariational methods, and often by
This is an averaged/continuous form of [34]. linear analysis.
If hex  Hc1 is of order j log "j, then the optimal The study of the Ginzburg–Landau energy in non-
number n becomes of order hex and the vortices no simply-connected domains is also very interesting
longer concentrate around a single point. because it leads to nontrivial topological effects, since
The simplest approach is then to simply consider in such domains there exist unit-valued maps with
the vorticity measure
( , A) and to rescale it by the nonzero degree (corresponding to permanent currents).
order n, hence by hex . Then (1=hex )
( , A) con-
verges, after extraction, to some measure
. A See also: Abelian Higgs Vortices; Aharonov–Bohm Effect;
continuous version of [31] can thus be written, using Bose–Einstein Condensates; Gamma-Convergence and
[12], as Homogenization; Gauge Theory: Mathematical
Applications; Ginzburg–Landau Equation; High Tc
G" ð ; AÞ Superconductor Theory; Image Processing:
Z Z Mathematics; Superfluids; Topological Defects and Their
1 1

hex j log "j j
j þ h2ex jrh
j2 þ jh
j2 ½36 Homotopy Classification; Variational Techniques for
2  2  Microstructures.
where h
solves
h
þ h
¼
in  Further Reading
h
¼ 1 on @
Bethuel F, Brezis H, and Hélein F (1994) Ginzburg–Landau
Again, this inequality can be proved to be sharp (by Vortices. Boston: Birkhäuser.
DeGennes PG (1966) Superconductivity of Metal and Alloys.
a construction) and allows to show that minimizers New York: Benjamin.
of G" have a vorticity
( , A) such that
( , A)=hex Jerrard RL and Soner HM (2002) The Jacobian and the
converges to a minimizer of Ginzburg–Landau energy. Calculus of Variations and Partial
 Z Z Differential Equations 14(2): 151–191.
1 jlog "j 1 Sandier E and Serfaty S Vortices in the Magnetic Ginzburg–

Þ ¼ lim j
j þ jrh
j2 þ jh
j2
2 "!0 hex  2 
Landau Model. Birkhäuser (monograph to appear).
Tinkham M (1996) Introduction to Superconductivity, 2nd edn.
In fact the stronger result holds, in that sense: McGraw-Hill.
Variational Techniques for Microstructures 363

Variational Techniques for Microstructures


G Dolzmann, University of Maryland, College Park, expression for W is not available. In the spirit of the
MD, USA fundamental assumption (2) one therefore focuses
ª 2006 Elsevier Ltd. All rights reserved. on the structure of the set of minima of W which is
determined from general invariance and symmetry
principles. We may assume that W  0 and that
K = {X: W(X) = 0} 6¼ ;. The principle of material
Austenite–Martensite Transformations frame indifference then asserts that
and the Shape Memory Effect
WðRFÞ ¼ WðFÞ for all R 2 SOðnÞ
Microstructures in materials that typically form in
response to phase transformations in the solid state, Here SO(n) is the group of proper rotations, that is,
and their impact on the elastic properties of these the set of all matrices R 2 Mnn with RT R = Id and
materials have been known for centuries. The det R = 1.
discovery of the complex phase diagram of iron The symmetry of the austenitic (high-temperature)
revolutionized the production of steels at the end of phase implies that the energy density in the
the nineteenth century. Starting in the 1980s, the martensitic (low-temperature) phase is invariant
mathematical description of microstructures in the under all changes of basis that leave the underlying
framework of nonlinear elasticity has led to deep lattice in the austenitic phase invariant. Therefore,
analytical questions and surprising developments in WðRT FRÞ ¼ WðFÞ for all R 2 P a
the calculus of variations and in nonlinear partial
differential equations. where P a is the point group of the austenite. In the
The mathematical approach outlined here is based case of a cubic to tetragonal phase transformation,
on the following fundamental assumptions: this leads to K = SO(3) in the austenitic phase and to
1. The observed configurations correspond to mini- K ¼ SOð3ÞU1 [ SOð3ÞU2 [ SOð3ÞU3 ½1
mizers of or elements of minimizing sequences with
for an energy functional.
2. The qualitative properties of low energy states 1
Ui ¼ 2 ei  ei þ ðI  ei  ei Þ ½2
are determined from the set of minima of the free 
energy density.
in the martensitic phase (see Figure 1). A set of the
Under these assumptions one aims at explaining form SO(n)Ui is often referred to as an energy well.
experimental observations and to predict material The origin of the shape memory effect is the
properties based on minimizing an energy functional availability of a rich class of geometric patterns in
of the form which the martensitic phases can be arranged, thus
Z leading to a great flexibility of the material to
IðuÞ ¼ WðDuÞ dx accommodate macroscopic deformations. Upon heat-
 ing of the material above the transformation tem-
Here  is an ideal, unstressed reference configura- perature, the martensitic phases lose their stability
tion in Rn , u :  ! R m is an elastic deformation, and and the material returns to its unique shape in the
W : Mmn ! R is the stored energy density. In the
case of physical interest, m = n = 2 or m = n = 3. For
applications in elasticity we assume that m = n, but
this assumption is not needed in the general theory.
The energy density W and its structure depend
critically on the temperature. However, since we are
interested in the analysis of the material at a given
temperature, we do not include this dependence
explicitly.
The key ingredient of this model is the stored
energy density W which has to reflect the properties (a) (b) (c)
of the specific material one wants to model.
Figure 1 Two-dimensional cartoon of a cubic to tetragonal
Frequently these are alloys, in particular shape phase transformation in a single crystal: (a) a cubic lattice, (b)
memory alloys that undergo an austenite–martensite and (c) tetragonal variants which are stretched in directions e1
transformation. For most materials a closed analytic and e2 , respectively. (Sketch not to scale.)
364 Variational Techniques for Microstructures

This leads to weak compactness in W 1, p (; Rm )


(weak- compactness in W 1, 1 (; R m )) and to the
requirement of sequential weak lower semicontinu-
ity of the functional,

IðuÞ  lim inf Iðuj Þ if uj * u in W 1;p ð; Rm Þ


j!1
(a) (b) (c)
(sequential weak- lower semicontinuity for p = 1).
Figure 2 Formation of phase boundaries in a single crystal.
Morrey’s fundamental work establishes a link
(a) The upper right half of the lattice deforms into phase I with
the constant deformation gradient U1 , the lower left half of the between convexity conditions for the energy density
lattice deforms into phase II with constant deformation gradient and lower semicontinuity of the variational integral:
U2 : (b) An additional rotation is needed to accomplish a under suitable growth and coercivity conditions,
continuous deformation, see formula [3]. (c) A different config- sequential weak- lower semicontinuity is equivalent
uration with a different orientation of the interface. (Sketch not
to quasiconvexity of the integrand.
to scale.)
Definition 1 A function W : Mmn ! R is said to be
quasiconvex at F if
austenitic phase. The two solutions of Hadamard’s Z Z
compatibility condition WðFÞdx  W ðF þ DÞdx
 
QU2  U1 ¼ a  b; Q 2 SOð3Þ
for all  2 W01;1 ð; Rm Þ
are given by
0 1 and for all open and bounded domains 
R n with
2 4
1 @ 2 4  1 0 Ln (@) = 0. It is said to be quasiconvex if it is
Q1 ¼ 1 22 0 A ½3 quasiconvex at all F.
4 þ 1
0 0 4 þ 1
In the language of nonlinear elasticity, W is
and Q2 = QT1 (see Figure 2). The normals (in pthe ffiffiffi quasiconvex if affine functions are minimizers of
reference configuration) are given by (1, 1, 0)= 2. the energy functional subject to their own boundary
It is one of the successes of the theory that it conditions. The direct method implies the following
provides an analytical derivation of the normals to classical existence theorem.
the twinning planes.
Theorem 1 Suppose that W : Mmn ! R is quasi-
convex and satisfies the growth and coercivity
condition [4]. Let u0 2 W 1, p (; R m ). Then the varia-
The Direct Method in the Calculus tional problem: minimize I(u) in
of Variations n o
1;p
A ¼ u 2 W 1;p ð; Rm Þ : u  u0 2 W0 ð; Rm Þ
The mathematical interest in the variational prob-
lems described in the previous section lies in the fact has a minimizer.
that existence of minimizers cannot in general be
obtained by a straightforward application of the The remarkable fact is that the structure of the
direct methods in the calculus of variations. This zero set of a typical energy W modeling a phase-
approach is based on the idea to (1) choose a transforming material in its low-temperature phase
minimizing sequence for the functional I, (2) show prevents W from being quasiconvex. In order to see
that this sequence is bounded and precompact, this, let Q
R 3 be a cube p with
ffiffi two of its sides
and (3) prove that the functional is lower semicon- perpendicular to b = (1, 1, 0)= (2) and let h be the
tinuous with respect to the notion of convergence, 1-periodic function with h0 = 0 on (0, ) and h0 = 1 on
(, 1) with  2 (0, 1). Define vj (x) = U1 x þ ah( jx b)=j
IðuÞ  lim inf Iðuj Þ if uj ! u and
j!1
 
The typical choice is to seek uj in a suitable Sobolev uj ðxÞ ¼ min vj ðxÞ; distðx; @QÞ
space W 1, p (; R m ) with 1 < p  1 which is related ¼ minfU1 x þ ahð jx bÞ=j; distðx; @QÞg
to growth and coercivity conditions for the energy
density W, where dist(x, @Q) = inf {kx  yk1 , y 2 @Q}. Then
  uj ! u, u(x) = Cx strongly in L1 (Q; R3 ) and weakly-
c1 jFjp  c2  WðFÞ  c3 jFjp þ 1 in W 1, 1 (Q; R3 ) with C = U1 þ (1  )Q1 U2 2 =K
for all F 2 Mmn ½4 where K is the zero set of W, see the previous section.
Variational Techniques for Microstructures 365

is called the quasiconvex envelope of W. Equivalently,


Z
1
W qc ðFÞ ¼ inf WðF þ DÞdx
 2 W01;1 ð;R m Þ jj 
Duj = A B A B A B A B b
This formula implies that W qc is the macroscopic
energy of the system in the sense that it characterizes
the smallest energy per unit volume that is required
to subject a volume element to a deformation with
λ /j (1–λ)/j affine boundary conditions. Here the system is
Figure 3 Construction of a minimizing sequence uj with Duj !
allowed to minimize its energy with microstructures
fA, Bg in measure and affine boundary conditions u(x) = A þ at any scale, a mechanism which was already
(1  )B Hadamard’s compatibility condition requires that A  explored in the previous section. The arguments in
B = a  b is a rank-1 matrix and that the planar interfaces are this section prove that W qc (C) = 0 and this shows
perpendicular to b. that the zero set of W qc can be strictly larger than
the zero set of K, see Definition 4. The relaxed
functional is given by
Moreover, Duj 2 {U1 , Q1 U2 } except in a small transi- Z
tion layer of volume O(1=j) close to @Q and I qc ðuÞ ¼ W qc ðDuÞdx
Z 
IðuÞ ¼ WðCÞdx > lim inf Iðuj Þ ¼ 0
 j!1 Since W qc satisfies the growth and coercivity
conditions [4] if they are satisfied by W, the
This inequality shows that the functional is not
functional Iqc attains its minimum subject to given
weakly- lower semicontinuous and therefore W
boundary conditions. The functional Iqc is the
fails to be quasiconvex. The oscillations of uj on a
weakly lower semicontinuous envelope of I in the
scale 1=j are part of the mathematical model for the
sense that minimizing sequences for I contain
microstructures frequently observed in shape mem-
subsequences that converge to minimizers of Iqc
ory alloys. More generally, whenever u is a Sobolev
and for all u there exists a sequence uj which
function on a domain  such that Du takes only two
converges in W 1, p (; Rm ) to u such that the
values, say Du 2 {A, B}, on open sets which are not
energies converge, I(uj ) ! I(u). However, a lot of
empty and whose union is  (up to a set of measure
information in particular about oscillation patterns
zero), then the tangential continuity of the deriva-
might be lost in the passage from I to Iqc since the
tives implies that the difference A  B is a matrix of
knowledge of a minimizer u for Iqc does not
rank 1, A  B = a  b, and that the interfaces
provide any immediate information about the
between the regions with Du = A and Du = B are
behavior of any minimizing sequence for I that
hyperplanes with normal parallel to b. This state-
converges to u. Moreover, the minimization pro-
ment is usually referred to as ‘‘Hadamard’s compat-
blem required in the definition of the relaxed
ibility condition.’’ Moreover, the pattern in Figure 3
energy has been solved explicitly only for very
is known as a ‘‘simple laminate’’ and the matrices A
special energy densities.
and B are said to be rank-1 connected.
In this context, one often relies on two related
notions of convexity, one sufficient and the other
necessary for quasiconvexity. For F 2 Mmn let
Relaxation M(F) 2 Rd(m, n) be the vector of all minors (sub-
determinants) of F. In the special case m = n = 2
The discussion in the previous section shows that the
we have M(F) = (F, det F) 2 R5 and for m = n = 3
variational problems related to models in materials
we find M(F) = (F, cof F, det F) 2 R 19 where cof
science typically fail to be weakly lower semicon-
F is the 3  3 matrix of all 2  2 subdeterminants
tinuous. One approach which allows us to recover
of F.
the macroscopic energy of the system and the macro-
scopic stress–strain relation is to pass to the relaxed Definition 3 Let W : Mmn ! R be given. The
variational problem which involves the quasiconvex function W is said to be polyconvex if there exists
envelope of the energy density W. a convex function g : Rd(m, n) ! R such that
W(F) = g(M(F)). The function W is rank-1 convex if
Definition 2 Let W : Mmn ! R be given. The
it is convex along all rank-1 lines in Mmn , that is, the
function
function t 7! W(F þ tR) is convex for all F 2 Mmn
W qc ¼ supff : f  W; f quasiconvexg and all R 2 Mmn with rank(R) = 1.
366 Variational Techniques for Microstructures

All notions of convexity reduce to classical ‘‘Young measure’’ generated by a subsequence. It is


convexity if m = 1 or n = 1. In the vector-valued given by a family of probability measures x that
case m, n > 1 the following implications are true: provide statistical information about the distribution of
the values of zj close to a given point x. The existence
f convex ) f polyconvex ) f quasiconvex
and the fundamental properties of Young measures are
) f rank-1 convex described in the following theorem. For simplicity we
assume that the sequence zj is uniformly bounded.
The reverse statements for the first two implications
are not true. Rank-1 convexity does not imply Theorem 2 (Fundamental theorem on Young
quasiconvexity for m  3 and it is a fundamental measures). Let E
Rn be measurable, Ln (E) < 1,
open problem with deep connections to harmonic and let zj : E ! Rd be a measurable and bounded
analysis to decide whether rank-1 convexity and sequence. Then there exists a subsequence zk and a
quasiconvexity are equivalent for m = n = 2. weakly- measurable map  : E ! M(Rd ) such that
The polyconvex and the rank-1 convex envelope the following assertions are true:
of an energy density W are defined analogously to
(i) The measures x are non-negative probability
Definition 2. In view of the implications between the
measures.
different notions of convexity, one has W pc 
(ii) If there exists a compact set K such that uk ! K
W qc  W rc and essentially all explicitly known
in measure, then supp x
K for a.e. x 2 E.
relaxation formulas are based on the approach to
construct a candidate W for W rc and to verify that (iii) If f 2 C(R d ) and if f (zk ) is relatively weakly
compact in L1 (E), then f (zk ) * f in L1 (E)
W is polyconvex. Then the inequalities become
where f (x) = hx , f i.
equalities and one obtains a characterization for the
relaxed energy. This approach does not work for Here hx , f i denotes the integration of the func-
extended-valued functions which are used in models tion f with respect to the measure x . For example,
for incompressible materials since quasiconvexity the Young measure generated by the sequence Duj
does not imply rank-1 convexity in this case. constructed in the section ‘‘The direct method in the
However, for a model system of particular interest, calculus of variations’’ generates the Young measure
nematic elastomers, a complete characterization of x = (1=2)A þ (1=2)B (see Figure 3) and
the relaxed energy, the macroscopic stress–strain Z
relation, and the macroscopic phase diagram have Iðuj Þ ¼ WðDuj Þ dx
been obtained. Z Z
! WðYÞ dx ðYÞ dx ¼ 0
 Mmn
Classical and Generalized Minimizers A Young measure generated by a sequence of
The discussion of observed configurations as ele- gradients is called a gradient Young measure
ments of minimizing sequences {uj } in the section (GYM). It is said to be homogeneous if x =  is
‘‘The direct method in the calculus of variations’’ independent of x. We restrict our attention in the
leaves the question of the existence of minimizers following to homogeneous GYMs generated by
open. The answer cannot be obtained via the direct sequences that are bounded in L1 . The importance
methods since minimizing sequences do not need to of quasiconvexity is also reflected in the following
converge strongly to minimizers. One approach to characterization of homogeneous GYMs.
obtain the existence of solutions u with I(u) = 0 is to Theorem 3 A non-negative probability measure 
solve the differential relation Du 2 K, u(x) = Fx on is a GYM if and only if there exists a compact set
@ by constructing special minimizing sequences K
Mmn with supp
K and Jensen’s inequality
that converge strongly so that one can pass to h, f i  f (h, idi) holds for all quasiconvex functions
the limit in the energy integral. This idea has led f : Mmn ! R.
to surprising solutions u with affine boundary
conditions for the two-well problem where K = This motivates to characterize the generalized
SO(2)diag(, 1=) [ SO(2)diag(1=, ). However, the limits of minimizing sequences as
structure of the solutions is intrinsically complicated
in the sense that the phase boundary has infinite Mqc ðKÞ ¼ f 2 MðKÞ : f ðh; idiÞ  h; f i
length unless the boundary conditions are given by for all f : Mmn ! R quasiconvexg
u(x) = Fx with F 2 K.
More generally, the right tool to pass to the limit in where M(K) is the set of all probability measures
nonlinear functions of zj = Duj like the energy is the supported on K. If  is generated by a sequence of
Variational Techniques for Microstructures 367

functions with affine boundary conditions


uj (x) = Fx, then h, idi = F. The set of all affine
deformations of the material that can be recovered
by heating (shape memory effect) is therefore given
as the set of all centers of mass of homogeneous
GYMs supported on K, the so-called ‘‘quasiconvex
hull’’ Kqc of K.
Definition 4 Suppose that K
Mmn is compact.
We define the quasiconvex hull of K by

Kqc ¼ fF ¼ h; idi :  2 Mqc ðKÞg


There are several equivalent definitions of Kqc . Figure 4 The four-point subset K in the space of all diagonal
The foregoing definition corresponds to the defini- matrices and its convex hulls: K rc = K qc are given by K, the line
tion of the convex hull of a set as the set of all segments and the shade square, K pc is bounded by the dashed
centers of mass of probability measures supported hyperbolic arcs, and the convex hull is the outer square.
on K (which satisfy Jensen’s inequality for all
convex f ). The set Kqc can also be defined as the larger than the set K itself despite the fact that the
set of all points that cannot be separated by set K does not contain any rank-1 connections.
quasiconvex functions from K or as the zero set of There are only a few examples in which explicit
the quasiconvex envelope of the distance function to characterizations of the convex hulls for sets
K. The ‘‘polyconvex hull’’ Kpc and the ‘‘rank-1 invariant under SO(n) have been obtained. For
convex hull’’ Krc are defined analogously by replac- K = SO(3)U1 [ SO(3)U2 (see [2]), one finds
ing quasiconvexity with polyconvexity and rank-1 8 0 1
>
< a c 0
convexity in the foregoing definitions. It follows that B C
Krc
Kqc
Kpc and all of these inclusions can be Kqc ¼ F 2 M33 : FT F ¼ @ c b 0 A;
>
:
strict. 0 0 1=2
A particularly useful set of conditions are the
)
minors conditions 1
ab  c2 ¼ 2 ; a þ b þ 2jcj  4 þ
2
h; Mi ¼ Mðh; idiÞ
for all minors M which follow from the weak The quasiconvex hull of the three-well problem [1]
continuity of the minors. For example, if is not known. In two dimensions one finds for
K = {A, B}
M22 , then any probability measure K ¼ SOð2ÞU1 [ [ SOð2ÞUn ;
supported on K is given by  = A þ (1  )B . The
det Ui ¼ 1; i ¼ 1; . . . ; n
minors condition with M = det implies that
that
detðA þ ð1  ÞBÞ ¼ deth; idi ¼ h; deti  
¼  det A þ ð1  Þ det B Kqc ¼ F 2 M22 : det F ¼ 1; jFej2  max jUi ej2
i¼1;...;n
This identity is equivalent to
All examples in which envelopes of functions or hulls
ð1  Þ detðA  BÞ ¼ 0 of sets have been obtained explicitly are based on the
exceptional property that the polyconvex envelope
and therefore the quasiconvex hull is equal to K if
coincides with the rank-1 convex envelope. The T4
and only if det (A  B) 6¼ 0. A very instructive
configuration in Figure 4 is one of the few cases where
example is the set K = {(1, 3), (1, 3), (3, 1),
the quasiconvex hull is known to be different from the
(3, 1)} viewed as a subset of the space of all
polyconvex hull. The construction of quasiconvex
diagonal matrices in M22 . It is frequently referred
functions and the understanding of their properties is
to as a T4 configuration. The rank-1 convex hull is
one of the challenges left for the future.
equal to the quasiconvex hull and given by the four
points, the line segments, and the square in the
center, the polyconvex hull is bounded by four
hyperbolic arcs, and the convex hull is the square
Bibliographical Remarks
with the points as corners, see Figure 4. It is This article can only review some of the highlights
remarkable that the rank-1 convex hull is strictly of mathematical developments related to models in
368 Variational Techniques for Microstructures

nonlinear elasticity for solid–solid phase transforma- differential equations and for the passage from
tions based on a huge body of work in the original microscopic to macroscopic models. Gradient
literature. The precise references can be found in the Young measures were characterized by Kinder-
extensive bibliographies of the books and review lehrer and Pedregal. The four-point configuration
articles that are cited in the subsequent section, was discovered independently in various contexts by
in particular in Ball (2004), Bhattacharya (2003), several authors including Scheffer, Aumann and
Dolzmann (2003), James and Hane (2000), and Hart, Casadio Tarabusi, Tartar, and Milton and
Müller (1999). This article focuses on models for Nesi. The characterization of the quasiconvex hull
single crystals; the behavior of polycrystals (which uses a quasiconvex function constructed by Šverák.
strongly depends on the amount of symmetry The quasiconvex hull of the two-well problem in 3D
breaking in the transformation) was studied by was found by Ball and James, and the generalization
Bhattacharya and Kohn. to n wells in 2D by Bhattacharya and Dolzmann.
The formulation of solid–solid phase transforma-
tions via nonlinear continuum theory goes back to
Ericksen and the analysis via tools in the calculus of Acknowledgments
variations was initiated by Ball and James, Chipot
The work of G Dolzmann was supported by the
and Kinderlehrer, and Fonseca. The Russian school
NSF through grants DMS0405853 and
developed the theory in linear elasticity in the 1960s,
DMS0104118.
see Khachaturyan (1983) for a review. A detailed
discussion of the crystallographic and group-theo- See also: Gamma-Convergence and Homogenization;
retical aspects is contained in Pitteri and Zanzotto Variational Techniques for Ginzburg–Landau Energies.
(2002).
Quasiconvexity was introduced by Morrey (1966)
and his results were extended to Carathéodory Further Reading
integrands by Acerbi and Fusco and Marcellini. A
Ball JM (2004) Mathematical models of martensitic microstruc-
modern treatment including Dacorogna’s relaxation
ture. Materials Science and Engineering A 378(1–2): 61–69.
theorem and a summary of the various notions Bhattacharya K (2003) Microstructure of Martensite. Oxford:
of convexity and their properties can be found in Oxford University Press.
Dacorogna (1989). Šverák proved that rank-1 Dacorogna B (1989) Direct Methods in the Calculus of Varia-
convexity does not imply quasiconvexity for m  3 tions. Berlin: Springer.
and Milton modified his example to show that the Dolzmann G (2003) Variational Methods for Crystalline Micro-
structure: Analysis and Computation, Lecture Notes in
rank-1 convex hull of a set can be strictly smaller Mathematics, vol. 1803. Berlin: Springer.
than its quasiconvex hull. The explicit characteriza- James RD and Hane KF (2000) Martensitic transformations and
tions for nematic elastomers were obtained by shape-memory materials. Acta Materiali 48: 197–222.
DeSimone and Dolzmann. Khachaturyan A (1983) Theory of Structural Transformations in
Lipschitz solutions to differential inclusions were Solids. New York: Wiley.
Morrey CB (1966) Multiple Integrals in the Calculus of
constructed by Müller and Šverák based on Gro- Variations. Berlin: Springer.
mov’s concept of convex integration, by Dacorogna Müller S (1999) Variational methods for microstructure and
and Marcellini using Baire’s category argument, and phase transitions. Proc. C.I.M.E. Summer School ‘‘Calculus of
by Kirchheim in the framework of Banach Mazur Variations and Geometric Evolution Problems,’’ Cetraro,
games. The structure of solutions of the two-well 1996, Lecture Notes in Mathematics, vol. 1713. Berlin–
Heidelberg: Springer.
problem with finite surface energy was analyzed by Pitteri M and Zanzotto G (2002) Continuum Models for Phase
Dolzmann and Müller. Young measures (also called Transitions and Twinning in Crystals. London: Chapman and
parametrized measures or chattering controls) were Hall.
originally introduced as generalized solutions for Tartar L (1979) Compensated compactess and partial differential
optimal control problems which do not admit equations. In: Knops R (ed.) Nonlinear Analysis and
Mechanics: Heriot–Watt Symposion, vol. IV. London: Pitman.
classical solutions (Young 1969). Tartar (1979) Young LC (1969) Lectures on the Calculus of Variations and
introduced Young measures as a fundamental tool Optimal Control Theory. Philadelphia–London–Toronto:
for the analysis of oscillation effects in partial Saunders.

Vertex Operator Algebras see Two-Dimensional Conformal Field Theory and Vertex Operator Algebras
Viscous Incompressible Fluids: Mathematical Theory 369

Viscous Incompressible Fluids: Mathematical Theory


J G Heywood, University of British Columbia, also be pivotal to our understanding of turbulence,
Vancouver, BC, Canada perhaps justifying Kolmogorff theory; see Heywood
ª 2006 Elsevier Ltd. All rights reserved. (2003). In this article we aim to present a relatively
simple approach to the local existence, uniqueness,
and regularity theory for the initial boundary value
problem for the Navier–Stokes equations, and to
Introduction discuss some observations that bear on the question
The Navier–Stokes equations of global regularity. A wider-ranging review of open
problems is given in Heywood (1990), and further
ðut þ u  ruÞ ¼ rp þ u þ f ½1
observations concerning the problem of global
regularity are given in Heywood (1994).
ru¼0 ½2
provide the simplest model for the motion of a
viscous incompressible fluid that is consistent with Setting the Problem
the principles of mass and momentum conservation,
To focus on core issues, we shall make some
and with Stokes’ hypothesis that the internal forces
simplifying assumptions. The fluid under considera-
due to viscosity must be invariant with respect to
tion will be assumed to completely fill (without free
any superimposed rigid motion of the reference
boundaries or vacuums) a bounded, connected,
frame. Despite their simplicity, they seem to govern
time-independent domain   Rn , n = 2 or 3, with
the motion of air, water, and many other fluids very
smooth boundary @. We are mainly interested in
accurately over a wide range of conditions. Thus,
the three-dimensional case, but comparisons with
their mathematical theory is central to the rigorous
the two-dimensional case are illuminating. The Rn -
analysis of many experimental observations, from
valued velocity u(x, t) = (u1 (x, t), . . . , un (x, t)) and R-
the asymptotics of steady wakes and jets, to the
valued pressure p(x, t) are functions of the position
dynamics of convection cells, vortex shedding, and
x = (x1 , . . . , xn ) 2  and time t  0. Equation [1] is
turbulence. During the last 80 years, a great deal of
an expression of Newton’s second law of motion,
progress has been made on both the basic mathe-
equating mass density times acceleration on the left
matical theory of the equations and on its applica-
with several force densities on the right, due to
tion to the understanding of such phenomena. But
pressure and viscosity, and sometimes a prescribed
one of the most important matters, that of estimat-
external force f. Written in full, using the summa-
ing the regularity of solutions over long periods of
tion convention over repeated indices, its ith
time, remains a vexing and fascinating challenge.
component is
Such an estimate will almost certainly be needed to
 
prove the ‘‘global’’ existence of smooth solutions. By @ui @ui @p @ 2 ui
that we mean the existence of smooth solutions of  þ uj ¼ þ  2 þ fi
@t @xj @xi @xj
the initial-value problem over indefinitely long
periods of time without any restriction on the We will assume the density  and the coefficient of
‘‘size’’ of the data. To date we can prove the viscosity  are positive constants.
‘‘local’’ existence of smooth solutions, but there In this article, we consider the initial boundary
remains a concern that if the data are large, value problem consisting of the equations [1], [2]
solutions may develop singularities within a finite together with the initial and boundary conditions
period of time. In fact, there is a great deal more at
ujt¼0 ¼ u0 ; uj@ ¼ 0 ½3
issue than this question of existence. A regularity
estimate is required to prove the reliability of the The initial velocity u0 (x) is prescribed. It will be
equations as a predictive model. That is because any assumed to possess whatever smoothness is con-
estimate for the continuous dependence of solutions venient, and to satisfy r  u0 = 0 and u0 j@ = 0. The
on the prescribed data for a problem depends upon boundary condition is a reasonable one, since fluids
a regularity estimate, as do error estimates for adhere to rigid surfaces.
numerical approximations. A global estimate for Notice that a further condition would be needed
the regularity of solutions is also required for a to uniquely determine the pressure, since only its
mathematically rigorous theory of turbulence. In derivatives appear in the problem as posed. We
fact, it may be hoped that the insight which prefer to do without auxiliary conditions for the
ultimately yields a global regularity estimate will pressure, and to refer to u by itself as a solution of
370 Viscous Incompressible Fluids: Mathematical Theory

the problem provided there exists a scalar function p in which case


which together with u satisfies [1]–[3]. The problem
is said to be uniquely solvable if there is a unique vt þ v  rv ¼ rq þ v ½6
solution u, in which case the gradient of the pressure
is also uniquely determined, along with the pressure with
up to a constant. Notice also that under our
 ¼ = ½7
assumptions a potential force like gravity has no
effect on u. If u solves the problem in the absence of
and q(x, t) = 2 1 p(x=, t=). We refer to such u
such a force, then the inclusion of the force affects
and v as dynamically similar flows. The relation
only the pressure, from which the potential must be
[7], that follows from [5], is equivalent to the
subtracted. It turns out that the inclusion of a equality of the Reynolds numbers for the two
prescribed nonpotential force, while complicating
flows,
many of the estimates below, does not affect in any
essential way those parts of the theory to be max juj  jj  
presented here. Thus, for simplicity, we shall RðuÞ ¼

henceforth assume that f  0.
max juj  jj  1
¼ ¼ RðvÞ


Reynolds Number The condition [5] can be satisfied simultaneously


with the condition  = 1. For example, one may
We can make a slight further simplification of eqn [1]
choose  = 1,  = =, and  = =. This achieves a
by rescaling, with the objective of setting  = 1, or even
rescaling of the equation to
 = 1 and  = 1. This scaling is not required for the
existence theory we are presenting, but provides an vt þ v  rv ¼ rq þ v ½8
important insight for the study of stability, bifurcation,
and turbulence. The Reynolds number without changing the domain. Different Reynolds
numbers result from varying the magnitude of the
max juj  jj   velocity. In what follows, we will work with the

 Navier–Stokes equation in this simplest possible
form.
plays an important role in rescaling. It expresses the
ratio of the inertial to viscous effects. The notation
jj represents a characteristic length, such as the
minimum diameter of a bounded domain. Generally Continuous Dependence on the Data
speaking, a high Reynolds number corresponds to
We begin our investigation of the initial boundary
what is meant by ‘‘large’’ data, and the higher the
value problem
Reynolds number the more inclined a flow is to
instability and turbulence, and perhaps to the ut þ u  ru ¼ rp þ u; ru¼0
development of singularities. However, the size of
the Reynolds number has precise implications only for ðx; tÞ 2   ð0; 1Þ, ½9
in comparing ‘‘dynamically similar’’ flows. We say ujt¼0 ¼ u0 ; uj@ ¼ 0
that two vector fields v(x, t) and u(x, t) are dynami-
cally similar if and only if v(x, t) = u(x=, t=) for by considering two smooth solutions, say u and v,
some , ,  > 0. In such a case, if u is defined in taking possibly different initial values u0 and v0 . Let
  [0, T), then v will be defined in   [0, T), their difference be w = v  u, with initial value w0 ,
where  = {x: x 2 }. Furthermore, if u satisfies and let q be the difference of the corresponding
the Navier–Stokes equations, then v will satisfy pressures. Then, subtracting one equation from the
other, one obtains
1 vt þ 2 v  rv
wt þ w  rw þ u  rw þ w  ru ¼ rq þ w ½10
¼ rpðx=; t=Þ þ 1  2 v ½4

which has the form of the Navier–Stokes equations if Multiplying this by w, integrating over , and
and only if the coefficients of the two inertial terms integrating by parts, one then obtains
on the left-hand side are equal. That is, if and only if 1d
kwk2 þkrwk2 ¼ ðw  ru; wÞ ½11
 ¼  ½5 2 dt
Viscous Incompressible Fluids: Mathematical Theory 371

where holds
pffiffiffiif a, b > 0, p, q > 1 and 1=p þ 1=q = 1. Taking
Z a = 2krwk, along with p = q = 2 in the two-
kw k2 ¼ w2 dx dimensional case, and a = (4=3)3=4 krwk, along
Z with p = 4=3, q = 4 in the three-dimensional case,
2 @wi @wi one obtains
krwk ¼ dx
 @xj @xj
Z jðw  ru; wÞj
@ui (
ðw  ru; wÞ ¼ wj wi dx
 @xj krwk2 þ 14 kruk2 kwk2 ; if n ¼ 2 ½13

since (and this should further explain our notation) krwk2 þ 256
27
kruk4 kwk2 ; if n ¼ 3
Z
Using these estimates for the right-hand side of [11],
ðwt ; wÞ ¼ wt  w dx we obtain linear differential inequalities for kwk2

Z that are easily integrated to give
1d 1d
¼ w2 dx ¼ kw k2
2 dt  2 dt
Z 2 kw ð t Þ k2
@ wi ( Rt 1
ðw; wÞ ¼ 2
wi dx kw0 k2 exp 2
0 2 kruk d ; if n ¼ 2 ½14
 @xj Rt
Z 2
kw0 k exp 27 4
@wi @wi 0 128 kruk d ; if n ¼ 3
¼ dx ¼ krwk2
 @x j @x j
Z Z
@q @wi It follows that if we can estimate the integrals on
ðrq; wÞ ¼ wi dx ¼  q dx ¼ 0 the right, which concern only the solution u, and if
@x @xi
Z i 
v is a second solution, perhaps differing only
@wi
ðu  rw; wÞ ¼ uj wi dx slightly from u when t = 0, then we can estimate
 @xj
Z the difference kv(t)  u(t)k at later times. Moreover,
1 @uj at any particular time this difference will be
¼ wi wi dx ¼ 0
2  @xj bounded proportionally to kv(0)  u(0)k. The inte-
and similarly (w  rw, w) = 0. In deriving these we gral on the right-hand side of the two-dimensional
have used the fact that the vector fields are version of [14] is easily estimated using the energy
divergence free and vanish on the boundary. In the estimate [16] below. The estimation of the corre-
following, we will use such identities without further sponding integral in the three-dimensional case,
mention. without a restriction on the size of the data,
We can estimate the nonlinear term on the right- remains an open problem. It can be regarded as
hand side of [11] by using the ‘‘Sobolev inequalities’’ the most important open problem in the Navier–
Stokes theory. It would never be enough to some-
kk24 kkkrk; if n ¼ 2 how prove that solutions are smooth without
½12 estimating this integral, or something equivalent
kk24 kk1=2 krk3=2 ; if n ¼ 3
to it. Of course, if solutions were known to be
proved by Ladyzhenskaya (1969), though with smooth one could infer their uniqueness from [14],
larger constants. These are valid for any smooth since smoothness would imply that the integrals are
function  which vanishes on the boundary of . It finite, which is enough to conclude that kw(t)k is
may be either scalar or vector valued. The norms on zero if kw0 k is zero.
the leftR are L4 -norms; we use the notation
kkp = (  jjp dx)1=p for any p > 1, but usually
drop the subscript when p = 2. Using first Hölder’s Energy Estimate
inequality and then [12], one obtains If one multiplies the Navier–Stokes equation for u
by u, and proceeds as in deriving [11], one obtains
jðw  ru; wÞj kwk24 kruk
(
kwkkrwkkruk if n ¼ 2 1d
kuk2 þkruk2 ¼ 0 ½15
1=2 3=2 2 dt
kwk krwk kruk if n ¼ 3
Young’s inequality and hence
Z t
1 1 1 1
ab ap þ bq kuðtÞk2 þ kruk2 d ¼ ku0 k2 ½16
p q 2 0 2
372 Viscous Incompressible Fluids: Mathematical Theory

This settles the matter of continuous dependence in u þ rp ¼ f and r  u ¼ 0 in  uj@ ¼ 0 ½18
the two-dimensional case. Together with [16], the
two-dimensional version of [14] implies with f = Pu. For such solutions, and hence for
2 2 2 all such u, we have the estimates
kwðtÞk kw0 k exp 14 ku0 k ; if n ¼ 2 ½17
kukW 2 ðÞ ckPuk ½19
2
We remark that the local rate of energy dissipa-
tion is 2jDuj2 rather than jruj2 , where Du is the and
stress tensor Du = (1=2)(ru þ (ru)T ). However,
integrating over the domain, and integrating by 
ckukkPuk; if n ¼ 2
parts using the boundary condition uj@ = 0, one supjuj2 ½20
 ckrukkPuk; if n ¼ 3
may verify that the rate of total energy dissipation
2kDuk2 equals kruk2 . For the purpose of this with constants independent of u. It can also be
article, it is convenient to write the energy identity shown that every such vector field u belongs to J 1 ()
as [15]. and hence to J(); see Heywood (1973).
Some history and remarks are in order. The
inequality [19] was proved independently by
Estimates for kru(t)k Pointwise in Time Solonnikov (1964, 1966), and by Prodi’s student
Cattabriga (1961). In fact, they gave Lp versions of
Of course, an estimate for kru(t)k pointwise in time it for all orders of the derivatives. Several proofs
would imply an estimate for the integral of kru(t)k4 specific to the L2 case needed here have been given
on the right-hand side of [14]. We can prove such an by Solonnikov and Sčadilov (1973) and by Beirão da
estimate for at least a finite interval of time by an Veiga (1997). The inequalities [20] can be proved by
argument due to Prodi (1962). It requires, in combining [19] with appropriate Sobolev inequal-
preparation, some deep results concerning the ities, or better, by combining [19] with recent
regularity of solutions of the steady Stokes equa- inequalities of Xie (1991) which are of precisely
tions. These cannot be proved here, but we can the form [20], but with 4u instead of P4u on the
briefly summarize what will be needed. Let right-hand side, and without the requirement that
L2 () = space of vector fields , with finite r  u = 0. The constant c in [19] depends upon the
L2 -norms kk, regularity of the boundary, and tends to infinity
1
C0 () = space of smooth vector fields with compact along with a bound for the boundary curvature.
support in , Through the work of Xie (1992, 1997), there is
D() = { 2 C1 reason to believe that the inequalities [20] are
0 (): r   = 0},
J() = completion of D() in the L2 -norm kk, probably valid for arbitrary domains, with the
J 1 () = completion of D() in the norm krk, constant c = (2
)1 if n = 2, and c = (3
)1 if n = 3.
G() = {rp: p 2 L2 () with rp 2 L2 ()}, and Xie’s efforts to prove this have been continued by
P : L2 () ! J() be the L2 -projection of L2 () onto the author (Heywood 2001). If the inequalities
J(), [20] can be proved for arbitrary domains (i.e.,
arbitrary open sets), with these fixed constants,
and define the Sobolev W22 () norm by then the approach to Navier–Stokes theory pre-
sented in this article will extend immediately to
kuk2W 2 ðÞ ¼ kuk2 þkruk2
2
Z arbitrary domains, as explained in Heywood and
 2 Xie (1997), with estimates independent of the
þ @ 2 ui =@xj @xk  dx
 domain.
We go on now with an estimation of kru(t)k
based on [20]. Multiplying the Navier–Stokes
Furthermore, observe that (rp, ) = 0 for rp 2
equation for u by P4u, and integrating over ,
G() and  2 J(), since it holds if p is smooth
one obtains
and  2 D(). Therefore, Prp = 0, since
(Prp, ) = (rp, ) = 0, for all  2 J(). Later,
1d
when we need it, we will also argue that kruk2 þ kPuk2 ¼ ðu  ru; PuÞ
2 dt
L2 () = J()
G().
With these preparations, it is evident that every sup jujkrukkPuk ½21

smooth vector field u satisfying r  u = 0 and
uj@ = 0 can be regarded as a solution of the steady since (ut , Pu)=(Put , u)=(ut , u)=(rut ,ru)
Stokes problem and (rp,Pu)=0.
Viscous Incompressible Fluids: Mathematical Theory 373

The right-hand side of [21] can be estimated using without any restriction on the size of the data.
[20] and Young’s inequality: Integrating the second, one obtains a global estimate

sup jujkrukkPuk kru0 k2


 kruðtÞk2 Rt
( 1  ckru0 k2 0 kruk2 d
ckuk1=2 krukkPuk3=2 ; if n ¼ 2
kru0 k2
ckruk3=2 kPuk3=2 ; if n ¼ 3 ½27
( 1  ðc=2Þku0 k2 kru0 k2
1
kPuk2 þ ckuk2 kruk4 ; if n ¼ 2
2 2 6
valid for all t  0, provided
1
2 kPuk þ ckruk ; if n ¼ 3
2
ku0 k2 kru0 k2 < ½28
Thus, c
d This is a good interpretation of what we mean by
kruk2 þ kPuk2 ‘‘small data.’’ If Xie’s conjecture is correct, that the
dt (
constant in the three-dimensional version of [20] is
ckuk2 kruk4 ; if n ¼ 2 c = (3
)1 , then we obtain [25]–[28] with the
½22
ckruk6 ; if n ¼ 3 constant c = 3=(128
2 ). Thus, 2=c ’ 842.

These differential inequalities are at the core of


present theory. Consider first the two-dimensional
case. It can be viewed as a linear differential
Further Regularity, Smoothing Estimates
inequality Once one has an estimate of the form
d   kruðtÞk MðtÞ; for 0 t < T ½29
kruk2 ckuk2 kruk2 kruk2 ½23
dt as provided by [24], [26], or [27], one can estimate
with a coefficient ckuk2 kruk2 that is integrable, in the solution’s derivatives of all orders over the open
view of the energy estimate [16]. Integrating it yields time interval (0, T). The initial time t = 0 must be
a ‘‘global’’ estimate; for all t  0, excluded from the interval, because the ‘‘imperfec-
Z t tion’’ of prescribed data generally causes an impul-
2
kruðtÞk kru0 k exp2
kuk2 kruk2 d sive acceleration along the boundary at time zero,
0 resulting in a thin boundary layer in which the
1 derivatives are so large that krut (t)k and ku(t)kW 3 ()
kru0 k exp ku0 k4
2
½24 2
2 tend to infinity as t ! 0þ . But the solution quickly
However, if the three-dimensional version of [22] smooths and remains smooth as long as [29]
is viewed as a linear differential inequality, the remains in force. Thus, our working assumption up
coefficient to be integrated is kru(t)k4 . Thus, the to this point, that solutions are C1 smooth in  
same integral which is crucial to proving continuous [0, 1) is not valid at t = 0. However, we will see
dependence on the data is also crucial to proving that they are smooth in   (0, T) and continuous in
regularity. What we can do in the three-dimensional   [0, T). They are also continuous on [0, T) in the
case, is view [22] as a nonlinear differential inequal- W22 () norm. This is sufficient regularity to justify
ity of the form everything that we have done to this point.
In this section, we give estimates for the derivatives
’0 c’3 or ’0 ckruk2 ’2 ½25 of all orders with respect to time, of u and its first- and
second-order derivatives with respect to space. In the
next section, we will prove an existence theorem by
for ’(t) = kru(t)k2 . Integrating the first of these, one Galerkin approximation. It will be easily seen that all
obtains a local estimate of the estimates proved in this and previous sections,
for solutions that are assumed to be smooth, also hold
kru0 k2
kruðtÞk2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½26 for the approximations, without any unproven
1  2ckru0 k4 t assumptions. Therefore, they will be inherited by the
solution that is obtained upon passing to the limit of
for the approximations. At first, this solution will be
something of a generalized solution, not fully classical,
1
0 t< but one which is C1 with respect to time over the
2ckru0 k4 interval 0 < t < T, in the W22 () norm. In a final step,
374 Viscous Incompressible Fluids: Mathematical Theory

viewing u at any fixed time as a solution of the steady containing an integral of kPuk2 on the left-hand side.
Stokes equations, we can apply regularity estimates for We will use the notation B(M, t) generically, for any
the Stokes equations to infer that it is C1 in all bound that depends only on the function M(t) and t.
variables throughout   (0, T), with specific esti- We remark, that a term kut k2 can also be included
mates for each derivative. under the integral sign on the left-hand side of [32],
The estimates of this section are obtained by because kut k and kPuk are of essentially the
integrating an infinite sequence of differential inequal- same order, being the leading terms in the projection
ities, for kuk, kruk, kut k, krut k, kutt k, krutt k, . . . . ut þ P(u  ru) = Pu of the Navier–Stokes equation.
The first two are [15] and [21], which have already Finally, one can also include kukW 2 () under the
2
been dealt with. It turns out that after these first two, integral sign, in view of [19].
each succeeding differential inequality is linearized by Going on, we obtain a third differential inequality
the estimates obtained from its predecessor, which from the second identity of the sequence [30]. Its
explains why the time intervals for these additional right-hand side admits the estimate
estimates do not become successively shorter. In fact,
in the two-dimensional case, the energy estimate ðut  ru; ut Þ kut k24 kruk
resulting from [15], which is valid for all time, already ckut k1=2 krut k3=2 kruk
gives the linearization [23] of [21], which then 1
provides an estimate valid for all time. Except for krut k2 þ ckruk4 kut k2 ½33
2
noting such differences between the two- and three-
dimensional cases, we will henceforth deal with only which, in view of [29] or [32], produces a linear
the three-dimensional case. differential inequality with integrable coefficients.
The differential inequalities just mentioned are Its integration yields an estimate of the form
obtained by estimating the right-hand sides of two Z t
sequences of differential identities, and ordering kut ðtÞk2 þ krut k2 d
them by an iteration between the two sequences. 0 ½34
The first sequence begins with and is patterned after BðM; t; kut ð0ÞkÞ; for 0 t < T
the energy identity,
provided kut (0)k is bounded. Since ut = P(u 
1d u  ru), we have the estimate
kuk2 þ kruk2 ¼ 0
2 dt
1d kut ð0Þk ¼ kPðu0  u0  ru0 Þk
kut k2 þ krut k2 ¼  ðut  ru; ut Þ  
2 dt
½30 ku0  u0  ru0 k B ku0 kW 2 ðÞ ½35
1d 2
kutt k2 þ krutt k2 ¼  ðutt  ru; utt Þ
2 dt
provided that u is smooth in   [0, T). This is a
 2ðut  rut ; utt Þ
delicate point, having been forewarned of a regular-
etc: ity breakdown at t = 0. But, we will be able to
while the second begins with and is patterned after replicate the estimate [35] for the Galerkin approx-
Prodi’s identity, imations, ultimately validating [34] for the approx-
imations and the solution.
1d The integration of the next differential inequality,
kruk2 þ kPuk2 ¼ ðu  ru; PuÞ which arises from the second of the identities [31],
2 dt
1d requires that krut (0)k < 1. Similarly to [35], we
krut k2 þ kPut k2 ¼ ðut  ru; Put Þ have
2 dt
þ ðu  rut ; Put Þ ½31
krut ð0Þk ¼ krPðu0  u0  ru0 Þk
1d  
krutt k2 þ kPutt k2 ¼ ðutt  ru; Putt Þ þ    ½36
2 dt B ku0 kW 3 ðÞ
2
etc:
provided that u is smooth in   [0, T). However,
Before going on, notice that we can return to [22] and there is a big difference between [35] and [36]. In the
use [29] to infer a more complete estimate of the form next section, we will not be able to obtain an analog of
Z t [36] for the Galerkin approximations. Consequently,
2
kruðtÞk þ kPuk2 d the solution that is obtained will not be fully regular at
0 ½32
time t = 0. It will satisfy u 2 C(  [0, T)) \ C1 (
BðM; tÞ; for 0 t < T (0, T)), but not u 2 C1 (  [0, T)). It will satisfy
Viscous Incompressible Fluids: Mathematical Theory 375

ku(t)  u0 kW 2 () ! 0 but not ku(t)  u0 kW 3 () ! 0, account this initial breakdown in the regularity.
2 2
as t ! 0þ . The continuous dependence estimate [14] meets this
One may wonder whether this is a fault or requirement. So also do the error estimates given in
deficiency in the Galerkin method. It is not, a series of four papers by Rannacher and the author,
remembering what was said at the beginning of beginning with Heywood and Rannacher (1982).
this section. For most prescribed values of u0 , no They were based on the ‘‘smoothing’’ regularity
matter how smooth, there is a breakdown in the estimates for solutions that are being presented here.
regularity of the solution as t ! 0þ . In fact, it was We go on with these now, as models for similar
proved in Heywood and Rannacher (1982) that if estimates for the Galerkin approximations.
krut (t)k or any one of several other quantities, Estimating the right-hand side of the second of the
including ku(t)kW 3 () , remains bounded as t ! 0þ , identities [31] using [20] and Young’s inequality,
2
then there exists a solution p0 of the overdetermined and then multiplying through by t, we get the linear
Neumann problem differential inequality

p0 ¼ r  ðu0  ru0 Þ in  d 


½37 tkrut k2 þ tkPut k2
rp0 j@ ¼ u0 j@ dt 
krut k2 þc kruk4 þkruk2
Generically speaking, this problem is not solvable,  
and therefore þ kPuk2 tkrut k2 ½38
lim supt!0þ krut ðtÞk ¼ 1 for tkrut k2 , with coefficients that are integrable in
We mention that under our assumption that u0 is view of the previous estimates [32], [34], and [35].
smooth, the correctly posed Neumann problem, Therefore, its integration yields an estimate analo-
with boundary condition @p0 =@nj@ = u0  nj@ , is gous to [32] of the form
uniquely solvable for a solution p0 2 W21 ()=R, and Z t
krp(t)  rp0 k ! 0, as t ! 0þ ; see Heywood and tkrut ðtÞk2 þ tkPut k2 d
Rannacher (1982). 0
 
Since solutions are smooth for 0 < t < T, the B M; t; ku0 kW 2 ðÞ ; for 0 < t < T ½39
2
pressure in the Navier–Stokes equations satisfies the
overdetermined Neumann problem for all t 2 (0, T). provided its ‘‘initial value’’ is finite. It is, due to the
So it may seem appropriate to require that the time weight, in the sense that
prescribed initial value u0 be a function for which  
problem [37] is solvable. We do not agree with that. lim supt!0þ tkrut ðtÞk2 ¼ 0 ½40
It is too difficult, if not impossible, to find such
functions, except by solving the Navier–Stokes This is proved by noting that if the lim sup were
equations. For example, one might think that the positive, then the integral on the left-hand side of
condition that [37] should be solvable might be [34] would be infinite. Finally, a term tkutt k2 can be
satisfied if u0 2 D(), since such functions are zero included under the integral sign on the left-hand side
in a neighborhood of the boundary. In fact, K of [39], because kutt k and kPut k are of essentially
Masuda has shown that if  is a three-dimensional the same order, being the leading terms in the
sphere, then the overdetermined Neumann problem projection utt þ P(ut  ru þ u  rut ) = Put of the
[37] is never solvable for nonzero u0 2 D(). Hence, time differentiated Navier–Stokes equation.
the gradient of the initial pressure will have a We continue inductively. Estimating the right-
nonzero tangential component, causing an impulsive hand side of the third of the identities [30] using
tangential acceleration along the boundary. [12], [20], and Young’s inequality, and then multi-
If we are to use the Navier–Stokes equations to plying through by t2 , we get the linear differential
make predictions of the future, we must solve the inequality
initial boundary value problem for ‘‘man-made’’ d2 
initial values, and accept the fact that there is a t kutt k2 þ t2 krutt k2
dt
momentary breakdown in regularity along the
2tkutt k2 þt2 krut k2 þt2 kPut k2
boundary, immediately following the initial time.   
Thereafter, the solution smooths as ‘‘nature’’ takes þ c kruk4 þkrut k4 t2 kutt k2 ½41
over. To prove the reliability of our predictions, we
need continuous dependence estimates and error with coefficients that are integrable in view of
estimates for numerical methods that take into preceding estimates. In particular, the integrability
376 Viscous Incompressible Fluids: Mathematical Theory

of the first term on the right-hand side follows from of the system of ordinary differential equations
the boundedness of the integral      
Z t unt ; al þ un  run ; al ¼ un ; al
tkutt k2 d ½42 for l ¼ 1; 2; . . . ; n ½46
0

which, we have pointed out, can be included on the satisfying the initial conditions (u (0), a ) = (u0 , al ),
n l

left-hand side of [39]. Finally, notice that the for l = 1, 2, . . . , n. Of course, since (unt , al ) = @cln =@t
boundedness of the integral [42] implies and (un , al ) = (Pun , al ) =  l cln , the differential
  equations can be written as
lim supt!0þ t2 kutt ðtÞk2 ¼ 0 ½43 X n  
d
cln ¼  cin cjn ai  raj ; al  l cln
Therefore, we can integrate [41] to get the estimate dt i;j¼1
Z t
t2 kutt ðtÞk2 þ t2 krutt k2 d and the initial conditions as cln (0) = (un (0), al ), for
0 l = 1, 2, . . . , n.
 
B M; t; ku0 kW 2 ðÞ ; for 0 t < T ½44 The system [46] is at least locally solvable, on
2 some interval [0, Tn ), with each coefficient satisfying
analogous to [34]. cln 2 C1 [0, Tn ). Therefore, since the eigenfunctions
At this point, we have introduced every device are also smooth, un is C1 smooth in   [0, Tn ). It
needed to proceed by induction to an infinite also satisfies all of the identities [30] and [31] on the
sequence of time-weighted estimates, similar to interval [0, Tn ). Indeed, multiplying [46] by cln and
[39] and [44], but with successively higher orders summing over l from 1 to n has the effect of
of time derivatives and weights. The dependence of converting al into un . The resulting identity for un
these estimates on ku0 kW 2 () was introduced through leads immediately to the energy identity
2
[34] and [35]. It can be eliminated by beginning the 1d n 2
introduction of powers of t as weight functions one ku k þkrun k2 ¼ 0 ½47
2 dt
step earlier, with the added advantage that the initial
velocity u0 needs only belong to J 1 (). In the two- The remaining identities in the sequence [30] are
dimensional case, the weight functions can be obtained similarly. For example, the second is
introduced even another step earlier, with the obtained by taking the time derivative of [46],
advantage that the initial velocity u0 needs only multiplying through by dcln =dt and summing over l.
belong to J(). Each of these cases leads to an Prodi’s identity is obtained by multiplying [46] by
existence theorem for solutions u 2 C1 (  (0, T)), l cln and summing, which has the effect of convert-
with the initial values assumed in the norms of J 1 () ing al into Pun . To obtain the second of the
and J(), respectively. identities [31] for un , one differentiates [46], multi-
plies by l dcln =dt and sums. The remaining identities
in the sequence [31] are obtained similarly.
The initial conditions easily imply that kun (0)k
Existence by Galerkin Approximation ku0 k, because u0 2 J() and the eigenfunctions are
Let {a1 , a2 , . . .} and { 1 , 2 , . . .} denote the eigenfunc- orthogonal and complete in J(). Therefore, inte-
tions and eigenvalues of the Stokes equations, gration of [47] yields the energy estimate
Z t
ak þ rp ¼ k ak ; r  ak ¼ 0 in  1 n 1
ku ðtÞk2 þ krun k2 d ku0 k2 ½48
 2 0 2
ak @ ¼ 0 ½45
which is uniform in n. Since kun (t)k remains bounded,
2
chosen to be orthonormal in L (). Clearly, the solution un (t) can be continued for all time. Thus,
Pak = k ak , so they are also the eigenfunctions Tn = 1, for all n. Hence, our early working assump-
and eigenvalues of the Stokes operator, P. Using tion about solutions, that they are smooth in  
regularity estimates for the Stokes equations, each [0, 1), is actually valid for the Galerkin approxima-
eigenfunction is known to be C1 smooth in . tions. The issue becomes one of obtaining estimates
The nth Galerkin approximation for problem [9] for their derivatives that are uniform in n. All of the
is the solution estimates we have proved for solutions are proved in
exactly the same way for the approximations. The
X
n
un ðx; tÞ ¼ ckn ðtÞak ðxÞ only possible source of nonuniformity would arise
k¼1
from the initial values of krun k and kunt k.
Viscous Incompressible Fluids: Mathematical Theory 377

   
The estimates [24], [26], and [27] are uniform in
ut ; al þ ðu  ru; al Þ ¼ u; al
n, since u0 2 J 1 () and hence krun (0)k kru0 k,
due to the orthogonality of the eigenfunctions in the for l ¼ 1; 2; . . . ½49
inner-product (ru, rv), and their completeness with
respect to functions in J 1 (). We also obtain a Since the eigenfunctions are complete in J(), and
uniform bound for kunt (0)k of the form [35], by D()  J(), this implies
multiplying [46] by @cln =@t and summing over l. In ðut þ u  ru  u; Þ ¼ 0; for all  2 DðÞ ½50
the last step, we also need the inequality
kun (0)kW 2 () kun0 kW 2 () , which follows from the Therefore, there exists a vector field rp 2 G() such
2 2
orthogonality of the eigenfunctions in the inner that
product (Pu, Pv), and their completeness with ut þ u  ru  u ¼ rp ½51
respect to functions in J 1 () \ W22 (); see
Ladyzhenskaya (1969, p. 46). Any attempt to find Indeed, the usual test to determine whether a
a bound for krunt (0)k analogous to [36] is certain to smooth vector field w is conservative in some
fail, as it would lead to a contradiction with afore- domain , and therefore representable as a gradient,
mentioned results from Heywood and Rannacher is to check whether the curve integrals
I
(1982).
w  ds ½52
C

vanish for every smooth closed curve C  . Here,


Passage to the Limit is the unit tangent to the curve and ds is its arc
We now have L2 -bounds for un , run , unt , @ 2 un =@xi @xj , length. With a little reflection, one will realize that
and runt over any space-time region   (0, T 0 ), with these curve integrals can be approximated by
0 < T 0 < T. We also have L2 -bounds for all orders of volume integrals of the form (w, ) with  2 D().
the time derivatives of these quantities over any For this, one should choose  to have its support in
subregion   (", T 0 ), with 0 < " < T 0 < T. From a small tubular neighborhood of the curve, and its
these L2 -bounds, we may infer the existence of a streamlines parallel to the curve, with unit net flux
subsequence of the Galerkin approximations, again through any section of the tube. If w is not smooth,
denoted by {un }, which converges, along with those of but only known to belong to L2 (), one can
its derivatives for which we have bounds, to a limit u approximate it with its smooth mollifications. This
and its derivatives. The convergence un ! u and argument can be made rigorous. We previously
run ! ru is strong in L2 (  (0, T 0 )); the conver- showed that J() and G() are orthogonal sub-
gence of unt is strong in L2 (  (", T 0 )) and weak in spaces of L2 (). Now we have argued that
L2 (  (0, T 0 )); the convergence of Pun is weak in L2 () = J()
G().
L2 (  (0, T 0 )); all time derivatives of un , run con-
verge strongly in L2 (  (", T 0 )). Classical C 1 Regularity
Because of estimates for the time derivatives, trace
arguments give the strong convergence un ! u, At any fixed time, we may regard u as a solution of the
run ! ru, unt ! ut , and the weak convergence steady Stokes problem [18] with f = ut  u  ru.
Pun ! Pu, in L2 (), for every t > 0. Included in Cattabriga (1961) and Solonnikov (1964,
For any fixed time, u 2 W22 (), and therefore u is 1966) are regularity estimates for all orders of
continuous in  by a well known Sobolev inequal- derivatives of the form
ity. Since u 2 J 1 (), it must equal zero along the kukW kþ2 ðÞ ckf kW k ðÞ
boundary. The estimates for the time derivatives of 2 2

un , run , @ 2 un =@xi @xj imply that u and its time From our estimates above, we easily conclude that
derivatives are time continuous in W22 (). There- f  ut  u  ru 2 W21 (). Hence, u 2 W23 (). In
fore, u, ut , utt , . . . are classically continuous in fact, in view of the regularity we have proven
  (0, T). with respect to time, f 2 C1 (0, T; W21 ()) and u 2
C1 (0, T; W23 ()). Thus begins a bootstrapping argu-
ment. In the next step, we observe that f 2
Introduction of the Pressure
C1 (0, T; W22 ()) and conclude that u 2 C1 (0, T;
Because of the strong convergence un ! u, run ! W24 ()). By induction, one obtains u 2 C1 (0, T;
ru, unt ! ut and the weak convergence W2k ()) for every positive integer k. Then well-
Pun ! Pu, in L2 (), for any t > 0, it is an easy known Sobolev inequalities imply that u 2 C1
matter to let n ! 1 in [46], obtaining, for all t > 0, (  (0, T)).
378 Viscous Incompressible Fluids: Mathematical Theory

Assumption of the Initial Values For the terms under the last integral we have

n

We begin by showing that u(t) ! u0 , weakly in  u  run ; Pun þ un  run ; Pun 
t t
L2 (), as t ! 0þ . Of course, ku(t)k remains bounded 2
as t ! 0þ , in virtue of [48], and the eigenfunctions runt þckrun k1=2 kPun k3=2
{al } are complete in J(). Writing Therefore, [53] implies
    
uðtÞ  u0 ; al ¼ uðtÞ  un ðtÞ; al þ un ðtÞ
   kPun ðtÞk2 kPun ð0Þk2 þ2ðun  run ; Pun Þjt
 un ð0Þ; al þ un ð0Þ  u0 ; al  2ðun  run ; Pun Þj0 þ Kt
note that the first and third terms on the right-hand uniformly in n, as t ! 0þ , where K is a constant
side can be made small by choosing n large. The depending on the estimates [32] and [34]. Letting
second can be written as n ! 1, gives
  Z t 
un ðtÞ  un ð0Þ; al ¼ unt ; al d kPuðtÞk2 kPuð0Þk2 þ2ðu  ru; PuÞjt
0
 2ðu  ru; PuÞj0 þ Kt
which will be small if t is small, in view of [34].
Thus, (u(t)  u0 , al ) ! 0, as t ! 0þ , which implies Since u  ru ! u0  ru0 strongly in L2 (), and
the desired weak convergence. Pu ! Pu0 weakly in L2 (), we get the desired
The strong convergence u(t) ! u0 in L2 () follows result. The continuous assumption of the initial values
from the weak convergence if lim supt ! 0þ ku(t)k in W22 () also implies their continuous assumption in
ku0 k. The energy estimate [48] for the approxima- the classical sense, and hence that u 2 C(  [0, T)).
tions implies this also.
To conclude that u(t) ! u0 strongly in J 1 (), it only Conclusion
remains to be shown that lim supt ! 0þ kru(t)k
Years ago, mathematical questions concerning the
kru0 k. This readily follows from [29], provided the
Navier–Stokes equations were usually considered in
bounding function M(t) satisfies M(t) ! kru0 k, as
the context of generalized or weak solutions, which was
t ! 0þ . The bounding functions provided by our basic
a technical barrier to many in the scientific community.
estimates [24], [26], and [27] all have this property.
Nowadays, realizing that solutions are at least locally
We may conclude that u(t) ! u0 weakly in W22 (),
classical, fundamental questions such as that of global
provided ku(t)kW 2 () remains bounded as t ! 0þ . To
2 regularity can be studied within the classical context. If
see this, remember that kPuk and kut k are of the estimate [29] is proved for classical solutions, with
essentially the same order. Thus the term kut (t)k2 on T = 1, and without a restriction on the size of the data,
the left-hand side of [34] can be accompanied by a this particular matter will be settled.
term ku(t)k2W 2 () .
2
Finally, to prove that u(t) ! u0 strongly in W22 (),
we need only show that lim supt ! 0þ kPu(t)k
kPu0 k, since kPk and kkW 2 () are equivalent
2 This work has been supported by the Natural
norms on J 1 () \ W22 (). To this end, multiply [46]
Sciences and Engineering Research Council of Canada.
by l dcln =dt and sum to get
1d 2

kPun k2 þ runt ¼ un  run ; Punt See also: Compressible Flows: Mathematical Theory;
2 dt Elliptic Differential Equations: Linear Theory;
d Incompressible Euler Equations: Mathematical Theory;
¼ ðun  run ; Pun Þ
dt
Interfaces and Multicomponent Fluids; Leray–Schauder
 unt  run þ un  runt ; Pun Theory and Mapping Degree; Non-Newtonian Fluids;
Partial Differential Equations: Some Examples;
Integrating this gives Stochastic Hydrodynamics; Turbulence Theories;
Wavelets: Application to Turbulence.
kPun ðtÞk2 kPun ð0Þk2
Z t
d
¼ kPun k2 ds Further Reading
0 dt
¼ 2ðun  run ; Pun Þjt  2ðun  run ; Pun Þj0 Beirão da Veiga H (1997) A new approach to the L2 -regularity
Z t Z t theorem for linear stationary nonhomogeneous Stokes sys-
n 2
n tems. Portugaliae Mathematica 54(Fasc. 3): 271–286.
2
rut ds  2 ut  run
0 0 Cattabriga L (1961) Su un probleme al contorno relativo al
sistema di equazioni di Stokes. Rendi conti del Seminario
þ un  runt ; Pun ds ½53 Matematico della Università di Padova 31: 308–340.
von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 379

Heywood JG (1976) On uniqueness questions in the theory of Heywood JG and Xie W (1997) Smooth solutions of the vector
viscous flow. Acta Mathematica 136: 61–102. Burgers equation in nonsmooth domains. Differential and
Heywood JG (1980) The Navier–Stokes equations: on the Integral Equations 10: 961–974.
existence, regularity and decay of solutions. Indiana Univer- Ladyzhenskaya OA (1969) The Mathematical Theory of Viscous
sity Mathematics Journal 29: 639–681. Incompressible Flow, 2nd edn. New York: Gordon and
Heywood JG (1990) Open problems in the theory of the Breach.
Navier–Stokes equations for viscous incompressible flow. In: Prodi G (1962) Teoremi di tipo lacale per il sistema de Navier–
Heywood JG, Masuda K, Rautmann R, and Solonnikov VA Stokes e stabilità delle soluzione stazionarie. Rendi conti del
(eds.) The Navier–Stokes Equations: Theory and Numerical Seminario Matematico della Università di Padova 32:
Methods, Lecture Notes in Mathematics, vol. 1431, pp. 1–22. 374–397.
Berlin–Heidelberg: Springer Verlag. Solonnikov VA (1964) On general boundary-value problems for
Heywood JG (1994) Remarks concerning the possible global elliptic systems in the sense of Douglis–Nirenberg. I. Izvestiya
regularity of solutions of the three-dimensional incompressible Akademii Nauk SSSR, Sariya Matematicheskaya 28: 665–706.
Navier–Stokes equations. In: Galdi GP, Malek J, and Necas J Solonnikov VA (1966) On general boundary-value problems for
(eds.) Progress in Theoretical and Computational Fluid elliptic systems in the sense of Douglis–Nirenberg. II. Trudy
Dynamics, Pitman Research Notes in Mathematics Series, Matematiceskogo Instituta Imeni V.A. Steklova 92: 233–297.
vol. 308, pp. 1–32. Essex: Longman Scientific and Technical. Solonnikov VA and Sčadilov VE (1973) On a boundary value
Heywood JG (2001) On a conjecture concerning the Stokes problem for a stationary system of Navier–Stokes equations.
problem in nonsmooth domains. In: Neustupa J and Penel P Proceedings. Steklov Institute of Matematics 125: 186–199.
(eds.) Mathematical Fluid Mechanics: Recent Results and Xie W (1991) A sharp pointwise bound for functions with L2
Open Problems, Advances in Mathematical Fluid Mechanics Laplacians and zero boundary values on arbitrary three-
vol. 2, pp. 195–206. Basel: Birkhauser Verlag. dimensional domains. Indiana University Mathematics Journal
Heywood JG (2003) A curious phenomenon in a model problem, 40: 1185–1192.
suggestive of the hydrodynamic inertial range and the smallest Xie W (1992) On a three-norm inequality for the Stokes operator in
scale of motion. Journal of Mathematical Fluid Mechanics 5: nonsmooth domains. In: Heywood JG, Masuda K, Rautmann R,
403–423. and Solonnikov VA (eds.) The Navier–Stokes Equations II:
Heywood JG and Rannacher R (1982) Finite element approxima- Theory and Numerical Methods, Lecture Notes in Mathematics,
tion of the nonstationary Navier–Stokes problem, Part I: vol. 1530, pp. 310–315. Berlin–Heidelberg: Springer Verlag.
Regularity of solutions and second-order error estimates for Xie W (1997) Sharp Sobolev interpolation inequalities for the
the spatial discretization. SIAM Journal on Numerical Stokes operator. Differential and Integral Equations 10:
Analysis 19: 275–311. 393–399.

von Neumann Algebras: Introduction, Modular Theory,


and Classification Theory
V S Sunder, The Institute of Mathematical Sciences, separable Hilbert spaces throughout this article) is
Chennai, India a von Neumann algebra precisely when there is a
ª 2006 Elsevier Ltd. All rights reserved. representation  of a group G as unitary operators
on H such that
M ¼ fx 2 LðHÞ : xðtÞ ¼ ðtÞx 8t 2 Gg
Introduction
As above, we shall write L(H) for the collection of
von Neumann algebras, as they are called now, all continuous linear operators on the Hilbert space H;
first made their appearance under the name recall that a linear mapping x : H ! H is continuous
‘‘rings of operators’’ in a series of seminal papers – precisely when there exists a positive constant K such
see Murray and von Neumann (1936, 1937, 1943) that kx k Kk k 8 2 H. If the norm kxk of the
and von Neumann (1936) – by F J Murray and J von operator x is defined as the smallest constant K with
Neumann starting in 1936. Murray and von the above property, then the set L(H) acquires the
Neumann (1936) specifically cite ‘‘attempts to structure of a Banach space. In fact L(H) is a Banach
generalize the theory of unitary group representa-
-algebra with respect to the composition product, and
tions’’ and ‘‘demands by various aspects of the involution x 7! x given by
quantum-mechanical formalism’’ among the reasons
for the elucidation of this subject. hx ; i ¼ h ; x i 8 ;  2 H
In fact, the simplest definition of a von Neumann
algebra is via unitary group representations: The first major result in the subject is the
a collection M of continuous linear operators on a remarkable ‘‘double commutant theorem,’’ which
Hilbert space H (in order to avoid some potential establishes the equivalence of a purely algebraic
technical problems, we shall restrict ourselves to requirement to purely topological ones. We need
380 von Neumann Algebras: Introduction, Modular Theory, and Classification Theory

two bits of terminology to be able to state the equivalence of subrepresentations of  is seen to


theorem. translate to the equivalence defined on the set P(M)
First, define the commutant S0 of a subset S  of projections in M, whereby p  q if and only if
L(H) by there exists an operator u 2 M such that u u = p and
uu = q. (Such a u is called a partial isometry, with
S0 ¼ fx0 2 LðHÞ : x0 x ¼ xx0 8x 2 Sg ‘‘initial space’’ = range p, and ‘‘final space’’ = range q.)
This is the definition of what is known as the
Second, the strong (resp., weak) operator topology is
‘‘Murray–von Neumann equivalence rel M’’ and is
the topology on L(H) of ‘‘pointwise strong (resp.,
denoted by M . The following accompanying defini-
weak) convergence’’: that is, xn ! x precisely when
tion is natural: if p, q 2 P(M), say p M q if there
kxn   xk ! 0 8 2 H (resp., hxn   x, i ! 0 8,
exists p0 2 P(M) such that p M p0  q – where of
 2 H).
course e  f , range(e)  range(f ).
Theorem 1 The following conditions on a unital

-subalgebra M of L(H) are equivalent:
(i) M = M00 (= (M0 )0 ). The Murray–von Neumann
(ii) M is closed in the strong operator topology. Classification of Factors
(iii) M is closed in the weak operator topology.
We start with a fact (whose proof is quite easy) and
The conventional definition of a von Neumann a consequent fundamental definition.
algebra is that it is a unital  -subalgebra of L(H)
Proposition 2 The following conditions on a von
which satisfies the equivalent conditions above. The
Neumann algebra M are equivalent:
equivalence with our earlier ‘‘simplest definition’’ is
a consequence of the double commutant theorem (i) for any p, q 2 P(M), it is true that either p M q
and the fact that any element of a von Neumann or q M p.
algebra is a linear combination of four unitary (ii) Z(M) = M \ M0 = C.
elements of the algebra: simply take G to be the
The von Neumann algebra M is called a ‘‘factor’’ if
group of unitary operators in M0 .
it satisfies the equivalent conditions above.
Another consequence of the double commutant
theorem is that von Neumann algebras are closed The alert reader would have noticed that if G is
under any ‘‘canonical construction.’’ For instance, a finite group, then (G)0 is a factor precisely when
the uniqueness of the spectral measure E 7! Px (E) the representation  is ‘‘isotypical.’’ Thus, the
associated to a normal operator x shows that if u is ‘‘representation-theoretic fact,’’ that any unitary
unitary, then Puxu (E) = uPx (E)u for all Borel sets E. representation is expressible as a direct sum of
In particular, if x 2 M and u0 2 U(M0 ), then isotypical subrepresentations, translates into the
u0 Px (E)u0 = Pu0 xu0 (E) = Px (E), and hence, we may ‘‘von Neumann algebraic fact’’ that any -subalgebra
conclude that Px (E) 2 U(M0 )0 = (M0 )0 = M (we will of L(H) is isomorphic, when H is finite dimensional,
write U(N) (resp., P(N)) to denote the collection of to a direct sum of factors. In complete generality,
unitary (resp., projection) operators in any von von Neumann (1949) showed that any von
Neumann algebra N); that is, if a von Neumann Neumann algebra is expressible as a ‘‘direct integral
algebra contains a normal operator, it also contains of factors.’’ We shall interpret this fact from
all the associated spectral projections. This fact, ‘‘reduction theory’’ as the statement that all the
together with the spectral theorem, has the conse- magic/mystery of von Neumann algebras is con-
quence that any von Neumann algebra M is the tained in factors and hence restrict ourselves, for a
closed linear span of P(M). while, to the consideration of factors.
The analogy with unitary group representations is Murray and von Neumann initiated the study of a
fruitful. Suppose then that M = (G)0 , for a unitary general factor M via a qualitative as well as a
representation of G. Then the last sentence of the quantitative analysis of the relation M on P(M).
previous paragraph implies that (G)0 = C precisely First, call a p 2 P(M) infinite if there exists a p0  p
when there exist no nontrivial -stable subspaces such that p M p0 and p0 6¼ p; otherwise, say p is
(here and in the sequel, we identify C with its image finite. They obtained an analog, called the ‘‘dimen-
under the unique unital homomorphism of C into sion function,’’ of the Haar measure, as follows.
L(H); and we reserve the symbol Z(M) to denote the
Theorem 3
center of M), that is, when  is irreducible. In general,
the -stable subspaces are precisely the ranges of (i) With M as above, there exists a function
projection operators in M. The notion of unitary DM : P(M) ! [0, 1] which satisfies the following
von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 381

properties, and is determined up to a multiplicative Let us restrict ourselves to the case of


constant, by them: M = L1 (, ) acting on L2 (, ). In this case, it is
true that any automorphism of M is of the form
 p M q , DM (p)  DM (q)
f 7! f T 1 , where T is a ‘‘nonsingular transforma-
 p is finite if and only if DM (p) < 1
tion of the measure space (, )’’ (= a bijection
 If {pn : n = 1, 2, . . . } is any sequence of pairwise
which preserves the class of sets of -measure 0). So,
orthogonal
P projections P in P(M) and
an action of G on M is of the form t (f ) = f Tt1 ,
p = n pn , then DM (p) = n DM (pn )
for some homomorphism t 7! Tt from G to the
(ii) M falls into exactly one of five possible cases, group of nonsingular transformations of (, ). We
depending on which of the following sets is the have the following elegantly complete result from
range of some scaling of DM : Murray and von Neumann (1936).
 (In ) {0, 1, 2, . . . , n} Theorem 4 Let M, G,  be as in the last section,
 and let Me = Mo  G. Assume the G-action is ‘‘free,’’
(I1 ) {0, 1, 2, . . . , 1}
 (II1 ) [0, 1] meaning that if t 6¼ 1 2 G, then ({! 2  :
 (II1 ) [0, 1] Tt (!) = !}) = 0. Then:
 (III) {0, 1} (i) Me is a factor if and only if G acts ergodically on
In words, we may say that a factor M is of: (, ) – meaning that the only G-fixed functions
in M are the constants.
1. type I (i.e., of type In for some 1  n  1)
(ii) Assume that G acts ergodically. Then the type of
precisely when M contains a minimal projection, e is determined as follows:
the factor M
2. type II (i.e., of type II1 or II1 ) precisely when M
contains nonzero finite projections but no mini- e is of type I or II if and only if there exists
 M
mal projections, and a G-invariant measure which is mutually
3. type III precisely when M contains no nonzero absolutely continuous with respect to ,
finite projections. meaning (E) = 0 , (E) = 0; (the ergodicity
assumption implies that such a is necessa-
Examples L1 (, ) may be regarded as a von
rily unique up to scaling by a positive
Neumann algebra acting on L2 (, ) as multi-
constant;)
plication operators; thus, if we set mf () = f ,  e is of type In precisely when the as above is
M
then m : f 7! mf defines an isomorphism of L1 (, )
totally atomic, and  is the disjoint union of n
onto a commutative von Neumann subalgebra of
atoms for ;
L(L2 (, )). In fact, ‘‘up to multiplicity,’’ this is how  e is of type II precisely when the as above is
M
any commutative von Neumann algebra looks.
nonatomic;
It is a simple exercise to prove that M  L(H) is a  Me is of finite type – meaning that 1 is a finite
factor of type In , 1  n  1, if and only if there exist projection in M e – precisely when the as
Hilbert spaces Hn and K and identifications H = Hn above is a finite measure;
K, M = {x idK : x 2 L(Hn )} where dim Hn = n; and  Me is of type III if and only if there exists no
so M ffi L(Hn ). as above.
To discuss examples of the other types, it will be
Thus, we get all the types of factors by this
convenient to use ‘‘crossed products’’ of von
construction; for instance, we may take:
Neumann algebras by ergodically acting groups of
automorphisms. We shall now digress with a (In )G = Zn acting on  = Zn by translation, and
discussion of this generalization of the notion of a  = = counting measure
semidirect product of groups. (I1 )G = Z acting on  = Z by translation, and
If  : G ! Aut(M) is an action of a countable group  = = counting measure
G on M, where M  L(H) is a von Neumann (II1 )G = Z acting on  = T = {z 2 C : jzj = 1} by
algebra, and He = ‘2 (G, H), there are representations powers of an aperiodic rotation, and  = =
 : M ! L(H) e and  : G ! U(L(H))e defined by arclength measure
(II1 )G = Q acting on  = R by translations, and
ððxÞÞ ðsÞ ¼ s1 ðxÞðsÞ; ððtÞÞðsÞ ¼ ðt1 sÞ  = = Lebesgue measure
These representations satisfy the commutation rela- (III)G = ax þ b group acting in the obvious manner
tion (t)(x)(t1 ) = (t (x)), and the crossed pro- on  = R,  = = Lebesgue measure.
duct Mo  G is the von Neumann subalgebra of L(H) e Such crossed products of a commutative von
e
defined by M = ((M) [ (G)) . 00
Neumann algebra by an ergodically acting countable
382 von Neumann Algebras: Introduction, Modular Theory, and Classification Theory

group were intensively studied by Krieger (1970, von Neumann algebra, with one candidate for N
1976). We shall simply refer to such factors as being M =N? (where N? = { 2 M : n() =
‘‘Krieger factors.’’ The term ‘‘Krieger factor’’ is 0 8n 2 N}).
actually used for factors obtained from a slightly 4. Any abstract von Neumann algebra (with separ-
more general construction, with ergodic group able predual) is isomorphic (in the category of
actions replaced by more general ergodic equiva- abstract von Neumann algebras) to a (concrete)
lence relations. Since there is no difference in the von Neumann subalgebra of L(H) (for a separ-
two notions at least in good (amenable) cases, we able H).
will say no more about this.
With the abstract viewpoint available, we shall
look for modules over a von Neumann algebra M,
meaning pairs (H, ) where  : M ! L(H) is a normal
Abstract von Neumann Algebras 
-homomorphism.
So far, we have described matters as they were in A brief digression into the proof of fact (4)
von Neumann’s time. To come to the modern era, it above – which asserts the existence of faithful
is desirable to ‘‘free a von Neumann algebra from M-modules – will be instructive and useful. Suppose
the ambient Hilbert space’’ and to regard it as an M is an abstract von Neumann algebra. A linear
abstract object in its own right which can act on functional on M is called a normal state if:
different Hilbert spaces – for example, L1 (, ) is
 (positivity) (x x) 08x 2 M;
an object worthy of study in its own right, without
 (normality) : M ! C is normal; and
reference to L2 (, ).
 (normalization) (1) = 1.
The abstract viewpoint is furnished by a theorem
of Sakai (1983); let us define an abstract von (Normal states on L1 (, ) correspond to non-
Neumann algebra to be an abstract C -algebra negative probability measures on  which are
(this is a Banach algebra with an involution related absolutely continuous with respect to .) It is true
to the norm by the so-called C -identity kxk2 = that there exist plenty of normal states on M.
kx xk) M which admits a pre-dual M – i.e., M is In fact, they linearly span M . This implies that if
isometrically isomorphic to the Banach dual space M is separable, then there exist normal states
(M ) . It turns out that a predual of such an abstract on M which are even ‘‘faithful’’ – meaning
von Neumann algebra is unique up to isometric (x x) = 0 , x = 0.
isomorphism. Consequently, an abstract von Fix a faithful normal state on M. (Consistent
Neumann algebra comes equipped with a canonical with our convention about separable H’s, we shall
‘‘weak -topology,’’ usually called the ‘‘
-weak topol- only consider M’s with separable preduals.) The
ogy’’ on M. The natural morphisms in the category well-known ‘‘Gelfand–Naimark–Segal’’ construction
of abstract von Neumann algebras are -homo- then yields a faithful M-module which is usually
morphisms which are continuous with respect to denoted by L2 (M, ) – motivated
R by the fact that if

-weak topologies on domain and range. It is M = L1 (, ), and (f ) = f d , with a probabil-
customary to call a linear map between abstract ity measure mutually absolutely continuous with
von Neumann algebras ‘‘normal’’ if it is continuous respect to , then L2 (M, ) = L2 (, ) with L1 (, )
with respect to
-weak topologies on domain and acting as multiplication operators. The construction
range. mimics this case: the assumptions on ensure that
The equivalence of the ‘‘abstract’’ definition of the equation
this section, with the ‘‘concrete’’ one given earlier
(which depends on an ambient Hilbert space), relies hx; yi ¼ ðy xÞ
on the following four facts:
defines a positive-definite inner product on M; let
1. L(H) is an abstract von Neumann algebra, with L2 (M, ) be the Hilbert space completion of M. It
the predual L(H) being the so-called ‘‘trace class’’ turns out that the operator of left-multiplication by
of operators, equipped with the ‘‘trace norm.’’ an element of M extends as a bounded operator to
2. A self-adjoint subalgebra of L(H) is closed in the L2 (M, ), and it then follows easily that L2 (M, ) is
strong operator topology, and is hence a ‘‘con- indeed a faithful M-module, thereby establishing
crete von Neumann algebra’’ precisely when it is fact (4) above.
closed in the
-weak topology on L(H). Since we wish to distinguish between elements of
3. If M is an abstract von Neumann algebra, and N the dense subspace M of L2 (M, ) and the operators
is a  -subalgebra of M which is closed in the of left-multiplication by members of M, let us write

-weak topology of M, then N is also an abstract x
^ for an element of M when thought of as an
von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 383

element of L2 (M, ), and x for the operator of left- elements of M – then the one-parameter subgroup
multiplication by x; thus, for instance, x ^ and
^ = x1, { (
t ) : t 2 R} of Out(M) is independent of .
x^
y=x ^ 1i
cy, hx1, ^ = (x), etc.

Connes’ Classification and


Injective Factors
Modular Theory
Given a factor M, Connes defined
While type III factors were more or less an enigma
at the time of von Neumann, all that changed with SðMÞ ¼
\
the advent of Connes. The first major result of this fspecð Þ : a faithful normal state on Mg
‘‘type III era’’ is the celebrated ‘‘Tomita–Takesaki
theorem’’ (cf. Takesaki (1970)), which views the which is obviously an isomorphism invariant. He
adjoint mapping on M as an appropriate operator then classified (Connes 1973) type III factors into a
on L2 (M, ), and analyzes its polar decomposition. continuum of factors:
Specifically, we have:
Theorem 6 Let M be a factor. Then,
Theorem 5 If is any faithful normal state on M,
(i) 0 2 S(M) , M is of type III; and
consider the densely defined conjugate-linear opera-
(ii) if M is a type III factor, there are three mutually
tor given, with domain {^ x: x 2 M}, by S(0) x) = xb .
(^ exclusive and exhaustive possibilities:
Then,
 (III0 )S(M) = {0, 1}
(i) there is a unique conjugate-linear operator
 (III )S(M) = {0} [ Z , for some 0 <  < 1
S (the ‘‘closure of S(0) ’’) whose graph is  (III1 )S(M) = [0, 1)
the closure of the graph of S(0) ; if we write Q
1=2
S = J  for the polar decomposition of the Example 7 Consider the compact group  = 1 n=1
conjugate-linear closed operator S , then Gn where Gn is Q a finite cyclic group of order n for
(ii) J is an antiunitary involution on L2 (M, ) (i.e., each n. Let  = 1 n = 1 n , where n is a probability
it is a conjugate-linear norm-preserving bijec- measure defined on the subsets of Gn which assigns
tion of L2 (M, ) onto itself which is its own positive mass to each singleton. Let G = 1 n = 1 Gn be
inverse); the dense subgroup of  consisting of finitely
(iii)  is an injective positive self-adjoint operator nonzero sequences. It is not hard to see that each
on L2 (M, ) such that J f ( )J = f (1
) for all translation Tg , g 2 G (given by Tg (!) = g þ !) is a
Borel functions f : R ! R, and most crucially nonsingular transformation of the measure space
(iv) (, ). The density of G in  shows that this action
of G on L1 (, ) is fixed-point-free and ergodic,
J MJ ¼ M0 and it Mit
¼ M 8t 2 R with the result that the crossed product L1 (, )o G
is a factor.
(Here and elsewhere, we shall identify x 2 M Krieger showed that in the case of a Krieger factor
with the operator of ‘‘left-multiplication by x’’ M = L1 (, )o G, the invariant S(M) agrees with the
on L2 (M, ).) so-called ‘‘asymptotic ratio set’’ of the group G of
Thus, each faithful normal state on M yields a nonsingular transformations, which is computable
one-parameter group {
t : t 2 R} of automorphisms purely in terms of the Radon–Nikodym derivatives
of M – referred to as the group of ‘‘modular d( Tt )=d. Using this ratio set description, it is
automorphisms’’ – given by not hard to see that the Krieger factor M given by
the infinite product 

t ðxÞ ¼ it xit
 is a factor of type III if n = 2 and n {0} = =(1 þ )
for all n;
The extent of dependence of the modular group on
 is a factor of type III1 if n = 2 and 2n {0} =
the state is captured precisely by Connes’ Radon
=(1 þ ), 2nþ1 {0} = =(1 þ ), for all n, pro-
Nikodym theorem (Connes 1973), which shows that
vided that {, } generates a dense multiplicative
the modular groups associated to two different
subgroup of R þ;
faithful normal states are related by a ‘‘unitary
 can be of type III0 .
cocycle in M.’’ This has the consequence that if
: Aut(M) ! Out(M) = Aut(M)=Int(M) is the quoti- Among all factors, Connes identified one tractable
ent mapping – where Int(M) denotes the normal class – the so-called injective factors – which are
subgroup of inner automorphisms given by unitary ubiquitous and amenable to classification. To start
384 von Neumann Algebras: Introduction, Modular Theory, and Classification Theory

with, he established the equivalence of several Renormalizable Quantum Field Theory; Hopf Algebras
(seemingly quite disparate) requirements on a von and q-Deformation Quantum Groups; The Jones
Neumann algebra M  L(H) – ranging from injec- Polynomial; Knot Theory and Physics; Noncommutative
tivity (meaning the existence of a projection of norm Geometry and the Standard Model; Noncommutative
Tori, Yang–Mills and String Theory; Positive Maps on
1 from L(H) onto M) to ‘‘approximate finite
C-Algebras; Quantum 3-Manifold Invariants; Quantum
dimensionality’’ (meaning M = ( [n An )00 for some
Entropy; Tomita–Takesaki Modular Theory;
increasing sequence A1  A2      An     of von Neumann Algebras: Subfactor Theory.
finite-dimensional -subalgebras). In the same
paper, Connes (1976) essentially finished the com-
plete classification of injective factors. Only the
injective III1 factor withstood his onslaught; but Further Reading
eventually even it had to surrender to the technical Connes A (1973) Une classification des facteurs de type III.
virtuosity of Haagerup (1987) a few years later! Annales Scientifiques de l’Ecole Normale Superieure 6:
In the language we have developed thus far, the 133–252.
classification of injective factors may be summarized Connes A (1976) Classification of injective factors. Annals of
as follows: Mathematics 104: 73–115.
Connes A (1994) Non-commutative Geometry. San Diego:
 Every injective factor is isomorphic to a Krieger Academic Press.
factor. Dixmier J (1981) von Neumann Algebras. Amsterdam: North
Holland.
 Up to isomorphism, there is a unique injective Haagerup U (1987) Connes’ bicentralizer problem and uniqueness
factor of each type with the solitary exception of of the injective factor of type III1 . Acta Mathematica 158:
III0 . 95–148.
 Injective factors of type III0 are classified (up to Kadison RV and Ringrose JR (1983/1986) Fundamentals of the
isomorphism) by an invariant of an ergodic- Theory of Operator Algebras, vol. I–IV. New York: Academic
Press.
theoretic nature called the ‘‘flow of weights’’; Krieger W (1970) On the Araki–Woods asymptotic ratio set and
unfortunately, coming up with a crisp description non-singular transformations of a measure space. In: Con-
of this invariant, which is simultaneously acces- tributions to Ergodic Theory and Probability, pp. 158–177.
sible to the nonexpert and is consistent with the Lecture Notes in Mathematics, vol. 160, Springer-Verlag.
stipulated size of this survey, is beyond the scope Krieger W (1976) On ergodic flows and the isomorphism of
factors. Mathematische Annalen 223: 19–70.
of this author. Murray FJ and von Neumann J (1936) Rings of operators. Annals
The interested reader is invited to browse through of Mathematics 37: 116–229.
Murray FJ and von Neumann J (1937) On rings of operators, II.
one of the books (Connes 1994, Sunder 1986, Transactions of the American Mathematical Society 41:
Dixmier 1981) for further details; the third book is 208–248.
the oldest (a classic but the language has changed a Murray FJ and von Neumann J (1943) On rings of operators, IV.
bit since it was written), the second is more recent Annals of Mathematics 44: 716–808.
(but quite sketchy in many places), and the first is Sakai S (1983) C -algebras and W  -algebras. Berlin–New York:
Springer-Verlag.
clearly the best choice (if one has the time to read it Sunder VS (1986) An Invitation to von Neumann Algebras.
carefully and digest it). Alternatively, the interested New York: Springer-Verlag.
reader might want to browse through the encyclo- Takesaki M (1970) Tomita’s Theory of Modular Hilbert Algebras
pediac treatments (Kadison and Ringrose) or and its Applications, Lecture Notes in Mathematics 128,
(Takesaki). Berlin–Heidelberg–New York: Springer-Verlag.
Takesaki M (1979/2003) Theory of Operator Algebras,
vols. I–III. Heidelberg: Springer Verlag.
See also: Algebraic Approach to Quantum Field Theory; von Neumann J (1936) On rings of operators, III. Annals of
Bicrossproduct Hopf Algebras and Noncommutative Mathematics 37: 111–115.
Spacetime; Braided and Modular Tensor Categories; von Neumann J (1949) On rings of operators. Reduction theory.
C-Algebras and Their Classification; Ergodic Theory; Annals of Mathematics, 50: 401–485.
Finite-Type Invariants; Hopf Algebra Structure of
von Neumann Algebras: Subfactor Theory 385

von Neumann Algebras: Subfactor Theory


Y Kawahigashi, University of Tokyo, Tokyo, Japan L2 (M). Then M acts on this Hilbert space and we
ª 2006 Elsevier Ltd. All rights reserved. have dimM L2 (M) = 1.
Let N  M be a subfactor and suppose that both
N and M are of type II1 . (We then simply say that
N  M is a type II1 subfactor.) Suppose that M acts
Introduction on a Hilbert space H with dimM H < 1. Then we
Subfactor theory was initiated by Jones (1983) and define the Jones index of N in M by
has experienced rapid progress beyond the frame-
work of operator algebras. Here we start with a dimN H
½M : N ¼ 2 ½1; 1
basic introduction in this section. dimM H
A factor is a von Neumann algebra with a trivial
center. A von Neumann algebra M is an algebra of This number is independent of the choice of H, as
bounded linear operators on a Hilbert space H, long as dimM H < 1, so we can take H = L2 (M),
which contains the identitiy operator and is closed then we have [M : N] = dimN L2 (M). The equality
under the -operation and weak operator topology, [M : N] = 1 means M = N. The first major discovery
and its center is the intersection of M and its of Jones (1983) is that the value of the Jones index is
commutant in the set

M0 ¼ fx 2 BðHÞjxy ¼ yx for all y 2 Mg f4 cos2 ð=mÞjm ¼ 3; 4; 5; . . .g [ ½4; 1 ½1

where B(H) denotes the set of all the bounded linear and all the values in this set are indeed realized.
operators on H. (We are mostly interested in Suppose we have a II1 factor M and an action of
separable, infinite-dimensional Hilbert spaces. A an at most countable, discrete group G on M, that
von Neumann algebra is automatically closed also is, a homomorphism  : G ! Aut(M), where Aut(M)
in the norm topology and thus it is also a C -algebra.) is the automorphism group of M. Then we have a
By definition, a factor M acts on a certain Hilbert construction Mo  G, called the crossed product. If
space H, but we also consider its action on another g is not an inner automorphism of M for any g 2 G
Hilbert space K, that is, a -weakly continuous other than the identity element of G, then Mo  G is
homomorphism preserving the -operation from M also a type II1 factor. (An automorphism  of M is
into B(K). A subfactor is a factor N which is said to be inner if it is of the form (x) = uxu for
contained in another factor M and has the same some unitary operator u 2 M.) The index of a
identity. A factor is classified into types subfactor M  Mo  G is the order of G, which can
In (n = 1, 2, 3, . . . ), I 1 , II1 , II 1 , and III. In most of be infinite. If we have a subgroup H of G, then we
the interesting studies of subfactors, the two factors obtain a subfactor Mo  H  Mo  G and its index is
are of both type II1 or both type III. A factor M is given by the index [G : H] of the subgroup H. This
said to be of type II1 if it is infinite dimensional analogy to the index of a subgroup is the origin of
and has a finite trace tr : M ! C. By definition, a the terminology of the Jones index for a subfactor.
finite trace tr is a linear functional on M satisfying The Jones index is also analogous to the degree of
tr(1) = 1, tr(xy) = tr(yx) for all x, y 2 M, and an extension of a field. From the viewpoint of this
tr(x x)  0 for all x 2 M. When a factor M, not analogy, subfactor theory can be regarded as a
isomorphic to C, acts on a separable Hilbert space, it certain generalized analogue (or the ‘‘quantum’’
is of type III if and only if for any two nonzero version) of the classical Galois theory for field
projections p, q 2 M, we have an operator v 2 M extensions. (The direct analog of the classical Galois
with vv = p and v v = q. One obviously cannot have correspondence for subfactors was studied by
a trace on such a factor. (See Takesaki (2002, 2003) Nakamura–Takeda in the early days, and Izumi–
for a general theory on factors.) Longo–Popa gave the most general form.)
Let M be a type II1 factor acting on a Hilbert The tools Jones (1983) has introduced to study
space H. We then have the coupling constant of subfactors are as follows. Let N  M be a subfactor of
Murray and von Neumann, which is denoted by type II1 with finite Jones index. We consider the
dimM H and belongs to (0, 1]. This measures the actions of N, M on L2 (M). The completion of N with
relative dimension of H with respect to M. Note respect to the inner product given by the trace gives
that the factor M acts on M itself by the left L2 (N), which is naturally regarded as a closed
multiplication. We introduce an inner product on subspace of L2 (M). Let eN be the projection on
M by (x, y) = tr(y x) and denote the completion by L2 (M) onto L2 (N), which is called the Jones
386 von Neumann Algebras: Subfactor Theory

projection. We define M1 to be the von Neumann These finite-dimensional algebras are called higher
algebra generated by M and eN on L2 (M). This is again relative commutants of N  M. We draw the
a type II1 factor and denoted by M1 . This construction Bratteli diagram for the higher relative commutants
is called the basic construction. We obtain [M1 : M] = as follows. Consider N 0 \ Mk (with convention
[M : N]. Repeat the same procedure for M  M1 M1 = N, M0 = M), then it is a L finite-dimensional
acting on L2 (M1 ) this time. In this way, we have an -algebra; thus, it is of the form j Mnj (C), where
increasing sequence of type II1 factors, we have only finitely many direct summands. We
draw a dot for each summand.L We similarly draw a
N  M  M1  M2  M3     dot for each summand in l Mml (C) for N 0 \ Mkþ1 .
0
which is called the Jones tower. We label the Let
L  be the inclusion Lmap from N \ Mk =
0
corresponding Jones projections as e1 = eN , e2 = eM , j M nj (C) to N \ M kþ1 l M ml (C) and pl the
e3 = eM1 , . . . . We then have the following celebrated identity of Mml (C), which is a projection in N 0 \
Jones relations: Mkþ1 . We denote by jl the multiplicity of the
embedding map x 7! (x)pl from Mnj (C) to Mml (C).
ej ek ¼ ek ej ; if jj  kj > 1 Then we draw jl edges from the jth dot for Mnj (C)
1 to the lth dot for Mml (C). We repeat this procedure
ej ej1 ej ¼ ej ½2 for all k, and get a picture as in Figure 1, which is
½M : N
called the Bratteli diagram of the higher relative
Jones proved the above-mentioned restriction on commutants of N  M.
the possible values of the Jones index using these It turns out that the edges connecting the kth and
relations. The realization of the index values below (k þ 1)th steps of the Bratteli diagram consist of the
4 in the set [1] by Jones also relies on these reflection of those connecting the (k  1)th and kth
relations of the Jones projection. The basic con- steps, and a (possibly empty) new part. The ‘‘new’’
struction is also possible for the other direction. parts taken altogether in the above Bratteli diagram
That is, we can construct a subfactor N1  N so constitute the principal graph of a subfactor N  M.
that N  M is the basic construction of N1  N. In the example of Figure 1, the principal graph is the
This is called the downward basic construction. Dynkin diagram A5 . In general, a principal graph
This N1 is not unique, but is unique up to an inner can be finite or infinite. If it is finite, we say that a
automorphism of N. subfactor is of finite depth. If a subfactor has the
A subfactor N  M is said to be irreducible if the Jones index less than 4, it is automatically of
relative commutant N 0 \ M is equal to C. If a finite depth and the principal graph must be one of
subfactor has Jones index less than 4, then it is the A–D–E Dynkin diagrams.
automatically irreducible. The original realization of Pimsner and Popa (1986) obtained the character-
the Jones index values above 4 by Jones was through ization of the Jones index value in terms of the
reducible subfactors. Popa proved that all the values Pimsner–Popa inequality for a conditional expec-
above 4 are realized with irreducible subfactors. A tation. This can be used as a definition of the index
factor is said to be hyperfinite if it has a dense for a subfactor of arbitrary type (and even for
subalgebra given as the union of increasing sequence C -subalgebras). Kosaki obtained a definition of the
of finite-dimensional -algebras. If M is a hyperfinite index for type III subfactors based on works of
type II1 factor, then its subfactor is automatically Connes and Haagerup.
also hyperfinite by a deep theorem of Connes. For
hyperfinite, irreducible type II1 subfactors, it is still
an open problem to determine all the possible values
of the Jones index. N′ ∩ N
For type II1 factors N  M  P, the Jones index
[P : N] is equal to the product [P:M][M:N]. Thus for N′ ∩ M
the Jones tower, we have [Mk :N] = [M:N]kþ1 . In N ′ ∩ M1
general, if a subfactor N  M has a finite Jones
index, then the relative commutant N 0 \ M is N ′ ∩ M2
automatically finite dimensional. So, if we start
N ′ ∩ M3
with a type II1 subfactor N  M with finite Jones
index, we have an increasing sequence of finite- N ′ ∩ M4
dimensional algebras as follows:
N 0 \ M  N 0 \ M1  N 0 \ M2  N 0 \ M3     ½3 Figure 1 The Bratteli diagram of the higher relative commutants.
von Neumann Algebras: Subfactor Theory 387

Analytic Classification Theory Constructions and Combinatorial


If M is a hyperfinite type II1 factor, then it is unique
Classification
up to isomorphism. So any subfactor of such M is As mentioned in the above section, Jones con-
isomorphic to M itself. We next consider the structed hyperfinite type II1 subfactors for all
classification problem of hyperfinite type II1 subfac- possible index values below 4. They have the
tors. We say that a subfactor N  M is isomorphic to Dynkin diagrams An as the principal graphs. It has
P  Q if we have an isomorphism of M onto Q been an important problem to construct new
which maps N onto P. The following tower of finite- subfactors since then. Using the Hecke algebras,
dimensional algebras is a natural invariant for a type Wenzl constructed a series of subfactors with index
II1 subfactor N  M with finite Jones index and it is values sin2 (N=k)= sin2 (=k) with N = 2, 3, 4, . . . ,
called the standard invariant for N  M: where the series for N = 2 coincide with the ones
constructed by Jones. Wenzl’s dimension estimate in
M0 \ M  M0 \ M1  M0 \ M2  
this work for the relative commutant has been an
\ \ \ ½4 important tool to study subfactors. It was soon
noticed that the subfactors of Jones and Wenzl are
N0 \ M  N 0 \ M1  N 0 \ M2   related to the quantum groups Uq (slN ) of Drinfel’d–
Jimbo, at the value of the deformation parameter q
Each square
at exp (i=k). Constructions of subfactors from other
M0 \ Mk  M0 \ Mkþ1 quantum groups have been given by Wenzl.
Ocneanu (1988) has introduced a notion of a
\ \ paragroup and characterized the higher relative
N 0 \ Mk  N 0 \ Mkþ1 commutants arising from a type II1 subfactor with
finite Jones index and finite depth as a paragroup. If
is a special combination of inclusions called a we start with a subfactor N  No  G for a finite
commuting square. Under a fairly general condition group G, the corresponding paragroup contains
(called extremality of a subfactor, which automati- complete information on the group G and its
cally holds for an irreducible subfactor), the above representations. In this sense, a paragroup is a
sequence [4] is anti-isomorphic to the following generalization of a (finite) group. The basic idea is
sequence of finite-dimensional algebras, including to regard the bimodule N L2 (M)M as an analog of the
the trace values: fundamental representation of a compact Lie group
M0 \ M  N0 \ M  N10 \ M   and make finite relative tensor products

\ \ \ ½5    N L2 ðMÞ M L2 ðMÞ N L2 ðMÞ M   


Then one makes an irreducible decomposition and
M0 \ M1  N 0 \ M1  N10 \ M1  
studies various intertwiners arising from these
where     N3  N2  N1  N  M is given by irreducible bimodules. In this way, we obtain a
repeated downward
S basic constructions. So, if the certain combinatorial object and it is called a
closure of j (Nj0 \ M1 ) in the weak operator topology paragroup. The vertices of the principal graphs
is equal to M1 for S an appropriate choice of Nj ’s, correspond to irreducible bimodules and the edges
then the closure of j (Nj0 \ M) is also M, and the correspond to basis vectors in the intertwiner spaces.
isomorphism class of the subfactor N  M is recov- Note that by Popa’s theorem explained in the
ered from the standard invariant. In such a case, we previous section, a classification of subfactors of a
say that a subfactor has a generating property, and hyperfinite type II1 factor with finite Jones index
then we have a complete classification of subfactors and finite depth is reduced to one of paragroups.
in terms of the standard invariant. Popa (1994) Using this theory of paragroups, Ocneanu has
introduced a notion called strong amenability and found that the Dynkin diagrams An , D2n , E6 , and
proved that a subfactor of type II1 is strongly E8 are realized as principal graphs of subfactors, but
amenable if and only if it has the generating property. D2nþ1 and E7 are not. Furthermore, each of the
This is the fundamental result in the classification of graphs An and D2n has unique realization and each
subfactors. A hyperfinite type II1 subfactor with finite of E6 , E8 has two realizations. At the index value 4,
Jones index and finite depth is automatically strongly the principal graph must be one of the extended
amenable, so such a subfactor is covered by this Dynkin diagrams, A(1) (1) (1) (1) (1)
2n1 , Dn , E6 , E7 , E8 , A1 ,
classification theorem of Popa. Popa also has a A1,1 , and D1 , and all are realized. (The last
similar result for subfactors of type III. three correspond to subfactors of infinite depth.)
388 von Neumann Algebras: Subfactor Theory

See Evans and Kawahigashi (1998) and Goodman et invariant for links. This was the beginning of series
al. (1989) for these constructions and classifications. of entirely new theories in three-dimensional topol-
Evans-Kawahigashi and Xu studied the orbifold ogy. The Jones polynomial was quickly generalized
construction of subfactors applied to the Hecke to the two-variable HOMFLY polynomial by Hoste,
algebra subfactors of Wenzl. Ocneanu, Millet, Freyd, Lickorish, and Yetter.
In a theory of integrable lattice models, we have A three-dimensional topological quantum field
squares with labeled edges, and we assign complex theory (TQFT3 ) assigns a complex number to each
numbers to them. A paragroup has much formal closed oriented 3-manifold and a finite dimensional
similarity to such a lattice model, and the para- vector space to each closed oriented surface.
groups of subfactors of Jones and Wenzl correspond Furthermore, to each compact oriented 3-manifold
to the lattice models of Andrews–Baxter–Forrester. with boundary, it assigns a vector in the vector space
Goodman–de la Harpe–Jones have another con- corresponding to its boundary. Turaev–Viro have
struction of subfactors from the Dynkin diagrams, constructed TQFT3 from combinatorial data called
and for E6 this gives p a ffiffiffihyperfinite type II1 subfactor quantum 6j-symbols arising from quantum groups.
with Jones index 3 þ 3 and finite depth. Haagerup Ocneanu has found that a subfactor of finite index
has made a combinatorial study on type II1 sub- pffiffiffi and finite depth also produces quantum 6j-symbols,
factors with Jones index values between 4 and 3 þ 3 which give rise to a TQFT3 generalizing the Turaev–Viro
and obtained a list of candidates of possible higher construction. See Evans and Kawahigashi (1998) for
relative commutants. Haagerup himself pffiffiffiffiffiffi showed one this construction. Reshetikhin–Turaev have another
in the list with Jones index (5 þ 13)=2 is indeed construction of TQFT3 from a modular tensor
realized. Asaeda–Haagerup showed that pffiffiffiffiffiffi another in category, which is a braided tensor category with
the list having the Jones index (5 þ 17)=2 is also nondegenerate braiding. Ocneanu has found a
realized. These two examples are still among the subfactor version of the quantum double construc-
most mysterious examples of subfactors today and tion which produces a modular tensor category
do not seem to arise from other constructions using from a type II1 subfactor of finite index and finite
quantum groups or conformal field theory. Izumi depth. From a type II1 subfactor of finite index and
has another construction
pffiffiffiffiffiffi of a subfactor with the finite depth, we can apply Ocneanu’s generalization
Jones index (7 þ 29)=2 using an endomorphism of of the Turaev–Viro construction on one hand, and
the Cuntz algebra. also the Reshetikhin–Turaev construction to the
Popa has obtained a complete characterization of modular tensor category arising from the quantum
higher relative commutants including the case of double construction of Ocneanu. The resulting two
infinite depth, and axiomatized the higher relative TQFT3 s are shown to be equal by Kawahigashi–
commutant as the standard -lattices. Xu has Sato–Wakui. Concrete computations of these topo-
constructed standard -lattices, hence subfactors, logical invariants have been made by Sato–Wakui
from quantum groups. This realization of Popa of a based on Izumi’s work. Turaev and Wenzl have
given standard -lattice produces a nonhyperfinite other constructions of TQFT3 and modular tensor
type II1 subfactor. Popa–Shlyakhtenko later showed categories.
that any standard -lattice is realized for a subfactor
of a single type II1 factor, a group II1 factor arising
Algebraic Quantum Field Theory
from the free group F1 having countably many
generators, which is not hyperfinite. An operator algebraic approach to quantum field
Jones (1999) has introduced a combinatorial theory is called algebraic quantum field theory and
characterization of standard -lattices as planar the standard reference is Haag (1996). In this
algebras. This approach uses planar operads based approach, instead of quantum fields which are
on tangles and provides a new viewpoint on the operator-valued distributions, we consider a family
structure of higher relative commutants. More {A(O)} of von Neumann algebras parametrized by
studies on planar algebras have been done by spacetime regions O in a Minkowski space. Each
Bisch–Jones. A(O) is meant to be generated by self-adjoint
operators which are observables in O. We axioma-
tize such a family of von Neumann algebras and call
one a local net of von Neumann algebras. It is
Topological Invariants in Three
enough to take O of a special form, called a double
Dimensions and Tensor Categories
cone. The name ‘‘local’’ comes from the locality
Through the relations of the Jones projections, Jones axiom which is a mathematical expression of the
(1985) discovered the Jones polynomial as an Einstein causality on a Minkowski space. The
von Neumann Algebras: Subfactor Theory 389

Poincaré group is used as the spacetime symmetry of and computed their representation theory, and his
the Minkowski space. Doplicher et al. (1971, 1974) construction has been extended to other Lie groups
have introduced a representation theory of a local by Toledano Laredo and others. For the local
net A of von Neumann algebras and found that a conformal net A of von Neumann algebras on the
‘‘physically nice’’ representation is realized as an circle arising from LSU(N), we take an endomorph-
endomorphism of a one von Neumann algebra A(O) ism  of A(I) arising from a representation of the
for some fixed O. They have a notion of a statistical local conformal net, then we have a subfactor
dimension for such a representation and it is an (A(I))  A(I). This is isomorphic to the type II1
integer (or infinite) if the spacetime dimension is subfactor constructed by Jones and Wenzl tensored
larger than 2. Longo (1989, 1990) has shown that with a common type III factor.
this statistical dimension of a representation is equal Longo–Rehren (1995) started the study of a local
to the square root of the index [A(O) : (A(O))], net of subfactors, A(I)  B(I). They have defined a
where  is the corresponding endomorphism of certain induction procedure which gives a represen-
A(O) to the representation. The relation between tation of the larger local conformal net B from that
algebraic quantum field theory and subfactor theory of A. This procedure is today called -induction. Xu
has been found in this way. Longo (1989, 1990) has has studied this procedure and found several basic
also started a theory of canonical endomorphisms properties. In the cases of local conformal nets of
for a subfactor and Izumi has further studied it. subfactors arising from conformal embeddings, he
Longo has later obtained a characterization when an has found a simple construction of subfactors with
endomorphism of a factor becomes a canonical principal graphs E6 and E8 using -induction.
endomorphism by introducing a Q-system. In the context of subfactor theory, -induction
Recently, conformal field theory has attracted has been further studied by Böckenhauer–Evans–
much attention. An approach based on algebraic Kawahigashi, together with graphical methods of
quantum field theory describes a conformal field Ocneanu on the Dynkin diagrams. More detailed
theory with a local net of von Neumann algebras on studies on local conformal nets of factors on the
a two-dimensional Minkowski space with diffeo- circle have been pursued partly using various
morphism group as the spacetime symmetry. We can techniques of subfactor theory, including classifica-
restrict such a theory into a tensor product of two tion of local conformal nets of von Neumann
theories on the circle, the compactified one- algebras on the circle with central charge less than
dimensional Euclidean space. Each theory on the 1 by Kawahigashi–Longo.
circle is called a chiral conformal field theory and
described by a local conformal net of von Neumann See also: Algebraic Approach to Quantum Field Theory;
algebras, which is a family of von Neumann Braided and Modular Tensor Categories; C-Algebras
algebras parametrized by intervals on the circle. and Their Classification; Hopf Algebras and
The name ‘‘conformal’’ comes from the fact that we q-Deformation Quantum Groups; The Jones Polynomial;
use the orientation preserving diffeomorphism group Quantum 3-Manifold Invariants; Quantum Entropy; von
Neumann Algebras: Introduction, Modular Theory, and
on the circle as the symmetry group of the space. For
Classification Theory; Yang–Baxter Equations.
a local conformal net A of von Neumann algebras
on the circle with natural irreducibility assumption,
each von Neumann algebra A(I) is automatically a Further Reading
type III factor. The Doplicher–Haag–Roberts theory
works in this setting after an appropriate adaptation Doplicher S, Haag R, and Roberts JE (1971) Local observables
and particle statistics, I. Communications in Mathematical
as in Fredenhagen et al. (1989) and each representa- Physics 23: 199–230.
tion of a local conformal net of von Neumann Doplicher S, Haag R, and Robert JE (1974) Local observables an
algebras is realized by an endomorphism of A(I), particle statistics, II. Communications in Mathematical Phy-
where I is an arbitrarily fixed interval on the circle. sics 35: 49–85.
(Here we do not need an assumption that a Evans DE and Kawahigashi Y (1998) Quantum Symmetries on
Operator Algebras. Oxford: Oxford University Press.
representation is ‘‘physically nice’’ since it now Fredenhagen K, Rehren K-H, and Schroer B (1989) Superselection
automatically holds.) Now the representations give sectors with braid group statistics and exchange algebras.
a braided tensor category. Communications in Mathematical Physics 125: 201–226.
Buchholz–Mack–Todorov constructed examples of Goodman F, de la Harpe P, and Jones VFR (1989) Coxeter
local conformal nets of von Neumann algebras on the Graphs and Towers of Algebras, vol. 14. Berlin: MSRI
Publications, Springer.
circle using the U(1)-current algebra. Wassermann Haag R (1996) Local Quantum Physics. Berlin: Springer.
(1998) has constructed more examples using positive Jones VFR (1983) Index for subfactors. Inventiones Mathematical
energy representations of the loop groups LSU(N) 72: 1–25.
390 Vortex Dynamics

Jones VFR (1985) A polynomial invariant for knots via von London Mathematical Society Lecture Note Series, vol. 36,
Neumann algebras. Bulletin of the American Mathematical pp. 119–172. Cambridge: Cambridge University Press.
Society 12: 103–112. Pimsner M and Popa S (1986) Entropy and index for subfactors.
Jones VFR (1999) Planar algebras, I, math.QA/9909027. Annales Scientifques de l’Ecole Normale 19: 57–106.
Longo R (1989) Index of subfactors and statistics of quantum fields I. Popa S (1994) Classification of amenable subfactors of type II.
Communications in Mathematical Physics 126: 217–247. Acta Mathematica 172: 163–255.
Longo R (1990) Index of subfactors and statistics of quantum Popa S (1995) An axiomatization of the lattice of higher relative
fields II. Communications in Mathematical Physics 130: commutants of a subfactor. Inventiones Mathematicae 120:
285–309. 427–445.
Longo R and Rehren K-H (1995) Nets of subfactors. Reviews in Takesaki M (2002, 2003) Theory of Operator Algebras I, II, III.
Mathematical Physics 7: 567–597. Berlin: Springer.
Ocneanu A (1988) Quantized group, string algebras and Galois Wassermann A (1998) Operator algebras and conformal field theory
theory for algebras. In: Evans DE and Takesaki M (eds.) III: Fusion of positive energy representations of SU(N) using
Operator Algebras and Applications, vol. 2 (Warwick 1987), bounded operators. Inventiones Mathematicae 133: 467–538.

Vortex Dynamics
M Nitsche, University of New Mexico, Albuquerque, u(x, t) = u(x, t)i þ v(x, t)j þ w(x, t)k, and depends on
NM, USA the fluid density (x, t), temperature T(x, t), gravita-
ª 2006 Elsevier Ltd. All rights reserved. tional field g, and other external forces possibly
acting on it. The fluid vorticity is defined by w = r
u.
The vorticity measures the local fluid rotation about an
axis, as can be seen by expanding the velocity near
Introduction
x = x0 ,
A vortex is commonly associated with the rotating
motion of fluid around a common centerline. It is uðxÞ ¼ uðx0 Þ þ Dðx0 Þðx  x0 Þ þ 12 wðx0 Þ
ðx  x0 Þ
defined by the vorticity in the fluid, which measures þ Oðjx  x0 j2 Þ ½1
the rate of local fluid rotation. Typically, the fluid
circulates around the vortex, the speed increases as where
2 3
the vortex is approached and the pressure decreases. ux uy uz
Vortices arise in nature and technology applications Dðx0 Þ ¼ 12 ðru þ ruT Þ; ru ¼ 4 vx vy vz 5 ½2
in a large range of sizes, as illustrated by the wx wy wz
examples given in Table 1. The next section presents
some of the mathematical background necessary to The first term u(x0 ) corresponds to translation: all
understand vortex formation and evolution. Next, fluid particles move with constant velocity u(x0 ).
some sample flows are described, including impor- The second term D(x0 )(x  x0 ) corresponds to a
tant instabilities and reconnection processes. Finally, strain field in the three directions of the eigenvectors
some of the numerical methods used to simulate of the symmetric matrix D. If the eigenvalue
these flows are presented. corresponding to a given eigenvector is positive,
the fluid is stretched in that direction, if it is
negative, the fluid is compressed. Note that, in
Background incompressible flow, r  u = 0, so the sum of the
eigenvalues of D equals zero. Thus, at least one
Let D be a region in three-dimensional (3D) space
eigenvalue is positive and one negative. If the third
containing a fluid, and let x = (x, y, z)T be a point in
eigenvalue is positive, fluid particles move towards
D. The fluid motion is described by its velocity
sheets (Figure 1a). If the third eigenvalue is negative,
fluid particles move towards tubes (Figure 1b). The
Table 1 Sample vortices and typical sizes last term in eqn [1], (1/2)w(x0 )
(x  x0 ), corre-
sponds to a rotation: near a point with w(x0 ) 6¼ 0,
Vortex Diameter the fluid rotates with angular velocity jwj=2 in a
Superfluid vortices 108 cm ( = 1 Å) plane normal to the vorticity vector w. Fluid for
Trailing vortex of Boeing 727 1–2 m which w = 0 is said to be irrotational.
Dust devils 1–10 m A vortex line is an integral curve of the vorticity.
Tornadoes 10–500 m For incompressible flow, r  w = r  (r
u) = 0,
Hurricanes 100–2000 km
Jupiter’s Red Spot 25 000 km
which implies that vortex lines cannot end in the
Spiral galaxies Thousands of light years interior of the flow, but must either form a closed loop
Vortex Dynamics 391

incompressible flow, the fluid velocity is determined


by the vorticity, up to an irrotational far-field
component u1 , through the Biot–Savart law,
Z
1 ðx  x0 Þ  wðx0 Þ 0
uðxÞ ¼  dx þ u1 ½6
4 jx  x0 j3
In planar 2D flow, eqn [6] reduces to
(a) (b)
1 yi þ xj
Figure 1 Strain field: (a) two positive eigenvalues, sheet uðxÞ ¼ K 2D !; K  2D ðxÞ ¼ ½7
formation and (b) one positive eigenvalue, tube formation. 2 jxj2
where !(x) is the scalar vorticity. Equations [4], [5]
interior of the flow, but must either form a closed loop and [6], [7] are the basis of the numerical methods
or start and end at a bounding surface. In 2D flow, discussed later in this article.
u = ui þ vj and the vorticity is w = !k, where ! is the A vortex is typically defined by a region in the
scalar vorticity. Thus, in 2D, the vorticity points in the fluid of concentrated vorticity. A simple model is a
z-direction and the vortex lines are straight lines point vortex in 2D flow, which corresponds to a
normal to the x–y plane. A vortex tube is a bundle straight vortex filament of unit circulation. The
of vortex lines. TheR strength of a vortex tube is defined associated scalar vorticity is a -function in the
as the circulation C u  ds about a curve C enclosing plane, and the induced velocity is obtained from the
the tube. By Stokes’ theorem, Biot–Savart law. For a point vortex at the origin,
Z ZZ this reduces to the radial velocity field
u  ds ¼ w  n dS ½3 u(x) = K  2D  = K 2D (x). Corresponding particle tra-
C A
jectories are shown in Figure 2a. The particle speed
and thus the circulation can also be interpreted as juj = 1=(2r) increases unboundedly as the vortex
the flux of vorticity through a cross section of the center is approached, and vanishes as r ! 1
tube. In inviscid incompressible flow of constant (Figure 2b). In general, the far-field velocity of a
density, Helmholtz’ theorem states that the tube concentrated vortex behaves similarly to the one of
strength is independent of the curve C, and is a point vortex, with speeds decaying as 1=r. Near
therefore a well-defined quantity, and Kelvin’s the vortex center, the velocity typically increases in
theorem states that a tube’s strength remains magnitude and, as a result, the fluid pressure
constant in time. A vortex filament is an idealization decreases (Bernoulli’s theorem). A vortex of arbi-
in which a tube is represented by a single vortex line trary shape can be approximated by a sum of point
of nonzero strength. vortices (in 2D) or vortex filaments (in 3D), as is
The evolution equation for the fluid vorticity, as often done for simulation purposes.
derived from the Navier–Stokes equations, is Vorticity can be generated by a variety of
dw mechanisms. For example, vorticity can be gen-
¼ w  ru þ w ½4 erated by density gradients, which in turn are
dt
induced by spatial temperature variations. This
where d=dt = @=@t þ u  r is the total time deriva- mechanism explains the formation of warm-air
tive. Equation [4] states that the vorticity is vortices when a layer of hot air is trapped
transported by the fluid velocity (first term), underneath cooler air. Vorticity is also generated
stretched by the fluid velocity gradient (second near solid walls in the form of boundary layers
term), and diffused by viscosity  (last term). These caused by viscosity. To illustrate, imagine
equations are usually nondimensionalized and writ-
ten in terms of the Reynolds number, a dimension-
less quantity inversely proportional to viscosity.
Speed, |u |

To understand high Reynolds number flow it is of


interest to study the inviscid Euler equations. The
corresponding vorticity evolution equation in 2D is
d! Distance, r
¼0 ½5
dt
(a) (b)
which states that 2D vortex filaments in inviscid Figure 2 Flow induced by a point vortex: (a) streamlines and
flow move with the fluid velocity. Furthermore, in (b) speed juj vs. distance r.
392 Vortex Dynamics

y y y

d
U
Uo u ω

(a) (b) k
Figure 3 Velocity and vorticity in boundary layer near a flat
(a) (b)
wall.
Figure 5 Vortex sheet: (a) velocity profile and (b) dispersion
relation.
horizontal flow with speed Uo moving past a
solid wall at rest (Figure 3a). Since in viscous
flow the fluid sticks to the wall (the no-slip inside. Shear layers occur naturally in the ocean or
boundary condition), the fluid velocity at the wall atmosphere when regions of distinct temperature or
is zero. As a result, there is a thin layer near the density meet. To illustrate this scenario, consider a
wall in which the horizontal velocity varies tank containing two horizontal layers of fluids of
greatly while the vertical velocity gradients are different densities, one on top of the other. If the
small, yielding large negative vorticity values tank is tilted, the heavier bottom fluid moves
! = vx  uy (Figure 3b). Similarity solutions to downstream, and the lighter one moves upstream,
the approximating Prandtl boundary-layer equa- creating a shear layer.
tions show that the boundary-layer
pffiffi thickness d Flat shear layers are unstable to perturbations:
grows proportional to t, where t measures the they do not remain flat but roll up into a sequence
time from the beginning of the motion. Boundary of vortices. This is the Kelvin–Helmholtz instability,
layers can separate from the wall at corners or which can be deduced analytically using linear
regions of high curvature and move into the fluid stability analysis. One shows that in a periodically
interior, as illustrated in several of the following perturbed flat shear layer, the amplitude of a
examples. perturbation with wave number k will initially
grow exponentially in time as ewt , where w = w(k)
is the dispersion relation, leading to instability. The
Sample Vortex Flows wave number of largest growth depends on the layer
Shear Layers thickness. This is illustrated in Figure 4b, which
plots w(k) for a constant-vorticity layer of thickness
A shear layer is a thin region of concentrated 2d. The wave number of maximal growth is
vorticity across which the tangential velocity com- proportional to 1=d.
ponent varies greatly. An example is the constant- A vortex sheet is a model for a shear layer. The
vorticity layer given by parallel 2D flow layer is approximated by a surface of zero thickness
u(x, y) = U(y), v(x, y) = 0, where U is as shown in across which the tangential velocity is discontinu-
Figure 4a. In this case, the velocity is constant ous, as illustrated in Figure 5a. In this case, the
outside the layer and linear inside. The vorticity dispersion relation reduces to w(k) =  k. That is,
! = U0 (y) is zero outside the layer and constant for each wave number k there is a growing and a
decaying mode, and the growing mode grows faster
the higher the wave number is, as shown in
y
Figure 5b. The vortex sheet arises from a constant
vorticity shear layer as the thickness d ! 0 and the
w
vorticity ! ! 1 in such a way that the product !d
2D remains constant. Figure 6 shows the roll-up of a
U
periodically perturbed vortex sheet due to the

0.65
kd

(a) (b)
Figure 4 Shear layer: (a) velocity profile and (b) dispersion
relation. Figure 6 Computation of vortex sheet roll-up.
Vortex Dynamics 393

Kelvin–Helmholtz instability, computed using one of


the methods described in the next section.

Aircraft Trailing Vortices


One can often observe trailing vortices that shed (a)
from the wings of a flying aircraft (also called
contrails). These vortices are formed because the
wing develops lift. The pressure on the top of the
wing is lower than on bottom, causing air to move
around the edge of the wing from the bottom
surface to the top. The boundary layer on the wing (b)
separates as a shear layer that rolls up into a vortex
attached to the tip of the wing (Figure 7). Since the
velocity inside the vortex is high, the pressure is
correspondingly low and causes water vapor in the
air to condense, forming water droplets that
(c)
visualize the vortices. The vortex strength increases
with increasing lift, and is particularly strong in Figure 8 Sketch. Onset of Crow instability in a pair of vortex
lines and ensuing reconnection.
high-lift conditions such as take-off and landing.
Since lift is proportional to weight, it also increases
with the size of the airplane. Vortices of large planes two-dimensional schematic in Figure 8c does not
are strong enough to flip a small one if it gets too convey the three-dimensional structure of the rings.
close. Trailing vortices are the principal reason for The reconnection process destroys the initial wake
the time delay between take-off and landing and are structure more rapidly than viscous decay of the
still a serious issue for crowded urban airports. individual filaments.
The trailing vortices can be modeled by a pair of Of much interest is the study of how to accelerate
counter-rotating vortex lines (Figure 8a). Two the vortex decay. High-aspect-ratio vortices are subject
parallel vortex lines of opposite strength induce a to a shorter-wavelength elliptic instability, which leads
downward motion on each other, similar to two to earlier destruction. However, such vortices are not
point vortices, the zero-core limit. Two point realistic in current aircraft wakes. Wing designs have
vortices of strength  at a distance 2d from each been proposed in which more than two trailing
other translate with self-induced velocity (Figure 9): vortices form which interact strongly and lead to
faster decay. Other interesting aspects are the effect of
 ambient turbulence and vortex breakdown. Break-
U¼ ½8
4d down refers to a disturbance in the vortex core in
As a result trailing vortices near takeoff hit the which it quickly, within an axial distance of few core
ground as a strong downwash air current. diameters, develops a region of reversed flow and loses
Vortex decay results generally from the develop- its laminar behavior.
ment of instabilities. Two parallel vortex tubes are Unlike the counter-rotating vortices discussed so
subject to the long-wavelength Crow instability. far, two equally signed vortices rotate under their
Triggered by turbulence in the surrounding air, or self-induced velocity about a common axis. If the
by local variations in air temperature or density, the separation distance between them is too small, two
vortices develop symmetric sinusoidal perturbations equally signed patches merge into one. Vortex
with long wavelength, of the same order as the merging occurs in two- or three-dimensional flows,
vortex separation (Figure 8b). As the perturbations as opposed to vortex reconnection, which is a
grow to finite amplitude, the tubes reconnect and strictly three-dimensional phenomenon.
produce a sequence of vortex rings. Note that the

Γ −Γ

2d

Figure 7 Sketch. Shear layer separation and roll-up into


trailing vortices behind an airfoil. Figure 9 Self-induced downward motion of a vortex pair.
394 Vortex Dynamics

Vortex Rings
A vortex tube that forms a closed loop is called a
vortex ring. Vortex rings can be formed by ejecting
fluid from a circular opening, such as when a smoke
ring is formed. The boundary layer wall vorticity
separates at the opening as a cylindrical shear layer
that rolls up at its edge into a ring (Figure 10). The
vorticity is concentrated in a core, which may be
thin or thick relative to the ring diameter. The
limiting cases are an infinitely thin circular filament
of nonzero circulation and the Hill’s vortex, in Figure 12 Sketch. Onset of azimuthal vortex ring instability.
which the vorticity occupies all the interior of a
sphere. Vortex rings of small cross section are subject
Just as a counter-rotating vortex pair, a ring to an azimuthal instability. Theory, experiment,
translates under its self-induced velocity U in and simulations show that if a ring is perturbed
direction normal to the plane of the ring (Figure 11). in the azimuthal direction, there exists a domi-
However, unlike the vortex pair, the ring velocity nant wave number which is unstable and grows
depends significantly on its core thickness. For a ring (Figure 12). The unstable wave number increases
with radius, circulation and core size, respectively, as the core size decreases, while its spatial
R, , a, the self-induced velocity is amplification rate is almost independent of the
 
 8R 1 core size.
U log  ½9 Interesting dynamics are obtained when two or
4R a 4
more rings interact. Two coaxial vortex rings of
asymptotically as a ! 0. Thus, the translation equally signed circulation move in the same
velocity becomes unbounded for rings with decreas- direction and exhibit leap-frogging: the rear ring
ing core size. In reality, at some point viscosity takes causes the front ring to grow in radius and the
over and spreads the core vorticity, slowing the ring front ring causes the rear one to decrease. From
down. eqn [9] it can be seen that the ring velocity is
inversely proportional to its radius. Consequently,
the front ring slows down and the rear ring
speeds up, until the rear ring travels through the
front ring. This process repeats itself and is
known as leap-frogging. On the other hand, two
coaxial vortex rings of oppositely signed circula-
tion approach each other and grow in radius.
Their cores contract in order to preserve volume,
and their vorticity increases in order to preserve
circulation. Under certain experimental condi-
tions, the azimuthal instability develops, the
resulting waves on opposite rings reconnect and
a sequence of smaller rings form.

Figure 10 Vortex ring, formed by ejecting fluid from a circular


Vortices, Mixing, and Chaos
tube.
Mixing is important in many natural processes and
technological applications. For example, mixing in
shear flows and wakes is relevant to aeronautics and
a
combustion, mixing and diffusion determine chemi-
R
cal reaction rates, and mixing of contaminants
Γ pollutes oceans and atmosphere. It is therefore
U important to understand and control mixing
processes.
Efficient mixing of two fluids is obtained by
Figure 11 Self-induced motion of a vortex ring. efficient stretching and folding of material lines.
Vortex Dynamics 395

Stretching and folding in turn are the fingerprint tropical cyclones. Baroclinic instability, which
of chaos; thus, mixing and chaos are intimately occurs when temperature advection is superposed
related. Mixing and associated chaotic fluid on a velocity field, can lead to cyclonic vortices at
motion can be obtained by simple vortical the front between air of polar origin and that of
motion. For example, two counter-rotating vor- tropical origin. The inertial or centrifugal
tices subject to a periodic strain field oscillate in a instability occurs when air flows around high-
regular fashion but induce chaos in a region of pressure systems and the pressure gradient force
fluid moving with them. Similarly, two corotating is not large enough to balance the centripetal
vortices of equal strength that are turned on and acceleration and the Coriolis effect.
off periodically so that one is on when the other Vortices also form on other planets with an
is off, known as the blinking vortices, rotate atmosphere. On Mars, dust devils are quite
around a common axis in a stepwise manner but common. They are 10–50 times larger than the
induce chaos in nearby regions. On the other ones on Earth and can carry high-voltage electric
hand, if there are four or more vortices present, fields caused by the rubbing of dust grains against
the vortex motion itself is generally chaotic. It each other. Jupiter’s characteristic spots are
should be noted that there are also nonchaotic extremely large storm vortices. The Great Red
equilibrium solutions of four or more vortices Spot is a vortex spanning twice the diameter of
forming what is called a vortex crystal. the Earth. Unlike the low-pressure terrestrial
Information about chaotic particle motion is storms and hurricanes, the Great Red Spot is a
obtained by studying Poincaré sections, examining high-pressure system that has been stable for
the associated stable and unstable manifolds, and more than 300 years. Other vortices on Jupiter
investigating the existence of chaotic maps such as decay and vanish, such as the White Ovals, three
the horseshoe map. large anticyclones which merged into one within
two years. Recent computer simulations predict
Atmospheric Vortices that many of Jupiter’s vortices will merge and
Atmospheric vortices are driven by temperature disappear in the next decade. As a result, mixing
gradients, Earth’s rotation (Coriolis force), spatial of heat across zones will decay and the planet’s
landscape variations, and instabilities. For example, temperature is predicted to increase.
temperature differences between the equator and the Numerical simulations of the atmosphere are
poles and Earth’s rotation lead to large-scale expensive due to the large number of parameters
vortices such as the trade winds (Hadley cell), the and the relatively small scales that need to be
jet streams, and the polar vortex (Figure 13). Semi- resolved. For climate models and medium-range
annual temperature oscillations are responsible for forecast models, the governing 3D compressible
the Indian monsoons. Daily oscillations cause land- Euler equations are simplified using the hydro-
and sea-breezes. Landscape variations can cause static approximation (in which only the pressure
urban–rural wind flows and mountain–valley gradient and the gravitational forces are retained
circulations. in the vertical-momentum equation) and the
Instabilities are often responsible for large anelastic approximation (in which d=dt is
cyclonic vortices. Barotropic instability results neglected), to obtain the primitive equations.
from large horizontal velocity gradients, and has Additional vertical averaging yields the shallow-
been deemed responsible for disturbances over the water equations. One big hurdle is to accurately
Sahara region that occasionally intensify into incorporate the effect of clouds, which is sig-
nificant and is usually treated using subgrid
models.
Polar front
Polar
vortex Polar jet stream
Vortices in Superfluids and Superconductors
Ferrel cell At temperatures below 2.2 K, liquid helium is a
superfluid, meaning that it acts essentially like a
Subtropical fluid with zero viscosity governed by the Euler
jet stream
equations. The fluid is irrotational, except for
Hadley cell extremely thin vortex filaments, which are formed
by quantum-mechanical processes. Since the vortices
cannot end in the interior of the flow, they can be
Figure 13 Vortices in the atmosphere. generated only at the surface or they nucleate as
396 Vortex Dynamics

vortex rings inside the fluid. As an example, if free current flow is lost. In order to recover the
a cylindrical container with helium is rotated desired property of dissipation-free flow, flux lines
sufficiently fast, vortex lines attached to both ends have to be pinned, for example, by introducing
of the container appear. These quantum vortices inhomogeneities and structural defects. For a given
have discrete values of circulation (= nh=m, where pinning force, flux lines remain pinned as long as the
h = Planck’s constant, m = mass of helium atom, current density stays below a critical value. A major
n = integer), core sizes of about 1 Å (roughly the research objective is to optimize the pinning force in
diameter of a single hydrogen atom) and move order to preserve superconductivity at larger current
without viscosity. densities.
Similarly, certain types of materials lose their
electric resistance at low temperatures and
become superconductors. One distinguishes type-I
Numerical Vortex Methods
superconductors (most pure metals) from type-II
superconductors (alloys). Using the Ginzburg– Many numerical methods used to compute fluid
Landau theory it has been predicted that in flow are Eulerian schemes based on a fixed mesh,
type-II superconductors a lattice of vortex fila- such as finite difference, finite element, and spectral
ments forms, each carrying a quantized amount methods, commonly used for example in atmo-
of magnetic flux. This was subsequently con- sphere and ocean modeling. This section briefly
firmed by experimental observation. More pre- describes alternative vorticity-tracking methods
cisely, for temperatures T below a critical value used to simulate incompressible inviscid vortex
Tc , there are three regions corresponding to flows, and concludes with some extensions to
increasing values of the magnetic field (Figure 14). viscous flows. The premise of these methods is
At low magnetic fields (H < Hc1 ), no vortices that since the fluid velocity is determined by the
exist (superconducting phase). At intermediate vorticity through the Biot–Savart law (eqn [6]), it
values (Hc1 < H < Hc2 ), the magnetic field pene- suffices to track only that portion of the fluid
trates the superconductor in the form of quan- carrying nonzero vorticity. This region is often
tized vortices, also called flux lines (mixed much smaller than the total fluid volume, and
phase). The values Hc1, c2 are determined by the computational efficiency is gained. Numerical vor-
London penetration depth , which measures the tex methods are typically Lagrangian, that is, the
electromagnetic response of the superconductor. computational elements move with the fluid
With increasing magnetic field, the density of flux velocity.
lines increases until the vortex cores overlap
when the upper critical field Hc2 is reached,
beyond which one recovers the normal metallic Point-Vortex Approximation in 2D
state (normal conductor). To compute the evolution of a vorticity distribution
When an external current density j is applied to !(x, t) in 2D, the simplest approach is to approx-
the vortex system, the flux lines start to move under imate the vorticity by a set of point vortices at xj (t)
the action of the Lorentz force. As a result, a with circulation j and evolve them under their self-
dissipating electric field E appears that is parallel to induced motion. The values j are an estimate of the
j, and the superconducting property of dissipation- initial circulation around xj (0). The vortex positions
xj (t) evolve in the induced velocity field

dxj X N
¼ k K 2D ðxj  xk Þ ½10
dt k¼1
k6¼j
Hc2
Normal where the exclusion k 6¼ j accounts for the fact that
conductor a point vortex induces zero velocity on itself. The
H Mixed phase
solution to the system of ordinary differential
equations [10] can be obtained using any method,
Hc1 such as Runge–Kutta or Adams–Bashforth.
Superconductor The point-vortex approximation can be written in
no vortices
Hamiltonian form as
0 T Tc
Figure 14 Superconductor phase dependence on magnetic dxj 1 @H dyj 1 @H
field H and temperature T.
¼ ; ¼ ½11
dt j @yj dt j @xj
Vortex Dynamics 397

where the Hamiltonian 0.0 t=0

1 XN X N h
Hðx; yÞ ¼ j k log ðxj  xk Þ2
4 j¼1 k¼1
k>j
i t = 20
þðyj  yk Þ2 ½12

is conserved along fluid particles, dH=dt = 0. The


method also conserves the fluid circulation and the
linear and angular momenta.
Ideally, the solution to [10] should converge as
N ! 1 to the solution of the Euler equations.
t = 60
This is true for smooth vorticity distributions, but
for singular distributions such as a vortex sheet,
the situation is more complicated. The vortex
sheet, a curve in the plane, develops a singularity
in finite time at which the curvature becomes
–8.0
unbounded at a point. The point-vortex approx- –2.5 2.5
imation converges before the singularity formation Figure 15 Computed evolution of an elliptically loaded flat
time, provided the growth of spurious roundoff vortex sheet.
error due to Kelvin–Helmholtz instability is
suppressed using a filter. However, past the
singularity formation time, the point-vortex
Contour Dynamics in 2D
approximation no longer converges.
The general approach is to replace the singular Consider a planar patch of constant vorticity !o
kernel K 2D by a regularization K 2D , such as bounded by a curve x(s, t), 0 s L, moving in
inviscid, incompressible flow. In view of Kelvin’s
1 yi þ xj theorem and eqn [5], the vorticity in the patch
K2D ¼ ½13a remains constant and equal to !o for all time, and
2 jxj2 þ 2
the patch area remains constant. Only the patch
1 yi þ xj  2 2
 boundary moves. The velocity at a point x(, t) on
K2D ¼ 2
1  ejxj = ½13b the boundary can be written as a line integral over
2 jxj
the boundary:
where  is a numerical parameter. The regulariza- Z
tion amounts to replacing the -function vorticity dx !o @x
¼ log jx  xðs; tÞj ds ½14
of a point vortex by an approximate -function. In dt 2 C @s
order to recover the solution to the Euler equations,
it is necessary to study the limit N ! 1,  ! 0. For The contour dynamics method consists of approx-
smooth vorticity distributions, this process con- imating a given vorticity distribution by a super-
verges. For vortex sheet initial data, there is position of vortex patches, and moving their
evidence of convergence, but details of the limiting boundaries according to eqn [14]. This method
behavior remain under investigation. Regularized has been applied to compute the evolution of
solutions with fixed value  and vortex sheet initial single-vortex patches and shear layers, and to
data are shown in Figures 6 and 15. Figure 6 shows geophysical flows. Typically, filamentation occurs:
the onset of the Kelvin–Helmholtz instability in a the patch develops thin filaments which increase the
periodically perturbed flat vortex sheet. Figure 15 boundary length significantly and thereby the
shows the rollup of an elliptically loaded flat vortex computational expense. The approach generally
sheet that models the evolution of an aircraft taken is to remove the thin filaments at several
wake (see Figure 7). The correspondence between times throughout the computation, which is
the two-dimensional simulation and the three- referred to as contour surgery. The contour
dimensional wake is made by replacing the spatial dynamics approach as well as the point-vortex
coordinate in the aircraft’s line of flight by a time approximation have also been generalized to treat
coordinate. quasigeostrophic flows.
398 Vortex Dynamics

Vortex Filament Methods in 3D Special topics have also been addressed; atmosphere
(Andrews et al. 1987), point vortex motion and chaos
Vortex simulations in 3D differ from those in 2D in
(Aref 1983, Newton 2001, Ottino 1989), superfluids
that the stretching term in eqn [4] needs to be
and superconductors (Blatter et al. 1994, Donnelly
incorporated. The vortex filament method approx-
1991), turbulence theory using statistical mechanics (
imates the fluid vorticity by a finite number of
Chorin 1994), vortex reconnection (Kida and
filaments whose circulation remains constant in
Takaoka 1994), theory for Euler and Navier–Stokes
time. Each filament is marked by computational
equations (Majda and Bertozzi 2002), contour
mesh points which move with the regularized
dynamics (Pullin 1992), vortex rings (Shariff and
induced velocity. The regularization is necessary to
Leonard 1992), and aircraft trailing vortices (Spalart
prevent the infinite self-induced velocities of curved
1998). Green (1995) includes survey articles on
vortex filaments. As in 2D, this method automati-
various topics.
cally conserves circulation. Vorticity stretching is
accounted for by the stretching between computa-
tional mesh points. As the filament length increases, Nomenclature
more meshpoints are typically introduced to keep it
resolved. Also, the number of filaments can be a vortex ring core size
increased throughout the simulation to maintain g gravitational field
resolution. H Hamiltonian
K2D singular velocity kernel
Viscous Vortex Methods K2D,  regularized velocity kernel
(x, t) fluid density
While inviscid models are expected to approximate R vortex ring radius
small viscosity fluids well far from boundaries, near T(x, t) temperature
boundaries, where vortex shedding is an inherently U translation velocity
viscous mechanism, it is important to incorporate u(x, t) = u(x, t)iþ fluid velocity
the effects of viscosity. The first methods to do so v(x, t)j þ w(x, t)k
w(k) dispersion relation
used operator splitting in which inviscid and viscous
w=ru vorticity
terms of the Navier–Stokes equations were solved in ! = vx  uy scalar vorticity
a sequential manner. In each time step, the compu-  ring circulation
tational elements would first be convected, and then
they would be diffused by a random-walk scheme. See also: Abelian Higgs Vortices; Incompressible Euler
The particle strength exchange method, introduced Equations: Mathematical Theory; Integrable Systems:
more recently, does not rely on operator splitting Overview; Interfaces and Multicomponent Fluids;
and has better accuracy. The particle position and Intermittency in Turbulence; Newtonian Fluids and
vorticity evolve simultaneously, and viscous Thermohydraulics; Point-Vortex Dynamics; Stochastic
diffusion is accounted for in a consistent manner. Hydrodynamics; Superfluids; Topological Knot Theory
Vortex dynamics continues to be a source of and Macroscopic Physics; Turbulence Theories.
interesting problems of theoretical and practical
importance. In particular, much remains to be
Further Reading
learned to better understand turbulence and the
transition to turbulence, a process dominated by Anderson JD (1990) Modern Compressible Flow with Historical
deterministic vortex dynamics. Perspective, 2nd edn. New York: McGraw-Hill.
Andrews DG, Holton JR, and Leovy CB (1987) Middle Atmo-
sphere Dynamics. Orlando: Academic Press.
Further Remarks Aref H (1983) Integrable, chaotic, and turbulent vortex motion in
two-dimensional flows. Annual Review of Fluid Mechanics
Finally, some remarks on relevant literature on this 15: 345–389.
Batchelor GK (1967) An Introduction to Fluid Dynamics.
subject are in order. Lugt (1983) and Tritton (1988)
Cambridge: Cambridge University Press.
are recommended as elementary introduction to Blatter G, Feigel’man MV, Geshkenbein VB, Larkin AI, and
vortex flows. van Dyke (1982) presents beautiful and Vinokur VM (1994) Vortices in high-temperature
instructive flow visualizations. Comprehensive treat- superconductors. Reviews of Modern Physics 66(4):
ments of incompressible fluid dynamics are given in 1125–1388.
Batchelor (1967), Chorin and Marsden (1992), Lamb Chorin AJ (1994) Vorticity and Turbulence. New York: Springer.
Chorin AJ and Marsden JE (1992) A Mathematical Introduction
(1932), and Saffman (1992), and compressible flow is to Fluid Mechanics, 3rd edn. New York: Springer.
treated in Anderson (1990). Cottet and Koumoutsakos Cottet G-H and Koumoutsakos PD (2000) Vortex Methods:
(2000) give an overview of numerical vortex methods. Theory and Practice. Cambridge: Cambridge University Press.
Vortex Dynamics 399

Donnelly RJ (1991) Quantized Vortices in Helium II. Cambridge: Ottino JM (1989) The Kinematics of Mixing: Stretching, Chaos,
Cambridge University Press. and Transport. Cambridge: Cambridge University Press.
Green SI (ed.) (1995) Fluid Vortices. Dordrecht: Kluwer Academic. Saffman PG (1992) Vortex Dynamics. Cambridge: Cambridge
Kida S and Takaoka M (1994) Vortex reconnection Annual University Press.
Review of Fluid Mechanics 26: 169–189. Shariff K and Leonard A (1992) Vortex rings. Annual Review of
Lamb H (1932) Hydrodynamics, 6th edn. New York: Dover. Fluid Mechanics 24: 235–279.
Lugt HJ (1983) Vortex Flow in Nature and Technology. New Spalart PR (1998) Airplane trailing vortices. Annual Review of
York: Wiley. Fluid Mechanics 30: 107–138.
Majda AJ and Bertozzi AL (2002) Vorticity and Incompressible Tritton DJ (1988) Physical Fluid Dynamics, 2nd edn. Oxford:
Flow. Cambridge: Cambridge University Press. Clarendon Press.
Newton PK (2001) The N-Vortex Problem: Analytical Techni- van Dyke M (1982) Album of Fluid Motion. Stanford: The
ques. New York: Springer. Parabolic Press.
Pullin DI (1992) Contour dynamics methods. Annual Review of
Fluid Mechanics 24: 89–115.

Vortices see Abelian Higgs Vortices; Point-Vortex Dynamics


W
Wave Equations and Diffraction
M E Taylor, University of North Carolina, Chapel Hill, The wave equation [1] models a number of
NC, USA physical phenomena, at least in the linear approxi-
ª 2006 Elsevier Ltd. All rights reserved. mation. The vibration of a drum head is modeled by
[1], with M a planar domain, and with the Dirichlet
boundary condition [4]. The motion of sound waves
in a room with hard walls is modeled by [1], with M
Introduction a region in R3 , and with the Neumann boundary
condition [5]. The propagation of electromagnetic
The most basic wave equation is waves is given by Maxwell’s equations:
@2u
 u ¼ 0 ½1 @E
@t2  curl B ¼ J
@t
for u = u(t, x), where  is the Laplace operator, @B
given by u = @ 2 u=@x21 þ    þ @ 2 u=@x2n on n-dimen- þ curl E ¼ 0 ½6
@t
sional Euclidean space Rn . More generally, u might
be defined on R  M, where R is the t-axis and M is div E ¼ 
a Riemannian manifold, with a metric tensor given div B ¼ 0
in local coordinates by (gjk ). Then the Laplace–
Beltrami operator is given, in local coordinates, by where  is the electric charge density and J the
X @   current. These equations yield [1] (with the right-
1=2 1=2 jk @u
u ¼ g g g ½2 hand side replaced by some function F(t, x) if J and 
j;k
@xj @xk
are not zero) for the components of the electric field
E and the magnetic field B. If the propagation is in a
where (gjk ) is the matrix inverse to (gjk ) and
region M in R3 bounded by a perfect conductor,
g = det (gjk ). Even if one concentrates on wave
then the boundary conditions are that E is normal to
propagation in Euclidean space, one frequently
@M and B is tangential to @M. If @M is flat, these
wants to use curvilinear coordinates, and [2] is
equations can be decomposed into Dirichlet pro-
useful. Equation [1] is supplemented by initial
blems for some components and Neumann problems
conditions of the form
for the rest, but if @M is curved such a decomposi-
uð0; xÞ ¼ f ðxÞ; @t uð0; xÞ ¼ gðxÞ ½3 tion is not possible.
Other models of vibrating objects produce var-
called Cauchy data. If the spatial domain M has a
iants of [1]. Examples include vibrating elastic
boundary @M (e.g., if M is a bounded region in Rn ),
solids, yielding an equation like [1] with u
then boundary conditions are imposed. The most
replaced by u þ ( þ )grad div u, for linear
common are the Dirichlet boundary condition
elasticity. Here  acts componentwise on u, and 
uðt; xÞ ¼ 0 for x 2 @M ½4 and  are constants, called Lamé constants. Other
examples model vibrations of crystals and propaga-
and the Neumann boundary condition tion of electromagnetic waves in crystals. Further
interesting phenomena arise in these various cases,
@ uðt; xÞ ¼ 0 for x 2 @M ½5
such as Rayleigh waves in linear elasticity and
where @ u denotes the normal derivative of u at the conical refraction in crystal optics.
boundary. More generally, one might have a driving Here we discuss the propagation of waves and their
force, and replace 0 on the right-hand side of [1] by reflection and diffraction at boundaries. In the interest
a function F(t, x). Similarly, one can consider of providing reasonable coverage in a brief space, we
nonzero boundary data in [4] and [5]. restrict attention to the wave equation [1].
402 Wave Equations and Diffraction

Basic Propagation Phenomena of a similar nature on the surfaces  t , moving at


unit speed in the direction of their normals, þ t
The simplest examples of waves propagating accord-
flowing from 0 in one direction and  t in the
ing to [1] are plane waves, of the form
other. This also holds for the manifold case [2]. That
uðt; xÞ ¼ ’ðx  !  tÞ ½7 happens at least until such surfaces develop singula-
rities, when matters become more elaborate.
with (t, x) 2 R  Rn , ! a unit vector in Rn , and ’ a An alternative way to describe how the set of
function on R. If ’ has two continuous derivatives, singularities evolves is the following. Let S1 M denote
[7] defines a classical solution of [1]. More the space of unit vectors tangent to M; this is
generally, one can allow ’ to be less regular. For a submanifold of the tangent bundle of M, TM.
example, it could be piecewise smooth with a jump There is a natural projection  : S1 M ! M. Asso-
discontinuity at some point a 2 R. In such a case, u ciated to a smooth surface 0 of dimension n  1 in
will be piecewise smooth with a jump across the M (of dimension n) are two preimages þ 
0 and 0 in
n-dimensional surface x  !  t = a in R  R n , which S1 M, consisting of unit vectors lying over points of
will solve [1] in a weak, or distributional, sense. For 0 and normal to 0 . The geodesic flow is a flow on
fixed t, u(t,  ) has a jump across the (n  1)- S1 M, and it takes  0 to smooth (n  1)-dimensional
dimensional surface t = {x 2 R n : x  ! = t þ a}. As surfaces  t in S 1 M. The sets  t are the images of
t varies, t moves in the direction ! with unit speed. 
t under . The geodesics starting out at points in
There are also spherical wave solutions to [1] on  
0 and sweeping out t are the rays along which
R  Rn , such as the singularities of the solution u propagate.
sgn t 2 This latter description works for all t if M has no
uðt; xÞ ¼ ðt  jxj2 Þ1=2
þ ½8 boundary and is complete, that is, all geodesics are
2
defined for all t, although singularities develop in
for n = 2, and the images (   
t ) = t , at points p 2 t , where t
1 meets Tp M nontransversally. The behavior of u near
uðt; xÞ ¼ ðjxj  jtjÞ ½9 such singular points of 
4t t , known as caustics, is
more complicated than that near regular points, but
for n = 3. Here sþ = s for s > 0, sþ = 0 for s  0, and
it can be captured in terms of integrals. Methods of
(s) is the Dirac delta function. In fact, [8] and [9]
establishing this propagation of singularities are
are ‘‘fundamental solutions’’ (more on which in the
discussed in the section on geometrical optics.
section on harmonic analysis) to the wave equation
Such a description needs further elaboration if M
on R  Rn , for n = 2 and 3, respectively. In such
has a boundary. One of the principal problems of
cases, the singularity in u(t,  ) for each fixed t lies in
diffraction theory is to explain how singularities of
t = {x 2 Rn : jxj = jtj}, a family of surfaces in Rn
solutions to [1], with a boundary condition such as
that moves, in the direction of the normal to t , at
[4] or [5], propagate and reflect off the boundary.
unit speed.
Considering the case where M is a half-space
The examples mentioned above illustrate two
in Rn ,
general phenomena about the behavior of solutions
to [1]. The first is finite propagation speed. Its M ¼ Rnþ ¼ fx 2 Rn : xn 0g ½11
general formulation is that, given a closed set
K  M,
provides a guide to the simplest reflection phenom-
supp f ; g  K ) supp uðt; Þ ena. In such a case, one can solve the Dirichlet or
Neumann boundary problem for the wave equation
 fx 2 M : distðx; KÞ  jtjg ½10
[1] by the method of images. One extends f and g
In fact, given that [8]–[9] are fundamental solutions, from R nþ to Rn . For the Dirichlet problem [4], one
[10] is a consequence of these formulas when takes odd extensions, f (x0 , xn ) = f (x0 , xn ), and
M = R2 or R3 . The result [10] is true in great similarly for g. For the Neumann problem [5], one
generality, with well-known demonstrations invol- takes even extensions, f (x0 , xn ) = f (x0 , xn ), etc.
ving energy estimates. One then solves the wave equation [1] on R  Rn
The second phenomenon involves propagation of with the extended Cauchy data, and the restriction
singularities. Typically, if the Cauchy data f and g in to R  Rnþ solves the respective boundary problem.
[3] are smooth on the complement of an (n  1)- Suppose 0 is a smooth (n  1)-dimensional surface
dimensional surface 0 , perhaps with a jump across that does not meet @Rnþ , and that f and g have
0 , or such a singularity as in [8] or [9], the solution singularities on 0 , as above. (Suppose for simplicity
u(t, x) will be a sum of two terms, with singularities that f and g vanish near @Rnþ .) Those rays issuing
Wave Equations and Diffraction 403

from normals to 0 have mirror images, which are Sn1 . For more general convex obstacles K or
rays in Rn . If such a ray hits @Rnþ , its mirror image manifolds with diffractive boundary, other techni-
does so also, and continues into R nþ , as the reflected ques are required, to show that waves reflect off the
ray. The singularities of u propagate along such boundary in a fashion similar to the case [12].
reflected rays. Another situation arises if instead of [12] one
Such a description extends to a general complete takes M = B, or more generally M = K, a convex
Riemannian manifold with boundary M, in the case region as described above. A ray starting off from
of rays that hit the boundary transversally. Such a a point in @M, almost tangent to @M but with a
ray is reflected by retaining the tangential compo- small component in the direction of the normal
nent of its velocity vector at the point of intersection pointing into M, will undergo many reflections in
@M and reversing the sign of the normal component. a short time. Upon shrinking the normal compo-
One says that the ray is reflected according to the nent of the initial velocity to zero, one obtains in
laws of geometrical optics. Singularities of u carried the limit a geodesic in @M, known as a gliding ray.
by such rays that hit @M are correspondingly In such a case, singularities of solutions to [1],
reflected. Methods to establish such transversal with such a boundary condition as [4] or [5],
reflection of singularities are natural extensions of propagate along both transversally reflected and
those developed to treat the propagation away from gliding rays.
@M, mentioned above. For the generic smooth obstacle K in Rn , the
Matters become more delicate when there are rays second fundamental form can have a variety of
that are tangent to @M. A model example is given by signatures at various boundary points. Various types
of ‘‘generalized rays’’ occur – generally speaking
M ¼ R n n B; B ¼ fx 2 Rn : jxj < 1g ½12
limits of sequences of transversally reflected rays.
which one takes when studying the scattering of This situation also holds for general complete
waves in Rn by the obstacle B. Consider a solution Riemannian manifolds with smooth boundary. The
to [1] with boundary condition given by [4] or [5] main result about propagation of singularities in
that has a simple singularity on t = {x 2 Rn : xn = t} such a case is that it is always along such generalized
for t < 1. The associated rays are of the form rays. This was established by Melrose and Sjöstrand
x0 (t) = (x0 , t), for t < 1, with x0 2 Rn1 . If jx0 j > 1, (1978).
these rays continue on in Rn nB, for all t 1. If Further diffraction effects arise when @M has
jx0 j < 1, these rays hit @M = @B transversally, and singularities, such as edges and corners. The simplest
their reflection is as described above. If jx0 j = 1, example is
these rays hit @B tangentially, at t = 0; they are
sometimes called grazing rays. One also continues M ¼ fx 2 R2 : a   b; r 0g ½15
them past t = 0. One defines in this fashion t for
where (r, ) are the polar coordinates of x 2 R2 , and
t 1. The region
we assume 0  a < b  2. Here one is studying the
S ¼ fx ¼ ðx0 ; xn Þ 2 R n nB : jx0 j < 1; xn > 0g ½13 diffraction of waves by a wedge. In the limiting case
a = 0, b = 2, the wedge becomes a half-line, that is,
is called the ‘‘shadow region.’’ It is disjoint from t
for all t. The solution u is smooth in S for all t, M ¼ R2 n fðx1 ; 0Þ : x1 > 0g ½16
although it is not identically zero. The set
Singularities of solutions to [1] on R  M with
S b ¼ fx ¼ ðx0 ; xn Þ 2 Rn nB : jx0 j ¼ 1; xn 0g ½14 such a boundary condition as [4] or [5] propagate in
the interior of M and reflect off the regular points of
is the ‘‘shadow boundary.’’ @M as before. If a family of continuous, piecewise
One can replace B in [12] by a more general smooth curves t carrying the singularity of u hit the
smooth, convex obstacle K, with positive Gauss corner x = 0 at t = a, this reflection creates a tear in
curvature everywhere, and the same considerations t for t > a. In addition, a diffracted wave spreads
of transversal and grazing rays and shadow regions out from the corner at unit speed. This diffracted
apply. These notions also extend to a more general wave carries a singularity that is weaker than that of
class of Riemannian manifolds with boundary, the incident wave. For example, if one has a solution
called manifolds with diffractive boundary. In the like [8], but shifted to have support in a disk of radius
case K = B, one can use separation of variables to jtj about a point p 6¼ 0 in R 2 , for small jtj, then the
reduce the problem of analyzing solutions to [1] and diffracted wave will have a jump discontinuity.
showing that singularities propagate along such rays The space M in [15] is a special case of a cone.
to a problem in harmonic analysis on the sphere More generally, if N is a complete Riemannian
404 Wave Equations and Diffraction

manifold (possibly with boundary), then the cone although it is weaker than the singularity of the
C(N) with base N is the set main wave.
Taking Cartesian products of spaces of the form
CðNÞ ¼ ½0; 1Þ  N ½17 [15] with R k yields spaces with k-dimensional
with all points (0, x), x 2 N, identified, with the edges. There are also spaces with curvy edges.
metric tensor Rather than continuing with further general
description, one more particular case is discussed
ds2 ¼ dr 2 þ r 2 g ½18 next, which has had a historical significance.
Namely, we consider the reflection of waves in R3
where g is the metric tensor on N, and points on off a disk, that is, take
C(N) are denoted (r, x), r 2 [0, 1), x 2 N. The space  
in [14] has the form M = C(N) with N = [a, b], an M ¼ R3 n D; D ¼ ðx1 ; x2 ; 0Þ : x21 þ x22  1 ½20
interval. A cone in Euclidean space Rn is of the form
Consider a wave given for t < 0 by u(t, x) = (x3  t).
C(N) with N a domain in the unit sphere Sn1 .
This wave hits D = @M at t = 0, giving off a diffracted
The propagation of singularities for solutions to [1]
wave, traveling away from
= {(x1 , x2 , 0) : x21 þ
on C(N), when N has smooth boundary, has a
x22 = 1} at speed 1 for t > 0. This diffracted wave
description similar to that above for the case [15].
carries a singularity that blows up like the 1=2 power
Again, there is a diffracted wave set off from the conic
of the distance to the torus of points of distance t from
point {r = 0} when a singularity of a wave hits it. The

, for t 2 (0, 1). For t > 1, there is a focusing effect
diffracted wave is typically (n  1)=2 units smoother
along the x3 -axis, producing a stronger singularity for
than the singular wave producing it, where
u(t, x) there.
n = dim C(N). For example, the fundamental solution
This sort of phenomenon was understood, at
to the wave equation on C(N) produces a diffracted
least from a heuristic point of view, in the
wave which is the sum of a jump discontinuity and (in
nineteenth century, and it played a role in an
general) a logarithmic singularity.
important argument of Poisson. At the time, there
In fact, precise understanding of the behavior of
was a debate about whether the propagation of
the fundamental solution to the wave equation on
light was a wave phenomenon. Poisson did not
C(N) is encoded in terms of the behavior of the
think it was, and he noted that if it were, the light
solution operator to the wave equation on the base
waves propagated past such an obstacle should
N. This is discussed in further detail in the section
produce a bright spot along the axis normal to the
on harmonic analysis. In the case where C(N) is
disk and through its center. The experiment was
given by [15], we are dealing with the wave
performed and the bright spot was observed.
equation on an interval [a, b], whose behavior is
This is now called the Poisson spot, and its
elementary.
occurrence convinced many physicists, including
One can use analysis of [15] together with finite
Poisson, that the propagation of light is a wave
propagation speed to get a good qualitative picture of
phenomenon.
diffraction of waves in R2 by a polygonal obstacle. A
variation of this argument allows one to understand
the behavior of the wave equation on a ‘‘polygonal’’ Harmonic Analysis and the Wave
domain N in S2 , that is, one whose boundary consists Equation
of a finite number of geodesic segments in S2 . Going
from there to C(N), one can then analyze diffraction The wave equation [1] with Cauchy data [3] can be
of waves in R3 by a polyhedron. regarded as an operator differential equation, with
It is worth remarking how the ‘‘shadow region’’ solution
pffiffiffiffiffiffiffiffi
for such an obstacle as a wedge in R2 differs from pffiffiffiffiffiffiffiffi sin t 
that in [12]–[14]. For example, if one considers M uðt; xÞ ¼ cos t  f ðxÞ þ pffiffiffiffiffiffiffiffi gðxÞ ½21

given by [16] and u(t, x) = (x2  t), for t < 0, then
the region This brings one to investigate functions of the self-
S ¼ fx ¼ ðx1 ; x2 Þ : x1 ; x2 > 0g ½19 adjoint operator . If M = Rn , one can do this using
the Fourier transform, which is given by
is the ‘‘shadow region,’’ in the sense that rays either Z
missing or reflecting off the obstacle {(x1 , 0) : x1 > 0} F f ð Þ ¼ ^f ð Þ ¼ ð2Þn=2 f ðxÞeix dx ½22
do not enter the region [19]. However, unlike the
case [13], the solution u(t, x) is not smooth in the One defines F
by changing eix to eix in [22],
region [19] for t > 0. There is a singularity there, and the Fourier inversion formula says F and F
are
Wave Equations and Diffraction 405

inverses of each other on various function spaces, To understand functions of the Laplace operator
including L2 (Rn ). Then one has on a cone C(N), one uses
pffiffiffiffiffiffiffiffi Z
@2 n  1 @ 1
’ð Þf ðxÞ ¼ ð2Þn=2 ’ðj jÞ^f ð Þeix d ½23 ¼ þ þ N ½32
@r2 r @r r2

Note that [23] is equal to where N is the Laplace operator on N, which


follows from [2] and [18]. Here n = dim C(N). This is
Z
a modified Bessel operator. We define the operator
ðx  yÞf ðyÞ dy ¼ 
f ðxÞ ½24
n2
 ¼ ðN þ 2 Þ1=2 ; ¼ ½33
where 2
Z For each j in the spectrum of , we consider the
ðxÞ ¼ ð2Þn ’ðj jÞeix d ½25 Hankel transform
Z 1
In particular, [21] becomes Hj gðÞ ¼ gðrÞJj ðrÞr dr ½34
0

@ where Jj is the Bessel function of order j . The


uðt; xÞ ¼ Rt
f ðxÞ þ Rt
gðxÞ ½26
@t Hankel inversion formula says Hj is unitary
on L2 (R þ , r dr), and is its own inverse. Conse-
pffiffiffiffiffiffiffi
where quently, we can write the action of ’( ) on
Z L2 (C(N)) as
sin tj j ix
Rt ðxÞ ¼ ð2Þn e d ½27 Z 1
j j pffiffiffiffiffiffiffiffi
’ð Þgðr; xÞ ¼ K’ ðr; s; Þgðs; xÞsn1 ds ½35
0
is the fundamental solution to the wave equation.
The integral [27] is not an easy integral when where K’ (r, s,) is a family of operators on L2 (N),
n > 1, but the answer can be derived by analytic given by
continuation from the Poisson kernel, that is, Z 1
pffiffiffiffiffiffi K’ ðr; s; Þ ¼ ðrsÞ ’ðÞJ ðrÞJ ðsÞ d ½36
ey  f ðxÞ ¼ Py
f ðxÞ 0
½28
Py ðxÞ ¼ Cn yðjxj2 þ y2 Þðnþ1Þ=2 To obtain the wave kernel on C(N), one can
analytically continue
pffiffiffiffiffi formulas for the Poisson
where Cn = (nþ1)=2 ((n þ 1)=2). One gets kernel, for ey  . Such formulas arise from the
Lipschitz–Hankel identity:
Cn
Rt ðxÞ ¼ lim Imðjxj2  ðt  i"Þ2 Þðn1Þ=2 ½29 Z 1
"&0 n 1 ey J ðrÞJ ðsÞ d
0
Taking this limit for n = 2, 3 yields the formulas 2 
1 r þ s2 þ y2
[8]–[9]. There are several ways to derive [28]. One, ¼ ðrsÞ1=2 Q1=2 ½37
 2rs
which is flexible and useful for other situations,
derives it from the formula for the heat kernel, Here Q1=2 ( ) is a Legendre function. The identity
2 [37] is one of the more difficult identities in the
et f ðxÞ ¼ Ht
f ðxÞ; Ht ðxÞ ¼ ð4tÞn=2 ejxj =4t
½30 theory of Bessel functions. It is useful to know that it
can be derived by applying a slight variant of the
via the subordination identity:
subordination identity [31] to the more elementary
Z 1 identity
y 2 2
eyA ¼ 1=2 ey =4t etA t3=2 dt Z 1
2 0 2 1 2 2
 rs 
A > 0; y > 0 ½31 et J ðrÞJ ðsÞ d ¼ eðr þs Þ=4t I ½38
0 2t 2t
pffiffiffiffiffiffiffi
with A = . The heat kernel can be computed via (where I (y)= ei=2 J (iy) for y > 0), which describes
[23], which becomes a well-known Gaussian inte- the behavior of the heat kernel on C(N).
gral. The identity [31] can be proved using the fact Carrying out the analytic continuation of [37]
that the Fourier integral formula for Py (x) is to imaginary y yields results stated in the section
elementary to compute when n = 1. on basic propagation phenomena, once one
406 Wave Equations and Diffraction

understands the behavior of families of functions of where Ai is the Airy function. The coefficients ak ()
the operator  so produced. An approach taken by and bk () are smooth functions of their argument,
Cheeger and Taylor (1982) to this was to synthesize  = (z), which is defined by
these operators from eis , s 2 R, and deduce their Z 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi
behavior from the behavior of the solution operator 2 3=2 dt
 ¼ 1  t2 ½45
to the wave equation on the base N. 3 z t
One can apply similar considerations to M = R n nB, Making use of [43] in [44], one can obtain a
which is the truncated cone [1, 1)  Sn1 , with parametrix for u (i.e., a solution modulo a C1 error)
metric tensor [18], where g is the metric tensor on whose form is a special case of the formula [50],
Sn1 , and Laplace operator given by [32], with N which we will present in the next section.
the Laplace operator on Sn1 . The problem of
diffraction of waves by the ball B can be recast as
solving Geometrical Optics and Extensions

@2u By results of the last section, the solution to [1]


 u ¼ 0 on R  M when M = Rn has the form
@t2 ½39
ujR@M ¼ f ; uðt; xÞ ¼ 0 for t 0 XZ
uðt; xÞ ¼ ^ ð Þ d
eitj jþix h ½46

with f compactly supported on R  @M. Taking the
partial Fourier transform with respect to t yields the where the functions h are produced from the initial
reduced wave equation data via simpler transformations. For a general
metric tensor, one can produce a parametrix (i.e.,
ð þ 2 Þv ¼ 0 for jxj > 1; vjSn1 ¼ gðx; Þ ½40 an approximation to u(t, x) with a C1 error) in the
following form:
and the condition u(t, x) = 0 for t 0 yields for v
the outgoing radiation condition XZ 
uðt; xÞ ¼ ^ ð Þ d
a ðt; x; Þ ei’ ðt;x; Þ h ½47
  
@v
rðn1Þ=2  iv ! 0 as r ! 1 ½41
@r Here the phase functions ’ (t, x, ) are smooth for
6¼ 0 and homogeneous of degree 1 in . The
The solution is amplitudes a (t, x, ) are smooth and have asympto-
ð1Þ tic expansions as j j ! 1:
H ðrÞ X
vðx; Þ ¼ rðn2Þ=2 gðx; Þ ½42
ð1Þ
H ðÞ a ðt; x; Þ a
k ðt; x; Þ ½48
k 0
with  as in [33] and H (1) the Hankel function.
with a
k (t, x, ) homogeneous of degree k in . One
The behavior of H(1) (r)=H(1) () as j ,  ! 1
j j applies @t2   to both sides of [47], and obtains an
with ratio in a small neighborhood of 1 can be
operator of P a similar form, with new amplitudes
shown to control the behavior of the solution u to
b (t, x, ) b k (t, x, ). Setting the terms in this
[39] near grazing rays. There is an asymptotic
asymptotic expansion equal to zero yields, first for
formula for this, which is one of the most delicate
’ (t, x, ), a partial differential equation known as
analytical results in the theory of Bessel functions.
the eikonal equation:
The result is that, uniformly for z near 1, as
 ! 1, @’
¼ jrx ’ j ½49
 1=4 @t
4
Hð1Þ ðzÞ 2ei=3 where jvj is the norm of a vector v 2 Tx M,
1  z2 determined by the metric tensor. Setting b
( k (t, x, )
X = 0 for k 1 yields linear differential equations for
 Aþ ð2=3 Þ ak ðÞ1=32k the amplitude terms in [48], known as transport
k 0
) equations.
X Operators of the form [47] are special cases
þ A0þ ð2=3 Þ bk ðÞ 5=32k
½43
of Fourier integral operators. Seminal works of
k 0
Keller (1953) and Lax (1957) gave an important
Here stimulus to work on these operators, and work of
Hörmander (1971) turned this into a systematic and
Aþ ð Þ ¼ Aiðe2i=3 Þ ½44 powerful theory. A particular advance regards
Wave Equations and Diffraction 407

producing a parametrix valid for all t. Generally, one (1976), which produced solutions satisfying [52] to
can solve [49] and the associated transport equations infinite order at n = 0. This earlier construction is
for t in some interval, past which the eikonal adequate to produce a grazing ray parametrix, but
equation might break down. Hörmander’s theory the sharper result [52] is extremely valuable for
treats products of Fourier integral operators, yielding constructing a gliding ray parametrix. This has the
global constructions. This facilitates the treatment of form Z
caustics mentioned earlier. Stationary-phase methods uðyÞ ¼ ½a AiðÞ þ ij j1=3 b Ai0 ðÞ
can be brought to bear to relate the singularities of R n

Th to those of h, when T is a Fourier integral b d


 Aið0 Þ1 ei Fð Þ ½53
operator.
To construct parametrices for waves reflecting off It differs from [50] in the use of Ai rather than Aþ .
a boundary, one can again reduce the problem to Since Ai has real zeros, it is also convenient to pick
one of the form [39]. Waves that reflect transver- T > 0 and evaluate , , a, and b at ( 1 , . . . , n1 ,
sally are given by parametrices of the form [47], n þ iT), and take 0 = 1=3 ( n þ iT). The treatment
although with the role of the variables changed, so of the eikonal and transport equations is as above,
that t in [47]–[49] is replaced by a coordinate that though the Fourier–Airy integral operator [50] has a
vanishes on R  @M. different behavior from [53], reflecting the differ-
A parametrix that treats grazing rays can be written ence between how singularities in solutions to the
in the form of a Fourier–Airy integral operator: wave equation are carried by grazing and by gliding
rays.
Z h i
uðyÞ ¼ a Aþ ðÞ þ ij j1=3 b A0þ ðÞ
Rn
b d
 Aþ ð0 Þ1 ei Fð Þ ½50 Further Reading
Here y = (y1 , . . . , ynþ1 ) denotes a coordinate system Bowman J, Senior T, and Uslengi P (1969) Electromagnetic and
on a neighborhood of a boundary point of R  M, Acoustic Scattering by Simple Shapes. Amsterdam: North-
Holland.
with ynþ1 = 0 on R  @M. We have a pair of phase
Cheeger J and Taylor ME (1982) On the diffraction of waves by
functions (y, ) and (y, ), homogeneous in of conical singularities, I, II. Communications in Pure and
degree 1 and 2/3, respectively, and a pair of Applied Mathematics 25: 275–331; 487–529.
amplitudes a(y, ) and b(y, ), each having asympto- Garnir HG (ed.) (1981) Singularities in Boundary Value Problems,
tic expansions of the form [48]. The function Aþ is NATO Advanced Study Institute Series, vol. 65. Boston: D. Reidel.
Hörmander L (1971) Fourier integral operators, I. Acta Mathe-
the Airy function [44]. The phase functions satisfy a
matica 127: 79–183.
coupled pair of eikonal equations: Hörmander L (1985) The Analysis of Partial Differential
Operators. vols. 3–4. New York: Springer.
hry ; ry i þ hry ; ry i ¼ 0 Keller JB (1953) The geometrical theory of diffraction. Proc. Symp.
½51
hry ; ry i ¼ 0 Microwave Optics, Eaton Electronics Lab, McGill University.
Lax P (1957) Asymptotic solutions of oscillatory initial value
where h , i denotes the Lorentz inner product on problems. Duke Mathematical Journal 24: 627–646.
Ludwig D (1967) Uniform asymptotic expansion of the field
Ty (R  M) given by dt2  g. More precisely, [51] is scattered by a convex object at high frequencies. Communica-
to hold in the region where   0, and also to tions in Pure and Applied Mathematics 19: 103–138.
infinite order at ynþ1 = 0, for  0. One requires Melrose RB (1975) Microlocal parametrices for diffractive bound-
@ =@ j to have linearly independent y-gradients, for ary problems. Duke Mathematical Journal 42: 605–635.
j = 1, . . . , n, and Melrose RB and Sjöstrand J (1978) Singularities of boundary
value problems. Communications in Pure and Applied
1=3 Mathematics 31: 593–617.
ðy; Þ ¼ 0 ð Þ ¼ 1 n for ynþ1 ¼ 0 ½52
Melrose RB and Taylor ME (1985) Near peak scattering and the
The terms in the asymptotic expansions of a(y, ) corrected Kirchhoff approximation for a convex obstacle.
Advances in Mathematics 55: 242–315.
and b(y, ) satisfy coupled systems of transport
Melrose RB and Wunsch J (2004) Propagation of singularities for
equations. One can arrange that b(y, ) = 0 for the wave equation on conic manifolds. Inventiones Mathema-
ynþ1 = 0. Then ujR@M = TF, where T is a Fourier ticae 156: 235–299.
integral operator, which can be inverted, modulo a Taylor ME (1976) Grazing rays and reflection of singularities of
smooth error, by Hörmander’s theory, producing a solutions to wave equations. Communications in Pure and
Applied Mathematics 29: 1–38.
parametrix for [39].
Taylor ME (1981) Pseudodifferential Operators. Princeton, NJ:
The construction of solutions to [51] satisfying Princeton University Press.
[52] is due to Melrose. This followed earlier works Taylor ME (1996) Partial Differential Equations, vols. 1–3. New
of Ludwig (1967), Melrose (1975), and Taylor York: Springer.
408 Wavelets: Application to Turbulence

Wavelets: Application to Turbulence


M Farge, Ecole Normale Supérieure, Paris, France not hold for nonlinear motions, their archetype
K Schneider, Université de Provence, Marseille, being the turbulent regime, which therefore cannot
France be decomposed into a sum of independent motions
ª 2006 Published by Elsevier Ltd. that can be separately studied. Generically, their
evolution involves a wide range of scales, exciting
smaller and smaller ones, even leading to finite-time
Introduction about Turbulence singularities, e.g., shocks. The ‘‘art’’ of predicting
and Wavelets the evolution of such nonlinear phenomena consists
of disentangling the active from the passive
What is Turbulence? elements: the former should be deterministically
Turbulence is a highly nonlinear regime encoun- computed, while the latter could either be discarded
tered in fluid flows. Such flows are described by or their effect statistically modeled. The wavelet
continuous fields, for example, velocity or pressure, representation allows to analyze the dynamics
assuming that the characteristic scale of the fluid in both space and scale, retaining only those degrees
motions is much larger than the mean free path of of freedom which are essential to predict the
the molecular motions. The prediction of the flow evolution. Our goal is to perform a kind
spacetime evolution of fluid flows from first of ‘‘distillation’’ and retain only the elements
principles is given by the solutions of the Navier– which are essential to compute the nonlinear
Stokes equations. The turbulent regime develops dynamics.
when the nonlinear term of Navier–Stokes equa-
tions strongly dominates the linear term; the ratio How One Studies Turbulence?
of the norms of both terms is the Reynolds number
When studying turbulence one is uneasy about the
Re, which characterizes the level of turbulence. In
fact that there are two different descriptions,
this regime nonlinear instabilities dominate, which
depending on which side of the Fourier transform
leads to the flow sensitivity to initial conditions and
one looks from.
unpredictability.
The corresponding turbulent fields are highly  On the one hand, looking from the Fourier space
fluctuating and their detailed motions cannot be representation, one has a theory which assumes
predicted. However, if one assumes some statistical the existence of a nonlinear cascade in an
stability of the turbulence regime, averaged quan- intermediate range of wavenumbers sets, called
tities, such as mean and variance, or other related the ‘‘inertial range’’ where energy is conserved
quantities, for example, diffusion coefficients, lift or and transferred towards high wavenumbers, but
drag, may still be predicted. only on average (i.e., considering either ensemble
When turbulent flows are statistically stationary or time or space averages). This implies that a
(in time) or homogeneous (in space), as it is turbulent flow is excited at wavenumbers lower
classically supposed, one studies their energy spec- than those of the inertial range and dissipated at
trum, given by the modulus of the Fourier transform wavenumbers higher. Under these hypotheses, the
of the velocity autocorrelation. theory predicts that the slope of the energy
Unfortunately, since the Fourier representation spectrum in the inertial range scales as k5=3 in
spreads the information in physical space among the dimension 3 and as k3 in dimension 2, k being
phases of all Fourier coefficients, the energy spec- the wavenumber, i.e., the modulus of the wave
trum loses all structural information in time or vector.
space. This is a major limitation of the classical way  On the other hand, if one studies turbulence from
of analyzing turbulent flows. This is why we have the physical space representation, there is not yet
proposed to use the wavelet representation instead any universal theory. One relies instead on
and define new analysis tools that are able to empirical observations, from both laboratory
preserve time and space locality. and numerical experiments, which exhibit the
The same is true for computing turbulent flows. formation and persistence of coherent vortices,
Indeed, the Fourier representation is well suited to even at very high Reynolds numbers. They
study linear motions, for which the superposition correspond to the condensation of the vorticity
principle holds and whose generic behavior is, either field into some organized structures that contain
to persist at a given scale, or to spread to larger most of the energy (L2 -norm of velocity) and
ones. In contrast, the superposition principle does enstrophy (L2 -norm of vorticity).
Wavelets: Application to Turbulence 409

Moreover, the classical method for modeling turbu- The wavelet transform of a function f 2 L2 (R) is
lent flows consists in neglecting high-wavenumber the inner product of f with the analyzing wavelets
motions and replacing them by their average, suppos- e
a, b , which
R gives the wavelet coefficients: f (a, b) =
ing their dynamics to be either linear or slaved to the hf , a, b i = f (x) a, b (x) dx. They measure the fluc-
low wavenumber motions. Such a method would work tuations of f around the scale a and the position
if there exists a clear separation between low and high b. f can then be reconstructed without any loss as
wavenumbers, that is, a spectral gap. the inner product of its wavelet coefficients e f with
Actually, there is now strong evidence, from the analyzing wavelets
both laboratory and direct numerical simulation ZZ
(DNS) experiments, that this is not the case. 1 e 2
a; b : f ðxÞ = C f ða; bÞ a; b ðxÞa da db
Conversely, one observes that turbulent flows are
nonlinearly active all along the inertial range and that R
coherent vortices seem to play an essential dynamical C = j ^j2 jkj1 dk being a constant which depends
role there, especially for transport and mixing. One on the wavelet .
may then ask the following questions: Are coherent Like the Fourier transform, the wavelet transform
vortices the elementary building blocks of turbulent realizes a change of basis from physical space to
flows? How can we extract them? Do their mutual wavelet space which is an isometry. It thus conserves
interactions have a universal character? Can we the inner product (Plancherel theorem), and in
compress turbulent flows and compute their evolu- particular energy (Parseval’s identity). Let us men-
tion with a reduced number of degrees of freedom tion that, due to the localization of wavelets in
corresponding to the coherent vortices? physical space, the behavior of the signal at infinity
The DNS of turbulent flows, based on the integra- does not play any role. Therefore, the wavelet
tion of the Navier–Stokes equations using either grid analysis and synthesis can be performed locally, in
points in physical space or Fourier modes in spectral contrast to the Fourier transform where the nonlocal
space, requires a number of degrees of freedom per nature of the trigonometric functions does not allow
time step that varies as Re9=4 in dimension 3 (and as to perform a local analysis.
Re in dimension 2). Due to the inherent limitation of Moreover, wavelets constitute building blocks of
computer performances, one can presently only per- various function spaces out of which some can be
form DNS of turbulent flows up to Reynolds numbers used to contruct orthogonal bases. The main
Re = 106 . To compute higher Reynolds flows, one difference between the continuous and the orthogo-
should then design ad hoc turbulence models, whose nal wavelet transforms is that the latter is non-
parameters are empirically adjusted to each type of redundant, but only preserves the invariance by
flows, in particular to their geometry and boundary translation and dilation only for a discrete subset of
conditions, using data from either laboratory or wavelet space which corresponds to the dyadic grid
numerical experiments.  = (j, i), for which scale is sampled by octaves j and
space by positions 2j i. The advantage is that all
What are Wavelets? orthogonal wavelet coefficients are decorrelated,
which is not the case for the continuous wavelet
The wavelet transform unfolds signals (or fields)
transform whose coefficients are redundant and
into both time (or space) and scale, and possibly
correlated in space and scale. Such a correlation
directions in dimensions higher than 1. The starting
can be visualized by plotting the continuous wavelet
point is a function 2 L2 (R), called the ‘‘mother
coefficients of a white noise and the patterns one
wavelet’’, which is well localized in physical space
thus observes are due to the reproducing kernel of
x 2 R, is oscillating ( has at least a vanishing
the continuous wavelet transform, which corre-
integral, or better, its first m moments vanish), and
sponds to the correlation between the analyzing
is smooth (its Fourier transform ^(k) exhibits fast
wavelets themselves.
decay for wave numbers jkj tending to infinity). The
In practice, to analyze turbulent signals or fields,
mother wavelet then generates a family of dilated
one should use the continuous wavelet transform
and translated wavelets
with complex-valued wavelets, since the modulus of
  the wavelet coefficients allows to read the evolution
1=2 xb
a; b ðxÞ ¼ a of the energy density in both space (or time) and
a
scales. If one uses real-valued wavelets instead, the
with a 2 Rþ the scale parameter and b 2 R the modulus of the wavelet coefficients will present the
position parameter, all wavelets being normalized same oscillations as the analyzing wavelets and it
in L2 -norm. will then become difficult to sort out features
410 Wavelets: Application to Turbulence

belonging to the signal or to the wavelets. In the case with the wavenumber k denoting the barycenter of
of complex-valued wavelets, the quadrature between the wavelet support in Fourier space computed as
the real and the imaginary parts of the wavelet R1
coefficients eliminates these spurious oscillations; this kj bðkÞjdk
k ¼ R0 1 ½3
is why we recommend to use complex-valued wave- j bðkÞjdk
0
lets, such as the Morlet wavelet. To compress
turbulent flows, and a fortiori to compute their
evolution at a reduced cost, compared to standard For the orthogonal wavelet transform, there is
methods (finite difference, finite volume, or spectral a large collection of possible wavelets and the
methods), one should use orthogonal wavelets. This choice depends on which properties are preferred,
avoids redundancy, since one has the same number of for instance: compact support, symmetry, smooth-
grid points as wavelet coefficients. Moreover there ness, number of cancelations, computational
exists a fast algorithm to compute the orthogonal efficiency.
wavelet coefficients which is even faster than the fast From our own experience, we tend to prefer
Fourier transform, having O(N) operations instead of the Coifman wavelet 12, which is compactly
O(N log2 N). supported, has four vanishing moments, is quasi-
The first paper about the continuous wavelet symmetric, and is defined with a filter of length 12,
transform has been published by Grossmann and which leads to a computational cost for the fast
Morlet (1984). Then, discrete wavelets were wavelet transform in 24N operations, since two
constructed, leading to frames (Daubechies et al. filters are used.
1986) and orthogonal bases (Lemarié and Meyer, As stated above, we recommend the complex-
1986). From there the formalism of multiresolution valued continuous wavelet transform for analysis. In
analysis (MRA) has been constructed which led this case, one plots the modulus and the phase of the
to the fast wavelet algorithm (Mallat 1989). The wavelet coefficients in wavelet space, with a linear
first application of wavelets to analyze turbulent horizontal axis for the position b, and a logarithmic
flows has been published by Farge and Rabreau vertical axis for the scale a, with the largest scale at
(1988). Since then a long-term research program has the bottom and the smallest scale at the top.
been developed for analyzing, computing and In Figure 1a we show the wavelet analysis of
modeling turbulent flows using either continuous a turbulent signal, corresponding to the time
wavelets, orthogonal wavelets, or wavelet packets. evolution of the velocity fluctuations of two succes-
sive vortex breakdowns, measured by hot-wire
anemometry at N = 32768 = 215 instants (Cuypers
et al. 2003). The modulus of the wavelet coefficients
Wavelet Analysis (Figure 1b) shows that during the vortex break-
Wavelet Spectra down, which is due to strong nonlinear flow
instability, energy is spread over a wide range of
Wavelet space To study turbulent signals one uses scales. The phase of the wavelet coefficients
the continuous wavelet transform for analysis, and (Figure 1c) is plotted only where the modulus is
the orthogonal wavelet transform for compression non-negligible, otherwise the phase information
and computation. To perform a continuous wavelet would be meaningless. In Figure 1c, one observes
transform, one can choose: that the lines of constant phase point towards the
 either a real-valued wavelet, such as the Marr instants where the signal is less regular, that is,
wavelet, also called ‘‘Mexican hat,’’ which is the during vortex breakdowns.
second derivative of a Gaussian,
 2 Local wavelet spectrum Since the wavelet trans-
2 x
ðxÞ ¼ ð1  x Þ exp ½1 form conserves energy and preserves locality in
2
physical space, one can extend the concept of energy
 or a complex-valued wavelet, such as the Morlet spectrum and define a local energy spectrum, such
wavelet, that
8   2
! 1 e k 
> 1 ðk  k Þ 2 e
Eðk; xÞ ¼ f ; x  for k  0 ½4
>
< bðkÞ ¼ C k  k
exp  for k > 0
2 2 ½2
>
> where k is the centroid wavenumber of the
:b
ðkÞ ¼ 0 for k  0 analyzing wavelet and C is defined in the
Wavelets: Application to Turbulence 411

14 admissibility condition (respectively, eqns [10] and


[1] in the article Wavelets: Mathematical Theory).
12 By measuring E(k,e x) at different instants or
positions, one estimates which elements in the
10 signal contribute most to the global Fourier energy
spectrum, inorder to suggest a way to decompose
8 the signal into different components. For example,
if one considers turbulent flows, one can compare
6 the energy spectrum of the coherent structures
(such as isolated vortices in incompressible flows
4 or shocks in compressible flows) and the energy
spectrum of the incoherent background flow, since
2 both elements exhibit different correlations and
therefore different spectral slopes.
0
0 0.5 1 1.5 2 2.5 3
x 104
(a) Global wavelet spectrum Although the wavelet
transform analyzes the flow using localized func-
tions rather than complex exponentials, one can
1.6 show that the global wavelet energy spectrum
20 converges towards the Fourier energy spectrum,
1.4 provided the analyzing wavelet has enough vanish-
40 1.2 ing moments. More precisely, the global wavelet
spectrum, defined by integrating [4] over all
60 1 positions,
0.8 Z
80 1
0.6
e
EðkÞ ¼ e xÞdx
Eðk; ½5
1
100 0.4
gives the correct exponent for a power-law Fourier
120 0.2
energy spectrum E(k) / k if the analyzing wavelet
has at least M > ( 1)=2 vanishing moments.
0.5 1 1.5 2 2.5 3
Thus, the steeper the energy spectrum one studies,
x 104
(b) the more vanishing moments the analyzing wavelet
should have.
3
The inertial range which corresponds to the scales
when turbulent flows are dominated by nonlinear
20 interactions, exhibits a power-law behavior as
2
predicted by the statistical theory of homogeneous
40 and isotropic turbulence.
1
The ability to correctly evaluate the slope of the
60 energy spectrum is an important property of the
0 wavelet transform which is related to its ability to
80 detect and characterize singularities. We will not
–1 discuss here how wavelet coefficients could be used
100 to study singularities and fractal measures, since it is
–2 presented in detail elsewhere (see Wavelets:
120 Applications).
–3
0.5 1 1.5 2 2.5 3
x 104 Relation to Classical Analysis
(c)
Figure 1 Example of a one-dimensional continuous wavelet Relation to Fourier spectrum The global wavelet
analysis. (a) the signal to be analyzed, (b) the modulus of its
e
energy spectrum E(k) is actually a smoothed version
wavelet coefficients, (c) the phase of its wavelet coefficients. of the Fourier energy spectrum E(k). This can be
412 Wavelets: Application to Turbulence

seen from the following relation between the two The classical measures based on structure func-
spectra: tions can be thought of as a special case of wavelet
filtering using a nonsmooth wavelet defined as the
Z    difference of two Diracs (DOD). It is this lack of
1 1  k k 0 2 0
e
EðkÞ ¼ 0 
Eðk Þ b  dk ½6 regularity of the underlying wavelet that limits the
C k 0 k  adequacy of classical measures to analyze smooth
signals. Wavelet-based diagnostics can overcome
which shows that the global wavelet spectrum is an these limitations, and produce accurate results,
average of the Fourier spectrum weighted by the whatever the signal to be analyzed.
square of the Fourier transform of the analyzing We will link the scale-dependent moments of the
wavelets at wavenumber k. Note that the larger k, wavelet coefficients and the structure functions,
the larger the averaging interval, because wavelets which are classically used to study turbulence. In
are bandpass filters with k=k constant. This the case of second-order statistics, the global wavelet
property of the global wavelet energy spectrum is spectrum corresponds to the second-order structure
particularly useful to study turbulent flows. Indeed, function. Furthermore, a rigorous bound for the
the Fourier energy spectrum of a single realization maximum exponent detected by the structure func-
of a turbulent flow is too oscillating to be able to tions can be computed, but there is a way to
clearly detect a slope, while it is no more the case overcome this limitation by using wavelets.
for the global wavelet energy spectrum, which is a The increments of a signal, also called the
better estimator of the spectral slope. modulus of continuity, can be seen as its wavelet
The real-valued Marr wavelet [1] has only two coefficients using the DOD wavelet
vanishing moments and thus can correctly measure
the energy spectrum exponents up to  < 5. In the 
ðxÞ ¼ ðx þ 1Þ  ðxÞ ½8
case of the complex-valued Morlet wavelet [2], only
the zeroth-order moment is null, but the higher 2mth We thus obtain
order moments are very small (/ km e(k =2) ),
provided that k is larger than 5. For instance, the f ðx þ aÞ  f ðxÞ ¼ e
fx; a ¼ hf ; 
x; a i ½9
Morlet wavelet transform with k = 6 gives accu-
rate estimates of the power–law exponent of the
with x, a (y) = 1=a[((y  x)=a þ 1) ((y  x)=a)].
energy spectrum up to  < 7.
Note that the wavelet is normalized with respect to
There is also a family of wavelets with an infinite
the L1 -norm. The pth-order structure function Sp (a)
number of cancelations
therefore corresponds to the pth-order moment of
   the wavelet coefficients at scale a
bn ðkÞ ¼ n exp  1 k2 þ 1 n1 ½7 Z
2 k2n
Sp ðaÞ ¼ ðe
fx;a Þp dx ½10

where n is chosen for normalization.


As the DOD wavelet has only one vanishing
These wavelets can therefore correctly measure
moment (its mean), the exponent of the pth-order
any power–law energy spectrum, and thus detect the
structure function in the case of a self-similar
difference between a power–law energy spectrum
2 behavior is limited by p, that is, if Sp (a) / a(p) ,
and a Gaussian energy spectrum (E(k) / e((k=k0 ) ) ).
then (p) < p. To be able to detect larger exponents,
For instance, it is important in turbulence to
one has to use increments with a larger stencil, or
determine the wavenumber after which the
wavelets with more vanishing moments.
energy spectrum decays exponentially, since this
We now concentrate on the case p = 2, that is, the
wavenumber defines the end of the inertial range,
energy norm. Equation [6] gives the relation
dominated by nonlinear interactions, and the begin- e
between the global wavelet spectrum E(k) and the
ning of the dissipative range, dominated by linear
Fourier spectrum E(k) for an arbitrary wavelet .
dissipation.
For the DOD wavelet we find, since b (k) =
eik 1 = eik=2 (eik=2  ei k=2 ) and hence j b (k)j2 =
2(1  cos k), that
Relation to structure functions In this subsection
we will point out the limitations of classical Z 1   
e 1 k k0
measures of intermittency and present a set of EðkÞ ¼ Eðk0 Þ 2  2 cos dk0 ½11
wavelet-based alternatives. C k 0 k
Wavelets: Application to Turbulence 413

Setting a = k =k, we see that the wavelet spectrum statistics therefore characterize intermittency. Of
corresponds to the second-order structure function, course, intermittency is not essential for all problems:
such that second-order statistics are sufficient to measure
dispersion (dominated by energy-containing scales),
e 1 but not to calculate drag or mixing (dominated by
EðkÞ ¼ S2 ðaÞ ½12
C k vorticity production in thin boundary or shear
layers).
The above results show that, if the Fourier spectrum To measure intermittency, one uses the space–
e
behaves like k for k ! 1, E(k) / k if  < 2M þ scale information contained in the wavelet coeffi-
1, where M denotes the number of vanishing cients to define scale-dependent moments and
moments of the wavelets. Consequently, we find moment ratios. Useful diagnostics to quantify the
for S2 (a) that S2 (a) / a(p) = (k =k)(p) for a ! 0 if intermittency of a field f are the moments of its
(2)  2M. For the DOD wavelet, we have M = 1, wavelet coefficients at different scales j
therefore, the second-order structure function
j
can only detect slopes smaller than 2, corresponding X
2 1

to an energy spectrum whose slope is shallower Mp;j ðf Þ ¼ 2 j


je
fj;i jp ½13
i¼0
than 3. Thus, the usual structure functions give
spurious results for sufficiently smooth signals. The Note that the distribution of energy scale by scale,
relation between structure functions and wavelet that is, the scalogram, can be computed from the
coefficients can be generalized in the context of second-order moment of the orthogonal wavelet
Besov spaces, which are classically used for non- coefficients: Ej = 2j1 M2, j . Due to orthogonality of
linear approximation theory (see Wavelets: Mathe- the decomposition,
P the total energy is just the sum:
matical Theory). E = j0 Ej .
The sparsity of the wavelet coefficients at each
scale is a measure of intermittency, and it can be
Intermittency Measures quantified using ratios of moments at different
Intermittency is defined as localized bursts of high- scales
frequency activity. This means that intermittent Mp;j ðf Þ
phenomena are localized in both physical and Qp;q;j ðf Þ ¼ ½14
ðMq;j ðf ÞÞp=q
spectral spaces, and thus a suitable basis for
representing intermittency should reflect this dual which may be interpreted as quotient norms
localization. The Fourier basis is well localized in computed in two different functional spaces,
spectral space, but delocalized in physical space. Lp -and Lq -spaces. Classically, one chooses q = 2 to
Therefore, when a turbulence signal is filtered using define typical statistical quantities as a function of
a high-pass Fourier transform and then recon- scale. Recall that for p = 4 we obtain the scale-
structed in physical space, for example, to calculate dependent flatness Fj = Q4, 2, j . It is equal to 3 for a
the flatness, some spatial information is lost. This Gaussian white noise at all scales j, which proves that
leads to smoothing of strong gradients and spurious this signal is not intermittent. The scale-dependent
oscillations in the background, which come from the skewness, hyperflatness, and hyperskewness are
fact that the modulus and phase of the discarded obtained for p = 3, 5, and 6, respectively. For inter-
high wavenumber Fourier modes have been lost. mittent signals Qp, q, j increases with j, whatever p
The spatial errors introduced by such a Fourier and q.
filtering lead to errors in estimating the flatness, and
hence the signal’s intermittency.
When a quantity (e.g., velocity derivative) is Wavelet Compression
intermittent, it contains rare but strong events (i.e.,
Principle
bursts of intense activity), which correspond to
large deviations reflected in the ‘‘heavy tails’’ of the To study turbulent signals, we now propose to
PDF. Second-order statistics (e.g., energy spectrum, separate the rare and extreme events from the dense
second-order structure function) are relatively events, and then calculate their statistics indepen-
insensitive to such rare events whose time or dently. A major difficulty in turbulence research is
space supports are very small and thus do not that there is no clear scale separation between these
dominate the integral. However, these events two kinds of events. This lack of ‘‘spectral gap’’
become increasingly important for higher-order excludes Fourier filtering for disentangling these
statistics, where they finally dominate. High-order two behaviors. Since the rare events are well
414 Wavelets: Application to Turbulence

localized in physical space, one might try to use an orthogonal and therefore the L2 -norm, for example,
on–off filter defined in physical space to extract energy or enstrophy, is a superposition of coherent
them. However, this approach changes the spectral and incoherent contributions (Mallat 1998).
properties by introducing spurious discontinuities, Assuming that coherent structures are what
adding an artificial scaling (e.g., k2 in one remain after denoising, we need a model, not for
dimension) to the energy spectrum. To avoid these the structures themselves, but for the noise. As a first
problems, we use the wavelet representation, which guess, we choose the simplest model and suppose the
combines both physical and spectral space localiza- noise to be additive, Gaussian and white, that is,
tions (bounded from below by Heisenberg’s uncer- uncorrelated. Having this model in mind, we use
tainty principle). In turbulence, the relevant rare Donoho and Johnstone’s theorem to compute the
events are the coherent vortices and the dense value to threshold the wavelet coefficients. Since the
events correspond to the residual background flow. threshold value depends on the variance of the noise,
We have proposed a nonlinear wavelet filtering of which in the case of turbulence is not a priori
the wavelet coefficients of vorticity to extract the known, we propose a recursive method to estimate
coherent vortices out of turbulent flows. We now it from the variance of the weakest wavelet
detail the different steps of this procedure. coefficients, that is, those whose modulus is below
the threshold value.
Extraction of Coherent Structures
Principle We propose a new method to extract Wavelet decomposition We describe the wavelet
coherent structures from turbulent flows, as encoun- algorithm to extract coherent vortices out of
tered in fluids (e.g., vortices, shocklets) or plasmas turbulent flows and apply it as example to a 3D
(e.g., bursts), in order to study their role in transport turbulent flow. We consider the vorticity field
and mixing. w = r  v, computed at resolution N = 23J , N being
We first replace the Fourier representation by the the number of grid points and J the number of
wavelet representation, which keeps track of both octaves in each spatial direction. Each vorticity
time and scale, instead of frequency only. The component is developed into an orthogonal wavelet
second improvement consists in changing our view- series from the largest scale lmax = 20 to the smallest
point about coherent structures. Since there is not scale lmin = 2J1 using a three-dimensional (3D) MRA:
yet a universal definition of coherent structures, we
prefer starting from a minimal but more consensual !ðxÞ ¼ !
0;0;0 0;0;0 ðxÞ
statement about them, that everyone hopefully could J1 2 j j j
X X X
1 2 1 2X 1 X
7
agree with: ‘‘coherent structures are not noise.’’ þ ~dj;ix ;iy ;iz d
! j;ix ;iy ;iz ðxÞ ½15
Using this apophatic method, we propose the j¼0 ix ¼0 iy ¼0 iz ¼0 d¼1
following definition: ‘‘coherent structures are what
remain after denoising.’’
with j, ix , iy i, iz (x) = j, ix (x)j, iy (y)j, iz (z), and
For the noise we use the mathematical definition
stating that a noise cannot be compressed in any 8
functional basis. Another way to say this is to > j;ix ðxÞj;iy ðyÞj;iz ðzÞ d¼1
>
>
>
>
observe that the shortest description of a noise is the > d¼2
> j;ix ðxÞ j;iy ðyÞj;iz ðzÞ
>
>
noise itself. Notice that often one calls ‘‘noise’’ what >
>
>
> j;i ðxÞj;iy ðyÞ j;iz ðzÞ d¼3
is actually ‘‘experimental noise,’’ but not noise in the >
< x
mathematical sense. d
j;ix ðxÞj;iy ðyÞ j;iz ðzÞ d¼4
j;ix ;iy ;iz ðxÞ ¼ ½16
Considering our definition of coherent structures, >
>
>
>
turbulent signals can be split into two contribu- >
> j;ix ðxÞ j;iy ðyÞj;iz ðzÞ d¼5
>
>
tions: coherent bursts, corresponding to that part of >
>
>
> j;ix ðxÞ j;iy ðyÞ j;iz ðzÞ d¼6
the signal which can be compressed in a wavelet >
>
:
basis, and incoherent noise, corresponding to that j;ix ðxÞ j;iy ðyÞ j;iz ðzÞ d¼7
part of the signal which cannot be compressed,
neither in wavelets nor in any other basis. We will where j, i and j, i are the one-dimensional
then check a posteriori that the incoherent con- scaling function and the corresponding wavelet,
tribution is spread, and therefore does not com- respectively. Due to orthogonality, the scaling coeffi-
press, in both Fourier and grid-point basis. Since we cients are given by ! 0, 0, 0 = h!, 0, 0, 0 i and the wavelet
use the orthogonal wavelet representation, both coefficients are given by ! ~dj, ix , iy , iz = h!, dj, ix , iy , iz i, where
2
coherent and incoherent components are h,i denotes the L -inner product.
Wavelets: Application to Turbulence 415

Nonlinear thresholding The vorticity field is then


split into w C and w I by applying a nonlinear threshold-
ing to the wavelet coefficients. The threshold is defined
as = ð43 Z ln NÞR1=2 . It only depends on the total
enstrophy Z = 12 jwj2 dx and on the number of grid
points N without any adjustable parameter. The choice
of this threshold is based on theorems by Donoho
and Johnstone proving optimality of the wavelet
representation to denoise signals in the presence of
Gaussian white noise, since this wavelet-based
estimator minimizes the maximal L2 -error for func-
tions with inhomogeneous regularity (Mallat 1998).

Wavelet reconstruction The coherent vorticity field


w C is reconstructed from the wavelet coefficients
whose modulus is larger than and the incoherent ω
vorticity field w I from the wavelet coefficients whose Figure 2 Isosurfaces of total vorticity field, for
modulus is smaller or equal to . The two fields thus jwj = 3
, 4
, 5
with opacity 1, 0.5, 0.1, respectively, and
2 the
obtained, w C and w I , are orthogonal, which ensures total enstrophy. Simulation with resolution N = 2563 for R = 168.
a separation of the total enstrophy into Z = ZC þ ZI Zoom on a subcube 643 . Reprinted with permission from Farge
because the interaction term hw C , w I i vanishes. We et al. Coherent vortex extraction in three-dimensional homo-
geneous turbulence: Comparison between CVS-wavelet and
then use Biot–Savart’s relation v = r  (r2 w) to POD-Fourier decompositions. Physics of Fluids 15(10): 2886–
reconstruct the coherent velocity vC and the inco- 2896. Copyright 2003, American Institute of Physics.
herent velocity vI from the coherent and incoherent
vorticities, respectively. coherent vorticities, but they have been reduced by a
factor 2 for the incoherent vorticity whose fluctuations
Application to 3D Turbulence are much smaller. In the coherent vorticity (Figure 3)
We consider a 3D homogeneous isotropic turbulent we recognize the same vortex tubes as those present in
flow, computed by DNS at resolution N = 2563 , the total vorticity (Figure 2). In contrast, the remaining
which corresponds to a Reynolds number based vorticity (Figure 4) is much more homogeneous and
on the Taylor microscale R = 168 (Farge et al.
2003). The computation uses a pseudospectral
code, with a Gaussian random vorticity field as initial
condition, and the flow evolution is integrated until a
statistically stationary state is reached. Figure 2 shows
the modulus of the vorticity fluctuations of the total
flow, zooming on a 643 subcube to enhance structural
details. The flow exhibits elongated, distorted, and
folded vortex tubes, as observed in laboratory and
numerical experiments.
We apply to the total flow the wavelet compres-
sion algorithm described above. We find that only
2.9% wavelet modes correspond to the coherent
flow, which retains 79% of the energy (L2-norm of
velocity) and 75% of the enstrophy (L2-norm of
vorticity), while the remaining 97.1% incoherent
modes contain only 1% of the energy and 21% of
the enstrophy. We display the modulus of the ω>
coherent (Figure 3) and incoherent (Figure 4) vorti- Figure 3 Isosurfaces of coherent vorticity field, for
city fluctuations resulting from the wavelet jwj = 3
, 4
, 5
with opacity 1, 0.5, 0.1, respectively. Simulation
with resolution N = 2563 . Zoom on a subcube 643 : Reprinted with
decomposition.
permission from Farge et al. Coherent vortex extraction in three-
Note that the values of the three isosurfaces chosen dimensional homogeneous turbulence: Comparison between CVS-
for visualization (j!j = 6Z1=2 , 8Z1=2 and 10Z1=2 , with wavelet and POD-Fourier decompositions. Physics of Fluids
Z the total enstrophy) are the same for the total and 15(10): 2886–2896. Copyright 2003, American Institute of Physics.
416 Wavelets: Application to Turbulence

Total
10
Coherent
Incoherent
1 Fourier cut
k (–5/3)
0.1
k2
0.01

0.001

0.0001

1e–05

1e–06
1 10 100
Figure 6 Energy spectrum, resolution N = 2563 with a
zoom at 643 . Reprinted with permission from Farge et al.
Coherent vortex extraction in three-dimensional homogeneous
ω< turbulence: Comparison between CVS-wavelet and POD-Fourier
Figure 4 Isosurfaces of incoherent vorticity field, for decompositions. Physics of Fluids 15(10): 2886–2896. Copyright
jwj = 3=2
, 2
, 5=2
with opacity 1, 0.5, 0.1, respectively. Simula- 2003, American Institute of Physics.
tion with resolution N = 2563 . Zoom on a subcube 643 . Reprinted
with permission from Farge et al. Coherent vortex extraction in the same Gaussian distribution as the total velocity,
three-dimensional homogeneous turbulence: Comparison between
while the incoherent velocity remains Gaussian, but its
CVS-wavelet and POD-Fourier decompositions. Physics of Fluids
15(10): 2886–2896. Copyright 2003, American Institute of Physics. variance is much smaller. The corresponding energy
spectra are plotted on Figure 6. We observe that the
does not exhibit coherent structures. Hence, the spectrum of the coherent energy is identical to the
wavelet compression retains all the vortex tubes and spectrum of the total energy all along the inertial
preserves their structure at all scales. Consequently, the range. This implies that the vortex tubes are respon-
coherent flow is as intermittent as the total flow, while sible for the k5=3 energy scaling, which corresponds to
the incoherent flow is structureless and non intermit- a long-range correlation, characteristic of 3D turbu-
tent. Modeling the effect of the incoherent flow onto lence as predicted by Kolmogorov’s theory. In con-
the coherent flow should then be much simpler than trast, the incoherent energy has a scaling close to k2 ,
with methods based on Fourier filtering. which corresponds to an energy equipartition between
Figure 5 shows the velocity PDF in semilogarithmic all wave vectors k, since the isotropic spectrum is
coordinates. We observe that the coherent velocity has obtained by integrating energy in 3D k-space over 2D
shells k = jkj. The incoherent velocity field is therefore
spatially uncorrelated, which is consistent with the
10
observation that incoherent vorticity is structureless
Total and homogeneous.
1 Coherent From these observations, we propose the following
Incoherent
0.1 Gaussian fit scenario to interpret the turbulent cascade: the
coherent energy injected at large scales is transferred
0.01 towards small scales by nonlinear interactions between
0.001
vortex tubes. In the meantime, these nonlinear inter-
actions also produce incoherent energy at all scales,
0.0001 which is dissipated at the smallest scales by molecular
1e–05 kinematic viscosity. Thus, the coherent flow causes
direct transfer of the coherent energy into incoherent
1e–06 energy. Conversely, the incoherent flow does not
1e–07 trigger any energy transfer to the coherent flow, as it
–30 –20 –10 0 10 20 30 is structureless and uncorrelated. We conjecture that
Figure 5 Velocity PDF, resolution N = 2563 with a zoom at the coherent flow is dynamically active, while the
643 . Reprinted with permission from Farge et al. Coherent vortex incoherent flow is slaved to it, being only passively
extraction in three-dimensional homogeneous turbulence: Com-
parison between CVS-wavelet and POD-Fourier decomposi-
advected and mixed by the coherent vortex tubes. This
tions. Physics of Fluids 15(10): 2886–2896. Copyright 2003, is a different view from the classical interpretation
American Institute of Physics. since it does not suppose any scale separation. Both
Wavelets: Application to Turbulence 417

coherent and incoherent flows are active all along the The above equations are completed with bound-
inertial range, but they are characterized by different ary conditions and a suitable initial condition.
probability distribution functions and correlations:
non-Gaussian and long-range correlated for the Time discretization Introducing a classical semi-
former, while Gaussian and uncorrelated for the latter. implicit time discretization with a time step t and
setting !n (x) !(x, nt), we obtain
Wavelet Computation
ð1  tr2 Þ!nþ1 ¼ !n þ tðr  Fn  vn  r!n Þ ½19
Principle
The mathematical properties of wavelets (see Wave-
lets: Mathematical Theory) motivate their use for r2 nþ1 ¼ !nþ1 and vnþ1 ¼ r? nþ1 ½20
solving of partial differential equations (PDEs).
Hence, in each time step two elliptic problems
The localization of wavelets, both in scale and
space, leads to effective sparse representations of have to be solved and a differential operator has to
functions and pseudodifferential operators (and their be applied.
inverse) by performing nonlinear thresholding of the Formally the above equations can be written in
wavelet coefficients of the function and of the matrices the abstract form Lu = f , where L is an elliptic
representing the operators. Wavelet coefficients allow operator with constant coefficients. This corre-
sponds to a Helmholtz type equation for ! with
to estimate the local regularity of solutions of PDEs
L = (1  tr2 ) and a Poisson equation for  with
and thus can define autoadaptive discretizations with
L = r2 .
local mesh refinements. The characterization of func-
tion spaces in terms of wavelet coefficients and the
corresponding norm equivalences lead to diagonal Spatial discretization For the spatial discretization,
preconditioning of operators in wavelet space. we use the method of weighted residuals, that is, a
Moreover, the existence of the fast wavelet trans- Petrov–Galerkin scheme. The trial functions
form yields algorithms with optimal linear complex- are orthogonal wavelets  and the test functions
ity. The currently existing algorithms can be are operator adapted wavelets, called ‘‘vaguelettes,’’
classified in different ways. We can distinguish . To solve the elliptic equation Lu = f at time
between Galerkin, collocation, and hybrid schemes. step tnþ1 , we develop unþ1 into P an orthogonal
Hybrid schemes combine classical discretizations, wavelet series, that is, unþ1 =  e unþ1
  , where
for example, finite differences or finite volumes, and  = (j, ix , iy , d) denotes the multi-index for scale j,
wavelets, which are only used to speed up the linear space i, and direction d. Requiring that the residual
algebra and to define adaptive grids. On the other vanishes with respect to all test functions  , we
hand, Galerkin and collocation schemes employ obtain a linear system for the unknown wavelet
wavelets directly for the discretization of the coefficients e unþ1
 of the solution u:
solution and the operators. Wavelet methods have X
been developed to solve Burger’s, Stokes, Kura- unþ1
e hL  ; 0 i ¼ hf ; 0 i ½21
moto–Sivashinsky, nonlinear Schrödinger, Euler, 
and Navier–Stokes equations. As an example, we The test functions are defined such that the
present an adaptive wavelet algorithm, of Galerkin stiffness matrix turns out to be the identity.
type, to solve the 2D Navier–Stokes equations. Therefore, the solution of Lu =P f reduces to a
Adaptive Wavelet Scheme
change of basis, that is, unþ1 =  hf ,  i  . The
right-hand side (RHS) f can then be developed into a
We consider the 2D Navier–Stokes equations writ- biorthogonalP operator adapted wavelet
ten in terms of vorticity ! and stream function , basis f =  hf ,  i , with  = L?1  and
which are both scalars in two dimensions,  = L  , ? denoting the adjoint operator. By
construction, and  are biorthogonal, that is,
@t ! þ v  r!  r2 ! ¼ r  F ½17
such that h  , 0 i = , 0 . It can be shown that
both have similar localization properties in physical
and Fourier space as , and that they form a Riesz
r2  ¼ ! and v ¼ r?  ½18 basis.
for x 2 [0, 1]2 , t > 0. The velocity is denoted by v, F
is an external force, > 0 is the molecular kinematic Adaptive discretization To get an adaptive space
viscosity, and r? = (@y , @x ). discretization for the linear problem Lu = f , we
418 Wavelets: Application to Turbulence

J
advantage of this scheme is that general nonlinear
i
terms, for example, f (u) = (1  u) eC=u , can be
treated more easily. The method can be summar-
ized as follows: starting from the significant
j
wavelet coefficients, jeu j > ", one reconstructs u
on a locally refined grid and gets u(x ). Then one
|ω∼| > ε can evaluate f (u(x )) pointwise and the wavelet
0
coefficients e
f are calculated using the adaptive
Figure 7 Illustration of the dynamic adaption strategy in decomposition.
wavelet coefficient space.
Finally, one computes the scalar products of the
RHS of [21] with the test functions to advance the
solution in time. We compute e u = hf ,  i belonging
consider only the significant wavelet coefficients of to the enlarged coefficient set (white and gray
the solution. Hence, we only retain coefficients e un regions in Figure 7).
whose modulus is larger than a given threshold ", The algorithm is of O(N) complexity, where N
that is, jeun j > ". The corresponding coefficients denotes the number of wavelet coefficients retained
are shown in Figure 7 (white area under the solid in the computation.
line curve).
Application to 2D Turbulence
Adaption strategy To be able to integrate the
equation in time we have to account for the To illustrate the above algorithm we present an
evolution of the solution in wavelet coefficient adaptive wavelet computation of a vortex dipole in
space (indicated by the arrow in Figure 7). There- a square domain, impinging on a no-slip wall at
fore, we add at time step tn the neighbors to the Reynolds number Re = 1000. To take into account
retained coefficients, which constitute a security the solid wall, we use a volume penalization
zone (gray area in Figure 7). The equation is then method, for which both the fluid flow and the
solved in this enlarged coefficient set (white and solid container are modeled as a porous medium
gray areas below the curves in Figure 7) to obtain whose porosity tends towards zero in the fluid and
unþ1
e towards infinity in the solid region.
 . Subsequently, we threshold the coefficients
and retain only those whose modulus je unþ1 The 2D Navier–Stokes equations are thus mod-
 j >"
(coefficients under the dashed curve in Figure 7). ified by adding the forcing term F = (1= )v
This strategy is applied in each time step and hence in eqn [18], where is the penalization parameter
allows to automatically track the evolution of the and  is the characteristic function whose value is 1
solution in both scale and space. in the solid region and 0 elsewhere. The equations
are solved using the adaptive wavelet method in
a periodic square domain of size 1.1, in which
Evaluation of the nonlinear term For the
the square container of size 1 is imbedded,
evaluation of the nonlinear term f (un ), where the
taking = 103 . The maximal resolution corre-
un are given, there are two
wavelet coefficients e
sponds to a fine grid of 10242 points. Figure 8a
possibilities:
shows snapshots of the vorticity field at times
 Evaluation in wavelet coefficient space. As t = 0.2, 0.4, 0.6, and 0.8 (in arbitrary units). We
illustration, we consider a quadratic nonlinear observe that the vortex dipole is moving towards
term, f (u) = u2 . The wavelet coefficients of f can the wall and that strong vorticity gradients are
be calculated using the connection coefficients, produced when the dipole hits the wall. The
that is,Pone Phas to calculate the bilinear expres- computational grid is dynamically adapted during
sion,  0 e
u I 0 00 e
u0 with the interaction the flow evolution, since the nonlinear wavelet filter
tensor I   = h   , 00 i. Although many coeffi-
0 00 0 automatically refines the grid in regions where
cients of I are zero or very small, the size of I strong gradients develop. Figure 8b shows the
leads to a computation which is quite untractable centers of the retained wavelet coefficients at
in practice. corresponding times.
 Evaluation in physical space. This approach is Note that during the computation only 5% out of
similar to the pseudospectral evaluation of the 10242 wavelet coefficients are used. The time
nonlinear terms used in spectral methods, there- evolution of total kinetic energy and the total
fore it is called pseudowavelet technique. The enstrophy F = ( 1y )v, are plotted in Figure 9 to
Wavelets: Application to Turbulence 419

(a) (b)
Figure 8 Dipole wall interaction at Re = 1000. (a) Vorticity field, (b) corresponding centers of the active wavelets, at t = 0.2, 0.4, 0.6,
and 0.8 (from top to bottom).

show the production of enstrophy and the concomi-


tant dissipation of energy when the vortex dipole 0.5 300
hits the wall. 0.45 E(t )
250
This computation illustrates the fact that the Z(t )
0.4
adaptive wavelet method allows an automatic grid 200
E(t )

Z(t )

0.35
refinement, both in the boundary layers at the 150
0.3
wall and also in shear layers which develop during 100
0.25
the flow evolution far from the wall. Therewith,
0.2 50
the number of grid points necessary for the 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
computation is significantly reduced, and we con- t
jecture that the resulting compression rate will Figure 9 Time evolution of energy (solid line) and enstrophy
increase with the Reynolds number. (dashed line).
420 Wavelets: Applications

Acknowledgments Farge M, Kevlahan N, Perrier V, and Goirand E (1996)


Marie Farge thankfully acknowledges Trinity Col- Wavelets and turbulence. Proceedings of the IEEE 84(4):
lege, Cambridge, UK, and the Centre International 639–669.
de Rencontres Mathématiques (CIRM), Marseille, Farge M, Kevlahan N, Perrier V, and Schneider K (1999)
France, for hospitality while writing this paper. Turbulence analysis, modelling and computing using wavelets.
In: van den Berg JC (ed.) Wavelets in Physics, pp. 117–200.
Cambridge: Cambridge University Press.
See also: Turbulence Theories; Viscous Incompressible
Farge M and Schneider K (2002) Analysing and computing
Fluids: Mathematical Theory; Wavelets: Applications;
Wavelets: Mathematical Theory. turbulent flows using wavelets. In: Lesieur M, Yaglom A, and
David F (eds.) New trends in turbulence, Les Houches 2000,
vol. 74, pp. 449–503. Springer.
Further Reading Farge M, Schneider K, Pellegrino G, Wray AA, and Rogallo RS
(2003) Physics of Fluids 15(10): 2886.
Cohen A (2000) Wavelet methods in numerical analysis. In: Grossmann A and Morlet J (1984) SIAM J. Appl. Anal 15: 723.
Ciarlet PG and Lions JL (eds.) Handbook of Numerical
Lemarié P-G and Meyer Y (1986) Revista Matematica Ibero-
Analysis, vol. 7, Amsterdam: Elsevier.
americana 2: 1.
Cuypers Y, Maurel A, and Petitjeans P (2003) Physics Review
Letters 91: 194502. Mallat S (1989) Transactions of the American Mathematical
Dahmen W (1997) Wavelets and multiscale methods for operator Society 315: 69.
equations. Acta Numerica 6: 55–228. Mallat S (1998) A Wavelet Tour of Signal Processing. Academic Press.
Daubechies I (1992) Ten Lectures on Wavelets. SIAM. Schneider K, Farge M, and Kevlahan N (2004) Spatial
Daubechies I, Grossmann A, and Meyer Y (1986) Journal of intermittency in two-dimensional turbulence. In: Tongring
Mathematical Physics 27: 1271. N and Penner RC (eds.) Woods Hole Mathematics, Perspec-
Farge M (1992) Wavelet transforms and their applications to tives in Mathematics and Physics, pp. 302–328. Word
turbulence. Annual Reviews of Fluid Mechanics 24: 395–457. Scientific.
Farge M and Rabreau G (1988) Comptes Rendus Hebdomadaires http://wavelets.ens.fr – other papers about wavelets and turbu-
des Seances de l’Academie des Sciences, Paris 2: 307. lence can be downloaded from this site.

Wavelets: Applications
M Yamada, Kyoto University, Kyoto, Japan From the perspective of time–frequency analysis,
ª 2006 Elsevier Ltd. All rights reserved. the wavelet analysis may be regarded as a windowed
Fourier analysis with a variable window width,
narrower for higher frequency. The wavelets can
Introduction therefore give information on the local frequency
structure of an event; they have been applied to
Wavelet analysis was first developed in the early various kinds of one-dimensional (1D) or multi-
1980s in the field of seismic signal analysis in the dimensional signals, for example, to identify an
form of an integral transform with a localized kernel event or to denoise or to sharpen the signal.
function with continuous parameters of dilation and 1D wavelets (a,b) (x) are defined as
translation. When a seismic wave or its derivative  
has a singular point, the integral transform has a ða;bÞ 1 xb
ðxÞ ¼ p ffiffiffiffiffi

scaling property with respect to the dilation para- jaj a
meter; thus, this scaling behavior can be available to
locate the singular point. In the mid-1980s, the where a( 6¼ 0), b are real parameters and (x) is a
orthonormal smooth wavelet was first constructed, spatially localized function called ‘‘analyzing wave-
and later the construction method was generalized let’’ or ‘‘mother wavelet.’’ Wavelet analysis gives a
and reformulated as multiresolution analysis decomposition of a function into a linear combina-
(MRA). Since then, several kinds of wavelets have tion of those wavelets, where a perfect reconstruc-
been proposed for various purposes, and the concept tion requires the analyzing wavelet to satisfy some
of wavelet has been extended to new types of basis mathematical conditions.
functions. In this sense, the most important effect of For the continuous wavelet transform (CWT),
wavelets may be that they have awakened deep where the parameters (a, b) are continuous, the
interest in bases employed in data analysis and data analyzing wavelet (x)L2 (R) has to satisfy the
processing. Wavelets are now widely used in various admissibility condition
fields of research; some of their applications are
discussed in this article.
Wavelets: Applications 421

analyzing wavelet (x)L2 (R) has to satisfy the The wavelet frame is also employed in several
admissibility condition applications.
From the prospect of applications, the CWTs are
Z
j ^ð!Þj2
1 better adapted for the analysis of data functions,
C  d! < 1 including the detection of singularities and patterns,
1 j!j
while the DWTs are adapted to the data processing,
where ˆ (!) is the Fourier transform of (x): including signal compression or denoising.
Z 1
^ð!Þ ¼ ei!x ðxÞ dx
1
Singularity Detection and Multifractal
The admissibility condition is known to be equiva- Analysis of Functions
lent to the condition that (x) has no zero-frequency
Since its birth, the wavelet analysis has been applied
component, that is, ˆ (0) = 0, under some mild
for the detection of singularity of a data function.
condition for the decay rate at infinity. Then the
Let us define the Hölder exponent h(x0 ) at x0 of a
CWT and its inverse transform of a data function
function f (x) is defined here as the largest value of
f (x) 2 L2 (R) is defined as
the exponent h such that there exists a polynomial
Z 1
1 Pn (x) of degree n that satisfies for x in the
T ða; bÞ ¼ pffiffiffiffiffiffi ða;bÞ ðxÞf ðxÞ dx
C 1 neighborhood of x0 :
Z 1 Z 1
1 ða;bÞ da db jf ðxÞ  Pn ðx  x0 Þj ¼ Oðjx  x0 jh Þ
f ðxÞ ¼ pffiffiffiffiffiffi T ða; bÞ ðxÞ
C 1 1 a2
The data function is not differentiable if h(x0 ) < 1,
In the case of the discrete wavelet transform but if h(x0 ) > 1 then it is differentiable and a
(DWT), the parameters (a, b) are taken discrete; a singularity may arise in its higher derivatives. The
typical choice is a = 1=2j , b = k=2j , where j and k are wavelet transform is applied to find the Hölder
integers: exponent h(x0 ), because T (a, b) has an asymptotic
behavior T (a, b) = O(ah(x0 þ1=2) )(a ! 0) if the ana-
j;k ðxÞ ¼ 2j=2 ð2j x  kÞ
lyzing wavelet has N( >h(x0 )) vanishing moments,
In order that the wavelets { j,k (x) j j, k 2 Z} may that is,
constitute a complete orthonormal system in L2 (R), Z 1
the analyzing wavelet should satisfy more stringent xm ðxÞ dx ¼ 0; m 2 Z; 0  m < N
conditions than the admissibility condition for the 1
CWT, and is now constructed in the framework of
MRA. A data function is then decomposed by the A commonly used analyzing wavelet for this purpose
DWT as may be the N-time derivative of the Gaussian
function (x) = dN (ex =2 )=dxN . This method works
2

X1 Z 1
f ðxÞ ¼ j;k j;k ðxÞ; j;k ¼ j;k ðxÞf ðxÞ dx
well to examine a single or some finite number of
j¼1 1 singular points of the data function.
When the data function is a multifractal function
Even when the discrete wavelets do not constitute with an infinite number of singular point of various
a complete orthnormal system, they often form a strengths, the multifractal property of the data
wavelet frame if linear combinations of the wavelets function is often characterized by the singularity
are dense in L2 (R) and if there are two constants A, spectrum D(h) which denotes the Hausdorff dimen-
B such that the inequality sion of the set of points where h(x) = h. The
X singularity spectrum is, however, difficult to obtain
Akf k2  jh j;k ; f ij2  Bkf k2 directly from the CWT, and the Legendre transfor-
j;k
mation is introduced to bypass the difficulty.
holds for an arbitrary f (x) 2 L2 (R). For the wavelet Fully developed 3D fluid turbulence may be a
frame { j,k }, there is a corresponding dual frame, typical example of wavelet application to the
{ ˜ j,k }, which permits the following expansion of f (x): singularity detection. The Kolmogorov similarity
X X law of fluid turbulence for the longitudinal velocity
f ðxÞ ¼ h j;k ; f i ~j;k ðxÞ ¼ h ~j;k ; f i j;k ðxÞ increment u(r)  e  (u(x þ re)  u(x)), where u(x)
j;k j;k is the velocity field and e is a constant unit vector,
422 Wavelets: Applications

predicts a scaling property of the structure function; has a convex shape around h = 1=3 suggesting a
for r in the inertial subrange, multifractal property. For a fractal signal, we note
that the WTMM method enlightens the hierarchical
hðuðrÞÞp i  rp; p ¼ p=3 organization of the singularities, in the branching
structure of the WT skeleton defined by the
where h  i denotes the statistical mean. In reality, maxima lines arrangement in the (a, b) half-plane.
however, the scaling exponent p measured in Though the above discussion also applies to the
experiments shows a systematic deviation from p/3, DWT, the detection of the Hölder exponent h in
which is considered to be a reflection of intermit- experimental situations is usually performed by the
tency, namely the spatial nonuniformity or multi- CWT, which has no restriction on possible values of
fractal property of active vortical motions in a, while the DWT is often employed for theoretical
turbulence. For simplicity, let us consider the discussions of singularity and multifractal structure
velocity field on a linear section of the turbulence of a function.
field. According to the multifractal formalism, the
turbulence velocity field has singularities of various
strengths described by the singularity spectrum Multiscale Analysis
D(h), which is related to the scaling exponent p
through the Legendre transform, D(h) = infp (ph  Wavelet transform expands a data function in the
p þ 1). This relation is often used to determine D(h) time–frequency or the position–wavenumber space,
from the knowledge of p (structure function which has twice the dimension of the original signal,
method). However, this method does not necessarily and makes it easier to perform a multiscale analysis
work well because, for example, it does not capture and to identify events involved in the signal. In the
the singular points of the Hölder exponent larger wavelet transform, as stated above, the time resolu-
than 1 and it is unstable for h < 0. tion is higher at higher frequency, in contrast with
These difficulties are not restricted to the turbu- the windowed Fourier transform where the time and
lence research, but arise commonly when the the frequency resolutions are independent of fre-
structure function is employed to determine the quency. Another advantage of wavelet is a wide
singularity spectrum. In these problems, the CWT variety of analyzing wavelet, which enables us to
T (a, b) provides an alternative method. An inge- optimize the wavelet according to the purpose of
nious technique is to take only the modulus maxima data analysis. Both the CWT and the DWT are
of T (a, b) (for each of fixed a) to construct a available for these time–frequency or position–
partition function wavenumber analysis. However, the CWT has
properties quite different from those of familiar
" #q
X orthonormal bases of discrete wavelets.
0
Zða; qÞ ¼ sup jT ða; b Þj
0
l2Lmax ða;b Þ2l
Multidimensional CWT
where q 2 R, and Lmax denotes the set of all maxima The CWT can be formulated in an abstract way. We
lines, each of which is a continuous curve for small can regard G = {(a, b) j a( 6¼ 0), b 2 R} as an affine
value of a, and there exists at least one maxima line group on R with the group operation of
toward a singular point of the Hölder exponent (a, b)(a0 , b0 ) = (aa0 , ab0 þ b) associated with the
h(x0 ) < N. In the limit of a ! 0, defining the invariant measure d = da db=a2 . The group G has
exponent (q) as Z(a, q)  a(q) , one can obtain the its unitary representation in the Hilbert space
singularity spectrum through the Legendre H = L2 (R):
transform:  
    1 xb
DðhÞ ¼ inf q h þ 12  ðqÞ ðUða; bÞf ÞðxÞ ¼ pffiffiffiffiffiffi f
q jaj a
This method (wavelet-transform modulus-maxima and then we can consider the CWT can be constructed
(WTMM) method) is advantageous in that it works as a linear map W from L2 (R) to L2 (G; da db=a2 ):
also for singularities of h > 1 and h < 0. Several
simple examples of multifractal functions have been 1
W : f ðxÞ 7! T ða; bÞ ¼ pffiffiffiffiffiffi hUða; bÞ ; f i
successfully analyzed by this method. For fluid C
turbulence, this method gives a singularity spectrum
D(h) which has a peak value of 1 at h  1=3, where h , i is the inner product of L2 (R) with the
consistently with Kolmogorov similarity law, but complex conjugate taken at the first element, and
Wavelets: Applications 423

(x) is a unit vector (analyzing wavelet) satisfying which defines the range of the CWT, a subspace
the abstract admissibility condition of L2 (R). Therefore, if one wants to modify T (a, b)
Z by, for example, assigning its value as zero in some
C ¼ jhUða; bÞ ; ij2 d < 1 parameter region just as in a filter process, care
G should be taken for the resultant T (a, b) to be in the
image of the CWT. The reason may be understood
This formulation is applicable also to a locally
intuitively by noticing that the wavelets (a,b) (x) are
compact group G and its unitary and square
linearly dependent on each other. The expression of
integrable representation in a Hilbert space H.
a data function by a linear combination of the
Note that even the canonical coherent states are
wavelets is therefore not unique, and thus is
included in this framework by taking the Weyl–
redundant. The CWT gives only T (a, b) of the
Heisenberg group and L2 (R) for G and H,
least norm in L2 (R2 ; da db=a2 ). In physical inter-
respectively. This abstract formulation allows us
pretations of the CWT, however, this nonuniqueness
to extend the CWT to higher-dimensional Eucli-
is often ignored.
dean spaces and other manifolds: for example, 2D
sphere S2 for geophysical application and 4D
manifold of spacetime taking the Poincaré group Pattern Detection
into consideration. Edge detection The edges of an object are often the
In Rn , the CWT of f (x) 2 L2 (Rn ) and its inverse most important components for pattern detection.
transform are given by The edge may be considered to consist of points of
Z sharp transition of image intensity. At the edge, the
1 ða;r;bÞ ðxÞf ðxÞ dx
T ða; r; bÞ ¼ pffiffiffiffiffiffi modulus of the gradient of the image f (x, y) is
C Rn expected to take a local maximum in the 1D
Z direction perpendicular to the edge. Therefore, the
1 ða;r;bÞ da dr db
f ðxÞ ¼ pffiffiffiffiffiffi Tða; r; bÞ ðxÞ local maxima of jrf (x, y)j may be the indicator of
C G anþ1 the edge. However, the image textures can also give
similar sharp transitions of f (x, y), and one should
where r 2 SO(n), b 2 Rn , dr is the normalized invar- take into account the scale dependence which
iant measure of G = SO(n), and the wavelets are distinguishes between edges and textures. One of
defined as (a, r, b) (x) = (1=an=2 ) (r1 (x  b)=a), with the practically possible ways for this purpose is to
the analyzing wavelet satisfying the admissibility use dyadic wavelets m j m j
(2 x, 2j y) which
j (x, y) = 2
condition are generated from the two wavelets ( 1 , 2 ) = (
Z ^ @=@x, @=@y), where  is a localized function
j ðwÞj
C ¼ n dw < 1 (multiscale edge detection method). The dyadic
R jwj
n
wavelet transform of the image f (x, y)
Note that these wavelets are constructed not only
by dilation and translation but also by rotation Tjm ðb1 ; b2 Þ ¼ hf ðx; yÞ; m
j ðx  b1 ; y  b2 Þi; m ¼ 1; 2
which therefore gives the possibility for directional
pattern detection in a data function. In the case of defines the multiscale edges as a set of points
2D sphere S2 , on the other hand, the dilation b = (b1 , b2 ) where the modulus of the wavelet trans-
operation should be reinterpreted in such a way form, j(Tj1 , Tj2 )j, takes a locally maximum value
that at the North Pole, for example, it is the normal (WTMM) in a 1D neighborhood of b in the
dilation in the tangent plane followed by lifting it direction of (Tj1 (b), Tj2 (b)). Scale dependence of
to S2 by the stereographic projection from the the magnitude of the modulus maxima is related to
South Pole. the Hölder exponent of f (x, y) similarly to 1D case,
Generally, the abstract map W thus defined is and thus gives information to distinguish between
injective and therefore reversal, but not surjective in the edges and the textures.
contrast with the Fourier case. Actually in the case of Inversely, the information of WTMM bj,p =
1D CWT, T (a, b) is subject to an integral condition: {(b1,j,p , b2,j,p )} of multiscale edges can be made use
Z Z of for an approximate reconstruction of the original
1 1
da db image, although the perfect reconstruction cannot be
T ða; bÞ ¼ 2
Kða; b; a0; b0 ÞT ða0; b0 Þ
1 1 a expected because of the noncompleteness of the
Z 1 modulus maxima wavelets. Assuming that
0 0
Kða; b; a ; b Þ ¼
0 0
ða;bÞ ðxÞ ða ;b Þ ðxÞ dx { 1j,p , 2j,p } = { 1j (x  bj,p ), 2j (x  bj,p )} constitutes a
1 frame of the linear closed space generated by
424 Wavelets: Applications

{ 1j,p , 2j,p }, an approximate image f̂ is obtained by reduces the noise component orthogonal to it. More
inverting the relation specifically, the wavelet framePgives a representation
XX XX of a data function as f (t) = j,k j,k j,k , where the
Lf̂  hf̂ ; m m
j;p i j;p ¼ Tjm ðbj;p Þ m
j;p expansion coefficients j,k = h j,k , f (x)i satisfy the
m j;p m j;p
defining equation of the subspace
using, for example, a conjugate gradient algorithm, X
j0;k0 ¼ j;k h j0;k0 ; j;k i
where a fast calculation is possible with a filter bank
algorithm for the dyadic wavelet (‘‘algorithm à If the frame coefficients are transmitted, the projec-
trous’’). This algorithm gives only the solution of tion operator P, which is defined on the right-hand
minimum norm among all possible solutions, but it side of the above equation, reduces the noise in the
is often satisfactory for practical purposes and thus received coefficients j,k contaminated during the
is applicable also to data compression. transmission.
However, this method is not applicable if the
Directional detection For oriented features such as transmitted signal is not redundant. Then some
segments or edges in images to be detected, a a priori criterion is necessary to discriminate between
directionally selective wavelet for the CWT is desired. signal and noise. Various criteria have been pro-
A useful wavelet for this purpose is one that has the posed in different fields. If the signal and the noise,
effective support of its Fourier transform in a convex or plural signals have different power-law forms of
cone with apex at the origin in wave number space. A spectra, then their discrimination may be possible by
typical example of the directional wavelet may be the the DWT at higher-frequency region where the
2D Morlet wavelet: difference in the magnitude of the coefficients is
significant. In this approach, the wavelets of Meyer
ðxÞ ¼ expðik0  xÞ expðjAxj2 Þ type, that is, an orthogonal wavelet with a compact
where k0 is the center of the support in Fourier support in Fourier space, may be preferable because
space, and A is a 2  2 matrix diag[1=2 , 1](  1), the wavelets of different scales are separated, at least
where the admissibility condition for the CWT is to some extent, in Fourier space.
approximately satisfied for jk0 j  5. Another exam- In fluid dynamics, the vorticity field of 2D
ple is the Cauchy wavelet which has the support turbulence is found to be decomposed into coherent
strictly in a convex cone in wave number space. and incoherent vorticity fields, according as the
These wavelets have the directional selectivity CWT is larger than a threshold value or not,
with preference to a slender object in a specific respectively. These two fields give different Fourier
direction. One of their applications is the analysis of spectra of the velocity field (k5 for coherent part
the velocity field of fluid motion from an experi- while k3 for incoherent part), showing that the
mental data, where many tiny plastic balls distrib- coherent structures are responsible for the deviation
uted in fluid give a lot of line segments in a picture from k3 predicted by the classical enstrophy
taken with a short exposure. The directional wavelet cascade theory. In an astronomical application, on
analysis of the picture classifies the line segments the other hand, the data processing is performed by
according to their directions, indicating the direc- a more sophisticated method taking into account
tions of fluid velocity. Another example may be a interscale relation in the wavelet transform, because
wave-field analysis where many waves in different an astronomical image contains various kinds
directions are superimposed; the directional wavelets of objects, including stars, double-stars, galaxies,
allow one to decompose the wave field into the nebulas, and clusters. In a medical image however
component waves. Directional wavelets have also contrast analysis is indispensable for diagnostic
been applied successfully to detect symmetry of imaging to get a clear detailed picture of organic
objects such as crystals or quasicrystals. structure. A scale-dependent local contrast is defined
as the ratio of the CWT to that given by an
analyzing wavelet with a larger support. A multi-
Denoising and separation of signals The wavelet plicative scheme to improve the contrast is con-
frame as well as the CWT give a redundant structed by using the local contrast.
representation of a data function. If, instead of the
original data, the redundant expression is trans-
Signal Compression
mitted, the redundancy is used to reduce the noise
included in the received data because the redun- Signal compression is quite an important technology
dancy requires the data to belong to a subspace, and in digital communication. Speech, audio, image, and
the projection of the received data to the subspace digital video are all important fields of signal
Wavelets: Applications 425

compression, and plenty of compression methods give a compressed signal. One of the systematic
have been put to practical use, but we mention here methods to generate such a suitable basis is also to
only a few. employ the wavelet packets.
The MRA for orthogonal wavelets gives a
successive procedure to decompose a subspace of
Numerical Calculation
L2 (R) into a direct sum of two subspaces corre-
sponding to higher- and lower-frequency parts; only Application of wavelet transform, especially of the
the latter of which is decomposed again into its DWT, to numerical solver for a differential equation
higher- and lower-frequency parts. Algebraically, (DE) has long been studied. At the first sight, the
this procedure was already known before the wavelets appear to give a good DE solver because
discovery of MRA in filter theory in electrical the wavelet expansion is generally quite efficient
engineering, where a discretely sampled signal is compared to Fourier series due to its spatial
convoluted with a filter series to give, for example, a localization. But its implementation to an efficient
high-pass-filtered or low-pass-filtered series. An computer code is not so straightforward; research is
appropriate designed pair of a high-pass and a still continuing for concrete problems. Application
low-pass filters followed by the downsampling of the CWT to spectral method for partial differ-
yields two new series corresponding to the higher- ential equation (PDE) has been studied extensively.
and lower-frequency parts, respectively, which are There is no wavelet which diagonalizes the differ-
then reversible by another two reconstruction filters ential operator @=@x; therefore, an efficient numer-
with the upsampling. These four filters which are ical method is necessary for derivatives of wavelets.
often employed in a widely used technique of ‘‘sub- Products of wavelets also yield another numerical
band coding’’ then constitute a perfect reconstruc- problem. MRA brings about mesh points which are
tion filter bank. Under some conditions, successive adaptive to some extent, but finite element method
applications of this decomposition process to the still gives more flexible mesh points.
series of lower-frequency parts, which is equivalent For some scaling-invariant differential or integral
to the nesting structure of MRA, have been used for operators, including @ 2 =@x2 , Abel transformations,
data compression (quadrature mirror filter). A and Reisz potential, adaptive biorthogonal wavelets
famous example is a data compression system of can be provided with block-diagonal Galerkin
FBI for finger prints, consisting of wavelet coding representations, which has been applied to data
with scalar quantization. processing. Generally, simultaneous localization of
In MRA, however, it is only the lower-frequency wavelets, both in space and in scale, leads to a
parts that are successively decomposed. If both the sparse Galerkin representation for many pseudodif-
lower- and the higher-frequency parts are repeatedly ferential operators and their inverses. A threshold-
decomposed by the decomposition filters, then the ing technique with DWT has been introduced to
successive convolution processes correspond to a coherent vortex simulation of the 2D Navier–Stokes
decomposition of data function by a set of wavelet- equations, to reduce the relevant wavelet coeffi-
like functions, called ‘‘wavelet packet,’’ where there cients. Another promising application of wavelet
are choices whether to decompose the higher- and/or occurs as a preprocessor for an iterative Poisson
the lower-frequency parts. The best wavelet packet, in solver, where a wavelet-based preconditioning leads
the sense of the entropy, for example, within a to a matrix with a bounded condition number.
specified number of decompositions, often provides
with a powerful tool for data compression in several
Other Wavelets and Generalizations
areas, including speech analysis and image analysis.
We also note that from the viewpoint of the best basis Several new types of wavelets have been proposed:
which minimizes the statistical mean square error of ‘‘coiflet’’ whose scaling function has vanishing
the thresholded coefficients, an orthonormal wavelet moments giving expansion coefficients approxi-
basis gives a good concentration of the energy if the mately equal to values of the data functions, and
original signal is a piecewise smooth function super- ‘‘symlet’’ which is an orthonormal wavelet with a
imposed by a white noise, which is thus efficiently nearly symmetric profile. Multiwavelets are wavelets
removed by thresholding the coefficients. The effi- which give a complete orthonormal system in L2
ciency of a wavelet expansion of a signal is sometimes space. In 2D or multidimensional applications of the
evaluated with the entropy of ‘‘probability’’ defined as DWT, separable orthonormal wavelets consisting of
jj,k j2 =jjf jj2 . A better wavelet can be selected by tensor products of 1D orthonormal wavelets are
reducing the entropy, practically from among some frequently used, while nonseparable orthonormal
set of wavelets, and its restricted expansion coefficients wavelets are also available. Another generalization
426 Wavelets: Mathematical Theory

of wavelets is the Malvar basis which is also a Further Reading


generalization of local Fourier basis, and gives a
perfect reconstruction. A new direction of wavelet is Benedetto JJ and Frazier W (eds.) (1994) Wavelets: Mathematics
the second-generation wavelets which are con- and Applications. Boca Raton, FL: CRC Press.
structed by lifting scheme and free from the regular van den Berg JC (ed.) (1999) Wavelets in Physics. Cambridge:
dyadic procedure, and thus applicable to compact Cambridge University Press.
regions as S2 and a finite interval. Daubechies I (1992) Ten Lectures on Wavelets, SIAM, CBMS61,
Philadelphia.
See also: Fractal Dimensions in Dynamics; Image Mallat S (1998) A Wavelet Tour of Signal Processing. San Diego:
Processing: Mathematics; Intermittency in Turbulence; Academic Press.
Wavelets: Application to Turbulence; Wavelets: Strang G and Nguyen T (1997) Wavelet and Filter Banks.
Mathematical Theory. Wellesley: Wellesley-Cambridge Press.

Wavelets: Mathematical Theory


K Schneider, Université de Provence, Marseille, In this article, we will first define the continuous
France wavelet transform and then the orthogonal wavelet
M Farge, Ecole Normale Supérieure, Paris, France transform based on a multiresolution analysis.
ª 2006 Elsevier Ltd. All rights reserved. Properties of both transforms will be discussed
and illustrated by examples. For a general intro-
duction to wavelets, see Wavelets: Applications.
Introduction
The wavelet transform unfolds functions into time Continuous Wavelet Transform
(or space) and scale, and possibly directions. The
continuous wavelet transform has been discovered Let us consider the Hilbert space of square-integr-
by Alex Grossmann and Jean Morlet who published able functions L2 (R) = {f : jkf k2 < 1},
R equipped
the first paper on wavelets in 1984. This mathema- with the scalar product hf , gi = R f (x)g? (x) dx
tical technique, based on group theory and square- (? denotes the complex conjugate in the case of
integrable representations, allows us to decompose a complex-valued functions) and where the norm is
signal, or a field, into both space and scale, and defined by kf k2 = hf , f i1=2 .
possibly directions. The orthogonal wavelet trans-
Analyzing Wavelet
form has been discovered by Lemarié and Meyer
(1986). Then, Daubechies (1988) found orthogonal The starting point for the wavelet transform is to
bases made of compactly supported wavelets, and choose a real- or complex-valued function 2
Mallat (1989) designed the fast wavelet transform L2 (R), called the ‘‘mother wavelet,’’ which fulfills
(FWT) algorithm. Further developments were done the admissibility condition,
in 1991 by Raffy Coifman, Yves Meyer, and Victor Z 1
Wickerhauser who introduced wavelet packets and b 2 dk
C ¼ ðkÞ <1 ½1
applied them to data compression. The development 0 jkj
of wavelets has been interdisciplinary, with con-
tributions coming from very different fields such as where
engineering (sub-band coding, quadrature mirror Z 1
filters, time–frequency analysis), theoretical physics bðkÞ ¼ ðxÞ e2 kx dx ½2
(coherent states of affine groups in quantum 1
mechanics), and mathematics (Calderon–Zygmund pffiffiffiffiffiffi
denotes the Fourier transform, with  = 1 and k
operators, characterization of function spaces, har-
monic analysis). Many reference textbooks are the wave number. If is integrable, that is, 2
available, some of them we recommend are listed L1 (R), this implies that has zero mean,
in the ‘‘Further reading’’ section. Meanwhile, a large Z 1
ðxÞ dx ¼ 0 or bð0Þ ¼ 0 ½3
spectrum of applications has grown and is still
1
developing, ranging from signal analysis and image
processing via numerical analysis and turbulence In practice, however, one also requires the wavelet
modeling to data compression. to be well localized in both physical and Fourier
Wavelets: Mathematical Theory 427

Z 1 bandwidth becomes wider.


xm ðxÞ dx ¼ 0 for m ¼ 0; M  1 ½4
1
Synthesis
that is, monomials up to degree M  1 are exactly
reproduced. In Fourier space, this property is The admissibility condition [1] implies the existence
equivalent to of a finite energy reproducing kernel, which is a
necessary condition for being able to reconstruct the
dm b function f from its wavelet coefficients ~f . One then
ðkÞ jk¼0 ¼ 0 for m ¼ 0; M  1 ½5
dkm recovers
therefore, the Fourier transform of decays Z 1Z 1
1 e dadb
smoothly at k = 0. f ðxÞ ¼ f ða; bÞ a;b ðxÞ 2 ½11
C 0 1 a

Analysis which is the inverse wavelet transform.


The wavelet transform is an isometry and one has
From the mother wavelet , we generate a family of Parseval’s identity. Therefore, the wavelet transform
continuously translated and dilated wavelets, conserves the inner product and we obtain
 
1 xb Z 1
ðxÞ ¼ p ffiffi

a;b
a a hf ; gi ¼ f ðxÞg ðxÞ dx
1
for a > 0 and b 2 R ½6 Z 1Z 1
1 e dadb
¼ g ða; bÞ 2
f ða; bÞe ½12
where a denotes the dilation parameter, correspond- C 0 1 a
ing to the width of the wavelet support, and b the
translation parameter, corresponding to the position As a consequence, the total energy E of a signal
of the wavelet. The wavelets are normalized in can be calculated either in physical space or in
energy norm, that is, k a, b k2 = 1. wavelet space, such as
In Fourier space, eqn [6] reads Z 1
pffiffiffi E¼ jf ðxÞj2 dx
ba;bðkÞ ¼ a bðakÞ e2kb ½7 1
Z 1Z 1
1 dadb
where the contraction with 1/a in [6] is reflected in ¼ je
f ða; bÞj2 2 ½13
C 0 1 a
a dilation by a [7] and the translation by b implies a
rotation in the complex plane. This formula is also the starting point for the
The continuous wavelet transform of a function f definition of wavelet spectra and scalogram (see
is then defined as the convolution of f with the Wavelets: Application to Turbulence).
wavelet family a, b :
Z 1
e Examples
f ða; bÞ ¼ f ðxÞ a;b ðxÞ dx ½8
1 In the following, we apply the continuous wavelet
where 
denotes, in the case of complex-valued transform to different academic signals using the
a, b
wavelets, the complex conjugate. Morlet wavelet. The Morlet wavelet is complex
Using Parseval’s identity, we get valued, and consists of a modulated Gaussian with
Z 1 width k0 =:
e
f ða; bÞ ¼ b
f ðkÞ ba;b ðkÞ dk ½9 2 2 2
x =k20
1 ðxÞ ¼ ðe2x  ek0 =2 Þ e2 ½14
and the wavelet transform could be interpreted as a The envelope factor k0 controls the number of
frequency decomposition using bandpass filters ba, b oscillations in the wave packet; typically, k0 = 5 is
centered at frequencies k = k =a. The wave number 2
used. The correction factor ek0 =2 , to ensure its
k denotes the barycenter of the wavelet support in vanishing mean, is very small and often neglected.
Fourier space The Fourier transform is
R1
kj bðkÞj dk
k ¼ R0 1 ½10 bðkÞ ¼ kp0ffiffiffi eðk20 =2Þð1þk2 Þ ðek20 k  1Þ ½15
b
0 j ðkÞj dk 2 

Note that these filters have a variable width k=k; Figure 1 shows wavelet analyses of a cosine, two
therefore, when the wave number increases, the sines, a Dirac, and a characteristic function. Below
428 Wavelets: Mathematical Theory

 
the four signals we plot the modulus and the phase cos   sin 
of the corresponding wavelet coefficients. ½17
sin  cos 
The analysis formula [8] then becomes
Higher Dimensions Z
The continuous wavelet transform can be extended to e
f ða; b; Þ ¼ f ðxÞ a;b; ðxÞ dx ½18
higher dimensions in L2 (Rn ) in different ways. Either R2

we define spherically symmetric wavelets by setting and for the corresponding inverse wavelet transform
(x) = 1d (jxj) for x 2 R n or we introduce in addition [11] we obtain
to dilations a 2 Rþ and translations b 2 R n also rota- Z Z Z 2
tions to define wavelets with a directional sensitivity. In 1 1 e dadbd
f ðxÞ ¼ f ða;b; Þ a;b; ðxÞ ½19
the two-dimensional case, we obtain for example, C 0 R2 0 a3
   Similar constructions can be made in dimensions
1 1 x  b
a;b; ðxÞ ¼ R  ½16 larger than 2 using n  1 angles of rotation.
a a
where a 2 R þ , b 2 R2 , and where R is the rotation
matrix

1.5
1.5

1
1

0.5
0.5

0 Two sines 0
Cosine

–0.5 –0.5

–1 –1

–1.5 –1.5
0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000

3
0.9 3 0.9
10 10
10 0.8 10 0.8 2
2 20 20
0.7 0.7
20 20 30 30
1 1
0.6 0.6
30 30 40 40
0.5 0.5
0 50 50 0
40 40 0.4
0.4
60 60
–1 –1
50 0.3 50 0.3
70 70
0.2 0.2 –2
60 60 –2 80 80
0.1 0.1
90 90
70 70 –3 –3
200 400 600 800 1000 200 400 600 800 1000 500 1000 1500 2000 2500 3000 3500 4000 500 1000 1500 2000 2500 3000 3500 4000
Modulus of the wavelet coefficients Phase of the wavelet coefficients Modulus of the wavelet coefficients Phase of the wavelet coefficients

1.5

1 1

0.5 0.5
Dirac
Characteristic function

0 0

–0.5 –0.5
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000

3 3
0.3 0.18
10 10 10
0.16 10
2.5 2
20 0.25
20 20 0.14 20
2 1
30 0.2 0.12
30 30 30
0.1 0
40 0.15 40 1.5
40 40
0.08
50 50 1 –1
0.1 50 0.06 50
0.04 –2
60 0.05 60 0.5 60 60
0.02
70 70 70 70 –3
0
200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000
Modulus of the wavelet coefficients Phase of the wavelet coefficients Modulus of the wavelet coefficients Phase of the wavelet coefficients

Figure 1 Examples of a one-dimensional continuous wavelet analysis using the complex-valued Morlet wavelet. Each subfigure
shows on the top the function to be analyzed and below (left) the modulus of its wavelet coefficients and below (right) the phase of its
wavelet coefficients.
Wavelets: Mathematical Theory 429

Discrete Wavelets The discrete reconstruction formula is


Frames X
1 X
1
f ðxÞ ¼ C e
fji ji ðxÞ þ RðxÞ ½23
It is possible to obtain a discrete set of quasiortho- j¼1 i¼1
gonal wavelets by sampling the scale and position
where C is a constant and R(x) is a residual, both
axes a, b. For the scale a we use a logarithmic
depending on the choice of the wavelet and the
discretization: a is replaced by aj = aj 0 , where a0 is sampling of the scale and position axes. For the parti-
the sampling rate of the log a axis (a0 = ( log a))
cular choice a0 = 2 (which corresponds to a scale
and where j 2 Z is the scale index. The position b is
j sampling by octaves) and b0 = 1, we have the dyadic
discretized linearly: b is replaced by xji = ib0 a0 ,
sampling, for which there exist special wavelets ji that
where b0 is the sampling rate of the position axis at
form an orthonormal basis of L2 (R), that is, such that
the largest scale and where i 2 Z is the position
index. Note that the sampling rate of the position h ji ; j0 i0 i ¼ jj0 ii0 ½24
varies with scale, that is, for finer scales (increasing j
where  denotes the Kronecker symbol. This means
and hence decreasing aj ), the sampling rate
that the wavelets ji are orthogonal with respect to
increases. Accordingly, we obtain the discrete wave-
their translates by discrete steps 2j i and their dilates
lets (cf. Figure 2)
by discrete steps 2j corresponding to octaves. In
 0 
0 1=2 x  xji this case, the reconstruction formula is exact with
ji ðx Þ ¼ a j ½20 C = 1 and R = 0. Note that the discrete wavelet
aj
transform has lost the invariance by translation and
and the corresponding discrete decomposition for- dilation of the continuous one.
mula is
Z 1
efji ¼ h ji ; f i ¼ f ðx0 Þ ji ðx0 Þ dx0 ½21 Orthogonal Wavelets and Multiresolution Analysis
1
The construction of orthogonal wavelet bases and the
Furthermore, the wavelet coefficients satisfy the associated fast numerical algorithm is based on the
following estimate: mathematical concept of multiresolution analysis
X 2 (MRA). The underlying idea is to consider approx-
Akf k22  je
fji j  Bkf k22 ½22 imations fj of the function f at different scales j.
j;i
The amount of information needed to go from a coarse
with frame bounds B  A > 0. In the case A = B we approximation fj to a finer resolution approximation
have a tight frame. fjþ1 is then described using orthogonal wavelets. The
orthogonal wavelet analysis can thus be interpreted as
decomposing the function into approximations of the
function at coarser and coarser scales (i.e., for
decreasing j), where the differences between the
approximations are encoded using wavelets.
The definition of the MRA was introduced by
Stéphane Mallat in 1988 (Mallat 1989). This
technique constitutes a mathematical framework of
orthogonal wavelets and the related FWT.
A one-dimensional orthogonal MRA of L2 (R) is
defined as a sequence of successive approximation
(a) spaces Vj , j 2 Z, which are closed imbedded subspaces
8
7
of L2 (R). They verify the following conditions:
6
5 ...
...
Vj  Vjþ1 8j 2 Z ½25
4 0
3 0 1 2 3 4 5 6 7
j 2 0 1 2 3 [
1
0
0
0
1 Vj ¼ L2 ðRÞ ½26
j2Z
i
\
(b) Vj ¼ f0g ½27
Figure 2 Orthogonal quintic spline wavelets j, i (x ) = 2 j=2 j2Z
(2j x  i) at different scales and positions: (a) 5, 6 (x ),
f ðxÞ 2 Vj , f ð2xÞ 2 Vjþ1 ½28
6, 32 (x), 7, 108 (x), and (b) corresponding wavelet coefficients.
430 Wavelets: Mathematical Theory

j=2
A scaling function (x) is required to exist. Its with gn = hjn , j1, 0 i, and where ji (x) = 2
translates generate a basis in each Vj , that is, (2j x  i), j, i 2 Z (cf. Figure 2). The filter coeffi-
cients gn can be computed from the filter coefficients
Vj Vj ¼ spanfji gi2Z ½29 hn using the relation
where
gn ¼ ð1Þ1n h1n ½38
ji ðxÞ ¼ 2j=2 ð2j x  iÞ; j; i 2 Z ½30
The translates and dilates of the wavelet
At a given scale j, this basis is orthonormal with respect
constitute orthonormal bases of the spaces Wj ,
to its translates by steps i=2j but not to its dilates,
hji ; jk i ¼ ik ½31 Wj ¼ spanf ji gi2Z ½39

The nestedness of the approximation spaces [28] As in the continuous case, the wavelets have
generated by the scaling function  implies that it vanishing mean, and also possibly vanishing higher-
satisfies a refinement equation: order moments; therefore,
X
1 Z 1
j1;i ðxÞ ¼ hn2i jn ðxÞ ½32 xm ðxÞ dx ¼ 0 for m ¼ 0; . . . ; M  1 ½40
n¼1 1
with the filter coefficients hn = hjn , j1,0 i, which
Let us now consider approximations of a function
determine the scaling function completely. In gen-
f 2 L2 (R) at two different scales j:
eral, only the filter coefficients hn are known and no
analytical expression of  is given. Equation [32] at scale j
implies that the approximation of a function at
coarser scale can be described by linear combina- X
1
fj ðxÞ ¼ f ji ji ðxÞ ½41
tions of the same function at finer scales.
i¼1
The orthogonal projection of a function f 2 L2 (R)
on VJ is defined as at scale j  1
PVJ : f !PVJ f ¼ fJ ½33 X
1
fj1 ðxÞ ¼ f j1;i j1;i ðxÞ ½42
with i¼1
X
fJ ðxÞ ¼ hf ; jk ijk ðxÞ ½34 with the scaling coefficients
k2Z

This coarse graining at a given scale J is done by fji ¼ hf ; ji i ½43


filtering the function with the scaling function . As
a filter, the scaling function  does not have which correspond to local averages of the function
vanishing mean but is normalized so that f at position i2j and at scale 2j .
R1
 (x) dx = 1. The difference between the two approximations is
1
As VJ1 is included in VJ , we can define its encoded by the wavelets
orthogonal complement space in VJ :
X
1
fj ðxÞ  fj1 ðxÞ ¼ e
fj1; i j1;i ðxÞ ½44
VJ ¼ VJ1  WJ1 ½35
i¼1
Correspondingly, the approximation of the func-
tion f at scale 2J , belonging to VJ , can be with the wavelet coefficients
decomposed as a sum of orthogonal projections on
VJ1 and WJ1 , such that e
fji ¼ hf ; ji i ½45
PVJ f ¼ PVJ1 f þ PWJ1 f ½36
which correspond to local differences of the function
Based on the scaling function , one can construct a at position (2i þ 1)2(jþ1) between approximations
function , the so-called mother wavelet, given by at scales 2j and 2(jþ1) .
the relation Iterating the two-scale decomposition [44], any
X function f 2 L2 (R) can be expressed as a sum of a
ji ðxÞ ¼ gn2i j;n ðxÞ ½37 coarse-scale approximation at a reference scale j0
n2Z that we set to 0 here, and their successive
Wavelets: Mathematical Theory 431

differences. These details are needed to go from one upsampling which adds zeros in between two
scale j to the next finer scale j þ 1 for successive coefficients.
j = 0, . . . , J  1,
reconstruction
X
1 1 X
X 1 for j = 1 to J, step 1, do
f ðxÞ ¼ f 0;i 0;i ðxÞ þ e
fji ji ðxÞ ½46
i¼1 j¼0 i¼1 X
1 X
1
f ji ¼ hi2n f j1;n þ gi2ne
fj;n ½51
For numerical applications, the sums in eqn [46] n¼1 n¼1
have to be truncated in both scale j and position i.
The truncation in scale corresponds to a limitation The FWT has been introduced by Stéphane Mallat
of f to a given finest scale J, which is in practice in 1989. If the scaling functions (and wavelets) are
imposed by the available sampling rate. Due to the compactly supported, the filters hn and gn have only
finite length of the available data, the sum over i a finite number of nonvanishing coefficients. In this
also becomes finite. The decomposition [46] is case, the numerical complexity of the FWT is O(N)
orthogonal, as, by construction, where N denotes the number of samples.

h ji ; j0 i0 i ¼ jj0 ii0 ½47


Choice of Wavelets

Orthogonal wavelets are typically defined by their


h ji ; j0 i0 i ¼0 for j  j0 ½48
filter coefficients hn , since in general no analytic
in addition to [31]. expression for is available. In the following, we
give the filter coefficients of hn for some typical
Fast Wavelet Transform
orthogonal wavelets. The filter coefficients of gn can
be obtained using the quadrature relation between
Starting with a function f 2 L2 (R) given at the finest the two filters [38].
resolution 2J (i.e., we know fJ 2 VJ and hence the
Haar D1 (one vanishing moment):
coefficients f Ji for i 2 Z), the FWT computes its
pffiffiffi
wavelet coefficients efji by decomposing successively h0 ¼ 1= 2
each approximation fJ into a coarser scale approx- pffiffiffi
imation fJ1 , plus the corresponding details which h1 ¼ 1= 2
are encoded by the wavelet coefficients. The
algorithm uses a cascade of discrete convolutions Daubechies D2 (two vanishing moments):
with the low pass filter hn and the bandpass filter gn , h0 ¼ 0:482 962 913 145
followed by downsampling, in which only one
h1 ¼ 0:836 516 303 736
coefficient out of two is retained. The direct wavelet
transform algorithm is h2 ¼ 0:224 143 868 042
h3 ¼ 0:129 409 522 551
initialization
  Daubechies D3 (three vanishing moments):
2 i
given f 2 L ðRÞ and f Ji ¼ f for i 2 Z
2J h0 ¼ 0:332 670 552 950
h1 ¼ 0:806 891 509 311
decomposition
for j = J to 1, step 1,X do h2 ¼ 0:459 877 502 118
f j1;i ¼ hn2i f jn ½49 h3 ¼ 0:135 011 020 010
n2Z h4 ¼ 0:085 441 273 882
X h5 ¼ 0:035 226 291 882
e
fj1;i ¼ gn2i f jn ½50
Coiflets C12 (four vanishing moments): the
n2Z
wavelets and the corresponding scaling function
The inverse wavelet transform is based on
are shown in Figure 3.
successive reconstructions of fine-scale approxima-
tions fj from coarser scale approximations fj1 , Remarks The construction of orthogonal wavelets
plus the differences between approximations at in L2 (R) can be modified to obtain wavelets on the
scale j  1 and the finer scale j which are encoded interval, that is, in L2 ([0, 1]). Therewith, boundary
by efj1, i . The algorithm uses a cascade of discrete wavelets are introduced, while in the interior of the
convolutions with the filters hn and gn , preceded by interval the wavelets are not modified.
432 Wavelets: Mathematical Theory

0.3 6

0.25 5
0.2
4
0.15
3
0.1
2
0.05

0 1

0.05
1 0.5 0 0.5 1 0 50 100 150 200

(a)

0.3 6

0.2 5

4
0.1
3
0
2
0.1 1

0.2
1 0.5 0 0.5 1 0 50 100 150 200

(b)
ˆ
Figure 3 Orthogonal wavelets Coiflet C12. (a) Scaling function (x) (left) and j(!)j. (b) Wavelet (x ) (left) and j ˆ (!)j.

A periodic MRA of L2 (T), where T = R=Z Higher Dimensions


denotes the torus, can also be constructed by
The previously presented one-dimensional construc-
periodizing the wavelets in L2 (R), using
tion can be extended to higher dimensions. For
X simplicity, we will consider only the two-
per
ðxÞ ¼ ðx þ kÞ dimensional case, since higher dimensions can be
k2Z
treated analogously.
Relaxing the condition of orthogonality allows
greater flexibility in the choice of the basis Tensor product construction Having developed a
functions. For example, biorthogonal wavelets can one-dimensional orthonormal basis ji of L2 (R), one
be designed using different basis functions for could use these functions as building blocks in
analysis (a ) and synthesis (s ) which are related higher dimensions. One way of doing so is to take
but no longer orthogonal. A couple of refinable the tensor product of two one-dimensional bases
scaling functions (a , s ) with related wavelets and to define
( a , s ) which are by construction biorthogonal jx ;jy ;ix ;iy ðx; yÞ ¼ jx ;ix ðxÞ jy ;iy ðyÞ ½52
generate a biorthogonal MRA Vja , Vjs . From an
algorithmic point of view, only two different filter The resulting functions constitue an orthonormal
couples (ga , ha ) for the forward and (gs , hs ) for the wavelet basis for L2 (R2 ). Each function f 2 L2 (R2 )
backward FWT are used, without changing the can then be developed into
algorithm. XX
f ðx; yÞ ¼ efjx ;jy ;ix ;iy jx ;jy ;ix ;iy ðx; yÞ ½53
The multiresolution approach can be further
jx ;ix jy ;iy
generalized, for samplings on nonequidistant
grids leading to the so-called second-generation with e
fjx , jy , ix , iy = hf , jx , jy , ix , iy i. However, in this basis
wavelets. the two variables x and y are dilatated separately
Wavelets: Mathematical Theory 433

...
~1
fj –1, ix , iy
... ... ... ... ...
~1
fj, ix , iy
~ ~ ~3 ~2
... fj x–1 , jy–1, ix , iy fjx, jy
–1 , ix , iy fj –1, ix , iy fj –1, ix , iy

~ ~ ~3 ~2
... fj x –1, jy, ix , iy fjx, jy , ix , iy fj, ix , iy fj, ix , iy

(a) (b)
Figure 4a Schematic representation of the 2D (b) wavelet transforms: (a) Tensor product construction and (b) 2D MRA.

and therefore no longer form an MRA. This means 8


>
< j;ix ðxÞj;iy ðyÞ; "¼1
that the functions jx , jy involve two scales, 2jx and "
j;ix ;iy ðx; yÞ ¼ j;ix ðxÞ j;iy ðyÞ; "¼2 ½58
2jy , and each of the functions is essentially supported >
:
on a rectangle with these side-lengths. Hence, the j;ix ðxÞ j;iy ðyÞ; "¼3
decomposition is often called rectangular wavelet
decomposition (cf. Figure 4a). From the algorithmic Observe that here the scale parameter j simulta-
viewpoint, this is equivalent to applying the one- neously controls the dilatation in x and y. We recall
dimensional wavelet transform to the rows and the that in d dimensions this construction yields 2d  1
columns of a matrix or a function. For some types of wavelets spanning W j .
applications, such a basis is advantageous, for others Using [58], each function f 2 L2 (R 2 ) can be
not. Often the notion of a scale has a certain developed into a multiresolution basis as
meaning. For an application, one would like to have XX X
f ðx; yÞ ¼ e
fj;i" x ;iy "j;ix ;iy ðx; yÞ ½59
a unique scale assigned to each basis function.
j ix ;iy "¼1;2;3
Multiresolution construction Another much more
interesting construction is the construction of a truly with efj," ix , iy = < f , "j, ix , iy >. A schematic representa-
two-dimensional MRA of L2 (R2 ). It can be obtained tion of the wavelet coefficients is shown in
through the tensor product of two one-dimensional Figure 4b. The algorithmic structure of the one-
MRAs of L2 (R). More precisely, one defines the dimensional transforms carries over to the two-
spaces V j , j 2 Z by dimensional case by simple tensorization, that is,
applying the filters at each decomposition step to
V j ¼ Vj
Vj ½54
rows and columns.
and V j = span{j, ix , iy (x, y) = j, ix (x)j, iy (y), ix , iy 2 Z} Remark The described two-dimensional wavelets
fulfilling analogous properties as in the one- and scaling functions are separable. This advantage is
dimensional case. the ease of generation starting from one-
Likewise, we define the complement space W j to dimensional MRAs. However, the main drawback
be the orthogonal complement of V j in V jþ1 , that is, of this construction is that three wavelets are needed
V jþ1 ¼ Vjþ1
Vjþ1 to span the orthogonal complement space W j .
Another property should be mentioned. By construc-
¼ ðVj  Wj Þ
ðVj  Wj Þ ½55
tion, the wavelets are anisotropic, that is, horizontal,
diagonal, and vertical directions are preferred.
¼Vj
Vj  ððWj
Vj Þ
 ðVj
Wj Þ  ðWj
Wj ÞÞ ½56
Approximation Properties
¼ Vj  Wj ½57 Reproduction of Polynomials
It follows that the orthogonal complement W j = A fundamental property of the MRA is the exact
V jþ1 V j consists of three different types of func- reproduction of polynomials. RThe vanishing
tions and is generated by three different wavelets moments of the wavelet , that is, R xm (x)dx = 0
434 Wavelets: Mathematical Theory

for m = 0, M  1, is equivalent to the fact that


polynomials up to degree M  1, can be expressed
exactly Pas a linear combination of scaling functions,
pm (x)= n2Z nm (x  n) for m=0,M  1. This so-
called Strang–Fix condition proves that has M
vanishing moments if and only if any polynomial of
degree M  1 can be written as a linear combination
of scaling functions . Note that, as pm 62 L2 (R), the
coefficients nm are not in l2 (Z).
(a)
Regularity and Local Decay of Wavelet
Coefficients
The local or global regularity of a function is closely
related to the decay of its wavelet coefficients. If a
function is locally in Cs (R) (the space of s-times
continuously differentiable functions), it can be well
approximated locally by a Taylor series of degree s. –4.00E + 00 Logarithm 1.00E + 00
Consequently, its wavelet coefficients are small at
(b)
fine scales, as long as the wavelet has enough
vanishing moments. The decay of the coefficients Figure 5 Orthogonal wavelet decomposition using quintic
spline wavelets: (a) function f (x ) = sin (2x ) for x  1=4 and x 
hence determines directly the error being made when 3=4 and f (x )= sin(2x ) for 1=4 < x < 3=4 sampled on a grid
truncating a wavelet sum at some scale. xi =i=2J ,i =0,..., 2J  1 with J =9 and (b) corresponding wavelet
Depending on the type of norm used and whether coefficients log10 jefj, i j for i =0,...,2j  1 and j =0,...,J  1.
global or local characterization is concerned, various
relations of this kind have been developed. Let us
take as example the case of an -Lipschitz function. J1 X
X
Suppose f 2 L2 (R), then for [a, b]  R the func- fJ ðxÞ ¼ e
fj;i j;i ðxÞ ½60
tion f is -Lipschitz with 0 <  < 1 for any x0 2 j¼0 i2Z
[a, b], that is, jf (x0 þ h)  f (x0 )j  Cjhj , if and
only if there exists a constant A such that je fji j  The approximation error can be estimated by
A2j1=2 for any (j, i) with i=2j 2 [a, b].
This shows the relation between the local reg- kf  fJ kLp < C2J minðs;mÞ=d ½61
ularity of a function and the decay of its wavelet
coefficients in scale. where s denotes the smoothness of the function in
Lp , d the space dimension, and m the number of
Example To illustrate the local decay of the vanishing moments of the wavelet . In the case of
wavelet coefficients, we consider in Figure 5 the poor global regularity of f, that is, for small s, a
function f (x) = sin (2x) for x  1=4 and x  3=4 large number of scales J is needed to get a good
and f (x) = sin (2x) for 1=4 < x < 3=4. The corre- approximation of f.
sponding wavelet coefficients for quintic spline In Figure 6, we plot the linear approximation of
wavelets are plotted in logarithmic scale. The the function f shown in Figure 5. The function f6 is
wavelet coefficients show that only in a local region reconstructed using wavelet coefficients up to scale
around singularities the fine-scale coefficients are J  1 = 5, so that in total only 64 out of 512
significant. coefficients are retained. We observe an oscillating
behavior of fJ near the discontinuities of f which
Linear Approximation dominates the approximation error.
The exact reproduction of polynomials can be used
Nonlinear Approximation
to derive error estimates for the approximation of a
function f at a given scale, which corresponds to Retaining the N largest wavelet coefficients in the
linear approximation. We consider f belonging to wavelet expansion of f in [46], without imposing
the Sobolev space W s, p (Rd ), that is, the weak any a priori cutoff scale, yields the best N-term
derivatives of f up to order s belong to Lp (Rd ). The approximation f N . In contrast to the linear approx-
linear approximation of f at scale J, corresponding imation [60], it is called nonlinear approximation,
to the projection of f onto VJ , is then given by since the choice of the retained coefficients depends
Wavelets: Mathematical Theory 435

C α(IR d )

g
in
dd
be
Em
Linear approx.
O(N –t /d )
t Nonlinear approx.
O(N –t /d )

(a)
s
W s,p(IR d )

1/p 1/q = 1/p + t /d Lp(IR d )


Figure 7 Schematic representation of linear and nonlinear
approximation.
–4.00E + 00 Logarithm 1.00E + 00

(b) In Figure 8, we plot the nonlinear approximation


Figure 6 (a) Linear approximation fJ of the function f in of the function f shown in Figure 5. The function f N
Figure 5 for J = 6, reconstructed from 64 wavelet coefficients is reconstructed using the strongest 64 wavelet
using quintic splines wavelets and (b) corresponding wavelet coefficients out of 512 coefficients. Compared to
coefficients log10 jefj, i j for i = 0, . . . , 2j  1 and j = 0, . . . , J  1. the linear approximation (cf. Figure 6), the oscilla-
Note that the coefficients for J > 5 have been set to zero.
tions around the discontinuities disappear and the
approximation error is reduced while using the same
on the function f. The mathematical theory has been number of coefficients.
formalized by Cohen, Dahmen, and De Vore.
The nonlinear approximation of the function f can Compression and Preconditioning of Operators
then be written as
The nonlinear approximation of functions can be
X extended to certain operators leading to an efficient
f N ðxÞ ¼ e
fj;i j;i ðxÞ ½62
ðj;iÞ2N

where N denotes the ensemble of all multi-indices


 = (j, i), indexing the N largest coefficients (mea-
sured in the lp norm),

N ¼ fk ;k ¼ 1; Nj ke
fk klp > ke
f klp 8 2 g ½63

with  = { = (j, i),j  0, i 2 Z}. The nonlinear


approximation leads to the following error estimate:
(a)
kf  f N kLp < CN s=d ½64

where s denotes the smoothness of f in the larger


space Lq (Rd ) with

1 1 s
¼ þ
q p d

which corresponds to the Sobolev embedding line –4.00E + 00 Logarithm 1.00E + 00


(Figure 7). This estimate shows that the nonlinear (b)
approximation converges faster than the linear one,
Figure 8 (a) Nonlinear approximation f N of the function f in
if f has a larger regularity in Lq , that is, f 2 W s, q Figure 5 reconstructed from the 64 largest wavelet coefficients
(Rd ), which is for example the case for functions using quintic splines wavelets, (b) retained wavelet coefficients
with isolated singularities and for small q. log10 jefj, i j for i = 0, . . . , 2j  1 and j = 0, . . . , J  1.
436 Wavelets: Mathematical Theory

representation in wavelet space, that is, to sparse The thresholding parameter " depends on the
matrices. For integral operators, for example, variance of the noise and on the sample size N.
Calderon–Zygmund operators T on R defined by The thresholding function we consider corre-
Z sponds to hard thresholding:
Tf ðxÞ ¼ Kðx; yÞf ðyÞ dy ½65
R 
a if jaj > "
" ðaÞ ¼ ½66
where the kernel k satisfies 0 if jaj  "
C
jkðx; y; Þj  Donoho and Johnstone (1994) have shown that
jx  yj there exists an optimal " for which the relative
and quadratic error between the signal s and its
    estimator sC is close to the minimax error for all
@    C
 kðx; yÞ þ  @ kðx; y; Þ  signals s 2 H, where H belongs to a wide class of
@x  @y  function spaces, including Hölder and Besov spaces.
jx  yj2
They showed using the threshold
their wavelet representation hT j, i , j0 , i0 i is sparse
and a large number of weak coefficients can be pffiffiffiffiffiffiffiffiffiffiffiffiffi
"D ¼
n 2 ln N ½67
suppressed by simple thresholding of the matrix
entries while controlling the precision. The resulting
numerical scheme is called BCR algorithm and is yields an error which is close to the minimum error.
due to Beylkin et al. (1991). The threshold "D depends only on the sampling N
The characterization of function spaces by the and on the variance of the noise
n ; hence, it is
decay of the wavelet coefficients and the corre- called universal threshold. However, in many
sponding norm equivalences can be used for applications,
n is unknown and has to be estimated
diagonal preconditioning of integral or differential from the available noisy data s. For this, the present
operators which leads to matrices with uniformly authors have developed an iterative algorithm (see
bounded condition numbers. For elliptic differential Azzolini et al. (2005)), which is sketched in the
operators, for example, the Laplace operator r2 the following:
norm equivalence kr2 f k ’ k22je fji k can be used for 1. Initialization
preconditioning the matrix hr2 j, i , j0 , i0 i by a simple (a) given sk , k = 0, . . . , N  1. Set i = 0 and com-
diagonal scaling with 22j to obtain a uniformly pute the FWT of s to obtain es ;
bounded condition number. For further details, we (b) compute the variance
20 of s as a rough
refer to the book of Cohen (2000). estimate of the variance of n and compute the
corresponding threshold "0 = (2 ln N
20 )1=2 ;
Wavelet Denoising
(c) set the number of coefficients considered as
noise Nnoise = N.
We consider a function f which is corrupted by a 2. Main loop repeat
Gaussian white noise n 2 N (0,
2 ). The noise is 0
(a) set Nnoise = Nnoise and count the wavelet
spread over all wavelet coefficients es , while, coefficients Nnoise with modulus smaller
typically, the original function f is determined by than "i ;
only few significant wavelet coefficients. The aim is (b) compute the new variance
2iþ1 from the
then to reconstruct the function f from the observed wavelet coefficients whose modulus is smal-
noisy signal s = f þ n. ler than "i and the new threshold "iþ1 =
The principle of the wavelet denoising can be (2( ln N)
2iþ1 )1=2 ;
summarized in the following procedure: (c) set i = i þ 1 until (Nnoise0
= = Nnoise ).
Decomposition. Compute the wavelet coefficients 3. Final step
es using the FWT. (a) compute sC from the coefficients with mod-
Thresholding. Apply the thresholding function " ulus larger than "i using the inverse FWT.
to the wavelet coefficients es , thus reducing the Example To illustrate the properties of the denoising
relative importance of the coefficients with small algorithm, we apply it to a one-dimensional test signal.
absolute value. We construct a noisy signal s by superposing a
Reconstruction. Reconstruct a denoised version sC Gaussian white noise, with zero mean and variance
from the thresholded wavelet coefficients using
2W = 1,Pto a function f, normalized such that
the fast inverse wavelet transform. ((1=N) k jfk j2 )1=2 = 10. The number of samples is
Wavelets: Mathematical Theory 437

f n
30 30

25 25

20 20

15 15

10 10

5 5

0 0

–5 –5

–10 –10

–15 –15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000

s
30

25

20

15

10

–5

–10

–15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000

sC s – sC
30 30
25 25
20 20
15 15

10 10

5 5

0 0

–5 –5

–10 –10

–15 –15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Figure 9 Construction (top) of a 1D noisy signal s = f þ n (middle), and results obtained by the recursive denoising algorithm
(bottom).

N = 8192. Figure 9a shows the function f together the Circle; Image Processing: Mathematics; Wavelets:
with the noise n; Figure 9b shows the constructed Application to Turbulence; Wavelets: Applications.
noisy signal s and Figure 9c shows the wavelet
denoised signal sC together with the extracted noise.
Further Reading
Azzolini A, Farge M, and Schneider K (2005) Nonlinear wavelet
Acknowledgments thresholding: A recursive method to determine the optimal
denoising threshold. Applied and Computational Harmonic
Marie Farge thankfully acknowledges Trinity Col-
Analysis 18(2): 177.
lege, Cambridge, UK, and CIRM, Marseille, France,
Beylkin, Coifman, and Rohklin (1991) Fast wavelet transforms
for support while writing this paper. The authors also and numerical algorithms. Communications in Pure and
thank Barbara Burke for kindly revising their English. Applied Mathematics 44: 141.
Cohen A (2000) Wavelet methods in numerical analysis. In:
See also: Coherent States; Fractal Dimensions in Ciarlet PG and Lions JL (eds.) Handbook of Numerical
Dynamics; Homeomorphisms and Diffeomorphisms of Analysis, vol. 7. Amsterdam: Elsevier.
438 WDVV Equations and Frobenius Manifolds

shape. SIAM Journal of Mathematical Analysis. 15(4):


Dahmen W (1997) Wavelets and multiscale methods for operator 723–736.
equations. Acta Numerica 6: 55–228.
Lemarié P-G and Meyer Yves (1986) Ondolettes et bases
Daubechies I (1988) Orthonormal bases of compactly supported
wavelets. Communications in Pure and Applied Mathematics hilbertiennes. Revista Matematica Iberoamericano 2: 1.
41: 909.
Mallat S (1989) Multiresolution approximations and wavelet
Daubechies I (1992) Ten Lectures on Wavelets. Philadelphia, PA:
SIAM. orthonormal bases of L2 ðRÞ. Transactions of the American
Donoho and Johnstone (1994) Ideal spatial adaptation via Mathematical Society 315: 69.
wavelet shrinkage. Biometrika 81: 425. Mallat S (1998) A Wavelet Tour of Signal Processing. San Diego,
Grossmann A and Morlet J (1984) Decomposition of Hardy
functions into square integrable wavelets of constant CA: Academic Press.

WDVV Equations and Frobenius Manifolds


B Dubrovin, SISSA-ISAS, Trieste, Italy for arbitrary 1  , , ,   n. (Summation over
ª 2006 Elsevier Ltd. All rights reserved. repeated indices will always be assumed.) The last
one is the so-called quasihomogeneity condition

EF ¼ ð3  dÞF þ 12 A v v þ B v þ C ½4
Main Definition
where
WDVV equations of associativity (after E Witten,
R Dijkgraaf, E Verlinde, and H Verlinde) is   @
E ¼ a v þ b
tantamount to the following problem: find a func- @v
tion F(v) of n variables v = (v1 , v2 , . . . , vn ) satisfying for some constants a , b satisfying
the conditions [1], [3], and [4] given below. First,
a1 ¼ 1 ; b1 ¼ 0
@ 3 FðvÞ
 ½1 A , B , C, d are some constants. E is called Euler
@v1 @v @v
vector field and d is the charge of the Frobenius
must be a constant symmetric nondegenerate matrix. manifold.
Denote (  ) = (  )1 the inverse matrix and intro- For n = 1 one has F(v) = (1=6)v3 . For n = 2 one
duce the functions can choose

@ 3 FðvÞ Fðu; vÞ ¼ 12 uv2 þ f ðuÞ


c ðvÞ ¼ 
; ; ;  ¼ 1; . . . ; n ½2
@v @v @v only the quasihomogeneity [4] makes a constraint
The main condition says that, for arbitrary for f (v). The first nontrivial case is for n = 3. The
v1 , . . . , vn these functions must be structure con- solution to WDVV is expressed in terms of a
stants of an associative algebra, that is, introducing function f = f (x, y) in one of the two forms (in the
a v-dependent multiplication law in the n-dimen- examples all indices are written as lower):
sional space by d 6¼ 0 : F ¼ 12 v21 v3 þ 12 v1 v22 þ f ðv2 ; v3 Þ
  2
fxxy ¼ fyyy þ fxxx fxyy
a b :¼ c1 ðvÞa b ; . . . ; cn ðvÞa b ½5
d¼0: F ¼ 16 v31 þ v1 v2 v3 þ f ðv2 ; v3 Þ
one obtains an n-parameter family of n-dimensional fxxx fyyy  fxxy fxyy ¼ 1
associative algebras (these algebras will automati-
The function f (x, y) satisfies additional constraint
cally be also commutative). Spelling out this condi-
imposed by [4]. Because of this the above PDEs [5]
tion one obtains an overdetermined system of
can be reduced (Dubrovin 1992, 1996) to a
nonlinear PDEs for the function F(v) often also
particular case of the Painlevé-VI equation (see
called WDVV associativity equations
Painlevé Equations).
@ 3 FðvÞ @ 3 FðvÞ The problem [1], [3], [4] is invariant with respect
 
   to linear changes of coordinates preserving the
@v @v @v @v @v @v
direction of the vector @=@v1 :
@ 3 FðvÞ  @ 3 FðvÞ
¼ ½3 v 7! ~v ¼ P v þ Q ; detðP Þ6¼ 0; P1 ¼ 1
@v @v @v @v @v @v
WDVV Equations and Frobenius Manifolds 439

It is also allowed to add to F(v) a polynomial of the The structure constants of the Frobenius algebra
degree at most 2. To consider more general non- Av = Tv M
linear changes of coordinates one has to give a
coordinate-free form of the above equations [1], [3], @ @ @
 ¼ c ðvÞ  ½6
[4]. This gives rise to the notion of Frobenius @v @v @v
manifold introduced in Dubrovin (1992). can be locally represented by third derivatives [2] of
Recall that a Frobenius algebra is a pair (A, < , > ), a function F(v) satisfying [1], [3], [4]. The function
where A is a commutative associative algebra with a F(v) is called ‘‘potential’’ of the Frobenius manifold.
unity e over a field k (we will consider only the cases It is defined up to adding of an at most quadratic
k = R, C) and < , > is a k-bilinear symmetric non- polynomial in v1 , . . . , vn .
degenerate invariant form on A, that is, A generalization of the above definition to the
< x  y; z > ¼ < x; y  z > case of Frobenius supermanifolds can be found in
Manin (1999). For the more general class of the
for arbitrary vectors x, y, z in A. so-called F-manifolds, the requirement of the
existence of a flat invariant metric has been relaxed.
Definition Frobenius structure (, e, < , >, E, d) on
the manifold M is a structure of a Frobenius algebra
on the tangent spaces Tv M = (Av , < , >v ) depending
(smoothly, analytically, etc.) on the point v 2 M. It Deformed Flat Connection
must satisfy the following axioms.
One of the main geometrical structures of the theory
FM1. The curvature of the metric < , >v on M
of Frobenius manifolds is the deformed flat connec-
(not necessarily positive definite) vanishes. Denote r
tion. This is a symmetric affine connection on M 
the Levi-Civita connection for the metric. The unity
C defined by the following formulas:
vector field e must be flat, re = 0.
FM2. Let c be the 3-tensor c(x, y, z) := < x  y, ~ x y ¼ rx y þ zx  y;
r x; y 2 TM; z 2 C
z > , x, y, z 2 Tv M. The 4-tensor (rw c)(x, y, z) must
be symmetric in x, y, z, w 2 Tv M. ~ d=dz y ¼ @z y þ E  y  1 Vy
r
z ½7
FM3. A linear vector field E 2 Vect(M) (called
Euler vector field) must be fixed on M, that is, d
~x ¼ r d
~ d=dz ¼ 0
r
rrE = 0, such that dz dz

LieE ðx  yÞ  LieE x  y  x  LieE y ¼ x  y where, as above, r is the Levi-Civita connection for


the metric < , > and
LieE < ; > ¼ ð2  dÞ < ; >
2d
for some number d 2 k called ‘‘charge.’’ V :¼  rE ½8
2
The last condition (also called quasihomogeneity)
is an operator on the tangent bundle TM antisym-
means that the derivations QFunc(M) := E, QVect(M) :=
metric with respect to < , > ,
id þ adE define on the space Vect(M) of vector fields
on M a structure of graded Frobenius algebra over < Vx; y > ¼  < x; Vy >
the graded ring of functions Func(M).
Flatness of the metric < , > implies local existence Observe that the unity vector field e is an eigen-
of a system of flat coordinates v1 , . . . , vn on M. vector of this operator with the eigenvalue
Usually, they are chosen in such a way that
d
@ Ve ¼  e
e¼ 1 2
@v
~ = r(z)
The connection r ~ is not metric but it satisfies
is the unity vector field. In such coordinates, the
problem of local classification of Frobenius mani- ~
r < x; y > ¼ < rðzÞx; ~
y > þ < x; rðzÞy >
folds reduces to the WDVV associativity equations
x; y 2 TM
[1], [3], [4]. Namely,  is the constant Gram
matrix of the metric in these coordinates for any z 2 C . As it was discovered in Dubrovin
  (1992), vanishing of the curvature of the connection
@ @ r̃ is essentially equivalent to the axioms of
 :¼ ;
@v @v Frobenius manifold.
440 WDVV Equations and Frobenius Manifolds

Definition A ‘‘deformed flat function’’ f (v; z) on a Here the multiplication law on the cotangent planes
domain in M  C is defined by the requirement of is defined by means of the isomorphism.
horizontality of the differential df
< ; > : TM ! T  M
~ ¼0
rdf ½9 The discriminant 
M is a proper analytic (for an
Due to vanishing of the curvature of r̃ locally analytic M) subset where the intersection form
there exist n independent deformed flat functions degenerates. One can introduce a new metric on
f1 (v; z), . . . , fn (v; z) such that their differentials, the open subset Mn taking the inverse of the
together with the flat 1-form dz, span the cotangent intersection form. A remarkable result of the theory
  of Frobenius manifolds is vanishing of the curvature
plane T(v; z) (M  C ). They will be called ‘‘deformed
flat coordinates.’’ The global analytic properties of of this new metric. Moreover, the new flat metric
deformed flat coordinates can be derived, for the together with the following new multiplication:
case of semisimple Frobenius manifolds, from the
x  y :¼ x  y  E1
results of the section ‘‘Moduli of semisimple
Frobenius manifolds’’ discussed later. defines on Mn a structure of an almost-dual
One can relax the definition of Frobenius manifold Frobenius manifold (Dubrovin 2004). In the original
dropping the last axiom FM3. The potential F(v) in flat coordinates v1 , ... , vn the coordinate expressions
this case satisfies [1] and [3] but not [4]. In this case, for the new metric and for the associated Levi-Civita
the deformed flat connection r̃ is just a family of connection r , called the Gauss–Manin connection,
affine flat connections on M depending on the read
parameter z 2 C given by the first line in [7]. The
curvature and torsion of this family of connections g ðvÞ :¼ ðdv ; dv Þ ¼ E ðvÞc
 ðvÞ
vanishes identically in z. The deformed flat functions r  dv ¼   ðvÞ dv


of r ~ defined as in [9] can be chosen in the form of   ½12


1
power series in z. The flatness equations written in the 
 ðvÞ :¼  g 
ðvÞ 
 ðvÞ ¼ c 
 ðvÞ  V
flat coordinates on M yield a recursion equation for 2 
the coefficients of these power series The pair ( , ) and < , > of bilinear forms on T  M
X possesses the following property crucial for under-
~ ¼ 0; f ¼
rdf
p ðvÞzp standing the relationships between Frobenius mani-
p 0
folds and integrable systems: they form a flat pencil.
@ @ f ¼ zc  ðvÞ@ f That means that on the complement to the subset


@ @
0 ðvÞ ¼ 0  :¼ v 2 M j det g ðvÞ   ¼ 0
p 0 ½10
@ @
pþ1 ðvÞ ¼ c  ðvÞ@
p ðvÞ The inverse to the bilinear form

Thus, f (v; 0) is just an affine linear function of the ð ; Þ :¼ ð ; Þ   < ; > ½13
flat coordinates v1 , . . . , vn ; the dependence on z can defines a metric with vanishing curvature. Flat
be considered as a deformation of the affine functions p = p(v; ) for the flat metric are deter-
structure. This motivates the name ‘‘deformed flat mined from the system
coordinates.’’ The coefficients of the expansions of
the deformed flat coordinates are the leading terms ðr  rÞ dp ¼ 0 ½14
of the "-expansion of the Hamiltonian densities They are called ‘‘periods’’ of the Frobenius manifold.
of the integrable hierarchies associated with the The periods p(v; ) are related to the deformed flat
Frobenius manifolds (see below). functions f (v; z) by the suitably regularized Laplace-
type integral transform
Z 1
dz
Intersection Form of a pðv; Þ ¼ ez f ðv; zÞ pffiffiffi ½15
0 z
Frobenius Manifold
Choosing a system of n independent periods, one
Another important geometric structure on M is the obtains a system of flat coordinates p1 (v; ), . . . ,
intersection form of the Frobenius manifold. It is a pn (v; ) for the metric ( , ) on Mn ,
symmetric bilinear form on the cotangent bundle  i
T  M defined by the formula dp ðv; Þ; dpj ðv; Þ  ¼ Gij ½16
ð!1 ; !2 Þ ¼ iE !1  !2 ; !1 ; !2 2 T  M ½11 for some constant nondegenerate matrix Gij .
WDVV Equations and Frobenius Manifolds 441

The structure of a flat pencil on the Frobenius in the canonical coordinates. Actually, existence of
manifold M gives rise to a natural Poisson pencil canonical coordinates can be proved without using
(= bi-Hamiltonian structure) on the infinite-dimen- [4] (see details in Dubrovin (1992)).
sional ‘‘manifold’’ L(M) consisting of smooth maps Choosing locally branches of the square roots
of a circle to M (the so-called loop space). In the flat pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
coordinates v1 , . . . , vn for the metric < , > the i1 ðuÞ :¼ < @=@ui ; @=@ui >; i ¼ 1; . . . ; n ½20
Poisson pencil has the form we obtain a transition matrix  = ( i (u)),
   0
fv ðxÞ; v ðyÞg1 ¼  ðx  yÞ @ Xn
i ðuÞ @
   0 ¼ ½21
fv ðxÞ; v ðyÞg2 ¼ g ðvðxÞÞ ðx  yÞ ½17 @v i¼1 i1 ðuÞ @u i

þ  
 ðvðxÞÞvx ðx  yÞ from the basis @=@v to the orthonormal basis
By definition of the Poisson pencil, the linear hfi ; fj i ¼ ij
combination a1 { , }1 þ a2 { , }2 of the Poisson brackets
1 @
is again a Poisson bracket for arbitrary constants f1 ¼ 11 ðuÞ
a1 , a2 . Choosing a system of n independent periods @u1
@ ½22
pi (v; ), i = 1, . . . , n, as a new system of dependent f2 ¼ 1
21 ðuÞ ;...
variables, one obtains a reduction of the Poisson @u2
bracket { , } := { , }2  { , }1 for a given  to the @
fn ¼ 1
n1 ðuÞ
canonical form @un
fpi ðvðxÞ; Þ; pj ðvðyÞ; Þg ¼ Gij 0 ðx  yÞ ½18 The matrix (u) satisfies orthogonality condition
 
Under an additional assumption of existence of tau  @ @
function (Dubrovin 1996, Dubrovin and Zhang),  ðuÞðuÞ  ;  ¼ ð Þ;  :¼ ;
@v @v
one can prove that any Poisson pencil on L(M) of
the form [17] with a nondegenerate matrix ( ) In this formula  stands for the transposed matrix.
comes from a Frobenius structure on M. The lengths [20] coincide with the first column of
this matrix.
Denote V(u) = (Vij (u)) the matrix of the antisym-
metric operator V [8] with respect to the orthonor-
Canonical Coordinates on Semisimple
mal frame
Frobenius Manifolds
VðuÞ :¼ ðuÞV1 ðuÞ ½23
Definition The Frobenius manifold M is called
semisimple if the algebras Tv M are semisimple for The antisymmetric matrix V(u) = (Vij (u)) satisfies
v belonging to an open dense subset in M. the following system of commuting time-dependent
Hamiltonian flows on the Lie algebra so(n)
Any n-dimensional semisimple Frobenius algebra
equipped with the standard Lie–Poisson brackets
over C is isomorphic to the orthogonal direct sum of
{Vij , Vkl } = Vil jk  Vjl ik þ Vjk il  Vik jl :
n copies of one-dimensional algebras. In this section,
all the manifolds will be assumed to be complex @V
¼ fV; Hi ðV; uÞg; i ¼ 1; . . . ; n ½24
analytic. @ui
Near a semisimple point, the roots ui = ui (v),
with quadratic Hamiltonians
i = 1, . . . , n, of the characteristic equation

1 X Vij
2
det g ðvÞ   ¼ 0 ½19
Hi ðV; uÞ ¼ ½25
2 j6¼i ui  uj
can be used as local coordinates. The vectors
@=@ui , i = 1, . . . , n, are basic idempotents of the
algebras Tv M The matrix (u) satisfies

@ @ @ @
 ¼ ij ¼ Vi ðuÞ;
@ui @uj @ui @ui ½26
Vi ðuÞ :¼ adEi ad1
U ðVðuÞÞ; i ¼ 1; . . . ; n
We call u1 , . . . , un ‘‘canonical coordinates.’’ Observe
that we violate the indices convention labeling the Here the matrix unity Ei has the entries (Ei )ab =
canonical coordinates by subscripts. We will never ai ib ,U = diag(u1 , . . . , un ). Conversely, given a solu-
use summation over repeated indices when working tion to [24] and [26], one can reconstruct the
442 WDVV Equations and Frobenius Manifolds

Frobenius manifold structure by quadratures n-dimensional linear space equipped with a sym-
(Dubrovin 1998). The reconstruction depends on a metric nondegenerate bilinear form < , > . Two
choice of an eigenvector of the constant matrix linear operators on V, a semisimple operator
V = 1 (u) V(u)(u). ˆ : V ! V, and a nilpotent operator R : V ! V must
The system [24] coincides with the equations of satisfy the following properties. First, the operator ˆ
isomonodromic deformations (see Isomonodromic is antisymmetric:
Deformations) of the following linear differential
^ ¼ ^ ½31
operator with rational coefficients:
  and the operator R satisfies
dY V
¼ Uþ Y ½27
dz z R ¼ ei^ R ei^ ½32
The latter is nothing but the last component of the
Here the adjoint operators are defined with respect
deformed flat connection [7] written in the ortho-
to the bilinear form < , > . The last condition to be
normal frame [22]. Other components of the
imposed onto the operator R can be formulated in a
horizontality equations yield
simple way by choosing a basis e1 , . . . , en of
@i Y ¼ ðzEi þ Vi ðuÞÞY; i ¼ 1; . . . ; n ½28 eigenvectors of the semisimple operator ,ˆ
The compatibility conditions of the system [27] and ^  ¼   e ;
e  ¼ 1; . . . ; n
[28] coincide with [24].
We require the existence of a decomposition
The integration of [24], [26] and, more generally,
the reconstruction of the Frobenius structure can be R ¼ R0 þ R1 þ R2 þ    ½33
reduced to a solution of a certain Riemann–Hilbert
where for any integer k 0 the linear operator Rk
problem (see Riemann–Hilbert Problem).
satisfies
The isomonodromic tau function of the semisim-

ple Frobenius manifold is defined by Rk e 2 span e j ¼  þ k 8 ¼ 1; ...; n ½34
X
n
In the nonresonant case, such that none of the
d log I ðuÞ ¼ Hi ðVðuÞ; uÞdui ½29
differences of the eigenvalues of ˆ being equal to a
i¼1
positive integer, all the matrices R1 , R2 , . .., are equal
It is an analytic function on a suitable unramified to zero. Observe a useful identity
covering of the semisimple part of M.
Alternatively, eqns [24] can be represented as the z^ R z^ ¼ R0 þ zR1 þ z2 R2 þ    ½35
isomonodromy deformations of the dual Fuchsian
system More generally, for any operator A : V ! V com-
  muting with e2iˆ a decomposition is defined as
d 1
½U   ¼ þV ½30 A ¼ ½Ak
d 2 k2Z
X ½36
The latter comes from the Gauss–Manin system for z^ A z^ ¼ zk ½Ak
the periods p = p(v; ) of the Frobenius manifold k2Z
written in the canonical coordinates [22].
In particular, [R]k = Rk , k 0, [R]k = 0, k < 0.
One has to also choose an eigenvector e of the
Moduli of Semisimple operator ˆ such that R0 e = 0; denote d=2 the
Frobenius Manifolds corresponding eigenvalue

All n-dimensional semisimple Frobenius manifolds d


e 2 V; ^e ¼  e; R0 e ¼ 0 ½37
form a finite-dimensional space. They depend on 2
n(n  1)=2 essential parameters. To parametrize the
The second part of the monodromy data is a pair
Frobenius manifolds one can choose, for example,
of linear operators
the initial data for the isomonodromy deformation
equations [24]. Alternatively, they can be parame- C : V ! Cn ; S : Cn ! Cn
trized by monodromy data of the deformed flat
connection according to the following construction. The space Cn is assumed to be equipped with the
The first part of the monodromy data is the standard complex Euclidean structure given by
spectrum (V, < , > , ,
ˆ R) of the Frobenius manifold the sum of squares. The properties of the operators
associated with the Poisson pencil. Here V is an S, C depend on the choice of an unordered set
WDVV Equations and Frobenius Manifolds 443

u0 = (u01 , . . . , u0n ) of n pairwise distinct complex manifolds in terms of the monodromy data (, ˆ R,
numbers and on a choice of a ray ‘þ on an auxiliary S, C).
complex z-plane starting at the origin such that Conversely, to reconstruct the Frobenius manifold
  near a semisimple point with the canonical coordi-
Re z u0i  u0j 6¼ 0; i 6¼ j; z 2 ‘þ ½38 nates u01 , . . . , u0n , one is to solve the following
boundary-value problem. Let
Let us order the complex numbers in such a way that
‘ ¼ ð‘ Þ [ ‘þ
zðu0i u0j Þ
e ! 0; i < j; jzj ! 1; z 2 ‘þ ½39 be the oriented line on the complex z-plane chosen
as in [38]. Here the ray ‘ is the opposite to ‘þ .
The operator S must be upper triangular Denote R =L the right/left half-planes with respect
to ‘. To reconstruct the Frobenius manifold, one is
S ¼ ðSij Þ; Sij ¼ 0; i > j
½40 to find three matrix-valued functions 0 (z; u),
Sii ¼ 1; i ¼ 1; . . . ; n R (z; u), and L (z; u):
The operator C must satisfy 0 ðz; uÞ : V ! Cn
R=L ðz; uÞ : Cn ! Cn
C S C ¼ ei^ eiR ½41
for u close to u0 such that 0 (z; u) is analytic and
Here the adjoint operator C is understood as invertible for z 2 C, R (z; u)=L (z; u) are analytic
follows: and invertible for z 2 R =L resp., and continuous
<;>1
up to the boundary ‘n0 and

C : Cn ! Cn ! V  ! V
R=L ðz; uÞ 1 þ Oð1=zÞ; jzj ! 1; z 2 R =L
The group of diagonal n  n matrices
The boundary values of the functions
D ¼ diagð 1; . . . ; 1Þ 0 (z; u),R (z; u), and L (z; u) must satisfy the
following boundary-value problem (as above
acts on the pairs (S, C) by
U = diag(u1 , . . . , un )):
S 7! DSD; C 7! DC R ðz; uÞ ¼ L ðz; uÞez U Sez U ; z 2 ‘þ ½42
One is to factor out the action of this diagonal R ðz; uÞ ¼ L ðz; uÞez U S ez U ; z 2 ‘ ½43
group. Besides, the operator C is defined up to a left ^ R zU
action of certain group of linear operators depend- 0 ðz; uÞz z ¼ R ðz; uÞe C; z 2 R
^ R zU
½44
ing on the spectrum. 0 ðz; uÞz z ¼ L ðz; uÞe SC; z 2 L
For the generic (i.e., nonresonant) case where
e2 iˆ has simple spectrum, the operator C is defined Here zˆ := eˆ log z , zR := eR log z are considered as
up to left multiplication by any matrix commuting Aut(V)-valued functions on the universal covering
with e2 iˆ . In this situation, the monodromy data of Cn0; the branch cut in the definition of log z is
(,
ˆ R, S, C) are locally uniquely determined by the chosen to be along ‘ .
n(n  1)=2 entries of the matrix S. Therefore, near a The solution of the above boundary-value pro-
generic point, the variety of the monodromy data is blem [42]–[44], if exists, is unique. It can be reduced
a smooth manifold of the dimension n(n  1)=2. At to a certain Riemann–Hilbert problem, that is, to a
nongeneric points, the variety can get additional problem of factorization of an analytic n  n
strata. nondegenerate matrix-valued function on the
The monodromy data S, C are determined at an annulus
arbitrary semisimple point of a Frobenius manifold Gðz; uÞ; r < jzj < R; det Gðz; uÞ 6¼ 0
in terms of the analytic properties of horizontal
sections of the deformed flat connection r̃ [7] in the depending on the parameter u = (u1 , . . . , un ) in a
complex z-plane (the so-called ‘‘Stokes matrix’’ and product
the ‘‘central connection matrix’’ of the operator
Gðz; uÞ ¼ G0 ðz; uÞ1 G1 ðz; uÞ ½45
[27]). Locally, they do not depend on the point of
the semisimple Frobenius manifold (the isomono-
dromicity property). of two matrix-valued functions G0 (z; u) and
We will now describe the reconstruction procedure G1 (z; u) analytic for jzj < R and r < jzj  1 resp.,
giving a parametrization of semisimple Frobenius with nowhere-vanishing determinant.
444 WDVV Equations and Frobenius Manifolds

Existence of a solution to the Riemann–Hilbert The multiplication is introduced by identifying the


problem for a given u = (u1 , . . . , un ), ui 6¼ uj for i 6¼ j, tangent space Ts M with the quotient algebra
means triviality of certain n-dimensional vector
Ts MAn ¼ C½x=ðfs0 ðxÞÞ
bundle over the Riemann sphere with the transition
functions given by G(z; u). Existence of the solution The metric has the form
for u = u0 implies solvability of the Riemann–
@fs ðxÞ=@si @fs ðxÞ=@sj
Hilbert problem for u sufficiently close to u0 . From < @si ; @sj > ¼ ðn þ 1Þ resx¼1 dx
these arguments, it can be deduced that the matrices fs0 ðxÞ
0 (z; u), R=L (z; u) are analytic in (z; u) for u The flat coordinates v = v (s) can be found from
sufficiently close to u0 . Moreover, they can be the expansion of the solution to the equation
analytically continued in u to the universal covering fs (x) = knþ1 ,
of the space of configurations of n distinct points on  
the complex plane: 1 vn vn1 v1  1
x¼k þ 2 þ    þ n þ O nþ2
 n nþ1 k k k k
C n[i6¼j fui ¼ uj g =Sn ½46
The potentials of the Frobenius manifolds MAn for
The resulting functions are meromorphic on the n = 1, 2, 3 read
universal covering, according to the results of
B Malgrange and T Miwa. The structure of the FA1 ¼ 16 v31
global analytic continuation is given (Dubrovin FA2 ¼ 12 v21 v2 þ 72
1 4
v2 ½47
1999) in terms of a certain action of the braid group FA3 ¼ 1 2 1 2 1 2 2 1 5
2 v1 v2 þ 2 v1 v3 þ 16 v2 v3 þ 960 v3

Bn ¼ 1 Cn n[i6¼j fui ¼ uj g =Sn
The space of polynomials MAn can be identified with
on the monodromy data. the orbit space of C=W(An ) of the Weyl group of the
type An . More generally (Dubrovin 1996), the orbit
space MW := Cn =W of an arbitrary irreducible finite
Coxeter group W
O(n) carries a natural structure
Examples of Frobenius Manifolds
of a polynomial semisimple Frobenius manifold.
Example 0 Trivial Frobenius manifold, M = A0 a Conversely, all irreducible polynomial semisimple
graded Frobenius algebra, F(v) = (1=6) < e, v  v  v > Frobenius manifolds with positive degrees of the flat
is a cubic polynomial. coordinates can be obtained by this construction
(Hertling 2002). Generalizations for the orbit spaces
First nontrivial examples appeared in the setting
of certain infinite groups were obtained in Dubrovin
of 2D topological field theories (Dijkgraaf et al.
and Zhang (1998b) and Bertola (2000).
1991, Witten 1991) (see Topological Quantum Field
Theory: Overview). Mathematical formalization of Example 2 Gromov–Witten (GW) invariants (see
these ideas gives rise to the following two classes of Topological Sigma Models). Let X be a smooth
examples. projective variety. We will assume for simplicity that
H odd (X) = 0. To every such variety, one can associ-
Example 1 Frobenius structure on the base of an
ate a bunch of rational numbers. They are expressed
isolated hypersurface singularity. The construction
in terms of intersection theory of certain cycles on
(Hertling 2002, Sabbah 2002) uses the K Saito
the moduli spaces Xg, m,  of stable genus g and
theory of periods of primitive forms. For the
degree  curves on X with m marked points (see
example of An singularity f (x) = xnþ1 the Frobenius
details in Kontsevich and Manin (1994)):
structure on the base of universal unfolding



Xg;m; :¼ f : Cg ; x1 ; . . . ; xm ! X;
MAn ¼ fs ðxÞ ¼ xnþ1 þ s1 xn1 þ   þ sn j s1 ; ...; sn 2 C ½48
f ½Cg  ¼  2 H2 ðX; ZÞ
is constructed as follows (Dijkgraaf et al. 1991):
Denote n := dim H  (X; C). Choosing a basis 1 = 1,
@ 2 , . . . , n we define the numbers

@sn
< p1 ð 1 Þ . . . pm ð m Þ >g;
1 X @ Z
E¼ ðk þ 1Þsk p
nþ1 @sk :¼ ev1 ð 1 Þ ^ c11 ðL1 Þ
n1 ½Xg;m; virt

nþ1 ^    ^ evm ð m Þ ^ cp1m ðLm Þ ½49
WDVV Equations and Frobenius Manifolds 445

for arbitrary non-negative integers p1 , . . . , pm . Here (formal) Frobenius manifold on H  (X) with the
the evaluation maps evi , i = 1, . . . , m, are given by bilinear form  given by the Poincaré pairing
Z
evi : Xg;m; ! X; f 7! f ðxi Þ
 ¼  ^ 
X
The so-called tautological line bundles Li over Xg, m, 
by definition have the fiber Txi Cg , i = 1, . . . , m (see the unity
the article Moduli Spaces: An Introduction regarding
@
the construction of the so-called virtual fundamental e¼
class [Xg, m,  ]virt ). The numbers [49] can be defined @v1
for an arbitrary compact symplectic manifold X and the Euler vector field
where one is to deal with the intersection theory on X
n
@
the moduli spaces of pseudoholomorphic curves E¼ ½ð1  q Þv þ r 
fixing a suitable almost-complex structure on X. ¼1
@v
They depend only on the symplectic structure on X. Here the numbers q , r are defined by the
In particular, the numbers conditions
< 0 ð 1 Þ . . . 0 ð m Þ >g; ½50 X
 2 H 2q ðXÞ; c1 ðXÞ ¼ r 
are called the genus g and degree  GW invariants of 
X. In certain cases, they admit an interpretation in
The resulting Frobenius manifold will be denoted
terms of enumerative geometry of the variety X
MX . The corresponding n-parameter family of
(Kontsevich and Manin 1994). The numbers [49]
n-dimensional algebras on the tangent spaces Tv MX
with some of pi > 0 are called ‘‘gravitational
is also called ‘‘quantum cohomology’’ QH (X). At
descendents.’’
the point vcl 2 MX of classical limit, the algebra
One can form a generating functions of the Tvcl MX coincides with the cohomology ring H  (X).
numbers [49] In all known examples, the series [53] actually
converges in a neighborhood of the point vcl .
X X 1 1 ;p1
FX t . . . tm ; pm Therefore, one obtains a genuine Frobenius structure
g ¼
m 2H2 ðX;ZÞ
m! on a domain MX
H  (X; C)=2iH2 (X; Z). How-
ever, a general proof of convergence is still missing.
< p1 ð 1 Þ . . . pm ð m Þ >g; ½51
In particular, for d = 1, the quantum cohomology
(summation over repeated indices 1  1 , . . . , m  of complex projective line P 1 is a two-dimensional
n will always be assumed). Here t, p are indetermi- Frobenius manifold with the potential, unity, and
nates labeled by pairs (, p) with  = 1, . . . , n, the Euler vector field
p = 0, 1, 2, . . . . (Usually one is to insert in the Fðu; vÞ ¼ 12 uv2 þ eu ;
definition of F X 
g elements q of the Novikov ring
C[H2 (X; Z)]. However, due to the divisor axiom @
e¼ ;
(Kontsevich and Manin 1994) and these insertions @v
can be compensated by a suitable shift in the space @ @
E¼v þ2
of couplings t = (t, p ).) We finally introduce the full @v @u
generating function called total GW potential (it is For d = 2 one has a three-dimensional Frobenius
also called the free energy of the topological sigma manifold QH (P 2 ) with
model with the target space X)
X Fðv1 ; v2 ; v3 Þ¼ 12 v21 v3 þ 12 v1 v22
F X ðt; Þ ¼ 2g2 F X
g ½52 X v3k1 kv2
g 0 þ Nk 3 e
k 1
ð3k  1Þ!
Restricting the genus-zero generating function @ ½54
onto the so-called small phase space e¼
@v1
@ @ @
FX ðvÞ :¼ F X
0 ðt
;0
¼ v ; t;p>0 ¼ 0Þ E ¼ v1 þ3  v3
½53 @v1 @v2 @v3
v ¼ ðv1 ; . . . ; vn Þ
where Nk = number of rational curves on P2 passing
one obtains a solution to the WDVV associativity through 3k  1 generic points. WDVV [5] yields
equations. This solution defines a structure of (Kontsevich and Manin 1994) recursion relations for
446 WDVV Equations and Frobenius Manifolds

the numbers Nk starting from N1 = 1. The closed the needed integrable hierarchy is a new one. It can
analytic formula for the function [54] is still unknown. be associated (Dubrovin and Zhang) with an arbi-
Only for certain very exceptional X the Frobenius trary n-dimensional semisimple Frobenius manifold
manifold MX is semisimple (e.g., for X = Pd ). The M. The equations of the hierarchy have the form
general geometrical reasons of the semisimplicity of h
MX are still to have been understood. wit ¼ Aij ðwÞwjx þ 2 Bij ðwÞwixxx þ Cijk ðwÞwjx wkxx
For the case X = Calabi–Yau manifold, the Fro- i
benius manifold QH  (X) is never semisimple. This þ Dijkl ðwÞwjx wkx wlx þ Oð4 Þ; i ¼ 1;.. .;n ½57
Frobenius structure can be computed in terms of the
The coefficients of 2g are graded homogeneous
mirror symmetry construction (see Mirror Symme-
polynomials in ux , uxx , etc., of the degree 2g þ 1,
try: A Geometric Survey).
deg dm u=dxm ¼ m
The construction of the hierarchy is done in two
Frobenius Manifold and Integrable
steps. First, we construct the leading approximation
Systems
(Dubrovin 1992). The equation of the hierarchy
The identities in the cohomology ring generated by specifying the dependence on t = t, p at  = 0 reads
the cocycles evi (  ) and j := c1 (Lj ) can be recast 
@v
into the form of differential equations for the ;p
¼ @x r
; pþ1 ðvÞ
generating function [52]. The variable x := t1, 0
@t ½58
corresponding to 1 = 1 plays a distinguished role  ¼ 1; . . . ; n; p 0
in these differential equations. According to the idea The functions
, p (v), v 2 M, are the coefficients of
of Witten (1991), the differential equations for the expansion [10] of the deformed flat functions
generating functions can be written as a hierarchy of normalized by
, 0 = v . The solution v = v(x, t) of
systems of n evolutionary PDEs (n = dim H (X)) for interest is determined from the implicit function
the unknown functions equations
X
@ 2 F X ðt; Þ v ¼ xe þ t;p r
;p ðvÞ ½59
w ¼ hh 0 ð  Þ 0 ð 1 Þii ¼ 2 ½55
@t1;0 @t;0 ;p
1, 0
The variable x = t is the spatial variable of the Next, one has to find solution
equations of the hierarchy. The remaining para- X
meters (coupling constants) t, p of the generating F ¼ 2g2 F g ðv; vx ; . . . ; vð3g2Þ Þ ½60
function play the role of the time variables. Witten g 1
suggested to use the two-point correlators
of the following universal loop equation (closely
@ 2 F X ðt; Þ
2 related with the Virasoro conjecture of Eguchi and
h; p ¼ hh pþ1 ð  Þ 0 ð 1 Þii ¼  ½56
@t1;0 @t;p Xiong (1998)):
as the densities of the Hamiltonians of the flows of  
the hierarchy. X @F 1
@xr
Existence of such a hierarchy can be proved for r 0
@v;r EðvÞ  
the case of GW invariants (and their descendents) !
of complex projective spaces Pd (the results of X @F X
r r
þ @xk1 @e p G @xrkþ1 @  p
Givental (2001) along with Dubrovin and Zhang r 1
@v;r k
k¼1
(2005) can be used). For d = 0 one obtains,
according to the celebrated result by Kontsevich 1 1 h i2
¼ trðU  Þ2 þ tr ðU  Þ1 V
conjectured by Witten (see Topological Gravity, 16 4
 
Two-Dimensional), the tau function of the solution 2 X @ 2 F @F @F
to the KdV hierarchy (see Korteweg–de Vries Equation þ þ
2 @v;k @v;l @v;k @v;l
and Other Modulation Equations) specified by the
initial condition,  @xkþ1 @  p G @xlþ1 @  p

uðxÞ j t¼0 ¼ x 2 X @F kþ1


þ @
2 @v;k x
For d = 1 the hierarchy in question is the extended 
@p ðv; Þ @p ðv; Þ
Toda lattice (see details in Dubrovin and Zhang  r r  vx G ½61
(2004); see also Toda Lattices). For all other d 2, @ @
WDVV Equations and Frobenius Manifolds 447

Here U = U(v) is the operator of multiplication by


Dijkgraaf R, Verlinde H, and Verlinde E (1991) Topological
E(v), p = p (v; ),  = 1, . . . , n, is a system of flat strings in d < 1. Nuclear Physics B 352: 59–86.
coordinates [16] of the bilinear form [13]. The Dubrovin B (1992) Integrable systems in topological field theory.
substitution Nuclear Physics B 379: 627–689.
Dubrovin B (1996) Geometry of 2D topological field theories. In:
v 7! w ¼ v þ 2 @x @t;0 F ðv; vx ; vxx ; . . . ; 2 Þ Francaviglia M and Greco S (eds.) Integrable Systems and
½62 Quantum Groups, Montecatini Terme, 1993, Springer Lecture
 ¼ 1; . . . ; n Notes in Math., vol. 1620, pp. 120–348.
Dubrovin B (1998) Geometry and analytic theory of Frobenius
transforms [58] to [57]. The terms of the expansion
manifolds. Proceedings of the International Congress of
[60] are not polynomial in the derivatives. For Mathematicians, vol. II (Berlin, 1998). Doc. Math. 1998,
example (Dubrovin and Zhang 1998a), Extra Vol. II, 315–326.
Dubrovin B (1999) Painlevé transcendents in two-dimensional
1 X n
I ðuÞ topological field theory. In: Conte R (ed.) The Painlevé Property:
F1 ¼ log u0i þ log 1=24
24 i¼1 J ðuÞ 100 Years Later, CRM Ser. Math. Phys. pp. 287–412.
  ½63 New York: Springer.
@v Yn
Dubrovin B (2004) On almost duality for Frobenius manifolds.
JðuÞ ¼ det ¼ i1 ðuÞ In: Buchstaber VM and Krichever IM (eds.) Geometry,
@ui i¼1 Topology, and Mathematical Physics, Amer. Math. Soc.
(the canonical coordinates have been used) where Transl. Ser. 2, vol. 212, pp. 75–132. Providence, RI: American
Mathematical Society.
I (u) is the isomonodromic tau function [29]. The Dubrovin B and Zhang Y (1998a) Bi-Hamiltonian hierarchies in
transformation [62] applied to the solution [59] 2D topological field theory at one-loop approximation.
expresses higher-genus GW invariants of a variety X Communications in Mathematical Physics 198: 311–361.
with semisimple quantum cohomology QH  (X) via Dubrovin B and Zhang Y (1998b) Extended affine Weyl groups
the genus-zero invariants. For the particular case of and Frobenius manifolds. Compositio Math. 111: 167–219.
X = P 2 , the formula [63] yields (Dubrovin and Dubrovin B and Zhang Y (2004) Virasoro symmetries of the
extended Toda hierarchy. Communications in Mathematical
Zhang 1998a) Physics 250: 161–193.
Dubrovin B and Zhang Y (2001) Normal forms of hierarchies of
000  27 1 X ð1Þ e
kz
¼  þ kN k
integrable PDEs, Frobenius manifolds and Gromov–Witten
8ð27 þ 2 0  3 00 Þ 8 k 1 ð3kÞ! invariants, math/0108160.
Dubrovin B and Zhang Y (2005) Integrable hierarchies of the
Here topological type (to appear).
Eguchi T and Xiong CS (1998) Quantum cohomology at higher
X ekz genus: topological recursion relations and Virasoro conditions.
ðzÞ ¼ Nk Advances in Theoretical and Mathematical Physics 2:
k 0
ð3k  1Þ!
219–229.
Givental A (2001) Gromov–Witten invariants and quantization of
is the generating function of the genus-zero GW
quadratic Hamiltonians. Moscow Mathematical Journal 1:
invariants of P 2 (see [54]) and Nk(1) = the number of 551–568, 645.
elliptic plane curves of the degree k passing through Hertling C (2002) Frobenius Manifolds and Moduli Spaces for
3k generic points. Singularities, Cambridge Tracts in Mathematics, vol. 151.
Cambridge: Cambridge University Press.
See also: Bi-Hamiltonian Methods in Soliton Theory; Hitchin N (1997) Frobenius manifolds. With Notes by David
Functional Equations and Integrable Systems; Integrable Calderbank. In: Gauge Theory and Symplectic Geometry
(Montreal, PQ, 1995). , NATO Adv. Sci. Inst. Ser. C Math.
Systems: Overview; Isomonodromic Deformations;
Phys. Sci. vol. 488, pp. 69–112. Dordrecht: Kluwer Academic
Korteweg–de Vries Equation and Other Modulation Publishers.
Equations; Mirror Symmetry: A Geometric Survey; Moduli Kontsevich M and Manin Yu (1994) Gromov–Witten classes,
Spaces: An Introduction; Painlevé Equations; quantum cohomology and enumerative geometry. Commu-
Riemann–Hilbert Problem; Toda Lattices; Topological nications in Mathematical Physics 164: 525–562.
Gravity, Two-Dimensional; Topological Quantum Field Manin Yu (1999) Frobenius Manifolds, Quantum Cohomology,
Theory: Overview; Topological Sigma Models. and Moduli Spaces, American Mathematical Society Collo-
quium Publications, vol. 47. Providence, RI: American
Mathematical Society.
Sabbah C (2002) Déformations isomonodromiques et variétés
Further Reading de Frobenius. Savoirs Actuels (Les Ulis). Paris: CNRS
Éditions.
Bertola M (2000) Frobenius manifold structure on orbit space of Witten E (1991) Two-dimensional gravity and intersection
Jacobi groups. I, II. Differential Geometry and Applications theory on moduli space. Surv. of Differential Geometry 1:
13: 19–41, 213–233. 243–310.
448 Weakly Coupled Oscillators

Weakly Coupled Oscillators


E M Izhikevich, The Neurosciences Institute,
San Diego, CA, USA Identify
Y Kuramoto, Hokkaido University, Sapporo, Japan

Identify
ª 2006 Elsevier Ltd. All rights reserved. θ2
θ2

θ2 θ1 θ1 θ1
Introduction
(a) (b) (c)
Practically any physical, chemical, or biological
system can exhibit rhythmic oscillatory activity, at Figure 1 A 2-torus and its representation on the square.
(Modified from Hoppensteadt and Izhikevich 1997.)
least when the conditions are right. Winfree (2001)
reviews the ubiquity of oscillations in nature,
ranging from autocatalytic chemical reactions to Frequency locking
pacemaker cells in the heart, to animal gates, and to
circadian rhythms. When coupled, even weakly, In phase
oscillators interact via adjustment of their phases,
that is, their timing, often leading to synchroniza-
tion. In this chapter, we review the most important Entrainment Synchronization Phase locking
(1:1 frequency locking)
concepts needed to study and understand the
dynamics of coupled oscillators.
From a mathematical point of view, an oscillator Antiphase
is a dynamical system,
x_ ¼ f ðxÞ; x 2 Rm ½1 Figure 2 Various degrees of locking of oscillators. (Modified
having a limit-cycle attractor – periodic orbit   Rm . from Izhikevich 2006.)
Its period is the minimal T > 0 such that
The oscillators are said to be frequency locked when
ðtÞ ¼ ðt þ TÞ for any t [4] has a stable periodic orbit #(t) = (#1 (t), . . . , #n (t))
and its frequency is  = 2=T. Let x(0) = x0 2  be on the n-torus Tn , as in Figure 1a. The ‘‘rotation
an arbitrary point on the attractor, then the state of vector’’ or ‘‘winding ratio’’ of the orbit is the set of
the system, x(t), is uniquely defined by its phase integers q1 : q2 :    : qn such that #1 makes q1 rotations
# 2 S1 relative to x0 , where S1 is the unit circle. while #2 makes q2 rotations, etc., as in the 2 : 3
Throughout this article, we assume that the frequency locking in Figure 1a. The oscillators
periodic orbit  is exponentially stable, which are entrained when they are 1 : 1:    :1 frequency
implies normal hyperbolicity. In this case, there is a locked. The oscillators are phase locked when there is
continuous transformation  : U ! S1 defined in a an (n  1)  n integer matrix K having linearly
neighborhood U   such that #(t) = (x(t)) for any independent rows such that K#(t) = const. For exam-
trajectory in U, that is,  maps solutions of [1] to ple, the two oscillators in Figure 1b are phase locked
solutions of with K = (2, 3), while those in Figure 1c are not. The
oscillators are synchronized when they are entrained
#_ ¼  ½2 and phase locked. Synchronization is in-phase when
Such a transformation removes the amplitude but #1 (t) =    = #n (t) and out-of-phase otherwise. Two
saves the phase of oscillation. oscillators are said to be synchronized antiphase when
Accordingly, there is a continuous transformation #1 (t)  #2 (t) = . Frequency locking without phase
that maps solutions of the weakly coupled network locking, as in Figure 1c, is called phase trapping. The
of n oscillators, relationship between all these definitions is depicted
in Figure 2.
x_ i ¼ fi ðxi Þ þ "gi ðx1 ; . . . ; xn ; "Þ; "1 ½3
onto solutions of the phase system
Phase Resetting
#_ i ¼ i þ "hi ð#1 ; . . . ; #n ; "Þ; #i 2 S1 ½4
An exponentially stable periodic orbit is a normally
which is easier for studying the collective properties hyperbolic invariant manifold, hence its sufficiently
of [3]. small neighborhood, U, is invariantly foliated by
Weakly Coupled Oscillators 449

Andronov–Hopf oscillator van der Pol oscillator


1.5 2

1 θ = π /2 1.5 pulse
θ = 3π /4 θ = π /4 1
0.5
0.5
θ=π x0 θ = 0
Im z 0 y 0
U
–0.5
–0.5 γ
θ = 5π /4 –1
–1
θ = 3π /2 θ = 7π /4 –1.5
–1.5 –2
–1.5 –1 – 0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
Re z x
Figure 3 Isochrons of Andronov–Hopf oscillator (z_ = (1 þ i )z  zjzj2 , z 2 C) and van der Pol oscillator (x_ = x  x 3  y , y_ = x):

stable submanifolds (Guckenheimer 1975) illustrated In Figure 5 we depict phase portraits of the
in Figure 3. The manifolds represent points having Andronov–Hopf oscillator receiving pulses of
equal phases and, for this reason, they are called magnitude 0.5 (left) and 1.5 (right). Notice the
isochrons (from Greek ‘‘iso’’ meaning equal and drastic difference between the corresponding PRCs
‘‘chronos’’ meaning time). or PTCs. Winfree (2001) distinguishes two cases:
The geometry of isochrons determines how the
1. type 1 (weak) resetting results in continuous PRCs
oscillators react to perturbations. For example, the
and PTCs with mean slope 1, and
pulse in Figure 3, right, moves the trajectory from
2. type 0 (strong) resetting results in discontinuous
one isochron to another, thereby changing its phase.
PRCs and PTCs with mean slope 0.
The magnitude of the phase shift depends on the
amplitude and the exact timing of the stimulus
relative to the phase of oscillation #. Stimulating the
oscillator at different phases, one can measure the Type 1 (weak) resetting Type 0 (strong) resetting
phase transition curve (Winfree 2001) θ = π /2

#new ¼ PTCð#old Þ
and the phase resetting curve θ=π θ = 0, 2π

PRCð#Þ ¼ PTCð#Þ  #
ðshift ¼ new phase  old phaseÞ
θ = 3π /2
Positive (negative) values of the PRC correspond to
phase advances (delays). PRCs are convenient when 1 π
PRC(θ) PRC(θ)
the phase shifts are small, so that they can be
Phase resetting

Phase resetting

magnified and clearly seen, as in Figure 4. PTCs are


convenient when the phase shifts are large and 0 0

comparable with the period of oscillation.

–1 –π
0 2π 0 2π
Andronov–Hopf oscillator van der Pol oscillator
Stimulus phase, θ Stimulus phase, θ
Re z (t ) x (t )
0.2
2π π
PRC(θ)

PTC(θ) = PTC(θ) =
Phase transition

Phase transition

0 {θ + PRC(θ)} mod 2π {θ + PRC(θ)} mod 2π


PRC2
PRC2
–0.2 PRC1 PRC1 0
0 2π 0 2π
Stimulus phase, θ Stimulus phase, θ

Figure 4 Examples of phase response curves (PRCs) of the 0 −π


0 2π 0 2π
oscillators in Figure 3. PRC1 (#) and PRC2 (#) correspond to
Stimulus phase, θ Stimulus phase, θ
horizontal (along the first variable) and vertical (along the second
variable) pulses with amplitudes 0.2. An example of oscillation is Figure 5 Types of phase resetting of the Andronov–Hopf
plotted as a dotted curve in each subplot (not to scale). oscillator in Figure 3.
450 Weakly Coupled Oscillators

The discontinuity of type 0 PRC in Figure 5 is a 1. Winfree: Q(#) is normalized PRC to infinitesimal
topological property that cannot be removed by pulsed perturbations;
reallocating the initial point x0 that corresponds to 2. Kuramoto: Q(#) = grad (x); and
zero phase. The discontinuity stems from the fact 3. Malkin: Q is the solution to the adjoint problem
that the shifted image of the limit cycle (dashed
circle) goes beyond the central equilibrium at which _ ¼ fDf ððtÞÞg> Q
Q ½7
the phase is not defined.
The stroboscopic mapping of S1 to itself, called with the normalization Q(t)  f ((t)) =  for any t.
Poincaré phase map, The function Q(#) can be found analytically in a
#kþ1 ¼ PTCð#k Þ ½5 few simple cases:
describes the response of an oscillator to a T-periodic 1. a nonlinear phase oscillator x_ = f (x) with x 2 S1
pulse train. Here, #k denotes the phase of oscillation and f > 0 has Q(#) = =f ((#));
when the kth input pulse arrives. Its fixed points 2. a system near saddle-node on invariant circle
correspond to synchronized solutions, and its periodic bifurcation has Q(#) proportional to 1  cos #;
orbits correspond to phase-locked states. and
3. a system near supercritical Andronov–Hopf
bifurcation has Q(#) proportional to sin(#  ),
where 2 S1 is a constant phase shift.
Weak Coupling
Other interesting cases, including homoclinic,
Now consider dynamical systems of the form relaxation, and bursting oscillators are considered
x_ ¼ f ðxÞ þ "sðtÞ ½6 by Izhikevich (2006).
Treating s(t) in [6] as the input from the network,
describing periodic oscillators, x_ = f (x), forced by we can transform weakly coupled oscillators
a weak time-depended input "s(t), for example, from
other oscillators in a network. Let (x) denote the si ðtÞ
zfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflffl{
phase of oscillation at point x 2 U, so that the map Xn
 : U ! S1 is constant along each isochron. This x_ i ¼ fi ðxi Þ þ " gij ðxi ; xj Þ ; xi 2 Rm ½8
mapping transforms [6] into the phase model j¼1

#_ ¼  þ "Qð#Þ  sðtÞ to the phase model


si ðtÞ
with function Q(#), illustrated in Figure 6, satisfying zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{
three equivalent conditions: Xn
#_ i ¼ i þ " Qi ð#i Þ  gij ðxi ð#i Þ; xj ð#j ÞÞ ½9
j¼1
P
having the form [4] with hi = Qi gij , or the form
Andronov–Hopf oscillator van der Pol oscillator
1 X
n
1 Q (θ) #_ i ¼ i þ " hij ð#i ; #j Þ
Q (θ) j¼1

Q2 0 Q2 0 where hij = Qi gij . Introducing phase deviation vari-


ables #i = i t þ ’i , we transform this system into the
form
–1
–1
–1 0 1 –2 0 2 X
n
Q1 Q1 ’_ i ¼ " hij ði t þ ’i ; j t þ ’j Þ
j¼1
1 Q1(θ) Q1(θ)
1 which can be averaged to
Q2(θ) Q2(θ)
X
n
0 0 ’_ i ¼ " Hij ð’i  ’j Þ ½10
j¼1
–1
–1 with the functions
0 Phase, θ 2π 0 Phase, θ 2π Z T
1
Figure 6 Solutions Q = (Q1 , Q2 ) to the adjoint problem [7] for Hij ðÞ ¼ lim hij ði t; j t  Þ dt ½11
oscillators in Figure 3. T!1 T 0
Weakly Coupled Oscillators 451

describing the interaction between oscillators Andronov–Hopf oscillator van der Pol oscillator
(Ermentrout and Kopell 1984). To summarize, we H(χ) H(χ)
transformed weakly coupled system [8] into the 0.5
Hij (χ)
phase model [10] with H given by [11] and each Q 0 Hij (χ)
being the solution to the adjoint problem [7]. This –0.5
constitutes the Malkin theorem for weakly coupled
oscillators (Hoppensteadt and Izhikevich 1997, 0 Phase difference, χ 2π 0 Phase difference, χ 2π
theorem 9.2). Figure 7 Solid curves: functions Hij () defined by [11]
Existence of one equilibrium of the phase model corresponding to the gap–junction input g(xi , xj ) = (xj1  xi1 , 0).
[10] implies the existence of the entire circular Dashed curves: functions H() = Hji ()  Hij ().: Parameters
family of equilibria, since translation of all ’i by a are as in Figure 3.
constant phase shift does not change the phase
differences ’i  ’j and hence the form of [10]. This
family corresponds to a limit cycle of [8], on which All equilibria of [12] are solutions to H() = !,
all oscillators have equal frequencies and constant and they are intersections of the horizontal line !
phase shifts, that is, they are synchronized, possibly with the graph of H. They are stable if the slope
out of phase. of the graph is negative at the intersection. If
We say that two oscillators, i and j, have resonant oscillators are identical, then H() is an odd
(or commensurable) frequencies when the ratio function (i.e., H() = H()), and  = 0 and
i =j is a rational number, for example, it is p=q  =  are always equilibria, possibly unstable,
for some integer p and q. They are nonresonant corresponding to the in-phase and antiphase syn-
when the ratio is an irrational number. In this case, chronized solutions. The in-phase synchronization
the function Hij defined above is constant regardless of gap–junction coupled oscillators in Figure 7 is
of the details of the oscillatory dynamics or the stable because the slope of H (dashed curves) is
details of the coupling, that is, dynamics of two negative at  = 0. The max and min values of the
coupled nonresonant oscillators is described by an function H determine the tolerance of the network
uncoupled phase model. Apparently, such oscillators to the frequency mismatch !, since there are no
do not interact; that is, the phase of one of them equilibria outside this range.
cannot change the phase of the other one even on Now consider a network of n > 2 weakly coupled
the long timescale of order 1=". oscillators [8]. To determine the existence and
stability of synchronized states in the network, we
need to study equilibria of the corresponding phase
Synchronization model [10]. The vector  = (1 , . . . , n ) is an
Consider [8] with n = 2, describing two mutually equilibrium of [10] when
coupled oscillators. Let us introduce ‘‘slow’’ time X
n
 = "t and rewrite the corresponding phase model 0 ¼ !i þ Hij ði  j Þ ðfor all iÞ ½13
[10] in the form j6¼1

’01 ¼ !1 þ H12 ð’1  ’2 Þ It is stable when all eigenvalues of the linearization


’02 ¼ !2 þ H21 ð’2  ’1 Þ matrix (Jacobian) at  have negative real parts,
except one zero eigenvalue corresponding to the
where 0 = d=d and !i = Hii (0) is the frequency eigenvector along the circular family of equilibria (
deviation from the natural oscillation, i = 1, 2. Let plus a phase shift is a solution of [13] too since the
 = ’2  ’1 denote the phase difference between the phase shifts j  i are not affected).
oscillators; then In general, determining the stability of equilibria
is a difficult problem. Ermentrout (1992) found a
0 ¼ ! þ HðÞ ½12
simple sufficient condition. If
where
1. aij = Hij0 (i  j ) 0, and
! ¼ !2  !1 and HðÞ ¼ H21 ðÞ  H12 ðÞ 2. the directed graph defined by the matrix a = (aij )
is connected, (i.e., each oscillator is influenced,
is the frequency mismatch and the antisymmetric
possibly indirectly, by every other oscillator),
part of the coupling, respectively, illustrated in
Figure 7, dashed curves. A stable equilibrium of then the equilibrium  is neutrally stable, and the
[12] corresponds to a stable limit cycle of the phase corresponding limit cycle x(t þ ) of [8] is asympto-
model. tically stable.
452 Weakly Coupled Oscillators

Another sufficient condition was found by


Hoppensteadt and Izhikevich (1997). If system [10]
satisfies ϕi
r reiψ
1. !1 =    = !n = ! (identical frequencies) ψ
ϕj
2. Hij () = Hji () (pairwise odd coupling)
for all i and j, then the network dynamics converge to a
limit cycle. On the cycle, all oscillators have equal
frequencies 1 þ "! and constant phase deviations.
The proof follows from the observation that [10] Figure 8 Kuramoto synchronization index [15] describes the
is a gradient system in the rotating coordinates degree of coherence in the network [14].
’ = ! þ  with the energy function
randomly distributed on the unit circle, corresponds
1X n X n
EðÞ ¼  Rij ði  j Þ to r
0. Intermediate values of r correspond to a
2 i¼1 j¼1
partially synchronized or coherent state, depicted in
Figure 8. Some phases are synchronized forming a
where
cluster, while others roam around the circle.
Z  Multiplying both sides of [15] by ei’i and
Rij ðÞ ¼ Hij ðsÞ ds considering only the imaginary parts, we can rewrite
0
P [14] in the equivalent form
One can check that dE()=d =  (0i )2 0
’0i ¼ !i þ Kr sinð  ’i Þ
along the trajectories of [12] with equality only at
equilibria. that emphasizes the mean-filed character of interac-
tions between the oscillators: they all are pulled into
the synchronized cluster (’i ! ) with the effective
Mean-Field Approximations strength proportional to the cluster size r. This pull
is offset by the random frequency deviations !i that
Let us represent the phase model [10] in the form pull away from the cluster.
X
n Let us assume that !i ’s are distributed randomly
’0i ¼ !i þ Hij ð’i  ’j Þ around 0 with a symmetrical probability density
j6¼i function g(!), for example, Gaussian. Kuramoto has
where 0 = d=d,  = "t is the slow time, and shown that in the limit n ! 1, the cluster size r
!i = Hii (0) are random frequency deviations. Collec- obeys the self-consistency equation
tive dynamics of this system can be analyzed Z þ=2
in the limit n ! 1. We illustrate the theory r ¼ rK gðKr sin ’Þ cos2 ’ d’ ½16
=2
using the special case, H() = sin , known as the
Kuramoto (1984) model: Notice that r = 0, corresponding to the incoherent
state, is always a solution of this equation. When the
KX n
coupling strength K is greater than a certain critical
’0i ¼ !i þ sinð’j  ’i Þ; ’i 2 ½0; 2 ½14
n j¼1 value,
2
where K > 0 is the coupling strength and the factor Kc ¼
1=n ensures that the model behaves well as n ! 1. gð0Þ
The complex-valued sum of all phases, an additional, nontrivial solution r > 0 appears,
X
n which corresponds to a partially synchronized
1 state. Expanding g in a Taylor series, one gets the
rei ¼ ei’j pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n j¼1 scaling r = 16(K  Kc )=(g00 (0)K4c ). Thus, the
ðKuramoto synchronization indexÞ ½15 stronger the coupling K relative to the random
distribution of frequencies, the more oscillators syn-
describes the degree of synchronization in the chronize into a coherent cluster. The issue of stability
network. Apparently, the in-phase synchronized of incoherent and partially synchronized states is
state ’1 =    = ’n corresponds to r = 1 with discussed by Strogatz (2000). Other generalizations
being the population phase. In contrast, the inco- of the Kuramoto model are reviewed by Acebron et al.
herent state with all ’i having different values (2005). An extended version of this article with the
Wheeler–De Witt Theory 453

emphasis on computational neuroscience can be found


Glass L and MacKey MC (1988) From Clocks to Chaos. Princeton:
in the recent book by Izhikevich (2006). Princeton University Press
Guckenheimer J (1975) Isochrons and phaseless sets. Journal of
See also: Bifurcations of Periodic Orbits; Dynamical Mathematical Biology 1: 259–273.
Systems and Thermodynamics; Hamiltonian Systems: Hoppensteadt FC and Izhikevich EM (1997) Weakly Connected
Stability and Instability Theory; Singularity and Bifurcation Neural Networks. New York: Springer.
Theory; Stability Theory and KAM; Synchronization of Izhikevich EM (1999) Weakly connected quasiperiodic oscillators,
Chaos. FM interactions, and multiplexing in the brain. SIAM Journal
on Applied Mathematics 59: 2193–2223.
Izhikevich EM (2006) Dynamical Systems in Neuroscience: The
Geometry of Excitability and Bursting. Cambridge, MA: The
Further Reading MIT Press.
Kuramoto Y (1984) Chemical Oscillations, Waves, and Turbulence.
Acebron JA, Bonilla LL, Vincente CJP, Ritort F, and Spigler R (2005) New York: Springer.
The Kuramoto model: a simple paradigm for synchronization Pikovsky A, Rosenblum M, and Kurths J (2001) Synchronization: A
phenomena. Reviews of Modern Physics 77: 137–186. Universal Concept in Nonlinear Science. Cambridge: Cambridge
Ermentrout GB (1992) Stable periodic solutions to discrete and University Press.
continuum arrays of weakly coupled nonlinear oscillators. SIAM Strogatz SH (2000) From Kuramoto to Crawford: exploring the
Journal on Applied Mathematics 52: 1665–1687. onset of synchronization in populations of coupled oscillators.
Ermentrout GB and Kopell N (1984) Frequency plateaus in a chain Physica D 143: 1–20.
of weakly coupled oscillators, I. SIAM Journal on Applied Winfree A (2001) The Geometry of Biological Time, 2nd edn.
Mathematics 15: 215–237. New York: Springer.

Wheeler–De Witt Theory


J Maharana, Institute of Physics, Bhubaneswar, India relativity and some of its applications in the study of
ª 2006 Elsevier Ltd. All rights reserved.
quantum cosmology. There are, broadly speaking,
three different approaches to quantize gravity.
The general theory of relativity has been tested to
great degree of accuracy in the classical regime. The
Introduction
geometrical description of spacetime plays a cardinal
It is recognized that one of the outstanding problems role in Einstein’s theory. Therefore, the general
in modern physics is to formulate the quantum relativists emphasize the geometrical attributes of the
theory of gravity, synthesizing the principles of theory and the central role played by the spacetime
quantum mechanics and general theory of relativity. structure in their formulation of quantum theory.
The fundamental units for measuring time, length, It is natural to adopt a background-independent
and energy, known as Planck time, Planck length, approach. In contrast, the path followed by
and Planck energy, respectively, are defined to be quantum field theorists, where the prescription is
tPl = (hG=c5 )1=2 = 5.39  1044 s, lPl = (hG=c3 )1=2 = valid in the weak-field approximation, the theory is
1.61  1033 cm, and mPl = (hc=G)1=2 = 2.17  quantized in a given background, usually the Min-
5
10 g, in terms of the Newton’s constant, G, kowskian space. It is argued by the proponents of the
velocity of light, c, and h = h=2, h being the geometric approach, that the background metric
Planck’s constant. We may conclude, on dimen- should emerge from the theory in a self-consistent
sional arguments, that quantum gravity effects will manner rather than being introduced by hand when
play an important role when we consider physical we quantize the theory. One of the earliest attempts
phenomena in the vicinity of these scales. Therefore, to quantize gravity was to follow the route of
when we probe very short distances, consider canonical method. The canonical quantization
collisions at Planckian energies, and envisage evolu- approach has many advantages. One of the impor-
tion of the universe in the Planck era, the quantum tant features is that it is quite similar to the
gravity will come into play in a predominant prescriptions adopted in quantum field theory where
manner. The purpose of this article is to present an one uses notion of operators, commutation relations,
overview of an approach to quantize Einstein’s etc. Moreover, the subtleties encountered in quantiz-
theory of gravity, pioneered by Wheeler and De ing gravity are transparent. Therefore, the canonical
Witt almost four decades ago. We proceed to procedure is preferred over the path-integral formula-
recapitulate various prescriptions for quantizing tion, although the latter has its own advantages too.
gravitation and then discuss simple derivation of Another positive aspect of the canonical approach is
the Wheeler–De Witt (WDW) equation in general that the requirement of background-independent
454 Wheeler–De Witt Theory

formulation could be maintained to some extent. the strings. The superstring theories are perturba-
Thus, there is room for exploring some of the tively consistent in critical ten dimensions. The
nonperturbative attributes of the theory. The relati- closed-superstring spectrum contains a spin-2 mass-
vists favor canonical formulation, since some of the less state which is identified to be the graviton. It is
geometrical features of general theory of relativity well known that perturbative computation of pro-
could be incorporated here and be explored to see cesses involving graviton turn out to be finite.
how far the quantum theory captures such properties Moreover, the Einstein–Hilbert term appears natu-
of the classical theory. As we shall discuss in sequel, rally when one derives the string effective action.
some of the interesting issues of quantum cosmology Therefore, it is expected that string theory will be
are addressed in this approach. However, there are able to provide answers to questions related to
limitations and short comings in this formulation and quantum gravity. Indeed, the theory has met with
we refer the reader to the text books and review success in resolving some important issues. We note
articles for further reading and critical assessments of that cosmological scenario has been discussed in the
canonical approach to quantize gravity. string theory framework and the WDW equation
The second approach is primarily the endeavor of has played an important role in study of quantum
physicists who have devoted their research to string cosmology. We shall comment on this aspect
quantum field theory. Feynman’s seminal work on towards the end of this article.
quantization of gravity from this perspective has
profoundly influenced the subsequent developments.
The quantization of gravity is carried out in the The Canonical Structure of Einstein
weak-field approximation such that the graviton is
Gravity
identified as the fluctuation over the Minkowski
background metric. It is a massless spin-2 field as one The Einstein–Hilbert action is
concludes from the properties of low-energy gravita- Z
1 pffiffiffiffiffiffiffi 4
tional interaction in the classical limit. Furthermore, S¼ g d xðR  2Þ ½1
the gauge invariance associated with a spin-2 mass- 16G M
less field gets intimately related with invariance of where R is the Ricci scalar derived from the metric,
Einstein’s theory under general coordinate transfor- g , and  is the cosmological constant. The field
mation. In this setup, the field-theoretic techniques equations are derived from the action by the
could be employed to quantize theory and to consider standard variational technique. Note that R involves
perturbative expansions for the scattering amplitudes. second derivative of the metric. If we have compact
It is realized that low-energy amplitudes computed manifolds with boundary @M such that variations of
from the massless spin-2 theory match with those the metric vanish on the boundary and the normal
derived from the Einstein–Hilbert action in the weak- derivatives do not, it is necessary to add a surface
field approximation. Furthermore, the theory is not term to this action. The exact form of this term will
perturbatively renormalizable since the coupling be discussed later. The Einstein’s theory of gravita-
constant carries dimension. One of the most impor- tion is manifestly covariant. The associated action
tant outcomes of the investigations from this per- [1] is invariant under general coordinate transforma-
spective is the discovery, due to Feynman, that the tions: under x ! x0 (x),
introduction of ghost fields is necessary in order to
@x0 @x0
maintain unitarity of the S-matrix when one goes g0 ðx0 Þ ¼ g ðxÞ ½2
beyond the tree level. As is well known, this work has @x @x
profoundly influenced frontiers of research in physics Therefore, we expect that the theory will be
leading to quantization of Yang–Mills theory which, endowed with constraints expressed in terms of the
in turn, paved way for electroweak theory and the canonical variables. One can implement general
QCD. It is worthwhile to mention in passing that the coordinate transformations so that there are only
quantum phenomena associated with gravity in the two pairs of canonical phase-space variables on a
nonperturbative regime cannot be addressed in this spacelike hypersurface. In other words, from physi-
framework. cal considerations, graviton has only two polariza-
In recent years, superstring theory has been at the tions whereas the metric has ten components.
center stage in order to provide a unified theory of Therefore, the two physical degrees of freedom can
fundamental interactions. It is postulated that all be obtained using the freedom of choosing the
elementary constituents of matter and the carriers of ‘‘gauge’’ transformations in this context. It is
the interactions such as gauge bosons and graviton desirable to identify the constraints and analyze
are excitations of one-dimensional extended objects: their structure, most appropriately in Dirac’s
Wheeler–De Witt Theory 455

formalism, and to quantize the theory canonically as Consequently, [4] implies that g00 and g0i will enter
the next step. This is the path we intend to follow in the Hamiltonian as arbitrary functions. As alluded
order to arrive at the WDW equation. to above, hij and their conjugate momenta ij are the
dynamical degrees of freedom. We may choose
The Classical Constraints (N ? , N i ) = N  and hij as independent variables
rather than (g00 , g0i ) = g0 and hij for convenience
The Hamiltonian approach is most appropriate to and go back to the other set of variables through [4]
employ the constraint formalism due to Dirac. We and [5] if we desire. Let  be canonically conjugate
recall that the Lagrangian formulation is manifestly momenta to N  , then it is obvious that a Lagrangian
covariant as is reflected in the field equations; multiplier,  , is necessary so that . term has to be
whereas the spacetime covariance is lost in the supplemented to the Hamiltonian due to the
passage to the Hamiltonian approach. Furthermore, arbitrariness of N  . We remind the reader that in
the spatial components of the metric are the electrodynamics an analogous situation arises while
dynamical degrees of freedom. We adopt the analyzing its canonical structure – local gauge
formalism introduced by Arnowitt, Deser, and symmetry plays a crucial role there. It is obvious
Misner (ADM) for the so-called 3 þ 1 split of the that the generic form of the Hamiltonian is (we shall
hyperbolic Riemannian spacetime metric, g . One introduce 1=16G, etc., later)
introduces the lapse function, N ? , and the shift Z
function, N i . We suppress the factors of 1=16G,  
H ¼ d3 x N? H? ½hij ; ij  þ N i Hi ½hij ; ij  þ : ½7
etc., for the time being for the general discussions
and shall reintroduce them later. The family of From the perspective of constraint analysis, it is
spacelike hypersurfaces, t , are constructed, with natural that   0 appears as a first-class constraint
metric hij induced on it. Here t is a timelike as they are multiplied by arbitrary functions. More-
parameter, parametrize t . The distance between over, this constraint must hold good under the
points on two neighboring hypersurface, t and deformation of the surface which implies { , H}PB
tþdt , with coordinates (t, xi ) and (t þ dt, xi þ dxi ), must vanish weakly leading to H  0. As a
respectively, is given by consistency requirement, these must be first-class
ds2 ¼ ðN ? Þ2 dt2 þ hij ðN i dt þ dxi ÞðN j dt þ dxj Þ constraints if N  are to be arbitrary functions. We
identify that   0 and H  0 are the primary and
¼ g dx dx ½3
secondary constraints, respectively. Thus far, we
The indices of tensors defined on t are raised and have discussed the case for pure gravity; the
lowered by hij and its inverse hij . The relations presence of matter fields in the full action modifies
between the components of g and N ? , N i , hij can the treatment appropriately.
be obtained easily, Let us analyze the structure of the constraints for
the Einstein–Hilbert action [1]. For a compact
g00 ¼ hij N i N j  ðN ? Þ2 ; g0i ¼ hij N j ½4 manifold with boundary @M, we have to add the
The above relations can be inverted to give surface term which takes the form:
Z pffiffiffi
1 1
N i ¼ hij g0i N ? ¼ pffiffiffiffiffiffiffiffiffiffiffi ½5 d3 x hK
g00 8G @M

The relations between spatial components, gij , of g Here K stands for the trace of the extrinsic curvature
and hij and some other useful relations are listed of the boundary 3-surface and h = det hij ; note that
below for later conveniences: hij is the induced metric on the 3-surface. If we
include matter fields, the corresponding action is to
Ni Nj be taken into account. Once we make the 3 þ 1 split
gij ¼ hij 
ðN ? Þ2 of the metric, the action assumes the following form:
pffiffiffiffiffiffiffi p ffiffiffi Z
g ¼ N ? h ½6 1 pffiffiffi
S¼ d3 x dtN ? h
Ni 16G
g0i ¼  
ðN ? Þ2  Kij Kij  K2 þ 3 R  2 ½8
Note that (N ? , N i ) are introduced to specify the where
deformation of the hypersurface and therefore, the  
evolution equations through the Hamiltonian will 1 @hij
Kij ¼ ?  þ Di Nj þ Dj Ni ½9
not determine them; they are arbitrary functions. N @t
456 Wheeler–De Witt Theory

Here Di Nj represents covariant derivative of Nj with fij ðxÞ; kl ðx0 Þg ¼ 0 ½18
the connections computed from hij and 3 R is
curvature of the 3-surface. The canonical momenta
fhij ðxÞ; kl ðx0 Þg ¼ ðik jÞl  ðx; x0 Þ ½19
are
pffiffiffi  
h Thus, Poisson brackets among the constraints [13]
ij
 ¼ Kij  hij Kll ½10 and [14] are
16G
0
and we can invert this relation to get fHi ðxÞ; Hj ðx0 Þg ¼ Hj ðxÞ@ix ðx; x0 Þ
  þ Hi @jx ðx; x0 Þ ½20
ij 1 ij 1 ij l
K ¼ pffiffiffi   h l
16G h 2
fHi ðxÞ; H? ðx0 Þg ¼ H? ðxÞ@ix ðx; x0 Þ ½21
The Hamiltonian form of action is given by
Z   0
fH? ðxÞ; H? ðx0 Þg ¼ hij ðxÞHi ðxÞ@jx ðx; x0 Þ
SH ¼ d3 x dt h_ ij ij  N? H?  N i Hi ½11
 hij ðx0 ÞHi ðx0 Þ@jx ðx; x0 Þ ½22
Notice that [8] does not involve time derivatives of
When we resort to canonical quantization, the
N ? and N i , their corresponding canonical momenta
starting point is the Hamiltonian action in the first-
vanish.
order formalism, where the canonical variables are
subjected to the constraints [13] and [14] in terms of
?  0; i  0 ½12
H? and Hi satisfying the algebra given by [20]–[22].
as expected from our earlier discussions about the One encounters a number of important issues while
role of N  . A straightforward constraint analysis proceeding to canonically quantize the theory. We
leads to the pair of constraints shall mention only a few of them in what follows. It
is important to address issues related to the role of
Hi ¼ 2Dj ji  0 ½13 the constraints in the quantized theory and how to
deal with the Lagrange multipliers N ? and N i . A
  simple proposal is to solve the constraints at the
16G 1
H? ¼ pffiffiffi hij hkl  hik hjl ik jl classical level and identify the physical degrees of
h 2
pffiffiffi freedom and quantize the theory subsequently.
h 3 There are four constraints (first class), H? , Hi ,
 R0 ½14
16G therefore, out of the 12 phase-space variables,
We mention in passing that the above constraint (hij , ij ), only eight are independent. We need to
equations get modified in the presence of matter supply four gauge conditions in order to render the
fields in the theory. This is relevant. The WDW theory (classically) solvable. Thus, we are left with
equation plays an important role in quantum four physical degrees of freedom in the Hamiltonian
cosmology to describe the evolution of the universe phase space and we can quantize them. The
in early epochs and the equation is studied in the implementation of this idea is easier said than
presence of a generic matter content, that is, a scalar done. One obstacle is that the constraints cannot
field with potential. The constraint equations [13] be solved in a closed form in this formalism. If we
and [14] modify to fix a gauge and quantize the theory, we obviously
break the gauge invariance. It is essential to show,
HTi ¼ Hi þ Hmatter
i 0 ½15 subsequently, that all physically observable quanti-
ties are independent of the gauge choice. Another
HT? ¼ H? þ Hmatter  0 ½16 criticism of this formalism is that we already get rid
of some of the components of the metric. Therefore,
the spirit of the general theory of relativity, which is
based on the geometrical structure of spacetime, is
The Algebra of Constraints
somewhat diluted. There are other suggestions
In order to compute the classical Poisson bracket where hij and their conjugate momenta are elevated
algebra of the constraints [13] and [14], we use the to quantum status before supplying the gauge
canonical Poisson bracket relations for the phase- conditions. The issues of gauge fixing and dealing
space variables on t : with the constraints are addressed at the quantum
level. We replace the canonical Poisson bracket
fhij ðxÞ; hkl ðx0 Þg ¼ 0 ½17
Wheeler–De Witt Theory 457

algebra by the canonical commutators and proceed is that the quantum momentum constraint, H^i , as an
further. The momentum operator assumes the form operator annihilates the wave function which is a
statement of the quantum-mechanical invariance of

^ij ¼ i
h the theory under three-dimensional diffeomorph-
hij isms. However, the WDW equation conveys invar-
and the wave functional depends on hij that is, [h]. iance of the theory under reparametrization,
There are many technical problems related to the although careful analysis is necessary to prove this
properties of the states and we shall not deal with point. Now we proceed to discuss the solutions of
them due to limitations of space. It is essential to the WDW equation.
discuss the role of the constraints in the quantum
theory. We demand that the quantum constraints WDW Equation and the Solutions
annihilate the physical states (recall the Gauss law
constraint in gauge theories). However, the issue of It is recognized that the WDW equation [24] is a
operator ordering is to be dealt with which in turn is second-order hyperbolic functional differential equa-
connected with the Hermiticity properties of the tion and naturally it has enormous number of
quantum constraints. The Hamiltonian constraint solutions. Therefore, if we want the WDW equation
H?  0 (henceforth denoted as H and defined as the to have any predictive power, it is necessary to
Hamiltonian) is a product of the metric h ^ij and ij . introduce boundary conditions. One of the possible
There is certain ambiguity in defining the constraint. choice is to specify the wave function on the
Therefore, one has to choose a convention. boundary of the superspace. Indeed, the central
T
The condition that the Hamiltonian, H^ , consisting issue of quantum cosmology is about the choice of
of gravitational and matter components, annihilates various boundary conditions which has been an
the state is expressed as important topic of debates. This point will be briefly
discussed later. Notice that the boundary condition
T
H^  ¼ 0 ½23 has to be introduced keeping in mind how the
universe is expected to behave as it evolves. There is
When we adopt coordinate representation for ij , a proposition that the boundary condition for the
the above equation takes the form quantum evolution of the universe be given the
status of a physical law. Therefore, the role of the
 
16G Gijkl wave functional, [hij (x), (x), B], its evolution, and
hij hkl interpretation are central to the development of
pffiffiffi #
h 3 matter quantum cosmology. Thus,  represents the ampli-
 ð R  2Þ þ H ½h;  ¼ 0 ½24 tude for the universe to have hij (x) on the 3-surface,
16G
B, and matter field (x). It is argued that path-
This is the celebrated WDW equation. Here we have integral formalism should be adopted as an alter-
considered a simple case where matter Hamiltonian native to the canonical prescription to solve for the
density generically contains a single scalar field, , wave function, rather the transition amplitude,
and therefore  is functional of 3-metric on t and satisfying the WDW equation. Here the first step is
. Gijkl is the De Witt metric in the superspace: to define the Euclidean version of the gravitational
action keeping in mind the subtleties. As is well
1
Gijkl ¼ pffiffiffi ðhik hjl þ hil hjk  hij hkl Þ ½25 known, we deal with propagator (or transition
h amplitude) in the path-integral approach where the
functional integral is carried out over a set of
Remarks The space of all 3-metrics and the scalar 4-metrics and matter fields with Euclidean action
field (hij , ), on t , for the description of classical inside the integral acting as the weight factor. We
evolutions is called the superspace (no connection recall that while formulating quantum mechanics in
with the superspace of supersymmetry). Thus, the path-integral approach, we sum over all possible
[hij , ] is a functional on superspace. Furthermore, paths in the functional integral. However, in the
 carries no explicit dependence on t. This is a semiclassical approximation, the amplitude is domi-
consequence of the fact that ‘‘time’’ plays the role of nated by the action corresponding to the classical
a parameter in the general theory of relativity, thus path and we approximate the wave function as 
the dynamical variables hij and already provide the e(i=h)Scl and it gets modified appropriately in the
evolutionary processes although t does not make its Euclidean formulation. In this background, we
appearance. As mentioned earlier, we always discuss briefly discuss how the wave function of the
the case when t is compact. Another point to note universe is obtained in the path-integral formalism.
458 Wheeler–De Witt Theory

According to the proposal of Hartle and Hawking, boundary B. Thus, in order to determine the wave
one adopts path-integral formalism for the Eucli- function of the universe, we are required to specify
dean action where the functional integral is not only the initial configurations of hij and at
= 0. We
carried out over the 4-metric, g , and the scalar shall not enter into important issues related with the
field , but also one takes sum over the class of properties of the Euclidean action, the problems
manifolds, M. Note that B is a part of the boundary associated with the choice of contours of the path
ij and  are the induced
of this set of manifold. If h integrals, and related topics. The reader will find
metric and the configuration of the scalar field, , detailed discussions in the lectures and monographs
on the boundary, B, then the propagator (henceforth referred in the ‘‘Further reading’’ section.
we just call it the wave function) [h ij , ,
 B] can be It is important to re-emphasize that boundary
given a functional-integral representation. Indeed, conditions are to be introduced while solving the
obtaining the most general form of the path integral, WDW equation. It was argued by De Witt that the
summing over the 4-manifolds, is quite a formidable wave function will be determined uniquely from the
task. On the other hand, if one chooses a class of mathematical consistency of the theory and that
4-manifolds which can be decomposed as a product hope has not been realized. Whether one attempts to
(foliation) R  B, the wave function is expressed as solve the functional differential WDW equation or
obtain the wave function in the path-integral
 ;
½h;  B formalism, the issue of boundary condition is
Z Z
unavoidable. There are mainly three different kinds
¼ DN  Dhij D f ðN  ÞFP eSE ½g ;  ½26 of boundary conditions in quantum cosmology:
Hartle–Hawking (HH) no-boundary proposal,
We have introduced the gauge-fixing condition Vilenkin’s tunneling mechanism, and Linde’s bound-
as f (N ), which is usually taken to be N _  = l and ary condition. We shall briefly discuss the first two
then the corresponding Faddeev–Popov determinant, proposals. Instead of stating the boundary condi-
FP , has to be inserted into the path-integral tions in full generality, we shall envisage quantum
measure. We recall from our earlier discussions cosmology in a minisuperspace and provide illus-
that N  has to be unrestricted on the boundary, B, trative examples to compare the main features of
since they have no dynamical role when we express HH and Vilenkin solutions to the WDW equation.
the action in terms of the variables defined on the It is realized that the discussion and solutions of
3-surface. As noted in the previous discussion, quantum cosmology in the superspace is rather
explicit time dependence does not appear after the difficult, since we deal with functional differential
3 þ 1 split and (hij (x), (x)) have no dependence on equations and the configuration space is infinite
t. Therefore, we introduce a parameter to designate dimensional. Therefore, it is worthwhile to consider
the paths over which the functional integral is to be a system, as a simple model, which has finite degrees
taken. Recall that in the quantum-mechanical case, of freedom. Thus, we assume that the metric and
the paths are parametrized as qi (t) for the coordi- matter fields depend only on cosmic time to begin
nates. However, when we resort to a parametriza- with. There is a physical motivation behind this
tion of the variables for the case at hand, certain assumption, since the present classical state of the
conditions must be fulfilled. We are permitted to universe is described by the Friendmann–Robertson–
integrate over hij and over only those paths, while Walker (FRW) metric corresponding to an isotropic
parametrizing them as (hij (x,
), (x,
)), so that they and homogeneous universe. Notice that the classical
match the arguments of the wave function on the evolution equation resembles that of the motion of a
boundary B. Therefore, we may define the metric particle. The quantum evolution equations are now
and the scalar field configuration so that at
= 1 given by differential equations of quantum
they assume their functional values on the boundary: mechanics rather than functional differential equa-
in other words, ij (x) = hij (x,
= 1)
h and tions. Similarly, the path-integral formulation

(x) = (x,
= 1). It is worthwhile to go back to becomes analogous to the quantum-mechanical
the quantum-mechanical analogy once more. When frame work. Of course, adopting such a simplified
we compute amplitudes/propagators in quantum approach deprives us from describing some of the
mechanics, the functional integral is defined for the important aspects of quantum gravity. However,
amplitude of going from a configuration qi to qf within this framework, several essential features can
while summing over all possible paths originating be exhibited and deep insight might be gained into
from one endpoint qi and ending at the final point the physics of the very early universe. The first step
qf . On this occasion, we have imposed the con- in getting the minisuperspace metric is to assume
straint on the final endpoint belonging to the that the lapse is homogeneous, that is, N ? = N ? (t)
Wheeler–De Witt Theory 459

and the shift is set to zero, N i = 0. Thus, the metric derivative term; the total derivative term can be
takes the form removed by adding a boundary term and k is
positive since we take the spatial part to be closed.
ds2 ¼ ðN ? ðtÞÞ2 dt2 þ hij ðx; tÞdxi dxj ½27 We have redefined the scale factor, the scalar
field, the potential term, and k such that the
The relevant choice of 3-metric for FRW isotropic Einstein–Hilbert action with matter field assumes
and homogeneous universe is the form of [29] and this action facilitates the
definition of conjugate momenta without cumber-
hij ðx; tÞdxi dxj ¼ aðtÞ2 d23 ½28 some numerical factors, and the Hamiltonian takes
a simple form. The conjugate momenta and result-
Note that d23 is the metric on a 3-sphere. It is ing Hamiltonian are
straightforward to derive the Friedmann equations
for such a geometry. aa_ a3 _
The HH no-boundary condition can be inter- a ¼  ;  ¼ ½30
N? N?
preted as a topological proposition about the set of
path over which we have to sum. The 3-surface B is " #
to be taken as the only surface of compact N? 2a 2 3
Hc ¼  þ 3 þ a Vð Þ  a ¼ N ? H ½31
4-manifold M which is endowed with the metric 2 a a
g , and hij and  are the induced metric and the
scalar field on the surface. The wave function is and the constraint is H = 0. In the quantum
obtained by using the matching condition supple- cosmology context, we solve the WDW equation:
mented with initial condition. For the minisuper- H = 0. Since the exact solution is not possible, one
space case, initial conditions impose constraints on resorts to some approximation with simple assump-
the scale factor a(
= 0) and (da=d
)(
= o), and N ? tions. The differential equation is
is to be gauge fixed. These conditions are to be

implemented in the context of determining the wave @2 1 @2 4 2


function of the universe. In the case of the tunneling  þ a Vð Þ  a ¼0 ½32
@a2 a2 @ 2
boundary condition of Vilenkin, the qualitative
scenario is as follows. If we look at the solution Let us consider the case when V( ) does not grow
to the WDW equation (in the path-integral very fast, that is, V( )=V( )0 << 1 and consider the
approach, Vilenkin considers Lorentzian action), solution to the WDW equation where  has weak
the solution, crudely speaking, has both ingoing dependence on . Consequently, we may ignore the
and outgoing modes at the boundary. In his derivative term in [32]. The purpose of these
proposal, the outgoing mode at the boundary is to assumptions is to reduce the problem to a one-
be accepted. The exact prescription is lot more dimensional quantum mechanics problem and then
subtle than the above statement, since one has to employ WKB method. It is hoped that at least some
define the meaning of outgoing mode carefully in of the nonperturbative aspects can still be captured.
the absence of a timelike Killing vector when we When the effective potential appearing in [32] is
write the WDW equation on the superspace. The negative (this is a classically inaccessible region), the
qualitative picture for Vilenkin’s boundary condi- wave function is
tion, in the minisuperspace, is like tunneling solu-
tions in quantum mechanics when a particle 2
Vð ÞÞ3=2
ða; Þ  eð1=3Vð ÞÞð1a ½33
penetrates through a potential barrier.
Let us consider a minisuperspace model with the
We expect the wave function have oscillatory
scalar field and potential V( ). The action is
behavior in the classically allowed domain and it
Z "   does have that property,
1 3 1 a_ 2 ð Þ _ 2
S¼ dta  þ
2 N? a N? 2
ða; Þ  eði=3Vð ÞÞða Vð Þ1Þ3=2
½34
#
?
N The choice of signs is decided from the boundary
N? Vð Þ þ 2 ½29
a conditions imposed and the usual matching of
the wave functions of the two regions is done as is
A few comments are in order here. For the FRW the case with the WKB approximation. Note that we
pffiffiffi
metric, we have gR = 6(aa_ þ ka)þ a total are considering the metric and the scalar field on
460 Wheeler–De Witt Theory

the boundary which were denoted by h ij and ; Vilenkin boundary conditions yield the following
strictly speaking, we should denote the solutions wave functions:
as   But from now on, we drop this bar on
a and .
a and . 2
Vð Þ3=2 Þ
ða; ÞV  eð1=3Vð ÞÞð1½1a ½35
Let us momentarily assume that V is -indepen-
dent and therefore, we have an effective cosmologi-
2
Vð Þ13=2
cal constant. The problem is identical to the motion ða; ÞV  e1=3Vð Þ eði=3Vð ÞÞ½a ½36
of a particle in a potential well. There are two
turning points. In one region, the particle starts from Note that [35] is the wave function under the barrier,
a = 0, reaches one turning point r1 and returns back. that is, a2 V( ) < 1 in this region, whereas [36] is in
In another case, it starts from a = 1, travels up to the classically accessible domain (a2 V( ) > 1) which
a = r2 and reflects back. In the quantum-mechanical is reflected by the oscillatory character. The slowly
case, the particle can tunnel through the barrier. The varying function F( )  e1=3V( ) appears as the
wave function has both decaying and growing common factor for the wave functions in the two
modes under the barrier, and boundary conditions domains.
tell us which mode to choose. One possibility is that The HH no-boundary proposal to derive the wave
the particle starts from a = 0, tunnels through and function of the universe was formulated in the
proceeds towards a = 1, that is, it has outgoing Euclidean path-integral formalism. A considerable
mode. The other possibility is that the wave function amount of attention has been focused in this area.
has both outgoing and ingoing modes. In this simple We shall present the HH wave function providing
scenario, the former corresponds to Vilenkin’s only a sketchy argument. In the Euclidean descrip-
tunneling boundary condition, where the universe tion, 4-metric is ds2 = (N ? )2 d
2 þ a2 (
)d23 . The
is created at a = 0 and it keeps growing. The latter is 4-geometry should close in a regular way. If we
HH no-boundary proposal where the wave function make the bounding 3-space smaller and smaller, it
has both modes and the universe contracts and can be closed with flat space. We can infer about the
expands. behavior of the scale factor in the limit
! 0 from
Now we discuss the two boundary conditions in this consideration. Furthermore, in the semiclassical
the presence of the potential, with the approxima- approximation (a, )  eSE ; we have replaced
tions mentioned above. The proposition of Vilenkin  by (a, ) as remarked earlier. Thus, our aim
(a, )
amounts to the following conditions on the wave is to evaluate SE at the saddle point. This is achieved
function: the region of the boundary which is by writing down the (Euclidean version) field
nonsingular is finite and a = 0. Other than this equations for a and and the Hamiltonian
domain, either a or diverge on any other region of constraint, and then solve for a(
), (
), and N ? (
).
the boundary; both can diverge in this singular Eventually, we want to eliminate N ? and then
boundary. Notice from the expression for [33] and obtain SE . After all, the path integral is dominated
[34] that the tunneling region corresponds to by the classical trajectory, a(
), and one does not fix
a2 V( ) < 1, whereas, the oscillatory domain is the gauge for N ? while solving for a. In fact, the
a2 V( ) > 1. If we use the saddle-point approxima- lapse gets eliminated by utilizing the Hamiltonian
tion,   eiScl . Vilenkin’s boundary condition cor- constraint which involve
-derivatives of both a and
responds to   eiScl , with . We mention, without going into details, that the
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi classical action is not unique. One of the ways to
ð a2 Vð Þ  1Þ3 visualize it is to note that the solutions obtained for
Scl ¼ the lapse from the Hamiltonian constraint have sign
3Vð Þ
ambiguities.
So far, we considered the situation where differential The classical action is
operator for is dropped in [32]. In order to
account for weak -dependence, we could introduce 1  
S
E ¼  1  ½1  a2 Vð Þ3=2 ½37
it by multiplying a slowly varying function, say F( ) 3Vð Þ
and write (a, )  F( )eiScl . Similarly, the wave
function can be obtained under the barrier and Note that the two solutions correspond to 3-sphere
required to satisfy WKB matching conditions. boundary being closed off by sections of 4-sphere.
Furthermore, the regularity condition on the wave Moreover, the Euclidean action is negative. Hartle
function in small scale factor limit and behavior of and Hawking argue that the negative sign in [37]
its derivative with respect to in that limit gives the correct answer since the wave function
determine the form of F( ). In summary, the peaks for that choice. However, there is no unanimity
Wheeler–De Witt Theory 461

for HH argument and some authors have put always accompanied by the dilaton in any string-
forward a point of view that additional inputs are theoretic approach to study the universe. The duality
necessary to arrive at the HH conclusion about symmetries are recognized to provide deep under-
choosing the negative sign for SE in [37]. We refer the standing of the string dynamics. Therefore, the
reader to the reviews of Hartle and Halliwell for investigations of quantum gravity phenomena from
detailed discussions on the choice of contours for the string-theory viewpoint are necessarily influenced
path integrals, subtleties involved in getting various by above mentioned facts. Indeed, classical cosmolo-
solutions for the lapse and their interpretations. We gical solutions, derived from string effective action,
give below the wave function under the barrier (with have several interesting characteristics. We mention is
choice of negative sign in [37]): passing that the WDW equation has played an
important role to study quantum evolution equations
2
Vð Þ3=2 Þ
HH ða; Þ  eð1=3vð ÞÞð1½1a ½38 in string cosmology. The choice of operator-ordering
prescription in defining the WDW Laplace–Beltrami
operator can be resolved by appealing to the duality
HH ða; Þ  e1=3Vð Þ symmetries. Furthermore, the boundary conditions
 
1  imposed on the wave function are dictated by string
 cos ½a2 Vð Þ  13=2  ½39
3Vð Þ 4 symmetries and therefore, the resulting wave function
has very interesting properties. The string theory has
addressed some of the most important problems in
Remarks The wave function in [38] is obtained in
quantum gravity and it has provided resolutions to
the classically inaccessible region under the condition
several key issues. It is expected that string theory
a2 V( ) < 1, and wave function [39] corresponds to
will provide answers to challenging questions in
the case a2 V( ) > 1, where the particle motion is
quantum cosmology. In summary, we have conveyed
permissible classically. Note the factor e1=3V( ) in the
some of the salient aspects of the WDW equation.
wave functions in both the regions and compare that
The canonical quantization technique is adopted to
with the Vilenkin’s wave function which has the
study quantum gravity in this approach. We have
opposite sign. We may conclude where the wave
illustrated the crucial role of the constraint formalism
function will peak for each of the two boundary
due to Dirac and argued that some of the nonpertur-
conditions. Whereas Vilenkin’s proposal implies that
bative aspects of quantum gravity could be retained.
V (a, ) peaks when V( ) takes large values, HH no-
In a short article of this nature, it is not possible to
boundary condition tells us that it peaks when
provide detailed discussion about the general deriva-
V( ) ! 0. Furthermore, we note that V is complex
tion of the WDW equation and discuss the role of
and HH is real in the oscillatory region. Although
boundary conditions more exhaustively. Instead, we
the debates on the merits and demerits of each of the
presented some of the key steps in the derivation of
boundary proposals are going on for more than two
the WDW equation adopting the canonical formalism
decades, the issue is far from being settled. In the
and provided simple examples. The subject is still an
absence of any experimental tests, there is no way to
active area of research. The interested reader may
favor one boundary proposal over another. Then,
benefit from the bibliography.
boundary conditions do have predictions about the
evolution of the universe after the quantum era and See also: Canonical General Relativity; Loop Quantum
have predictions in that (classical) regime. Therefore, Gravity; Quantum Cosmology; Quantum Dynamics in
determination of the wave function with specific Loop Quantum Gravity; Quantum Geometry and its
boundary conditions does have some connections Applications; Superstring Theories.
with the laws that govern the evolution of our
universe in the present epoch.
Further Reading
It is worthwhile to dwell on the WDW equation
from the perspectives of string theories. Indeed, there De Witt BS (1967) Quantum theory of gravity. Physical Review
160: 1113.
have been important developments to understand the Feynman RP, Morinigo FB, and Wagner WG (1995) Feynman
dynamics of the universe in the string-theoretic Lectures on Gravity. New York: Addison-Wesley.
framework. It is important to note the key role Halliwell JJ (1990) Quantum cosmology. In: Randjbar-Daemi S,
played by dilaton in string theory: (1) it is one of the Sezgin E, and Shafi Q (eds.) Summer School in High Energy
massless states of the theory, and (2) the vacuum Physics and Cosmology, p. 513. Singapore: World Scientific.
Hartle JB (1989) Introductory lectures on quantum cosmology. In:
expectation value (VEV) of this field determines the Coleman S, Hartle JB, Piran T, and Weinberg S (eds.) Quantum
coupling constants we hope to use in describing Cosmology and Baby Universes, Proceedings of 7th Winter
fundamental interactions. Therefore, the graviton is School in Theoretical Physics, p. 159. Jerusalem: World Scientific.
462 Wulff Droplets

Hartle JB and Hawking SW (1983) Wave function of the Vilenkin A (1984) Quantum creation of universe. Physical Review
universe. Physical Review D 28: 2960. D 30: 509.
Hawking SW (1983) Quantum cosmology. Les Houches Lectures Wheeler JA (1963) Geometrodynamics. In: De Witt C and De
on Quantum Cosmology, p. 333. (Les Houches Publication) in Witt BS (eds.) Relativity, Groups and Topology. New York:
Einstein Centenary Volume. Gordon and Breach.

Wightman Axioms see Axiomatic Quantum Field Theory

Wulff Droplets
S Shlosman, Université de Marseille, Marseille, minimizer does exist and is unique up to translation.
France It is called the Wulff shape.
ª 2006 Elsevier Ltd. All rights reserved. The following is the geometric construction of
W
. Consider the set
n o
K
¼ x 2 R d: 8n 2 Sd1 ðx; nÞ

ðnÞ
Introduction
Historically, the first question where the Wulff shapes If we define the half-spaces
have appeared is the one of the formation of a droplet n o
or a crystal of one substance inside another. The L
;n ¼ x 2 R d: ðx; nÞ

ðnÞ
natural problem here is: what shape such a formation
would take? The statement that such a shape should then
be defined by the minimum of the overall surface
energy subject to the volume constraint is physically K
¼ \n L
;n ½1
very natural. In the isotropic case, when the surface In particular, K
is convex. It turns out that
tension does not depend on the orientation of the
surface, and so is just a positive number, the shape in W
¼ 
@ ðK
Þ
question should be of course spherical (provided we
neglect the gravitational effects). In a more general where the dilatation factor 
is defined by the
situation the shape in question is less symmetric. The normalization: vol(
K
) = 1. The relation [1] is
corresponding variational problem is called the Wulff called the Wulff construction. For the future use,
problem. Wulff (1901) formulated it in his paper, we introduce the notation w
for the value of the
where he also presented a geometric solution to it, surface energy of the Wulff shape:
called the ‘‘Wulff construction.’’ w
¼ W
ðW
Þ
The Wulff variational problem is formulated as
follows. Let
(n), n 2 Sd1 , be some continuous The Wulff construction was considered by the
function on the unit sphere Sd1  R d . We suppose rigorous statistical mechanics as just a phenomeno-
that
> 0, and that
is even:
(n) =
(n). The value logical statement, though the notion of the surface

(n) plays the role of the surface tension between two tension was among its central notions. The situation
phases separated by the hyperplane orthogonal to the changed after the appearance of the book by
vector n. For every closed compact (hyper)surface Dobrushin et al. (1992). There it was shown that
Md1  Rd , we define its surface energy as in the setting of the canonical ensemble formalism,
Z in the regime of the first-order phase transition, the
W
ðM Þ ¼
ðns Þ ds (random) shape occupied by one of the phases has
M asymptotically (in the thermodynamic limit) a
where ns is the normal vector to M at s 2 M. The nonrandom shape, given precisely by the Wulff
functional W
(M) has the meaning of the surface construction! In other words, a typical macroscopic
energy of the M-shaped droplet made from one of random droplet looks very close to the Wulff shape.
these two phases. It is called the Wulff functional. In what follows we will explain the above result.
Let W
be the surface which minimizes W
( ) over Another important application of the concepts
all the surfaces enclosing the unit volume. Such a introduced above – the role played by the Wulff
Wulff Droplets 463

shapes in the theory of metastability – is also where


described (see Metastable States). m2 ðÞ  jj
¼ ; K ¼ KðÞ
2m2 ðÞ
Crystals in the Ising Model There is a point x = x() – the ‘‘center’’ of () –
d
Ising spins x take values 1, with x 2 Z . We will such that the shift of () by x() brings the
wrap Zd into a torus TN d
by taking a factor lattice: contour () very close to the scaled Wulff curve,
TNd
= Zd =NZd . Ising-model grand canonical Gibbs defined by the Ising-model surface tension :
d
state in TN is the probability measure N : sffiffiffiffiffiffi !
2
N ðÞ ¼ Z1
N; expðHN ðÞÞ rH ðÞ  xðÞ; NW  KN 2=3 ðln N ÞK ½2
P w
Here HN () =  x, y n.n., x, y2T d x y ,  > 0 is the
N
inverse temperature, and ZN,  is the normalization (Here rH is the Hausdorff distance: for every two
d
factor. Ising-model canonical Gibbs state in TN sets A, C 2 Rd , rH (A, C) is defined as max{inf[r :
, 
is the probability measure N , obtained from N by A  C þ Br ], inf[r : C  A þ Br ]}, where Br is the
taking its conditional distribution: ball of radius r.)
!
; 
X The proof of the above result is the content of
N ðÞ ¼ N  j x ¼ N d ; jj < 1
d
the book by Dobrushin et al. (1992). In the
x2TN
two-dimensional case, it remains true for all
(Here we make a slight abuse of notation. More temperatures 1 below the critical one (Ioffe and
precisely, since P x = 1, one has to consider Schonmann 1998). The value 2/3 of the exponent is
the conditioning x = N N d , where N !  as an improvement of the original 3/4 result
N ! 1, while the numbers (1  N )N d are even (Alexander 1992). Probably, it can be further
integers; otherwise the condition is empty.) We will improved down to 1/2. Though Dobrushin et al.
, 
characterize the canonical state N by describing the (1992) treat only the Ising model, their results are
properties of contours, {i ()}, of configuration . valid for a wide range of other models.
Contours i of configuration  are hypersurfaces The restriction  > gd in the theorem is needed
made of elementary (d  1)-dimensional unit cubes of because without it the droplet may prefer to assume
the dual lattice, which separate the nearest-neighbor the shape of a strip between two meridians rather
d
(n.n.) points x, y 2 TN where x 6¼ y . than to take the Wulff shape.
Suppose that the temperature 1 is low enough,
while the density parameter  satisfies the constraints: Three-Dimensional Case
md ð Þ >  > gd
In the case d = 3 or d
3, the statement that a
Here md () is the spontaneous magnetization of the typical configuration  has only one big contour
d-dimensional Ising model, while gd is some geo- () is still true. But the analog of [2] is not known.
metric factor, the role of which will be explained It is natural to conjecture that it holds at low
later. The above constraint forces some amount of temperatures, even in a stronger version, with only a
the ()-phase into the (þ)-phase. It turns out that logarithmic term K( ln N)K in the RHS. It probably
this amount gathers into one big droplet, which has fails at higher subcritical temperatures.
approximately the Wulff shape. What is known to hold is a weaker version of this
We first formulate the known rigorous results for theorem, where the distance between random
the case d = 2, and then indicate some extensions. droplet and the Wulff shape is measured not in
Hausdorf distance, but in L1 sense. To state the
Two-Dimensional Case corresponding theorem, we will associate with every
d
configuration  on a lattice torus TN a real-valued
,
The following holds with N -probability approach- function M (t) on the unit torus Td = Rd =Zd ,
ing 1 as N ! 1: and we then compare this function with the
The set {i ()} of contours of  has precisely one indicator function IsK , where sK  Td is the Wulff
‘‘big’’ contour, (); the diameters of other body, properly scaled.
contours do not exceed K ln N, K = K(). The function M (t) is defined as follows. We
The area jInt ()j inside () satisfies denote by iN the natural embedding of the discrete
  torus TN d
into Td , the image of iN being the grid with
 jInt  ðÞj  N 2   KN 6=5 ðln N ÞK spacing 1=N. For t 2 Td we define bN (t)  Td to be
464 Wulff Droplets

pffiffiffiffiffiffiffiffiffiffi
the ball centered at t with radius d 1=N , and let droplet () present in the configuration , then the
BN (t)  (N) be its preimage under iN . Then value M (t) should be expected to be md (),
X depending on whether t is outside or inside the
1
M ðrÞ ¼ ðxÞ droplet, which explains the factor 1=md ().
jBN ðtÞj x2B ðtÞ For a proof, see Bodineau (1999) and Cerf and
N

Pizstora (1999).
We have to expect to see a droplet sK with
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi See also: Cluster Expansion; Large Deviations in
d d md ðÞ   Equilibrium Statistical Mechanics; Metastable States;

w 2md ðÞ Percolation Theory; Statistical Mechanics of Interfaces.

Let us introduce for every subset A  Td the


indicator
 Further Reading
1; t 2 A
IA ðtÞ ¼
1; t 2 Ac Alexander K (1992) Stability of the Wulff minimum and fluctua-
1 d tions in shape for large clusters in two-dimensional percolation.
For every function v in L (T ) we denote by U(v,
) Probability Theory and Related Fields 91: 507–532.
its
-neighborhood in L1 (Td ). Bodineau T (1999) The Wulff construction in three and more
The result can now be formulated. Suppose the dimensions. Communications in Mathematical Physics 207(1):
temperature  1 is below the critical one. Then the 197–229.
function M (t) is close to the characteristic function Cerf R and Pizstora A (2000) On the Wulff crystal in the Ising
model. Annals of Probability 28(3): 947–1017.
of the Wulff shape: For every
> 0
Dobrushin RL, Kotecky R, and Shlosman SB (1992) Wulff
8 9
< 1 = Construction: A Global Shape from Local Interaction. AMS
[ Translations Series. Providence, RI: American Mathematical
lim ;
N :  M  ðÞ 2 U ð I sK þt ;
Þ ¼1
N!1 md ðÞ d

; Society.
t2T Ioffe D and Schonmann R (1998) Dobrushin–Kotecky–Shlosman
theory up to the critical temperature. Communications in
The shifts by all t–s of the Wulff shape sK appear
Mathematical Physics 199: 117–167.
in the statement since the location of the droplet can Wulff G (1901) Zur frage der geschwindigkeit des wachsturms
be arbitrary. Note that if a point t is such that the under auflosung der kristallchen. Z. Kristallogr. 34:
ball BN (t) stays away from the boundary of the 449–530.
Y
Yang–Baxter Equations
J H H Perk and H Au-Yang, Oklahoma State Spin Models
University, Stillwater, OK, USA
When Onsager wrote his monumental paper on the
ª 2006 Elsevier Ltd. All rights reserved. Ising model published in 1944, he made a brief
remark on an obvious star–triangle transformation
relating the model on the honeycomb lattice with
Introduction the one on the triangular lattice. His details on this
were first presented in Wannier’s review article of
The term Yang–Baxter equations (YBEs) was coined 1945. However, the star–triangle transformation
by Faddeev in the late 1970s to denote a principle of played a much more crucial role in Onsager’s
integrability, that is, exact solvability, in a wide reasoning, as it is also intimately connected with
variety of fields in physics and mathematics. Since his elliptic function uniformizing parametrization.
then it has become a common name for several Furthermore, it implies the commutation of
classes of local equivalence transformations in transfer matrices and spin-chain Hamiltonians.
statistical mechanics, quantum field theory, differ- Only in his Battelle lecture of 1970 did Onsager
ential equations, knot theory, quantum groups, and explain how he used this remarkable observation in
other disciplines. We shall cover the various versions his derivation of the formula for the spontaneous
and their relationships, paying attention also to the magnetization which he had announced as a
early historical development. conference remark in 1948 and of which the first
complete derivation had been published by Yang in
Electric Networks
1952 using a completely different method.
Many other applications and generalizations have
The first such transformation came up as early as since appeared. Most generally, we can consider a
1899 when the Brooklyn engineer Kennelly pub- system whose state variables – also called spins – take
lished a short paper, entitled ‘‘The equivalence of values from some suitable discrete or continuous sets.
triangles and three-pointed stars in conducting net- The interactions between spins a and b are given in
works.’’ This work gave the definite answer to such terms of weight factors Wab and W ab , which are
questions as whether it is better to have the three complex numbers in general, see Figure 2. One
coils in a dynamo – or three resistors in a network – quantity of special interest is the partition function –
arranged as a star or as a triangle, see Figure 1. sum of the product of all weight factors over all
Using Kirchhoff’s laws, the two situations in Figure 1 allowed spin values. The integrability of the model is
can be shown to be equivalent provided expressed by the existence of spectral variables –
rapidities p, q, r, . . . – that live on oriented lines, two
Z1 Z1 ¼ Z2 Z2 ¼ Z3 Z3 of which cross between a and b as indicated by the
dotted lines in Figure 2. Arrows from a to b are added
¼ Z1 Z2 þ Z2 Z3 þ Z3 Z1 ½1
to keep track of the ordering of a and b in case the
¼ Z1 Z2 Z3 =ðZ1 þ Z2 þ Z3 Þ ½2 weights are chiral (not symmetric).
In Onsager’s special choice of the Ising model the
spins take values a, b, c, . . . = 1 and the weight
Here one has to take either [1] or [2] as second line
factors are the usual real positive Boltzmann weights
of the equation, depending on which direction the
depending on the product ab = 1, uniformizing
transformation is to go. The star–triangle transfor-
variable p  q, and elliptic modulus k. In the integra-
mation thus defined is also known under other
ble chiral Potts model the weights depend on a  b
names within the electric network theory literature
mod N, with a, b = 1, . . . , N, whereas the rapidities p
as wye–delta (Y  ), upsilon–delta (  ), or
and q are living in general on a higher-genus curve.
tau–pi (T  ) transformation.
466 Yang–Baxter Equations

z1 factors. In general, there may also appear scalar factors


R(p, q, r) and R(p, q, r), which can often be eliminated
by a suitable renormalization of the weights. If a, b,
z2 z3 and c take values in the same set, we can sum over
= z3 z2 a = b = c, showing that R = R in that case.
z1 The Kennelly star–triangle equation [1], [2] can be
recovered as a special limit of a spin model where
the states are continuous variables.
Figure 1 Star–triangle equation for impedances.

Knot Theory and Braid Group

a b A seemingly totally different situation occurs in the


theory of knots, links, tangles, and braids. In 1926,
p p
Reidemeister showed that only three types of moves
b a
suffice to show the equivalence between two
different configurations, see Figure 4. Moves of
q q
type I – removing simple loops – do not apply to
Wab Wab braids. Moves of type II, for which one strand crosses
twice over another strand, can be reformulated for
Figure 2 Spin model weights Wab (p, q) and W ab (p, q):
braids, namely that an overcrossing is the inverse of
When the weights are asymmetric in the spins, there an undercrossing. The Reidemeister move of type III
are two sets of star–triangle equations which can be is a precursor of the more general Yang–Baxter
expressed both pictorially (Figure 3) and algebraically: moves and can be represented also by the defining
X relations of Artin’s braid group. Let Ri, iþ1 be the
W cd ðp; qÞW db ðq; rÞWda ðp; rÞ operator representing the situation in which the
d strand in position i crosses over the one in position
¼ Rðp; q; rÞWba ðp; qÞWca ðq; rÞW cb ðp; rÞ ½3 i þ 1. Then a braid can be represented by a product
of Rj, jþ1 ’s and their inverses, provided

Rðp; q; rÞWab ðp; qÞWac ðq; rÞW bc ðp; rÞ Ri;iþ1 Riþ1;iþ2 Ri;iþ1 ¼ Riþ1;iþ2 Ri;iþ1 Riþ1;iþ2 ½5
X
¼ W dc ðp; qÞW bd ðq; rÞWad ðp; rÞ ½4 and
d
½Ri;iþ1 ; Rj;jþ1  ¼ 0; if ji  jj  2 ½6
Note that eqns [3] and [4] differ from each other by the and similar relations in which Ri, iþ1 and/or Riþ1, iþ2
transposition of both spin variables in all six weight are replaced by their inverses.

b p
Factorizable S-Matrices and Bethe Ansatz
b

p In the early 1960s, Lieb and Liniger solved the one-


dimensional Bose gas with delta-function interaction
c d c
= using the Bethe ansatz. Yang and McGuire then tried
q to generalize this result to systems with internal
a q a degrees of freedom and to fermions. This led to the

r r

= = =
a p a

c d c
=
q

b
q b
=

r r
Figure 3 Star–triangle equation. Figure 4 Reidemeister moves of types I, II, and III.
Yang–Baxter Equations 467

λ
p
γ′ λ d c d c
γ′
α β α β
p
β α′ β p p p
α″ β ″ α′
γ″ γ″ μ a μ b a b
=
α β″ β′ α″ β′
q α q q q
γ q γ ω W w

r r Figure 6 Vertex model weight !  (p, q), mixed model weight
 dc dc
W jab (p, q) and IRF model weight wab (p, q).
Figure 5 Vertex model YBE.
XXX 00 00 0 0 00 0
! ðp; qÞ!0000 ðq; rÞ!00 ðp; rÞ
discovery of the condition for factorizable S-matrices 00  00  00
by McGuire in 1964, represented pictorially by XXX 0 0 00 00 0 00

Figure 5, where the world lines of the particles are ¼ !0000 ðp; qÞ! ðq; rÞ!00 ðp; rÞ ½8
00  00  00
given. Upon collisions the particles can only exchange
their rapidities p, q, r, so that there is no dispersion. This equation is represented graphically in Figure 5.
Also indicated are the internal degrees of freedom in From it one can also derive a sufficient condition for
Greek letters. In other words, the three-body S-matrix the commutation of transfer matrices and spin-chain
can be factorized in terms of two-body contributions Hamiltonians, generalizing the work of McCoy and
and the order of the collisions does not affect the Wu, who had earlier initiated the search by showing
final outcome. McGuire also realized that this that the general six-vertex model transfer matrix
condition is all one needs for the consistency of commutes with a Heisenberg spin-chain Hamilto-
factoring the n-body S-matrix in terms of two-body nian. To be more precise, Baxter found that if
S-matrices. The consistency condition is obviously !  
 =   for some choice of p and q, some spin-
related to the Reidemeister move of type III in chain Hamiltonians could be derived as logarithmic
Figure 4. derivatives of the transfer matrix.
Yang succeeded in solving the spin-1/2 fermionic
model using a nested Bethe ansatz, utilizing a Interaction-Round-a-Face Model
generalization of Artin’s braid relations [5] and [6],
Baxter introduced another language, namely that of the
 i;iþ1 ðp  qÞR
R  iþ1;iþ2 ðp  rÞR
 i;iþ1 ðq  rÞ IRF or ‘‘interaction-round-a-face’’ model, which he
introduced in connection with his solution of the hard-
¼R  i;iþ1 ðp  rÞR
 iþ1;iþ2 ðq  rÞR  iþ1;iþ2 ðp  qÞ ½7 hexagon model. This formulation is convenient when
He submitted his findings in two short papers in studying one-point functions using the corner-transfer-
1967. The R  operators in eqn [7] – a notation matrix method. Now the integrability condition can be
introduced later by the Leningrad school – depend represented graphically as in Figure 7 or algebraically as
X 0
on differences of two momenta or two relativistic ad
wcb a0 b dc0
0 ðp; qÞwdc0 ðq; rÞwb0 a ðp; rÞ
rapidities. Sutherland solved the general spin case d
using repeated nested Bethe ansätze, while Lieb and X 0 0 0
¼ wbc cd ab
d0 a ðp; qÞwb0 a ðq; rÞwcd0 ðp; rÞ ½9
Wu used Yang’s work to solve the one-dimensional
d0
Hubbard model.
The spins live on faces enclosed by rapidity lines and
Vertex Models
the weights wdcab (p, q) are assigned as in Figure 6.
Since Lieb’s solution of the ice model by a Bethe
ansatz, there have been many developments on
vertex models, in which the state variables live on a′ b p a′ b
line segments and weight factors ! are assigned to p
a vertex where four line segments with the four
states , , ,  on them meet, see Figure 6. c d c′ = c d′ c′

Baxter solved the eight-vertex model in 1971, using q


a method based on commuting transfer matrices, a q b′
b′ a
starting from a solution of what he then called the
generalized star–triangle equation, but what is now r r
commonly called the Yang–Baxter equation (YBE): Figure 7 IRF model YBE.
468 Yang–Baxter Equations

Baxter discovered a new principle based on eqns [8]


and [9], which he called Z-invariance, as it expresses a′ b p a′ b
γ′ γ′
an invariance of the partition function Z under moves
p α′ β
of rapidity lines. This also implies that typical one- β α″ β″ α′
point functions are independent of the values of the c d γ″ c′ = c γ″ d ′ c′
rapidities, while two-point functions can only depend α β ′′ α″ β′
q β′ α
on the values of the rapidities of rapidity lines crossing γ
b′ γ a q b′ a
between the two spins considered. Many recent results
on correlation functions in integrable models depend r r
on this observation of Baxter. Figure 8 General YBE.

IRF-Vertex Model
quantum inverse-scattering method (QISM), coining
In Figure 6, we have also defined mixed IRF-vertex the term quantum YBEs (QYBEs) for eqns [8]. If
 dc
model weights W jab (p, q). (We could put further special limiting values of p and q can be found, say as
state variables on the vertices, but then the natural h ! 0, such that !  
 =   þ O(
h), one can reduce
thing to do is to introduce new effective weights [8] to the classical Yang–Baxter equations (CYBEs) by
summing over the states at each vertex.) With the expanding up to the first nontrivial order in expansion
choice made a more general YBE can be represented variable h. These determine the integrability of certain
as in Figure 8, or by models of classical mechanics by the inverse-scattering
X X X X 00 00 a0 d method and the existence of Lax pairs.
W jcb0 ðp; qÞ
00  00  00 d
0 0 0 00 0 0
 W0000 jadcb0 ðq; rÞW00  jdc
b0 a ðp; rÞ Checkerboard generalizations
XXXX 0 0 0
¼ W0000 jbc
d0 a ðp; qÞ Star–triangle equations [3] and [4] imply that there are
00  00  00 d0 further generalizations of the YBEs, namely those for
00 00 0 0 00 0
  cd  ab
which the faces enclosed by the rapidity lines are
 W jb0 a ðq; rÞW 00 jcd0 ðp; rÞ ½10
alternatingly colored black and white in a checkerboard
pattern. We can then introduce either vertex model
Quantum Inverse-Scattering Method weights ! 
 (p, q) and ! (p, q), or IRF-vertex model

The Leningrad school of Faddeev incorporated the weights W jab (p, q) and W  jdc
 dc
ab (p, q), or IRF
dc dc
methods of Baxter and Yang in their so-called model weights wab (p, q) and wab (p, q), see Figure 9.

λ λ
α β α β
p p

μ μ

q q
ω ω

d λ c d λ c d c d c

α β α β
p p p p

a μ b a μ b a b a b

q q q q

W w w
W
Figure 9 Checkerboard versions of the weights.
Yang–Baxter Equations 469

The black faces are those where the spins of the Checkerboard IRF Model
spin model with weights defined in Figure 2 live; the
The checkerboard IRF version of the YBE [8]
white faces are to be considered empty in Figures 2
becomes
and 3 (or, equivalently, they can be assumed to host
trivial spins that take on only a single value). X 0 0 0
ad ab dc
Clearly, the IRF-vertex model description contains wcb 0 ðp; qÞwdc0 ðq; rÞwb0 a ðp; rÞ

all the other versions. d

X 0 0 0
¼ Rðp; q; rÞ wbc cd ab
d0 a ðp; qÞwb0 a ðq; rÞwcd0 ðp; rÞ ½13
Checkerboard Vertex Model d0

First we consider the checkerboard vertex model X 0 0 0


ad ab dc
with weights ! 
 (p, q) and ! (p, q) as assigned in
Rðp; q; rÞ wcb 0 ðp; qÞwdc0 ðq; rÞwb0 a ðp; rÞ

Figure 9. The YBE [8] then generalizes to two sets of d


X
equations: wbc cd 0
ab 0 0
¼ d0 a ðp; qÞwb0 a ðq; rÞwcd0 ðp; rÞ ½14
X X X 00 00 0 0 00 0 d0
! ðp; qÞ!0000 ðq; rÞ!00 ðp; rÞ
00  00  00
XXX 0 0
again with scalar factors R and R added as in [3]
¼ Rðp; q; rÞ !0000 ðp; qÞ and [4]. These equations can now be represented
00  00  00 graphically as in Figure 11. Note that these
00 00 0 00 equations reduce to eqns [3] and [4] if the spins on
 ! ðq; rÞ!00 ðp; rÞ ½11
the white faces are allowed to take only one value,
which means that they can be ignored.
XXX 00 00 0 0 00 0
Rðp; q; rÞ ! ðp; qÞ!0000 ðq; rÞ!00 ðp; rÞ
Checkerboard IRF-Vertex Model
00  00  00
XXX 0 0 00 00 0 00 Finally, the most general case is represented by the
¼ !0000 ðp; qÞ ! ðq; rÞ!00 ðp; rÞ ½12
checkerboard IRF-vertex model, with weights
00  00  00
defined in Figure 9. For this case the YBEs are
where scalar factors R and R have been added as in given by
[3] and [4]. These equations are represented graphi-
cally by Figure 10. XXXX   ad 00 00 0
W jcb0 ðp; qÞ
00  00  00 d

0 0 0  00  0 0
 W0000 jdc
ab dc
0 ðq; rÞW  00  jb0 a ðp; rÞ

p
γ′ γ′ XXXX 0  0 0
p α′ β ¼ Rðp; q; rÞ W 00 00 jbc
d0 a ðp; qÞ
β α″ β″ α′ 00  00  00 d0
γ″ = γ″
α β″ α″ β′
q β′  00 00 0 0 00 0
α  W  jcd  ab
b0 a ðq; rÞ W 00 jcd0 ðp; rÞ ½15
γ q γ

r r
XXXX ad 00  00 0
Rðp; q; rÞ W  jcb 0 ðp; qÞ

00  00  00 d

γ′ p
γ′ ab  0 0  dc 0 00 0 0

p α′ β  W 00 00 jdc 0 ðq; rÞW 00  jb0 a ðp; rÞ


β α″ β″ α′
γ″ = XXXX
γ″ 0 0 0

α
¼ W0000 jbc
d0 a ðp; qÞ
β″ α″ β′
q β′ α 00  00  00 d0

γ q γ
  cd 00 00ab 0  0  00 0
 W jb0 a ðq; rÞW  00 jcd 0 ðp; rÞ ½16
r r
Figure 10 Checkerboard vertex model YBE. with its graphical representation in Figure 12.
470 Yang–Baxter Equations

a′ a′
λ
b p b
p2 λ
p
α β (p1,p2) α β
=
c d c′ = c d′ c′
p1 μ
q
μ
b′ a q b′ a q1 q2 (q1,q2)
r r Figure 13 Square weight as vertex weight.

a′ b p a′ b From any solution of [3] and [4] we can thus


p construct a solution of YBE [8]. This has been used
by Bazhanov and Stroganov to relate the integrable
c d c′ = c d′ c′
chiral Potts model with a cyclic representation of the
q six-vertex model.
b′ a q b′ a
Map to Checkerboard Vertex Model
r r
Figure 11 Checkerboard IRF model YBE. The checkerboard IRF-vertex model formulation
contains all other versions mentioned above as
special cases. However, collecting the state variables
in triples, we can immediately translate it to a vertex
a′ b p a′ γ′ b
γ′ model version, writing
p α′ β
β α″ β″ α′ ^^  dc ^^ 
!^^ ðp; qÞ ¼ W jab ðp; qÞ; !^^ ðp; qÞ ¼ W  jdc
ab ðp; qÞ
c d γ″ c′ = c γ″ d ′ c′
α α″
(
β″ β′
q β′ α ^ ¼ ðd; ; cÞ; ^ ¼ ðb; ; cÞ
if ½18
γ γ
b′ a q b′ a ^ ¼ ða; ; dÞ; ^ ¼ ða; ; bÞ
r r

^^ ^^
!^^ ðp; qÞ ¼ !^^ ðp; qÞ ¼ 0 otherwise ½19
a′
γ′
b p a′
γ′
b In eqn [19], we have set all vertex model weights
p α′ β zero that are inconsistent with IRF-vertex config-
β α″ β′′ α′ urations. Clearly, the translation of IRF models and
c d γ″ c′ = c γ′′ d ′ c′
spin models to vertex models can be done similarly.
α β″ α′′ β′
q β′ α
γ a q γ a Map to Spin Model
b′ b′

r r We can, furthermore, translate each vertex model


Figure 12 Checkerboard YBE. with weights assigned as in Figures 6 or 9 into a spin
model with weights as in Figure 2 by defining
suitable spins in the black faces, after checkerboard
Formal Equivalence of Languages coloring. Each spin is then defined to be the ordered
set of states on the line segments of the vertex
The Square Weight
model, a = (1 , 2 , . . . ), ordering the line segments
Combining four weights of a checkerboard model in counterclockwise starting at, say, 12 o’clock. We

a square, as is done with four spin model weights can then identify !  (p, q) = Wa, b (p, q), ! (p, q) =
in Figure 13, we find a regular vertex model weight W a, b (p, q). This is surely not very economical, as
with rapidities that are now pairs of the original many of the weights will be equal, but it helps show
ones. This gives that all different versions of the checkerboard YBE
are formally equivalent.
Hence, we shall only use the vertex model
W ðp1 ; q1 ÞW  ðp1 ; q2 ÞW  ðp2 ; q1 ÞW ðp2 ; q2 Þ
language in the following. It is fairly straightforward
¼ !
 ðp1 ; p2 ; q1 ; q2 Þ ½17 to convert to the other formulations.
Yang–Baxter Equations 471

An sl(mjn) Example The Ř-Matrix


One fundamental example is a Q-state model for If we transpose the  indices i and j in eqn [22],
which the rapidities have 2Q þ 1 components, we can define a set of matrices Ři, iþ1 (p, q) with
p = (pQ , . . . , pQ ), q = (qQ , . . . , qQ ), etc., and the elements
states on the line segments are arranged in strings Y
of continuing conserved color. The vertex weights, Ři; iþ1 ðp; qÞ11...N
...N ¼ ! i ; iþ1
i ; iþ1 ðp; qÞ kk ½24
k6¼i; iþ1
for , , ,  ¼ 1, . . . , Q, are given by
Using these, the YBE [8] can be rewritten in matrix
pþ q form as
!
 ðp; qÞ ¼ !
0 ðp0 ; q0 Þ ½20
qþ p
Ři; iþ1 ðq; rÞŘiþ1; iþ2 ðp; rÞŘi; iþ1 ðp; qÞ
with ( 6¼ ) ¼ Řiþ1; iþ2 ðp; qÞŘi; iþ1 ðp; rÞŘiþ1; iþ2 ðq; rÞ ½25
!
0 ðp0 ; q0 Þ ¼ N sinh½
þ " ðp0  q0 Þ
and
!
0  ðp0 ; q0 Þ ¼ N G sinhðp0  q0 Þ ½Ři; iþ1 ðp; qÞ; Řj; jþ1 ðr; sÞ ¼ 0; if ji  jj52 ½26
½21
! 
0  ðp0 ; q0 Þ ¼Ne ðp0 q0 Þsignð Þ
sinh
In this formulation, it is clear that many solutions

!0 ðp0 ; q0 Þ ¼ 0; otherwise can be found ‘‘Baxterizing’’ Temperley–Lieb and
Iwahori–Hecke algebras.
where N is an arbitrary overall normalization factor
and
is a constant. Furthermore, " = 1 for
Classical YBEs
 = 1, . . . , Q, where m of them equal þ1 and n of
them equal 1. The G ’s are constants satisfying If we expand
G = 1=G  , which freedom is allowed because the
number of - crossings minus the number of - Rij ðpi ; pj Þ ¼ 1 þ hX ij ðpi ; pj Þ þ Oðh2 Þ ½27
crossings is fixed by the states on the boundary only, in [23], we get in second order in h the classical YBE
that is, the choice of , 0 , , 0 , ,  0 in YBE [8] and (CYBE) as the vanishing of a sum of three commu-
Figure 5. tators, that is,
The solution [20], [21] has many applications.
The case m = 0, n = 2 leads to the general six-vertex ½X ij ðpi ; pj Þ; X ik ðpi ; pk Þ þ ½X ij ðpi ; pj Þ; X jk ðpj ; pk Þ
model; the m = 0, n = n case produces the funda- þ ½X ik ðpi ; pk Þ; X jk ðpj ; pk Þ ¼ 0 ½28
mental intertwiner of affine quantum group Uq sl(n), b
whereas the case m = 2, n = 1 corresponds to the introduced by Belavin and Drinfel’d, where X ij is
supersymmetric one-dimensional t–J model. called the classical r-matrix.

Reflection YBEs
Operator Formulations Cherednik and Sklyanin found a condition deter-
The R-Matrix mining the solvability of systems with boundaries,
the reflection YBEs (RYBEs), see Figure 14. Upon
For a problem with N rapidity lines, carrying
rapidities p1 , . . . , pN , we can introduce a set of
matrices Rij (pi , pj ), for 14i < j4N, with elements
Y q– q–
Rij ðpi ; pj Þ11...N
...N ¼ !jiij ðpi ; pj Þ kk ½22
k6¼i; j p–

p–
In terms of these, the YBE [8] can be rewritten in
matrix form as =
p
Rjk ðpj ; pk ÞRik ðpi ; pk ÞRij ðpi ; pj Þ
½23 p
¼ Rij ðpi ; pj ÞRik ðpi ; pk ÞRjk ðpj ; pk Þ
q q
where 14i < j < k4N. Figure 14 Reflection YBE.
472 Yang–Baxter Equations

collisions with a left or right wall the rapidity


variable changes from p to p and back. In most
examples, in which the rapidities are difference p
variables such that R(p, q) = R(p  q), one also has p
p =   p, with  some constant. The corresponding
p
left boundary weights are K (p, p) satisfying
p
Ǩ1 ðq; qÞŘ12 ðp; qÞǨ1 ðp; pÞŘ12 ðq; pÞ
¼ Ř12 ðp; qÞǨ1 ðp; pÞŘ12 ðq; pÞǨ1 ðq; qÞ ½29 q q q q q q

with Ǩ1 (p, p) defined by a direct product as in [24] p


appending unit matrices for positions i52, and a p
similar equation must hold for the right boundary. p
Most work has been done for vertex models, while
p
Pearce and co-workers wrote several papers on the
IRF-model version.

Figure 16 Heuristic derivation of inversion relation.


Higher-Dimensional Generalizations
In 1980 Zamolodchikov introduced a three-
dimensional generalization of the YBE, the so-called variable that can assume Q values, then the total
tetrahedron equations (TEs), and he found a special partition function factors by repeated application of
solution. Baxter then succeeded in proving that the relation in Figure 15 into the contribution of
this solution satisfies all TEs. Baxter and Bazhanov M þ N circles. Therefore,
showed in 1992 that this solution can be seen as
a special case of the sl(1) chiral Potts model. Z ¼ QMþN Cðp; qÞMN  ZM; N ðp; qÞZN; M ðq; pÞ ½30
Several authors found further generalizations more
Taking the thermodynamic limit,
recently.
zðp; qÞ  lim ZM; N ðp; qÞ1=MN ½31
M; N!1
Inversion Relations
one finds
When !  
 (p, p) /   , that is, the weight decouples
when the two rapidities are equal, one can derive the zðp; qÞzðq; pÞ ¼ Cðq; pÞ ½32
local inverse relation depicted in Figure 15, which is In many models, eqn [32], supplemented with some
a generalization of the Reidemeister move of type II suitable symmetry and analyticity conditions, can be
in Figure 4. It is easily shown that C(q, p) = C(p, q). used to calculate the free energy per site.
This local relation implies also a global inversion
relation which can be found in many ways. The See also: Affine Quantum Groups; Bethe Ansatz;
following heuristic way is the easiest: consider the Classical r-matrices, Lie Bialgebras, and Poisson Lie
situation in Figure 16, with N closed p-rapidity lines Groups; Eight Vertex and Hard Hexagon Models; Hopf
and M closed q-rapidity lines. For M and N large, Algebras and q-Deformation Quantum Groups;
we may expect the partition function of Figure 16 Integrability and Quantum Field Theory; Integrable
Discrete Systems; Integrable Systems: Overview; The
to factor asymptotically in top- and bottom-half
Jones Polynomial; Knot Invariants and Quantum Gravity;
contributions. If each line segment carries a state Knot Theory and Physics; Sine-Gordon Equation;
Topological Knot Theory and Macroscopic Physics;
Two-Dimensional Ising Model; von Neumann Algebras:
Subfactor Theory.

q p = C(p,q) p q Further Reading


Au-Yang H and Perk JHH (1989) Onsager’s star-triangle
equation: master key to integrability. Advanced Studies in
Pure Mathematics 19: 57–94.
Baxter RJ (1982) Exactly Solved Models in Statistical Mechanics.
Figure 15 Local inversion relation. London: Academic Press.
Yang–Baxter Equations 473

Behrend RE, Pearce PA, and O’Brien DL (1996) Interaction- Alloys, Magnets and Superconductors, pp. xix–xxiv, 3–12.
round-a-face models with fixed boundary conditions: the ABF New York: McGraw-Hill.
fusion hierarchy. Journal of Statistical Physics 84: 1–48. Perk JHH (1989) Star-triangle equations, quantum Lax pairs, and
Gaudin M (1983) La Fonction d’Onde de Bethe. Paris: Masson. higher genus curves. Proceedings of Symposia in Pure
Jimbo M (ed.) (1987) Yang–Baxter Equation in Integrable Mathematics 49(1): 341–354.
Systems. Singapore: World Scientific. Perk JHH and Schultz CL (1981) New families of commuting
Kennelly AE (1899) The equivalence of triangles and three- transfer matrices in q-state vertex models. Physics Letters A
pointed stars in conducting networks. Electrical World and 84: 407–410.
Engineer 34: 413–414. Perk JHH and Wu FY (1986) Graphical approach to the
Korepin VE, Bogoliubov NM, and Izergin AG (1993) Quantum nonintersecting string model: star-triangle equation, inversion
Inverse Scattering Method and Correlation Functions. relation, and exact solution. Physica A 138: 100–124.
Cambridge: Cambridge University Press. Reidemeister K (1926a) Knoten und Gruppen. Abhandlungen aus
Kulish PP and Sklyanin EK (1981) Quantum spectral transform dem Mathematischen Seminar der Hamburgischen Universität
method. Recent developments. In: Hietarinta J and 5: 7–23.
Montonen C (eds.) Integrable Quantum Field Theories, Reidemeister K (1926b) Elementare Begründung der Knotenthe-
Lecture Notes in Physics, vol. 151, pp. 61–119. Berlin: orie. Abhandlungen aus dem Mathematischen Seminar der
Springer. Hamburgischen Universität 5: 24–32.
Lieb EH and Wu FY (1972) Two-dimensional ferroelectric Yang CN (1967) Some exact results for the many-body problem
models. In: Domb C and Green MS (eds.) Phase Transitions in one dimension with repulsive delta-function interaction.
and Critical Phenomena, vol. 1, pp. 331–490. London: Physical Review Letters 19: 1312–1314.
Academic Press. Yang CN (1968) S-matrix for the one-dimensional N-body
Onsager L (1971) The Ising model in two dimensions. In: Mills problem with repulsive or attractive -function interaction.
RE, Ascher E, and Jaffee RI (eds.) Critical Phenomena in Physical Review 167: 1920–1923.

You might also like