You are on page 1of 169

Graduate Statistical Mechanics

Vijay S. Pande Department of Chemistry Stanford University

Chemistry 275: Graduate Statistical Mechanics

Instructor: Ofce: Ofce Hours: Lectures: Teaching Assistants:

Vijay Pande (pande@stanford.edu) 315 Keck TTh 1:30pm - 2:30pm TTh 11:00am-12:15pm Vishal Vaidyanathan (vvishal@stanford.edu)

Subject material: The properties of the macroscopic world are due to the nature of its constituent molecules and the manner in which they interact. Statistical mechanics is a fundamental bridge from this microscopic world to many macroscopic properties of interest. Textbooks: The primary texts will be McQuarries Statistical Mechanics and my notes (which can be found at http://www.stanford.edu/ pande/chem275). Other texts which are also useful include D. Chandlers Introduction to Modern Statistical Mechanics and K. Huangs Statistical Mechanics. Assignments: Readings from the texts, handouts, and problems. The problem sets are to be handed in for grading at the beginning of the lecture on the due date. Examinations and grading: There will be weekly homework sets, a take-home midterm and a take-home nal. The course grade will be 20% homework, 30% takehome midterm, and 50% nal. Late homework (or exams) will not be accepted without prior approval of the instructor.

ii

Course Outline
1. A brief review of thermodynamics Laws of thermodynamics, energy, entropy, thermal equilibrium & temperature, Legendre transforms, Maxwell relations (McQuarrie 1-4, Chandler 1, Huang 1) 2. Probability General denitions, probability distribution functions (PDFs), cumulant expansions of PDFs, important PDFs, sums of random variables, rules for large numbers (McQuarrie 1-5, Chandler 3.1) 3. Statistical thermodynamics I: Microcanonical ensemble Microcanonical ensemble, Two level system, ideal gas, mixing entropy and Gibbs paradox, (Chandler 3, Huang 6) 4. Statistical thermodynamics II: Canonical and generalized ensembles canonical ensemble, uctuations involving uncorrelated particles, two level system and ideal gases revisited, generalized ensembles, quantum gases (McQuarrie 2 - 4, Chandler 4, Huang 7) 5. Interacting systems I: The onset of interaction Real gases and cluster expansions, second virial equation and the van der Waals equation, mean eld theory of condensation, corresponding states (McQuarrie 12, lecture notes, Huang 10, Chandler 5) 6. Interacting systems II: Phase transitions Phase transitions, critical behavior, polymers, Ising model, mean eld theory (Chandler 5, Huang 14 15) 7. Interacting systems III: An introduction to modern theoretical techniques Monte carlo calculations, Landau theory, real space renormalization, from particles to elds (lecture notes, Chandler 6, Huang 17 18) 8. Statistical Mechanics of Non-equilibrium Systems Systems close to equilibrium, Onsagers regression hypothesis, chemical kinetics, diffusion, uctuation dissipation theorem, transition state theory (McQuarrie 21 and 22, Chandler 8)

iii

Contents
0 Preface 0.1 Why study thermodynamics and statistical mechanics 0.2 Thermodynamics vs Statistical Mechanics . . . . . . 0.3 Ideal gases: A simple system to play with . . . . . . 0.4 A brief glimpse of phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv iv iv v v 1 1 1 1 2 4 4 6 6 6 6 7 9 9 10 10 11 11 13 13 14 16 16 17 18 18 19 20 20

1 A brief review of thermodynamics 1.1 An analogy to more familiar systems . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Energy and the rst law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 State functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Work required for reversible and irreversible isothermal transformations of a gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Why we need thermodynamics: a.k.a. energy isnt everything . . . . . . . . 1.2.2 The second law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 A new state function: entropy . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Entropy of reversible and irreversible isothermal transformations of a gas . . 1.2.5 Application: thermal equilibrium and temperature . . . . . . . . . . . . . . 1.3 Thermodynamic potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Helmholtz free energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Gibbs free energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Grand potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Another way to examine spontaenous processes . . . . . . . . . . . . . . . . 1.4 Useful mathematical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Mathematical properties of state functions . . . . . . . . . . . . . . . . . . . 1.4.2 Gibbs-Duhem relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 Maxwells relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The third law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Summary of the laws of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . 1.7 Appendix A: Isothermal transformations of a gas, revisited . . . . . . . . . . . . . . 1.7.1 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.2 Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.3 Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Appendix B: Carnot engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv 1.8.1 Heat engines and refrigerators . . . . . . . . . . . . . . . 1.8.2 Carnot engines and the thermodynamic temperature scale . 1.8.3 Entropy is a state function . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 22 25 27 27 27 27 28 30 33 34 35 36 36 37 37 39 43 43 44 44 44 45 45 46 46 47 47 50 52 56 59 59 59

1.9

2 Probability 2.1 Denitions and preliminary remarks . . . . . . . . . . . . . 2.2 One random variable . . . . . . . . . . . . . . . . . . . . . 2.2.1 Fundamental relationships . . . . . . . . . . . . . . 2.2.2 Ways to characterize PDFs: moments and cumulants 2.3 Four important PDFs . . . . . . . . . . . . . . . . . . . . . 2.4 Many random variables . . . . . . . . . . . . . . . . . . . . 2.5 Sums of independent random variables . . . . . . . . . . . . 2.6 Rules for large numbers . . . . . . . . . . . . . . . . . . . . 2.6.1 Summation of exponential quantities . . . . . . . . . 2.6.2 Saddle point integration . . . . . . . . . . . . . . . 2.6.3 Application: derivation of Stirlings approximation . 2.7 Appendix: Fourier transforms . . . . . . . . . . . . . . . . 2.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Statistical mechanics I: Microcanonical ensemble 3.1 A simple example from probability . . . . . . . . . . 3.2 Life at constant energy: the microcanonical ensemble 3.2.1 Microstate probability . . . . . . . . . . . . 3.2.2 Geometric interpretation of entropy . . . . . 3.2.3 Extensivity of entropy and its implications for 3.2.4 Relationship between , , and . . . . . . 3.2.5 Entropy is maximized at equilibrium . . . . . 3.2.6 Clausius law from a probabilistic perspective 3.3 Applications of the microcanonical approach . . . . 3.3.1 A non-interacting, two-level system . . . . . 3.3.2 The ideal gas . . . . . . . . . . . . . . . . . 3.3.3 Mixing entropy and Gibbs paradox . . . . . 3.4 Problems . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . .

. . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

4 Statistical mechanics II: Canonical ensemble 4.1 Canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Derivation of the canonical ensemble from the microcanonical . . . . . . . .

v 4.1.2 Physical interpretation of the partition function . . . . . . . . . . 4.1.3 Fluctuations in the canonical ensemble . . . . . . . . . . . . . . 4.1.4 Interpretation of in terms of Laplace/Legendre transformations Applications of the canonical ensemble approach . . . . . . . . . . . . . 4.2.1 Two level system . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Equipartition theorem . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 61 63 64 64 65 66 68 71 71 71 71 72 72 73 73 74 75 76 78 79 79 79 79 80 81 81 81 82 84 86 87 89 91

4.2

4.3

5 Statistical mechanics III: Grand canonical ensemble 5.1 Derivation of the grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Sketch of derivation of the grand canonical ensemble from the microcanonical 5.1.2 Probability of nding particles . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Fluctuations in the grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Physical interpretation of 5.2 Applications of the grand canonical ensemble approach . . . . . . . . . . . . . . . . 5.2.1 Gibbs Entropy formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 A simple chemical reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 Generalization to all chemical reactions . . . . . . . . . . . . . . . . . . . . 5.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Adding quantum effects 6.1 Quantum statistical mechanics in the microcanonical ensemble . . 6.1.1 Approach common to Bose and Fermi statistics . . . . . . 6.1.2 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.4 Comparing Bose, Fermi, and Boltzmann gases . . . . . . 6.2 Applications of Bose and Fermi statistics . . . . . . . . . . . . . 6.2.1 Bose-Einstein distribution . . . . . . . . . . . . . . . . . 6.2.2 Fermi-Dirac distribution . . . . . . . . . . . . . . . . . . 6.3 The need for a quantum mechanical treatment in polyatomic gases 6.4 A diatomic molecule in more detail . . . . . . . . . . . . . . . . . 6.4.1 Vibrational modes . . . . . . . . . . . . . . . . . . . . . 6.4.2 Rotational modes . . . . . . . . . . . . . . . . . . . . . . 6.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

vi 7 Interacting particles 7.1 A simple example of interaction: van der Waals uid revisited . . 7.1.1 Free energy of a van der Waals uid . . . . . . . . . . . . 7.1.2 Going beyond the vdw approximation: a virial expansion . 7.1.3 Simple example of a gas-liquid transition . . . . . . . . . 7.2 Deriving virial coefcients from rst principles: cluster expansions 7.2.1 Derivation of the second virial coefcient . . . . . . . . . 7.2.2 Higher order virial coefcients . . . . . . . . . . . . . . . 7.3 The second virial coefcient and van der Waals equation . . . . . 7.3.1 Calculation . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 vdW equation of state . . . . . . . . . . . . . . . . . . . 7.4 Understanding the vdW equation of state . . . . . . . . . . . . . . 7.4.1 The critical point and the universal equation of state . . . 7.4.2 Breakdown of the van der Waals equation . . . . . . . . . 7.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Mean eld theory of phase transitions 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 What is a phase transition? . . . . . . . . . . . . 8.1.2 What is mean eld theory? . . . . . . . . . . . . 8.2 Polymers . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Universal aspects of polymers: persistence length 8.2.2 Random walk . . . . . . . . . . . . . . . . . . . 8.2.3 Florys theory for a self-avoiding walk . . . . . . 8.2.4 Coil to globule transition . . . . . . . . . . . . . 8.3 Ising model . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Non-interacting spins . . . . . . . . . . . . . . . 8.3.2 Mean eld Theory for interacting spins . . . . . 8.3.3 Mean eld theory near the critial point . . . . . . 8.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . 92 92 92 93 93 94 94 96 97 97 97 98 98 98 99 100 102 102 102 102 103 103 103 105 106 107 107 109 110 111 113 113 113 114 118 119

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

9 Field theoretic methods 9.1 From particles to elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 What are elds? An example of the interaction of charged particles . . . . 9.1.2 Math of elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.3 Scattering experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.4 Transforming from particles to elds: Hubbard-Stratonovich transformation

. . . . .

vii 9.2 Landau theory of phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Motivation from the Hubbard-Stratonovich transformation . . . . . . . . . . 9.2.2 Motivation from symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3 Ehrenfest classication of phase transitions . . . . . . . . . . . . . . . . . . 9.2.4 First order phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.5 Calculating scattering functions from Landau theory (for inhomogenous systems) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Breakdown of mean eld theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 122 123 124 125 126 128 131 134 134 134 136 136 138 138 140 141 141 145 145 146 148 150 150 151 152 153 154 154 155

9.3 9.4

10 Beyond mean eld theory 10.1 Critical phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Exact solution of the one dimensional case: transfer matrix approach . . . . . . . . . 10.3 Real space renormlization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Self-similarity: the heart of renormalization . . . . . . . . . . . . . . . . . . 10.3.2 Similarity to critical phenomenon . . . . . . . . . . . . . . . . . . . . . . . 10.3.3 Recursion relationship: renormalizing . . . . . . . . . . . . . . . . . . 10.3.4 Recursion relationship for a 2D random walk on a cubic lattice . . . . . . . . 10.4 Computer simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Simulation techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2 Two central issues in calculating thermodynamic properties . . . . . . . . . 10.4.3 Calculating thermodynamic properties from simulations that have reached equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.4 Calculating thermodynamic properties from simulations that dont easily reach equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Disordered systems 11.1 Spin glasses: the Ising model of disordered systems . 11.2 Which thermodynamic properties should we calculate? 11.3 Annealed-average partition function . . . . . . . . . . 11.4 Quenched partition function . . . . . . . . . . . . . . 11.4.1 High temperature (cumulant) expansion . . . . 11.4.2 Random Energy Model . . . . . . . . . . . . . 11.4.3 Problems . . . . . . . . . . . . . . . . . . . .

  

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

P REFACE

viii

0 Preface
0.1 Why study thermodynamics and statistical mechanics
Many (most?) objects we care about are fundamentally macroscopic (i.e. consist of many particles). However, the properties of these macroscopic objects derive from the microscopic nature of its constituent molecules. Statistical mechanics is a means to connect the microscopic and the macroscopic properties. In doing so, many degrees of freedom are intentionally thrown away, since there are too many to deal with (indeed, if we have moles of particles, we have many moles of degrees of freedom). Statistical mechanics provides a systematic means to do this. In comparison, thermodynamics is, in a sense, a set of relationships between various macroscopic properties which must hold, independent of the underlying microscopic nature. This is the elegance of (and a strength of) thermodynamics, and makes it useful for situations in which the microscopic picture is unknown, complex (or both). Historically, thermodynamics was developed rst, and then statistical mechanics was used to explain these principles starting from a molecular understanding. In this course, we will in many ways take a similar approach. The strength of studying this area in this manner is that it allows one to see what is independent of the molecular underpinnings and what derives directly from them. Actually, it is interesting to consider the historcal context of thermodynamics and statistical mechanics. Consider the late 1800s: the Industrial Revolution was just begining and steam engines were the high tech miracles of their day. While it is obvious that burning lots of coal leads to heat, the truly relvant question is what fraction of that heat can be used to do useful work? While this seemingly simple question started in many ways from the very practical need for efciency, it launched an entire eld of physical chemistry in its answering. Why is thermodynamics and statistical mechanics of relevance today? We are still wrestling with the same questions: the relationship bewteen work and heat and understanding what processes will happen spontaneously. Of course, weve moved past steam engines, but our modern high tech miracles (eg biotech and materials) are as mysterious to us now as steam engines were then. It is natural to look to see what fundamental laws of physical chemistry can come to our aid in these new challenges as well.

0.2 Thermodynamics vs Statistical Mechanics


What is the difference between Thermodynamics and Statistical Mechanics and why are both important? Thermodynamics is more fundamental and does not rely on the molecular nature of matter. For that reason, it can be extremely useful, especially in areas where the molecular detail is not needed. Statistical mechanics starts with the underlying molecular nature and builds from there, achieving the same results as Thermodynamics, but from a more microscopic nature. In a sense, the goal of Statistical mechanics is the connection between the micro and the macro worlds: to convert a problem with has over a mole of degrees of freedom (eg the positions and momenta of a mole of particles in a gas) to an understanding using just a few degrees of freedom (eg pressure, temperature, volume). This is extremely powerful and allows us to wrap our minds around problems which would be difcult to consider from a strict Newtonian (or even worse, Quantum mechanical) perspective.

P REFACE

ix

0.3 Ideal gases: A simple system to play with


Ideal gases are a natural place to start learning about thermodynamics. We have some intuition about how gases work and an ideal gas is a simple model to examine this understanding and compare to experiment. Like many types of matter, we characterize an ideal gas by certain properties: volume ( ), pressure ( ), temperature ( ), and how many moles of atoms are in the gas ( ). 1 There is a simple expression relating these quantities:

This equation is called the equation of state for this system, since it relates the state variables ( , , , and ) in this case. Actually, many gases behave like ideal gases in certain conditions (yea, you guessed it, in so-called ideal conditions). Whats missing from an ideal gas model? Ideal gases are useful, but boring they dont do anything. Real gases do interesting things: they can have phase transitions into liquid and solid phases. However, ideal gases cannot have phase transitions. Thus, far from the phase transition, they are good models, but do not work well near the transition. Much of statistical mechanics centers around phase transitions, since this is a good example of how macroscopic properties result directly from microscopic interactions. Thus, our next step is to put some of these additional properties into our model.

0.4 A brief glimpse of phase transitions


When gases cool, they condense to liquids. When liquids cool, they freeze into solids. These are two examples of phase transitions. What makes a phase transition? Interaction. Gas particles start to stick together at lower temperatures and form a liquid. How can we model this interaction? We can modify the ideal gas equation to include interactions. Well do so in two steps: 1. Whats the probability that a gas particle will bump into another one? The density is a lot like a probability that well nd a given particle at a given spot. If the density is high, then there is a high probability that the particle is there. The probability of nding two particles at the same spot goes like the . This is similar to the probability of ipping two coins and having them both come up heads: its the probability of one coming up heads squared. When two particles get close, they attract each other and lead to a force which brings them together. We can include this force as a modied pressure, i.e.

i.e. the true pressure is the original pressure (i.e. that in the ideal gas law) decreased by the strength of interaction between particles. The constant has to do with how strong is the attraction between atoms.
1 There are two types of properties here: Extensive properties are properties which are related to how much stuff there is. For example, and are extensive properties. If the system is duplicated, these variables get doubled. Intensive properties are independent of the size of the system. For example, , , are intensive properties. So is the density .

 ) 10('&

& P H 8DB 86  2 QIAGFECA@9754

#  %$"!  2 3& S R S b`XV a Y W  

(0.1)

P REFACE

2 1.75 1.5 1.25 1 0.75 0.5 0.25 -0.4 -0.2 0 0.2 0.4 0.6 0.8 log HV VcL 1

P Pc

Figure 0.1: Van der Waals theory for a uid: Figure 0.2: Temperature-volume relationship for equation of state. a NIPA gel (Tanaka lab, MIT).

2. Next, we have to also include the fact that two atoms cant be in exactly the same place. What this means is that if a gas of moles of particles are in a box of volume , each particle cannot be anywhere in that volume , as part of the volume is lled with the gas. If we say that the volume of an individual gas molecule is , then the total volume of the gas particles themselves is . Thus, the amount of space available is now only . Van der Waals hypothesis was that one could start with the ideal gas equation, but with the effective volume and kinetic pressure: Substituting the terms above leads to the vdW equation of state: (0.2)

The equation above is the Van der Waals equation and is the rst step to describe a non-ideal gas (and actually goes a long way). Whats different about a van der Waals (vdW) gas vs an ideal gas? vdW gases can change their phase and go from a gaseous to a liquid phase. This is called a phase transition. By this we mean that there is some major rearrangement of the particles which comprise the gas: in this case from the gas phase to the liquid phase. Lets explore this equation of state. If we look at pressure vs volume (or temperature vs volume), we see that there is a discontinuity in the volume as it jumps from one size to another. This discontinuity reects the fact that liquids take up much less volume than gases. However, as we raise the temperature, we nd that the transition becomes less and less discontinuous, and at a particular point (called the critical point), the transition is no longer discontinuous. Finally, above the critical point, there is no transition at all: there is no simple distinction between liquid and gas. To see this more clearly, we can do some algebra2 (also see Huang 2.3) and rewrite the vdW , , and equation in terms of the values of , , and at the critical point (

dU W U

U U

parameters , , and

2 To see this more clearly, we can do some algebra. We can nd out where the critical point is in terms of the molecular by the following means. For , there are three roots to the vdW equation. At

 )  1   ) 

c  H  $g54gf7 ed

#  %$"ug4 2 1) 2 yxv c  H    P w 

#  %utsEqAGpECA@ih rB  8DB 8 6

c $

, these

P REFACE ) instead of the microscopic parameters , , and . We nd a very simple relationship:

xi

What does this mean? It means that the microscopic parameters ( , ) are important in determining the critical point (i.e. , , and ) but otherwise are not important. In other words, we would expect that different gases would all follow the same equation (above), if we correct for differences in the values of , , and . The idea that the vdW equation above (or any equation of state) can universally describe many different types of gases is called the law of corresponding states and is a powerful property common to many systems in statistical mechanics. 3 In fact, the vdW equation does a pretty good job of describing other types of phase transitions (eg polymer chain or gel collapse transitions). For example, below Ive plotted the phase behavior of a N-isopropylacrylamide (NIPA) gel. In gels, there is a gas-like phase were the gel is swollen and a liquid-like phase in which it is collapsed. Of course, the polymeric aspect of the gel will lead to deviations from the vdW equation, but the general physical ideas are very much the same. Also, Id like to stress that phase transitions are not limited to gas, liquid, and solid. There is a huge multitude of different arrangements (phases) of matter, resulting from the numerous ways in which atoms can interact. For example, in my own work, I examine phase transitions of biological polymers (proteins and nucleic acids) to learn how these phase transitions are relevant for biological function.
three roots must fuse together. Thus, in the neighborhood of the critical point, the equation of state must take the form

We can expand the cubic to get

and compare this to the vdW equation at the critical point (

), i.e.

which we can multiply through to get

which we can rewrite as

Comparing terms, we get

3 I cant resist editorializing a bit here. While many systems in physical chemistry are universal, not all systems are. Most noticeably, biological systems are often very dependent on chemical details and one must be careful in the examination of which (if any) aspects are universal. Universality must be demonstrated, not assumed!

R R W T R } W S U I W R S R T R T a y W y S W w S} s a y S W } a S a n  W } j T T T t { W y uS  S 3r3R R t y S R R U

w S T a I3R W T a y v'S  T a 3 o~t y S U R t w R S R s s

T a S

If we multiply by

, we get

and

. Thus,

c P

U R t R T t I3R W S a y vuS a 3R v'S T

I3R W w S w S a 0T U R t R s s T W T dU W U S t S } S } { |W y uS oS ~t y S

h n oh j i k2  q m4 p l H h x w { yw S t |W zx vuS s

c P

g  r $ r u

) $fbex

(0.3)

. From here, we get

P REFACE

xii

Weve seen here a glimpse of the rest of the course: interacting systems, phase transitions, critical points, universalities. We have also gone far with only some simple intuition, but extending this requires a more general understanding, which is the body of this course. Before we get to this good stuff, we will have to lay a good foundation for our understanding, starting with a review of thermodynamics.

BRIEF REVIEW OF THERMODYNAMICS

1 A brief review of thermodynamics


1.1 An analogy to more familiar systems
1.1.1 Energy and the rst law The laws of thermodynamics start with a physical rule which has been beaten into most students by now: energy is conserved. However, this doesnt mean that the energy of something of interest will remain constant, but rather than energy cannot be created or destroyed and instead is just shufed around. In particular, it can be shufed into two main types of energy dealt with in thermodynamics: heat and work. This relationship is simply stated as

and this is called the rst law of thermodynamics. In statistical mechanics and thermodynamics, we often deal with small, innitesimal changes in given quantities. We can rewrite the above in terms of small changes (1.2) Note that weve put a slash through the differentials in and (these types of differentials are called inexact differentials). This denotes the fact that while the total energy can be calculated , there is no sense of total heat and it does not make sense to write . as Furthermore, this means that is meaningless as well, and thus we write instead, with the integral , where and denote initial and nal states respectively. Work is a pretty straightforward concept. It typically takes the form of the product of some forcelike term with an extension-like term. For example, work to extend a distance against a force takes the form . Analogously, it requires work to compress a gas by volume at pressure , thus in this case . The minus sign arises since it takes force to compress the system. Heat is typically a more elusive concept. In many ways, thermodynamics is all about heat and how it gets shufed around (hence the name thermo dynamics). One doesnt need to have a molecular understanding of heat to use thermodynamics, although ones intuition for what heat is builds as one works through thermodynamics. Finally, one important property of energy is that it is extensive, i.e. if two parts of a system have energies and respectively, the total system has energy . Non-extensive properties (consider the density or temperature) are called intensive. 1.1.2 Equilibrium Thermodynamics and statistical mechanics deals with equilibrium phenomena. By equilibrium, we mean the state at which the system will rest essentially indenitely. A second way to think of equilibrium is that it is the conguration of the system such that all the forces are balanced. There are several, equivalent ways of understanding equilibrium. 1. point where the forces are balanced 2. state at which the system has minimized its potential energy

2 gue w

(w

 H %  ) 0"& 1 2 0 

(1.1)

BRIEF REVIEW OF THERMODYNAMICS

3. small changes require the maximum amount of work 4. state at which the system will remain if we wait for a long time To make thermodynamic concepts more clear, it is often useful at rst to make an analogy to mechanical systems. For example, consider a pendulum. To understand how a pendulum works, it is natural to think about how the energy of the pendulum varies vs the angle it makes with the vertical. When the pendulum is 90 degrees from vertical, it has energy (where is the mass of the pendulum, is the gravitational constant, and is the height). At an arbitrary angle , the work is . When we let the pendulum go, this work is completely lost to kinetic energy at the bottom. However, we know that the total energy must be conserved, and thus

is converted into heat. If there was no friction, this process of potential energy turning into kinetic and back would continue forever. However, in the real world, there is almost always some form of friction (or some equivalent) which dissipates energy into the surroundings. In the pendulum example, this makes the pendulum lose kinetic energy during each swing. With less kinetic energy, it cannot go up as high, and eventually simply rests at the bottom. One way to predict this behavior is to look for the minima in energy. This is where the system will remain indenitely. Also, at this point, the force (which is equal to ) is zero (i.e. the tension force balances that of gravity). Note that these are precisely the ways weve dened equilibrium in general, and thus this is the equilibrium state of the pendulum. The concept of examining energy minima and energy proles will be carried over into more our statistical mechanical concepts, as we will introduce thermodynamic generalizations of the energy (enthalpy, free energy, etc). Finally, one important aspect of equilibrium is the fact that if two systems A and B are separately in equilibrium with a third system C, then they are also in equilibrium with each other. This is often called the zeroth law of thermodynamics. Despite its simplicity, this concept is important for the concept of temperature, as we will see later. 1.1.3 State functions One aspect which is crucial to thermodynamics is the concept of a state function. State functions do not depend on the path of the transformation, but only on the end points. It is important to notice that depends only on the initial and nal states and not the path taken and thus energy is a state function. However, different paths may lead to different values of and (and thus heat and work are not state functions). We can illustrate this by considering two ways we can lower a block on a ramp. 1. Connect the block to a body of equal mass and let the block down slowly. In this case, the system is in equilibrium, since the masses of the two bodies are equal (and the gravitational and tension forces are therefore balanced). If we give the block a small push downward, the block will slowly move down the ramp. Since the block moves down innitesimally slowly, it will

H s%`

% 0

"

l  1FiH 0de  #

BRIEF REVIEW OF THERMODYNAMICS

(a) mg

(Vi,Ti)

T
(b)
HEAT

(Vf,Tf)

V
Figure 1.2: Different paths in space. In both cases, the energy is the same, i.e. independent of path. Path independence is a hallmark property of state functions.

Figure 1.1: Different paths for lowering a block.

generate no heat. Thus, The block will be able to raise the body (do work) without generating any heat, and So, by letting the block down in this way, weve converted all of the original potential energy in the block into new potential energy of the second body. All of the energy is in the form of work.1 2. Just let it go. If we didnt attach any additional mass and just let the original block go down the ramp, all of the potential energy would be converted into heat (via friction). Thus,

In both cases, we have lower the block (and thus used its potential energy), but for different uses. The key point is that the change in energy is the same (its equal to the original potential energy in the block). All paths you can imagine would lead to the same , although they could have different combinations of heat and work done. Finally, we see that the generation of heat accompanies irreversible transitions, where as reversible transitions do the maximal amount of work ( and thus generate no heat ).
1 This is also a good example of a reversible transformation, which we will discuss further below. Reversible systems are (1) typically done slowly, consisting of many small steps; (2) the system is at equilibrium during each step of the transformation (we see this since the forces are balanced at each step); and (3) the maximal amount of work is done, i.e. no heat is generated.

  %q

H e

H e

BRIEF REVIEW OF THERMODYNAMICS

1.1.4 Reversibility Reversibility is another important concept in thermodynamics. A transformation is said to be reversible if it can spontaneous revert to the previous state, without the addition of any extra work. Most processes in life are irreversible, i.e. once changed, the system does not return back to the initial condition. For example When a drop of dye is placed in a bucket of water, it spreads throughout the bucket: it rarely spontaneously comes together to form the initial droplet

and on a One can imagine an reversible process as follows. Lets say we have two masses pulley. If the masses are equal , then the system is in equilibrium. If one mass is larger by some signicant amount, eg , then the system would quickly change (mass 1 would fall). As the mass falls a distance , this work will be (note that while mass 1 is going down, mass 2 is being pulled up!). , i.e. , where , However, we could make only a small change in then only an innitesimal quantity of work would be required to restore the system. Also, at every point during the reversible process, which is virtually indistinguishable from the equilibrium (since is so small). Thus, to summarize, reversible systems occur in situations when the system is essentially in equilibrium during the transition and at each step, only an innitesimal amount of work would be necessary to truly restore equilibrium. 1.1.5 Work required for reversible and irreversible isothermal transformations of a gas What would the work be for an irreversible reaction? If we just let the system go, then we would have just one step, and the work will be (for an isothermal, i.e. constant temperature, transition)

We make one big, fast change. For a reversible transformation, we need to make small steps, reaching equilibrium at each small step, as if we never left equilibrium. We can imagine the series of these small steps as the sum

and in the limit of an innite number of small steps, we get

In other words, doing an integral is like doing a larger series of summations, each innitesimally small. Each summation here is the work of making a small change in volume. Thus, the integral is

) Q

  il H #  %$" gH    # %$  g

w 2 XQ

i H   9z bFipvmEB '

 H  EB

w 2 y

 QH 

w 2 5 2

 

 "

 H %zpp3x BD

2 5

If you start out with a deck of cards which is ordered (say by suite and card value) and then shufe the deck, it never (on time scales we live in) comes back to the original order

BRIEF REVIEW OF THERMODYNAMICS

the mathematical manifestation of a reversible process. Just as reversible processes combine a large number of small steps, an integral is the sum of an innite number of innitesimal steps. How do reversible vs 1-step transformations differ for ideal gas? The reversible reaction would do

If you stick in numbers for the ratio , youll see that the reversible work case always does more work (i.e. requires less work to be done on it). Finally, in the limit when the ratio of the old and new volumes is close to 1 (i.e. the change is small), then we expect that the two equations should be equal. Physically, this occurs since a small change in volume should be reversible (as its only a minor change in the system from equilibrium). If is close to 1, then we can write the Taylor series approximation for to rst order. Thus, for a small change, we get

Just as we suspected, for a small change in volume (which is a small perturbation and thus a reversible transformation), the two equations are the same. However, for larger changes in volume, the reversible transformation will always yield more work. Finally, imagine performing a cycle of compression, then expansion of a gas. How much work is required? In the case of a reversible transformation, expansion of the gas followed by a compression does

On the other hand, if we do a cycle comprised of irreversible transformations, we get a different result. For example, if we do 1-step expansion followed by 1-step compression, we get

which is always greater than zero (unless , which is the limiting case of a reversible-like transformation, i.e. innitesimal volume change). Thus, we see that the reason why a reversible transformation is reversible is that the work done on the system in compression is balanced by the work done by the system on expansion. This may have seemed like an aside, but it will be important for our reasoning in understanding the next arguments.

n i  i  " #  Fu%u%H #  p'%$e  g  g

  n  j  l il   BD BD #  %ue H #  %$w H #  %$" vpp3x w zF3x zF3x BD qEB xvG    2  QH   g

H  `Ep

 FgH 

 1)  n  j i  #  p'%ut #  p% uH   g    H H pv~%#uH |%%#$H # %u  1)  zB 8 F   il H #  %ue%zB  g

 j F'%#$w i  p'%$e EB #  w EB %EB qEB xzG '  g

  1) m

EB

'

BRIEF REVIEW OF THERMODYNAMICS

1.2 Entropy
1.2.1 Why we need thermodynamics: a.k.a. energy isnt everything Our analogy of a block sliding down a ramp is a good one in many ways, but does seem to break down in certain situations. It is an example of an exothermic reaction, i.e. heat is given off. This occurs in the block case since the potential energy of the block is turned into heat by friction. This heat raises the temperature of the surroundings. However, there are endothermic reactions too. In these cases, heat is absorbed (and the temperature goes down) and the total energy goes up. In the block example, this would be like the block spontaneously going up ramp! Clearly there is something shy here. Consider another shy example: a large box containing a much smaller sealed container of argon gas. What happens when we open the gas container? The gas will leave the container and eventually ll the box. Why? What is the force which, in some sense, pushes the gas out of its original container and makes it ll the larger box? As we will see in later sections, this force is due to entropy. Thus, entropy is the concept which is truly unique to thermodynamics and the missing part from mechanics. 1.2.2 The second law What can we learn from this model system? One immediate question we can address deals with perpetual motion machines. The rst law (energy conservation) rules out machines which can produce work without consuming any energy. However, there is a second type of perpetual motion machine: those which create order without any additional work or energy (for example: a refrigerator is a machine which can pull the heat out of water energy is still conserved, but yet this doesnt happen spontaneously). These types of machines are ruled out by the second law of thermodynamics. The principal observation here is that the natural direction for heat ow is from hotter to colder bodies. There are many ways to state and reformulate this idea, including these two famous statements: 1. No process is possible whose sole result is the complete conversion of heat into work (Kevin) 2. No process is possible whose sole result is the transfer of heat from a colder to a hotter body (Clausius) Thus, (1) rules out a perfect (100% efcient) engine and (2) rules out a perfect refrigerator. These two statements are equivalent. One way to show this is to demonstrate that if one is violated than so is the other (this is done in Huang 1.3). 1.2.3 A new state function: entropy Clausius later used the second law, as formulated above, and the algebra of Carnot engines to dene the following state function

zB

  H   zIvb5

(1.3)

BRIEF REVIEW OF THERMODYNAMICS

When the path is reversible, then the inequality becomes an equality. 1.2.4 Entropy of reversible and irreversible isothermal transformations of a gas Lets think about what this means in terms of entropy. in terms of our example of the isothermal transformations of an ideal gas. We know that from the rst law and that for isothermal transformations. Thus, , i.e. the heat gained by the surroundings is equal to work done by the system. If we continuously adjust the pressure during this transformations, we can imagine a reversible transformation. As weve discussed, this does the maximum work, and we get For an ideal gas, we have and thus

This is the change in entropy of the gas. We see that expanding a gas ( ) increases its entropy and compressing a gas ( ) decreases its entropy. This explains why a gas will always ow from a smaller container into a larger one. From the above discussion, we showed that spontaneous transformations will occur if the entropy increases. If a gas moves into a larger volume, then its entropy will increase. This also explains why the gas doesnt spontaneously compress itself into a smaller volume: that would decrease the entropy and so that cannot occur spontaneously. We can make it happen, but we have to do work to decrease this entropy. Next, lets consider an arbitrary (probably irreversible) transformation, which generates work and heat (and well assume were talking about an ideal gas at constant temperature , such that ). Since (this becomes an equality when is a reversible work), we must have . In hindsight, this is not a surprise, since we know that from Clausius theorem

The idea of doing work to decrease entropy is common throughout all of thermodynamics. In fact, if we were happy with what a state of maximum entropy gives us, then we wouldnt have to do any work and would let thermodynamics just do its thing. However, we like order in our life and thus we do work to ght the disorder that entropy typically leads to. Well talk more about the relationship between entropy and disorder in the coming sections. If we talk about entropy being related to disorder and work that cannot be used, then why is the entropy change the same for irreversible and reversible reactions (recall that is a state function)? Well, we have to look at not just the entropy for the gas but the entropy of everything. Lets take these two cases one by one:

l  ) 

EB

 1) 

|zB

) EB

EB

Since

, we get

"

 #  F$"  EB #   pu%#$"  %ue%zB H %EB   0  ) #  1%u"4  %EB H %zB ' 0

and then to prove the following relationship.

0 (

(1.4)

BRIEF REVIEW OF THERMODYNAMICS

1. For the reversible case, if we expand a gas, then its entropy increases:

However, when we expand a gas reversibly, we are doing some work. In fact, the amount of work we do on the system is equal to the amount of heat that we transfer to the surroundings (since for isothermal transformations of an ideal gas). In this case, since there is a change in reversible heat of the heat bath, the heat baths entropy changes. How much? We know that the work done on the surroundings is the negative of the work done by the gas, so we get . This leads to and thus the total entropy is unchanged

This balances out because during compression (expansion), work done on (by) the gas is exactly balanced by the work done by (on) the system. As for ideal gases, , and thus there is a balance of heat as well. Thus balance of heat, leads to a balance of entropy, thus leading to the total change in entropy to be zero. Thus, in reversible transformations, entropy is exactly exchanged between the system and its surroundings.

However, what about the entropy change in the bath? When we expand the gas irreversibly in one step, we dont do any work on the system we just let the gas go. Thus, . Since, the system absorbs no heat from the bath, there is no compensating entropy change in the bath, or and thus the total entropy change is

Thus, in the irreversible expansion, the total entropy increased. This is also what we expected: we said that spontaneous transformations occur if the entropy increases. Here we see that it does. Once the gas has expanded as far as it can, the entropy is at its maximal value and after that the gas is at equilibrium and entropy rests at its maximal value. Thus, in order to keep track of entropy changes, we must consider both the system and its surroundings. This will make life tricky, but it makes sense: the entropy of a substance can decrease (we know that gases can be compressed), but this reduction in entropy must be compensated by an increase in entropy somewhere else (eg in the heat bath). In fact, if we compress the gas irreversibly, then the added entropy of the bath will be greater than the entropy lost in the gas.

sp D

 p$"y #  w FC D w FFo DD 

2. For irreversible reactions, we are not at equilibrium. As the gas is the same as in the reversible case

is a state function, the entropy of

ep D

"

 FutI #   D FC   "FC D #  FutI w H I

po DD

H r

H p

QF D

BRIEF REVIEW OF THERMODYNAMICS

1.2.5 Application: thermal equilibrium and temperature We can apply the ideas discussed in the previous section to a simple, but instructive example: how do two connected systems reach thermal equilibrium. Imagine two boxes (labeled 1 and 2) connected such that energy (heat) can ow between them in a reversible fashion, but no work is done. Also, these boxes are initially at two different temperatures ( and ). If heat is transfered from and for box 2 is box 1 to 2, then we would say that the entropy change for box 1 is and thus we have to total entropy change

Using our understanding between entropy and equilibrium above, we can immediately state the conditions for equilibrium: which means that .2 Another way to think about this is to note that since no work is done ( ), then we have and thus and

This is another general and useful result.3 Since heat ow and temperature are directly related, it is natural to quantify this relationship by introducing the concept of a heat capacity. For a quasi-static process, we can dene the heat capacity

, we need to know whether any work is done or not (eg is or constant). Since Thus, one typically denes heat capacities at constant pressure or volume (in the case of gases). Finally, since is extensive, so is . This makes sense since the amount of heat should be proportional to the number of particles in the system.

1.3 Thermodynamic potentials


We have now seen that in mechanical systems, equilibrium is found by minimizing energy, and in adiabatic systems, equilibrium is found by maximizing entropy. What happens when energy is not constant, but rather we x temperature and allow heat to ow from the system to a xed temperature bath? What function describes equilibrium in this case? Similarly, what happens if we keep the pressure constant and allow the volume to change? To address these questions, it is natural to introduce new thermodynamic potentials. While it may simply seem like one is introducing more and more functions to deal with special cases, there is a
is a solution to this equation too. This physically means that we have broken the connection between the two boxes and that heat can no longer ow between them (and thus nothing changes and the system is at equilibrium). This is of course another way to reach equilibrium, but not particularly interesting one. 3 As I mentioned, it holds for . When is no work done? For example, the work associated with an ideal gas . Thus, for gases at constant volume ( ) no work is done. is

{ |W S

{ `W

{ `W

2 Technically,

~EB

7f)fEB 2 0 EB  n 2 h j H l EB l

e e  %) 

l )~ tEB ) 2 eI

EB

2 H

zB

0 7

w 7 EB

EB

H s

2 fEB ) '

w qfEB

S ht W T 

H 2

(1.5)

BRIEF REVIEW OF THERMODYNAMICS

10

unifying picture here and an elegance to this approach. First, I will introduce a heuristic approach to explain why we need these multitude of potentials and then motivate how these potentials are actually all related in a simple manner. The principal motivation for introducing potentials is to keep our intuition relating energy minimization and equilibrium in mechanical systems. We know that the energy minima is the equilibrium state in those systems. We now want to generalize this picture. In these generalizations (i.e. new potentials), the intuition that the minimization of the potential is the equilibrium state remains. 1.3.1 Helmholtz free energy For example, consider a case in which our system of interest is in contact with a heat bath of xed temperature . Heat can ow to and from the bath, but the system has a xed temperature. Now, consider a isothermal transformation from A to B. From the second law, we have

The terms come up so often, we give them a new name is the change in the Helmholtz free energy. We can integrate this change up to get (at constant ) . Note that since is constant and and are state functions, then must be a state function as well. Note that we could have done this derivation above for a reversible transformation. In that case, we would have had equalities instead of inequalities and said that and thus (note that since we are talking about specically the reversible case here, we have to denote work and heat explicitly for this path, and ). Now, consider case in which no work is done on the system: . For gases (or systems in which the only work is work, then this is the case where , i.e. volume is constant). In this case, we have (1) in general and (2) at equilibrium. The rst statement talks about spontaneous transformations. Just as we expect that energy will spontaneously decrease in mechanical examples (block sliding down the ramp), we expect that free energy should spontaneously decrease in cases of xed and . Secondly, we see that at equilibrium, the free energy does not change. Thus, equilibrium in this case can be thought as reaching a free energy minima, just as we thought of equilibrium in the mechanical case as reaching an energy minima. 1.3.2 Gibbs free energy

What if some work is done on the system (eg we have xed pressure instead of volume, so ) but the temperature is held xed? In this case, we would write

H %Q

H s

where is the heat absorbed during the transformation. Using the rst law ( can write

H k

e%EB

~0

EB

'

H !

H 4

H !

zB

0

 H  0

H 4

H !

H |k

e

Since

is a constant, we can use the fact that

0 4 zbIbe   H  

and write

) we

EB

'  H

BRIEF REVIEW OF THERMODYNAMICS

11

In this case, we dene a new potential, the Gibbs free energy and nd . As in the case above, spontaneous changes occur only if and equilibrium means that . 1.3.3 Grand potential Finally, lets consider another type of work: the work required to add or remove particles from our system. In many cases, the number of moles of particles is xed. However, we can also imagine studying chemical reactions, such as , in which the numbers of molecules in the reactants and and product may change, and thus the number of particles is not constant. In this case, we say that for each chemical species , there are moles of particles, and the work required to add a mole of particle is . Thus, in this case, we would say that the total work takes the form (assuming constant ) (1.6)

From the grand potential, we can immediately see what would be the equilibrium conditions for chemical reactions (or phase transitions any reaction in which products are turned into reactants). Consider a very simple reaction where the total number of particles is xed, but the number of s and s can change within the constraint . In this case, we nd

(where we have used the fact that , i.e. the total number of particles is not changing). Thus, equilibrium (i.e. ) occurs when the chemical potentials are equal . The equality of the chemical potentials is called phase equilibrium, which should not be confused with the more general concept of equilibrium. Of course, to go any further, we must be able to compute the chemical potentials for a material of interest. 1.3.4 Another way to examine spontaenous processes Instead of looking to the second law to gauge whether processes will happen spontaneously, lets look at the work. Something happens spontaneously if we dont need to do any work to make it happen. Thus, lets look to the work to gauge spontonaeity. Consider a gas which can be compressed or expanded. When the gas expands, it does work on the gas, but it can also do some additional work , such as lifting a small weight. In this case, the total work is

 H

 gv  H 

w

 w   w tu 

 

w x

H H 4e

 H %EB

 

'

and thus takes the form

" 

H 4

and thus, our new potential with variable

obeys the relationship at equilibrium (1.7) (1.8)

  w H ~yu%!5

w H '%44 

"

5  w  

H m

t H Q m H  H     H 4   

BRIEF REVIEW OF THERMODYNAMICS

12

and we will say that something will happen spontaneously if we dont need to do any additional work, i.e. if . From the rst law, we get

3. Constant and : In this case, . Thus, we get 4. Constant and : In this case, . Thus, we get

(the system is isolated so no heat can ow) and

(the system is isolated so no heat can ow) and

Thus, we can use this to determine what statefunction to examine to tell whether something will happen spontaneously. With the above, we can make a table of constraints and the appropriate potentials

We see that there are natural partners (eg and ; and ) of which one is held constant while the other can vary. These pairs are often called conjugate variables, and usually take the form of an intensive, force-like state function (eg pressure) paired with an extensive, extension-like state function (eg volume). The elegance of thermodynamic potentials is that they all use the same physical intuition: equilibrium means nding the minima, and the choice of which potential to use is simply related to the relevant (i.e. knowing which aspects are held xed [and therefore not allowed to do work]).

potential Energy Entropy Helmholtz free energy Gibbs free energy Grand potential

symbol held constant , , , , , , , , , ,

2. Constant

and

: In this case,

 y  w qEB qEB ! u 4  w H 5 t%EB H 4 x%zB

and

"

1. Constant

and

: In this case,

and

Now, lets look at some specic cases:

 w EB

H 4





Solving for

, we get

. Thus, we get

. Thus, we get



w x

 H bEB

%zB

w EB

e

e



BRIEF REVIEW OF THERMODYNAMICS

13

1.4 Useful mathematical results


1.4.1 Mathematical properties of state functions State functions have important mathematical properties which are related to their path independent nature:

then depends only on the beginning and end points, and . For this reason, is said to be an exact differential. Since and are not state functions, their differentials are called inexact differentials and are written with a bar in the d: and . 2. We can write an exact differential in terms of some other coordinates. For example, lets say that the energy of some system depends on its volume and temperature . Then we can write,

Here, we have used partial derivatives ( ) instead of total derivatives ( ). is the rate of change of the energy with respect to changes in volume, holding temperature constant.

The equation above may give you a better idea of what we mean by different paths: since U is a function of and , you could go from the initial to nal states by different ways of varying and along the way. However, no matter which path you take, we know that , , and are state functions, and thus

Also, we can write the equation for

above as

Thus, we nd that the order of differentiation of a state function does not matter, i.e.

The proof of this property is called Eulers crieria for exactness. 3. Another test that is an exact differential is that

We would derive the above by saying that

 '   ) &  "

"

 i  

"

 vbzyw vbE" H  H H

"

 q  

' 2i    ' i    " " 1 & g  0 1 & g  0

' (i   " & g

"

vbzv"   " & i '  (i   g

"

"

g

"

 )  "& i wg  " & g

"

" # %


1. If we integrate a state function

" )

"

"
 

"

"

" #

BRIEF REVIEW OF THERMODYNAMICS

14

we get the desired result:

1.4.2 Gibbs-Duhem relation

To integrate eq. (1.9), consider that eq. (1.9) is written in terms of derivatives of extensive quantities ( , , , and are all extensive). In general, we expect that since the energy should be extensive, the integral of eq. (1.9) should have the form

i.e. if we multiply the extensive aspects by , this should be equal to an energy scaled by . This is an example of a rst order homogeneous equation (eg see Chandler 1.7). In general, to integrate such equations, i.e. , where , we rst take the derivatives of both sides with respect to :

Mathematically, we can write total derivatives in terms of partial derivatives as

G 1DqB@BAuzI4 C @ @  (i 4   C 8 @ @ @ 8 1DBBAu9I g 4  C @ @ @ 4 1DqBBFuz oEXDBBA9  C 8@ @ @ 8 4E 8 4 4 7 64 64 54

H2i 8  C @ @ @  D8BBAu8 8  g C

    uCI u o I    

w 

 H Q

Consider a system at constant ,

, and . At equilibrium, we have (1.9)

  hzI

3

How do we tell if a function is a state function? Any function any of the properties above.

Using Eulers criterion for exactness, which says that

i    vbz

i '  (i   g g      "2 "2

g

   "2

and thus

   ~bz"

" &


Similarly, we can nd that

' (
is a state function if it obeys

 i    vbzv

g

and taking another derivative, now respect to

, holding

constant

   "2

BRIEF REVIEW OF THERMODYNAMICS

15

and thus we can also say that

Thus, in the case for derivatives of the energy, we have

which is often called the fundamental equation of thermodynamics. If we take complete derivatives of this equation and combine it with eq. (1.9), we get

which is known as the Gibbs-Duhem relation. This relation is useful since it expresses the relationship of derivatives of intensive quantities at equilibrium, just as eq. (1.9) relates derivatives of extensive quantities at equilibrium. For example, we can use this relation to calculate the change in chemical potential for an ideal gas vs or along isotherms for a xed amount of gas . Since we are looking at isotherms, we have and thus we have , and

where we have used the ideal gas equation of state and (these constants arise from the constant of integration). pressure

is the chemical potential at

F%#yw

 w 

   ! 5 x l  l # %$

w k

 ) #  %u

" Qx  w

 w   H u x~Qx

 H Q5

 H Qy

 H y

Physically, this derives from the fact that ,

, and

are extensive variables. to obtain (1.10)

Of course, each of the partial derivatives above can be identied as we nd

W ST Q Q QQQ Y 

I G2i  DqBBAuE  C @ @ @  g C

 w   H u x~Qx

w x

U S XWQQ

QQQ V 

U T QQ S

w ku

QQQ R 

 t!

for all . For simplicity, take

and we get

IPH i 8  1DqBBAuzI  C @ @ @   g C

Combining these equations, we get

, and


respectively. Thus, (1.11)

IPH i 8  2i 4  PIH i 8  2i 4  G G  g C 8  g  g C g


  

5x 

BRIEF REVIEW OF THERMODYNAMICS

16

1.4.3 Maxwells relations As I mentioned earlier, thermodynamics is in many ways a set of relationships between thermodynamic quantities. Many of these relationships are called Maxwell Relations due to the work of Maxwell. These relationships combine some initial thermodynamic relationships we have already talked about with the mathematical connection formed by derivation. At the heart of all of these relationships is the mathematical property of commutation of derivatives:

For example, from eq. (1.9), we have

Thus, we can take second derivatives to get

There are a variety of mnemonics to memorize these relationships, but I dont think this is very useful. One can pretty easily derive them from the basic relationships of thermodynamics (eg eq. (1.9)).

1.5 The third law of thermodynamics


The second law allows us to calculate changes in entropy in transformations. For example, we could calculate the entropy change for a transition in temperature:

where is the relevant heat capacity. However, in order to calculate an absolute entropy, we would need to know the entropy at some particular temperature. The third law of thermodynamics supplies us with this information and states that the entropy of a system at absolute zero is a universal constant, which may be taken to zero. There are a few important consequences of this statement: 1. The heat capacity vanishes as zero temperature. Since

55xzb  

 E

pF

'

'

(note that we can write the absolute entropy according to the third law

rather than

 %z

'

' 5 ' 5

'

'

d  2 E Fp '

| )

Moreover, since

 H     l  )  3 )   H       2   2   S W QQ S T QQ C QQQ   c a x C QQQ   H b   3 3 3    3hz  2  


, we also have

 hE 2

 3

(1.12)

since

). Moreover,

BRIEF REVIEW OF THERMODYNAMICS

17

otherwise the integral would diverge. I have just written , but note that this would work for , , etc. 2. One can never reach absolute zero in a nite number of steps. Consider a scheme to adiabatically cool a gas by lowering its pressure (for example, naively, we could say that in an ideal gas so as , one would expect ).

and thus as we approach absolute zero

since to zero as

just as

is a constant ( is the thermal compressibility, which also goes does) as (see Huang 1.7).

Thus, decreasing near has little effect on the change on ! In a sense, reaching absolute zero is not dissimilar to Zenos paradox of trying to reach 0 by going half of the remaining distance in each step. One can get gradually cooler and cooler, but never reach absolute zero since the steps one can take get smaller and smaller as we approach .

1.6 Summary of the laws of thermodynamics


The laws of thermodynamics were once explained to me in the following concise (albeit pessimistic) summary: 1. There is no free lunch: you can at best break even. 2. You can only break even at absolute zero. 3. You can never reach absolute zero.

ee

e f

s d ` d e r p i )p i  ) h Iqm h q)%  X   dg h bt H i )  1 %  X   0 e i  H  i   g g

we get

ei  g

' hi 

g

b ca

e f

ei  g

However, since

d (%z

d `

p F

'

d g

#  %$"!~ 

and since

and

are always positive, we must have (1.13)

e f

(1.14)

(1.15)

BRIEF REVIEW OF THERMODYNAMICS

18

1.7 Appendix A: Isothermal transformations of a gas, revisited


To clarify aspects of how we dene work (eg work on vs by the system), I have included the following detailed discussion of the isothermal compression and expansion of an ideal gas.

1.7.1 Compression
1.2 1 0.8 0.6 0.4 0.2 0 0.75 1.2 1 0.8 0.6 0.4 0.2 1 1.25 1.5 1.75 2 2.25 2.5 work = 1. (1 step)

final

initial

We can compress a gas using just one step or using multiple steps. In both paths, the initial and nal states (eg as determined by volume and pressure) are identical. Whats different about these paths? As weve seen in previous examples, the amount of heat and work differ. Lets rst tackle the amount of work done. In a one-step process, the work done is given by

initial

Pressure

0 0.75 1.2 1 0.8 0.6 0.4 0.2 0 0.75 1.2 1 0.8 0.6 0.4 0.2 0 0.75

1.25 1.5 1.75

2.25 2.5

work = 0.745635 (5 steps)

initial

work = 0.695653 (100 steps)

initial

1.25 1.5 1.75

Volume

2.25 2.5

For the specic case on the left, we know that , so . As we add steps, the work we need to do to compress the gas becomes less and less. This work is minimized for the case of a reversible path.

final

h l | F zF3% BD (%u(   #  n  j #  H F'%$ ~s zpp3%   H BD

1.25 1.5 1.75

2.25 2.5

where are denotes the middle step. What is the least amount of work we will have to do? In the limit of steps, we say the step size is very small: and our sum of steps becomes an integral. The work in this case is

final

h h 11p l l l l 1 H  H  H I1 %  %H   9 H  QI r  %H   H  H  

w y

t  t v w w yx

d 

final

Note that here is negative (as we are compressing). If we compress in a multi-step process, then we can minimize the amount of work it will take to compress the gas. We use less work because we apply less pressure. In the two-step case, we rst apply a pressure of 0.667 and then apply a pressure of 1.0. Thus, the work we do in this case is less

v t  ul

work = 0.833333 (2 steps)

 QH 

l l H    H %

 

 %H

BD zpp3x%

wBB@ @@

BD zF3 2

% % %

BRIEF REVIEW OF THERMODYNAMICS

19

It is important to keep track of minus signs. For the reversible case, we say that units of work are done on the gas, but that units of work are done by the gas. Thus, in this sense, the gas does the most amount of work in the reversible reaction since

1.7.2 Expansion
1.2 1 0.8 0.6 0.4 0.2 0 0.75 1.2 1 0.8 0.6 0.4 0.2 1

work = 0.5 (1 step)

initial

Now we will expand the gas. Again, we can do so in one-step or with many steps. In the case of a one-step expansion, the work done on the gas is

1.25 1.5 1.75

2.25 2.5

work = 0.583333 (2 steps)

This work is negative, since the gas is expanding and thus releasing potential energy. This means that the gas can do 0.5 units of work on some other system. In a fashion similar to the previous section, we calculate the reversible work

initial

final

Pressure

1.2 1 0.8 0.6 0.4 0.2 0 0.75 1.2 1 0.8 0.6 0.4 0.2 0 0.75

work = 0.645635 (5 steps)

in terms

initial

final

1.25 1.5 1.75

2.25 2.5

work = 0.690653 (100 steps)

initial

Thus, we see that the reversible path can do the most work. Moreover, we see that the reversible reaction releases the same amount of work that we have put in. This is why we consider it to be reversible: if we put 0.693 units of work into the system to compress the gas, a reversible expansion will give exactly the same 0.693 units of work. Thus, a reversible reaction wastes no energy and can go on compressing and expanding (reversibly!) forever. Of course, real systems are never perfectly reversible. For the one-step case, we see that we have to put in twice as much work as we get out (1.0 vs 0.5 units). Thus, the work provided in expansion will never be sufcient to fully recompress the gas.

final

1.25 1.5 1.75

Volume

2.25 2.5

0 0.75

1.25 1.5 1.75

2.25 2.5

which in this case is in terms of work done on the gas or of work the gas can do.

h h H  F'vpp3% H BD n  j F'%$ ~s zpp3% #  H   H BD

l H %  H  QH  %    H

v t

%H  %H

BD zpp3x%

final

 l

H %zpp3x%   BD

9%

h h 11p

H %vpBpD3 2 %

%  h %zpBpD3%9% H %H
h

BRIEF REVIEW OF THERMODYNAMICS

20

1.7.3 Heat I have not mentioned heat yet to avoid complication and confusion. The tricky part about thinking about heat in this case comes from (1) keeping track of minus signs (like we had to do in the case of work) and (2) remembering that our gas is in thermal contact with a heat bath. The energy of ideal gases are only related to their temperature. Thus, along an isotherm, the energy is also constant and thus . Since , along an isotherm we get . Heres where the signs become important, especially to understand whether heat is entering or leaving (and entering or leaving what: the gas or the heat bath). For example, consider the case of the one-step compression. In this case, we do work on the gas and so . This means that one unit of heat is transfered to the heat bath. In the case of reversible compression, we require less work and so we put less heat into the heat bath (0.693 units of heat are transfered to the heat bath). Now lets consider expansion. In one-step expansion, the gas does work on the system (aka negative work is done on the gas) and we found that . Thus, . This means that 0.5 units of heat were absorbed by the system (i.e. taken from the heat bath). For the reversible case, we found and . This means that units of heat were absorbed. We see that the reversible expansion absorbed more heat or in other words, the one-step (irreversible) reaction left more heat in the heat bath. What pattern do we see here? Lets make a table of the work and heat:

We see that and (note that this is the work done on the gas and the heat absorbed by the gas respectively). If we talk about how much work can be done by the gas (i.e. the negative of how much work is done on the gas) and the amount of heat transfered to the heat bath (the negative of the amount of heat absorbed):

We nd that the reversible reaction can always do the maximal work and transfers the least amount of heat. Getting the minus signs straight here is extremely important!

1.8 Appendix B: Carnot engines


Carnot engines are useful not just due to their simplicity, but because they all share a common property: they all have the same efciency (for a given temperature of heat baths and sinks ). Indeed, Carnots theorem states no engine operating between two given temperatures is more efcient than a Carnot engine. To prove this statement, consider a reversed Carnot engine a Carnot fridge (which from and deposits onto the bath) driven by a non-Carnot engine (which takes takes

h h H BD zF3%

g h

A$

l H BD zpp3x

BD zF3

h h H %

l H BD vpp3x%

direction compression expansion

h h H BD zF3%

A$

l H BD zpp3x

BD zF3

h h H %

EB EB 8 p l %H BD vpp3x%

$ f

direction compression expansion

h v vF

$ (% H f

szpp3x BD

v 8 FzFBD3edD A@%

v p4zFBD3edD A@8$ f%H zpBpD3x%

% $ # w "

v p'H

F'H%vBpp3% D H %zpp3x BD

zF3 D A@b BD 8

ed %

B$

" #

EB 8 EB

%zF3x% BD

BRIEF REVIEW OF THERMODYNAMICS

21

from the bath and can perform work on the Carnot fridge, but must deposit onto the bath). This combined device takes from the bath and deposits at the cold bath. According to Clausius statement, the quantity of hear transfered cannot be negative and thus . Since the same quantity of work is involved in the process, we conclude that since is the same for each engine

There is a simple but important Corollary from this: all Carnot engines operating between two given and ). temperatures have the same efciency (which is based solely upon 1.8.1 Heat engines and refrigerators

Thermodynamics was developed in the 19th century to address a very practical problem: how do heat engines work. Its interesting to consider that these very applied questions lead to such a revolution in our basic understanding of physical chemistry. An idealized heat engine works by taking some heat from a source (eg a coal re in a train), converting some of that heat to work , and then dumping the remaining heat into a heat sink (eg the atmosphere). The efciency of this engine (how much work we get compared to how much coal we have to use) can be calculated as

(i.e. elecWe can think of a refrigerator as a heat engine running backward, i.e. using work tricity) to extract heat from a cold system (the fridge), and dumping at a higher temperature (your house). We can similarly dene the performance of this system as

These two concepts are often both considered in terms of a Carnot engine. A Carnot engine is a simple model of real engine, allowing us to explore thermodynamics concepts. Moreover, in general, all thermodynamic processes can be reduced to some set of Carnot engines, and thus what we learn about Carnot engines will be true of thermodynamics in general. We dene a Carnot engine as any engine that is reversible and runs in a cycle with all of its heat exchanges taking place at a source temperature and a sink temperature . 1.8.2 Carnot engines and the thermodynamic temperature scale The concept of temperature has both an intuitive aspect (we all have a feeling for what higher temperature means) and a more formal meaning. Most importantly, the concept of temperature can be introduced using Carnot engines. To obtain a uniform temperature scale, consider two Carnot engines arranged in series. Engine A takes heat from the reservoir and deposits in the cold bath. Similarly, engine B takes heat from and deposits it at .

q i

&

gx g D C@ l x@C@ k i

g g

2 2

gu i

rC@ D

l k k

gi

'

g &

gi & gj i

(1.16)

(1.17)

(1.18)

BRIEF REVIEW OF THERMODYNAMICS

22

Using the corollary above, we know that the efciency should only depend on the hot and cold bath temperatures. Thus, we have and . We can also consider the efciency of the whole system as one big Carnot engine as . Since in general, then we can write

and

With some algebra, we get

This equation sets the temperature scale up to a multiplicative constant (which is then chosen from some experimental calculation, such as the triple point of water, i.e. 273.16K). The beauty of a thermodynamic scale dened in this form is that it is independent of the nature of the substances involved in the Carnot engines and is a pure result of Thermodynamics. 1.8.3 Entropy is a state function With our understanding of Carnot engines, we can now build the foundation for a thermodynamic understanding of entropy. Recall, that since this is a purely thermodynamic approach, it cannot make any appeal to the microscopic underpinnings of our system, and thus a thermodynamic understanding often is seen as very underwhelming by student of thermodynamics and statistical mechanics. However, we will see that this connection with Carnot engines allows us some fundamental results about entropy, independent of the microscopic nature of our system. Well start with Clausius theorem, which states that for any cyclic transformation throughout which temperature is dened, the following inequality holds:

g &

or in other words,

 2 E 1q9IE )

This property implies that can be written as a ratio of the form is set to by convention (and for simplicity). Thus, we nd

ihq 2 E k H  7 ysqhq 2 z k H l q ys97q 2 E k H l q ts 2 i&E k H l q otsq7q&E k H l q o o

we get

 2 qpz k o

qb E 

g & k

and with

and which

(1.19)

(1.20)

ihq z  2 q E

q tsi7i 2 E k H }2 tsi7i&E k H q o o z|{vu tsihqpz k H l q o 9o o xw o H zxvu ys 2 qpz k H l q o Xo 2 o 2 w o H q xw xvu tsq7i 2 E k H l r2 2 H 2


l l l

o p k o &

 ihq 2 z

2 q7 )

) ih

H 

BRIEF REVIEW OF THERMODYNAMICS

23

where refers to the heat increment supplied to the system at temperature and the integral extends over one cycle of the transformation (hence the circle in the integral sign). To prove this theorem, imagine a set of Carnot engines each connected to the same hot reservoir (with temperature ), but each with a different cold reservoir (with temperature and ). Each engine absorbs from the hot reservoir and deposits to the cold bath. By eq. (1.19), we have

Keep in mind that this system in total extracts heat and uses it to do external work. By Kelvins law (i.e. no process is possible whose sole result is the complete conversion of heat into work), we know that , i.e. we can convert work into heat ( ), but not heat into work ( ). Another way to think about this is that we will need to do some non-negative work on the system to get it to go through a cycle. In the previous section, we showed that in reversible transformations, no extra work needs to be done. However, in an irreversible transformation, one must do work on the system to get it to go through its cycle. Since some non-negative work must be done on the system, we know that (by Kelvins statement of the second law), heat must be put on the hot reservoir, and . thus Thus, we can write

(we can do the last step since and we will assume that is large enough that the sum can be replaced by an integral). This statement seems important (and it is). Lets consider some consequences of Clausius theorem: 1. A cycle made of reversible transformations (a) For a reversible cycle, we must have . We see this since we make a reversible cycle by running the system forward, then backward. Thus, we must have and . This can only be satised by . (b) Moreover, this implies that the integral of between any two points and must be independent of path. We see this since we can dene two paths ( and ) which each connect to . We can also imagine the reverse of path (called ). Since paths + (i.e. from A to B on and from B to A on ) form a cyclic transformation, we have for this reversible cycle But since is the reverse path of , we must have

i P

5xzB )

$ ~

"

EB

zB

EB )

eAyw

 $ ~

H EB F4 zB

) zB

i P

EB

  &e  $

"X%)EB h H

 n 

We can sum both sides over all

engines to get

C AAA 2

 $ &e  $

 &

 $ n 
l

$  $

) e4EB

i P

(1.21)

BRIEF REVIEW OF THERMODYNAMICS

24

(c) This allows us to dene a state function. State functions are special functions in thermodynamics whose integrals are independent of the path taken. In this case, we can write in general where is a state function, which we call the entropy.

2. Irreversible transformations (a) For an irreversible change from A to B, the equality above will not hold. We can turn a irreversible change from A to B into a cycle, by adding a reversible path back from B to A: (1.23) Thus, the integral of the irreversible change will be less that that of the reversible change. This leads to the important result

This inequality approaches an equality as the path on the left becomes more reversible. Thus, irreversible transformations which increase the entropy will always occur spontaneously. This is why the dye drop will spontaneously spread throughout the bucket, but will never come back to its original conguration. Entropy wants to be maximized. (b) There is an important result of the statement above. Lets consider the equation above in differential form: . For an adiabatic transformation (i.e. no heat transfer) and we have . We can imagine the universe as such an adiabatic system (i.e. no heat can leave the universe). This leads us to a fundamental aspect of our universe, independent of all of its underlying molecular underpinnings: the total entropy of the universe is always increasing. While we might be able to think locally and decrease the entropy in some subsystem of the universe, the total entropy will always increase. 3. Recall that equilibrium behavior is much like that of a reversible transformation (or better put a reversible transformation is always at equilibrium). Thus, at equilibrium, we would expect that (1.25) Thus, we now have a way of thinking about how a system adiabatically relaxes to equilibrium: it maximizes its entropy. At equilibrium, the entropy no longer changes (as it is at its maximum).

  H   vbQIvb4

t yw EB

  H   vbQIvbe

EB
(1.22) (1.24)

EB

Thus,

EB

BRIEF REVIEW OF THERMODYNAMICS

25

1.9 Problems
1. Adiabatic expansion/compression of an ideal gas Derive the equation for adiabatic transformations of an ideal gas. (An adiabatic process is one in which the system is isolated and thus unable to absorb heat from or transfer heat to a heat bath (i.e. ). Your answer will take the form of a relationship between , , , and , where the subscripts and refer to the initial and nal states, respectively. Assume that and are constant during the transformation. 2. Fast and slow mixing of gases Imagine a box with a partition separating two different gases. Assume that both gases are ideal gases. There is a different amount of each gas (the gas volumes are and respectively and the box volume is ). The box is in contact with a heat bath such that any transformation would occur at constant temperature. (a) First, imagine a process that allows the gases to gradually mix (i.e. such that during the mixing processes, the system is essentially always at equilibrium). Note that this process must require the surroundings to do some work in order for this gradual mixing to occur. What is the change in energy and the change in entropy of each of the gases (between the unmixed and completely mixed states)? (b) Now, imagine that the partition was instead quickly removed such that the gases mixed rapidly instead. What is the change in energy and the change in entropy of the gases in this case? (c) Calculate the sum of the entropy changes of the gases in (a) and (b) and compare the two values. Give a reason for the result of your comparison. (d) Calculate the total change in entropy (i.e. including the gas and the surroundings) in (a). (e) Calculate the total change in entropy (i.e. including the gas and the surroundings) in (b). (f) Compare your results to parts (d) and (e) and explain why there is a difference. 3. Fun with thermodynamic relationships (a) Derive the thermodynamic equation of state (McQuarrie 1-29)

w x

(b) Derive the equation (McQuarrie 1-30)

) 1)   2 

7

 H

 H

g

g 0

' 2i  

g

2 !  w 

"

e f

e f

BRIEF REVIEW OF THERMODYNAMICS

26

4. Helmholtz free energy A substance has the following properties: At a constant temperature , the work done by it on expansion from to is

5. Equilibration entropy

(b) How does this equation show that the change is spontaneous?

Two blocks of the same metal and same size are at different temperatures blocks are brought together and allowed to come to the same temperature. (a) Show that the entropy change is given by

(a) Calculate the Helmholtz free energy (b) Find the equation of state (c) Find the work done at an arbitrary constant temperature

2 7 F 2 w 9 2 Q7E

e f

where

is the volume,

the temperature, and

n  j  ut #  

The entropy is given by

, and

are xed constants.

and

 F %ue #  
. These

P ROBABILITY

27

2 Probability
As the name implies, statistical mechanics inherently deals with a probabilistic view of states in equilibrium. Thus, a familiarity with the manipulations of probabilities is an important prerequisite to our study of statistical mechanics. The purpose of this section is to review some important results in the theory of probability and to introduce notation that will be used in the rest of the course.

2.1 Denitions and preliminary remarks


Much of the basics aspects of probability is probably familiar to you. Heres a brief review for completeness and notation. 1. We call all the possible outcomes of an event the microstates of the system. For example, for a coin ip, the microstates are heads and tails. For a die, the 6 microstates are numbers 1-6. 2. Often it is useful to group microstates together into macrostates based upon some common properties of the microstates. For example, we could imagine in the dice case, two macrostates being odd and even outcomes of the die roll, each macrostate consisting of 3 microstates. 3. We can assign probabilities to events with particular microstates as outcomes. For example, for a fair die, each microstate is equally likely and thus we say that they have equal probability. The probability is related to the experimental outcome of events. For example, if we ip a coin times, we expect it to be heads times, where is the probability of the heads microstate. Clearly, the following properties must hold (and can be either reasoned, or taken axiomatically) (a) Positivity: The probability of event must be non-negative: . (b) Normalization: Since the number of events corresponding to microstate is and since , we must have . . Thus, (c) Additivity: if and are disconnected events, then since each microstate are disconnected outcomes, the probability of a macrostate is the sum of the probabilities of its constituent microstates.

2.2 One random variable


In the previous section, I gave examples using discrete random variables (eg the 6 discrete outcomes of rolling a die). Now, lets consider continuous random variables. Consider a continuous set of microstates, each with a different value of and . 2.2.1 Fundamental relationships We can dene some fundamental relationships. 1. We can consider the probability distribution function ( PDF) to be the probability of an outcome microstate with value . is the continuous equivalent to ; for example,

  7 w z7   

vv7  

  E7

wv  z7

B C vC



  zh

B xC vC



P ROBABILITY

28

the discrete normalization has a natural integral equivalent in the continuous case. An example of a continuous PDF is the probability of an event happening if in time the rate is : and . Note that is not unitless, but rather has the same units as . This is clear due to the normalization criteria . For this reason, we should technically call a probability density, whereas is a true (unitless) probability, corresponding to the fraction of microstates of value within a range . This distinction is not needed in discrete distributions, where is a true probability.

Using the time probability example above, we can calculate the probability that the event has . As one would expect, happened by time as . Unlike the probability distribution function, the cumulative probability function is always unitless.

With our rate example, we can calculate the average time to be (i.e. the inverse of the rate, as we would expect). 2.2.2 Ways to characterize PDFs: moments and cumulants

While we often think of PDFs in simple models as Gaussians or delta functions, PDFs can be complex. Thus, we need some systematic means to characterize them. A natural means to characterize a PDF is by calculating moments: (2.1) While moments can describe a PDF in great detail, simple moments of this form are not very useful. , i.e. the mean of the distribution, is a useful quantity. Often For example, while the rst moment we care more are the width of the distribution (the variance) . It would be nice to have some means to have moments that describe PDFs like the variance, as well as higher order generalizations of this idea. To improve upon this idea, we will introduce the concept of a cumulant. To derive a cumulant expansion of a PDF, we will rst need to calculate how to more generally write a given PDF in terms of an expansion in regular moments. To do so, we calculate the characteristic function, which is the generator of the moments of the distribution. It is simply the Fourier transform of the PDF:

) l tsi9H7f q P 

 H 7

 6z7  

H 2 p $

  z iz7j   "Pz

  C hzhj

 z

3. The expectation value of any function, eg

can be calculated as

 i

  E

2. We can dene the cumulative probability function ( CPF), given outcome that is less than :

as the probability that a

  E7

~E7  

 9H77H

i 7   l

  z7

 i
i 

 9H76r i c

zs  

4 3 97  H

p C ) C

E7   l z7   i!|H76i7   l i s  

7  

  z7

d  d  x 

(2.2)

P ROBABILITY

29

As one would expect, the PDF can be recovered from the characteristic function via the inverse Fourier transformation (2.3)

where we have used the Taylor series point can also be generated by expanding

Finally, using a similar idea, we can calculate the cumulant generating function by taking the logarithm of the characteristic function. In analogy to the characteristic function (whose expansion generates moments), the cumulant generating functions expansion generates cumulants:

where we denote cumulants by the subscript . What are cumulants? Actually, as we will see, they are probably familiar to you. Relationships between the moments and cumulants can be obtained by expanding the logarithm of and using the series

The rst four cumulants are called the mean, variance, skewness, and curtosis of the distribution, and can be obtained from the moments as

While lower order cumulants are identical to moments around the mean (i.e. , this is not true for ). Cumulants are often the most compact means to describe a PDF. Moreover, often just a few cumulants can give very useful information about the nature of the distribution (indeed, many theories simply use the mean and the variance to describe a system). Finally, there is a natural way to express moments in terms of cumulants via a diagrammatic expansion. A moment of order can be expressed in terms of a sum of cumulants by summing all of the 1 point irreducible clusters (connected or disconnected) of points (where the contribution of

C Cp Hz C h p H 2 p A2 l v w 2 A2 v Hw p o p hp

{ x C C H  C  Hz C H C  sz C H PEs q t9H7 H    {) C 3 C 9376         C {9H C C x9H C 3t9H7 7 C  t C  t
 C x C C C t9H  C BC   l H 9

. Similarly, moments around a particular

  7

Finally, moments of the distribution can be obtained by expanding expansion)

    tw76

C CVw

h F  

}v

o A2 h p

zh    F

  7

in powers of

(via a Taylor (2.4)

D o A2

D o A p

7  z7   

(2.5)

(2.6)

P ROBABILITY

30

the cluster is the product of the connected cumulants that it represents). Using this method, one can easily calculate

2.3 Four important PDFs

1. While a Delta Distribution is a pretty trivial example of a PDF, it is a natural rst example. A delta function distribution is sharply peaked at a given value of : . Thus, while its mean may be nonzero: , all higher order cumulants are zero. This can be shown since for all , for a delta function distribution (and thus for example). 2. By now, most of you have seen a Gaussian Distribution in several different contexts:

While delta distributions are characterized only by one parameter (its mean), Gaussian distributions by two parameters, a mean and a width (standard deviation). To calculate the cumulants of a Gaussian, we calculate its corresponding characteristic function (2.8)

which is also a Gaussian (recall that the Fourier transform of a Gaussian with width is also a Gaussian, but with a width of ). Cumulants of a Gaussian distribution can be identied from the logarithm of the characteristic function: (2.9) (2.10)

i.e.

In hindsight, this makes sense: we knew that and were associated with the mean and the variance. Gaussian PDFs are useful since they are a natural extension from delta distributions, yet still quite simple.1
1 Moreover, Gaussian distributions serve as the starting point for most perturbative calculations in eld theory. The vanishing of higher order cumulants implies that all graphical computations involve only products of one-point (related to the mean) and two-point clusters (called propagators).

 k|zd H 

2 h 4 53BB@ D @ @ o 2 ` A2 7 p 4 v H 4 H c7 p   2 2 $d T) l v H 4tH 6` tgH  4 2 TH v H 6 2 n}l v 4h   12 2 0 1 2 z 0 4 2 Tv H 6 2 n}l v z7    4 1 2 oHz 0 p H 2 p 2 p H 2 2 Cp C p

p o p w 2 p A2 V F o op p A p
E7  

" 2  H  e` x E

h w 2 2 w w w 2 h 2

w w 2

o p

(2.7)

P ROBABILITY

31

3. Binomial distribution. Consider ipping a coin (whose outcome is either heads or tails), except that this coin is weighted, such that and . The probability that in trials that heads happens exactly times (eg out of ips, heads comes out times) is given by the binomial distribution (2.11)

and gives the number of is just the coefcient obtained in the binomial expansion of possible ways to choose things out of (or in other words, the different ways to order heads and tails events). We can also think of the bionomial expansion in reverse as a series we can sum:

To explore this PDF further, we can calculate its cumulants. The characteristic function is given by

One can check this formula via the binomial theorem, or just by simply expanding the result. The cumulant generating function has a particularly interesting form

(2.15)

Lets address these terms physically. The mean makes sense: out of ips, we expect to get heads on average times. The standard deviation is just the square root of the variance and tells us the error or relative uncertainty of the average. As , the fractional uncertainty vanishes, i.e.

(2.16)

, i.e. times a function of distribution, the higher order cumulants (i.e. above the variance) do not vanish!.

2 However, note that while all cumulants have the form

v 

Thus, lets say we were ipping a fair coin ( chance that the number of heads is not ips, the mean will be close to .

. If we make ips, theres a good . However, if we made a thousand


and , unlike a Gaussian

i.e. the CGF for trials is times the CGF for one trial ( ). Thus, the cumulants for trials are times the respective cumulants for one trial. With some algebra, we nd 2

|g`vF  H |d76# $ H C c x C

(2.13)

(2.14)

V d

   $ Feo

 $ w C 

s$ w C rV~ q C $

s$ w    s$ w     }ht9H7 q p" C }7t9H7 q FecC F

$ $ H  D" 2 9$ " A2

R w 5s R   l '1v) e v l  v) l d  l  $  pF C c

|gz  H

|d 7  H

x

where the prefactor

(2.12)

RH

l y AAF 8D

C c $
n

 |vc  C

`CvCB
j |v    n

 m "

|v  

C c p C C c


s w  q C }$7t9H7 P t

|d 97  H

 "

$ D

H Q

 

2 2

cC

P ROBABILITY

32

4. Exponential PDF. In many systems, the kinetics is comprised of crossing a single free energy (or energy, etc) barrier. If we measured the time required to cross the barrier, we would nd that these times would follow an exponential distribution:

To calculate cumulants, we rst calculate the characteristic function

and calculate a series expansion in small

to nd the rst two cumulants

As we would expect, the mean time is also .

P 2 F2 P v  H   o  m sw 2 m 2 H m gI hFt m 7 p m l m w i  l  m gH i)H7 d 5 m 7


of the logarithm

. Interestingly, the width of the distribution is

 % )

 7

where

is the characteristic time to cross the barrier.

 i 97 ) H

l  i7

(2.17)

(2.18)

(2.19) (2.20)

P ROBABILITY

33

2.4 Many random variables


In the previous section, we talked about distributions of one (continuous) random variable . To generalize this to multiple random variables, we can simply think about as a dimensional vector . For example, describing the position and velocity of a gas particle requires a 6 dimensional space. The joint PDF is the probability density of an outcome in a volume element around the point . The joint PDF is normalized such that (2.21)

If (and only if) the random variables are independent (i.e. uncorrelated, as we will see below), the joint PDF is the product of the individual PDFs:

(2.22)

We can simply vectorize our concept of the expectation value to include many random variables: (2.23)

Moreover, we can use this to write a generalized version of the characteristic function (called the joint characteristic function):

(2.24)

can be used to generate joint moments and the genFinally, this characteristic function erates joint cumulants. A compact way to express this is in terms of derivatives of the moment and cumulant generating functions:

(2.25) (2.26)

In the previous section, we introduced a graphical relationship between moments and cumulants. This relationship also holds for joint moments and cumulants. For example, We can rearrange this to get an equation for the connected correlation function: (2.27) Correlation functions play a fundamental role in much of physical chemistry, as they are natural quantities to both measure experimentally and calculate theoretically. The connected correlation function plays a particularly important role since it vanishes when and are independent, i.e. and thus

and thus vanishes (whereas the means is zero).

vanishes only when the system is uncorrelated and one of

   h F

2 u

 t9 H @ @  2 t9H   1"  7 p C 1  BB@  t9  H 0 @ @ C 1  2 t 9H 0  5  h C 1   0 BB@ C 1   0

2 h 2 u 2 x 2 z 2 2 uxquz u 4 2 u 2 quE 2 $ 2 u 4 2 u  2 z7 quz7  2 quzh  2 u

@  $ 9H76 

2 u

2 u H 2 u 2 u P w 2 u 2 u

  q 7j   5  G 7  

 z

$   hj

 H Cdt9 iBB@ 2  C @ @ C C C1  0  H Cdt9 @@  C iBB@ 2C C C1  0   7   y 7

97 H

@@@ B qBBA

2 q `   7

P ROBABILITY

34

2.5 Sums of independent random variables


 E

and the corresponding characteristic function is given by

We get

We rearranged terms above so that we can now recognize this substitution, we get

So, in other words, in this case, we nd that the joint characteristic function (in this case) is simply the product of the characteristic functions for each random variable, but all evaluated at the same . To calculate cumulants, we caculate the log of this function (2.34)

Thus, since we also know that , we see that the cumulants for the probability of the sum are equal to the sum of the cumulants for the individual random numbers : (2.35)

 9 C " 4 H   4 H  5h  H   u9 E 65d4  z C C  v   H o45d f 7r) h   t9~6 4 H  e f }v 4  z C i U   4H 6 H96 t4v76 r1v   4 ! E C 1 5h g r}v  Hszd5  6G G e 4

Using the integral representation for the delta function

C U C    C s{ ) C 9H q C 97 p { C x C    C U C H 1 C C H 0 U 5 F U 97 F  

H7 

i H

i H

 z

 z

since each variable is independent. Consider the sum . The PDF for

is (2.29)

Consider random, independent variables , each with its own probability distribution can think of the joint PDF for all variables as

. We

 E

@@  BBF@uzs C

4 h 

P 9H~ 

7  

(2.28)

(2.30)

(2.31)

(2.32) . With

(2.33)

P ROBABILITY

35

This fundamental statement has a couple important consequences: The mean of is simply found to be the sum of the means of the . This makes sense intuitively. More importantly, the variance of is equal to the sum of the variances of ; thus, for the case where all the variances are the same , then and thus the standard deviation scales like . Since the mean scales like , we nd that the ratio of the standard deviation over the mean scales like , i.e. the fractional standard deviation vanishes in the limit. This is a very important result. Consider that non-systematic errors in measurements or calculations in computer simulations are like uncorrelated random variables. Thus, when one calculates . Thus, this averages over several measurements, the fractional error typically decreases like is a simple, statistical explanation for why taking more measurements improves the accuracy of the result. One make arguments like the above for the other, higher order cumulants. In the limit, only the mean and the variance survive. Thus, the probability distribution for is a Gaussian with mean and variance . The central limit theorem provides a more general form of this result. In fact, one can show that it is not necessary for the random variables to be completely independent, but rather that the correlation scales like .

2.6 Rules for large numbers


Experiments testing statistical mechanics are typically concerned with a huge number of particles typically moles of particles. Formally, in statistical mechanics, one makes the assumption that the number of particles (known as the thermodynamic limit since this leads to a number of simplications, which we will enumerate below.3 In the thermodynamic limit, there are three major types of dependence 4 which appear in statistical mechanical theories:

, are also common. One simple example is to consider 3. Exponential dependences, i.e. that each particle in the system will have microstates (eg each coin has 2 ways to land) and thus out of particles, we will have different microstates. In statistical mechanics, we shall frequently encounter sums or integrals of exponential variables. The thermodynamic limit allows us to considerably simplify these calculations, especially due to the following results.
3 However, it is intruiging that convergence can be very fast and, in fact, often even only hundreds of particles are sufcient to be within the regime described by the thermodynamic limit. 4 In principle, there can be other types of scaling. For example, since charge and volume is extensive ( and , then the Coulomb energy does not fall into these three categories since . However, interestingly, even this case is not really relevant since bare Coulomb interactions are typically screened by counter ions and the resulting energy is again extensive.

y uY Y y 9qxYa Y s a

2. Extensive quantities, such as energy, volume, or entropy, are proportional to

, i.e.

  '  v`

1. Intensive quantities, such as temperature, pressure, or density, are independent of

, i.e.

2 

2 U 6 2 2

d 2 5 2

   bp47

U 

U 

 2

S XXXS U  @@  e 9BB@ f

q 2 d

 7 

2 5 ) 2

y k S

P ROBABILITY 2.6.1 Summation of exponential quantities

36

where the are all real numbers and thus . Our claim is that this sum can be approximated by its largest term . The smallest sum would consist of other terms which are mostly zero and thus the minimum would be . Oppostively, the maximum sum would be the case which most terms are comparable to and thus could not be greater than : (2.37) . We nd We can convert the above to inensive quantities, by taking the log and dividing by

(2.38)

and thus

i.e. in ther thermodynamic limit. Thus, one can avoid doing the sum and needs to only nd the largest term. 2.6.2 Saddle point integration Simiarly, we can write an integral analogous to eq. (2.36):

where is assumed to be intensive and plays the role of in eq. (2.36). Thus, in the thermodynamic limit, to a good approximation, we can replace this integral by the maxmimum value of the integrand, which occurs at the maximum value of . We can also think of this process in terms of pulling out a constant from the integral:

For very large , all the terms in the integral for will go to zero and were just left with the sum of terms where . Typically, the maximum is found by solving the equation and (i.e. we want to avoid minima). Of course, this nds local maximum. If there are multiple solutions, then one must choose the global maximum. By our arguments above, this global maxima will be much larger than the other maxima in the thermodynamic limit. Finally, this method also assumes that our is bounded at (i.e. the function does not simply continue to rise to at the function boundaries).

2 2 )

) h   EI y 2U hU

3 xU G

xU

ho zI  

  'h 7

3

  zr

Since we nd

, then the ratio

will vanish in the thermodynamic limit, and thus

(2.39)

(2.40)

U FF 3 h ` p 3 h h3 F    q)ubFv q)uFz w 3 h s h 3 p F U yF U yF U 

U F

C U

Consider the sum of exponential quantities

(2.36)

U y

U U

P ROBABILITY 2.6.3 Application: derivation of Stirlings approximation

37

Stirlings approximation, which gives a simple functional form for factorials of large numbers, is an important mathematical tool, since factorials often appear in statistical mechanics (due to the combinatorial nature of counting microstates). at large . One can use saddle point approximation to derive Stirlings approximation for To use the saddle point approximation, we rst need to have some integral representation for . We can get one from starting with the simple integral (2.41)

for all . We can bring this integral into the form of (2.40) by saying that

Higher order terms (and thus a better approximation) can be found by looking at the uctuations around this maximum value. However, even for , Stirlings approximation works remarkably well, and only gets better as increases. For large , one often simply takes the highest order term and says that .

2.7 Appendix: Fourier transforms


Fourier transforms are an important mathematical tool. We dene the Fourier transform of a function to be (2.44) In a Fourier transforms, they allow one to transform from conjugate spaces, eg from time space to frequency space. As an example, lets consider a function which is a simple cos wave in time: (2.45)

  H i  ii 7

 D4

QH

v  w  )ysi m H7i m v7 q i m f1Fvi 

bp45h5h76rF"bF  kh

"i  sq 

 vPuawFwqcD`o4r w wv  

hzr  

5'bF4v7 

leads to approximation

and

")

H Xh)

$ )

and

. Thus we get the well known Stirlings

and then choose . While the above result only applies to integer analytical continuation, the Gamma function

U l G 5de w  U U p U Ul  G  9H

l H 9

) H sFezI  

If we differentiate each side

times by , we get

(2.42) , it is possible to dene, by (2.43)

l Q

  i I

P ROBABILITY

38

Note that we have broken down the cos function into exponentials, which will be useful in the next step. The Fourier transform of this function takes the form

Using the integral representation of the delta function, i.e. that

we get

where is a normalization constant. What does this mean? We see that picks out the relevant frequencies of the cosine function ( ). This alone might not seem so spectacular, but consider a much more complex function, say a sum of many (hundreds of cosine functions):

This function would look very complex and it wouldnt be easy to a priori gure out the relevant frequencies. However, Fourier transforms will do this for us in a very simple manner. Since Fourier transforms are derived from an integral, and the integral of a sum is the sum of the integrals (i.e. integration is a linear function), then we get

Using the derivation of the Fourier transform of a cosine function above, we get

In other words, the Fourier transform gives the spectrum of frequences , i.e. there is a peak corresponding to each with peak heights . There are a couple of Fourier transforms which are good to remember off the top of your head. The Fourier transform of a delta function is a constant and the Fourier transform of a Gaussian is a Gaussian. We can also talk about inverse Fourier transforms. As the name suggests, an inverse Fourier transform, transforms back to the original. Thus, the inverse Fourier transform of is . Therefore, the inverse Fourier transform of a constant is a delta function and the inverse Fourier transform of a Gaussian is a Gaussian.

 i  v  4 H  m f1Fict976

vstiscw m q H7icH m q z7 q ) 4  w s4  st4Vw m H m q   w  4 


 H  g d5 9h  H   i f i  

 4 5 syow      4 m dywo4H m d q o4 6

m Fv

i m 1F   v

 o

i 97  H

o r  

ct4 6

  o r

Grouping terms, we get

v )tsi  w   m 9H7i m z76 q ict49H76

  H i  qi 97

ct4

  o r

 o   i

(2.46)

(2.47)

(2.48)

(2.49)

(2.50)

(2.51)

P ROBABILITY

39

2.8 Problems
1. The Poisson PDF A classic example of a Poisson process is radioactive decay. Observing a piece of radioactive material over a time shows that The probability of one and only one event (i.e. a decay in this case) in the time interval between and (inclusive) is proportional to (i.e. , where is a constant . of proportionality) in the limit The probability of events at different intervals are independent of each other.

In the binomial distribution one calculates the probability of nding events with probability out of trials. Consider that the time comprises events, each of time , i.e. and that is sufciently long such that . Moreover, we have argued that above and thus . For sufciently small , only one event will occur and thus we have reduced the problem to that of the binomial distribution with nding events out of trials. (a) Using the characteristic function for the binomial distribution, use the arguments above to calculate the characteristic function for the Poisson distribution in terms of , , and (which is the Fourier transform analog to ). (b) Calculate the Poisson PDF from the inverse Fourier transform:

(Hint: To calculate the inverse Fourier transform, you should have an integral like

Use a series expansion in small

to write

and use the integral form of the delta function

to write as a function of only , , and (without summation over )). (c) Using calculate the cumulants to all orders (Hint: look at the series expansion for in small and examine the coefcients of the terms).

  |v7

of observing exactly decays in the time interval The probability Poisson distribution. In this problem, you will derive the Poisson distribution the binomial distribution.

is given by the as a limit of

g| p


  {) C H

  |d zh

 E" H 

 q e 3 s e 3 6 3 s)t q 6B q es 3 f

t 57 }v  

E

p }v G G e

H p

  |v7

d 2

  |z

C D

h w $

  |z7

) e

  h   |v7

  7

) b

!

P ROBABILITY 2. Moments and cumulants of symmetric functions

40

Often we can calculate many moments or cumulants based on symmetry arguments, avoiding a lot of unnecessary math. For example, consider a symmetric probability distribution (also called an even function) such that . (a) Prove that all odd moments vanish, i.e. where is odd. (b) Prove that all odd cumulants vanish, i.e. where is odd. Hint method 1: Think about the log of the characteristic function and how the fact that is symmetric affects the nature of the Fourier transform. Hint method 2: Use a property of the relation of moments to cumulants from the diagramatic expansion done in class and the fact that all odd moments vanish. First show that the low order odd cumulants vanish and then see if you can work up from there. Extra credit: If you use this Fourier transform arguments, you should be able to also show that even moments and cumulants would vanish for anti-symmetric probability distributions (i.e. ). However, unlike quantum wave functions (which can be symmetric or anti-symmetric), there is only one anti-symmetric probability distribution (since probability distributions are always positive ) and we already know that all of its moments and cumulants vanish (indeed: is both symmetric and anti-symmetric since ). 3. Uniform continuous distribution .

(a) Explain in words an argument for the values of and . (b) Calculate all cumulants. Hint: Use a distribution we already know in a particular limit to model the uniform distribution. 4. Double Delta Function Distribution Weve spent a lot of time talking about monomodal (single peaked) distributions. Often, the distribution of interest is multimodal. To learn more about these types of distributions, consider the double delta function, which is the simplest bimodal distribution. Consider the distribution (a) (b) (c) (d) Calculate the normalization . Calculate all of the moments of . Calculate the cumulant generating function for . From the above, calculate the rst 4 cumulants of .

$ 2 p s V65|H q

  z7   E7   z7 ts Qvzdy szd q z7 w  w H   

 E7

E7  

Consider a uniform disribution, i.e. distributions, we must have

for all in

. Note that like all probability

  E

z7  

" "  H 7 zh     E0z7   E7  

C C 

 H H b97 b97 E7  H  

 H H b97 qz7  

  z7

P ROBABILITY 5. High temperature expansion, moments, and cumulants

41

In the coming weeks, we will nd that the free energy can be related to the partition function , where is the Hamiltonian of the system, is a microstate, and . We relate and by the relationship . We will derive these results later in the course, but for now, one can simply take these equations as given. While the above is reasonably well dened (we almost always know the Hamiltonian), calculating the partition function (and therefore the free energy) is not easy. In fact, in most real life situations, its difcult (or impossible) to calculate a partition function or the free energy exactly. One way around this is to calculate or in the limit where the temperature is high (i.e., is small). Thus, one can Taylor expand or for small . For a generic partition function, we can write

i.e. we can write as the Laplace transform (which is very much like the Fourier transform) of . Indeed, the math here resembles that used in the denition of the characteristic function.

Hint: Appeal to arguments made in the notes about the relationships between moments, cumulants, and the characteristic function. 6. Stellar probabilities In this problem, you will model the probability distribution of stars in the galaxy, based upon a Poisson process model. (a) Assuming that starts are randomly distributed (which is experimentally wrong, by the way), with a density , what is the probability density of nding stars in a volume ? (Hint: Since thr probability of nding a star in a small volume is and they are assumed to be independnet, the number of stars in a volume is described by a Poisson to the appropriate variables discussed in the process. Make an analogy with and Poisson process derivation in the lecture notes.)

where and is the total number of microstates (and is constant). (b) Show that one can write the free energy in terms of a series expansion in and the cumulants of the energy.

 " &

    ~

   hvv"

H p'b

" C

DD pX

&

DD )   pX qvv"~s  

(a) Show that one can write energy dened as

in terms of a series expansion in

 H v7 97

  q~"

iv 

where denotes the microstates. We can rewrite microstates with energy (the density of states

in terms of a sum over the number of :

and the moments of the

F'3 H

  i|v

 n9H7

  |z

% i|v 97  )  H

q|z  

l )  nH76 3q)|zH76    

""

&

  v

P ROBABILITY

42

(b) What is the probability that the nearest star is at a distance ? (Hint: The probability of nding one star at a distance is the probability of nding no stars in a volume around the star and the one star in the volume . You should be able to take and write your answer as a probability as a function and .) 7. Diffusing particle Imagine a diffusing particle in 2D which at each step makes a random step in both the and directions, i.e. , where the sign for each dimension is chosen randomly at each step. (a) Assuming that the particle starts at the origin, what is the probability distribution after steps. (b) What is the probability that the particle returns to the origin after steps? (c) For a diffusing particle in dimensions, what is the probability that the particle returns to the origin after steps. Hint: The steps in each dimension are uncorrelated random variables. Note that this is also a good model for a polymer of length persistence lengths (a persistence length is the length of a polymer at which it can freely bend).

  hz7

d #

h #  o hy )  & #

# v7  #

l l Tg

 # v # 2 #

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

43

3 Statistical mechanics I: Microcanonical ensemble


Statistical mechanics is a probabilistic approach to equilibrium macroscopic properties of large numbers of degrees of freedom. While in many systems, adding degrees of freedom often makes the problem more difcult, statistical mechanics is in fact simplied by the limit, in ways similar to those discussed in the last section.

3.1 A simple example from probability


Before getting to examples with energies and microstates, lets think about the case where were ipping coins. The microstates of our system are the outcomes (eg one microstate is the rst coin landing heads, and the other two tails). The rst question we can ask is what is the chance of ipping coins and getting a particular microstate (i.e. a particular pattern of ips). In this case, we have 2 microstates per coin and thus only microstates total: hhh, hht, hth, htt, thh, tht, tth, ttt. For simplicity, we can label these microstates as microstates 1 through 8. We microstates total, the can ask, whats the probability of nding microstate ? Since we have probability of nding microstate (with no other constraints) is simply . However, lets say we place the constraint that the number of heads ( ) is . Thus, there are only microstates like this (hht, hth, thh). Thus, the probability of nding microstate with the external constaint that there always must be heads is if has 2 heads and otherwise. Now, lets generalize this to the case where there are coins. This probability of nding microstate (with no constraints) is simply , where is the total number of microstates:

Now, lets consider the case where we want to know the probability of nding microstate considering that we want of the coins to be heads. If is the number of coins which did come up heads in microstate , then this probability would be

where

is the discrete version of the delta function and

is the number of microstates with heads, i.e. how many ways of ipping coins and nding of them heads. Thus, if microstate has heads, then , i.e. 1 over the number of congurations which have heads, and zero otherwise. Thus, calculating the probability simply reduces to calculating the number of states within the appropriate constraints.

h l )

v )  

v 7

  ~)

p l ) !

d
p l

|z  

v o 7 p

|vp  S 6

U k v

`u r`u S  ~ 6

  |z

|v7  

|v  

v  

"|v  

~  

(3.1)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

44

3.2 Life at constant energy: the microcanonical ensemble


In many ways, the microcanonical ensemble is the most natural way to think about statistical mechanics (due to its direct link with probability), but often the most difcult to use in mathematical manipulations. Lets consider the case where our system is both thermally and mechanically isolated. Since there is no heat or work input to the system, the energy is always a constant , i.e. . Also, we will say that is also constant, i.e. . The heart of this approach is to simply directly count how many states have a given energy and thus calculate the density of states , i.e. the number of states with energy . Once we have calculated this probability, we can use some lessons from thermodynamics and our knowledge of probability to understand thermodynamic properties from a microscopic standpoint. There are a number of important properties, which I will enumerate below. 3.2.1 Microstate probability It is natural to carry over these ideas from coins in the previous section, to the microstates of some thermodynamic system. In our analogy above, where we have constrained the number of outcomes which are heads, we will now constrain the energy of the system. We can thus write

Note that we have used the Dirac delta function since energy is often considered to be continuous (we would go back to the Kroneker delta otherwise). The above assignment is known as Boltzmanns assumption of a priori equilibrium probabilities. 3.2.2 Geometric interpretation of entropy The entropy of this uniform probability distribution is given by (3.3)

where is Boltzmanns constant1 . For simplicity, I will simply assert this relationship without derivation. However, lets think about a few examples to see why this might make sense. We calculated the entropy change from to for an ideal gas to be

(note the change , where is the number of moles and is the number of particles). Now, lets consider how we would calculate not an entropy change , but an absolute entropy . Whats the smallest volume that we could compress an ideal gas to? The ideal gas particles dont
(or ) is even needed. Its main use is to convert temperature from Kelvin to an . For example, 300K means kcal/mol. If temperature were measured in units of energy (eg how hot energy is it? 0.6 kcal/mol), then we wouldnt need this extra (and sometimes annoying) complication.

  W IU  {  W

1 It is unfortunate that

  ~

P1v`

 H    ~QI|v

"

5vv  

  ~bp vb  

 F e 

PFw

e

  ~

  ~     vgH |v

# "5u

|v  

(3.2)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

45

interact with each other, so we could compress the gas down to the volume of a single particle (but we cant make a particle of volume smaller than ). Thus, the total (not change in) entropy of an ideal gas of volume is given by

Interestingly, this makes sense from the combinatorial point of view: how many ways are there to rearrange a particle of volume in a box of volume ? Imagine it like the number of ways to rearrange a box on a checkboard: it will be equal to the number of squares.

(3.4)

since these particles arent interacting and thus the number of states is simply the product of the individual for each particle. This fact also reects the fact that entropy is extensive, since

Lets consider a system which is thermally isolated, but we can look at two subsystems and (that are in thermal contact with each other, since they are simply subsystems). The total number of states with energy is given by

(3.6)

Since the entropies are extensive, we expect that and and we will assume that both systems are thermodynamic ( ). Thus, the above integral can be calculated by the saddle point approximation. This means that we have to look for the maximum with respect to . We nd

Using the fact that

i|v 2 qv4  H w  

, we nd

The delta function is there to maintain the fact that the total system energy must be a constant . One can trivially integrate over to get

2 2  )  H w    E iQ4v 2 khi7 7

ts

 vQH 2 dr 2  2 7i 2 w   

2  2   

5 2 2

q 6u

2  H   7  X P 1v 2 h" u w w   2  w

7 

l 2 0 qQ4 2 hqv7 v H     2

3.2.4 Relationship between , , and

(3.5)

bF "

bFX

7 

  ~

Similarly, if there are is simply

uninteracting molecules, then the number of microstates of

3.2.3 Extensivity of entropy and its implications for

bp

Lets rst think about the entropy of a single particle

. Since

, we have

particles

)XuF b  F 5~b    

F kbFX

2 y w DD Fpo

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

46

This probabilistic approach simply states that there are an exponentially large number of states which satisfy the above relation (and hence it is the solution for saddle point approximation). While this approach cannot make any comments about how the system reaches these exponentially many states, it does say that it is overwhelmingly likely these states will be found. Thus, we associate this condition with thermal equilibrium, and thus the equation above holds for the case when the two systems 1 and 2 are in thermal equilibrium. Thus, it is natural to appeal either to (a) our understanding from thermodynamics that two bodies in equilibrium will be at the same temperature or (b) the relationship that

to dene a thermodynamic temperature scale to this problem. Moreover, this means that equilibrium will occur between the two subsystems above when they are at the same temperature. 3.2.5 Entropy is maximized at equilibrium

thus, an irreversible approach to equilibrium for an isolated system is accompained by an increase of entropy. Another way to think about this is that at constant energy, the state with the most entropy is going to be the state with the most number of congurations, and thus will be the state with the highest probability. In principle, anything is possible, just certain states are very likely and certain states are very, very, very rare. For example, there is a microstate in which all of the gas particles in a room instaneously condense to a liquid (while the conditions are still typical for the gas phase). However, while this is possible, this is very rare. How rare? These states are exponentially unlikely, exponentially scaling with . Thus, as we examine more and more particles, this probability becomes more and more unlikely. 3.2.6 Clausius law from a probabilistic perspective Finally, when two bodies at different temperatures at rst brought into contact, eq. (3.7) does not hold. Instead, there is a change in entropy (and exchange of energy) such that

i.e. heat ows from the hotter body to the colder one.

2 !

We demonstrated, via saddle point integration, that the equilibrium point (with energies has a larger number of accessible states:

n 2 7 j H l

2  2  2 iq  v 2 i    

 

 

i 2

H 7

 g

We next set

" $ )

to nd the maximum and thus we have (3.7)

and

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

47

Note that in this case, this argument has come purely from a probabilistic approach. We are watching how the system moves from some set of states which are less likely to a set which is more likely. This is much like a gas expanding: the probability that the gas lls just a portion of the box is small (since there are so many ways of instead lling the box), thus it will expand.

3.3 Applications of the microcanonical approach


3.3.1 A non-interacting, two-level system A non-interacting two-level system is simply a system comprised of particles in which each of the particles can be in only one of two energy levels. There are many examples of such a system, including simple models of ferromagnets, absorbtion of ligands onto surfaces, impurity atoms trapped in a solid matrix, or our simple example of ipping coins (assuming that the two states have different energy). For simplicity, Ill talk in the language of impurities in a solid matrix. As I said, there are two energy levels, with energies and respectively. Note that the microstates are discrete in this simple example (which is different than the more general continuous picture we have been discussing); we will later tackle a continuous microstate problem (gases). We can label the microstates of this system in terms of the occupation numbers , i.e. whether the impurity is in its ground or excited state. Thus, the total energy of the microstate 2 is

The delta keeps the condition which must hold and the normalization is the number of ways of choosing excited levels among the available impurities. This is simply the way of choosing out of and is given by the binomial coefcient

The corresponding entropy

& 0R

)('qR% &

2 The notation

$ p w p H H 0 0QH 0 0 ssbp'q)0gH l uF q uq)0gH l ebFQ'9)0uF q q)QH b# F q w    H s w   ysqgpi0QvtH bFd0QH bp4 q usvbFX H  H   

refers to the value of

for each and thus completely describes the microstate of the system.

p!" bp

can be simplied using Stirlings formula in the large

and

limit, i.e.

 f

q0v pusvFusvb H     1 0 qQH F0 n usv  

tS$U   " ' l

0 j

 f h  

U 00

where energy

is the number of excited impurities. The macrostate is specied by the total and the number of impurities . The probability of nding microstate is thus (3.9)

0C i 

U  ft  

5 

'Csb  

pD

 f

(3.8)

(3.10)

(3.11) , to get

(3.12)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL , we get

ENSEMBLE

48

and thus we now have a (relatively) simple expression just in terms of and . Now, imagine that these impurities are in thermal equilibrium with each other. This means that heat can be exchanged between them. The equilibrium temperature can now be calculated as

We can rearrange the above to calculate energy vs temperature:

or alternatively, we can think of

, i.e. the average energy of an impurity site (3.16)

Lets think about what these equations mean: (1) At high temperature (dened by the only energy scale in the problem , i.e. ) we have and thus the impurities are in a random state: half are in the excited state and half are in the ground state. Entropically, this is the most favorable state since is maximized for and minimized for or (in which there is only one way to choose). (2) At low temperature ( ), we nd that , thus we nd that

Thus, we have , i.e. all the impurities are in their ground state (which has energy 0). This is clearly the energy minima for this system. However, it is also the lowest entropy state (there is only one way to have all impurities in their ground state). We can use these formulae to calculate many other thermodynamic quantities. For example, we can calculate the heat capacity

We nd that at both low and high temperatures. This means that energy changes little in these regimes. This effect is caused at low temperatures due to an energy gap: since the system must jump up full discrete steps of in energy, at low temperatures this is not likely and the system stays in the ground state. If small, innitessimal jumps were possible, then the energy would change and would not vanish. At high temperatures, the system is saturated: energy cannot increase any more (without decreasing entropy) and thus again . Finally, unlike thermodynamics, statistical mechanics can tell us about microscopic properties in addition to macroscopic properties such as energy and heat capacity. In particular, since we have

~ )

 |

$ n Y

 z

sl

w  )  % 7  )  % 7

l j

9 6 q 9 6

l %C57 ) l   2%)976

F l

gH j F

n Y l

w )  3576 v )

H 2

w  )  % 7

l j n

9 6

j "

H   Y p Y 'Csb # ) Tv"
H w %)57 l q ) v  "

If we substitute the fact that

(3.13)

U QQ

QQQ 

(3.14)

(3.15)

 )  3 7

vue )  5 6

(3.17)

(3.18)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

49

i.e. it is the number of ways the other particles can energy normalized by the total number of ways that all particles can have energy . Another way to thtink about this is that once we specify that particle 1 has occupation and thus energy , we have to count the ways in which the other particles can on total have energy . Using eq. (3.10), we nd

Using eq. (3.15) for the energy vs temperature, the occupation probabilities become

where we have the normalization . Moreover, we can write these results in the compact form . We will see later that this is indeed a general result. Thus, we can calculate the probability of nding a given particle in a given microstate as a function of temperature. As many of you have already seen, this form will be very common in the canonical formulation of statistical mechanics, as we will see next week. Aside: Negative temperature Lets look back at eq. (3.15)

We said that as and that as . Thus, as , half of the sites are excited and half are not. What would happen if we excited ? Certainly, we can imagine a situation where all of the sites more than half, i.e. we had are excited. In this case, we have . Using eq. (3.15), we nd

and thus, in this case, we must have and therefore . Similarly, for any energy in the range , we nd that the only solution involves negative temperature.

% ) H97 

 6 

 ) H % 97 l l  % 97 ) H w w  )  % 7 l % 97 ) H l l % ) H97 w w  )  % 7 l  )  % 7

m3 ) l

w  )  % 7 l

9 6

d` v d 9 6

) 0C l w  )  % 7

w 9 9 l 9

e( ) v m%%97  ) l w ) 3576 " "

q% )    H  )   liq7v976qrz7 4 l w  ) %57 l  hv7  l 5 H l 1"hv7  

) 905"hv7 H  

w  )  % 7 l

v )Y

57vh  

7 d d

hv7 

and

. Using

we nd

l H 0

Y l  hv7 ca  b Y H ) v"0 q0QH l v6 H q0v0 6 l H  H

k  H

 fh    l

the probability of nding a given microstate for the whole system probability of nding a single particle in its excited state by

, we can ask what is the

  usv H  H   hgv

  iIv7

yr

 fy7  l

PIg4  H

I

53 64C C 2
H

S XXXPS
 l

(3.19)

'  H  

ihv7   ehv7  

(3.20)

(3.21)

(3.22)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

50

1 0.8 0.6 E C 0.4 0.2 0 -3 -2 -1 0 T 1 2 3

0.4 0.3 0.2 0.1 0

-6

-4

-2

0 T

Since it is, in principle possible (and often also possible experimentally, such as laser excited states and magnetic spins), to set up such a state, this is a real phenomena. However, these states are clearly out of equilibrium. If we set all sites to be excited, it will fall out of that state both to lower energy and to increase entropy. Therefore, we will never nd equilibrium states with negative temperature, and any system with some instantaneous negative temperature will immediately fall to equilibrium with some positive temperature. 3.3.2 The ideal gas dimensional As we discussed earlier, microstates of a gas of particles correspond to a point in a phase space of the particles positions and momenta . If we ignore the potential energy of interactions, the particles are subject to a Hamiltonian

(3.23)

where describes the potential imposed by a box of volume . The microcanonical ensemble is specied by the energy, volume, and number of particles. The joint PDF for a microstate is

(3.24)

The rst delta function constrains particles to the box and the second one keeps the total energy (here, just the kinetic energy) xed to . This latter constraint also has a geometric interpretation: we say that the momenta are constrained to move on the surface of a hypersphere 3 dened by the equation .

E A& & F DCB

dimensional sphere of radius and a sphere is dened by

I G H H E

3 A hypersphere is a

dened by , etc.

. For example, a circle is dened by

 ~QH

   us ) 2   |

  1 "

w 2 

v U 0

9 7  Av @8 d

|v7  

Figure 3.1: Energy level system.

vs temperature

in a two Figure 3.2: Heat capacity a two level system.

vs temperature

in

  `

"

v 7 2

G H E

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

51

We need to calculate the number of microstates . Since there is no interaction between the coordinates and the momenta (which would arise from an interaction between particles, for example, but this does not exist for an ideal gas), then we can write

In terms of the coordinate degrees of freedom, the only constraint is that they must be in the box (which has volume ). Since each particle (which well say has volume ) can sample all locations in the box, and thus we get a factor of arrangements in terms of purely the coordinate degrees of freedom. In terms of the momenta, we can make a similar argument. Just as the coordinates are forced to of a dimensional live in a box of volume , the momenta are restricted to the surface area sphere of radius . Thus, the total contribution from the momenta is . Note that it is tempting to have some lower bound for momenta, just like we have a lower bound for volume . This lower bound does indeed exist it is Planks constant, which tells us the smallest momenta that can exist. For now, we will simply say that there is some characteristic momenta scale . Now, we need to calculate the area of a dimensional sphere . We can write in general , where the generalized solid angle is (3.26) or less (since we will take (3.27)

Finally, we can use Stirlings approximation for the factorial and we nd the entropy

(3.28)

Properties of the ideal gas can now be recovered from our denition of temperature

(3.29)

and thus we nd that the energy has a simple relationship

(3.30)

and is thus only a function of . Similarly, we nd that the heat capacity

(3.31)

1 2 yo

2 `

h $ R j Sv  0 Fe h

h  ) 1 0  ) 2  2 ~`  2 l   ') `

U o R

Q QQQ h v( Q

vh

n v 6 v U o C

vh

3 h

vh

S hU QQ

HI 2 )  R ~`}5vhF

QQQ 

QQQ

QQ

'   

vh

F4

!#

and thus we have ), we nd

and

. Ignoring terms of

 )

oP Q

FD9zBxih ' @   

o #P

D@B    F9zxusv

U d XIC iziG'sv  )   

   usv

 l %)  p H v 2 }v

t 

G   C 9z'C5us    ) R v`v  ` 

(3.25)

v
#

  usv

# p

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

52

is a constant. Finally, we can calculate the equation of state from

which is our familiar friend. Just as we did in the previous section, we can also calculate the probability of a given microstate (or set of microstates). For example, we can calculate the probability of nding particle with momentum and position . Similar to the calculation in the previous two level example, we can write this in terms of the number of states with particles and energy :

Using eq. (3.27), we nd

From Stirlings formula, the ratio of large limit, we get

to

is approximately

and in the (3.36)

This is a properly normalized Maxwell-Boltzmann distribution. It is more typically shown in a more familiar form using the substitution

This distribution describes the probability of particles at a given temperature. Note that this is a Gaussian distribution in dimensions (reecting the three dimensional nature of space) with zero mean and variance . 3.3.3 Mixing entropy and Gibbs paradox The expression eq. (3.28) for the entropy of the ideal gas has a signicant problem: it is not extensive! In particular, it has a term like , where is a non-extensive constant, and thus the , not . This non-extensive nature comes from the contribution entropy scales like and thus , since is extensive.

9z G

It is often more instructive to calculate the probability of nding particle 1 with momentum where in the box, i.e.

) i ` v v  i 2 v`h 9$ 7 h H 2 2 yo g g h  v  v ) 5}i) l H v h q 10 h  sv 2 yo 1)0  srq) l H v h q ) v y2 o  2 R ~`}v U i ` v H l h l 2  v 2 U ) U U  61)0  s v  o g 2 o  2 R v` 5v 2 )X t U 5}i) l H  h q U U  ) 1)0 h  v o ) 2  o  2 H 2 R ~` v 2  o d X

) 2

v H 4

5!~   i 2  l

   us  5qu $ 7 u q$ 7      )  ! 2 H 

   usv 9u $ h    ) 2 H!  l H

bF4 d Xp 5 9vXF % 9z$  )  G G bp4

) 6 2 yo  2 R l }5v

v) bY h

 )   uv hF4

Y 

S hU QQ QQQ   9

i$ 7  

(3.32)

i$ 7  

(3.33) any-

(3.34)

(3.35)

U d X  )

(3.37)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

53

Mixing entropy This difculty is intimately related to the mixing entropy of two gases: Consider two distinct gases, initially occupying volumes and at the same temperature . One can imagine this in terms of a box of volume with a partition to separate out the two gases. Now, if we remove the partition, the two gases will mix and they will each expand to occupy the total volume . This mixing process is clearly irreversible and thus must be accompanied by an increase in entropy, which we calculate below. According to eq. (3.28), the initial entropy is

The nal entropy of the mixed gas is

The temperature of the gas is unchanged by mixing since

. Also, since only depends on , there is no change on the momenta contributions Thus, to the entropy at xed temperature. Thus, the mixing entropy

is solely from the contribution of the coordinates.4 The above expression can be generalized for the mixing of components, yielding

4 This equation can also be rewritten using

and

to get

This form is very similar to the entropy found in the two level system, which in that case derived from the number of ways of rearranging particles in the levels. For this reason, that entropy is also often referred to as a mixing entropy in that one is mixing a certain number of excited energy particles among the sites.

v) b3 h

)

is the momentum contribution to the entropy general for any ideal, monatomic gas, we have

  2  F 2 w u p

8 8 8 FD CA@o h

@ DzBxusbp n h j pe

P 2

S S

Y w Y reds V n j t D Y c Y Y t W a `FW b Y t j SS b Y YXV Y t |W Y vS W # S S t Y   p u C Y%b H 8

 %

w  w  h5tpX 2 ytbp 0" w  @

 }vuF vh

yo

2  $ 8 8 H @ F 5D CA@C 4 8 p 2 yw 

%z  

where

. Since

2

  w 2 pvu 2

w  w   yq$ h$Fzu 0" 2 y75rD CA@ w 8 8

2 

h

2 h w 2 h w

2 xhm  w 

 2 `$

C 3 h @

T Uv

T U

D CA@ot 8 8 @

T U

T U

(3.38)

(3.39) in

(3.40)

(3.41)

(3.42)

(3.43)

(3.44)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

54

Gibbs paradox Gibbs Paradox is related to what happens when two gases, initially on opposite sides of a partition, are indentical and have the same density . Imagine removing the partition in this case and then reinserting it. There should be no change: the density, , , etc will be the same since the gas particles are all from the same, indistinguishable gas. On the other hand, eq. (3.43) predicts a change. Lets think about what is truly happening here. When we replace the partition, the system does not return to its original state in the sense that the original particles on one side of the partition have mixed with those on the other. If we could label these particles in some way, then we would see that they would have mixed. But these particles are not labeled they are considered to be identical and therefore, indistinguishable. For example, if we exchange particles (A,B) to (B,A) we see that there is a change. However, if we exchange the pair (A,A) to (A,A) we see that in this case of identical particles, the exchange does not lead to a different microstate but simply yields the original one. Thus, for the case of identical (indistinguishable) particles, we should not count the rearrangements of the labels of these particles. Thus, in the mixing entropy eq. (3.43), we have over-counted . If we correct for this the microstates by including the number of permutations of the labels addition, i.e. normalize by the number of permutations , then we get

and the corresponding entropy

is the density. This is now extensive since and are both intensive. where We can now use the above to calculate for the cases of distinguishable and indistinguishable particles. For distinguishable particles, the nal densities are since the particles will mix to ll the full volume . Thus, we get

which is precisely what we got before in eq. (3.43). This occurs since the entropy change here comes from the additional places whihch the particles can go (due to increased volume) rather than an entropy of the labels. For the mixing of gases of identical particles at the same density , we get

as we would expect since we have now removed the permutation entropy which is unphysical for indentical particles.

2 " 2 ) 2 5$1i0"& &   )

s& H Fg`w l q e $ w

e F 2 pX 0  F 2 ypX 0 & w & H & w & 2 2 w y0 2 2 w 2  p w $ FX 0 H 2 ku$ F  yu0v  w

  2  p 2 w $ F 0 2 2 2  FX w $ F

2 ) 2 5$1i5v&   )

1) &

)  F

U 10 h   v ) 2 o v`v 2 U o U 
# # # $


(3.45)

n n 2  2 j $ 0 j %H 2 pX 2 w F  0  2 H p 2 yw F 0 FD CA@C 8 8 8 H @ 

w eYs
u 8

q s

vs|   

(3.46)

w H  F!bF4xF4

FD qA@QC 8 8 8 H @ 8 & @ & 5C " 2 yuuq 2 y0v  w  ) w

T f

T U

T U

q kbFX
 ) 105v& 8

(3.47)

(3.48)

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

55

Note that this permutation entropy is very relevant for distinguishable particles since we can tell the two cases apart. For example, in the two level case in the previous section, there is no need for this factor since we can distinguish particles by their locations (which are xed and are not microstates, and thus there is no permutation entropy from the permutations of these (nonexistent) position microstates). We have introduced this extra factor of by hand using physical arguments. However, it more naturally comes from quantum mechanics, which specically deals with the fact that particles are indentical. Moreover, a quantum mechanical description of indentical particles requires proper factor that we have put in by hand symmetrization of the wave function and naturally yields the here.

) l

) l

) l

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

56

3.4 Problems
1. Polymer stat mech from a two-state-like model A polymer is a long chain molecule. Its arrangement in space can look much like a random walk. Here, you will calculate properties of these random walk congurations. (a) It is natural to think of a polymer as a random walk of steps with . For the one dimensional ( ) case, we can think of a polymer with an end-to-end distance as a sum of steps with and , where is the size of a single step (which is a constant). Thus, we must have . Negative are possible for example a polymer which always steps to the left ( for all ) has . Calculate the number of ways congurations with a given and for . (b) Calculate the mean end to end distance for the above. Hint: Can you answer this question by symmetry arguments? (c) Calculate the variance of the end to end distance for the above. Hint: Make an approximation for by Taylor expanding in the small quantity for . Keep only enough terms as youll need to calculate the variance. (d) If the steps in this random walk are uncorrelated (both in the sense of the step which preceeds a given step, as well as any correlation between , , , etc directions), what is the relationship between the number of congurations of a polymer in dimensions in terms of that of one-dimensional polymers , where is the number of monomers (i.e. number of steps in the random walk) and is the end to end distance? Hint: Think about the relationship between and when there is no correlation between the s between dimensions. (e) What is the mean and variance of for the dimensional case?

2. A simple model of disordered systems, such as proteins Many disordered systems freeze into a particular conguration, i.e. as one lowers the temperature, the system is found in its ground state. Typically, one says that the number of states with a given energy follows a Gaussian distribution:

where is a constant (related to the number of congurations per particle irrespective of the energy). (a) Argue (using physical arguments) why some system comprised of components with some random disorder (i.e. random variation from component to component) might have a of this form. Hint: Energy is extensive. (b) Calculate , , and (c) Find the freezing temperature , i.e. the temperature at which the entropy vanishes .

AFA l
# #

 #  vbp l

#v%   # | #  #  |

2

v T) 2

# 2

I# 4#

H 97

   6 U 5'

 #  v

# #| 
  %E #

7 g l

# w xI #

  %E 'b  

"h zb  

 # C

  '

h 6

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

57

(d) Explain why the entropy vanishes at a low, but non-zero, temperature. What is happening in the system? This freezing behavior is analogous to the folding of proteins and this simple Gaussian model for the energy levels actually goes a long way to describing the thermodynamics of protein folding (in which the microstates would be different protein conformations and freezing into low energy states is equivalent to proteins folding into particular native folds). 3. An ideal gas of complex molecules in

Consider a gas of molecules in dimensions. One can have in experimental situations by restricting these molecules (eg to live on a surface, and thus ), but we can solve equations mathematically for any (even ). With these results, youll see which thermodynamic parameters of an ideal gas depend on the dimension and how. Following arguments similar to that in the notes, calculate (or give clear physical reasoning for the value of) the following quantities in terms of at constant , , and : (a) the entropy (b) the total energy of the system as a function of and (c) the heat capacity (d) the equation of state 4. Fermi-Dirac Statistics in the micro-canonical ensemble In this problem, you will calculate Fermi-Dirac (FD) statistics in yet another way. Recall that for FD statistics, a single state can be occupied only by either zero or one particle. Consider FD statistics for a series of states all at the same energy . We will say that each state can be occupied only by or particles. If the state is occupied, then it contributes an energy to the state. (a) If of the states are occupied, what is the energy of the system. (b) Calculate the number of microstates with sites occupied as well as the density of states . Also calculate the respective entropies . (c) Using a thermodynamic relationship between , , and , calculate the average number of occupied sites vs temperature .

For our calculation of the density of states for an ideal gas, we used a relationship for the area of a hypersphere in dimensions. Here, we will derive it. (a) Consider the integral

Show that

st w @ 2 q q @ @ @ 2q Q@BB@w 2 gw 2 EvH U BB&u

5. The area of a hypersphere in

dimensions

  uv

dimensions

   usv

nq

  v

S TATISTICAL

MECHANICS

I: M ICROCANONICAL

ENSEMBLE

58 to -dimensional hyperspherical

and show that this reduces correctly for and . (e) Use the above to calculate the volume of a hypersphere of radius

 v ) i 1u q q rq 2 nq }v rq gp t

(i v 7 i v si 8p q nq }v

1  ) 2

 z

Show that , where (d) Equate these two values for to get

is the Gamma function.

in

dimensions:

}v
2

where is the factor that arises upon integration over the angles. Show that and . (c) can be determined for any by writing in hyperspherical coordinates:

p d q @@ qgp rq U BB@ i U BB@ 2 7 q @@

(b) We can transform the volume element coordinates to get

q r 8p rq '

@ 2 u zABxC

p i

v v) qp p i)1si g

o gp qp
p

q gp

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

59

4 Statistical mechanics II: Canonical ensemble


Often it is easier (or more desirable) to perform calculations for systems at constant instead of or constant instead of . These two cases correspond to the canonical and grand canonical ensembles, respectively. Typically, we will use these ensembles for the bulk of calculations we perform.

4.1 Canonical ensemble


While studying adiabatically isolated systems is both instructive (in terms of how statistical mechanics connects to probability) and often a useful form way to perform calculations, it is often more natural to consider a system which is in thermal contact with a very large heat bath, such that the system can be considered to be at constant temperature. The macrostates of this setup are considered to be part of the canonical ensemble and are thus specied by a value of and (or what ever generalized volume is relevant). Thus, while the microcanonical ensemble addresses the probability of microstate at xed energy , the canonical ensemble determines the analogous probabilities at xed temperature . 4.1.1 Derivation of the canonical ensemble from the microcanonical We can think of the system and the heat bath (reservoir) together as a single microcanonical ensemble. Thus, we write probabilities of the microstates of entire reservoir + system state as

denotes the entire system + bath, denotes the microstate for , and where is the total energy for the case. Since, in the canonical ensemble, we do not care about the heat bath, we want to integrate out the heat bath degrees of freedom: (4.2)

The sum over reservoir microstates is contributes a factor of states

, i.e. the reservoir density of (4.3)

The above equation is of the same form as the previous microcanonical examples, in which we calculated the probability of a microstate by looking at the density of states with that microstate removed. This equation, in turn, is related to the entropies of and

(4.4)

DD p

W W W W W D  FW Dr DD  Fr  q v H7FDIr D  Ets v hDp q Iu z  9uIr H D H        $v W W W 5 2 W  DD xpv   Est z 7p q Iu zI H D D H     z7 W # # B W  W  W W  v    v $ zyvxw'D 

$ Dp  D

 DD  9pvI DD H  xpQI$ v w 

# k

H  Ii

W W

v

H D  7DpvI l 

 'Cs

  |v

 z7

 |z D '

 h bE

 vh

(4.1)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

60

We are assuming that the heat bath is much, much larger than the system . Physically, this is how temperature is kept constant: signicant changes for are insignicant for . In terms of energies and entropies, this means that

Therefore we can Taylor expand

for small

. This yields (4.6)

Using the thermodynamic relationship

, we get

. Note that we have where the exponential term is called the Boltzmann weight of the microstate lumped all of the constants into a single normalization, which we calculate from the normalization condition of the microstates to be

which is known as the partition function. This is no small result it will be the foundation of much of the calculations to follow. It was very much foreshadowed in the simpler examples we did previously, but here we can show that this is a generic result. 4.1.2 Physical interpretation of the partition function It is natural to interpret the partition function as the normalization of the Boltzmann probability of a microstate. However, as we will see, there is also a natural physical interpretation for as well. While is xed in the microcanonical ensemble, it is a random variable in the canonical ensemble, i.e. it is allowed to uctuate. Thus, we can calculate the probability that the system has energy , , by

We have gone from a sum over microstates to an integral over by a change of variables. This will , i.e. how many microstates have energy , ie. result in a Jacobian which is naturally

Since

, we can bring

into the exponential to get

@@@  BB&vv"

 ) H 3 v97

# t

     ) H vgH |z di% ~%7

  ~

s  q)vv

W W W   ) $3q|z 76 l  H     9H76 l  v7  &) l |  ) W  W  W W   2 sw  W  I9Dpv !q z hDFI H D   H D   q W v W H7DFI W W W D   W  


 H    ~QI|v d

 

~gH |v93v97 l vv5     ) H   6 l 5 2 vQI|zdi|z ' 5 2  H    

  H q 6 %)ugH ~bF q 7 ~7  s    bp"s )  5 2 BBV @@@  BB@ 5 2 3BB@ 5 2 @@ @@

 ) u q|z

 &

W  H76 5 2 W e

b ca

 H    vgI|z d~

l 

  vv

(4.5)

(4.7)

 vh

(4.8)

vv7  

  vv7

(4.9)

(4.10)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

61

where is a free energy (as a function of energy). Moreover, we see that the probability arises from two terms: either how entropically likely a given energy level is or how low is that energy. We can make a similar agument with to see how it may be connected to thermodynamics:

(4.12) Since is extensive, we can use the saddle point approximation to calculate this integral in the termodynamic limit (4.13) where is the most likely energy (i.e. the energy which minimizes ). Moreover, this yields the relationship, that in equilibrium (where one is most likely to nd the most probable states)

We can also derive this result by calculating the average energy

In thermodynamics, we encountered a similar expression for the energy

Comparing these two equations:

4.1.3 Fluctuations in the canonical ensemble

How much does the energy uctuate? In the previous two approaches for deriving , we looked at the most likely energy and the average energy . These two values will be identical if there are negligable uctuations of energy. We can get an idea of these uctuations by computing the . variance

H Fu

H p'

`eFuH

it is also natural to connect the free energy to the partition function by

or

 9  " Fz  H   n j H  2  H %Qw       H      n9H7  q|znH76|v

  i|v

where averages:

(4.15) and we have used a common mathematical trick of taking derivatives of sums to get (4.16)

 ) % ivv

(4.14)

(4.17)

(4.18) .

bp

H i|v  

H 97

"|z  

 n9H7  5 2

5~93 v97    ) H

v  ) 3 q

Rearranging terms (and using the equality forced by the delta function

) we get

 H    vI|v d~



% q|v 97  )  H

H 7

H   l q|z

  7%q)~

H F

"

 nH76

vQI|zd H  ) H %v976 5e 1 0 6 5 2 t

H 7

% q|v 97  )  H

  l |v

5 2

|v7 |v    

  H ~b%" ` 2

l ) ' 5 2

 !

(4.11)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

62

where is the total number of states and is just a constant. We are performing a Laplace transformation, instead of a Fourier transformation. In terms of generating moments from the characteristic function, this difference is irrelevant. Thus, using this analogy, we can think of as the generating function of the cumulants of energy, with a mean

and a variance

(4.21)

Thus, we nd the variance to be

where we have used . The equation above is important for several reasons: 1. We see that the variance is extensive (since standard deviation

is extensive). This means that the relative

is essentially like a delta function: one energy and thus, in the thermodynamic limit, dominates. Indeed, we can write this distribution as

since the higher order moments vanish quickly in the thermodynamic limit. This also tells us that for smaller systems (eg nanoclusters, proteins), uctuations may be important.

1 92i

2 H H !

QQ QQQ    2 7 C   l 9H F C C

More generally, we can write the th cumulant of

as

(4.22)

(4.23)

(4.24)

(4.25)

i|vnH7|v 5 2 Ii|vn9H7 2 |z 5 2      l H     

 H ~7 7

  6vv7 
H

H   l i|v

2 l

DD p"v7 97  H

h}v

  I vbp

  ~7 2 2

   nH7|v 5 2

 qvv

  ~  

A natural means to do this is to think of . Indeed, when we write

as being like a characteristic function of the distribution (4.19)

 9H7

 % v )  H 2

2 H   F 2  H A2

l 2

4r v   vh  

DD pX

F2

  vh

(4.20)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

63

2. This connection between energy variance and heat capacity is also a convenient trick for calculating heat capacities in computer simulations. 3. The heat capacity is not always so well behaved. In phase transitions, the heat capacity can diverge. This signals huge variations in the states of the system. 4. Since the energy of the system is well dened, we expect that the canonical and microcanonical approaches describe exactly the same physics both at the microscopic and macroscopic scales.

We have used the thermodynamic relation

in many places to introduce temperature in a physical way in the previous discussions. However, one can introduce the concept of temperature without thermodynamics, albeit from a more mathematical approach. First, recall the idea that in work-like terms, conjugate terms appear (eg ): the product of an extensive quantity describing how much there is (eg volume , number of particles , extension ) and a intensive, force-like term describing the energy gain/cost for changing the value of the corresponding extensive quantity (eg the pressure , chemical potential , or force ). Energy is an extensive quantity. What is its corresponding conjugate variable? Mathematically, Laplace transforms allow one to write mathematical expressions to describe the system with one conjugate variable held constant in terms of an expression holding the other constant. For example, lets say that we calculated the density of states at constant energy and volume and wanted instead . One could calculate this as

work term into the Thus, the Laplace transform is a mathematical expression of including the relevant free energy (energy is constant here, so we can take it to zero). In the previous section, we saw that the Laplace transform of in was , with as the conjugate of energy. What this means physically is that is in some sense the force which keeps energy constrained (mathematically, this can also be thought of as a Lagrange multiplier). To get a physical idea for what is , lets consider two extremes: 1. At , there is no price for changing energy and so we expect many different energy levels to be found. Moreover, in this case, what clearly dominates is the entropic terms (since ). 2. For , then (a) there is a huge price for increasing energy and (b) entropy is vanishingly irrelevant compared to . Thus, in this case, energy simply wants to decrease as much as possible. Physically, it is natural to connect these two extremes with the high and low temperature regimes, , where can be considered to be a respectively. Thus, it is natural to make the association constant of proportionality, and of course turns out to be Boltzmanns constant.

57

  I v

s H  QI~b q 6

 ~

  ~

3 )

l '

"~%7    H

 

~~ 5~        ~

4.1.4 Interpretation of

in terms of Laplace/Legendre transformations

"

   ~sv

(4.26)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

64

Finally, one can think of Laplace transforms as another integral representation of a delta function:

Thus, one can write the density of states

as inverse Laplace transform of the partition function, again connecting and as a conjugate pair, just as the partition function can be considered to be the Laplace transform of the density of states:

4.2 Applications of the canonical ensemble approach


We will see below that the systems studied with the microcanonical approach in the previous section can be effortlessly rederived using the canonical approach, displaying why it is the approach more typically employed. 4.2.1 Two level system which equals 0 or 1, if Recall, that in the two level system, we have occupation probabilities the particle is in its ground or excited states, respectively. Thus, the Hamiltonian takes the form . We can directly calculate the probabilities of microstates

To go further, we need to calculate the partition function

is the single particle partition function. The particle partition function is just the product of the single particle partition functions since the particles dont interact. We see this explicitly mathematically in the formulae above, but its perhaps most physical to think about this in terms of the fact that the corresponding density of states can also be broken down into the product of single particle density of states.

 h9H7w

 w    h9H7ChH7ix9H76 C

where

ets|vnH q 5 2 d~797     U
4"  

U jik qh9H7 4C fh t BB@ t jikqhqhHr6 C g  3 @ @  fh

ts EH q u H 
 H v7 7

4o vQH |v X H    

ts

  vv"

U hH 6

    r9viv797u q 6 5 2 uQI|z q 5 2 s H 

(4.27)

H   E  f 7     r v

U hH 5 C 2

U y

vv   e

(4.28)

(4.29)

(4.30)

(4.31)

(4.32)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

65

This leads to a partition function of (4.33)

Note that this is a familiar form: we expand the product, we get a binomial series

Thus, we see how our microcanonical density of states is hidden inside our calculation. However, while this is in there, we got to the free energy in a considerably more direct route and need not at all. From this, we can easily calculate the free energy calculate

The entropy is simply found by

and the energy by

or by

4.2.2 Ideal gas

In the canonical ensemble, we are calculating macrostates with constant , , and . A microstate species the positions and momenta of all particles: and thus the probability of a microstate is

Including the modications to phase space of indentical particles (i.e. the factor of

where we have found this by calculating the Gaussian integral that we have dened a new length scale

As we will demonstrate later, this length scale controls the onset of quantum mechanical effects in an ideal gas.

}v `| 2u})H7 v 2   U U i o %z74 l U i 2 2 v H 6 Co U   2 o rv g g 1  0 o o ) l v U H 6 l  i t7  2 1  0 q

% 7 ) H w  ) H 3 97

 H v7 7

)  ~"0| U
 2 j

%)57w   ) H 397w Y5  H Y  ) H %97 bF  l %97w ) %Qw Y l n Yydts3)9H7w l q pbY"  w  

ys%C)H76w l q F'eFu|   H H

U tshH7w l q 

rv

q0C 

%z7  

 h9H7 4

e n 0 j

(4.34)

t    ubzv H   ~v

(4.35)

(4.36)

(4.37)

(4.38)

(4.39)

), we get (4.40) . Note

(4.41)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

66

The free energy is given by

Thus, we have rederived the microcanonical results with considerably less effort. However, as in the two level system, the microcanonical result is lurking within these formulae as well. In particular, in the microcanonical case, we had the delta function to x the energy. If we use an integral representation for this delta function, we will get terms just like the above, where would play the role of the Lagrange multiplier. 4.2.3 Equipartition theorem Imagine a system with , independent harmonic degrees of freedom. Its Hamiltonian is

is the mass of particle and is its spring constant. Since these particles do not interact where which is mathematically seen by the fact that we can separate the Hamiltonian into single particle Hamiltonians:

the partition function of the total system is simply the product of the single particle partition functions

Since these are just Gaussian integrals, they are easy to do. Using we get

$ 3 v H  $ v H )  ) 2 7 %) 2 976 mt 3h}v # # }v v   2 T) 2 H76

$ H7 

% )

Y5!~  v) bY h

where we have used state is obtained from

to get the internal energy

. Similarly, the equation of (4.44)

Various thermodynamic properties can be obtained from Duhem equation). For example, from the entropy

i 2 3
(4.42)

}v g

w `' % H H n n j j i 2 H p' % p h w F H   g H w H  H h y(bFu g4p' %tp' H F H

4 o Y rv v Y 1 1
2 l

h QH

v Y

w 2 

2 l

v 0

# C

% )

U S QQ  ' QQQ   ! H Y o U T QQ %w S n 4 H QQQ o j F  H

w 2 

Fw

v 0 C

 9H76

v Y

0 C


4e

% ) 

 9H7

(aka the Gibbs-

(4.43)

(4.45)

(4.46)

(4.47) ,

(4.48)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

67

where is the frequency of degree of freedom (and we have introduced lower cutoffs for volume and momenta in terms of the angular momentum ). Thus, the total partition function has the form (4.49) We can calculate the total energy as

to the energy. In general, momentum and Thus, each of the degrees of freedom contributes position degrees of freedom each contribute to the total energy. This is called the equipartition theorem. It is a simple way to calculate the energy vs the number of independent degrees of freedom. Mathematically, one can see this since each integral over a momenta ( ) or position ( ) requires another Gaussian integral, and thus yields another factor of to the partition function. The other quantities (eg the frequencies) drop out when we take . When these degrees of freedom interact, this answer becomes a bit more complicated, of course. We can use this result to easily calculate the energy and heat capacity of a crystal, which can simply be modeled as a set of non-interacting springs. Thus, we have the energy , which is refered to as Dulong and Petits law, and the heat capacity . As we will see later in the course, these values hold in the classical limit, which corresponds to high temperatures . At lower temperatures, the quantum nature of the discrete energy levels is important, whereas at high temperature, the energy level spacing is much smaller than the temperature and it looks like a continuum of energy levels.

Y h h

 ~ ) 

 p )

l m U C C l  "

m3 t C p

v) b3

"

m t

) 

r v

(4.50)

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

68

4.3 Problems
1. Volume reservoir Consider the case of a system in contact with a large reservoir, in a way such heat can be transfered from the system to the reservoir and the volume of the system can change (although the volume of the system + reservoir is xed). Assume that the volume of the reservoir is very, very large, compared to that of the system (and thus accordingly ).

where the in the exponential should be replaced with the appropriate function of thermodynamic state functions and is a normalization. (d) Using the normalization conditions for calculate as a function of . (e) Using the saddle point approximation, demonstrate a link between and at equilibrium. (f) Explain why the above makes sense in terms of thermodynamic arguments. 2. Derivation of the Grand Canonical Ensemble In class, I said that, in the Grand Canonical Ensemble, we can write the probability of a microstate of system in contact with a reservoir can be written as

Using arguments parallel to our derivation of the canonical ensemble from the microcanonical ensemble, derive this result from the microcanonical case of a small system in thermal contact with (and that can exchange particles with) the reservoir .

and integrate out the reservoir microstates

Hint 2: The equation for a 2D Taylor expansion for small

and

is

 DD H   pQI$ vI

   w

S QQQQ 3 6G Q 3 

z

S QQ 6G QQQ   3 E  3 g0E w w 3 w w  3  nW W W  WD DD  DpCpv  e|DpQIu vI  z d u v yvxwv c  D H   w 

Hint 1: In the microcanonical case (considering must be xed. Thus, one can write the probability

), the energy and the number of particles

v

 v

where and

is a normalization (analogous to the partition function in the canonical ensemble), and are the energy and number of particles in microstate , respectively.

s 3q)

#  v

 v7

(a) Write the probability of nding a microstate of the entire (b) Integrate out the reservoir microstates to get (c) Using the fact that simplify your answer for have the form

case,

above. Your answer should

nW  u W  z7 

s q Bq)

 76 W

 v7  l

 vh

 v H

p  W

 v7 l

W oS '

 y
z

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

69

The energy levels of a harmonic oscillator are evenly spaced. In other words where is Plancks constant, is the vibration frequency, and is the energy level number. is an integer which goes from to . Each energy level can only be occupied by a single state. Thus, the microstates in this problem are the energy levels directly corresponding to the values of . The ground state is and has an energy of . (a) Starting from the denition of a partition function, write the partition function for this system in terms of a sum over . (b) Using the mathematical identity

4. Einsteins model for the heat capacity In class, I mentioned that the heat capacity was not as simple in real life as one would get from classical arguments. Einstein came up with a model for the heat capacity using quantum mechanical arguments. Specically, he said that a solid is like a network of springs. Thus, each atom vibrates much like the quantized vibrations we talked about above. Assume that each oscillator is vibrating at a constant frequency (called the Einstein frequency). Also, assume that there are such oscillators and that each one vibrates independently (i.e. they do not interact). (a) Starting from your result from Problem 2b, calculate the free energy of oscillators. Assume that each oscillator operates independently (i.e. do not interact). (b) From (a), derive the average energy for a given oscillator. (c) From (b), derive the heat capacity . . For temperatures much greater than (d) We dene the Einstein temperature as , write a simple approximation for . How does your result compare with the classical value (which is valid for high temperatures)? (Note that oscillators have 2 degrees of freedom per oscillator due to position and momentum degrees of freedom). Hint: Write your answer to (c) in terms of . For , then and you can Taylor expand your result from (c) for small . look at low temperatures? Simplify your expression from (c) as much as (e) How does you can and comment on the value of as . (f) Its interesting to consider how different substances/models lead to different heat capacities at low temperature. The model weve worked out here is valid for phonons quantized vibrational modes. Compare your result for the previous part with what weve found in class for classical modes and for an electron gas.

&

) 7

&

q 7 )

m U

) 7

m t

) ~

Evaluate the sum in part (a) to write

in terms of , , , and .

 l  1 ) z w 

m t

3. Statistical mechanics of quantized vibrations (a harmonic oscillator)

v )

m t

) 

H l

v l

w v

m G @ @ @ w C BB`w o Qw 2 Q0gw

m t

5|

&

S TATISTICAL

MECHANICS

II: C ANONICAL

ENSEMBLE

70

In the Einstein model above, we assumed that all springs have the same frequency . The results of this model are pretty good at high temperature, but dont agree well with the experimental results at low temperature. To create a more accurate model (and get better experimental agreement) Debye later suggested a slightly more complicated model. Instead of saying that the degeneracy of spring frequencies is xed to a single frequency ( ), Debye suggested a more realistic function. (a) A simple way to approximate in Debye theory is to assume that the degeneracy of frequencies (i.e. how many different ways one can is related to the surface area of a sphere of radius in dimensions. For , what would we get for (up to a constant). (b) Use this result to calculate the energy and heat capacity for a Debye model of a crystal. You may need to leave part of your answer in the form of an integral. (c) What does the low temperature heat capacity look like? (Taylor expand your result above for low ).

6. Diatomic molecules: canonical approach Consider a classical system of noninteracting diatomic molecules enclosed in a box of volume at temperature . The Hamiltonian for a single molecule is taken to be

2 2 I7 H

(a) Calculate the Helmholtz free energy of the system (b) Calculate the heat capacity at constant volume (c) Calculate the mean square molecule diameter

where

are the momenta and

are the positions or particle 1 and 2, respectively.

5. Debyes theory for crystals

 $ 

2 H 2 Iu l

h v rq2 g v 2 2  p w 2 w 2 l  u $

 $ 

 u 

 7

m m
H

 du

 $  

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

71

5 Statistical mechanics III: Grand canonical ensemble


5.1 Derivation of the grand canonical ensemble
We have solved several problems in both the microcanonical and canonical approach, yielding the same answer. This demonstrates the mathematical result we previously found, that in the thermodynamic limit, these two ensembles will yield identical results. However, it is considerably easier to do many calculations in the canonical ensemble vs the microcanonical, since the restriction of xed energy can make calculations difcult. Similarly, often the restriction of a xed number particles can make calculations harder than they need to be and it is even more convenient to assume that instead of held constant, we work at constant chemical potential . The corresponding microstates contain an indenite number of particles, i.e. different microstates will have a varying number of particles. Thus, just as we use the Hamiltonian to tell us the energy of a microstate , we use a new function to tell us how many particles are in a given microstate . 5.1.1 Sketch of derivation of the grand canonical ensemble from the microcanonical Using arguments similar to those used to derive the canonical ensemble from the a microcanonical view of the system + reservoir, we can calculate the probability of nding microstate to be

where as before and is a normalization (analogous to the partition function in the canonical ensemble) and is called the Grand partition function. Since , we nd

We can reorganize the summation above by grouping all microstates with a given number of particles (5.3)

Thus, we see that the grand partition function can be written in terms of a sum of -particle partitions. Moreover, one can interpret the above as a Laplace transform of the partition function, transforming from xed and to xed and .

Also, just as we used the partition function to calculate in the canonical ensemble, we can use the grand partition function to calculate in the grand canonical ensemble, by

 H  'Q7|v

diq 

 zm

w    Qq)|v9H76 l 5 2

  vv7

  u7

uQr|v  H 

'bEvq%0u97   )  US di|z ' 5 2  

5.1.2 Probability of nding

particles

U U W U s ubz~q%)07 ots znH q 5 ut2 u97   W

ys W oS '

z

ts

v

w 

 v

v

w 

 v

nH q 6

 v

nH q 6

W oS '

z

 v

u 

(5.1)

(5.2)

oS

'

(5.4)

S TATISTICAL Since

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

72

where we have substituted in the grand potential directly parallels our argument used to show that This parallel can be summarized as follows equilibrium condition maximize minimize minimize

One could use similar arguments to handle constant pressure cases, etc, either by microcanonical arguments as used in the previous section, or via Laplace transform/Legendre transform arguments. 5.1.3 Fluctuations in the grand canonical ensemble We can use derived above to calculate the uctuations in in the grand canonical ensemble. We can calculate cumulants of from a series expansion of analogous to the way we calculated cumulants of from a series expansion of , since one can interpret as being like a characteristic function of just as was like a characteristic function of . For example, the average number of particles in the system is

and the variance is given by

Since is clearly extensive and clearly intensive, the variance above is linear in . Thus, the relative uctuation in (the square root of the variance divided by the mean) scales like and thus vanishes in the thermodynamic limit. Thus, in the thermodynamic limit, the grand canonical ensemble is equivalent to the canonical ensemble, since these uctuations are negligable (just as the simiarly negligable energy uctuations in the canonical ensemble made it equivalent to the microcanonical ensemble).

  bzvi% )

    )   u977ubzvi%076

It is natural to calculate

using the saddle point approximation, which yields (5.8)

5.1.4 Physical interpretation of

 h9 Y  

  vv   h `

US ) ) q%ivv 97`v ' H  usH HQ%H s


normalization held constant , , , , , ,

2 h  bF  

 ) % q'v

H 7

bp

 h9 p  

3 ) 7 H  3 ) 7 H  )  7

 6t `

 h9 bp   g

 ) H  % u u  h   F

2 h9  Hp  

  %zv

q 9H76 

 ) % i'v   uv 'v  l

, we get

oS

Y Y H A2 A2

H 97

'   `ubz~
  u7

(5.5) . This argument

(5.6)

(5.7)

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

73

where is the value of corresponding to the maximum term in the summation above. Because of the sharpness of the distribution of (due to the vanishing width in the thermodynamic limit), we can associate with (since is essentially a delta function in the thermodynamic limit), and thus make the connection

and since we normally just think of

as

, we can write

i.e. the grand partition function is connected to the grand potential, just as the partition function is connected to the free energy. We can obtain the values of other thermodynamic quantities using the analog of the Gibbs-Duhem equation for the grand potential

and thus

5.2 Applications of the grand canonical ensemble approach


To demonstrate both the means and ease of calculation in the grand canonical ensemble, I will solve (some, now very familiar) examples below. 5.2.1 Gibbs Entropy formula Gibbs derived an equation which relates the entropy to the probability of a given microstate:

Note that this is also very similar to the equation we found for the mixing entropy in 3.5.1. Using our understanding of in the Grand canonical ensemble, we can examine this equation more closely. In particular, we can use this expression to prove to above relation. For now, lets call this value and see how it relates to the entropy.

w  w  q  bpys|v h|zH |v7   |z7 w s w     bp3t|v h|vnH q |z7 s H w    H     qbpgr|v |znH q |v7 |z7hF|z7

 C bzvi3    )

o S QQ ' QQQ  

H |z7{p|v7    )qi|v `w|vn97|v7  H    | H

~ bz %4" bp'H H H  7  bzvi%)  u97   

H !

    |v7 p|v7

 H Q`

S QQ ' QQQ  

H Qy

  u7

S ro QQ

QQQ 

(5.9)

(5.10)

(5.11)

(5.12)

(5.13)

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

74

Finally, since then we get

since we equate the thermal average energy equation for the entropy works. 5.2.2 Ideal gas

To complete our study of the ideal gas, we now calculate the properties of an ideal gas using the grand canonical ensemble. We rst calculate the grand partition function:

where we have used the series expansion and is called the fugacity. From this, we can calculate the grand potential

But, since

and the number of particles is found to be

Comparing the last two equations yields , i.e. the equation of state . Finally, one can calculate the chemical potential using the formulae above simply by solving for . One nds

Often, one writes the chemical potential at arbitrary pressure in terms of the chemical potential at atm: (5.21) gure

g "s |zm P|v q l u%4 q l   H   w s H H H |vrx|v7 Xp  H H %" ys|zmI|z q |v7 w   H     H


and thermodynamic averages take the form with and with and we dene , the gas pressure can be directly obtained by

4 4 $ o 4  `  # U v H o o

as before

"~ 

2 1 

 ) 105%5T   S )QQ  Q o 4  o 4 ' QQ  H   1  S ro QQQ o  4  o  4 7 ' QQ    ! H H    H %u%45 H H o 4 %3|bFu  H H    h767v }v )  3 76 { ) C 3 U C U n $  )  U 4 %h76 l U 4 j %07 l  )   U  U # U  )  %076 'Cbzvi30976    )  o

i 4 o  g

  vpu w ) 

V `

o4

p' '

bp l

Since

, we nd

(5.14) ,

(5.15)

, we see that the Gibbs

(5.16)

(5.17)

(5.18)

(5.19)

(5.20)

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

75

5.2.3 A simple chemical reaction

This is the simplest type of chemical equilibrium. One example of this is the transformation between two isomers, such as n-butane and isobutane. If molecules of A are converted into molecules of B at constant and , we have where and . We can dene an extent of reaction which is 0 when the reaction position is entirely to the left (all at A) and 1 when everything is at B. More explicitly, we can write

where is a constant (there are isomerations going on, but no particles are leaving or coming); for simplicity, we set and . This allows us to write

Since

this occurs at the balance of the chemical potentials

We can further simplify this since we know that , i.e. the free energy difference at one mole is the free energy difference between the two chemical potentials. Thus

Thus, at equilibrium we get

we can write

S ' (i   p' H " e  g S  Fw e '(i   g S H  Fw H H  e 2i  '  g

at constant

and

. The reaction will proceed until equilibrium, i.e. when

qT) hpu3w  a   S ' 2i  H  e g S ' (i  5 e g  H 9

and

as we discussed previously. Since

Rw ` !    E DD p) DD F H DD Q7F H  "   D D p

Consider the equilibrium

  DD F) 5 

gw "Ip DD

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

76

or equivalently

Finally, for an ideal gas it is easy to rewrite the denition of the equilibrium constant in terms of concentrations instead of pressures . Since at constant we can write , we get

5.2.4 Generalization to all chemical reactions We can generalize these results to more complicated reactions. For the reaction

We can still dene , although it is a bit more complicated. It is easier to write its derivative

Since

, we obtain

With this, we can write the equilibrium constants

where represents the stoichiometric coefcients , , etc. The get the signs straight, the products are dened as positive and for the reactants are dened as negative. Thus

 a  e F'3w qsCTT) vp u m w S  a v  sC}) I dv ' (i  ) x w   a 0 p'Vw CT) I  Fw e   g 1  a  sCT) vFVw

w 

c P c

  ) 

P 

P H 

b ca

H 

we would have

From this we see that the direction of the reaction will depend on the free energy. If more products will be formed and .

 ) zv3  

7

y c w

@@@ BB`w 2

6`#7

2 yic c w

i H 6` e g H e Fu|

f P w

 

@@@ BB`w 2

 )

2 h9P P w

 4

We call the value of

  1) "

at equilibrium the equilibrium constant of the reaction, ie..

, then

for the

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

77

then we can try to express equation of state to substitute

in terms of

. To do so, we rst write :

s w

Again, we can write

in terms of concentrations to calculate

H e Fu|
w
n

i  i  w w g w x g e  7 1) %3  e

and thus at equilibrium

: If we dene

and use the ideal gas

S TATISTICAL

MECHANICS

III: G RAND

CANONICAL ENSEMBLE

78

5.3 Problems
1. Thermodynamics of gel absorption Recently, the Tanaka group at MIT created a gel which absorbs Pyranine. They found that the concentration of Pyranine molecules absorbed followed a power law relationship with the concentration of absorber molecules, i.e. where and are constants. In this problem, we will construct a simple model for this process. where First, lets say that absorption is governed by the chemical formula stands for Pyranine, for absorbers, and for the bound state (i.e. absorbed Pyranine). Note that it takes absorbers to absorb a Pyranine molecule. (a) Assuming constant and , write down in terms of the the chemical potentials and derivative of the numbers of each of the three species above ( , , and ). (b) We can think about the extent of the reaction in terms of the amount of each species and the stoichiometric coefcient . Write , , each in terms of (and where relevant). (c) What is at equilibrium? in terms of the chemical potentials (d) Use your answers to (a) and (b) above to derive and . (e) Using your answers from the previous parts, the concentration dependence of the chemical potentials ( , where is the concentration of component ), and the definition of the standard state Gibbs free energy in terms of and the stoichiometric coefcients, write in terms of and , , , and . and , , , and the (f) Using your answer to (c), derive the relationship between concentrations at equilibrium. (g) Using your previous result, write the concentration of the bound state ( ) in terms of that of the absorber ( ), the concentration of Pyranine ( ), , and . Use your result to nd out the values of and in the power law equation in the top of this problem.

PsFw 9 v 9 a q w
w "

sc aHb 9 v 9 a q wzy w e
) 

 

e 

)   # pu%w `

 

A DDING

QUANTUM EFFECTS

79

6 Adding quantum effects


6.1 Quantum statistical mechanics in the microcanonical ensemble
Particles in the universe are either Fermions or Bosons. Fermions follow Fermi-Dirac statistics, which states that two fermions cannot be in the same quantum state. Bosons, on the other hand, can be in the same quantum state. As we will see later in the course, it is in many ways simpler to treat Fermi-Dirac and BoseEinstein distributions in the grand canonical ensemble. However, it is not so hard to do these calculations in the microcanonical ensemble, and it makes a nice example. Also, with our work on the two level system, weve already calculated the Fermi-Dirac distribution, without knowing it. 6.1.1 Approach common to Bose and Fermi statistics Lets say that we have energy levels and that is the occupation of energy level . We group neighboring energy levels into cells in order to facilitate our calculations. Each cell has energy levels out of which are occupied. Since these energy levels are all close in energy, we can make the approximation that the energy of all the levels in a given cell are identical. Our approach for calculating the statistical mechanics of Bosons and Fermions will be to calculate the 1. energy 2. density of states 3. entropy 4. free energy

6.1.2 Fermions Since each quantum state can be populated by at most one particle, we must consider that out of quantum states, of them will be occupied. Thus, for a given number of energy levels , we would say that the number of ways of occupying of these levels is (6.1) where . Thus for the whole system

(6.2)

and

(6.3)

  gH zF  

 H e

 gH zF   

 H

 

 j

 

5. chemical potential of cell

~  

H %45 ~bp vb   vv  

7
0

 "

A DDING

QUANTUM EFFECTS

80

Using Stirlings approximation, we nd the entropy

and the free energy

Finally, we calculate the chemical potential of the th cell to be

In equilibrium, all the chemical potentials for all the different cells must be equal. In this we set and interpret the occupation in the corresponding formulae as the mean occupation values in equilibrium, and thus we write

This function is known as the Fermi-Dirac distribution, which we derived earlier in the context of the probability of nding a microstate with energy in two level system:

6.1.3 Bosons

In the example above, there are subcells, separated by partitions. Out of the total occupied levels, we need to know how many ways are there of rearranging these occupied sites among the different subcells. Another way of thinking of this is to consider the number of permutations of the particles plus the partitions. This is given by

(6.9)

and

 v

and thus

(6.10) (6.11)

  vv

Bosons can occupy the same quantum state. Thus, we need to calculate a new cell to have subcells with partitions, as follows

H v

 6 l H E  6 yw l H E   l H E  n 6 Qw l H E 

l w%C57  ) l  hvc  l l w  ) u%hHy) 57

w  % )

H  k) 576

@ @@@@ Trr@

Gathering terms up, we get

tsq H Ep H zeH F gH p E~H  q        sl


w  h gH zpw  l H p'H H  w l l H

ys H Ep H zeH F gH p q vbp7         q


H j 

(6.4)

7
v      l H

%4e H

(6.5)

(6.6) (6.7)

(6.8)

. Picture the th

A DDING

QUANTUM EFFECTS

81

Using Stirlings approximation, we nd the entropy

and the free energy

Finally, we calculate the chemical potential of the th cell to be

Gathering terms up, we get (assuming

6.1.4 Comparing Bose, Fermi, and Boltzmann gases

From the calculations above, we see that the distribution functions for Bose and Fermi gases can be simply written In the high temperature, classical regime, quantum mechanics becomes irrelevant since there are many more energy levels than occupied sites and thus the difference between Fermions and Bosons becomes irrelevant (the different is only relevant at low temperatures, when the energy levels try to ll up to the lowest energy possible). . This leads to the equation In this case, is very negative and thus

which we derived earlier in the case of an ideal gas (with the addition of the chemical potential).

6.2 Applications of Bose and Fermi statistics


6.2.1 Bose-Einstein distribution Derivation of distribution function Consider a system of non-interacting bosons. A natural way to calculate the thermodynamic properties of this system would be to calculate the canonical partition function

where and are the occupation number and energy of energy level , respectively. However, in the canonical case, we have to do our calculation in the regime where the total number of particles is xed, thus leading to a delta function above.

ys F I l   H

s F gI l   H

H z$F 

 ) H  %h) 57 %hwQ) H7 e )   )  (%h9H76

1 q

H Ep  l

P eH vz w  w {

i 5 

 H  XFgI w

H zeI Qw  H 

l l

H zeI w  H  l g l l

H I3) H) 57   l q  `

H zp H 

XXXyS S H | C C

H z$F w  

 ) H  &3h3) 97

H zp w   l l

H zi~ H  

H E

 q
t

q vvF  
  4e H

(6.12)

(6.13) (6.14)

(6.15)

(6.16)

(6.17)

(6.18)

A DDING

QUANTUM EFFECTS

82

1 0.8 0.6 0.4 0.2 0 -1 -0.5 0 0.5 1

5 4 3 2 1 0 -1 -0.5 0 0.5 1

Figure 6.1: Fermi distribution vs energy Figure 6.2: Negative derivative of the fermi dis. This is a fairly low temperature example tribution vs energy . Temperature as ( ). in the gure on the left.

To avoid complications related to calculations involving this delta function, we will instead work in the grand canonical ensemble, and calculate the grand partition function instead:

Since the particles are non-interacting, we can factor the above to get

where we have used the geometric series number is therefore given by

for

Bose-Einstein condensation At low temperatures, something interesting occurs: in this case, the system wants to minimize its energy, and all of the particles go into the ground state. Thus, (i.e. the free energy per particle is just the energy of the ground state energy level), and the excited energy levels are unoccupied and the number of particles in the ground state diverges. This phenomenon is called Bose-Einstein condensation. Superuidity (where the viscosity goes to zero at low temperatures, due to quantum effects) is also a related phenomena. 6.2.2 Fermi-Dirac distribution Derivation of distribution function Consider a gas of noninteracting fermions. Unlike Bosons, only a single Fermion can be in a given quantum state. This fundamentally changes the nature of

hH 5 

. The average occupation

l H s yH 5 q   hH l   p  l l l   H ) C C st 9l q 6H l yshH 5 hH q I C H    1 0 H }6 5 C 2 Ets|v H |v q H7    

q)C5

 

) H b

(6.19)

(6.20)

(6.21)

x@9

A DDING

QUANTUM EFFECTS

83

the distribution functions and the statistical mechanics of this gas. Since the gas is non-interacting, we can calculate the grand partition function in terms of single particle grand partition functions . For a given energy level , there are only two possibilities ( ), and thus

which thus leads to a grand partition function of the form

The average occupation number is therefore given by

which is called the Fermi distribution.

Electrons in metals Electrons are fermions. Thus, we can use the Fermi distribution to learn about the statistical mechanics of electrons (in the approximation in which the electrons are non-interacting). Lets rst take a look at the Fermi distribution (see Fig. 6.1). It is a step function, whose step and the width of the step is on the scale . Since the fermi function is reaches half way when a step function, its derivative will be sharply peaked (see Fig. 6.2). To calculate the energy of our system, we nd

where is the number of orbitals at an energy level of and the factor of 2 comes from two spin states (since electrons are spin 1/2). We could calculate the number of orbitals for a given system, but we can make some important observations without including those details (and thus which have a more general formulation). In particular, we can calculate the integral above via integration by parts to get

where we dene

Since is sharply peaked, we can approximate it by a Gaussian. To do this, we Taylor expand around its mean :

i.e.

is well approximated by a Gaussian of width

yshH 9H q w   l E 
and mean

4 v  %   H   h5sw 2 2 h5  H  l F5 i  H

w  sthH 5 q 6   hH9     5I

  i P i 9o& v i " !5~ t H  ~ C5 c  1 0

ssthH 5H q w l q  ys 9 q 6 C
 9o&  v {9 qC5oq& D  l l h H  h H  bF

Ftv p

) u H

  o&

(6.22)

(6.23)

(6.24)

(6.25)

(6.26)

(6.27)

(6.28) .

A DDING

QUANTUM EFFECTS

84 , giving us a lowest order term that is linear to quadratic order and as a Gaussian) and

(6.29) where the s and s are material-specic constants. The terms linear (and actually all odd terms) in vanish in the Gaussian integral, leaving us only the quadratic and higher order even terms. From the above, we can calculate the heat capacity (6.30)

i.e., we nd that the heat capacity scales linearly with the temperature at low temperature. This is radically different from the classical value that the heat capacity is independent of temperature.

6.3 The need for a quantum mechanical treatment in polyatomic gases


There are limitations to the applicability of classical statistical mechanics. These limitations become especially dramatic at low temperatures, where one must include quantum mechanical effects. The simplest example of this break down is provided by looking at the heat capacity of a polyatomic gas. The Hamiltonian for a single molecule consisting of atoms is

where the potential energy constants all the information about the molecular bonds. For simplicity, we have assumed that all of the atoms have the same mass (although one can use this formalism for atoms with different masses simply by rescaling the coordinates by where is the mass of the th atom). Ignoring the interactions between molecules, the partition function of a dilute gas is

Covalent bonds (which hold the molecule together) are of course quite strong. Thus, the molecule typically has a well dened shape and simply goes through small uctuations (deformations) from its equilibrium conguration. The contribution of these deformations to the single particle partition function can be calculated as follows:

 H h9 c P  E`w 2 c P     q  c     ~ i w i 5s 2 %vq)hH5~H 6 2 hH9 iwhH5oPI vt5 c 


(6.33)

2. The energy cost of small deformations about equilibrium is obtained by setting and making an expansion of for small :

8 w U

S S Sy S S 8 8   lv w  2 

1. First, one calculates the equilibrium positions

by minimizing the potential

U 1D BBAu IH  C @ @ @   1

v   @ @ @   CD BBAu kw 2 C

C @ @ @  U BBA 

i 2

c C5

v U H o l o o C

 H  g

 5 ~


We know that we can also Taylor expand in in energy. We can now do this integral (approximating we nd

(6.31)

 5

u   '~  

(6.32)

A DDING

QUANTUM EFFECTS

85

where the roman indices denotes the atom label and the greek indices denote the dimensional component (i.e. x,y, or z). The rst derivatives are absent in the equation above since the expansion is around a stable equilibrium conguration (otherwise rst derivatives would indicate some net force driving the system toward some particular point other than our equilibrium point). Also, the matrix of second derivatives is positive denite (only has non-negative eigenvalues).

It is more natural to work in the Fourier transform space and change variables from the original deformations to the corresponding Fourier amplitudes of the eigenmodes. One can show that this transformation is unitary (i.e. preserving the length of vectors) and the quadratic part of the resulting Hamiltonian is

(6.34)

The average energy can be therefore calculated from the expectation value of this Hamiltonian. From the equipartition theorem, we know that each degree of freedom classically contributes a factor of to the energy where is the number of such modes with non-zero . This makes sense since only modes with a nite (non-zero) stiffness can store potential energy. 4. Finally, we can appeal to symmetry arguments to help simplify the problem. In particular, we nd examine , no energy is stored in (a) Translational symmetry: Since the center of mass coordinate. Alternatively, we know that there is no bond which binds the center of mass to a particular location. Thus, the eigenvalues corresponding to x,y, and z are zero. (b) Rotational symmetry: There is also no potential energy associated with rotations of the molecule. The number of rotational modes must satisfy , where is the dimensionality of the system (well say for the rest of this discussion). Also, depends on the shape of the molecule. For example, a rod-shaped molecule has , as a rotation parallel to its axis does not lead to a new location. eigenvectors of the matrix have non-zero stifness and correspond The remaining to the vibrational normal modes. Thus, the energy per particle is

(6.35)

(6.36)

The corresponding classical heat capacities and are temperature independent (as we found for the ideal gas or the set of non-interacting springs). is easily measured in an adiabatic process. Values of , based on the above The ratio arguments, are listed below for a number of different molecules.

v) 

3. The normal modes of the molecule are obtained by diagonalizing the matrix The stiffness of each mode is the corresponding eigenvalue.

S S )   2 

v l p v g
H l

H   w

  w

@@    @@@  DC BBF@u 5DC BBAh 

e f

w y8

w2 8 w v w v w w  1 # w2 0 Co

v) 

H  `

w 

H  

hqu

v h  qu w

 4u

 C

@@@ BBA l
l

H `

8 w #

v)
m
)

e f y

A DDING

QUANTUM EFFECTS

86

Measurements of the heat capacity of dilute gases do not agree with the above predictions. For example, the prediction for a diatomic gas, such as oxygen, is only observed at temperature higher than a few thousand degrees Kelvin. At room temperatures, a lower value of is observed, while at even lower temperatures of around 10K, is further reduced to . This low temperature behavior is similar to a monatomix gas and suggests that no energy is stored in the rotational and vibrational degrees of freedom. These observations can be explained by including quantum mechanical effects, as we will discuss next.

6.4 A diatomic molecule in more detail


Lets consider a diatomic molecule in more detail. We could write the partition function as

but its likely more familiar to write in a more simple (albeit less general) fashion. We start with the Hamiltonian:

We can rewrite this Hamiltonian using a change of variables. Instead of position and momenta for each particle (2 particles times 3 degrees of freedom for each momenta and position leads to degrees of freedom), lets dene 12 new degrees of freedom 1. center of mass momenta 2. center of mass position

where

is the moment of inertia.

i 2 Fi w 2 2

lv

This allows us to rewrite the Hamiltonian as

2

w 2 l #

w z2G

6. the angular momenta

and

) #

t|

 &

5. momenta of the distance between the particles

4. the angles and

of rotation

3. distance between the particles

v)

h ~

t w ~t v v h

v ) h

U 1D BBAu srH  C @ @ @   1

1 2 3 3 4

0 2 2 3 3

v 2 2 Iu ~ l H

i 2 

v U H o l o o C

w 2 w  2  2

4# q)I2H h mu zo v h  G w v w  q) 2 $ vG

class example Monatomic He Diatomic O or CO Linear triatomic O-C-O Planar triatomic H O H Non-planar NH

) 1 l ) h l l ) 1  ) h ) 1 e f |

v)  v h ) l v) h v ) )

v) 1 l v)  v) e ) f

v) 1g)

'~  

vl

A DDING

QUANTUM EFFECTS

87

We see that several of these degrees of freedom decouple and we can rewrite this Hamiltian in terms of a sum of non-interacting Hamiltonians:

We can now tackle each independently, i.e.:

The translational Partition function is just the ideal gas, which we did before. Now, well tackle the vibration and rotation terms. 6.4.1 Vibrational modes

(6.39)

where we have used the series

l l l  H )  C C s H q 6H l q C s v q rms Rv ) m R H q 6 C s m R H 6 r) m R H s sv v w p m R H q C }) m R H q `oys1) l z m R H q 6 C

@@ BBF@d6v l d5

where this

is a non-negative integer. We can calculate the quantum partition function from

(6.40)

for

v) b

can be thought of as arising from two degrees of freedom (position and momentum) and the energy gain from each (due to the equipartition of energy). In quantum mechanics, the allowed values of energy are quantized, such that

 1 )

v l

  m R 9uF 

w  0v

m R

Ai 8

8 H Ai

Ai 8

v })

where

. The corresponding energy stored in this mode

mR m } v m

2 #2

2 l w 2

i 2 m h i }v 2 u}v g v 5 v g 5  g H 0 # 

A diatomic molecule has one vibrational mode, with stiffness oscillations. The classical partition function for this mode is

, where

i 2 pq w 2 2 

is the frequency of

lv

where

xAFF @ 8DD

@ 8DD xAp

@ 8DD xAp

ca b t

w w Xx@Ap8D Ai8Xx@Ap8D FC@ s` D


@ 8D 8 xAp AiI

2

w 2 xAp # l @ 8D

@ 8D @ xAF pC D 5e

v Ai 8
Ai 8

xAF pC @ 8D @ D 2G z

(6.37)

(6.38)

A DDING

QUANTUM EFFECTS

88

2.5 2 1.5 E C 1 0.5 0 0 0.5 1 1.5 2 2.5

1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 2.5

Figure 6.3: Quantum (solid line) and classical Figure 6.4: Quantum (solid line) and classical (dashed line) vibrational energy vs (both (dashed line) vibrational heat capacity vs in units of ). We see that when , the (in units of ). We see that when , two results agree. the two results agree.

The high temperature limit

coincides with the classical result. We calculate the quantum vibrational energy to be

The rst term is the energy of the ground state of the system. The second term describes the additional energy due to thermal uctuations. The resulting heat capcity

achieves the classical value only at high temperatures (dened by , where is a characteristic temperature based upon the energy scale for quantum-level energies). For low temperature ( ), goes to zero as . Thus, one can think of the heat capacity vs function as switching between classical ( ) to quantum ( ) values when temperature is lowered below . Thus, it is important to know the value of . It turns out that is actually quite high ( K) and thus the classical value is seen only at extremely high temperatures and in fact the quantum value is seen at room temperature.

8 Aqo

m R

8 ) Aq

) m R

8 Aq

 m R 9H 7 H m R 9H7

Aq 8 Ai8  ) 8 H  fAqob97 A8iu  2 sy  l m R 9H7 H q i   m R 9H7 2 m R g Ai8

Aio 8

m R

v oys 9H7H l q F  w v  m R m R  m R

Aq 4 8

mR m lR

Ai$ 8

m R Aq8
Aq bp F 8

Ai 8

Ai 8

Aio 8

8 H Ai

Aio 8

m R

(6.41)

(6.42)

(6.43)

A DDING

QUANTUM EFFECTS

89

6.4.2 Rotational modes From our discussion above, we saw that at around room temperatures, the vibrational modes do not contribute to the heat capacity. This explains the deviation of the of oxygen from its classical value of to . However, at very low temperatures, the heat capacity is further decreased to , i.e. what one would expect for a monatomic gas. Thus, it seems that at these low temperatures, the anomalous heat capacity can be described in terms of a quantum mechanical treatment of the rotational modes. Classically, the Hamiltonian for rotations is

and the average energy is

as we would expect for two degrees of freedom. In quantum mechanics, the allowed degrees of angular momentum are quantized to with integer , and each state has a degeneracy of (along a selected direction, i.e. ). We write the quantum partition function as

where is a characteristic temperature associated with quanta of rotational energy. Unfortunately, the sum above cannot be exactly calculated. We are left with three possible ways to tackle this sum: , many high energy levels are excited. In these high 1. High temperature limit: For energies, the fact that the levels are quantized is irrelevant. Mathematically, this means that we can replace the sum above with an integral

(6.48) (note that we have made the transformation which results in the differential ). We see that at high temperature, where we can safely ignore the fact that the levels are quantized, we get the classical result for the partition (and therefore for all , etc). thermodynamic quantities, ,

w  2 R 2

D 4

R v

 v 5) 2 R rD v $ l l l   w l p v  D H #  w v 1  w  2 R p H 0 6 w v D w@ @ @ H BBA l @@@ BBAd6v l 

f  )D

j i

$ l H 337 3 

}v g 1

l 

i 2 pq

w  'zr

and correspond to the momenta for the and where inertia, and is the angular momentum. We nd the classical partition function has the form

angles or rotation,

i 2 Fi

w 2

wEr  H D

w 2

lv

p 0

H  D F  6 2

r D

 l

w  0

5v

v) v) 1 1
w  D

Fp '

v ) h
 l

(6.44) is the moment of

(6.45)

(6.46)

(6.47)

A DDING

QUANTUM EFFECTS

90

2.5 2 E
C

1 0.8 0.6 0.4 0.2

1.5 1 0.5 0 0 0.5 1 1.5 2 2.5

0.5

1.5

2.5

which leads to the energy

The resulting heat capacity

vanishes at low temperatures.

3. Numerical calculation: Finally, we can get a good picture of the energy and the heat capacity by simply summing up many terms in the series. As we add more terms, we can describe higher and higher temperatures mathematically (just as looking at the rst term above allowed us to look at the lowest temperatures). In the gures above, I plot the energy and the heat capacity calculated from summing the rst 20 terms. In the temperature range shown here, adding additional terms does not change the results. We see that, like in the vibrational case, the heat capacity switches off at low temperatures . Since typical values for are between 1K and 10K, this explains the drop in the heat capacity of oxygen at low temperatures.

 )D H %x 97

 )D q%x

v  v  D ots%)D f|Hh h w l q F 

@@@ w  v  BB`%)D 9Hr

  w  v  9H7`fx)D f|H76 h

n j D 2

h v

2. Low temperature limit: For

, the rst few terms dominate the sum, and we get (6.49)

(6.50)

(6.51)

) D

Figure 6.5: Quantum (solid line) and classical (dashed line) rotational energy vs (both in units of ). This result was calcualated by summing up the rst 20 terms in the series. We see that when , the two results agree.

Figure 6.6: Quantum (solid line) and classical (dashed line) rotational heat capacity vs (in units of ). This result was calcualated by summing up the rst 20 terms in the series. We see that when , the two results agree.

D F p

'

D H

A DDING

QUANTUM EFFECTS

91

6.5 Problems
1. Ortho- vs para-hydrogen Ortho- and para-hydrogen differ in their rotational energies:

Also assume that (1) Boltzmann statistics is valid; (2) the two forms differ only in their rotational energy and the nature of the rotational state does not affect the other properties (vibration, translation, etc). (a) Calculate the equilibrium ratio of ortho to para hydrogen at high temperatures (i.e. in the classical limit). (b) Calculate the equilibrium ratio of ortho to para hydrogen at very low temperatures.

@BBAd6 h l @@ @@ BB@A 6v"

  l l

w w

 pRv  pR

CpD b

I NTERACTING

PARTICLES

92

7 Interacting particles
Non-interacting systems are often studied in statistical mechanics classes since they are easy to solve (since their partition functions can be written in terms of products of single particle partition functions). However, the real world is full of systems which interact and much of the interesting phenomena around us explicitly results from the interaction of molecules. To begin our understanding of interaction, it makes sense to start with a system in which the interaction can be physically gradually added in. Gases are a good example of this: as we increase the density, we should get more interaction. At very low densities, gases should be ideal. We can ask the physical question: as we increase density, how do interactions change the thermodynamic of our system of interest? We will explore this question in the next sections.

7.1 A simple example of interaction: van der Waals uid revisited


How should we include interactions into the equation of state? We did so before with the van der Waals uid. Lets take another approach to the derivation, which should allow us a more general generalization of these ideas. 7.1.1 Free energy of a van der Waals uid What do we need to do to calculate the free energy of a vdW uid? We need to add some sort of interactions between particles and decrease the entropy to account for the fact that two particles cannot be at the same place at the same time. For the interaction between particles, we would say that the energy of interactions per unit volume is simply proportional to the density of particles in that volume squared (for two body interactions, the likelihood goes as the probability squared)

This is along the lines of the argument I made in the rst lecture: the interaction energy per unit volume goes like the density ( ) squared since the density is like the probability of nding a single particle. Thus, the total energy is this energy per unit volume times the volume. Here is the strength of the interaction: for attraction between particles, . Next, we need to modify the entropy. Before, we said that each particle can be in places. Now, we will say that each particle can only be in places. Thus, the entropy is

and thus the free energy is

We can again use this to calculate the pressure:

X )

 j #  FuH P H %%45 H H Q4 2

2  Q4 H w P H 2

& P H u

n j F " H Q4 qd|g4 ) H 

& P H 2   

H !

&

(7.1)

(7.2)

(7.3)

(7.4)

I NTERACTING

PARTICLES

93

and thus we get (7.5)

i.e. the equation of state for a van der Waals uid, as we expected.

7.1.2 Going beyond the vdw approximation: a virial expansion Lets move around some terms to rewrite the pressure in terms of a series expansion in density. Well keep terms just to order for now. We nd:

The vdW equation considers only two-body interaction, which of course is an approximation. Just as we said that two body interaction can be mathematically expressed as a term of the form , we can consider the more general form of body interaction by adding in terms which correspond to higher order interaction as higher order powers of :

This series excpansion of for small is called a virial expansion and the coefcients are called the virial coefcients and depend on the nature of the system and may depend on temperature and other aspects of the systems environment. One can think of the expansion above also as a Taylor series of the pressure for small density. Equivalently, this relationship can also be derived from taking a Talor expansion of the free energy in density and then calculating . Either way, the idea here is that we are taking an approach where we consider that the density is small, which should be appropriate for gases. As the density gets larger, then we have to consider higher order terms in . This especially becomes important when we want to describe how a gas condenses into a liquid, which drives the volume up considerably. 7.1.3 Simple example of a gas-liquid transition A simple way to model a gas to liquid transition is to examine the case in which we consider the rst two virial coefcients. We will say that the two body term has the form

and the three body term is purely repulsive

2  ) 9v0 X& 2

l e 3gH   & 

ys

s P H  & 9&7~z5w l q e

@BB@`w o & @

BBw o & @@@

q q 2 yx p & P w
l h n

& P H  w & & P 2 d7 v"5 2 37 H

w 2 C l &

l yw 2 y%& 2 yw & w

o v

&

o X

&

j c 5 2

w & 2 ytp'H w 

3 "dQ4  H 

&

& e

l  & dH  

 ) H ! 5

q2


2 &

 1) 2 3 P w !7 

(7.6)

(7.7)

(7.8)

(7.9)

(7.10)

I NTERACTING

PARTICLES

94

This leads to free energy of the form

We associate the equlibrium state with free energy minima. The equilibrium volume is given by

i.e. at equilibrium, the total pressure (including the balancing pressure from the container or outside) should sum to zero. For , this occurs when , which is the gas phase. For , the two body attraction term is negative (since in this case). This means that there will be some free energetic bias towards forming a liquid. We see that in this case, the equilibrium density is given by

In this simple model, as we lower temperature, the density continues to increase, up to the value of .

7.2 Deriving virial coefcients from rst principles: cluster expansions


7.2.1 Derivation of the second virial coefcient In the previous example, we have shown in principal how a virial expansion is used. However, where do we get the values for and ? To calculate these, we must have some means to connect virial coefcients to the nature of interactions. This is of course, specied by the Hamiltonian. Lets consider the most general Hamiltonian which involves direct two body interaction:

denotes the pairwise interaction between particles (eg van der Waals forces, Coulomb where interaction, hydrogen bonding, etc) and denotes the external potential describing a box of volume . We want to construct some sort of series expansion what small value should we choose? A typical value for would be Lennard Jones (LJ) interactions, which take the form

where is the well depth and the typical interaction length. Since we expect to go to zero at large , but diverge at small , we can choose to expand in small values of , where

Using this form, we can write the partition function in the form

 

U u w v  H  U   " w 2 H o o o U l ys H q

o &


H 

QH 2 &

& w 2 &

j 1

gn

H  H 

Y
n

$ l j H c

$ j c H %~e%  H

"v&

  1 "

n j 0  

j c 5

(7.11)

 

&

"

v U 0

(7.12)

c 4t  

(7.13)

 H 

c )5bh1&

(7.14)

(7.15)

(7.16)

(7.17)

I NTERACTING

PARTICLES

95

Integrating over momenta is still easy (just a Gaussian integral) and we get

Now, we want to write this in terms of

To calculate the pressure, we nd

Since all of the density dependent terms of

Now, lets consider

If we expand we get

Finally, since

where we have introduced . Higher order terms have higher order densities as coefcients. Putting these equations together, we get (7.26)

By rearranging terms, we can write this as a virial expansion:

We recognize the coefcient of as the second virial coefcient have now expressed directly in terms of the nature of interactions between particles .

l  v    o ) ~H

@BB@w  o @

2  l H 2

$ @BB@w v @   o l #

@BB@`w I o @ 

& H

  w 2 l  "gw p 

 

0 

Using the series expansion

 @BB@`  lv 2  w l F   w     @ w  1 o | 0  F!  F   l H s y  H q   @@@ w BB  l H v  l  w l U 4  1 o   0 HU BB`w l @@@ @@ 2 w o BB@ o u o " @@@ BB`w u w l  kw l  u U l  yw  @@ 2 o BB@ o u o 5  F      H ! p   n l u U 5 PF 4 j  w yw  o U o l l t q u n 4 j ts H  H o U U o l l "
: are in , we get more carefully. We can write it in the form depends only on the difference for small , we get

(7.18)

(7.19)

(7.20)

(7.21)

(7.22)

(7.23)

(7.24)

, we can rewrite this integral as (7.25)

&

3 

(7.27)

(7.28) , which we

I NTERACTING

PARTICLES

96

7.2.2 Higher order virial coefcients The method weve used above is a simple way to get to the second virial coefcient, but is not the best way to derive the rest of the virial coefcients. A more general derivation is given in Huang, and is considerably more involved, employing diagrammatic expansions, and I will simply summairze the results of the derivation. The derivation follows lines similar to that in the previous section, although it is typically done in the grand canonical ensemble. To facilitate calculations, the terms in can be expressed diagrammatically. Each point reprsents a integral and each line represents a . Thus, for example, we can write

Furthermore, we can express products of these integrals as unconnected graphs:

Diagrammatically, we can express as the sum of all graphs. Since there are different ways of labeling a given -point graph, one must include factors to account for this permutation entropy. The way in which this is calculated is discussed in more detail in Huang. Next, one derives (or simply applies) a very fundamental result in statistical mechanics the linked cluster theorem which states that the log of the sum over all graphs (i.e. ) equals the sum over all connected graphs. This theorem is related to the connection between cumulants and moments: recall that the characteristic function (analogous to ) is written in terms of a moment expansion, whereas the log of the characteristic function (analogous to ) is written in terms of a cumulant expansion. Moreover, when we used diagrammatic methods to describe the relationship between moments and cumulants, we found that the cumulants were the connected diagrams. This fundamental result is also very deeply related to Wicks theorem in quantum eld theory, as that theorem also relates a series expansion to its log. This yields the free energy in terms of a diagrammatic expansion of connected diagrams. To calculate the equation of state, one nds that only the one-particle irreducible graphs 1 survive. Thus the virial expansion can be written diagrammatically in terms of 1-particle irreducible graphs, and the th virial coefcient can be expressed as

For example, one nds the third virial coefcient to be

1 These are graphs which do not break up into separate pieces if any one of the particles is removed.

 o9

2 2 H 2  Iq9o  q  9o o o

  9 PC DH a w b w w w a

HH HH H H 2 H H 2

o 2 o o 2 o
l 

2 Ho 2   2  i9o Iq  o o o o

 o t o o o2 2 o o o o
 j

(7.29)

 H l H 

o
F

(7.30)

H X

(7.31)

(7.32)

I NTERACTING

PARTICLES

97

7.3 The second virial coefcient and van der Waals equation
7.3.1 Calculation Lets study the second virial coefcient for a typical gas. We have to choose some sort of interaction. While it is natural to choose LJ interactions, it is harder to calculate with that mathematical form. Instead, we choose a form (called the Sutherland potential) for which is very similar and has the same essential qualities

This function has the hard core repulsion at and a weak, short range attraction for long . We can then calculate the second virial coefcient to be

where we have used the fact that the integration is sphereically symmetric to transform . The second integrand can be approximated by in the high temperature (small ) limit. Thus, we get

We can dene the excluded volume , which is 8 times the atomic volume minimum approach distance is twice the atomic radius), to get

7.3.2 Interpretation

The second virial coefcient has dimensions of volume (and is proportional to the atomic volume). Without the second virial coefcient, we would simply recover ideal gas behavior. When does the second virial coefcient become important to consider? This can be estimated by comparing the ratio of the rst two terms of the virial expansion:

This ratio is about for air at room temperature and pressure. Thus, the corrections provided by the second virial coefcient are small at these low densities. By dimensional arguments, we would expect that this effect would be similar for higher order virial coefcients, whose corresponding ratio

n 3 l j v 2 H i h QQ ) o i h v r QQQ i h H hn w h H lv H l  hH  l o n o 0 v o g j1 2 r hw 2 r H 0 l H 2 1 l  ) xRw    ) h97   d o v H F l I  ) h97f 2 r w l H   q  ks l Ix|H7 2 q r { lv H s I  H q  9H7 o l H  uu  ) H  

g v a 1db!uF v

 

!FqPDxFy a w a w v 1vfx v

(7.33)

)  2

&) 2

& 2 & 2

(7.34)

(7.35)

(since the

(7.36)

(7.37)

I NTERACTING

PARTICLES

98

would scale like . Moreover, this means that the virial expansion would breakdown at high densities, once this ratio becomes non-negligable. Similarly, the virial expansion would also breakdown at low temperatures. This can be dramatically seen in our formulae for the second virial coefcients above, which diverge in the low temperature limit. This reects the physical fact that the gas wants to be come a liquid at low temperatures, and thus this makes higher order interactions also important. 7.3.3 vdW equation of state The second virial coefcient is the simplest way to introduce pairwise interactions. How does it compare with the vdw equation? The truncated (to second order) virial expansion

Moreover, for small , we can write

which is the van der Waals equation

with the phenomenological parameters now determined directly from the nature of our Hamlitonian and . It may seem strange that the phenomenological volume parameter turns out to be one half the exlcluded volume. This occurs since is truly a joint excluded volume, i.e. the volume avalable to each particle when a pair of particles are interacting.

7.4 Understanding the vdW equation of state


The vdW equation has some important and interesting behavior. Before going forward, lets take a look at it in more detail. 7.4.1 The critical point and the universal equation of state We can dene the critical point from the equation of state as the point where the rst and second derivatives of with respect to vanish:

v 4 n  j v x " H w 1 i 1 2 i q 0 0 v v & l i) QH5 ) i gH v 2 w )& i l q & l l  1PQH ) `1yw P l &t v 2 w )& i qVk

l  ~v H & " H P ~ H l H & 2  z 5 w & P % H 3 l

i v & w i

o v

& P 2 QH

can be rearranged as

and thus we get (7.40)

2 &

l j v H i

cQH& )

v v

w & e

2 &  2

&

&

&   ) &

(7.38) (7.39)

(7.41)

v ) i m

v ) i qP

(7.42)

I NTERACTING

PARTICLES

99

The solution to these equations is

Since the values of and depend on the microscopic nature of the interactions between particles, so do , , and . However, one can rewrite the van der Waals equation using , , to get (7.44)

This equation is independent of microscopic detail. Thus, while we know that the microscopic interactions are crucial for dening the particular values for the critial parameters ( , , and ), the actual nature of the equation of state can simply be written in a form which is independent of these details. This is the a good example of universal behavior: where the chemical details dene certain pressure, volume, temperature, length, energy, etc scales, but after rescaling these system-dependent parameters, one gets a system-independent equation of state. However, for quantitative treatments, the vdW equation is typically not suitable higher order virial coefcients must be considered. 7.4.2 Breakdown of the van der Waals equation One physical aspect of gases is that (at xed volume) if one decreases the volume, the pressure should increase. Mathematically, we must have

(7.45)

Lets consider isotherms of a vdW uid. Above the critial temperature ( ), always holds. At the critical point, weve said that . When we think about this physically (i.e. changing the volume doesnt affect the pressure), it might seem a little strange, and it is just one , we see of the unusual aspects of critical points, which we will talk about later. However, for that vdW isotherms can have regions with . This is clearly unphysical, and thus we need to better interpret these regions. Do do this, lets look at the free energy vs density. For , there is a single free energy minima at , which corresponds to the gas phase. For , we see two free energy minima: one at and another at some , which corresponds to the liquid phase. The relative probability of nding liquid vs gas depends on the relative free energy: (7.46)

Another way to do this is to follow the Maxwell construction. Maxwell suggested that this region is a region of phase coexistence: a mixture of gas and liquid. To understand this phase coexistence, we can calculate the chemical potential of each of the phases. To do this, one calculates the area under the isotherm. This area of course is which is the free energy of that phase. Thus, the Maxwell construction is directly related to the free energy method described above.

 ) r

  )

 

 

 3 )

5  )

2  h H

p P h )4 2 P

 

h l

H 97

v i

6 XC89A 8

  )

 

  h p

& )

7

(7.43)

l  &   s P ) $fbex 1X4  )  $  r & (u&

I NTERACTING

PARTICLES

100

7.5 Problems
1. Universality of short range interactions in While different molecules may interact in many different and complicated ways, these differences often dont matter, or lead to trivial differences. To demonstrate this, consider three different pairwise potentials: (a) Attractive square well

(b) Exponential (c) Yukawa

Each has no excluded volume and interacts over the characteristic distance . For each potential, calculate the second virial coefcient in dimensions. Compare your results: how do they differ? What does this tell us about the role of the differences in the potential function used? Hint: the math will likely involve integrals which you can transform into the integral denition of the gamma function. 2. Virial expansion for a polymer globule: Coil to globule transition A polymer is a long chain molecule consisting of many monomers strong along a chain. Consider a virial expansion model for a polymer, where

The second virial coefcient is given by , where is the excluded volume of a monomer in the polymer, is the energy of attraction between monomers ( is positive), and is the temperature. Assume that the third virial coefcient is a constant and is independent of temperature. (a) Find the equilibrium density of the polymer. (b) Plot this equilibrium density vs temperature. (c) How does the density at high temperature compare with that at lower temperature? What happens and why? 3. A different model for the coil to globule transition Now consider a different type of model for the coil to globule transition, with the virial expansion carried out another term

o j

l  ) % H f9 dv  2 )  2

Now assume that form

and

are positive constants and that varies with temperature with the . The constants and are as above.

l  ) % H 1 dv  )

o &

H    ) 9r H  ) r6D  H H u uH  

2 C&& 2 f" o w

yw 2 y%& 2 f & w

dimensions

"

I NTERACTING

PARTICLES

101

(a) Sketch the free energy at high and low temperatures. (b) From the free energy (either from the function above or your sketch) plot the equilibrium density vs temperature for this model. (c) How does the density at high temperature compare with that at lower temperature? What happens and why? How is this different from the problem above?

M EAN

FIELD THEORY OF PHASE TRANSITIONS

102

8 Mean eld theory of phase transitions


8.1 Introduction
8.1.1 What is a phase transition? Recall, that a phase is a particular arrangement of matter. The constituent atoms and molecules are often the same in phases, and only their arrangement differs. However, this difference can have huge implications. For example, gas-phase water, liquid water, and ice all are made of the same water molecules, but their radically different arrangement leads to radically different macroscopic properties. One major thrust of statistical mechanics is the connection between the microscopic interactions of these particles and the corresponding macroscopic properties. A phase transition occurs when the molecules rearrange themselves into a new phase for example water freezing into ice. Typically, these rearrangements are induced by a change in external conditions, such as temperature, pressure, pH, etc. In order to monitor a phase transition, one typically denes/chooses some quantity, called an order parameter, which can discriminate phases. For example, the density serves as a good order parameter for the gas-liquid transition, since it changes signicantly in between gas to liquid. However, since the density changes only slightly in the liquidsolid transition, it often does not serve as a good order parameter in that case. Phase transitions can be broken down into two categories. In some cases, called rst order phase transitions, the order parameter changes discretely between phases, i.e. makes a jump. We saw this in the vdW equation for : there was a jump in density from gas to liquid. First order phase transitions are usually associated with a system which has two free energy minima and the system jumps from one to the other once their free energies cross (when the external conditions are changed). A continuous phase transition (sometimes called second order phase transitions) occurs with no jump, but rather a smooth change in the order parameter (such as the case in the vdW equation). In this case, there is typically only one free energy minima which moves when the external conditions are changed. One should keep in mind that there are many different types of phases, not just the gas, liquid, solid that we are most familiar with. Protein folding is a rst order phase transition, for example, where the protein goes from a unstructured phase to a phase in which it is folded into a particular arrangement. 8.1.2 What is mean eld theory? In order to desribe phase transitions, it is natural to associate phases with free energy minima. If we take the phase to be described by the particular value of the order parameter which minimizes the free energy, then we are making an approximation, since clearly close values of the order paramter will also have (almost as) low free energy. This approximation is tantamount to saying that we can ignore these other uctuations in the order parameter and say that the probability distribution of the order parameter is a delta function centered at the bottom of the free energy minima. This is called a mean eld theory, since we are assuming that the typical value of the order parameter is the only relevant quantity. Later, we will investigate more sophisticated theories which include uctuations and correlations.

M EAN

FIELD THEORY OF PHASE TRANSITIONS

103

8.2 Polymers
Polymers have become an important material in chemistry, physics, chemical engineering, and biology, due to their broad applicability in materials, industrial applications, as well as many important biological biopolymers (DNA, RNA, and proteins). While there are vast differences between different polymers on a chemical level, there are many properties which are independent of these chemical details, and thus are universal in the sense previously described. Here, we examine some of these properties. 8.2.1 Universal aspects of polymers: persistence length One of the more important universal properties of polymers is the persistence length. Different polymer chemistries vary in their stiffness: for example, DNA is very stiff whereas proteins are fairly exible. To quantify this, one can dene a persistence length as the typical length scale at which the polymer can bend. . If the chain is very To quantify this, lets consider more directly the correlation function exible, then this correlation function vanishes. However, for a stiff chain, this correlation function will not vanish and instead will decay with a characteristic length :

This chacteristic length is called the persistence length, since it describes over what length scale the bond vectors decorrelate. One can imagine taking any polymer, whose normal monomer-monomer distance is and recasting it as a polymer made of effective quasi-monomers separated by a distance . This new polymer will be completely exible. In general, when we talk about the statistical mechanics of polymers, we will assume that this quasimonomer renormalization has been preformed and that we are working with an effective model in which the chain is exible. Chemical differences between polymers only changes the value of (for example for proteins nm and for DNA m), but afterwards is not important. 8.2.2 Random walk The simplest model of a polymer is a random walk. We assume that the polymer does not interact with anything, and can be modeled as the path a diffusing polymer takes. Derivation from the central limit theorem First, it is natural to calculate the probability distribution for the end to end distance of this polymer. To do this, we can assume that each link of the polymer is oriented in a random manner. To simply our thinking, lets consider a polymer constrained to a cubic lattice. Then, each link can be oriented in each dimension, where is the length of a link. In each dimension, we can add up these random variables to get the total end to end vector. We know from the central limit theorem that the probability distribution for a sum of random variables is (i.e. times the variance of a single step). Since we Gaussian, with zero mean and a variance

8 @ 8 l

 ) u 097 H H l P

` 8 @ 8
2 P d

(8.1)

M EAN

FIELD THEORY OF PHASE TRANSITIONS

104

have random sums in each dimension, the end to end distance distribution takes the form (where the number of dimensions)

(8.2)

The probability of the end to end distance

takes the form

(8.3)

where

is the area of a hypersphere of dimension . For

, this becomes

(8.4)

We can calculate properties of the polymer from this distribution. For example, we can calculate the mean squared end to end distance . We can also calculate an entropy

(8.5)

Since we expect that the chain links are uncorrelated, then the correlation function and we get

where

is the distance between links.

since we expect that the are random and have no direction preference and thus can calculate the variance

P 2 ue 2 8 U A2 # @ 8 BB@ @ 8 U u i 8 @ 8 v w 2 8 U 2 8 U g 2 # H A2 # F2 # 8 8

"

8 U 8 U #

where we have taken For example,

as we expect

is very large. From this, we can calculate averages. (8.7) . Next, we

Alternate derivation Another way to think about this is to dene the bond vector where is the position of monomer . We can then write the end to end vector as

8 U " #

DD pbFX FFo DD

where

is a constant and represents the total possible entropy.

(8.6)

(8.8) vanishes, (8.9)

is

2 Pu #

q h}v p v#  2 ) 2 97 P v # H # P P    2 2 2 5 s 2 # q # 5# q P p  2 ) 2 # 97 2 P v H  2 uh}v q P p @ @ @     2 Puv) 2  # H76 2 5 2 h}v 1BB& v#9 #si G v#

F

H u w 

 2 u ) 2 %7 P # H

2 P l H DD po5o spp F bpX b  # DD  #  # 2 #

v

# o P 6 2 h 2 y q 2 uh}v p

ts

2 P u"

2 #

s  #

# s  

M EAN

FIELD THEORY OF PHASE TRANSITIONS

105

8.2.3 Florys theory for a self-avoiding walk In the previous sections, we have been talking about a random walk, which does not interact with anything. For example, it can bump into itself. Of course, this is not physical and real polymers are self-avoiding walks, i.e. their segments have some excluded volume. A simple, elegant, and remarkably quantitative theory for self-avoiding walks was posed by Flory. Florys idea for self avoiding walks is much like van der Waals approach to uids: start with the ideal case (for polymers, a random walk), and then add in interactions. To write the free energy of a self-avoiding walk, we combined the entropy from the random walk

where we have absorbed constants (such as ) into the denition of . Also note that we have set the volume to scale like ; this allows us to solve the problem for arbitrary dimensionality. Since the excluded volume makes the chain repel itself, then we know that has to be positive. From this, Flory calculated the free energy

and found the equilibrium state by nding the free energy minimum

where is an example of a critial exponent for the self-avoiding polymer walk. It describes how the typical size of the polymer scales with the number of monomers . To see the accuracy of Flory theory, lets compare the Flory result for to the exact results. The agreement is striking the Flory result is correct in and within 3% of the result for . While these results are striking, this was in some sense a bit of luck, since one can show that this accuracy is caused by a convienent cancellation of errors. Nevertheless, Flory theories of this form (i.e. start with the random walk entropy and add in interactions via the energy term) are a very common way to describe polymers and can be very accurate.

p 4

3/3=1 3/4 3/5 3/6=1/2

1 3/4 1/2

DG zqEB

spatial 1 2 3 4

dv

d A

m

which leads to

term as it is most relevant at small (where we have ignored the taken from the second virial term

DD po%H

2 P " #

# 5 2

2 u P 2 #

2 u P # l | w "  # 2 # 2

o 2P v # v Vw 2

(8.10) ) with a term for the energy (8.11)

u"~uev &  #

H DD dpo4v  #

H %

# bp #

(8.12)

(8.13) (8.14)

 1 w

 )

M EAN

FIELD THEORY OF PHASE TRANSITIONS

106

Before going on, lets think about what we would expect to be like. Consider a dimensional box. The number of things which can t in that box scales like its volume, i.e. and thus and thus in this case. Thus, it is natural to interpret as the fractal dimensionality of the system. For example, random walks scale like and thus have and thus have a fractal dimensionality of . What does this mean? For random walks in , we expect that it will completely ll the 2D plane. However, random walks in 3D will not. Finally, this means that since the trajectories of random walks have essentially a two dimensional nature, then in , we expect that they will never intersect (or very rarely, much fewer than order times and thus in the thermodynamic limit, their interaction is irrelevant). Thus, in , we would expect that even self-avoiding walks would behave like random walks. To test this, lets look at the Flory result. We nd that for and thus even in four dimensions, self-avoiding walks scale like random walks. However, for other dimensions, we see a strong deviation from the random walk behavior. Certainly in , we expect that a self-avoiding chain would have to be completely stretched out, i.e and thus , which is what Flory theory derives in this case. In and 3, the result is not so glaring, but we still nd that the repulsion due to excluded volume makes the chain swell, and thus have a higher than that of a random walk. 8.2.4 Coil to globule transition When the chain is a random walk, it is much like an ideal gas, as it doesnt interact with itself and simply tries to maximize entropy. When we put in excluded volume, the chain swells. However, if the chain attracts itself, then one would expect that the chain would condense into something equivalent of a liquid, called a polymer globule, where the chain is compact and the size scales like the volume . To predict the coil to globule transition, one can simply write a free energy in the form of a virial expansion where and are the second and third virial coefcients, respectively. Under the solvent conditions where the polymer is self-avoiding (called a good solvent), we expect , and thus the free energy minima occurs at , i.e. the coil state. When the polymer attracts itself (in a so-called poor solvent), then . In this case, we have a free energy minima at

(8.15)

(8.16)

As the density gets greater, the approximation to keep terms only up to the third virial coefcient becomes worse and worse. However, from this simple expression, one can see that there is a gradual transition from coil to globule, i.e. as goes negative, the density gradually decreases. This is different from the gas-liquid transition of the van der Waals equation, which predicts a jump in the density. These two styles of phase transitions (gradual vs a jump) are called rst order phase transitions and second order phase transitions, respectively. Well talk in more detail about what these terms mean later in the course.

v ) l

0 )

v l

H &

2 5

2 &

w & ~"

"&

w 5"

v V

v 7

& )

&

M EAN

FIELD THEORY OF PHASE TRANSITIONS

107

8.3 Ising model


The Ising model is perhaps the hydrogen atom of phase transitions a simple model from which we have learned a great deal. Imagine that we have spins, which can point either up or down, in a magnetic eld of strength . We can write the Hamiltonian in the form

denote the spin value of particle , is the magnetic eld strength (technically the where magnetic eld strength times the magnetic moment of the spin, and therefore has units of energy), and is the strength of interaction between spins and . One can imagine that the spins are sitting on a lattice for simplicity and then

8.3.1 Non-interacting spins

Next, we need to go from this Hamiltonian to a free energy, etc. This is difcult to do for the case of interacting spins ( ), so lets rst consider the non-interacting case ( ). In this case, we have a simple non-interacting, Hamiltonian

and we can write the partition function as the product of single particle partition functions:

The free energy has the simple form

2 

ys 9 q

"

and heat capacity

H 

9 a

Qy 9 1(v q H s v

F4"

entropy

9 a

as well as the energy

4s 9 1Fvh 9H76u 76`fl 9H7 w v  w   

There are actually other generalizations of this Hamiltonian for more complicated glases or protein folding, but thats beyond the scope of this section.

p w v w w a w a a v 9 c PAcb 7u

ys  1vhv q

U lv w l U

l l

H p' %eF H

(8.17)

b ca U

l 7

5e

(8.18) to describe spin

(8.19)

(8.20)

(8.21)

(8.22) (8.23) (8.24)

M EAN

FIELD THEORY OF PHASE TRANSITIONS

108

0 -0.2 <s> 0 2 4 6 8 10 -0.4 E -0.6 -0.8 -1

1 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10

0.4 0.3 0.2 0.1 0 0

1 0.8

S ln 2

0.6 0.4 0.2

10

Finally, we can calculate the average number of spins pointing in the direction of the eld:

This is related to the magnetization of the system:

which is the total number of spins pointing up. Since each of the spins has a magnetic moment, can be thought of as the total magnetic moment the magentic strength of this Ising system. Now, lets use these thermodynamic functions to see whats going on. At high temperature , we expect that the spins would not be aligned with the eld and they would be in some random (entropy maximizing) arrangement. We see that this is indeed the case: (a) no spins are aligned with the eld, shown by and ; and (b) the entropy is at its maximum .

v ) p5

v p45

9 a

 a7"}l

 

"

)

Figure 8.3: Heat capacity per particle verse temperature .

vs in- Figure 8.4: Entropy per particle verse temperature .

) 
10

rl

rl

"

) v

Figure 8.1: Energy per particle temperature .

vs inverse Figure 8.2: Magnetization per particle vs inverse temperature .

"

vs in-

(8.25)

(8.26)

M EAN

FIELD THEORY OF PHASE TRANSITIONS

109

As we lower temperature (increase ), we see that the spins start to align, evidenced by an increase in , a decrease in , and a decrease in . At very low temperature, all the spins are aligned ( ), energy is minimized ( ) and there is no entropy ( ) since there is only one way to align all of the spins. 8.3.2 Mean eld Theory for interacting spins Now, lets add interaction into our system. A simple way to do this is to say that magentic eld generated by the spins, interacts with the spins. In other words, one could say that the effective magentic eld is the external eld plus the magnetization times the interaction of the spins:

Alternatively, one could say that in our original Hamiltonian, the interaction term can take the form

Thus making the new effective Hamiltonian

Thus, we have turned the interacting Hamiltonian into a non-interacting form. The price we have paid lies in the approximation , i.e. we are not considering the actual value of (i.e. including it as a microstate), but simply taking the mean value. These mean values create the effective eld and thus this is an example of a mean eld theory. Mean eld theories work in many cases very well for describing phase transitions, especially away from the critial point. As we will see later, at the critial point, there are large uctuations, and the mean values are not a good approximation. Now, lets solve this new system with the effective eld . The thermodynamics is the same as before, except that we must insert for the old , and then solve the equations self consistently. For example, consider the magnetization (8.30)

To solve this equation, its perhaps simplest to solve this by graphing these two equations. Lets take the case where there is no external eld ( ) for simplicity. If we plot the line vs the curve for different values of , we see that for high temperature (small ), there is only solution (only one place where these curves intersect) at . This is the high temperature, random alignment, zero magnetization phase. As we lower temperature, we nd that we can get two more solutions: one with and another with . The value of will increase as temperature is decreased. These solutions correspond to spins aligning with each other, either up or down. Since there is no external magnetic eld, there is no difference in aligning up or down and thus both solutions must exist.

li

40

 4

l

ts4 

"

lu  )

q a7" i 9 a7e

l ` l

l l

l l

l l

 4

 a7 |3

(8.27)

(8.28)

(8.29)

M EAN

FIELD THEORY OF PHASE TRANSITIONS

110

8.3.3 Mean eld theory near the critial point In the previous section, we found that we could write the magnetization self consistently. At zero eld, we nd the magentization per particle to be

Near the critical point, we know that the magnetization will be small. Thus, we can Taylor expand the right hand side above for small , yielding

(which corresponds to the paramagnetic phase free energy This equation has the trivial solution minima at ) and two other solutions

where we have noted that this is the mean eld (MF) prediction. Mean eld theory is a good approximation for much of the qualitative behavior, but not so good quantitatively. It gets better when the system is inherently less correlated (since the MF approximation puts in for and thus removes some potentially important correlations). In fact, MF theory is exact in large dimensions ( ) and gets progressively worse at lower dimensions. In fact, we will see in the next section that MF theory is qualitatively wrong in one dimension. Also, in quantifying theories near the critical point, one often calculates critical exponents. For example, one can express the magnetization near the critical point in the mean eld solution to the Ising model as

where is one such critical exponent. There are many other critical exponents one can dene. MF theory gives these critical exponents correctly in . However, the lower dimensional results do differ slightly. Finally, often systems which have the same critical exponents have some intrinsic similarity (typically due to shared symmetries of the Hamiltonian). Such systems are said to be in the same universality class since these similarities in symmetries transcend differences in chemical details, leading to these similarities in critial exponents. In fact, there are such similarities between self avoiding walks (simple models of polymers) and the Ising model, due to inherent symmetries, which at rst sight, are hard to nd. A more detailed examination of the Ising model would bring these symmetries out to the surface, but is beyond the scope of this section.

j H 7$

n j H r

2 

corresponding to the ferromagnetic phase for these three minima fuse, which occurs at

o o   l

. We calculate

2 

 | l

) " h H (

9 as

(8.31)

9 h 

H $ )

s q  h

2 

"

9

(8.32)

(8.33) to be the temperature at which (8.34)

v ) l '

rl

(8.35)

M EAN

FIELD THEORY OF PHASE TRANSITIONS

111

8.4 Problems
1. Stretched polymer Write a simple Flory theory for a polymer in three dimensions which is stretched by an external force of constant magnitude . What is the equilibrium end to end distance and how does it depend on temperature? Do not include terms for excluded volume. 2. A polymer melt A melt is a dense polymer mixture such that there is (essentially) no solvent. Consider a melt of consisting of the same type of polymer. In this problem, you will derive, using a Flory theory, where is the end-to-end distance of a given polymer the statistics of a polymer melt. i.e. in a melt. These chains are real chains, i.e. they have excluded volume interactions with the neighboring chains. We want to know how the fact that the chain is in a melt changes the nature of its statistics (vs a random walk), eg as manifested by any changes in its end to end distance . This problem has an interesting and unexpected result, so before writing any math (esp for (b)), think about the nature of the system were examining here. This result was rst derived by Flory himself and is called the Flory theorem. (a) What is the entropy vs the end to end distance of a single, random walk polymer (which would be relevant for a Flory theory derivation)? ? (b) Now consider a particular chain in the melt. What is its energy To help solve this part. Consider the following details/reminders/hints: i. Remember, this is not a single chain problem, in the sense that the particular chain of interest is in a melt. However, to help visualize how to calculate imagine a single chain in a melt and consider how the energy of that chain would change (i.e. how the nature of its interactions with other chains would change) as you change its end to end distance . ii. For simplicity, you can consider that the polymers in the melt are restricted to live on a lattice site in a cubic lattice. iii. The melt is very dense and has no solvent, so each lattice site is occupied with some monomer (from some polymer). iv. The interaction energy between monomers is identical, independent on whether its an intra- or inter-polymer interaction. Hint: This part should not involve a lot of calculation, but might require some thought before you see the answer. (c) From your previous answers, calculate the i. free energy ii. probability distribution iii. typical end to end distance This problem is a good example of how a seemingly simplistic model can be useful in thinking of very real substances.

 # 

 # 

 # vb

2 2 #  # v

5 6

 # s

 # 

M EAN

FIELD THEORY OF PHASE TRANSITIONS

112

3. A polymer under theta conditions Consider a polymer under theta conditions, i.e. conditions where the second virial coefcient is zero. Note that this is the analogous condition for the Boyle point of a gas (i.e. conditions where the gas acts like an ideal gas). (a) Write the free energy. You can ignore the fourth and higher virial coefcients and you should use the polymer entropy we used in class. You should write your answer in terms of the end to end distance . Assume that we are describing a three dimensional system. Dont forget that we are working under theta conditions. (b) Use your free energy above to calculate the equilibrium value of the end to end distance. (c) What is the value of the Flory exponent , where ? How does this compare with the value of for a random-walk polymer and that of an excluded volume polymer (in three dimensions)? (d) Give a possible explanation for the result.

F IELD

THEORETIC METHODS

113

9 Field theoretic methods


9.1 From particles to elds
9.1.1 What are elds? An example of the interaction of charged particles Fields are functions of continous variables. They are often mathematically easier to manipulate than the original discrete quantities. For example, consider the Hamiltonian of charged particles interacting : with each other and an external electric potential

We can dene a charge density eld to be

With this denition, one can rewrite the Hamiltonian to be

We could nally rewrite this in a more general form:

where for Coulomb interaction between particles. Of course, we could choose different values for for other types of interactions between particles. Often, weve taken a mean eld approximation at some point, i.e. to say that the density is some position independent constant . This has some limitations most importantly it doesn not allow for uctuations or for one to analyze how something (eg density) varies in space. Next, we would like to calculate the partition function for the system. In terms of our original Hamiltonian, this would be (9.5) i.e. we want to sum over all of our microstates, which are all the different positions of our particles. Considering that our Hamiltonian involves terms like , an integral over all the different is difcult. Instead, we could write our partition function in terms of elds. In this case, the microstates are all of the different possible values of the eld at each position. Intuitively, one could write this as (9.6)

where we are now integrating over all values of over all possible points in space, and using the Hamiltonian written in terms of . Before performing this calculation, lets look a little more into what this would mean mathematically.

  & # o9# 

   # o& #  #

   @@@   n9H7qt o& BB& 2 o& qu o&

l H o) U nH97  @@ 2 BB@

 

w y 2 # #   2 # o9i# o& 2 # # "  & 

 H 

2 # I# 2 H # # w  2 # o9q# o&  & 

  w

&

$ H $

$ o&  

(9.1)

& # o&  

"

(9.2)

(9.3)

l 2 # X# o) H

"

&

(9.4)

0 2 # # 

F IELD

THEORETIC METHODS

114

9.1.2 Math of elds Functionals as a limit of a product of discrete integrals First, lets dene a functional: a funcis a function and the Hamiltonian is a tional is a function of a function. For example, since function of , then the Hamiltonian is a functional. Just as an integral of over is a integral of over all the values of , a functional integral as the integral of a functional is an integral of over all the values of . If this gets confusing, its simpler to think of a eld in terms of an extension from a discrete set of lattice points. Consider a system which is forced to live on a cubic lattice. We can still think of a density at each lattice point . In this case, the partition function would take the form of

Indeed, we can think of elds in terms of a generalization from discrete to continuous space:

Same point in space? Integral of the eld at a point Integral of the eld over all space

Thus, we can write a functional integral as a product of regular integrals:

Since our Hamiltonian is quadratic in our elds, we will have to do integrals of the form

Thus, the next step is to look to see how one can do Gaussian functional integrals. Gaussian functional integration The only functional integral one can do is of the Gaussian type. Consider a one dimensional Gaussian integral

 1v) 2 2  v  k 76u}v  k w 2 }) 2 9H76

  & n9H7u oi

$    & d3 o9&$ o& 3 H #  oi

Field at a point

  n9H7q o& G

Sum over locations in space

G   r   I  H  3 s d   I

Location in space

tszo& q  

 z

 9H76

  zo&

&

4t

  zo&

  Eo&

&

(9.7)

(9.8)

(9.9)

F IELD

THEORETIC METHODS

115

The simplest form of a Gaussian functional integral is very similar to a one dimensional Gaussian integral:

We can do each individually to get

However, typically one is interested in more complex Gaussian functional integrals, say of the form

These integrals are more similar to multidimensional Gaussian integrals:

where is the determinant of the matrix (i.e. the product of its eigenvalues) and inverse. To go from multidimensional Guassian integrals to Gaussian functional integrals, we make the and . The meaning of this transformation is that we will transformation now label our places in space using continuous variables instead of discrete sites . Otherwise, everything else is the same. If we do this, then we nd the result for the generalized Gaussian functional integration to be

 3 hz   d I d b  b H fh  H fh ik  C lv w crF lv 2 BC 7}5v "ik C w C lv o C

n  d  d h  $   l d h  w   n j   r i 7  I u I H   w )    

  I  

wsv    q k & }i) 3 h  d 3 r$ r|H 3

3 3 sv k p } 3

We see that each of these integrals are independent, i.e.

is the product of

In order to see how this works, lets go back to continuous space:

. We see that

independent integrals (9.13)

v  ) 2 y$  2 k j u}v n    v   k Iw 2 }q)  2 H j i
k w 2 T) 2 p EH v d(  I n
i

q v k w 2 }) 2 H p

i v) 2 2 k U q }v q v g ) 2 2 k p rv

v b lv j 3 3 rF H q 3 P 3 )i

(9.10) (9.11) becomes (9.12)

(9.14) (9.15)

(9.16)

(9.17) is its

(9.18) (9.19)

F IELD

THEORETIC METHODS

116

where is the determinant of the matrix and is its inverse. While it may seem strange to take the determinant or inverse of a function, in the end, one can think of simply as a matrix (in an innite dimensional space). Indeed, we can write an eigenvalue equation (9.20)

are the eigenvectors. Accordiningly, where the set of possible values for are the eigenvalues and one can dene the determinant as the product of these eigenvalues. The inverse is dened also in analogy to matrices: (9.21) Later in this chapter, we will show how Fourier transformations act as rotation matrices for the matrices in order to diagonalize them and thus make the calculate of eigenvalues, determinants, and inverses trivial. Functional differentiation The next step would be to dene how to take derivates of functionals with respect to functions. Mathematically, we can dene a functional derivative as follows. Let be a functional of , i.e. a function of a function. Then the functional derivative of with respect to the function is dened to be (9.22)

This leads to the functional derivative analogs of more familiar derivatives, such as

(9.23)

Operationally, one takes functional derivatives much as one would take the analogous regular derivative. As a warm up to calculations of functional derivates, lets consider derivatives of discrete quantities and their continuous analogs. For example, one can calculate

(9.25) (9.26)

Thus, we see that functional derivatives can be seen in terms of their discrete counter parts.

 H  3 s d

r  

(9.24)

(9.27) (9.28) (9.29)

 d h 

  1 I  C    H  IP3 $  C

 szd5 EI i H  i 

  zIT

  d h 

  3 I C  Ip    r  C  I    I

3
H

 H   s d5    o 3  d 3 h  3   4   4  d 3  3 r  3
EI

 zIT  bp! i F "    i HzdCVw$zIT i

b T ca

 d 7 

3
l

 i EI i

n  C C u

 EI    EI

EI

C C

 7 

3 cb

  EIT

 d h 

F IELD

THEORETIC METHODS

117

Fourier modes An often convenient change of variables is to examine elds in Fourier space instead of real space. This means we make the change of variables

In general, the Fourier component of a real eld is complex. However, for real elds , one can show that the complex conjugate satises the relationship . With these relations, one can show (9.32) (9.33) (9.34)

Finally, Id like to demonstrate how Fourier transformation can diagonalize matrices for systems . If the system is translationally invariant, with translational independence. Consider a matrix that means that all that matters is the distance between particles, not their absolute position. Thus, all that matters is . For completeness, we will also dene which is like the center of mass. We want to calculate the Fourier transform of , which we nd to be (9.35)

Using

Since the system is translationally invariant, integral over trivially

where we have used the identity for the delta function identify as the Fourier transform of

We see that this matrix is diagonalized (due to the delta function) with an eigenvalue , just as we would write a diagonal matrix in a discrete space as . Now that the matrix is diagonal, we read the eigenvalues as the diagonal elements, the determinant as the product of these elements, and the trace as the sum of these elements. Finally, the inverse of a diagonal matrix is just the . Thus, diagonal matrix consisting of inverse of its eigeinvalues .

 $ u$ Fys $ r$  ) l 2 w  s H 4  4 i )   5   $ $  4  s 2 H $ h $  2  2 $ hf$ dis 2 $ I$  w   H  $ f $   #     @  2 w q  #   ` # 9H7 # 2  $ w $ d5s $ f $ @ | # H7| #  2 $ $ di#  s 2 $ I$ q # 97 #  2 $ $  w    H H | # | #  #  s 2 $ $ q |# 9h6Es 2 $ If$ q `# 97j|# #  w @ H  H @ H  #  E# |# q @ 2 $ 9h# s|# q hf$ 97j|# #  2 $ $  s H H s w @ H H w # | # v 3 # | #  d3 h  d3 @ 2 $ 97 h$ 9763 4 2 $ $   H  @ H  #  H d 3  d 3 h  v w i)d 3  # v H q)d 3  #  d3 h      @   $ 9HIP $ I $ 2 fq $ I$ j $ v7x $ 5 2  I    @    $ H  $ I $ q $ r | $ v7x $   5  IPu    H $ IP1$ 9r 2 $ 4 2 qq $ I$ |$ v7x$ v!A e 2 fq I5u    @  $      $ Ir $ r H   I   r    @  q $ I $ v76h$ r   @   P j $ 9H7 1 $ I
and , we nd does not depend on

(9.30) (9.31)

   q I

(9.36) (9.37)

and thus we can take the (9.38) . We

and thus write

(9.39)

F IELD

THEORETIC METHODS

118

9.1.3 Scattering experiments Imagine that we irradiate our sample with a monoenergetic beam of X-rays, neutrons, or light, etc. We will assume that the scattering does not alter the system, but simply the light that we send at the sample gets scattered off at some vector . The intensity of scattered light will take the form

where is the scattering amplitude for scattering from particle . Note that this quantity is a thermal average, hence the angle brackets above. Assuming that all of the particles are identical, we can write the scattering amplitude in terms of a form factor (i.e. the scattering amplitude for a single particle in isolation) and a phase correction: We see the above relation by calculating the relationship between an arbitrary scatter and one at the origin: (9.42)

Making the transformation

where we dene a baseline scattering intensity to be manipulations in order to introduce the density

Thus, we see that the scattering intensity is simply related to the Fourier transform of the densitydensity correlation function . In other words, one can reverse Fourier transform the intensity to measure the density-density correlation function. Thus, scattering experiments will be useful for telling us the degree of correlation in a system, eg. how large are the domains in a system, what are the characteristic length scales, etc.

   # o9&q # o& 2   2 H @ q 2 # o9&q # o& ys # I # j $ H 6 # # H  H  H @  I # d U  r # U qys 2 # I# j$ H q 6 2 # # p H  @ H   ts 2 # I # j $ H q  I # U  H # U 2 # # p  H    d $ o& p 2 f  $ oAPQ  @ q p ys H Q j $ H 6 U U QQ QQQ QQ @ @ Q @ Q s | $ w q 6Bs j $ H q U U 2 s | $ H q U  $ oP p

 

1  

 1

Thus

@ q  @ q P q w  @ q P s j $ H  $ oP5s 3 j $ H d 3  3 s @j $ H `ts 3 j $ H d 3   3 H 0 3 @  P | $ 9H7  5 $  P

, we get

. Finally, we can make some (9.47) (9.48) (9.49)

@  H P j $ Hh 

and the Fourier transforming each side

s j $ H q 61 $ "1 $  P @ P

QQ QQQ 2  $  P


H P  s  5$  P

4 

QQQ U QQ 1 $ 

(9.40)

  P

$
p

(9.41)

(9.43)

 1  P

(9.44) (9.45) (9.46)

F IELD

THEORETIC METHODS

119

9.1.4 Transforming from particles to elds: Hubbard-Stratonovich transformation Calculating the partition function. Another way to tackle the Ising model is to introduce elds. A eld is a continuum version of some set of discrete objects. Just as we dened a charge density in an earlier example, we can dene a spin density, which is also essentially the local magnetization. We can also think of the magnetization as a eld conjugate to the spins, much like a density of how many spins are pointing up. For the Ising Hamiltonin

We get the partition function we wish to calculate:

(which equals if and are neighbors and Note that the factor of has been incorporated into zero otherwise). This calculation is made difcult because of the product of in the exponential. It would be much simpler if we could convert this case to something more familiar, such as spins that dont interact with each other, but do interact with an external magnetic eld (as weve seen in the previous sections). One natural way to make this transformation is to introduce a conjugate eld analogous to the density in the example above. We do so by the Hubbard-Stratonovich transformation, which is actually simply doing a Gaussian integral backwards:

(9.52) This method is useful because it turns interactions between spins (in the original Hamiltonian as ) into interactions between elds ( ) but independent spins ( ). Thus, using this form to calculate the partition function for a set of spins on no magnetic eld, we get (9.53)

where we dene

Now, we can proceed with the partition function by simply summing over the spin possibilities, since the spins are now decoupled. In other words, our system now mathematically looks like the spins are only interacting with the eld. The physics is the same of course, since we have made no approximations at this point. The physics of interaction between the spins is now mathematically handled by the interaction between the elds. This leads to

n 2 5 h}  v crp H j os q b lv i C l w   9 lv iHAs q i 5w 2 " l C  C l l C n  b ik l w lv H fh 2 5 h}v C rF lv H j ` ik l l C lv fh C  C

l l

v |   9 |9H6Bs q 5w 2 l i

ik l l U lv fh 5 w 2 v l H l l U

(9.50)

(9.51)

(9.54)

F IELD

THEORETIC METHODS

120

This appears here just in the way it appeared in the non-interacting case in a magnetic eld. Moreover, here serves as that external magnetic eld mathematically. Physically, can be thought of as some magnetization, and the term as the interaction between the spin and this magnetization. However, unlike the mean eld case discussed previously, we have made no approximations yet. However, now is the time for an approximation. We need one since we cannot perform this integral with the . Instead, lets assume that the magnetization eld is small and Taylor expand the term . If we keep terms to order , we get

where . Of course, another reason for this approximation is that it turns our integral into a Gaussian integral, which we can easily calculate. 1 At this point, it is natural to switch from discrete sites to continuous ones, i.e. instead of denoting spins by their lattice site , we will denote them by their location. Thus, we have transformations of the form and we have the conjugate eld . In this case, one would call a functional integral, since we are integrating over all values of the function . This yields

has a natural form, i.e. it is a delta function for nearest neighbors (i.e. with a lattice spacing ), and zero otherwise. This delta function representation, allows us to do one nal transformation: going to a Fourier basis. If we Fourier transform we get

1 Please note that has an important physical meaning. If one calculates the correlation function between the . Thus, has the physical meaning of the correlation function between conjugate elds, one les that magnetization at locations and . We will see this in more detail below.

Note that

depends only one one due to translational invariance (i.e. depends only on ). Fourier transformation of the other term in , the delta function, is 1. Also note that by

 r  q) 2 $ 2 gw l m   $   P 

where The benet of thinking in contiuum space is that

is the continuum version of

. (9.57)

s i D q

  I

  1Frp w    v s q 1Fv    s l q 6 w   
l

ysyw 2 H  w  P 2 H  q 2  P Iu dyhgH Iu d ` u  ts 2 H u w  ys 2 u  q H q   2 u   H v q i $2 2 2 rqu r   u  u l # 6Bs S""

v 6Bs q 5 2 i

  r

l  2 2 PH  P  2 h 

&

s tw  9~H q   H  w  9 l H BD q e v s i  vl) H v %v) 2 g 5 rF 5 vfF

$  $  $ w w $ 7 s D1vP $ 1vP G D1Fv q  $ 

v q l qH6Bs i l v q i |qH6Bs S v 6Bs q i

l H

(9.55)

1 & W & u

  1  $ 1

&

2 u H

v 1

(9.56)

(9.58)

F IELD

THEORETIC METHODS

121

Fourier transformation, what originally was a matrix quantity ( ) is now a function of a single quantity . This results from the fact that since is translationally invariant (does not depend on or , Fourier transformation is essentially a transformation which diagonalizes the matrix. In its diagonal form, calculating matrix inverses are easy: we just invert each term of the matrix. Thus we have (keeping terms to order )

where . Note that has units of length (whereas has units of inverse length). We can now Fourier transform back to get the correlation function in position space:

This form is called the Ornstein-Zernike correlation function. We recognize as the correlation length of our system, i.e. how far apart do we expect to nd spins which are still correlated. In other , we expect domains to form in words, at temperatures below the transition temperature which the spins all point in the same direction. The correlation length tells us the typical size of these domains. We see that this length scale is imaginary above the transition temperature, reecting the fact that . We see that at the transition temperature , we start to get no domains should form for correlation. Indeed, one can rewrite the correlation length as

The meaning of . How does relate to correlation functions which one can measure from scattering? The partition function takes the form

Lets introduce an additional eld

Physically, is like a position dependent magnetic eld. Mathematically, we get the original partition function back for . However, since this is only a linear term in the exponential, we can still calculate this partition function by Gaussian integration:

is a critical exponent of this system. Thus, this correlation length is small ( where high and divergent ( ) for .

(  k d  v $   k I  2 IPiu r  2 u  2 u l As q i   w     #   k v q i $ 2 2   u  u l # 6Bs S""

2 Qw 2 2 P )

$ 2 2 2 q k k   u  u # Ats u  b

 2  qu  

 2 u 

 2 r qu r   

2 Pi )

2 P )

$  H l 5w 2 $

j P H $k

 ) H b97

2 

H I

E 

j P H

 $
$

  l )q 2 2 yw  P l

9$

d `

"

l  H

2 u H t

)P  H l 5

$
H  

v ) l

1  

H } m

(9.59)

(9.60)

(9.61) ) for

(9.62)

(9.63)

(9.64)

F IELD

THEORETIC METHODS

122

From this, we can calculate the desired scattering function using a derivative trick:

(9.66)

In this form, we see that describes how fast the correlation function decorrelates. The typical length scale for this decorrelation is and thus is the length at which the system is correlated. In terms of an Ising system, this correlation refers to a patch of spins which are all aligned in the same manner. At high temperatures, is small and the patches are of a small size. When approaches , we nd patches at such large scales. Thus, there are large scale uctuations. This may be potentially bad, since in mean eld theory weve been ignoring these uctulations and indeed large which would indicate the possible limitations of mean eld theory, as we will see later in this chaper.

9.2 Landau theory of phase transitions


Landau theory is a simple and physical way to write models of phase transitions. The goal of a Landau theory is to simply write down the relevant free energy for a system, based upon the relevant symmetries. In many cases, one can simply think about some properties of the system, write the appropriate Landau free energy, and quickly build a good model for the phase behavior of the system of interest. 9.2.1 Motivation from the Hubbard-Stratonovich transformation One way to motivate Landau theory is to consider our exact equation for the partition function in the previous example (9.67) where

(9.68)

where acts like a free energy and is called the Landau free energy of the system. We obtained the approximate result by expanding . Note that we have gone from the lattice representation (where elds are labeled by and to the position representation). For the moment, lets assume that the eld does not depend on position, i.e. the elds are uniform and we can write . This leaves us with a simple Landau free energy

(9.69)

 1

$

Thus, we associate the two point correlation function with . What can we learn from scattering? To look more closely, lets Fourier transform back get

@@@   BB`w zr v l

2     7  93r EQQQQ I 2 b Q  2      k  Cu  k  93rEI p2 u   w 2 zr    

(9.65)

to

w 2 5y c   zr   zr l @BB@w @ H  ) wv ) ) 2 "z 1Fvfp l tH l lv yszr q 1vrFw 2 EI   9 lv     

  E

    I zI I zr      ) H 3b97

P 3 P 93 P
t
H

s BD q i

H y

F IELD

THEORETIC METHODS

123

where we dene and . Note the similarity between this form and a virial expansion. Indeed, one can think of a virial expanded free energy as a Landau free energy where the eld is the density. Now, lets take a closer look at how this free energy behaves. First, we notice that there is a and . For , there is a free energy at fundamental difference in the free energy for . When , then we see that there are two free energy minima at

where is the transition temperature for this phase transition. Thus, we see that for we have and for we have . Note that the formulae here parallel what we did in the mean eld calculation of the Ising system, where is interpreted as the magnetization. Indeed the equation above for the equilibrium magnetization follows exactly what we found by expanding the for small . This parallels our calculation here, where we expanded for small . 9.2.2 Motivation from symmetries

The Hubbard-Stratonovich transformation is a pretty heavy handed, mathematical way to arrive at our results. Ideally, one could think about the physical nature of the system and arrive at the Landau free energy. In fact, this is just the method suggested by Landau. First, we can write down the most general possible Landau free energy

However, this form is not very useful. We must now insert some physical understanding about Ising systems into the above in order to simplify it an distinguish it from other systems. To do this, we examine the symmetries of the Ising system. Lets be more specic by what we mean by symmetries. A symmetry of a system can be thought of as an aspect of the system which can be changed without affecting the free energy. For an Ising system that is not in a magnetic eld, there is an up-down symmetry, which means that the spins dont care which way they are aligning in an absolute sense, just that they are all aligning in the same direction. This symmetry is manifested in the free energy by the fact that we must have

i.e. the Landau free energy is invariant to a global spin ip. Mathematically, this means that the coefcients of all odd terms in the Landau free energy must be zero (since for odd) and only the even powered terms survive. This allows us to write the Landau free energy of the form

@@@ BBw yw w w o Fw 2 5y c w P

$ l l o   H ) bH

@@@ BBw kw

 H  

a

The transition between these two cases occurs at rewrite as

. Lets take a closer look at . We can (9.71)

v c )H

| Q

vl)l
"

w 2 "k c

c v o w 

v 9eH l q c  l

 $ 1

v q)


5 vrF

l ) b H  c u

 C |H C |H

c $

"

(9.70)

(9.72)

(9.73)

(9.74)

F IELD

THEORETIC METHODS

124

Next, we say that the intrinsic interactions are pairwise. This means that could be either positive or negative (since corresponds to pairwise interactions, just as the term in the virial expansion correponded to pairwise interactions). Finally, we have to decide where to truncate the series. For pairwise interactions, typically four body interactions will be dominated by the entropy, which wants to avoid attraction (since this decreases the entropy). Thus, we can take (i.e. repulsive four body interactions). With these two terms, we have a pretty complete picture. Or, at least, we can now examine what the system looks like with just these two terms. If the qualitative behaviour is lacking in some way, we might need to consider including additional terms. This leads us to the Landau free energy

just as we found before. But what about the form of and ? Since is caused by entropy, we expect that it should simply be a constant. However, since the pairwise interactions are the net result between the entropy of being different and the energy of aligning, one would think that we would have a coefcient of the form (9.76) for attractive interactions (where is dened to be positive for attraction and is some entropy per particle for the system). This simple form simply describes the fact that at high temperature ( ), the coeffcient should be repulsive and at low temperature, it should be attractive. At the transition temperature , the behavior switches. Thus, we can rewrite as (9.77)

This form brings us directly to our result from the previous section. Instead of going through all the mathematical formalism, we were able to make some simple physical arguments and move directly to the result. This is the real strength of Landau theory. However, one must have some understanding of the system to make these arguments! 9.2.3 Ehrenfest classication of phase transitions The Ehrenfest classication of phase transitions distinguishes discontinuous phase transitions from continuous ones. Ehrenfest proposed that phase transitions could be classied as th order if any th derivative of the free energy with respect to any of its arguments yields a discontinuity at the phase transition. For example, in the van der Waals equation for , we saw that there was a discontinuous jump in the volume. This jump typically means that there is a kink in the equilibrium free energy vs temperature. When we take the rst derivative of the free energy (with respect to temperature, for example), we nd that the rst derivative is discontinuous due to this kink. Thus, we call these transitions rst order phase transitions. On the other hand, there is no such discontinuity for the case in a vdW uid. In this case, the rst derivative is continuous. Such transitions are typically called second order phase transitions, but this name is typically includes all transitions which are continuous in the rst derivative (and thus includes third order, fourth order, etc). Since the name second order is a bit misleading,

l F

h"e

2 d&

H k

H l H l  7z3QCnzc

 H  H f ~c

l n 5

w 2 5y c

l) F

(9.75)

F IELD

THEORETIC METHODS

125

the prefered term is a continuous phase transition, but old habits (and nomenclature) are hard to break. Finally, we can ask, mathematically, where does this discontinuity come from? If we look at the free energy or the partition function, they are just sums of exponentials, and thus look like they should converge and not lead to singularities, eveb in their derivatives. Where does this singular behavior come from? Interestingly, we should keep in mind that there is one singular nature of functions like the partition function or the free energy, namely the thermodynamic limit . Mathematically, it is the thermodynamic limit which leads to singularities of the partition function or free energy. Since these singularities are associated with phase transitions, we come to the result that phase transitions only truly occur in the thermodynamic limit! With that formal statement said, clearly nothing in the universe is formally in the thermodynamic mol is sufciently close to observe phase limit and thus, in a pratical sense, we say that transitions. However, this is not as clear once one looks at much smaller systems, such as polymers, nanocrystals, proteins, etc (which could have as few as 100s of atoms). 9.2.4 First order phase transitions We have seen in the rst section a Landau theory for the Ising system, which yields a second order transition. In that case, there was a two body attraction (i.e. the term could have a negative coefcient) due to spin-spin interaction, the term vanished due to symmetry, and the term was always positive due to entropy. Now consider a different system: the trimerization of compound X. Lets say that X trimerizes only when all three monomers of X are interaction (dimers of X do not form, due to the dominance of exlucded volume over dimerization attraction). In this case, lets write down the Landau free energy. First, we need to pick a relevant order parameter. Lets choose the density of monomers . In this case, unlike Ising systems, there is no symmetry arguement to exclude odd power terms in . Thus, we can rst write down (9.78) Now, lets think about the values of these coefcients. Since we said that there was no attraction to . However, we said that there could be an attraction to form form dimers, we would say that trimers, so there could be a case with . At high temperature, is likely to be positive (since trimers wont form due to entropy) and thus, we can write

is a positive constant and is the reduced temperature. It is natural to set since there is not likely to be any attraction between quartinary units. The term is related to the chemical potential per unit density. If we are working at constant number of particles, we can likely ignore this. Thus, we get where with the constants , , and all positive.

&

&

& 3P

2 u

i H

v & (w o & (w 2 &c (3e l l l w & P

h v & (w o & i mH 2 &c y l l l

j H $k

q9$QtE b ) H H l H %n7b

(9.79)

(9.80)

F IELD

THEORETIC METHODS

126

Lets look for the free energy minima. We nd

We get three solutions: and . Lets look at these minima more carefully. For the solutions, we see that there is an interesting temperature given by the point where the square root is about to go negative (and thus the value of this minima is about to go imaginary): . For , we nd that there is only one solution . As we lower (and thus increase beyond , we nd that a free energy minima appears for . However, this free energy minima is a relative minima, not an absolute. Thus, this will not be the equilibrium state of the system in the thermodynamic limit. However, in real life, one will still nd some part of the system there in some meta stable form. This temperature at which the meta stable free energy minima appears is called the spinodal temperature. As one further lowers , one nds that eventually the free energy minima lowers in free energy. The point where both minima have the same value of free energy is the transition temperature . Below this temperature, the minima with is the global free energy minima. Thus, the density must jump from the to the minima, as one would expect from a rst order transition. 9.2.5 Calculating scattering functions from Landau theory (for inhomogenous systems) In the previous sections, we have been studying homogenous systems, i.e. we said that the order parameter was uniform in space . This, of course, is not typically the case. For example, in the previous section, the presence of two free energy minima can lead to domains of each phase in space. Thus, one domain will have one value for and the other a different one. Thus, will vary with . Considering this variation is especially important when we make connection with scattering experiments. Scattering directly examines the nature of these imhomogenities. To address this mathematically, we must now look at the Landau free energy for an inhomogenous system. For example, consider the following Landau free energy for an Ising system

The last three terms can be derived by symmetry arguments (as we did in earlier sections) and the coefcient is a position dependent external magnetic eld. The rst term is the analog to the term that we derived from the Hubbard-Stratonvich transformation. Formally, we can see this equivalence by noting that the Fourier transform of is . Physically, the gradient looks for differences in neighboring values of the eld , just as dealt with the interaction between neigboring lattice sites. Of course, this term is an approximation (formally the rst term in the series expansion), but it will yield the relevant qualitative behavior (much like the approximations to truncate the expansion). To calculate the equlibrium behavior, we must take derivatives of the Landau free energy with

  2 r3

9$ 6 $

  EI

& "&

   EI E

 5)q c H 2 2 v  i p i 5 &  5 H c & o & w 2 & i ~&"   H   zr &

(9.81)

w 2 zr w 2 EI   c  

&

ys q

|zr d 

&

v i ) c  
"&

5 &

&

 2   F zr

2 2i ts # q
 E

(9.82)

F IELD

THEORETIC METHODS

127

. This is an example of a functional derivative. 2 Operationally, one respect to the function takes functional derivatives just as one would take the analogous regular derivative. Similarly, one can dene a functional integral in an analogous way. Using this language, the partition function itself is a functional integral of the functional :

This mathematical formalism allows us to in a sense treat the prole like a macrostate, which makes sense when you consider that determines the conguration of the system which we input into the Landau free energy. Now, lets use this to calculate correlation functions which one can measure from scattering. For the case , we can ignore the term since it does not qualitatively change the behavior (the minima in this case is still at ). Thus, we have the partition function

where ; we know that should only depend on the difference since the system is translationally invariant there is nothing special about any particular point in space. The partition function is just a Gaussian functional integral, which can be solved just like we would a multi-dimensional Gaussian integral. In this analogy, we think of as a matrix (which happens to be diagonal since it only depends on the difference ) and and as vectors. This leads to (9.87) From this, we can calculate the desired scattering function using a derivative trick:

We associate the two point correlation function function . Finally, we have to calculate the inverse of our

with our function. To do this, we can write (9.89)

of a function. Then the functional derivative of

with respect to the function

This leads to the functional derivative analogs of more familiar derivatives, such as

be a functional of is dened to be

Es

w E w E b w E t E W w E s W w E E A s s s s w E ( w E rh( w E t E H w E r Duc s % t % s s 's E (w w s E s  %

2 Mathematically, we can dene a functional derivative as follows. Let

, i.e. a function

  zrT

 H 3E" sz yEyw 2 q  3 H s 3 H  c szd5sz E  3 H   3 H  3 H  H  3 H  3     3z `z 9rEI  3 9 z   3 H sz  3 H      z Fo2   P3IEI $  3  3 H  sz E 3 6Atz b q t  # s 3 H 3 H 

 z

   zrz

  zI

  zr

 zr E $ r zr E   w      H w 2 zI w 2 EI   c  

 3 H suE  3 H  c H 2 suzdsz  3 H 93 9HEI i 3    ys q 3   H j EI i

s zrTH q EI i    

  zr

zr  

"zI  

  zr

(9.85)

(9.86)

(9.88)

(9.83)

(9.84)

F IELD

THEORETIC METHODS

128

It is easier to solve the above in the Fourier representation. If we Fourier transform both sides, we get

Moreover, it is typically this Fourier transformed version of the correlation function which is calculated in scattering measurements. What can we learn from scattering? We see that has units of inverse length squared. In fact, one typically associates with the correlation length , i.e. . This makes sense especially when we Fourier transform back to get

correlation function decorrelates. The In this form, we see that describes how fast the typical length scale for this decorrelation is and thus is the length at which the system is correlated. In terms of an Ising system, this correlation refers to a patch of spins which are all aligned in the same manner. Recall that . Thus, at high temperature and thus , i.e. there is little correlation. As temperature decreases, decreases and increases. This continues until we hit the critical point. At this temperature, and thus , i.e. the correlation length diverges (i.e. the correlation length becomes the system size). With patches at such large scales, there are large scale uctuations, which would indicate the possible limitations of mean eld theory, as we will see in the next section.

9.3 Breakdown of mean eld theory


Lets go back to the rst way we introduced mean eld theory. We found that we could approximate the Hamiltonian as (9.92) Now, lets look at this approximation more closely. We can rewrite the original Hamiltonian to include the term missing when we made the approximation . Thus, including this missing term, we can write the Hamiltonian again (without any approximations) as

(9.93)

with the addition of the term showing the deviation from the average value. As we mentioned earlier, the approximation leads to

(9.94)

and thus the connected correlator vanishes in this approximation

(9.95)

The error implicit in this approximation can be estimated by

(9.96)

2 c

R) l d

ysC l H ul5h l q  w

HI9$r  $   
c 2 l c

l l

P3IPzI   3 H  H 3E   H  ) h3s 76 P3IPzI  $ 9

l l l l r l l H l l l l e l l H l l l l d l l  l l 2 l l l d l

2 mc

c yw 2 w w

i l d l

c c

  c

9$

PFw i l

l l

H ux5

l H l l l

1  

$ syw 2 $ q c
c

(9.90)

(9.91)

F IELD

THEORETIC METHODS

129

Noting that the correlation function is related to the Greens function

we can now try to make an estimate of the error . In particular, we can try to say how big can this error get before mean eld theory breaks down. For mean eld theory to hold, we would expect that we must have (9.98) Such an argument was rst made by Ginzburg, and the constraint above is called the Ginzburg criterion. Now, lets think about how we should perform this average. We want to see how correlations affect Landau theory. Thus, it makes sense to average these quantities over a range which we would expect to nd correlations. If we look at a larger range, then we would be throwing away the information we are interested in. We know that the system is correlated up to the correlation length . Thus, we can express the Ginzburg criterion as

where the volume of correlation is . Lets now evaluate these quantities in particular, how looks as we approach the critical point. We can estimate the denominator of using the mean eld solution we found from Landau theory: (9.100)

The condition that mean eld theory works is

or

Recall that is a small parameter in the regime we are discussing (i.e. near the critical point). Thus, we see that for , mean eld theory is always correct, even at the critical point (i.e. , and does not diverge and thus is small ( to the positive power is an even for smaller number). However, for , we see that as one approaches the critical point, mean eld

2 2 2 l

 2

2 l ' 2 &   #

Putting all of this together, we get

H   l

9

 # 

d d 2 r $ q)9Qz H

H  

 # 

9

and nally since

, we have

2 

) H  i9$eEb

where the critial point like

is the reduced temperature. We know that the correlation length diverges at (9.101)

that we calculated before (9.97)

  l 2 & 2 & # v#  v# l l 


2 5 H p v)c  # v  #  t0 2 &   # # d ) l  v#  

(9.99)

(9.102)

(9.103)

(9.104)

F IELD

THEORETIC METHODS

130

theory breaks down as the quantity is very large near (and diverges at) the critical point (although it is accurate away from the critical point). Because of this behavior, the dimension is called the upper critical dimension since for mean eld theory is exact. The value of changes from system to system. For polymers and the Ising model, . For some systems, can be even higher (for example, for hyper branched polymers [eg dendrimers], ). While we will not have time to discuss this further here, there is also a lower critical dimension for systems . For , uctuations destroy any possibility of long range order and thus mean eld theory is wrong. For Ising systems and polymers, . This effect was seen in the lack of a phase transition for the Ising model. Just as can vary from system to system, so can . Interestingly, for hyper branched polymers is and thus mean eld theory is not even qualitatively correct in for these systems.

H H H

}) l

H vH

F IELD

THEORETIC METHODS

131

1. Relationship between

For the system which has the partition function

Here, you will show that

If you like, you can think of this as an abstract system where the microstates are the values of and thus the functional integral represents the summation over all of the microstates. In . this case, one would have the Hamiltonian (a) To calculate averages of , it is more natural to work with the new Hamiltonian . Taking gives us our original Hamiltonian. This additional term is often called a source term. Calculate for this new Hamiltonian. to calculate the average for the case . (b) Use Hint: Derivatives of with respect to the new eld can be used to calculate averages. (c) Use to calculate the connected correlation function for the case . 2. Fluctuation corrections We have been solving mean eld theory solutions for the Ising model. This means that we have been writing down free energies and then solving them by the saddle point approximation, which means looking for free energy minima. The nature of this approximation is that we assume that the partition function is dominated by that value. Ideally, one would like to simply solve integrals like

exactly, but thats not possible. However, heres a way which is an advance over the saddle point approximation. This method allows us to include uctuations in near the free energy minima, i.e. we now longer assume that only one value dominates the integral (as in the saddle point approximation) and include the contribution of nearby values. (a) First, do the easier operation and solve the integral above using the saddle point approximation. Call the mean eld value of , i.e. the value which maximizes the integral . (minimizes the free energy), . For this entire problem, you can assume that (b) Now, rewrite the integral in terms of a new eld . Taking only terms up to quadratic order in in the exponent, solve the resulting Gaussian integral in to calculate the partition function and resulting free energy. How does this result differ from the mean eld theory calculation? What is the physical meaning of the additional term in the free energy?

P   e  k v  k IPi7 r bF k " P    k v  k I  k v  d k    I$  xH n k i n  2 2 2 2 H IPiu r   u  u n s q i 2    2 u  P rq r $ v s q 2 2   u  u l # BD i e

 2 r iu I   

H i

w 2 c

p H 6

  $

9.4 Problems

and correlation functions

F IELD

THEORETIC METHODS

132

3. Screening in Coulomb interactions You may have already learned that Coulomb interactions between two charges can be screened, i.e. the effective interaction is made short range. In fact, instead of the normal Coulomb interaction between particles 1 and 2 one nds that the effective interaction energy is

where is the screening length. Screening is caused by the rearrngement of other charged particles in solution. In this problem you will show that this is true and calculate in terms of measurable quantities. This effect is important since screening makes Coulomb interactions effectively short range, instead of the bare Coulomb interactions, which is considered to be long range (i.e. equivalent to the screened case when ). (a) The natural place to begin is with the Hamiltonian between charge particles.

where is the magnitude electron charge, is the charge of the th ion in terms of electron charge (eg electrons have , protons have ), and is the position of the th ion. . Call this Re-write this Hamiltonian in terms of the charge density Hamiltonian . (b) With , write the partition function

We see that this has interaction just like the Ising system we studied in class, except that the interaction is Coulombic in nature. Use the Hubbard-Stratonovich transformation to . introduce elds Hint: Treat the terms as we treated in class. Also treat the terms as we treated . (c) Now, make the approximation that , where is the density of the system and is assumed to be constant. Sum over the possible values of , i.e. whether the ions at a given location are (d) Taylor expand the result such that your exponent only has terms up to order .

l 2 # I# ) H

    H # d l # l

&

 ) 

 5w 2 i n9H7

l H l

 2 $ f` $ H76
l

& l #  

l 2 f

$` $ l

$ `

H (

2 B 8DGB r pzzEB

i#    # r

i i

F IELD

THEORETIC METHODS

133

4. Multi-dimensional Gaussian integrals In class, we did single dimensional Gaussian integrals

In general, one often has to calculate -dimensional Gaussian integrals, which take the form

where is the determinant of the matrix (i.e. the product of its eigenvalues) and is its inverse. In this problem you will derive this result. (a) First, calculate the integral above for the simpler case of a diagonal matrix . Hint: The eigenvalues of a diagonal matrix are its diagonal elements, and the inverse of a diagonal matrix is a diagonal matrix with inverted eigenvalues. (b) Now consider the more general case where is not necessarily diagonal (although one can assume that it is non-singular, i.e. that it has no non-zero eigenvalues). Derive the result of this multidimensional integral. Hint: Use a transformation of variables ( where is a unitary matrix which diagonalizes , i.e. ) to reduce this problem to form of the previous one. 5. High temperature expansion

Another way to handle interaction is with a high temperature expansion. One can show (see a previous problem) that one can write the free energy in terms of a sum of cumulants:

We will use the generic Hamiltonian

where is the energy of interaction between particles at a distance . (a) Write the terms to order . To do this, I would suggest writing

where we have used the denition of the density and will ignore the terms in the sum above. (b) Now, assume that the density is constant, i.e. . Show the relationship between this approximation (i.e. high temperature expansion, keeping terms to order ) and the calculation of the second virial coefcient we did in class.

ik 

4 ob  v b H   H ocrF lv fh 2 BC 7}v"ik C w C lv ofh 6 C


& # o&    #  # o&   H 2   2 H  2 # o&i # o& # r # I # # s s  H 2 # di I# di 2 # r#  2 # #  H  U  H  H 2 & )  H  { C H C C I9H  2 &

c l

w c p l H

6}v

c ) 7   # 4

v }c v

4  #

w 2 Cc l H

 I 

B EYOND

MEAN FIELD THEORY

134

10 Beyond mean eld theory


What can we do if mean eld theory doesnt work (eg near a critical point or perhaps in dimensions below the lower critical dimension)? The following two methods are the main alternatives.

10.1 Critical phenomena


In section 6, we studied the phase transitions of an Ising model and found that there was unsual behavior as or , where . In second order phase transitions, the transition temperature is associated with a critical point. We have seen that there is some unusual behavior as systems reach a critical point. In particular, the coefcient to pairwise interactions vanishes and the correlation length diverges (in the mean eld approximation) . More generally, one writes

The exponents here are called critical exponents. Weve seen that in mean eld theory, we get . It turns out that this is incorrect for . Much as Flory theory predicts deviations from random walk theory, more sophisticated treatments of the Ising model nd deviations from mean eld theory in the values of the critical exponents. One can similarly calculate values for and in mean eld theory and these values will have similar deviations. Indeed, one can write generic relationships between these exponents. For example, , where is the dimensionality of space. This scaling law holds, Josephson has shown that independent of the system one is studying. There are several other such laws which connect these and other critical exponents. Why does mean eld theory fail? Near the critical point, uctuations become large scale and long range. Weve already seen that even mean eld theory predicts this (both in terms of the divergence of and in terms of the Ginzburg criterion). Thus, clearly we need a more sophisticated theory to understand statistical mechanics near the critical point. In the following sections, we discuss means to do so. Finally, while we have spent most of our time discussing polymers and the Ising model, the behavior seen here is very general, and common to all second order phase transitions. First order phase transitions do not have this problem since they never get near a critical point.

10.2 Exact solution of the one dimensional case: transfer matrix approach
Ideally, instead of making approximations, such as those involved in mean eld theory, we would just solve the partition function exactly. Rarely is this mathematically tractable. One notable exception is one dimensional systems. Below, we calculate the properties of an Ising model in one dimension. While this may seem not directly relavant to three dimensional systems (indeed, the one dimensional case is radically different, as we will see below), there are true one dimensional of interest, such as polymers (see the homework problem).

{c

1qTP v a w w v 1qTACc1fa v a aa a zDfTA

) H  $i%v$zm

p Hv #7

d 2

d `
$

(10.1) (10.2) (10.3)

v ) l

B EYOND

MEAN FIELD THEORY

135

Consider a one dimensional Ising system, where all spins are conned to a ring (we put the spins on a ring, rather than a line since the periodic boundary conditions of the ring will simplify our problem a little). In this case, we have the partition function

We can express this equation in a simpler and more compact form by dening the transfer matrix

where we are labeling the different possible spins states by the greek letters: refers to and in the simple Ising model. Thus, in this case is a matrix:

Here, the spin representation is much like spinors in quantum mechamics. For example, to calculate the Boltzmann weight of an upspin interacting with a downspin, we do the matrix multiplication

Thus, the transfer matrix takes the form of the matrices in our partition function above. Since we , we nd can write the identiy matrix as

where the denotes the trace of a matrix (i.e. the sum of its diagonal elements, which also equals the sum of its eigenvalues) and denotes the matrix power of the transfer matrix. To take matrix powers, one must take the power of the eigenvalues of the matrix. Thus, we get (10.9)

where are the eigenvectors of the transfer matrix. Finally, since we are raising the eigenvalues to the power, and we will then take the thermodynamic limit , we can use the saddle point approximation to say (10.10) where is the largest eigenvalue. In the Ising case, we of course only have two eigenvalues, so we could keep both terms quite easily. However, even with two terms, the dominance of the largest eigenvalue in the thermodynamic limit is still of course quite large, so we keep only this term , and thus we have the free energy

$  7  H7    97 H   76  # l v tv H X2 l l l l ys l lnH q 7 y  U U U sl l H q t s l l H q t BB@ t sAl 2 l H q 6 t s 2 l l H q u3w BB@ w w @@ o @@ s l l H q U 3w @ @ w w u3w @ @ w w BB@ 1 l l H 0 BB@
 h4 U U U 4 h 4 t FF d 4 U e 4 U w h U w U w U U U xP l f l lr @ @ @ o 2 2 3w @ @ w w l l7f l BB&Al7t l l72 f l BB@ 2 l l w p l l 97` $ n H7 7 q $ $   9 779H    779H l q  # s   # # s

ys  1v(v q eF H H

(10.4)

9 1v(7h4 v

(10.5)

(10.6)

(10.7)

(10.8)

(10.11)

B EYOND

MEAN FIELD THEORY

136

It is possible to generalize our denition of a transfer matrix to systems whose energy depends on some external eld as well as an internal interaction:

which has the largest eignvalue

), there is no magnetization at any temperature. Thus, We nd that for no external eld ( the system remains in the paramagnetic phase and there is no phase transition! This was the result which Ising derived, and one can only imagine his disappointment (and in fact, he never published another paper). Why is there no transition in ? Phase transitions occur due to interactions. Lets say that the 1D Ising ring was aligning to all up spins. If one spin misaligned due to thermal uctuations, then all of the other spins after it would not know about the aligning to up spins. Thus, in 1D, thermal uctuations can destroy phase transitions. In higher dimensions, this is not possible since the interactions are transmitted in more than one dimension. The transfer matrix approach is broadly applicable to any one dimensional problem in which there are local interactions. Onsager devised a tranfer matrix method for the 2D Ising model, yielding exact results for that case. This result is considered to be one of the great tour de force calculations in statistical mechanics, and is covered in detail in Huang. Since mean eld theory is exact for , then the Ising model has been exactly solved for all dimensions except . The exact solution of the Ising model in remains an outstanding question in mathematical statistical mechanics.

10.3 Real space renormlization


10.3.1 Self-similarity: the heart of renormalization Mean eld theory breaks down near critical points. Lets think about what happens in this regime. First, we know that the correlation function diverges. This means that there are clusters at very large scales. However, the picture is more complicated than that. If we look at a sample near the critical

We can calculate the average spin by the derivative

ys$  H7H l q 

ts H97H l q r 9 2 1v  H 9 Fq

H r

9 Fv

 2 1

w 

 v 1 #

and thus a free energy per particle

$ ys  lq H w  Hr6H 7 9 2 1Fv   1v #  76`bh4 $  s H q 7   9H7   H76 Es w q 76 # 




0   

ul

Thus, in the case of spins in a magnetic eld, i.e. we the added potential transfer matrix

  w  q  v l  H ys ly l 1) tH  l ln36` y

(10.12) , we get the (10.13)

 ') 

l  pr ) eH 

(10.14)

H bh ) H h

rl

(10.15)

(10.16)

B EYOND

MEAN FIELD THEORY

137

point, we see that the sample gets clouded, or milky, i.e. looks white. This means that it is scattering light at all wavelengths. If something scatters light at all wavelengths, then we know that there is structure at all wavelengths. Indeed, near the critical point, we see structure at all length scales. The mathematical manifestation of this is the switching of correlation functions from exponential to power laws. An exponential function has a characteristic scale. For example, the correlation function

decays with a characteristic scale . What happens at the critical point? In this case, we know that and we get a correlation function of the form

. Unlike an exponential, a power law function has no characteristic scale, i.e. a power law and is thus called scale invariant. Thus, we see that there is such scale invariance taking place at the critical point, reecting the fact that there is structure at all length scales. Lets think about what scale invariance means. Another term associated with scale invariance is the existence of self-similarity, which means that subparts of a given structure have the same form as the structure itself (a property common to fractals, for example). For example, if you break off a piece of brocolli and compare it to the whole, both parts look very similar. If systems near the critical point exhibit this form, then this makes sense why mean eld theory breaks down. The structure near the critical point is fairly complicated and cannot be approximated by some mean eld. Instead, one must directly incorporate the self-similiar nature of this system. In the following sections, we will demonstrate how to use renormalization ideas to look at 2D self-avoiding walks, in order to calculate the critical exponent , as in

A mean eld solution of random walks gives , which we have mentioned is correct in dimensions. For two dimensions, we expect a signicantly different value. To calculate for a self-avoiding walk, we can think about how the self-avoiding walk could be considered to be self-similar: we may draw a coarse-grained version of the walk (i.e. some rough trajectory of the path taken), and then zoom in on a given sub-part of the trajectory and say that this subpart is self-similar to the whole. Then, we can related properties in the coarse grained representation to those in the detailed representation. This procedure is called the renormalization of the terms which describe the system of interest. The problem now becomes nding some mathematical way to use this idea to perform a calculation. The central idea here is that we wish to calculate the partition function by enumerating all of the microstates. In the self-avoiding polymer case, this means all of the self-avoiding walks. Of course, there is no obvious way to do this exactly. Instead, what the renormalization technique does is that it starts with something which is coarse grained not very detailed which we can write all of lattice. Next, we will relate the microstates down, such as all of the self avoid walks on a the individual links of each of these walks themselves to a self-avoiding random walk. This relation

H h

v t v

H h s  H H ) h 7

H 7 s  H H 3A)7 s 097

3     3 P3IPzI
w

(10.17)

 3 3 pF P3IPzI   

v ) l

 H E

3 

(10.18)

(10.19)

B EYOND

MEAN FIELD THEORY

138

puts detail back into our model and is a controlled way to in a sense sum over all of the microstates. The key is in building a relationship between the coarse grained and detailed models. With this relationship, we can recursively put further detail into our detailed model, and then detail into our detailed, detailed, model and so on. 10.3.2 Similarity to critical phenomenon It is natural to think about random walks in terms of the number of walks which have a given end to end distance for a polymer of xed length . From this we can calculate the total number of walks to be . However, to help ease the calculation, it is simpler to consider the grand-canonical version of this term: we will not consider a polymer of xed length , but with xed fugacity : (10.20) From this, one can calculate the average length of the chain using the derivative trick (10.21)

Now lets consider how varies with . For small (eg ), we see that this sum will be dominated by short chains (since exponentially decreases as increases for ). For large (eg ), we get the reverse: the sum is dominated by large since exponentially diverges. Thus, the result for small is equivalent to the result for and the result for large is equivalent to the case for . For this reason, one calls and xed points for the behavior of , since they dene the behavior of the system. However, in between these two xed points lies a transition from the regime to the regime and we associate this value as being like the critical point in this transition, i.e. we can write (10.22) Thus, we want to know how the typical size of the system

Now that weve made a connection with critical phenomena, we have to think about how to introduce self-similarity, i.e. the system at smaller scales resembles itself at larger scales. To do this, consider our idea of adding detail into a coarse grained picture of a self-avoiding walk. Thus, we want to relate somehow the correlation length at both scales.

  

10.3.3 Recursion relationship: renormalizing

 j  H Q7

scales when

. We expect that we should get something of the form (10.24)



U

   d U      U P  U   v#  U l F  U  U U U U U
 # 

 # 

U  U# U  2

 j  H gr

f

 2  

U   h97`~ U  U
 #  

 # v

r d 1  

#

(10.23)

B EYOND

MEAN FIELD THEORY

139

coarse grained

z2

z3

Figure 10.2: Different SAWs which contribute to a renormalized step. We match the fugacity in Figure 10.1: Renormalization of a self-avoiding the detailed representation to that in the coarse walk. grained representation .

To do so, we introduce a recursion relation, relating the coarse grained picture with something with more detail. In particular, since we expect our system to be self-similar, we can relate the to that in the detailed one by the recursion correlation function in the coarse grained picture relation (10.25) where is a scale factor to compensate for the fact that the detailed picture has a different scale (eg different lattice spacing). If is the only relevant parameter near the critical point, then is the same function as Similarly, the chemical potential of the coarse grained system should be related to the grand potential (i.e. the free energy) of the detailed system

One can think of the system at rst purely in terms of the coarse grained system. In that system, we need to know what the free energy per link is (what is its chemical potential). However, we know that a single link in the coarse grained system is related to several different microstates in the detailed system. In this case, we need to nd the total free energy of these detailed states. This will be the free energy for the single link in the coarse grained picture.

U U U

or in terms of fugacities

  1

  

i

  C5v  c  

i i

H p'

 v 

i i

om zo ck blo ne eo on in
z3 z4

detail for one block

 76   i i i

  v

(10.26) (10.27)

B EYOND

MEAN FIELD THEORY

140

In other words, one can think of this as saying that the coarse grained system is comprised of several different detailed microstates and thus the statistical weight should be equal to the sum of the statistical weights of these microstates, just as we would say that the free energy of a macrostate is the sum of the Boltzmann weights of the microstates. 10.3.4 Recursion relationship for a 2D random walk on a cubic lattice Now, lets apply these two equations on a small 2D lattice. Counting all of the walks, we nd that

Now, with this relationship, we can calculate , which we nd to be . For , we see that the recursion relation above goes towards to solution. For , we ow towards the xed point. For this reason, the two solutions ( ) are called xed points of this equation and we say that this renormalization transformation ows towards these two xed points and away from the unstable critical point . Since

Since

is an analytical function of , we have

and thus we get

Solving for , we get

The values of and which we get from this argument, should be compared with the best numerical estimates of and . Better results are achieved with larger cells. Also, analogous calculations can be done in three dimensions, yielding the appropriate for that case. From this calculation, we see that the inclusion of self-similarity allows us to go beyond the mean eld approximation. Unlike Flory theory, which also gives correct results, renormalization theory does so with a solid foundation. Flory theory works due to a convenient cancellation of errors and cannot simply be made more accrucate by the inclusion of higher order terms. On the other hand, renormalization theories can typically be made arbitrarily accurate as one adds more detail into the calculation.

ftH f T l ~f QQQ m yQQQ 5 m 5 l 4 9v p Q Q c p m QQQ QQQ m w QH  w Q 9   Q 5 w Q c w QA c  H  H i  QQ QQ i QQQ  QQQ  9QH 9  H   i i

 H w i Q7

We nd

 

 H w i Q7

i

6m`
g

c  v C"  C"  c  c

i i

 2 yw 

 v  o Vw " i 

(10.28)

i i

f

 n  j  H r

w
h

f n

 j  H Qr

  

` i

(10.29)

(10.30)

(10.31)

(10.32)

(10.33)

B EYOND

MEAN FIELD THEORY

141

10.4 Computer simulation


10.4.1 Simulation techniques We have spent much time talking about how to mathematically calculate thermodynamic properties of a system. In principle, this can be done for any system. In practice, it is often difcult to calculate partition functions or any other thermodynamic quantity. Instead, we often use computer simulations. Simulations serve as a means to directly test whether our explanation of reality (in terms of a model) is correct. In particular, one can often measure anything one would want in a simulation and thus one can both compare simulation result to experiment, as well as use simulations to further interpret experimental data. There are several ways in which simulations are used in thermodynamic calculations. Here I will present three common approaches. Exact enumeration In many systems, one can simply calculate the partition function exactly on the computer (even though it may be hard to do this mathematically). For example, consider a spin system with spins. We could have the computer run through all of the spin possibilities and directly calculate the partition function

This method is in particular useful when we say that there are interactions between the spins. Interaction is difcult to calculate mathematically, but much more straightforward computationally. Another example is in polymer systems. For a polymer chain of monomers, one can try to enumerate all of the possibilities. Often, due to particular aspects of the chemistry of the polymers under study, only particular bond angles and dihedral angles (isomerations) are possible. This allows one to think of the space of polymer arrangements as being discrete (due to the discrete number of isomers) and thus enumerate all of the congurations. Enumeration is a great means to do calculations since we have directly included all of the microstates into our calculation of the partition function. However, since the number of microstates exponentially increases with the system size 1 , enumeration is often only practical for small systems. Monte Carlo It would be great if would only need to sample some subset of all of the microstates and still get good thermodynamic results. This would be ne if we could sample microstates based upon their Boltzmann weights: states with high probability would be sampled a lot (since they are a major contribution to the thermodynamic calculations) and states with low proabability are sampled infrequently (which should be ok, since these do not matter that much anyway). This is much like trying to poll voters. TV news predicts elections based upon small, but representative subsets of
1 We know that this must be the case since entropy is extensive. In this case, we can write , where is the , we get , i.e. we get an exponential relationship entropy per particle. Since between the number of microstates and the system size .

Y dQW V

Uv

w Y a s

d 4W w a V s Y 4jW

 ) H % C 7

6 w U

"

cb W V 

B EYOND

MEAN FIELD THEORY

142

the total electorate. If we pick such as representative subset, then we can avoid doing a complete enumeration. But how are we going to do this? Well, lets try to use our knowledge of thermodynamics to our advantage. In particular, we know from thermodynamics that in order to calculate state functions, the path is irrelevant. For this reason, it makes sense to choose a path which is easy to calculate (why make life tougher than it has to be?). Thus, a Monte Carlo calculation is broken up into two parts: (1) choosing moves which dene the path; and (2) calculating whether the move is accepted or rejected. Let me illustrate this with an example. Consider trying to study a gas. We can say that in our simple model of a gas, the gas particles stick to a lattice. In this case, our moves are naturally chosen to be moving one lattice space in each possible direction. In terms of evaluation of moves, one typically uses the Metropolis criterion. This criteria goes as follows: calculate the original energy of the system . Next, make a move (eg choose one of the directions at random) and then calculate the energy of the new conguration . If , then we accept the move. If , we pick a random number with . If then we also accept the move, otherwise we reject it. What were doing here is choosing moves with a Boltzmann weight. This is important since it is this property which ensures that we are making moves which are thermodynamically reasonable. Metropolis dynamics (as outlined above) obeys two important laws: 1. Detailed balance: forward ows are related to reverse ows, i.e. (10.34) as

i.e. the product of the probability of being in state , times the probability of generating the trial move times the probability of accepting that move. In most Monte Carlo schemes, the probability of generating a move is a constant and independent of the conformation of the system, i.e. (10.36) (10.37) This leaves us with and thus

The Metropolis criteria obeys detailed balance since

i.e. we get

l  v p d   ) %v 97 H  ' Eup d  PF  )  wF u v fv H7l u Esp  d  s  q  s3)qivug'~H q 6` si)i)'u%HH q uus H v % v s  d    Eu1Fa t vus

 '

 ~

v3Fa t d 

 d   d     d  Esa t ' guXp t vs"u EsQ

p  d p  d  ~ X7' EuXp

   's5' 

One denes a ow from the original conformation

to the new one

(10.35)

(10.38)

(10.39)

(10.40)

C r

C |

 v

 Q u 

d  uQ

v  '

d3a  d  Eu3a

s H C  q)Ckvr~H q 6 p

B EYOND

MEAN FIELD THEORY

143

2. Boltzmann sampling: microstates are found with a Boltzmann probability (10.41)

One could use non-Metropolis criteria, but it is important for the above relationships not to be broken. What we get out from these simulations is thermodynamic data, such as free energies, heat capacities, energies, entropies, etc. Since thermodynamics does not depend on the path (for state functions), simple moves (and often simple models) can give good, quantitative results. This is a good example of how thermodynamics concepts really helps in research, even if we arent using thermodynamics directly. Molecular dynamics Instead of appealing to thermodynamic arguments, we can also just think about the natural dynamics of our problem and try to simulate how our system actually works. One way to do this is to directly simulate Newtonian dynamics. By this I mean, we can write Newtons equations of motion for each particle :

is the mass of , is the position of , and is the time. where is the potential energy, One can rewrite this differential equation in many ways in order to numerically solve it. I wont go into too much more detail on how this is done, but I will mention that thermodynamics results do come naturally from these calculations as well, since Newtons equations are certainly consistent with the laws of thermodynamics. Constant temperature MD simulations However, since we are not explicitly modeling a heat bath, MD simulations as formulated above are constant energy simulations. How can one simulate the interaction with a heat bath? A typical means to do this is by periodically picking velocities from a Maxwell-Boltzmann distribution, i.e. (10.43)

This method, called the Andersen thermostat after its inventor, Prof Andersen in our department, ensures that the kinetic energy is equilibrated at temperature . One can imagine this random reassignment of velocities as the result of the atoms making contact with atoms in the heat bath, and the resulting collision changing the velocities. Langevin dynamics Langevin dynamics is much like molecular dynamics, except that we make an approximation about the nature of the solvent. We do not model the solvent explcitly, i.e. in terms of particles to represent the solvent, but rather by a mathematical simplication. This is useful since explicitly modeling the solvent can be a great computational burden, since typically at least 90% of the atoms in a system are in the solvent.

P `t

 % ) 2 97 H

2 2

"

  dvs

where

is the normalization (partition function).

s   q)|vH q 6

|vs  

(10.42)

"

B EYOND

MEAN FIELD THEORY

144

We say that the solvent acts to lead to thermal uctuations and a viscous drag force. Thus, Newtons equations become (10.44) where is the external forces, i.e. the gradient of the potential energy, is a damping term to model the drag effects, is a measure of the viscosity, and is a random force. The random force is typicall taken from a Gaussian distribution whose width is proportional to temperature. Thus, it is from the random force that the thermal uctuations (and thus temperature) enters into our model. If we assume that the system is fairly viscous, then the acceleration is negligable compared to the other terms. Thus, we can solve to nd

Computational difculties Of course, we must ask what values of are reasonable for this calculation? Using too small a value of will be computationally inefcient. What about too large a value? If the value is too large, then position increments will be too coarse, and there is the strong possibility that particles will get closer to each other than one would physically expect. This is particularly disasterous, since when particles get close, they are repelled extremely strongly by hard core to be too large leads to numerical repulsion and then get shot out even faster. Thus, choosing instabilities and the whole system crashes. What values of are reasonable? One can make a good guess by looking at the the period of atomic vibrations. For any molecular system consisting of atoms which are bonded, one would expect that the bond periods would be of the femtosecond ( seconds) timescale. This means that must be around 1-2 femtoseconds. Thus, in order to simulate a microsecond ( seconds), it would take a billion iterations. Is that a lot? Considering that in each iteration, one must calculate the force on each particle, its actually quite a feat that modern computers can calculate a million iterations per day for small systems. Thus, while one can simulate nanseconds per day on typically fast processors today, microseconds would take 1000 days or 3 years! Hybrid techniques One can combine MC and MD in interesting ways. One of the simplest is using MD to generate MC moves. Constructively, this is done by doing some MD, calculating the change in energy from the original conformation, and then accepting it with the Metropolis criterion. This obeys detailed balance since (MD will always generate conguration from ) for this system and we accept moves with the Metropolis criteria. Summary In many ways, the real challenge of these simulations is to start with particles and Netwons equations and to try to extract out some deeper understanding, much like when we found that it was entropy that pushed an ideal gas to expand to ll the entire volume possible. Thus, reproducing nature is only (at best) half of the battle: understanding it is the real challenge!

s

# w D y$9EB

Thus, given a time increment the change in position :

that we wish to numerically integrate our system, we can calculate (10.46)

P `"4o 79zB # w H D # # w D $iEB

0t

D iEB

(10.45)

B EYOND

MEAN FIELD THEORY

145

10.4.2 Two central issues in calculating thermodynamic properties There are two central issues in calculating thermodynamic quantities from simulation. One is how accurate is ones model. Typically, this means how good is ones approximation for the interaction potential between atoms. For example, to model the van der Waals interaction between particles, one typically uses a Lenard-Jones potential

where is the distance between the particles, is the well depth (attraction energy), and is the dtsiance at which the two particles interact. Designing good potentials is a difcult job, especially since ideally one would have a multitude of experiments to help constrain the possible values for the potential set. The other issue is whether one has sampled the system well enough, i.e. has ones simulation truly reached thermal equilibrium. Thermodynamics assumes that we are talking about the nature of the system in the long time limit, i.e. the system is at equilibrium. Some properties reach equilibrium quickly and some do not. In exact enumeration studies, this is not an issue since we have sampled every state, but in Monte Carlo of Molecular dynamics, one does not come anywhere close to sampling every state and thus addressing the quality of sampling is an issue. One solution to this problem relies on the idea that many systems are ergotic, i.e. that time averages will be the same as ensemble averages. If we can sample enough that this is true, instead of running one very long computer simulation in time, one can run (eg using a parallel supercomputer or cluster) hundreds or thousands of runs in parallel from different initial conditions and calculate statistical mechanical averages over the ensemble, rather than a average over a very, very long run on one computer. 10.4.3 Calculating thermodynamic properties from simulations that have reached equilibrium First and foremost, in both monte carlo and molecular dynamics, the probabilty to reach a given microstate is given by its Boltzmann weight. Thus, if a system has microstates , then the probability of nding a given microstate is

Thus, the number of times the simulation reaches a given state is proportional to its Boltzmann weight . Thus, if we can run a long simulation (such that it has reached equilibrium), then we can simply calculate the number of times the system reached microstate to calculate the free energy (10.50)

 q

 n9H7

P1Fvw1 p'HeFws|   H

 1

i v  

 n9H7

This also means that macrostates

are found with analogous weights

  q z

 nH76

j 1

n j  0 7  ys

(10.47)

)  v 

nH q 

d


) E 

 s

ts

 



 vs

H q 6

(10.48)

(10.49)

 s

B EYOND

MEAN FIELD THEORY

146

(where is the number of times the system reached the macrostate ) i.e. the free energy is directly related to the log of the probability. Also, since these probabilities are Boltzmann probabilities, we can use them to calculate thermal averages (10.51) Thus, one can easily calculate many thermodynamic quantities simply by calculating the approximate density of states and then nding

where is the total number of states counted (but typically far from the total number of states in the whole system). 10.4.4 Calculating thermodynamic properties from simulations that dont easily reach equilibrium Unfortunately, many of the systems of interest equilibrate far too slowly. For example, the fastest proteins fold in milliseconds, but we can only simulate nanoseconds in detailed models. Therefore, simply letting a simulation run will never reach thermal equilibrium. To avoid this unfortunate possibility, there are many tricks which one can employ. Here, I describe a few examples. Umbrella sampling If our system of interest is not easily reaching equilibrium, one trick we can employ is to change the system in a simple way in order to get it to reach equilibrium, and then correct for the change that we made. One such means is umbrella sampling. Consider a system operating under the Hamiltonian . This system does not reach equilibrium easily. To make this concrete, consider that we have some energy and that we can start the system around , but in a normal simulation, it will never sample large values of . In umbrella sampling, we dene a new Hamiltonian . For example, one could say , where is a positive constant; if is large enough, dynamics (either MC or MD) with this new Hamiltonian will be able to reach large values of , allowing us to sample to system. With the new Hamiltonian, we can calculate thermodynamic averages. We now need to correct for the fact that were not using the original Hamiltonian. To correct for this, we need to change the way we calculate averages. If we calculate averages for the new Hamiltonian, we get (10.53)

where

. However, were really interested in

(10.54)

where

. However, since

, we can write

(10.55)

i P  7 i P 7 ys 

) i X ts|v nH q 6 i l s q 6|vr      i i P 7 i H  I|z H q l |vr   y|znH q l |zr s    p i Hi ` ts|vnH q 7e   )s    Et|vnH q |zr p  ys|z i nH q i  q   i )ys|z i nH |zr i p

  z

) ivv

  zs

w E  

  i z

G p

 ~

 vv

 1 

(10.52)

B EYOND

MEAN FIELD THEORY

147

where we have used

Thus, in general to calculate the true average (i.e. in terms of the original Hamiltonian), we can calculate averages with the new Hamiltonian, weighted by the differences in the Hamiltonian. This allows one to help push the system along its way and to improve sampling. However, this method can also make the result worse: if one pushes the system somewhere unphysical, such that it doesnt sample where it should, then this undersampling will lead to incorrect results.

   q q   q i 76 ts|v i nH As i l uts|vnH i l i

(10.56)

B EYOND

MEAN FIELD THEORY

148

10.5 Problems
1. Transfer matrix models for the helix-coil transition The helix-coil transition is a simple model of how helices coil up (eg DNA, alpha helixs in proteins, or some non-biological helices). In this model, we dene two states: helical and coiled. Imagine a long polymer of units, with parts which are helical and parts which are coiled. In the transition between coil to helix, we say that the energy to go from a coil to a helix is (the initiation energy) and to continue a helix has an energy . The coil to coil or helix to coil energy is zero. For simplicity, assume that in the units of .

2. Thermodynamic alchemy

In computer simulations, its easy to calculate energies and thermal averages of energies (i.e. averages weighted by Boltzmann weights), but its hard to calculate free energies and entropies. Thermodynamic alchemy is a technique used to calculate free energies using thermodynamic averages of energies. This technique is now used in drug design calculations. In general, imagine a Hamiltonian , where is a parameter used to switch the system between two states (dened as and ). In this problem, you will derive the fact that the free energy difference between these two states can be expressed as

is the Boltzmann probability.

Since we are in a sense transforming system A into B, this method is sometimes called thermodynamic alchemy. While this could never happen in real life of course, it is perfectly well behaved under the laws of thermodynamics. Thus, one should keep in mind that in simulations, we are not restrained by the many of the laws of chemistry and physics and can use methods impossible experimentally to yield experimentally relevant results. (a) Assume that we have some Hamiltonian for our system , which describes the transformation from system 0 to system 1. For now, do not worry about the specic form for . Calculate .

@@ BB@

where

is the denition of the thermal average of

and

4    4 4154

ysiqH 9 q E 

 

54

 )  H 3 q|z 7

H I

  o

|vh  

@@@  @@ BB&|v7 BB@

4  ) 

(a) (b) (c) (d) (e)

From these energies, write the Hamiltonian for this system in terms of a transfer matrix . What are the elements of in terms of and ? Calculate the partition function in the thermodynamic limit. Calculate the average number of helical units per monomer . Sketch vs for a few different values of . How does the transition depend vary with ?

  o

B EYOND

MEAN FIELD THEORY

149

(b) Using your result to part (a), derive the result for Hint:

While you should not worry about the specic form for while you calculate the results for this problem, for your information, in a thermodynamic alchemy problem, one typically makes the specic choice that the energy is given by

where is the Hamiltonian for system 0 and is the Hamiltonian for system 1. We choose to vary from 0 to 1 to do alchemy and thus turn system 0 into system 1 in our simulation. For example, lets say we wanted to calculate the binding free energy difference between a known ligand and some new molecule. In particular, in this method, we want to calculate the free energy difference between two different systems (called 0 and 1). For example, these two systems could be a transformation from a methyl to an ethyl group. Or similarly, system 0 could be some known drug bound to HIV protease and system 1 could be a new drug to test computationally.

 

54

4 4 cw yoH

written above.

 )  r l

 3

 o  

 H  o  z$ I z$ 0

3 

3 3

D ISORDERED

SYSTEMS

150

11 Disordered systems
11.1 Spin glasses: the Ising model of disordered systems
Spin glasses were originally formulated to study the behavior of spins in particular types of disordered materials. However, much like the Ising model, spin glasses have been useful paradigms for other systems (in particular nerual networks) due to the simplicity of the formulation of the model and the rich behavior which results. I will concentrate less on real spin glasses and only talk about a model system, much as one would talk about the Ising model rather than real ferromagnets. Indeed, one could say that spin glasses are a natural generalization of an Ising model. We can start from the Ising Hamiltonian:

where and is the strength of interaction. In the typical (nearest neighbor) Ising model, we must perform this sum only over nearest neighbors. We can rewrite this as

In this case, takes on a regular (albeit potentially complicated) form, depending on dimensionality, etc. Now, lets generalize this system and say that there is the possibility of spins not just from nearest neighbors. Also, well say that the spins are interacting in a disordered medium, which affects the strength of interaction between spins. This leaves us with a spin glass Hamiltonian, which is exactly as above except there is a new interpretation for the nature of . Finally, I cant resist making a brief aside into early models of protein folding. One can interpret the Hamlitonian above in yet another manner: we can view the as a polymer sequence and the as the contact map of a particular conformation (i.e. which monomers and are interacting). This natural interpretation lead many to believe that spin glasses would be relevant for protein folding. Time permitting, well see how this turns out not to be the case. In order to really go forward, there are a couple of important ideas from the physics of disordered systems which we must rst talk about. 1. disorder: These systems are disordered since they have some non-regular aspect to their nature. Just what is disordered will be claried later as we talk about the specic models. In typical spin glass models, plays the role of disorder. Since we cannot solve the system for a particular

l l

where

PF Fw v qwv 9 ucXPnc w a b a u

l l

l l

l 7

D ISORDERED

SYSTEMS

151

, we must calculate averages over the typical realizations of disorder. This requires some probability distribution for (which is typically taken to be Gaussian):

2. annealed quantities: Annealed quantities are those which are free to change in order to minimize the free energy of the system. In general, anything which is considered to be a microstate of the system is an annealed quantitity. In spin glasses typically the spins are annealed. In this case of protein folding, for example, the conformation (and therefore the contact map is annealed). 3. quenched quantities: Qunched quantities are those which are not free to change in order to minimize the free energy of the system. In general, anything which is considered to be a xed is a quenched quantitity. Moreover, most interesting aspects of the physics of disordered systems rests in the study of the properties of quenched disorder. For example, consider a spin glass in which is disordered from sample to sample, but in a given sample the are xed. Analogously, in the protein folding case, one typically considers the protein sequence to be quenched (since it does not change during folding). We can imagine a beaker full of different protein sequences this is quenched disorder. We can calculate averages over these different sequences.

11.2 Which thermodynamic properties should we calculate?


How should we go about calculating thermodynamic properties of a system with quenched disorder. It is natural to calculate some average over the disorder (denoted as ). But what to calculate? Average free energies or partition functions ? Lets rst think about what the average partition function means physically:

Thus, in this case, the disorder acts just like another microstate (and its probability like another Boltzmann weight). What does this mean? Since microstates are annealed quantities, it means that if one studies , one is calculating properties of a system with annealed disorder, i.e. the disorder can change. This might be equivalent to some polymer case where the polymer sequence can change in while folding, for example. Also, since the disorder appears like a microstate, (as we will see later) it is also much easier to calculate annealed averages.

@@@ w BB&qFwcPbv 9b   IBB@ @@

where denotes the microstates. We can write the average as thus we get

B 8 p s C FAC

 )  H fq|v 97

CB FAC8 fq|v976CcPF ) H  w b w v b 6 qFwcPbvp9bs CB F AC8 w 

@@ BB@

 )  H %q|z 7

where

and

are the mean and standard deviation of the distribution, respectively.

1 92
and

lv H 6  s 
2

H 

D ISORDERED

SYSTEMS

152

But what if we want to study a system with quenched disorder? In this case we dont want to study . Instead, we want to calculate disorder-averages of extensive quantities, such as the free energy, entropy, or energy. Why? There is an explicit (although often hidden) assumption that disordered systems are self averaging. Imagine that we could calculate the free energy for each different realization of disorder (probably impossible analytically, but possible in computer simulation). Self-averaging says that if we looked at the probability distribution of free energies over disorder we would see a sharply peaked distribution while there would be some cases with unusually low or high free energy, the vast majority of realizations of disorder would yield a free energy close to the mean . Thus, if we look at the mean values of free energy , we would learn about the vast majority of realizations of disorder and get a useful description of systems with quenched disorder. Analytically, self-averaging is typically assumed. However, this makes computer simulation of disordered systems very useful, as they can demonstrate the success or failure of the self-averaging principle. Why should systems be self-averaging? Well, recall that we said that extensive quantities are self averaging. These quantities are the sum of individual contributions, typically somewhat randomly distributed. The central limit theorem says that the mean of the distribution of this sum should converge rapidly as gets large. Thus, it is not surprising that self-averaging should hold in large systems. But how large is large? Are proteins large enough? This is an interesting question and has been addressed in simulation.

11.3 Annealed-average partition function


Lets start with the annealed-average partition funciton1 , which is easier to calculate. Lets take the Ising Hamiltonian and the Gaussian distributed . We get

1 I want to stress the difference between the annealed-average partition function

and the partition function for a system with annealed disorder

where . The key difference is that in the annealed-partition function, the different disorder states contribute to the entropy (as they themselves are physically different microstates) whereas these states do not contribute to the annealed-average systems entropy (as we average over these states).

s s ) l l H q 6 2 lv H 6 5 w 2 s 1 29 H  0 W) l l H 2 lv H 5 w 2 1 2 H  0 w  wcbvF9bs qB F Aq8

U w aw s

Ya w uc s U a w E e Y es t s w uc4 s u W e t s w s4 s T u E e Y W

 )  H 3 q|z h

 C

FwcPbvp9b w

 qq

FwcPbvp9b w
W w

uc s

 s

D ISORDERED

SYSTEMS

153

where is the normalization of the Gaussian distribution. This is now just a product of Gaussian integrals, thus yielding

Since in general, the second term above simplies. Also, we will make a mean eld approximation and say that . This leads to

We can calculate the the entropy per particle from this free energy

Lets consider some limits. We actually have two energy scales in this problem: and . If we take , this is the physical case where there is no disorder and all spins interact with the strength . Not surprisingly, this recovers the innite dimensional (mean eld) solution to the Ising model. Were not interested in this limit in this lecture. If we consider the case with a zero mean ( ), we get a simple result:

From this, we see that the normal entropy per particle of is decreased by the nature of disorder. Note that since , if we take , we get no interactions and we would expect only this entropy. Its interesting that in the absence of any mean attraction or repulsion, we still get a reduction of entropy. Moreover, as we lower temperature, the entropy goes to zero at the freezing temperature . This is our rst encounter with some interesting behavior. While the Ising models entropy goes to zero only at zero temperature (not including the up-down Ising symmetry, say were in a small eld), the spin glasses freeze at some much higher temperature. A lot of the later discussion will concentrate on this nature of freezing.

11.4 Quenched partition function


That was pretty easy. The quenched system will not be so simple. Why? What helped us calculate the annealed system was the fact that disorder could be treated like a microstate. We could do sums over disorder and microstates in parallel. The quenched free energy is not so simple:

We see that we have a sum, then a log, then another sum. This is difcult. Below, Ill present different ways to handle this.

 )  H % q|z 7

v v

H p C$ ) @@

 Fq

FwcPbvp9bs CB F AC8 |ep | w  H H

v F

E d b 2 H v H ) ca v p'fC@@

H F'3 %C @@

) @@ C

vqi @B

v ) pv h 3

H q$ ) @@

`"

where

is the total magnetization. This leads to an annealed free energy

2 %5v   %)  %k tI%  1rFbpw a )  H  ) v w v 2 v H ) 2 H FuQI%k  1Fvrp'YQH 3 v t 76 U %k  1Fv U v v   )

4s 2 %) 2 l 2 l 2 g) l l EH6 5 w 2   v w
 2 3 ) 2  

e l

l 2

D ISORDERED

SYSTEMS

154

11.4.1 High temperature (cumulant) expansion

This means that -order terms will involve -microstate correlators. For example, we can write

Thus, we see in this simple demonstration that the correlation between the energies of microstates, averaged over disorder. Typically, one does not have an a priori understanding of the nature of these correlations. Moreover, it is often this very nature which is the interesting aspect of these systems, as we will see below. 11.4.2 Random Energy Model This immediately brings up an interesting question: how are the energies of these states correlated? Derrida suggested a simpler model of disordered systems in which one makes explicit assumptions about the nature of energy correlations. In particular, REM assumes that the energies of states are uncorrelated: for states to , we have the joint probability

Since they are not correlated, we can write the joint probability as the product of individual probabilities. In terms of writing an disorder-averged energy correlator, this means that for states and

and so on for high order correlators as well. Lets explore what this means. We wish to calculate . If the energies are uncorrelated over the disorder distribution (and thus correlators vanish), then averages over disorder become much simpler, i.e. and in general and thus in the REM, we have the convenient result

i.e. in the REM-approximation, quenched-averaged free energies are equal to annealed-averaged free energies. This is why it was useful (and potentially meaningful) for us to calculate the annealedaveraged partition function. Therefore, we have already solved the REM-model version of the freezing transition!

C C

H 2 2 2 2 H @BBw 2 % lv H 3 %9H7 H 3 @@  )  2 6 F l


  1 v

C @@ t BB@ t

"

p' eF H H

t H  

 2 v9is"     

  4z 

A simple solution is to Taylor expand

C @@@ r BBA 2

for small

. This yields a cumulant expansion

 s

D ISORDERED

SYSTEMS

155

11.4.3 Problems

where if and are neighbors and 0 otherwise. This system is called the random eld Ising model since we will assume that the external eld is random at each site, with a Gaussian distribution

For the rest of this problem, assume that the spins are non-interacting, i.e.

), but do interact (a) First, consider the case where spins do not interact with each other ( with the eld, i.e. . Calculate the annealed partition function, i.e. . and the annealed entropy. (b) Next, calculate the annealed free energy (c) Find the freezing temperature. (d) Now, calculate the quenched free energy, i.e. , for the noninteracting case ( ) using a high temperature expansion. Keep only the rst relevant term. (e) Using your result for quenched free energy above, calculate the quenched entropy and the freezing temperature. 2. Interacting Random eld Ising model (a) Starting from the interacting Hamiltonian make a mean eld approximation to simplify the spin-spin interaction (but do not alter the spin-eld interaction). Write the new Hamiltonian in the mean eld approximation. Make sure not to confuse thermal averages with averages over the disorder (different values of ). To keep the notation straight, Ill write thermal averages with an overbar, i.e. the average spin is written as , and averages over disorder (values of ) as . Write your answer in terms of the magnetization . (b) Calculate the annealed partition function for this system. (c) Calculate the magnetization which arises from the annealed partition function. You may need to Taylor expand your result for small . (d) Using the above, if , what happens as one lowers from high to low ? (e) Using the above, if , what happens as one lowers from high to low ? (f) Now consider the case where both and are non-zero. Roughly sketch the phase diagram for this system, i.e. the relevant phases as a function of and . Hint: Youll likely nd paramagnetic (random spin arrangement), ferromagnetic (spins aligned), and glassy phases.

`w l l

}l

Now, we will deal with the case where

e 5   

We will denote averages over the eld with angle brackets, i.e.

Consider a system of Ising spins in an external eld which varies with position tonian is

F |H CBCGz@CB9 | p'C@@ H

1. Non-interacting Random eld Ising model

. The Hamil-

2
.

v T) 2

H 97

l l

  s

"

e 5u

"

D ISORDERED

SYSTEMS

156

3. Superuid with and without impurities The order parameter for a superuid He is the magnitude (in more sophisticated analysis, one can use the complex eld as the order parameter, but for now, lets just assume that our order paramter is ). Since is the magnitude of , is a positive, real number. Also, thoughout this problem, we will assume that the eld is homogenous, i.e. . In this problem, you will show that impurities can alter the nature of the phase transition. In the absence of impurities, we can write the Landau free energy as

vs temperature using the saddle point approxima(a) Calculate the equilibrium value of tion. Do we see a phase transition? If so, what type? (b) Now, lets include some He impurities into our model. Let denote the impurity concentration (and we set in the homogenous system approximation). We will say that the new Landau free energy is the old one with the addition of a interaction term between the two order parameters:

Finally, we will say that the probability of nding an impurity is Gaussian distributed with a zero mean and variance

Note the in this expression. We are interested in the properties of the average partition function

where averages over impurities (disorder) is done just as we did the average over disorder in problem 1. To get at this quantity, integrate out the degrees of freedom to put the average partition function in the form (11.1)

 h 

(d) Sketch the phase boundary in the .

plane and indicate how its two segments join at

What is ? and the saddle point approximation, calculate how the equilibrium value of (c) Using in this case varies with temperature. i. show that the nature of the transition changes for ii. identify the nature of the transition for iii. identify the value of

 z

 EB h 4e ir H  k   i 9Hh6  k s k ""

 g

k H o Fw 2 yP` i c w

2

v  T) 2 k 9H7

 8

i k

  s

z 

) H 7iq7yz

r zB

r zB i i

where

and and are positive constants and

is some temperature.

  2   zr CE0

2 z0 zr    

c w o FVw 2 yP`y

  EI   z0

  zr

  zI

You might also like