zotero-db/storage/HMZBCYGZ/.zotero-ft-cache

Quantum Field Theory
Mark Srednicki University of California, Santa Barbara
mark@physics.ucsb.edu
c 2006 by M. Srednicki All rights reserved.
Please DO NOT DISTRIBUTE this document. Instead, link to
http://www.physics.ucsb.edu/∼mark/qft.html
1

To my parents Casimir and Helen Srednicki
with gratitude

Contents

Preface for Students

8

Preface for Instructors

12

Acknowledgments

16

I Spin Zero

18

1 Attempts at relativistic quantum mechanics

19

2 Lorentz Invariance (prerequisite: 1)

30

3 Canonical Quantization of Scalar Fields (2)

36

4 The Spin-Statistics Theorem (3)

45

5 The LSZ Reduction Formula (3)

49

6 Path Integrals in Quantum Mechanics

57

7 The Path Integral for the Harmonic Oscillator (6)

63

8 The Path Integral for Free Field Theory (3, 7)

67

9 The Path Integral for Interacting Field Theory (8)

71

10 Scattering Amplitudes and the Feynman Rules (5, 9)

87

11 Cross Sections and Decay Rates (10)

93

12 Dimensional Analysis with ¯h = c = 1 (3)

104

13 The Lehmann-K¨all´en Form of the Exact Propagator (9) 106

14 Loop Corrections to the Propagator (10, 12, 13)

109

15 The One-Loop Correction in Lehmann-K¨all´en Form (14) 120

16 Loop Corrections to the Vertex (14)

124

17 Other 1PI Vertices (16)

127

18 Higher-Order Corrections and Renormalizability (17) 129

4

19 Perturbation Theory to All Orders (18)

133

20 Two-Particle Elastic Scattering at One Loop (19)

135

21 The Quantum Action (19)

139

22 Continuous Symmetries and Conserved Currents (8)

144

23 Discrete Symmetries: P , T , C, and Z (22)

152

24 Nonabelian Symmetries (22)

157

25 Unstable Particles and Resonances (14)

161

26 Infrared Divergences (20)

167

27 Other Renormalization Schemes (26)

172

28 The Renormalization Group (27)

178

29 Eﬀective Field Theory (28)

185

30 Spontaneous Symmetry Breaking (21)

196

31 Broken Symmetry and Loop Corrections (30)

200

32 Spontaneous Breaking of Continuous Symmetries (22, 30)205

II Spin One Half

210

33 Representations of the Lorentz Group (2)

211

34 Left- and Right-Handed Spinor Fields (3, 33)

215

35 Manipulating Spinor Indices (34)

222

36 Lagrangians for Spinor Fields (22, 35)

226

37 Canonical Quantization of Spinor Fields I (36)

236

38 Spinor Technology (37)

240

39 Canonical Quantization of Spinor Fields II (38)

246

40 Parity, Time Reversal, and Charge Conjugation (23, 39) 254

5

41 LSZ Reduction for Spin-One-Half Particles (5, 39)

263

42 The Free Fermion Propagator (39)

268

43 The Path Integral for Fermion Fields (9, 42)

272

44 Formal Development of Fermionic Path Integrals (43) 276

45 The Feynman Rules for Dirac Fields (10, 12, 41, 43)

282

46 Spin Sums (45)

292

47 Gamma Matrix Technology (36)

295

48 Spin-Averaged Cross Sections (46, 47)

298

49 The Feynman Rules for Majorana Fields (45)

303

50 Massless Particles and Spinor Helicity (48)

308

51 Loop Corrections in Yukawa Theory (19, 40, 48)

314

52 Beta Functions in Yukawa Theory (28, 51)

323

53 Functional Determinants (44, 45)

326

III Spin One

331

54 Maxwell’s Equations (3)

332

55 Electrodynamics in Coulomb Gauge (54)

335

56 LSZ Reduction for Photons (5, 55)

339

57 The Path Integral for Photons (8, 56)

343

58 Spinor Electrodynamics (45, 57)

345

59 Scattering in Spinor Electrodynamics (48, 58)

351

60 Spinor Helicity for Spinor Electrodynamics (50, 59)

356

61 Scalar Electrodynamics (58)

364

62 Loop Corrections in Spinor Electrodynamics (51, 59) 369

6

63 The Vertex Function in Spinor Electrodynamics (62)

378

64 The Magnetic Moment of the Electron (63)

383

65 Loop Corrections in Scalar Electrodynamics (61, 62)

386

66 Beta Functions in Quantum Electrodynamics (52, 62) 395

67 Ward Identities in Quantum Electrodynamics I (22, 59) 399

68 Ward Identities in Quantum Electrodynamics II (63, 67) 403

69 Nonabelian Gauge Theory (24, 58)

407

70 Group Representations (69)

412

71 The Path Integral for Nonabelian Gauge Theory (53, 69) 420

72 The Feynman Rules for Nonabelian Gauge Theory (71) 424

73 The Beta Function in Nonabelian Gauge Theory (70, 72) 427

74 BRST Symmetry (70, 71)

435

75 Chiral Gauge Theories and Anomalies (70, 72)

443

76 Anomalies in Global Symmetries (75)

455

77 Anomalies and the Path Integral for Fermions (76)

459

78 Background Field Gauge (73)

465

79 Gervais–Neveu Gauge (78)

473

80 The Feynman Rules for N × N Matrix Fields (10)

476

81 Scattering in Quantum Chromodynamics (60, 79, 80) 482

82 Wilson Loops, Lattice Theory, and Conﬁnement (29, 73) 494

83 Chiral Symmetry Breaking (76, 82)

502

84 Spontaneous Breaking of Gauge Symmetries (32, 70)

512

85 Spontaneously Broken Abelian Gauge Theory (61, 84) 517

7

86 Spontaneously Broken Nonabelian Gauge Theory (85) 523

87 The Standard Model: Gauge and Higgs Sector (84)

527

88 The Standard Model: Lepton Sector (75, 87)

532

89 The Standard Model: Quark Sector (88)

540

90 Electroweak Interactions of Hadrons (83, 89)

546

91 Neutrino Masses (89)

555

92 Solitons and Monopoles (84)

558

93 Instantons and Theta Vacua (92)

571

94 Quarks and Theta Vacua (77, 83, 93)

582

95 Supersymmetry (69)

590

96 The Minimal Supersymmetric Standard Model (89, 95) 602

97 Grand Uniﬁcation (89)

605

Bibliography

615

8

Preface for Students

Quantum ﬁeld theory is the basic mathematical language that is used to describe and analyze the physics of elementary particles. The goal of this book is to provide a concise, step-by-step introduction to this subject, one that covers all the key concepts that are needed to understand the Standard Model of elementary particles, and some of its proposed extensions.
In order to be prepared to undertake the study of quantum ﬁeld theory, you should recognize and understand the following equations:

dσ dΩ

=

|f (θ, φ)|2

a†|n

√ = n+1 |n+1

√

J±|j, m = j(j+1)−m(m±1) |j, m±1

A(t) = e+iHt/¯hAe−iHt/¯h

H = pq˙ − L ct′ = γ(ct − βx) E = (p2c2 + m2c4)1/2 E = −A˙ /c − ∇ϕ

This list is not, of course, complete; but if you are familiar with these equations, you probably know enough about quantum mechanics, classical mechanics, special relativity, and electromagnetism to tackle the material in this book.
Quantum ﬁeld theory has a reputation as a subject that is hard to learn. The problem, I think, is not so much that its basic ingredients are unusually diﬃcult to master (indeed, the conceptual shift needed to go from quantum mechanics to quantum ﬁeld theory is not nearly as severe as the one needed to go from classical mechanics to quantum mechanics), but rather that there are a lot of these ingredients. Some are fundamental, but many are just technical aspects of an unfamiliar form of perturbation theory.
In this book, I have tried to make the subject as accessible to beginners as possible. There are three main aspects to my approach.
Logical development of the basic concepts. This is, of course, very diﬀerent from the historical development of quantum ﬁeld theory, which, like the historical development of most worthwhile subjects, was ﬁlled with inspired guesses and brilliant extrapolations of sometimes fuzzy ideas, as well as its fair share of mistakes, misconceptions, and dead ends. None of that is in this book. From this book, you will (I hope) get the impression that the

9
whole subject is eﬀortlessly clear and obvious, with one step following the next like sunshine after a refreshing rain.
Illustration of the basic concepts with the simplest examples. In most ﬁelds of human endeavor, newcomers are not expected to do the most demanding tasks right away. It takes time, dedication, and lots of practice to work up to what the accomplished masters are doing. There is no reason to expect quantum ﬁeld theory to be any diﬀerent in this regard. Therefore, we will start oﬀ analyzing quantum ﬁeld theories that are not immediately applicable to the real world of electrons, photons, protons, etc., but that will allow us to gain familiarity with the tools we will need, and to practice using them. Then, when we do work up to “real physics”, we will be fully ready for the task. To this end, the book is divided into three parts: Spin Zero, Spin One Half, and Spin One. The technical complexities associated with a particular type of particle increase with its spin. We will therefore ﬁrst learn all we can about spinless particles before moving on to the more diﬃcult (and more interesting) nonzero spins. Once we get to them, we will do a good variety of calculations in (and beyond) the Standard Model of elementary particles.
User friendliness. Each of the three parts is divided into numerous sections. Each section is intended to treat one idea or concept or calculation, and each is written to be as self-contained as possible. For example, when an equation from an earlier section is needed, I usually just repeat it, rather than ask you to leaf back and ﬁnd it (a reader’s task that I’ve always found annoying). Furthermore, each section is labeled with its immediate prerequisites, so you can tell exactly what you need to have learned in order to proceed. This allows you to construct chains to whatever material may interest you, and to get there as quickly as possible.
That said, I expect that most readers of this book will encounter it as the textbook in a course on quantum ﬁeld theory. In that case, of course, your reading will be guided by your professor, who I hope will ﬁnd the above features useful. If, however, you are reading this book on your own, I have two pieces of advice.
The ﬁrst (and most important) is this: ﬁnd someone else to read it with you. I promise that it will be far more fun and rewarding that way; talking about a subject to another human being will inevitably improve the depth of your understanding. And you will have someone to work with you on the problems. (As with all physics texts, the problems are a key ingredient. I will not belabor this point, because if you have gotten this far in physics, you already know it well.)
The second piece of advice echoes the novelist and Nobel laureate William Faulkner. An interviewer asked, “Mr. Faulker, some of your readers claim they still cannot understand your work after reading it two or

10
three times. What approach would you advise them to adopt?” Faulkner replied, “Read it a fourth time.”
That’s my advice here as well. After the fourth attempt, though, you should consider trying something else. This is, after all, not the only book that has ever been written on the subject. You may ﬁnd that a diﬀerent approach (or even the same approach explained in diﬀerent words) breaks the logjam in your thinking. There are a number of excellent books that you could consult, some of which are listed in the Bibliography. I have also listed particular books that I think could be helpful on speciﬁc topics in Reference Notes at the end of some of the sections.
This textbook (like all ﬁnite textbooks) has a number of deﬁciencies. One of these is a rather low level of mathematical rigor. This is partly endemic to the subject; rigorous proofs in quantum ﬁeld theory are relatively rare, and do not appear in the overwhelming majority of research papers. Even some of the most basic notions lack proof; for example, currently you can get a million dollars from the Clay Mathematics Institute simply for proving that nonabelian gauge theory actually exists and has a unique ground state. Given this general situation, and since this is an introductory book, the proofs that we do have are only outlined. those proofs that we do have are only outlined.
Another deﬁciency of this book is that there is no discussion of the application of quantum ﬁeld theory to condensed matter physics, where it also plays an important role. This connection has been important in the historical development of the subject, and is especially useful if you already know a lot of advanced statistical mechanics. I do not want this to be a prerequisite, however, and so I have chosen to keep the focus on applications within elementary particle physics.
Yet another deﬁciency is that there are no references to the original literature. In this regard, I am following a standard trend: as the foundations of a branch of science retreat into history, textbooks become more and more synthetic and reductionist. For example, it is now rare to see a new textbook on quantum mechanics that refers to the original papers by the famous founders of the subject. For guides to the original literature on quantum ﬁeld theory, there are a number of other books with extensive references that you can consult; these include Peskin & Schroeder, Weinberg, and Siegel. (Italicized names refer to works listed in the Bibliography.) Unless otherwise noted, experimental numbers are taken from the Review of Particle Properties, available online at http://pdg.lbl.gov. Experimental numbers quoted in this book have an uncertainty of roughly ±1 in the last signiﬁciant digit. The Review should be consulted for the most recent experimental results, and for more precise statements of their uncertainty.
To conclude, let me say that you are about to embark on a tour of one of

11
humanity’s greatest intellectual endeavors, and certainly the one that has produced the most precise and accurate description of the natural world as we ﬁnd it. I hope you enjoy the ride.

12
Preface for Instructors
On learning that a new text on quantum ﬁeld theory has appeared, one is surely tempted to respond with Isidor Rabi’s famous comment about the muon: “Who ordered that?” After all, many excellent textbooks on quantum ﬁeld theory are already available. I, for example, would not want to be without my well-worn copies of Quantum Field Theory by Lowell S. Brown (Cambridge 1994), Aspects of Symmetry by Sidney Coleman (Cambridge 1985), Introduction to Quantum Field Theory by Michael E. Peskin and Daniel V. Schroeder (Westview 1995), Field Theory: A Modern Primer by Pierre Ramond (Addison-Wesley 1990), Fields by Warren Siegel (arXiv.org 2005), The Quantum Theory of Fields, Volumes I, II, and III, by Steven Weinberg (Cambridge 1995), and Quantum Field Theory in a Nutshell by my colleague Tony Zee (Princeton 2003), to name just a few of the more recent texts. Nevertheless, despite the excellence of these and other books, I have never followed any of them very closely in my twenty years of onand-oﬀ teaching of a year-long course in relativistic quantum ﬁeld theory.
As discussed in the Preface for Students, this book is based on the notion that quantum ﬁeld theory is most readily learned by starting with the simplest examples and working through their details in a logical fashion. To this end, I have tried to set things up at the very beginning to anticipate the eventual need for renormalization, and not be cavalier about how the ﬁelds are normalized and the parameters deﬁned. I believe that these precautions take a lot of the “hocus pocus” (to quote Feynman) out of the “dippy process” of renormalization. Indeed, with this approach, even the anharmonic oscillator is in need of renormalization; see problem 14.7.
A ﬁeld theory with many pedagogical virtues is ϕ3 theory in six dimensions, where its coupling constant is dimensionless. Perhaps because six dimensions used to seem too outre (though today’s prospective string theorists don’t even blink), the only introductory textbook I know of that treats this model is Quantum Field Theory by George Sterman (Cambridge 1993), though it is also discussed in some more advanced books, such as Renormalization by John Collins (Cambridge 1984) and Foundations of Quantum Chromodynamics by T. Muta (World Scientiﬁc 1998). (There is also a series of lectures by Ed Witten on quantum ﬁeld theory for mathematicians, available online, that treat ϕ3 theory.) The reason ϕ3 theory in six dimensions is a nice example is that its Feynman diagrams have a simple structure, but still exhibit the generic phenomena of renormalizable quantum ﬁeld theory at the one-loop level. (The same cannot be said for ϕ4 theory in four dimensions, where momentum-dependent corrections to the propagator do not appear until the two-loop level.) Thus, in Part I of this text, ϕ3 theory in six dimensions is the primary example. I use it to give

13
introductory treatments of most aspects of relativistic quantum ﬁeld theory for spin-zero particles, with a minimum of the technical complications that arise in more realistic theories (like QED) with higher-spin particles.
Although I eventually discuss the Wilson approach to renormalization and eﬀective ﬁeld theory (in section 29), and use eﬀective ﬁeld theory extensively for the physics of hadrons in Part III, I do not feel it is pedagogically useful to bring it in at the very beginning, as is sometimes advocated. The problem is that the key notion of the decoupling of physical processes at different length scales is an unfamiliar one for most students; there is nothing in typical courses on quantum mechanics or electomagnetism or classical mechanics to prepare students for this idea (which was deemed worthy of a Nobel Prize for Ken Wilson in 1982). It also does not provide for a simple calculational framework, since one must deal with the inﬁnite number of terms in the eﬀective lagrangian, and then explain why most of them don’t matter after all. It’s noteworthy that Wilson himself did not spend a lot of time computing properly normalized perturbative S-matrix elements, a skill that we certainly want our students to have; we want them to have it because a great deal of current research still depends on it. Indeed, the vaunted success of quantum ﬁeld theory as a description of the real world is based almost entirely on our ability to carry out these perturbative calculations. Studying renormalization early on has other pedagogical advantages. With the Nobel Prizes to Gerard ’t Hooft and Tini Veltman in 1999 and to David Gross, David Politzer, and Frank Wilczek in 2004, today’s students are well aware of beta functions and running couplings, and would like to understand them. I ﬁnd that they are generally much more excited about this (even in the context of toy models) than they are about learning to reproduce the nearly century-old tree-level calculations of QED. And ϕ3 theory in six dimensions is asymptotically free, which ultimately provides for a nice segue to the “real physics” of QCD.
In general I have tried to present topics so that the more interesting aspects (from a present-day point of view) come ﬁrst. An example is anomalies; the traditional approach is to start with the π0 → γγ decay rate, but such a low-energy process seems like a dusty relic to most of today’s students. I therefore begin by demonstrating that anomalies destroy the self-consistency of the great majority of chiral gauge theories, a fact that strikes me (and, in my experience, most students) as much more interesting and dramatic than an incorrect calculation of the π0 decay rate. Then, when we do eventually get to this process (in section 90), it appears as a straightforward consequence of what we already learned about anomalies in sections 75–77.
Nevertheless, I want this book to be useful to those who disagree with my pedagogical choices, and so I have tried to structure it to allow for

14
maximum ﬂexibility. Each section treats a particular idea or concept or calculation, and is as self-contained as possible. Each section also lists its immediate prerequisites, so that it is easy to see how to rearrange the material to suit your personal preferences.
In some cases, alternative approaches are developed in the problems. For example, I have chosen to introduce path integrals relatively early (though not before canonical quantization and operator methods are applied to free-ﬁeld theory), and use them to derive Dyson’s expansion. For those who would prefer to delay the introduction of path integrals (but since you will have to cover them eventually, why not get it over with?), problem 9.5 outlines the operator-based derivation in the interaction picture.
Another point worth noting is that a textbook and lectures are ideally complementary. Many sections of this book contain rather tedious mathematical detail that I would not and do not write on the blackboard during a lecture. (Indeed, the earliest origins of this book are supplementary notes that I typed up and handed out.) For example, much of the development of Weyl spinors in sections 34–37 can be left to outside reading. I do encourage you not to eliminate this material entirely, however; pedagogically, the problem with skipping directly to four-component notation is explaining that (in four dimensions) the hermitian conjugate of a left-handed ﬁeld is right handed, a deeply important fact that is the key to solving problems such as 36.5 and 83.1, which are in turn vital to understanding the structure of the Standard Model and its extensions. A related topic is computing scattering amplitudes for Majorana ﬁelds; this is essential for modern research on massive neutrinos and supersymmetric particles, though it could be left out of a time-limited course.
While I have sometimes included more mathematical detail than is ideal for a lecture, I have also tended to omit explanations based on “physical intuition.” For example, in section 90, we compute the π− → ℓ−ν¯ℓ decay amplitude (where ℓ is a charged lepton) and ﬁnd that it is proportional to the lepton mass. There is a well-known heuristic explanation of this fact that goes something like this: “The pion has spin zero, and so the lepton and the antineutrino must emerge with opposite spin, and therefore the same helicity. An antineutrino is always right-handed, and so the lepton must be as well. But only the left-handed lepton couples to the W −, so the decay amplitude vanishes if the left- and right-handed leptons are not coupled by a mass term.”
This is essentially correct, but the reasoning is a bit more subtle than it ﬁrst appears. A student may ask, “Why can’t there be orbital angular momentum? Then the lepton and the antineutrino could have the same spin.” The answer is that orbital angular momentum must be perpendicular to the linear momentum, whereas helicity is (by deﬁnition) parallel to the

15
linear momentum; so adding orbital angular momentum cannot change the helicity assignments. (This is explored in a simpliﬁed model in problem 48.4.) The larger point is that intuitive explanations can almost always be probed more deeply. This is ﬁne in a classroom, where you are available to answer questions, but a textbook author has a hard time knowing where to stop. Too little detail renders the explanations opaque, and too much can be overwhelming; furthermore the happy medium tends to diﬀer from student to student. The calculation, on the other hand, is deﬁnitive (at least within the framework being explored, and modulo the possibility of mathematical error). As Roger Penrose once said, “The great thing about physical intuition is that it can be adjusted to ﬁt the facts.” So, in this book, I have tended to emphasize calculational detail at the expense of heuristic reasoning. Lectures should ideally invert this to some extent.
I should also mention that a section of the book is not intended to coincide exactly with a lecture. The material in some sections could easily be covered in less than an hour, and some would clearly take more. My approach in lecturing is to try to keep to a pace that allows the students to follow the analysis, and then try to come to a more-or-less natural stopping point when class time is up. This sometimes means ending in the middle of a long calculation, but I feel that this is better than trying to artiﬁcially speed things along to reach a predetermined destination.
It would take at least three semesters of lectures to cover this entire book, and so a year-long course must omit some. A sequence I might follow is 1–23, 26–28, 33–43, 45–48, 51, 52, 54–59, 62–64, 66–68, 24, 69, 70, 44, 53, 71–73, 75–77, 30, 32, 84, 87–89, 29, 82, 83, 90, and, if any time was left, a selection of whatever seemed of most interest to me and the students of the remaining material.
To conclude, I hope you ﬁnd this book to be a useful tool in working towards our mutual goal of bringing humanity’s understanding of the physics of elementary particles to a new audience.

16
Acknowledgments
Every book is a collaborative eﬀort, even if there is only one author on the title page. Any skills I may have as a teacher were ﬁrst gleaned as a student in the classes of those who taught me. My ﬁrst and most important teachers were my parents, Casimir and Helen Srednicki, to whom this book is dedicated. In our small town in Ohio, my excellent public-school teachers included Esta Kefauver, Marie Casher, Carol Baird, Jim Chase, Joe Gerin, Hugh Laughlin, and Tom Murphy. In college at Cornell, Don Hartill, Bruce Kusse, Bob Siemann, John Kogut, and Saul Teukolsky taught particularly memorable courses. In graduate school at Stanford, Roberto Peccei gave me my ﬁrst exposure to quantum ﬁeld theory, in a superb course that required bicycling in by 8:30 AM (which seemed like a major sacriﬁce at the time). Everyone in that class very much hoped that Roberto would one day turn his extensive hand-written lecture notes (which he put on reserve in the library) into a book. He never did, but I’d like to think that perhaps a bit of his consummate skill has found its way into this text. I have also used a couple of his jokes.
My thesis advisor at Stanford, Lenny Susskind, taught me how to think about physics without getting bogged down in the details. This book includes a lot of detail that Lenny would no doubt have left out, but while writing it I have tried to keep his exemplary clarity of thought in mind as something to strive for.
During my time in graduate school, and subsequently in postdoctoral positions at Princeton and CERN, and ﬁnally as a faculty member at UC Santa Barbara, I was extremely fortunate to be able to interact with many excellent physicists, from whom I learned an enormous amount. These include Stuart Freedman, Eduardo Fradkin, Steve Shenker, Sidney Coleman, Savas Dimopoulos, Stuart Raby, Michael Dine, Willy Fischler, Curt Callan, David Gross, Malcolm Perry, Sam Trieman, Arthur Wightman, Ed Witten, Hans-Peter Nilles, Daniel Wyler, Dmitri Nanopoulos, John Ellis, Keith Olive, Jose Fulco, Ray Sawyer, John Cardy, Frank Wilczek, Jim Hartle, Gary Horowitz, Andy Strominger, and Tony Zee. I am especially grateful to my Santa Barbara colleagues David Berenstein, Steve Giddings, Don Marolf, Joe Polchinski, and Bob Sugar, who used various drafts of this book while teaching quantum ﬁeld theory, and made various suggestions for improvement.
I am also grateful to physicists at other institutions who read parts of the manuscript and also made suggestions, including Oliver de Wolfe, Marcelo Gleiser, Steve Gottlieb, Arkady Tsetlyn, and Arkady Vainshtein. I must single out for special thanks Professor Heidi Fearn of Cal State Fullerton, whose careful reading of Parts I and II allowed me to correct many unclear

17
passages and outright errors that would otherwise have slipped by. Students over the years have suﬀered through my varied attempts to
arrive at a pedagogically acceptable scheme for teaching quantum ﬁeld theory. I thank all of them for their indulgence. I am especially grateful to Sam Pinansky, Tae Min Hong, and Sho Yaida for their diligence in ﬁnding and reporting errors, and to Brian Wignal for help with formatting the manuscript. Also, a number of students from around the world (as well as Santa Barbara) kindly reported errors in versions of this book that were posted online; these include Omri Bahat-Treidel, Hee-Joong Chung, Yevgeny Kats, Sue Ann Koay, Peter Lee, Nikhil Jayant Joshi, Kevin Weil, Dusan Simic, and Miles Stoudenmire. I thank them for their help, and apologize to anyone that I may have missed.
Throughout this project, the assistance and support of my wife Elo¨ıse and daughter Julia were invaluable. Elo¨ıse read through the manuscript and made suggestions that often clariﬁed the language. Julia oﬀered advice on the cover design (a highly stylized Feynman diagram). And they both kindly indulged the amount of time I spent working on this book that you now hold in your hands.

Part I
Spin Zero

1: Attempts at relativistic quantum mechanics

19

1 Attempts at relativistic quantum mechanics
Prerequisite: none

In order to combine quantum mechanics and relativity, we must ﬁrst understand what we mean by “quantum mechanics” and “relativity”. Let us begin with quantum mechanics.
Somewhere in most textbooks on the subject, one can ﬁnd a list of the “axioms of quantum mechanics”. These include statements along the lines of

The state of the system is represented by a vector in Hilbert space.
Observables are represented by hermitian operators.
The measurement of an observable yields one of its eigenvalues as the result.

And so on. We do not need to review these closely here. The axiom we need to focus on is the one that says that the time evolution of the state of the system is governed by the Schr¨odinger equation,

i¯h

∂ ∂t

|ψ,

t

= H|ψ, t

,

(1.1)

where H is the hamiltonian operator, representing the total energy. Let us consider a very simple system: a spinless, nonrelativistic particle
with no forces acting on it. In this case, the hamiltonian is

H

=

1 2m

P2

,

(1.2)

where m is the particle’s mass, and P is the momentum operator. In the position basis, eq. (1.1) becomes

i¯h

∂ ∂t

ψ(x,

t)

=

−

¯h2 2m

∇2ψ(x,

t)

,

(1.3)

where ψ(x, t) = x|ψ, t is the position-space wave function. We would like to generalize this to relativistic motion.
The obvious way to proceed is to take

H = + P2c2 + m2c4 ,

(1.4)

1: Attempts at relativistic quantum mechanics

20

which yields the correct relativistic energy-momentum relation. If we for-

mally expand this hamiltonian in inverse powers of the speed of light c, we

get

H

=

mc2

+

1 2m

P2

+

.

.

.

.

(1.5)

This is simply a constant (the rest energy), plus the usual nonrelativistic

hamiltonian, eq. (1.2), plus higher-order corrections. With the hamiltonian

given by eq. (1.4), the Schr¨odinger equation becomes

i¯h ∂ ψ(x, t) = + −¯h2c2∇2 + m2c4 ψ(x, t) . ∂t

(1.6)

Unfortunately, this equation presents us with a number of diﬃculties. One

is that it apparently treats space and time on a diﬀerent footing: the time

derivative appears only on the left, outside the square root, and the space

derivatives appear only on the right, under the square root. This asymme-

try between space and time is not what we would expect of a relativistic theory. Furthermore, if we expand the square root in powers of ∇2, we get

an inﬁnite number of spatial derivatives acting on ψ(x, t); this implies that

eq. (1.6) is not local in space.

We can alleviate these problems by squaring the diﬀerential operators

on each side of eq. (1.6) before applying them to the wave function. Then

we get

−¯h2

∂2 ∂t2

ψ(x,

t)

=

−¯h2c2∇2 + m2c4 ψ(x, t) .

(1.7)

This is the Klein-Gordon equation, and it looks a lot nicer than eq. (1.6).

It is second-order in both space and time derivatives, and they appear in a

symmetric fashion.

To better understand the Klein-Gordon equation, let us consider in

more detail what we mean by “relativity”. Special relativity tells us that

physics looks the same in all inertial frames. To explain what this means, we

ﬁrst suppose that a certain spacetime coordinate system (ct, x) represents (by ﬁat) an inertial frame. Let us deﬁne x0 = ct, and write xµ, where

µ = 0, 1, 2, 3, in place of (ct, x). It is also convenient (for reasons not at all obvious at this point) to deﬁne x0 = −x0 and xi = xi, where i = 1, 2, 3.
This can be expressed more elegantly if we ﬁrst introduce the Minkowski

metric,

 −1



gµν = 

+1 +1

 ,

(1.8)

+1

where blank entries are zero. We then have xµ = gµν xν, where a repeated index is summed.

1: Attempts at relativistic quantum mechanics

21

To invert this formula, we introduce the inverse of g, which is confusingly also called g, except with both indices up:

 −1



gµν = 

+1 +1

 .

+1

(1.9)

We then have gµν gνρ = δµρ, where δµρ is the Kronecker delta (equal to one if its two indices take on the same value, zero otherwise). Now we can also write xµ = gµν xν.
It is a general rule that any pair of repeated (and therefore summed)
indices must consist of one superscript and one subscript; these indices are
said to be contracted. Also, any unrepeated (and therefore unsummed)
indices must match (in both name and height) on the left- and right-hand
sides of any valid equation.
Now we are ready to specify what we mean by an inertial frame. If the coordinates xµ represent an inertial frame (which they do, by assumption), then so do any other coordinates x¯µ that are related by

x¯µ = Λµν xν + aµ ,

(1.10)

where Λµν is a Lorentz transformation matrix and aµ is a translation vector.

Both Λµν and aµ are constant (that is, independent of xµ). Furthermore,

Λµν must obey

gµν ΛµρΛν σ = gρσ .

(1.11)

Eq. (1.11) ensures that the interval between two diﬀerent spacetime points that are labeled by xµ and x′µ in one inertial frame, and by x¯µ and x¯′µ in
another, is the same. This interval is deﬁned to be

(x − x′)2 ≡ gµν (x − x′)µ(x − x′)ν = (x − x′)2 − c2(t − t′)2 .

(1.12)

In the other frame, we have
(x¯ − x¯′)2 = gµν (x¯ − x¯′)µ(x¯ − x¯′)ν = gµν ΛµρΛν σ(x − x′)ρ(x − x′)σ = gρσ(x − x′)ρ(x − x′)σ = (x − x′)2 ,

(1.13)

as desired. When we say that physics looks the same, we mean that two observers
(Alice and Bob, say) using two diﬀerent sets of coordinates (representing

1: Attempts at relativistic quantum mechanics

22

two diﬀerent inertial frames) should agree on the predicted results of all possible experiments. In the case of quantum mechanics, this requires Alice and Bob to agree on the value of the wave function at a particular spacetime point, a point that is called x by Alice and x¯ by Bob. Thus if Alice’s predicted wave function is ψ(x), and Bob’s is ψ¯(x¯), then we should have ψ(x) = ψ¯(x¯). Furthermore, in order to maintain ψ(x) = ψ¯(x¯) throughout spacetime, ψ(x) and ψ¯(x¯) should obey identical equations of motion. Thus a candidate wave equation should take the same form in any inertial frame.
Let us see if this is true of the Klein-Gordon equation. We ﬁrst introduce some useful notation for spacetime derivatives:

∂µ

≡

∂ ∂xµ

=

+

1 c

∂ ∂t

,

∇

,

(1.14)

Note that

∂µ

≡

∂ ∂xµ

=

−

1 c

∂ ∂t

,

∇

.

∂µxν = gµν ,

(1.15) (1.16)

so that our matching-index-height rule is satisﬁed. If x¯ and x are related by eq. (1.10), then ∂¯ and ∂ are related by

∂¯µ = Λµν ∂ν .

(1.17)

To check this, we note that

∂¯ρx¯σ = (Λρµ∂µ)(Λσν xν + aµ) = ΛρµΛσν (∂µxν ) = ΛρµΛσν gµν = gρσ , (1.18)
as expected. The last equality in eq. (1.18) is another form of eq. (1.11); see section 2.
We can now write eq. (1.7) as

−¯h2c2∂02ψ(x) = (−¯h2c2∇2 + m2c4)ψ(x) .

(1.19)

After rearranging and identifying ∂2 ≡ ∂µ∂µ = −∂02 + ∇2, we have

(−∂2 + m2c2/¯h2)ψ(x) = 0 .

(1.20)

This is Alice’s form of the equation. Bob would write (−∂¯2 + m2c2/¯h2)ψ¯(x¯) = 0 .

(1.21)

Is Bob’s equation equivalent to Alice’s equation? To see that it is, we set ψ¯(x¯) = ψ(x), and note that

∂¯2 = gµν ∂¯µ∂¯ν = gµν ΛµρΛµσ∂ρ∂σ = ∂2 .

(1.22)

1: Attempts at relativistic quantum mechanics

23

Thus, eq. (1.21) is indeed equivalent to eq. (1.20). The Klein-Gordon equation is therefore manifestly consistent with relativity: it takes the same form in every inertial frame.
This is the good news. The bad news is that the Klein-Gordon equation violates one of the axioms of quantum mechanics: eq. (1.1), the Schr¨odinger equation in its abstract form. The abstract Schr¨odinger equation has the fundamental property of being ﬁrst order in the time derivative, whereas the Klein-Gordon equation is second order. This may not seem too important, but in fact it has drastic consequences. One of these is that the norm of a state,

ψ, t|ψ, t = d3x ψ, t|x x|ψ, t = d3x ψ∗(x)ψ(x),

(1.23)

is not in general time independent. Thus probability is not conserved. The Klein-Gordon equation obeys relativity, but not quantum mechanics.
Dirac attempted to solve this problem (for spin-one-half particles) by introducing an extra discrete label on the wave function, to account for spin: ψa(x), a = 1, 2. He then tried a Schr¨odinger equation of the form

i¯h

∂ ∂t

ψa(x)

=

−i¯hc(αj )ab∂j + mc2(β)ab ψb(x) ,

(1.24)

where all repeated indices are summed, and αj and β are matrices in spin-

space. This equation, the Dirac equation, is consistent with the abstract

Schr¨odinger equation. The state |ψ, a, t carries a spin label a, and the

hamiltonian is

Hab = cPj (αj )ab + mc2(β)ab ,

(1.25)

where Pj is a component of the momentum operator. Since the Dirac equation is linear in both time and space derivatives,
it has a chance to be consistent with relativity. Note that squaring the hamiltonian yields

(H2)ab = c2Pj Pk(αj αk)ab + mc3Pj (αj β + βαj )ab + (mc2)2(β2)ab . (1.26)

Since PjPk is symmetric on exchange of j and k, we can replace αjαk by

its

symmetric

part,

1 2

{αj

,

αk

},

where

{A, B}

=

AB

+ BA

is

the

anticom-

mutator. Then, if we choose matrices such that

{αj , αk}ab = 2δjkδab , {αj , β}ab = 0 , (β2)ab = δab ,

(1.27)

we will get

(H2)ab = (P2c2 + m2c4)δab .

(1.28)

Thus, the eigenstates of H2 are momentum eigenstates, with H2 eigenvalue p2c2 + m2c4. This is, of course, the correct relativistic energy-momentum

1: Attempts at relativistic quantum mechanics

24

relation. While it is outside the scope of this section to demonstrate it, it turns out that the Dirac equation is fully consistent with relativity provided the Dirac matrices obey eq. (1.27). So we have apparently succeeded in constructing a quantum mechanical, relativistic theory!
There are, however, some problems. We would like the Dirac matrices to be 2 × 2, in order to account for electron spin. However, they must in fact be larger. To see this, note that the 2 × 2 Pauli matrices obey {σi, σj} = 2δij, and are thus candidates for the Dirac αi matrices. However, there is no fourth matrix that anticommutes with these three (easily proven by writing down the most general 2 × 2 matrix and working out the three anticommutators explicitly). Also, we can show that the Dirac matrices must be even dimensional; see problem 1.1. Thus their minimum size is 4× 4, and it remains for us to interpret the two extra possible “spin” states.
However, these extra states cause a more severe problem than a mere overcounting. Acting on a momentum eigenstate, H becomes the matrix c α·p + mc2β. In problem 1.1, we ﬁnd that the trace of this matrix is zero. Thus the four eigenvalues must be +E(p), +E(p), −E(p), −E(p), where E(p) = +(p2c2 + m2c4)1/2. The negative eigenvalues are the problem: they indicate that there is no ground state. In a more elaborate theory that included interactions with photons, there seems to be no reason why a positive energy electron could not emit a photon and drop down into a negative energy state. This downward cascade could continue forever. (The same problem also arises in attempts to interpret the Klein-Gordon equation as a modiﬁed form of quantum mechanics.)
Dirac made a wildly brilliant attempt to ﬁx this problem of negative energy states. His solution is based on an empirical fact about electrons: they obey the Pauli exclusion principle. It is impossible to put more than one of them in the same quantum state. What if, Dirac speculated, all the negative energy states were already occupied? In this case, a positive energy electron could not drop into one of these states, by Pauli exclusion.
Many questions immediately arise. Why don’t we see the negative electric charge of this Dirac sea of electrons? Dirac’s answer: because we’re used to it. (More precisely, the physical eﬀects of a uniform charge density depend on the boundary conditions at inﬁnity that we impose on Maxwell’s equations, and there is a choice that renders such a uniform charge density invisible.) However, Dirac noted, if one of these negative energy electrons were excited into a positive energy state (by, say, a suﬃciently energetic photon), it would leave behind a hole in the sea of negative energy electrons. This hole would appear to have positive charge, and positive energy. Dirac therefore predicted (in 1927) the existence of the positron, a particle with the same mass as the electron, but opposite charge. The positron was found experimentally ﬁve years later.

1: Attempts at relativistic quantum mechanics

25

However, we have now jumped from an attempt at a quantum description of a single relativistic particle to a theory that apparently requires an inﬁnite number of particles. Even if we accept this, we still have not solved the problem of how to describe particles like photons or pions or alpha nuclei that do not obey Pauli exclusion.
At this point, it is worthwhile to stop and reﬂect on why it has proven to be so hard to ﬁnd an acceptable relativistic wave equation for a single quantum particle. Perhaps there is something wrong with our basic approach.
And there is. Recall the axiom of quantum mechanics that says that “Observables are represented by hermitian operators.” This is not entirely true. There is one observable in quantum mechanics that is not represented by a hermitian operator: time. Time enters into quantum mechanics only when we announce that the “state of the system” depends on an extra parameter t. This parameter is not the eigenvalue of any operator. This is in sharp contrast to the particle’s position x, which is the eigenvalue of an operator. Thus, space and time are treated very diﬀerently, a fact that is obscured by writing the Schr¨odinger equation in terms of the position-space wave function ψ(x, t). Since space and time are treated asymmetrically, it is not surprising that we are having trouble incorporating a symmetry that mixes them up.
So, what are we to do? In principle, the problem could be an intractable one: it might be impossible to combine quantum mechanics and relativity. In this case, there would have to be some meta-theory, one that reduces in the nonrelativistic limit to quantum mechanics, and in the classical limit to relativistic particle dynamics, but is actually neither. This, however, turns out not to be the case. We can solve our problem, but we must put space and time on an equal footing at the outset. There are two ways to do this. One is to demote position from its status as an operator, and render it as an extra label, like time. The other is to promote time to an operator. Let us discuss the second option ﬁrst. If time becomes an operator, what do we use as the time parameter in the Schr¨odinger equation? Happily, in relativistic theories, there is more than one notion of time. We can use the proper time τ of the particle (the time measured by a clock that moves with it) as the time parameter. The coordinate time T (the time measured by a stationary clock in an inertial frame) is then promoted to an operator. In the Heisenberg picture (where the state of the system is ﬁxed, but the operators are functions of time that obey the classical equations of motion), we would have operators Xµ(τ ), where X0 = T . Relativistic quantum mechanics can indeed be developed along these lines, but it is surprisingly

1: Attempts at relativistic quantum mechanics

26

complicated to do so. (The many times are the problem; any monotonic function of τ is just as good a candidate as τ itself for the proper time, and this inﬁnite redundancy of descriptions must be understood and accounted for.)
One of the advantages of considering diﬀerent formalisms is that they may suggest diﬀerent directions for generalizations. For example, once we have Xµ(τ ), why not consider adding some more parameters? Then we would have, for example, Xµ(σ, τ ). Classically, this would give us a continuous family of worldlines, what we might call a worldsheet, and so Xµ(σ, τ ) would describe a propagating string. This is indeed the starting point for string theory.
Thus, promoting time to an operator is a viable option, but is complicated in practice. Let us then turn to the other option, demoting position to a label. The ﬁrst question is, label on what? The answer is, on operators. Thus, consider assigning an operator to each point x in space; call these operators ϕ(x). A set of operators like this is called a quantum ﬁeld. In the Heisenberg picture, the operators are also time dependent:

ϕ(x, t) = eiHt/¯hϕ(x, 0)e−iHt/¯h .

(1.29)

Thus, both position and (in the Heisenberg picture) time are now labels on operators; neither is itself the eigenvalue of an operator.
So, now we have two diﬀerent approaches to relativistic quantum theory, approaches that might, in principle, yield diﬀerent results. This, however, is not the case: it turns out that any relativistic quantum physics that can be treated in one formalism can also be treated in the other. Which we use is a matter of convenience and taste. And, quantum ﬁeld theory, the formalism in which position and time are both labels on operators, is much more convenient and eﬃcient for most problems.
There is another useful equivalence: ordinary nonrelativistic quantum mechanics, for a ﬁxed number of particles, can be rewritten as a quantum ﬁeld theory. This is an informative exercise, since the corresponding physics is already familiar. Let us carry it out.
Begin with the position-basis Schr¨odinger equation for n particles, all with the same mass m, moving in an external potential U (x), and interacting with each other via an interparticle potential V (x1 − x2):

i¯h

∂ ∂t

ψ

=

n j=1

−

¯h2 2m

∇2j

+

U

(xj

)

n j−1

+

V (xj − xk) ψ ,

j=1 k=1

(1.30)

where ψ = ψ(x1, . . . , xn; t) is the position-space wave function. The quantum mechanics of this system can be rewritten in the abstract form of

1: Attempts at relativistic quantum mechanics

27

eq. (1.1) by ﬁrst introducing (in, for now, the Schr¨odinger picture) a quantum ﬁeld a(x) and its hermitian conjugate a†(x). We take these operators
to have the commutation relations

[a(x), a(x′)] = 0 , [a†(x), a†(x′)] = 0 , [a(x), a†(x′)] = δ3(x − x′) ,

(1.31)

where δ3(x) is the three-dimensional Dirac delta function. Thus, a†(x) and a(x) behave like harmonic-oscillator creation and annihilation operators that are labeled by a continuous index. In terms of them, we introduce the hamiltonian operator of our quantum ﬁeld theory,

H=

d3x a†(x)

−

¯h2 2m

∇2

+

U

(x)

a(x)

+

1 2

d3x d3y V (x − y)a†(x)a†(y)a(y)a(x) .

(1.32)

Now consider a time-dependent state of the form

|ψ, t = d3x1 . . . d3xn ψ(x1, . . . , xn; t)a†(x1) . . . a†(xn)|0 , (1.33)

where ψ(x1, . . . , xn; t) is some function of the n particle positions and time, and |0 is the vacuum state, the state that is annihilated by all the a’s,

a(x)|0 = 0 .

(1.34)

It is now straightforward (though tedious) to verify that eq. (1.1), the ab-

stract Schr¨odinger equation, is obeyed if and only if the function ψ satisﬁes

eq. (1.30).

Thus we can interpret the state |0 as a state of “no particles”, the state a†(x1)|0 as a state with one particle at position x1, the state a†(x1)a†(x2)|0 as a state with one particle at position x1 and another at position x2, and so on. The operator

N = d3x a†(x)a(x)

(1.35)

counts the total number of particles. It commutes with the hamiltonian, as is easily checked; thus, if we start with a state of n particles, we remain with a state of n particles at all times.
However, we can imagine generalizations of this version of the theory (generalizations that would not be possible without the ﬁeld formalism) in which the number of particles is not conserved. For example, we could try adding to H a term like

∆H ∝ d3x a†(x)a2(x) + h.c. .

(1.36)

1: Attempts at relativistic quantum mechanics

28

This term does not commute with N , and so the number of particles would not be conserved with this addition to H.
Theories in which the number of particles can change as time evolves are a good thing: they are needed for correct phenomenology. We are already familiar with the notion that atoms can emit and absorb photons, and so we had better have a formalism that can incorporate this phenomenon. We are less familiar with emission and absorption (that is to say, creation and annihilation) of electrons, but this process also occurs in nature; it is less common because it must be accompanied by the emission or absorption of a positron, antiparticle to the electron. There are not a lot of positrons around to facilitate electron annihilation, while e+e− pair creation requires us to have on hand at least 2mc2 of energy available for the rest-mass energy of these two particles. The photon, on the other hand, is its own antiparticle, and has zero rest mass; thus photons are easily and copiously produced and destroyed.
There is another important aspect of the quantum theory speciﬁed by eqs. (1.32) and (1.33). Because the creation operators commute with each other, only the completely symmetric part of ψ survives the integration in eq. (1.33). Therefore, without loss of generality, we can restrict our attention to ψ’s of this type:

ψ(. . . xi . . . xj . . . ; t) = +ψ(. . . xj . . . xi . . . ; t) .

(1.37)

This means that we have a theory of bosons, particles that (like photons or pions or alpha nuclei) obey Bose-Einstein statistics. If we want Fermi-Dirac statistics instead, we must replace eq. (1.31) with

{a(x), a(x′)} = 0 , {a†(x), a†(x′)} = 0 , {a(x), a†(x′)} = δ3(x − x′) ,

(1.38)

where again {A, B} = AB + BA is the anticommutator. Now only the fully antisymmetric part of ψ survives the integration in eq. (1.33), and so we can restrict our attention to

ψ(. . . xi . . . xj . . . ; t) = −ψ(. . . xj . . . xi . . . ; t) .

(1.39)

Thus we have a theory of fermions. It is straightforward to check that the abstract Schr¨odinger equation, eq. (1.1), still implies that ψ obeys the diﬀerential equation (1.30).1 Interestingly, there is no simple way to write
1Now, however, the ordering of the a and a† operators in the last term of eq. (1.32) becomes signiﬁcant, and must be as written.

1: Attempts at relativistic quantum mechanics

29

down a quantum ﬁeld theory with particles that obey Boltzmann statistics, corresponding to a wave function with no particular symmetry. This is a hint of the spin-statistics theorem, which applies to relativistic quantum ﬁeld theory. It says that interacting particles with integer spin must be bosons, and interacting particles with half-integer spin must be fermions. In our nonrelativistic example, the interacting particles clearly have spin zero (because their creation operators carry no labels that could be interpreted as corresponding to diﬀerent spin states), but can be either bosons or fermions, as we have seen.
Now that we have seen how to rewrite the nonrelativistic quantum mechanics of multiple bosons or fermions as a quantum ﬁeld theory, it is time to try to construct a relativistic version.
Reference Notes
The history of the physics of elementary particles is recounted in Pais. A brief overview can be found in Weinberg I. More details on quantum ﬁeld theory for nonrelativistic particles can be found in Brown.
Problems
1.1) Show that the Dirac matrices must be even dimensional. Hint: show that the eigenvalues of β are all ±1, and that Tr β = 0. To show that Tr β = 0, consider, e.g., Tr α21β. Similarly, show that Tr αi = 0.
1.2) With the hamiltonian of eq. (1.32), show that the state deﬁned in eq. (1.33) obeys the abstract Schr¨odinger equation, eq. (1.1), if and only if the wave function obeys eq. (1.30). Your demonstration should apply both to the case of bosons, where the particle creation and annihilation operators obey the commutation relations of eq. (1.31), and to fermions, where the particle creation and annihilation operators obey the anticommutation relations of eq. (1.38).
1.3) Show explicitly that [N, H] = 0, where H is given by eq. (1.32) and N by eq. (1.35).

2 Lorentz Invariance
Prerequisite: 1

A Lorentz transformation is a linear, homogeneous change of coordinates

from xµ to x¯µ,

x¯µ = Λµν xν ,

(2.1)

that preserves the interval x2 between xµ and the origin, where

x2 ≡ xµxµ = gµν xµxν = x2 − c2t2 .

(2.2)

This means that the matrix Λµν must obey gµν ΛµρΛν σ = gρσ ,

(2.3)

where

 −1



gµν = 

+1 +1

 .

+1

(2.4)

is the Minkowski metric.
Note that this set of transformations includes ordinary spatial rotations: take Λ00 = 1, Λ0i = Λi0 = 0, and Λij = Rij, where R is an orthogonal rotation matrix.
The set of all Lorentz transformations forms a group: the product of
any two Lorentz transformations is another Lorentz transformation; the product is associative; there is an identity transformation, Λµν = δµν ; and every Lorentz transformation has an inverse. It is easy to demonstrate
these statements explicitly. For example, to ﬁnd the inverse transformation (Λ−1)µν, note that the left-hand side of eq. (2.3) can be written as ΛνρΛν σ, and that we can raise the ρ index on both sides to get Λν ρΛνσ = δρσ. On the other hand, by deﬁnition, (Λ−1)ρνΛνσ = δρσ. Therefore

(Λ−1)ρν = Λν ρ .

(2.5)

Another useful version of eq. (2.3) is

gµν ΛρµΛσν = gρσ .

(2.6)

To get eq. (2.6), start with eq. (2.3), but with the inverse transformations (Λ−1)µρ and (Λ−1)νσ. Then use eq. (2.5), raise all down indices, and lower all up indices. The result is eq. (2.6).
For an inﬁnitesimal Lorentz transformation, we can write

Λµν = δµν + δωµν .

(2.7)

2: Lorentz Invariance

31

Eq. (2.3) can be used to show that δω with both indices down (or up) is

antisymmetric:

δωρσ = −δωσρ .

(2.8)

Thus there are six independent inﬁnitesimal Lorentz transformations (in

four spacetime dimensions). These can be divided into three rotations

(δωij = −εijknˆkδθ for a rotation by angle δθ about the unit vector nˆ) and three boosts (δωi0 = nˆiδη for a boost in the direction nˆ by rapidity δη).
Not all Lorentz transformations can be reached by compounding inﬁnitesimal ones. If we take the determinant of eq. (2.5), we get (det Λ)−1 =

det Λ, which implies det Λ = ±1. Transformations with det Λ = +1 are

proper, and transformations with det Λ = −1 are improper. Note that the

product of any two proper Lorentz transformations is proper, and that

inﬁnitesimal transformations of the form Λ = 1 + δω are proper. There-

fore, any transformation that can be reached by compounding inﬁnitesimal

ones is proper. The proper transformations form a subgroup of the Lorentz

group.

Another subgroup is that of the orthochronous Lorentz transformations: those for which Λ00 ≥ +1. Note that eq. (2.3) implies (Λ00)2 − Λi0Λi0 = 1; thus, either Λ00 ≥ +1 or Λ00 ≤ −1. An inﬁnitesimal transformation is
clearly orthochronous, and it is straightforward to show that the product

of two orthochronous transformations is also orthochronous.

Thus, the Lorentz transformations that can be reached by compounding

inﬁnitesimal ones are both proper and orthochronous, and they form a

subgroup. We can introduce two discrete transformations that take us out

of this subgroup: parity and time reversal. The parity transformation is

 +1



Pµν = (P−1)µν = 

−1 −1

 .

(2.9)

−1

It is orthochronous, but improper. The time-reversal transformation is

 −1



T

µ ν

=

(T

−1)µν

=



+1 +1

 .

(2.10)

+1

It is nonorthochronous and improper. Generally, when a theory is said to be Lorentz invariant, this means
under the proper orthochronous subgroup only. Parity and time reversal are treated separately. It is possible for a quantum ﬁeld theory to be invariant under the proper orthochronous subgroup, but not under parity and/or time-reversal.

2: Lorentz Invariance

32

From here on, in this section, we will treat the proper orthochronous subgroup only. Parity and time reversal will be treated in section 23.
In quantum theory, symmetries are represented by unitary (or antiunitary) operators. This means that we associate a unitary operator U (Λ) to each proper, orthochronous Lorentz transformation Λ. These operators must obey the composition rule

U (Λ′Λ) = U (Λ′)U (Λ) .

(2.11)

For an inﬁnitesimal transformation, we can write

U (1+δω)

=

I

+

i 2¯h

δωµν

M

µν

,

(2.12)

where M µν = −M νµ is a set of hermitian operators called the generators of the Lorentz group. If we start with U (Λ)−1U (Λ′)U (Λ) = U (Λ−1Λ′Λ), let Λ′ = 1 + δω′, and expand both sides to linear order in δω, we get

δωµν U (Λ)−1M µν U (Λ) = δωµν ΛµρΛν σM ρσ .

(2.13)

Then, since δωµν is arbitrary (except for being antisymmetric), the anti-
symmetric part of its coeﬃcient on each side must be the same. In this case, because M µν is already antisymmetric (by deﬁnition), we have

U (Λ)−1M µν U (Λ) = ΛµρΛν σM ρσ .

(2.14)

We see that each vector index on M µν undergoes its own Lorentz transformation. This is a general result: any operator carrying one or more vector indices should behave similarly. For example, consider the energymomentum four-vector P µ, where P 0 is the hamiltonian H and P i are the components of the total three-momentum operator. We expect

U (Λ)−1P µU (Λ) = Λµν P ν .

(2.15)

If we now let Λ = 1 + δω in eq. (2.14), expand to linear order in δω, and equate the antisymmetric part of the coeﬃcients of δωµν , we get the commutation relations

[M µν , M ρσ] = i¯h gµρM νσ − (µ↔ν) − (ρ↔σ) .

(2.16)

These commutation relations specify the Lie algebra of the Lorentz group.

We can identify the components of the angular momentum operator J as

Ji

≡

1 2

εij

kM

jk

,

and

the

components

of

the

boost

operator

K

as

Ki

≡

M i0.

We then ﬁnd from eq. (2.16) that

[Ji, Jj ] = i¯hεijkJk , [Ji, Kj ] = i¯hεijkKk , [Ki, Kj ] = −i¯hεijkJk .

(2.17)

2: Lorentz Invariance

33

The ﬁrst of these is the usual set of commutators for angular momentum, and the second says that K transforms as a three-vector under rotations. The third implies that a series of boosts can be equivalent to a rotation.
Similarly, we can let Λ = 1 + δω in eq. (2.15) to get

[P µ, M ρσ] = i¯h gµσP ρ − (ρ↔σ) ,

(2.18)

which becomes

[Ji, H] = 0 , [Ji, Pj ] = i¯hεijkPk , [Ki, H] = i¯hPi , [Ki, Pj] = i¯hδijH ,
Also, the components of P µ should commute with each other:

(2.19)

[Pi, Pj] = 0 , [Pi, H] = 0 .

(2.20)

Together, eqs. (2.17), (2.19), and (2.20) form the Lie algebra of the Poincar´e group.
Let us now consider what should happen to a quantum scalar ﬁeld ϕ(x) under a Lorentz transformation. We begin by recalling how time evolution works in the Heisenberg picture:

e+iHt/¯hϕ(x, 0)e−iHt/¯h = ϕ(x, t) .

(2.21)

Obviously, this should have a relativistic generalization,

e−iP x/¯hϕ(0)e+iP x/¯h = ϕ(x) ,

(2.22)

where P x = P µxµ = P · x − Hct. We can make this a little fancier by deﬁning the unitary spacetime translation operator

T (a) ≡ exp(−iP µaµ/¯h) .

(2.23)

Then we have

T (a)−1ϕ(x)T (a) = ϕ(x − a) .

(2.24)

For an inﬁnitesimal translation,

T

(δa)

=

I

−

i ¯h

δaµ

P

µ

.

(2.25)

Comparing eqs. (2.12) and (2.25), we see that eq. (2.24) leads us to expect

U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) .

(2.26)

2: Lorentz Invariance

34

Derivatives of ϕ then carry vector indices that transform in the appropriate

way, e.g.,

U (Λ)−1∂µϕ(x)U (Λ) = Λµρ∂¯ρϕ(Λ−1x) ,

(2.27)

where the bar on a derivative means that it is with respect to the argument x¯ = Λ−1x. Eq. (2.27) also implies

U (Λ)−1∂2ϕ(x)U (Λ) = ∂¯2ϕ(Λ−1x) ,

(2.28)

so that the Klein-Gordon equation, (−∂2 + m2/¯h2c2)ϕ = 0, is Lorentz invariant, as we saw in section 1.

Reference Notes

A detailed discussion of quantum Lorentz transformations can be found in Weinberg I.

Problems

2.1) Verify that eq. (2.8) follows from eq. (2.3).

2.2) Verify that eq. (2.14) follows from U (Λ)−1U (Λ′)U (Λ) = U (Λ−1Λ′Λ).

2.3) Verify that eq. (2.16) follows from eq. (2.14).

2.4) Verify that eq. (2.17) follows from eq. (2.16).

2.5) Verify that eq. (2.18) follows from eq. (2.15).

2.6) Verify that eq. (2.19) follows from eq. (2.18).

2.7) What property should be attributed to the translation operator T (a) that could be used to prove eq. (2.20)?

2.8) a) Let Λ = 1 + δω in eq. (2.26), and show that

[ϕ(x), M µν ] = Lµνϕ(x) ,

(2.29)

where

Lµν

≡

¯h i

(xµ∂ν

− xν ∂µ)

.

(2.30)

b) Show that [[ϕ(x), M µν ], M ρσ] = Lµν Lρσϕ(x).

c) Prove the Jacobi identity, [[A, B], C] + [[B, C], A] + [[C, A], B] = 0. Hint: write out all the commutators.

d) Use your results from parts (b) and (c) to show that

[ϕ(x), [M µν , M ρσ]] = (Lµν Lρσ − LρσLµν )ϕ(x) .

(2.31)

2: Lorentz Invariance

35

e) Simplify the right-hand side of eq. (2.31) as much as possible.
f) Use your results from part (e) to verify eq. (2.16), up to the possibility of a term on the right-hand side that commutes with ϕ(x) and its derivatives. (Such a term, called a central charge, in fact does not arise for the Lorentz algebra.)

2.9) Let us write

Λρτ

=

δρτ

+

i 2¯h

δωµν

(SVµν

)ρ

τ

,

(2.32)

where

(SVµν )ρτ

≡

¯h i

(gµρ

δν

τ

− gνρδµτ )

(2.33)

are matrices which constitute the vector representation of the Lorentz generators.

a) Let Λ = 1 + δω in eq. (2.27), and show that

[∂ρϕ(x), M µν ] = Lµν ∂ρϕ(x) + (SVµν )ρτ ∂τϕ(x) .

(2.34)

b) Show that the matrices SVµν must have the same commutation relations as the operators M µν . Hint: see the previous problem.

c) For a rotation by an angle θ about the z axis, we have

1 0

0 0

Λµν

=



0 0

cos θ sin θ

− sin θ cos θ

0 0



.

00

01

(2.35)

Show that

Λ = exp(−iθSV12/¯h) .

(2.36)

d) For a boost by rapidity η in the z direction, we have

 cosh η 0 0 sinh η 

Λµν = 

0 0

10 01

0 0

 .

sinh η 0 0 cosh η

(2.37)

Show that

Λ = exp(+iηSV30/¯h) .

(2.38)

3: Canonical Quantization of Scalar Fields

36

3 Canonical Quantization of Scalar Fields
Prerequisite: 2

Let us go back and drastically simplify the hamiltonian we constructed in section 1, reducing it to the hamiltonian for free particles:

H= =

d3x a†(x)

−

1 2m

∇2

a(x)

d3p

1 2m

p2

a†(p)a(p)

,

(3.1)

where

a(p) =

d3x (2π)3/2

e−ip·x a(x)

.

Here we have simpliﬁed our notation by setting

(3.2)

¯h = 1 .

(3.3)

The appropriate factors of h¯ can always be restored in any of our formulas
via dimensional analysis. The commutation (or anticommutation) relations of the a(p) and a†(p) operators are

[ a(p), a(p′)]∓ = 0 , [ a†(p), a†(p′)]∓ = 0 , [ a(p), a†(p′)]∓ = δ3(p − p′) ,

(3.4)

where [A, B]∓ is either the commutator (if we want a theory of bosons) or the anticommutator (if we want a theory of fermions). Thus a†(p) can

be interpreted as creating a state of deﬁnite momentum p, and eq. (3.1)

describes a theory of free particles. The ground state is the vacuum |0 ; it

is annihilated by a(p),

a(p)|0 = 0 ,

(3.5)

and so its energy eigenvalue is zero. The other eigenstates of H are all of

the form a†(p1) . . . a†(pn)|0 , and the corresponding energy eigenvalue is

E(p1)

+

...

+

E(pn),

where

E(p)

=

1 2m

p2.

It is easy to see how to generalize this theory to a relativistic one; all we

need to do is use the relativistic energy formula E(p) = +(p2c2 + m2c4)1/2:

H = d3p (p2c2 + m2c4)1/2 a†(p)a(p) .

(3.6)

Now we have a theory of free relativistic spin-zero particles, and they can be either bosons or fermions.

3: Canonical Quantization of Scalar Fields

37

Is this theory really Lorentz invariant? We will answer this question (in the aﬃrmative) in a very roundabout way: by constructing it again, from a rather diﬀerent point of view, a point of view that emphasizes Lorentz invariance from the beginning.
We will start with the classical physics of a real scalar ﬁeld ϕ(x). Real means that ϕ(x) assigns a real number to every point in spacetime. Scalar means that Alice [who uses coordinates xµ and calls the ﬁeld ϕ(x)] and Bob [who uses coordinates x¯µ, related to Alice’s coordinates by x¯µ = Λµνxν +aν, and calls the ﬁeld ϕ¯(x¯)], agree on the numerical value of the ﬁeld: ϕ(x) = ϕ¯(x¯). This then implies that the equation of motion for ϕ(x) must be the same as that for ϕ¯(x¯). We have already met an equation of this type: the Klein-Gordon equation,

(−∂2 + m2)ϕ(x) = 0 .

(3.7)

Here we have simpliﬁed our notation by setting

c=1

(3.8)

in addition to h¯ = 1. As with h¯, factors of c can restored, if desired, by dimensional analysis.
We will adopt eq. (3.7) as the equation of motion that we would like ϕ(x) to obey. It should be emphasized at this point that we are doing classical physics of a real scalar ﬁeld. We are not to think of ϕ(x) as a quantum wave function. Thus, there should not be any factors of h¯ in this version of the Klein-Gordon equation. This means that the parameter m must have dimensions of inverse length; m is not (yet) to be thought of as a mass.
The equation of motion can be derived from variation of an action S = dt L, where L is the lagrangian. Since the Klein-Gordon equation is local, we expect that the lagrangian can be written as the space integral of a lagrangian density L: L = d3x L. Thus, S = d4x L. The integration measure d4x is Lorentz invariant: if we change to coordinates x¯µ = Λµνxν , we have d4x¯ = |det Λ| d4x = d4x. Thus, for the action to be Lorentz invariant, the lagrangian density must be a Lorentz scalar: L(x) = L¯(x¯). Then we have S¯ = d4x¯ L¯(x¯) = d4x L(x) = S. Any simple function of ϕ is a Lorentz scalar, and so are products of derivatives with all indices contracted, such as ∂µϕ∂µϕ. We will take for L

L

=

−

1 2

∂µϕ∂µ

ϕ

−

1 2

m2

ϕ2

+

Ω0

,

(3.9)

where Ω0 is an arbitrary constant. We ﬁnd the equation motion (also known as the Euler-Lagrange equation) by making an inﬁnitesimal variation δϕ(x)

3: Canonical Quantization of Scalar Fields

38

in ϕ(x), and requiring the corresponding variation of the action to vanish:

0 = δS

=

d4x

−

1 2

∂µδϕ∂µ

ϕ

−

1 2

∂µϕ∂µδϕ

−

m2ϕ δϕ

= d4x +∂µ∂µϕ − m2ϕ δϕ .

(3.10)

In the last line, we have integrated by parts in each of the ﬁrst two terms, putting both derivatives on ϕ. We assume δϕ(x) vanishes at inﬁnity in any direction (spatial or temporal), so that there is no surface term. Since δϕ has an arbitrary x dependence, eq. (3.10) can be true if and only if (−∂2 + m2)ϕ = 0.
One solution of the Klein-Gordon equation is a plane wave of the form exp(ik·x ± iωt), where k is an arbitrary real wave-vector, and

ω = +(k2 + m2)1/2 .

(3.11)

The general solution (assuming boundary conditions that require ϕ to remain ﬁnite at spatial inﬁnity) is then

ϕ(x, t) =

d3k f (k)

a(k)eik·x−iωt + b(k)eik·x+iωt

,

(3.12)

where a(k) and b(k) are arbitrary functions of the wave vector k, and f (k)

is a redundant function of the magnitude of k which we have inserted for

later convenience. Note that, if we were attempting to interpret ϕ(x) as

a quantum wave function (which we most deﬁnitely are not), then the

second term would constitute the “negative energy” contributions to the

wave function. This is because a plane-wave solution of the nonrelativistic

Schr¨odinger equation for a single particle looks like exp(ip · x − iE(p)t),

with

E(p)

=

1 2m

p2

;

there

is

a

minus

sign

in

front

of

the

positive

energy.

We

are trying to interpret eq. (3.12) as a real classical ﬁeld, but this formula

does not generically result in ϕ being real. We must impose ϕ∗(x) = ϕ(x),

where

ϕ∗(x, t) =

d3k f (k)

a∗(k)e−ik·x+iωt + b∗(k)e−ik·x−iωt

=

d3k f (k)

a∗(k)e−ik·x+iωt + b∗(−k)e+ik·x−iωt

.

(3.13)

In the second term on the second line, we have changed the dummy integration variable from k to −k. Comparing eqs. (3.12) and (3.13), we see

3: Canonical Quantization of Scalar Fields

39

that ϕ∗(x) = ϕ(x) requires b∗(−k) = a(k). Imposing this condition, we can rewrite ϕ as

ϕ(x, t) = = =

d3k f (k)

a(k)eik·x−iωt + a∗(−k)eik·x+iωt

d3k a(k)eik·x−iωt + a∗(k)e−ik·x+iωt f (k)

d3k f (k)

a(k)eikx + a∗(k)e−ikx

,

(3.14)

where kx = k·x − ωt is the Lorentz-invariant product of the four-vectors xµ = (t, x) and kµ = (ω, k): kx = kµxµ = gµν kµxν . Note that

k2 = kµkµ = k2 − ω2 = −m2 .

(3.15)

A four-momentum kµ that obeys k2 = −m2 is said to be on the mass shell,
or on shell for short. It is now convenient to choose f (k) so that d3k/f (k) is Lorentz invariant.
An integration measure that is manifestly invariant under orthochronous Lorentz transformations is d4k δ(k2+m2) θ(k0), where δ(x) is the Dirac delta function, θ(x) is the unit step function, and k0 is treated as an independent
integration variable. We then have

+∞
dk0 δ(k2+m2) θ(k0)
−∞

=

1 2ω

.

Here we have used the rule

+∞
dx δ(g(x)) =
−∞

i

1 |g′(xi)|

,

(3.16) (3.17)

where g(x) is any smooth function of x with simple zeros at x = xi; in our case, the only zero is at k0 = ω.
Thus we see that if we take f (k) ∝ ω, then d3k/f (k) will be Lorentz invariant. We will take f (k) = (2π)32ω. It is then convenient to give the
corresponding Lorentz-invariant diﬀerential its own name:

dk

≡

d3k (2π)32ω

.

(3.18)

Thus we ﬁnally have

ϕ(x) = dk a(k)eikx + a∗(k)e−ikx .

(3.19)

3: Canonical Quantization of Scalar Fields

40

We can also invert this formula to get a(k) in terms of ϕ(x). We have

d3x e−ikxϕ(x)

=

1 2ω

a(k)

+

1 2ω

e2iωt

a∗(−k)

,

d3x

e−ikx∂0ϕ(x)

=

−

i 2

a(k)

+

i 2

e2iωt a∗

(−k)

.

We can combine these to get

(3.20)

a(k) = d3x e−ikx i∂0ϕ(x) + ωϕ(x)

=i

d3x

e−ikx

↔
∂0

ϕ(x)

,

(3.21)

↔
where f ∂µg ≡ f (∂µg) − (∂µf )g, and ∂0ϕ = ∂ϕ/∂t = ϕ˙ . Note that a(k) is time independent.
Now that we have the lagrangian, we can construct the hamiltonian by
the usual rules. Recall that, given a lagrangian L(qi, q˙i) as a function of some coordinates qi and their time derivatives q˙i, the conjugate momenta are given by pi = ∂L/∂q˙i, and the hamiltonian by H = i piq˙i − L. In our case, the role of qi(t) is played by ϕ(x, t), with x playing the role of a (continuous) index. The appropriate generalizations are then

Π(x)

=

∂L ∂ ϕ˙ (x)

(3.22)

and

H = Πϕ˙ − L ,

(3.23)

where H is the hamiltonian density, and the hamiltonian itself is H = d3x H. In our case, we have

Π(x) = ϕ˙ (x)

(3.24)

and

H

=

1 2

Π2

+

1 2

(∇ϕ)2

+

1 2

m2ϕ2

−

Ω0

.

(3.25)

Using eq. (3.19), we can write H in terms of the a(k) and a∗(k) coeﬃcients:

H

=

−Ω0V

+

1 2

dk dk′ d3x

−iω a(k)eikx + iω a∗(k)e−ikx −iω′ a(k′)eik′x + iω′ a∗(k′)e−ik′x

+ +ik a(k)eikx − ik a∗(k)e−ikx · +ik′ a(k′)eik′x − ik′ a∗(k′)e−ik′x

+ m2 a(k)eikx + a∗(k)e−ikx a(k′)eik′x + a∗(k′)e−ik′x

3: Canonical Quantization of Scalar Fields

41

=

−Ω0V

+

1 2

(2π)3

dk dk′

δ3(k − k′)(+ωω′ + k·k′ + m2)

× a∗(k)a(k′)e−i(ω−ω′)t + a(k)a∗(k′)e+i(ω−ω′)t

+ δ3(k + k′)(−ωω′ − k·k′ + m2) × a(k)a(k′)e−i(ω+ω′)t + a∗(k)a∗(k′)e+i(ω+ω′)t

=

−Ω0V

+

1 2

dk

1 2ω

(+ω2 + k2 + m2) a∗(k)a(k) + a(k)a∗(k)

+ (−ω2 + k2 + m2) a(k)a(−k)e−2iωt + a∗(k)a∗(−k)e+2iωt

=

−Ω0V

+

1 2

dk ω a∗(k)a(k) + a(k)a∗(k) ,

(3.26)

where V is the volume of space. To get the second equality, we used

d3x eiq·x = (2π)3δ3(q) .

(3.27)

To get the third equality, we integrated over k′, using dk′ = d3k′/(2π)32ω′. The last equality then follows from ω = (k2+m2)1/2. Also, we were careful to keep the ordering of a(k) and a∗(k) unchanged throughout, in anticipa-
tion of passing to the quantum theory where these classical functions will
become operators that may not commute.
Let us take up the quantum theory now. We can go from classical
to quantum mechanics via canonical quantization. This means that we promote qi and pi to operators, with commutation relations [qi, qj] = 0, [pi, pj] = 0, and [qi, pj] = i¯hδij. In the Heisenberg picture, these operators should be taken at equal times. In our case, where the “index” is continuous
(and we have set ¯h = 1), we have

[ϕ(x, t), ϕ(x′, t)] = 0 , [Π(x, t), Π(x′, t)] = 0 , [ϕ(x, t), Π(x′, t)] = iδ3(x − x′) .

(3.28)

From these canonical commutation relations, and from eqs. (3.21) and (3.24), we can deduce

[a(k), a(k′)] = 0 , [a†(k), a†(k′)] = 0 , [a(k), a†(k′)] = (2π)32ω δ3(k − k′) .

(3.29)

3: Canonical Quantization of Scalar Fields

42

We are now denoting a∗(k) as a†(k), since a†(k) is now the hermitian conjugate (rather than the complex conjugate) of the operator a(k). We can now rewrite the hamiltonian as

H = dk ω a†(k)a(k) + (E0 − Ω0)V ,

(3.30)

where

E0

=

1 2

(2π)−3

d3k ω

(3.31)

is the total zero-point energy of all the oscillators per unit volume, and, using eq. (3.27), we have interpreted (2π)3δ3(0) as the volume of space V .
If we integrate in eq. (3.31) over the whole range of k, the value of E0 is inﬁnite. If we integrate only up to a maximum value of Λ, a number known
as the ultraviolet cutoﬀ, we ﬁnd

E0

=

Λ4 16π2

,

(3.32)

where we have assumed Λ ≫ m. This is physically justiﬁed if, in the real world, the formalism of quantum ﬁeld theory breaks down at some large energy scale. For now, we simply note that the value of Ω0 is arbitrary, and so we are free to choose Ω0 = E0. With this choice, the ground state has energy eigenvalue zero. Now, if we like, we can take the limit Λ → ∞, with no further consequences. (We will meet more of these ultraviolet divergences after we introduce interactions.)
The hamiltonian of eq. (3.30) is now the same as that of eq. (3.6), with a(k) = [(2π)32ω]1/2 a(k). The commutation relations (3.4) and (3.29) are also equivalent, if we choose commutators (rather than anticommutators) in eq. (3.4). Thus, we have re-derived the hamiltonian of free relativistic bosons by quantization of a scalar ﬁeld whose equation of motion is the Klein-Gordon equation. The parameter m in the lagrangian is now seen to be the mass of the particle in the quantum theory. (More precisely, since m has dimensions of inverse length, the particle mass is h¯cm.)
What if we want fermions? Then we should use anticommutators in eqs. (3.28) and (3.29). There is a problem, though; eq. (3.26) does not then become eq. (3.30). Instead, we get H = −Ω0V , a simple constant. Clearly there is something wrong with using anticommutators. This is another hint of the spin-statistics theorem, which we will take up in section 4.
Next, we would like to add Lorentz-invariant interactions to our theory. With the formalism we have developed, this is easy to do. Any local function of ϕ(x) is a Lorentz scalar, and so if we add a term like ϕ3 or ϕ4 to the lagrangian density L, the resulting action will still be Lorentz invariant. Now, however, we will have interactions among the particles. Our next task is to deduce the consequences of these interactions.

3: Canonical Quantization of Scalar Fields

43

However, we already have enough tools at our disposal to prove the spin-statistics theorem for spin-zero particles, and that is what we turn to next.

Problems

3.1) Derive eq. (3.29) from eqs. (3.21), (3.24), and (3.28).

3.2) Use the commutation relations, eq. (3.29), to show explicitly that a state of the form

|k1 . . . kn ≡ a†(k1) . . . a†(kn)|0

(3.33)

is an eigenstate of the hamiltonian, eq. (3.30), with eigenvalue ω1 + . . . + ωn. The vacuum |0 is annihilated by a(k), a(k)|0 = 0, and we take Ω0 = E0 in eq. (3.30).
3.3) Use U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) to show that

U (Λ)−1a(k)U (Λ) = a(Λ−1k) , U (Λ)−1a†(k)U (Λ) = a†(Λ−1k) ,

(3.34)

and hence that

U (Λ)|k1 . . . kn = |Λk1 . . . Λkn ,

(3.35)

where |k1 . . . kn = a†(k1) . . . a†(kn)|0 is a state of n particles with momenta k1, . . . , kn.

3.4) Recall that T (a)−1ϕ(x)T (a) = ϕ(x − a), where T (a) ≡ exp(−iP µaµ) is the spacetime translation operator, and P 0 is identiﬁed as the hamiltonian H.
a) Let aµ be inﬁnitesimal, and derive an expression for [ϕ(x), P µ].

b) Show that the time component of your result is equivalent to the Heisenberg equation of motion iϕ˙ = [ϕ, H].

c) For a free ﬁeld, use the Heisenberg equation to derive the KleinGordon equation.

d) Deﬁne a spatial momentum operator

P ≡ − d3x Π(x)∇ϕ(x) .

(3.36)

Use the canonical commutation relations to show that P obeys the relation you derived in part (a).
e) Express P in terms of a(k) and a†(k).

3: Canonical Quantization of Scalar Fields

44

3.5) Consider a complex (that is, nonhermitian) scalar ﬁeld ϕ with lagrangian density

L = −∂µϕ†∂µϕ − m2ϕ†ϕ + Ω0 .

(3.37)

a) Show that ϕ obeys the Klein-Gordon equation.
b) Treat ϕ and ϕ† as independent ﬁelds, and ﬁnd the conjugate momentum for each. Compute the hamiltonian density in terms of these conjugate momenta and the ﬁelds themselves (but not their time derivatives).
c) Write the mode expansion of ϕ as

ϕ(x) = dk a(k)eikx + b†(k)e−ikx .

(3.38)

Express a(k) and b(k) in terms of ϕ and ϕ† and their time derivatives.
d) Assuming canonical commutation relations for the ﬁelds and their conjugate momenta, ﬁnd the commutation relations obeyed by a(k) and b(k) and their hermitian conjugates.
e) Express the hamiltonian in terms of a(k) and b(k) and their hermitian conjugates. What value must Ω0 have in order for the ground state to have zero energy?

4: The Spin-Statistics Theorem

45

4 The Spin-Statistics Theorem
Prerequisite: 3

Let us consider a theory of free, spin-zero particles speciﬁed by the hamil-

tonian

H0 = dk ω a†(k)a(k) ,

(4.1)

where ω = (k2 + m2)1/2, and either the commutation or anticommutation relations

[a(k), a(k′)]∓ = 0 , [a†(k), a†(k′)]∓ = 0 , [a(k), a†(k′)]∓ = (2π)32ω δ3(k − k′) .

(4.2)

Of course, if we want a theory of bosons, we should use commutators, and if we want fermions, we should use anticommutators.
Now let us consider adding terms to the hamiltonian that will result in local, Lorentz invariant interactions. In order to do this, it is convenient to deﬁne a nonhermitian ﬁeld,

ϕ+(x, 0) ≡ dk eik·x a(k) ,

(4.3)

and its hermitian conjugate ϕ−(x, 0) ≡ dk e−ik·x a†(k) .

(4.4)

These are then time-evolved with H0: ϕ+(x, t) = eiH0tϕ+(x, 0)e−iH0t =

dk eikx a(k) ,

ϕ−(x, t) = eiH0tϕ−(x, 0)e−iH0t = dk e−ikx a†(k) .

(4.5)

Note that the usual hermitian free ﬁeld ϕ(x) is just the sum of these: ϕ(x) = ϕ+(x) + ϕ−(x).
For a proper orthochronous Lorentz transformation Λ, we have

U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) .

(4.6)

This implies that the particle creation and annihilation operators transform as

U (Λ)−1a(k)U (Λ) = a(Λ−1k) , U (Λ)−1a†(k)U (Λ) = a†(Λ−1k) .

(4.7)

4: The Spin-Statistics Theorem

46

This, in turn, implies that ϕ+(x) and ϕ−(x) are Lorentz scalars:

U (Λ)−1ϕ±(x)U (Λ) = ϕ±(Λ−1x) .

(4.8)

We will then have local, Lorentz invariant interactions if we take the interaction lagrangian density L1 to be a hermitian function of ϕ+(x) and ϕ−(x).
To proceed we need to recall some facts about time-dependent pertur-
bation theory in quantum mechanics. The transition amplitude Tf←i to start with an initial state |i at time t = −∞ and end with a ﬁnal state |f
at time t = +∞ is

+∞

Tf←i = f | T exp −i

dt HI (t) |i ,

−∞

(4.9)

where HI(t) is the perturbing hamiltonian in the interaction picture,

HI(t) = exp(+iH0t) H1 exp(−iH0t) ,

(4.10)

H1 is the perturbing hamiltonian in the Schr¨odinger picture, and T is the time ordering symbol: a product of operators to its right is to be ordered,
not as written, but with operators at later times to the left of those at earlier times. We write H1 = d3x H1(x, 0), and specify H1(x, 0) as a hermitian function of ϕ+(x, 0) and ϕ−(x, 0). Then, using eqs. (4.5) and (4.10), we
can see that, in the interaction picture, the perturbing hamiltonian density HI (x, t) is simply given by the same function of ϕ+(x, t) and ϕ−(x, t).
Now we come to the key point: for the transition amplitude Tf←i to be Lorentz invariant, the time ordering must be frame independent. The time ordering of two spacetime points x and x′ is frame independent if
their separation is timelike; this means that the interval between them is negative, (x−x′)2 < 0. Two spacetime points whose separation is spacelike, (x − x′)2 > 0, can have diﬀerent temporal ordering in diﬀerent frames. In
order to avoid Tf←i being diﬀerent in diﬀerent frames, we must then require

[HI(x), HI (x′)] = 0 whenever (x − x′)2 > 0 .

(4.11)

Obviously, [ϕ+(x), ϕ+(x′)]∓ = [ϕ−(x), ϕ−(x′)]∓ = 0. However,

[ϕ+(x), ϕ−(x′)]∓ = dk dk′ ei(kx−k′x′)[a(k), a†(k′)]∓

= dk eik(x−x′)

=

m 4π2r

K1(mr)

≡ C(r) .

(4.12)

4: The Spin-Statistics Theorem

47

In the next-to-last line, we have taken (x − x′)2 = r2 > 0, and K1(z) is
a modiﬁed Bessel function. (This Lorentz-invariant integral is most easily evaluated in the frame where t′ = t.) The function C(r) is not zero for any r > 0. (Not even when m = 0; in this case, C(r) = 1/4π2r2.) On the other hand, HI (x) must involve both ϕ+(x) and ϕ−(x), by hermiticity. Thus,
generically, we will not be able to satisfy eq. (4.11).
To resolve this problem, let us try using only particular linear combinations of ϕ+(x) and ϕ−(x). Deﬁne

ϕλ(x) ≡ ϕ+(x) + λϕ−(x) , ϕ†λ(x) ≡ ϕ−(x) + λ∗ϕ+(x) ,

(4.13)

where λ is an arbitrary complex number. We then have

[ϕλ(x), ϕ†λ(x′)]∓ = [ϕ+(x), ϕ−(x′)]∓ + |λ|2[ϕ−(x), ϕ+(x′)]∓

= (1 ∓ |λ|2) C(r)

(4.14)

and

[ϕλ(x), ϕλ(x′)]∓ = λ[ϕ+(x), ϕ−(x′)]∓ + λ[ϕ−(x), ϕ+(x′)]∓

= λ(1 ∓ 1) C(r) .

(4.15)

Thus, if we want ϕλ(x) to either commute or anticommute with both ϕλ(x′) and ϕ†λ(x′) at spacelike separations, we must choose |λ| = 1, and we must choose commutators. Then (and only then), we can build a suitable HI (x) by making it a hermitian function of ϕλ(x).
But this has simply returned us to the theory of a real scalar ﬁeld, because, for λ = eiα, e−iα/2ϕλ(x) is hermitian. In fact, if we make the replacements a(k) → e+iα/2a(k) and a†(k) → e−iα/2a†(k), then the commutation relations of eq. (4.2) are unchanged, and e−iα/2ϕλ(x) = ϕ(x) = ϕ+(x) + ϕ−(x). Thus, our attempt to start with the creation and annihilation operators a†(k) and a(k) as the fundamental objects has simply led us
back to the real, commuting, scalar ﬁeld ϕ(x) as the fundamental object.
Let us return to thinking of ϕ(x) as fundamental, with a lagrangian density given by some function of the Lorentz scalars ϕ(x) and ∂µϕ(x)∂µϕ(x). Then, quantization will result in [ϕ(x), ϕ(x′)]∓ = 0 for t = t′. If we choose anticommutators, then [ϕ(x)]2 = 0 and [∂µϕ(x)]2 = 0, resulting in a trivial L that is at most linear in ϕ, and independent of ϕ˙ . This clearly does not
lead to the correct physics.
This situation turns out to generalize to ﬁelds of higher spin, in any
number of spacetime dimensions. One choice of quantization (commuta-
tors or anticommutators) always leads to a trivial L, and so this choice

4: The Spin-Statistics Theorem

48

is disallowed. Furthermore, the allowed choice is always commutators for ﬁelds of integer spin, and anticommutators for ﬁelds of half-integer spin. If we try treating the particle creation and annihilation operators as fundamental, rather than the ﬁelds, we ﬁnd a situation similar to that of the spin-zero case, and are led to the reconstruction of a ﬁeld that must obey the appropriate quantization scheme.
Reference Notes
This discussion of the spin-statistics theorem follows that of Weinberg I, which has more details.
Problems
4.1) Verify eq. (4.12). Verify its limit as m → 0.

5: The LSZ Reduction Formula

49

5 The LSZ Reduction Formula
Prerequisite: 3

Let us now consider how to construct appropriate initial and ﬁnal states for scattering experiments. In the free theory, we can create a state of one particle by acting on the vacuum state with a creation operator

|k = a†(k)|0 ,

(5.1)

where

a†(k) = −i

d3x

eikx

↔
∂0

ϕ(x)

.

The vacuum state |0 is annihilated by every a(k),

(5.2)

a(k)|0 = 0 ,

(5.3)

and has unit norm,

0|0 = 1 .

(5.4)

The one-particle state |k then has the Lorentz-invariant normalization

k|k′ = (2π)3 2ω δ3(k − k′) ,

(5.5)

where ω = (k2 + m2)1/2. Next, let us deﬁne a time-independent operator that (in the free theory)
creates a particle localized in momentum space near k1, and localized in position space near the origin:

a†1 ≡ d3k f1(k)a†(k) ,

(5.6)

where

f1(k) ∝ exp[−(k − k1)2/4σ2]

(5.7)

is an appropriate wave packet, and σ is its width in momentum space. Consider the state a†1|0 . If we time evolve this state in the Schr¨odinger picture, the wave packet will propagate (and spread out). The particle is
thus localized far from the origin as t → ±∞. If we consider instead a state of the form a†1a†2|0 , where k1 = k2, then the two particles are widely separated in the far past.
Let us guess that this still works in the interacting theory. One complication is that a†(k) will no longer be time independent, and so a†1, eq. (5.6), becomes time dependent as well. Our guess for a suitable initial state of a
scattering experiment is then

|i

=

lim
t→−∞

a†1(t)a†2(t)|0

.

(5.8)

5: The LSZ Reduction Formula

50

By appropriately normalizing the wave packets, we can make i|i = 1, and

we will assume that this is the case. Similarly, we can consider a ﬁnal state

|f

=

lim
t→+∞

a†1′ (t)a†2′

(t)|0

,

(5.9)

where k′1 = k′2, and f |f = 1. This describes two widely separated particles in the far future. (We could also consider acting with more creation
operators, if we are interested in the production of some extra particles in the collision of two.) Now the scattering amplitude is simply given by f |i .
We need to ﬁnd a more useful expression for f |i . To this end, let us

note that a†1(+∞) − a†1(−∞) = +∞ dt ∂0a†1(t)
−∞
= −i d3k f1(k)

d4x ∂0

eikx

↔
∂0

ϕ(x)

= −i d3k f1(k) d4x eikx(∂02 + ω2)ϕ(x)

= −i = −i = −i

d3k f1(k) d3k f1(k) d3k f1(k)

d4x eikx(∂02 + k2 + m2)ϕ(x) d4x eikx(∂02 − ∇←2 + m2)ϕ(x) d4x eikx(∂02 − ∇→2 + m2)ϕ(x)

= −i d3k f1(k) d4x eikx(−∂2 + m2)ϕ(x) . (5.10)

The ﬁrst equality is just the fundamental theorem of calculus. To get the second, we substituted the deﬁnition of a†1(t), and combined the d3x from this deﬁnition with the dt to get d4x. The third comes from straightforward

evaluation of the time derivatives. The fourth uses ω2 = k2 + m2. The ﬁfth

writes k2 as −∇2 acting on eik·x. The sixth uses integration by parts to

move the ∇2 onto the ﬁeld ϕ(x); here the wave packet is needed to avoid a

surface term. The seventh simply identiﬁes ∂02 − ∇2 as −∂2. In free-ﬁeld theory, the right-hand side of eq. (5.10) is zero, since ϕ(x)

obeys the Klein-Gordon equation. In an interacting theory, with (say)

L1 =

1 6

gϕ3

,

we

have

instead

(−∂2

+

m2)ϕ

=

1 2

gϕ2

.

Thus the right-hand

side of eq. (5.10) is not zero in an interacting theory.

Rearranging eq. (5.10), we have

a†1(−∞) = a†1(+∞) + i d3k f1(k) d4x eikx(−∂2 + m2)ϕ(x) . (5.11)

We will also need the hermitian conjugate of this formula, which (after a little more rearranging) reads

a1(+∞) = a1(−∞) + i d3k f1(k) d4x e−ikx(−∂2 + m2)ϕ(x) . (5.12)

5: The LSZ Reduction Formula

51

Let us return to the scattering amplitude,

f |i = 0|a1′ (+∞)a2′ (+∞)a†1(−∞)a†2(−∞)|0 .

(5.13)

Note that the operators are in time order. Thus, if we feel like it, we can put in a time-ordering symbol without changing anything:

f |i = 0|Ta1′ (+∞)a2′ (+∞)a†1(−∞)a†2(−∞)|0 .

(5.14)

The symbol T means the product of operators to its right is to be ordered, not as written, but with operators at later times to the left of those at earlier times.
Now let us use eqs. (5.11) and (5.12) in eq. (5.14). The time-ordering symbol automatically moves all ai′(−∞)’s to the right, where they annihilate |0 . Similarly, all a†i (+∞)’s move to the left, where they annihilate 0|.
The wave packets no longer play a key role, and we can take the σ → 0 limit in eq. (5.7), so that f1(k) = δ3(k − k1). The initial and ﬁnal states now have a delta-function normalization, the multiparticle generalization of eq. (5.5). We are left with

f |i = in+n′

d4x1 eik1x1(−∂12 + m2) . . . d4x′1 e−ik1′ x′1 (−∂12′ + m2) . . . × 0|Tϕ(x1) . . . ϕ(x′1) . . . |0 .

(5.15)

This formula has been written to apply to the more general case of n incoming particles and n′ outgoing particles; the ellipses stand for similar factors for each of the other incoming and outgoing particles.
Eq. (5.15) is the Lehmann-Symanzik-Zimmermann reduction formula, or LSZ formula for short. It is one of the key equations of quantum ﬁeld theory.
However, our derivation of the LSZ formula relied on the supposition that the creation operators of free ﬁeld theory would work comparably in the interacting theory. This is a rather suspect assumption, and so we must review it.
Let us consider what we can deduce about the energy and momentum eigenstates of the interacting theory on physical grounds. First, we assume that there is a unique ground state |0 , with zero energy and momentum. The ﬁrst excited state is a state of a single particle with mass m. This state can have an arbitrary three-momentum k; its energy is then E = ω = (k2 + m2)1/2. The next excited state is that of two particles. These two particles could form a bound state with energy less than 2m (like the

5: The LSZ Reduction Formula

52

E

2m

m

0

P

Figure 5.1: The exact energy eigenstates in the (P, E) plane. The ground state is isolated at (0, 0), the one-particle states form an isolated hyperbola that passes through (0, m), and the multi-particle continuum lies at and above the hyperbola that passes through (0, 2m).

hydrogen atom in quantum electrodynamics), but, to keep things simple, let us assume that there are no such bound states. Then the lowest possible energy of a two-particle state is 2m. However, a two-particle state with zero total three-momentum can have any energy above 2m, because the two particles could have some relative momentum that contributes to their total energy. Thus we are led to a picture of the states of theory as shown in ﬁg. (5.1).
Now let us consider what happens when we act on the ground state with the ﬁeld operator ϕ(x). To this end, it is helpful to write

ϕ(x) = exp(−iP µxµ)ϕ(0)exp(+iP µxµ) ,

(5.16)

where P µ is the energy-momentum four-vector. (This equation, introduced in section 2, is just the relativistic generalization of the Heisenberg equation.) Now let us sandwich ϕ(x) between the ground state (on the right), and other possible states (on the left). For example, let us put the ground state on the left as well. Then we have

0|ϕ(x)|0 = 0|e−iP xϕ(0)e+iP x|0

= 0|ϕ(0)|0 .

(5.17)

5: The LSZ Reduction Formula

53

To get the second line, we used P µ|0 = 0. The ﬁnal expression is just a Lorentz-invariant number. Since |0 is the exact ground state of the interacting theory, we have (in general) no idea what this number is.
We would like 0|ϕ(0)|0 to be zero. This is because we would like a†1(±∞), when acting on |0 , to create a single particle state. We do not want a†1(±∞) to create a linear combination of a single particle state and the ground state. But this is precisely what will happen if 0|ϕ(0)|0 is not zero.
So, if v ≡ 0|ϕ(0)|0 is not zero, we will shift the ﬁeld ϕ(x) by the constant v. This means that we go back to the lagrangian, and replace ϕ(x) everywhere by ϕ(x) + v. This is just a change of the name of the operator of interest, and does not aﬀect the physics. However, the shifted ϕ(x) obeys, by construction, 0|ϕ(x)|0 = 0.
Let us now consider p|ϕ(x)|0 , where |p is a one-particle state with four-momentum p, normalized according to eq. (5.5). Again using eq. (5.16), we have

p|ϕ(x)|0 = p|e−iP xϕ(0)e+iP x|0 = e−ipx p|ϕ(0)|0 ,

(5.18)

where p|ϕ(0)|0 is a Lorentz-invariant number. It is a function of p, but

the only Lorentz-invariant functions of p are functions of p2, and p2 is just

the constant −m2. So p|ϕ(0)|0 is just some number that depends on m

and (presumably) the other parameters in the lagrangian.

We would like p|ϕ(0)|0 to be one. That is what it is in free-ﬁeld theory,

and ized

woneek-pnaorwtitchleats,taintef.reTeh-ﬁuesl,dfotrheao†1r(y±, ∞a†1)(±to∞c)recarteeataescoarrceocrtrleyctnlyornmoarlmizaeld-

one-particle state in the interacting theory, we must have p|ϕ(0)|0 = 1.

So, if p|ϕ(0)|0 is not equal to one, we will rescale (or, one might say,

renormalize) ϕ(x) by a multiplicative constant. This is just a change of the

name of the operator of interest, and does not aﬀect the physics. However,

the rescaled ϕ(x) obeys, by construction, p|ϕ(0)|0 = 1.

Finally, consider p, n|ϕ(x)|0 , where |p, n is a multiparticle state with

total four-momentum p, and n is short for all other labels (such as relative

momenta) needed to specify this state. We have

p, n|ϕ(x)|0 = p, n|e−iP xϕ(0)e+iP x|0 = e−ipx p, n|ϕ(0)|0 = e−ipxAn(p) ,

(5.19)

where An(p) is a function of Lorentz invariant products of the various (relative and total) four-momenta needed to specify the state. Note that,

5: The LSZ Reduction Formula

54

from ﬁg. (5.1), p0 = (p2 + M 2)1/2 with M ≥ 2m. The invariant mass M is
one of the parameters included in the set n. We would like p, n|ϕ(x)|0 to be zero, because we would like a†1(±∞),
when acting on |0 , to create a single particle state. We do not want a†1(±∞) to create any multiparticle states. But this is precisely what may happen if p, n|ϕ(x)|0 is not zero.
Actually, we are being a little too strict. We really need p, n|a†1(±∞)|0 to be zero, and perhaps it will be zero even if p, n|ϕ(x)|0 is not. Also, we really should test a†1(±∞)|0 only against normalizable states. Mathematically, non-normalizable states cause all sorts of trouble; mathematicians
don’t consider them to be states at all. In physics, this usually doesn’t
bother us, but here we must be especially careful. So let us write

|ψ =

d3p ψn(p)|p, n ,

n

(5.20)

where the ψn(p)’s are wave packets for the total three-momentum p. Note that eq. (5.20) is highly schematic; the sum over n includes integrals over continuous parameters like relative momenta.
Now we want to examine

ψ|a†1(t)|0 = −i
n

d3p ψn∗ (p)

d3k f1(k)

d3x

eikx

↔
∂0

p, n|ϕ(x)|0

.

(5.21)

We will take the limit t → ±∞ in a moment. Using eq. (5.19), eq. (5.21)

becomes

ψ|a†1(t)|0 = −i
n

d3p ψn∗ (p)

d3k f1(k)

d3x

eikx

↔
∂0

e−ipx

An(p)

=
n

d3p ψn∗ (p) d3k f1(k) d3x (p0+k0)ei(k−p)xAn(p) . (5.22)

Next we use d3x ei(k−p)·x = (2π)3δ3(k − p) to get

ψ|a†1(t)|0 =
n

d3p (2π)3(p0+k0)ψn∗ (p)f1(p)An(p)ei(p0−k0)t , (5.23)

where p0 = (p2 + M 2)1/2 and k0 = (p2 + m2)1/2. Now comes the key point. Note that p0 is strictly greater than k0,
because M ≥ 2m > m. Thus the integrand of eq. (5.23) contains a phase factor that oscillates more and more rapidly as t → ±∞. Therefore, by the Riemann-Lebesgue lemma, the right-hand side of eq. (5.23) vanishes as t → ±∞.

5: The LSZ Reduction Formula

55

Physically, this means that a one-particle wave packet spreads out differently than a multiparticle wave packet, and the overlap between them goes to zero as the elapsed time goes to inﬁnity. Thus, even though our operator a†1(t) creates some multiparticle states that we don’t want, we can “follow” the one-particle state that we do want by using an appropriate wave packet. By waiting long enough, we can make the multiparticle contribution to the scattering amplitude as small as we like.
Let us recap. The basic formula for a scattering amplitude in terms of the ﬁelds of an interacting quantum ﬁeld theory is the LSZ formula, which is worth writing down again:

f |i = in+n′

d4x1 eik1x1(−∂12 + m2) . . . d4x1′ e−ik1′ x′1 (−∂12′ + m2) . . . × 0|Tϕ(x1) . . . ϕ(x′1) . . . |0 .

The LSZ formula is valid provided that the ﬁeld obeys

0|ϕ(x)|0 = 0 and

k|ϕ(x)|0 = e−ikx .

(5.24) (5.25)

These normalization conditions may conﬂict with our original choice of ﬁeld and parameter normalization in the lagrangian. Consider, for example, a lagrangian originally speciﬁed as

L

=

−

1 2

∂

µ

ϕ∂µϕ

−

1 2

m2ϕ2

+

1 6

gϕ3

.

(5.26)

After shifting and rescaling (and renaming some parameters), we will have

instead

L

=

−

1 2

Zϕ∂µϕ∂µ

ϕ

−

1 2

Zm

m2ϕ2

+

1 6

Zg

gϕ3

+

Y

ϕ

.

(5.27)

Here the three Z’s and Y are as yet unknown constants. They must be

chosen to ensure the validity of eq. (5.25); this gives us two conditions in

four unknowns. We ﬁx the parameter m by requiring it to be equal to the

actual mass of the particle (equivalently, the energy of the ﬁrst excited state

relative to the ground state), and we ﬁx the parameter g by requiring some

particular scattering cross section to depend on g in some particular way.

(For example, in quantum electrodynamics, the parameter analogous to g

is the electron charge e. The low-energy Coulomb scattering cross section is proportional to e4, with a deﬁnite constant of proportionality and no

higher-order corrections; this relationship deﬁnes e.) Thus we have four

conditions in four unknowns, and it is possible to calculate Y and the three

Z’s order by order in powers of g.

Next, we must develop the tools needed to compute the correlation functions 0|Tϕ(x1) . . . |0 in an interacting quantum ﬁeld theory.

5: The LSZ Reduction Formula

56

Reference Notes
Useful discussions of the LSZ reduction formula can be found in Brown, Itzykson & Zuber, Peskin & Schroeder, and Weinberg I.
Problems
5.1) Work out the LSZ reduction formula for the complex scalar ﬁeld that was introduced in problem 3.5. Note that we must specify the type (a or b) of each incoming and outgoing particle.

6: Path Integrals in Quantum Mechanics

57

6 Path Integrals in Quantum Mechanics
Prerequisite: none

Consider the nonrelativistic quantum mechanics of one particle in one dimension; the hamiltonian is

H(P, Q)

=

1 2m

P

2

+

V

(Q)

,

(6.1)

where P and Q are operators obeying [Q, P ] = i. (We set ¯h = 1 for
notational convenience.) We wish to evaluate the probability amplitude for the particle to start at position q′ at time t′, and end at position q′′ at time t′′. This amplitude is q′′|e−iH(t′′−t′)|q′ , where |q′ and |q′′ are eigenstates
of the position operator Q.
We can also formulate this question in the Heisenberg picture, where op-
erators are time dependent and the state of the system is time independent,
as opposed to the more familiar Schr¨odinger picture. In the Heisenberg picture, we write Q(t) = eiHtQe−iHt. We can then deﬁne an instantaneous eigenstate of Q(t) via Q(t)|q, t = q|q, t . These instantaneous eigenstates can be expressed explicitly as |q, t = e+iHt|q , where Q|q = q|q . Then our transition amplitude can be written as q′′, t′′|q′, t′ in the Heisenberg
picture. To evaluate q′′, t′′|q′, t′ , we begin by dividing the time interval T ≡
t′′ − t′ into N + 1 equal pieces of duration δt = T /(N + 1). Then introduce
N complete sets of position eigenstates to get

q′′, t′′|q′, t′ =

N
dqj q′′|e−iHδt|qN qN |e−iHδt|qN−1 . . . q1|e−iHδt|q′ .

j=1

(6.2)

The integrals over the q’s all run from −∞ to +∞.

Now consider q2|e−iHδt|q1 . We can use the Campbell-Baker-Hausdorf

formula

exp(A

+

B)

=

exp(A)

exp(B)

exp(−

1 2

[A,

B]

+

.

.

.)

(6.3)

to write

exp(−iHδt) = exp[−i(δt/2m)P 2] exp[−iδtV (Q)] exp[O(δt2)] . (6.4)

Then, in the limit of small δt, we should be able to ignore the ﬁnal exponential. Inserting a complete set of momentum states then gives

q2|e−iHδt|q1 = =

dp1 q2|e−i(δt/2m)P 2 |p1 p1|e−iδtV (Q)|q1 dp1 e−i(δt/2m)p21 e−iδtV (q1) q2|p1 p1|q1

6: Path Integrals in Quantum Mechanics

58

=

dp1 2π

e−i(δt/2m)p21

e−iδtV (q1) eip1(q2−q1)

.

=

dp1 e−iH(p1,q1)δt eip1(q2−q1) .

2π

(6.5)

To get the third line, we used q|p = (2π)−1/2 exp(ipq).

If we happen to be interested in more general hamiltonians than eq. (6.1),

then we must worry about the ordering of the P and Q operators in any

term that contains both. If we adopt Weyl ordering, where the quantum

hamiltonian H(P, Q) is given in terms of the classical hamiltonian H(p, q)

by

H(P, Q) ≡

dx 2π

dk 2π

eixP +ikQ

dp dq e−ixp−ikq H(p, q) ,

(6.6)

then eq. (6.5) is not quite correct; in the last line, H(p1, q1) should be

replaced with H(p1, q¯1), where q¯1 =

1 2

(q1

+

q2).

For the hamiltonian of

eq. (6.1), which is Weyl ordered, this replacement makes no diﬀerence in

the limit δt → 0.

Adopting Weyl ordering for the general case, we now have

q′′, t′′|q′, t′ =

N k=1

dqk

N j=0

dpj 2π

eipj (qj+1−qj ) e−iH(pj ,q¯j )δt

,

(6.7)

where q¯j

=

1 2

(qj

+

qj+1),

q0

=

q′,

and

qN +1

=

q′′.

If we now deﬁne q˙j

≡

(qj+1 − qj)/δt, and take the formal limit of δt → 0, we get

t′′
q′′, t′′|q′, t′ = Dq Dp exp i dt p(t)q˙(t) − H(p(t), q(t)) . (6.8)
t′

The integration is to be understood as over all paths in phase space that start at q(t′) = q′ (with an arbitrary value of the initial momentum) and end at q(t′′) = q′′ (with an arbitrary value of the ﬁnal momentum).
If H(p, q) is no more than quadratic in the momenta [as is the case for eq. (6.1)], then the integral over p is gaussian, and can be done in closed form. If the term that is quadratic in p is independent of q [as is the case for eq. (6.1)], then the prefactors generated by the gaussian integrals are all constants, and can be absorbed into the deﬁnition of Dq. The result of integrating out p is then

t′′
q′′, t′′|q′, t′ = Dq exp i dt L(q˙(t), q(t)) ,
t′

(6.9)

where L(q˙, q) is computed by ﬁrst ﬁnding the stationary point of the p integral by solving

0

=

∂ ∂p

pq˙ − H(p, q)

=

q˙

−

∂ H (p, ∂p

q)

(6.10)

6: Path Integrals in Quantum Mechanics

59

for p in terms of q˙ and q, and then plugging this solution back into pq˙ − H to get L. We recognize this procedure from classical mechanics: we are passing from the hamiltonian formulation to the lagrangian formulation.
Now that we have eqs. (6.8) and (6.9), what are we going to do with them? Let us begin by considering some generalizations; let us examine, for example, q′′, t′′|Q(t1)|q′, t′ , where t′ < t1 < t′′. This is given by

q′′, t′′|Q(t1)|q′, t′ = q′′|e−iH(t′′−t1)Qe−iH(t1−t′)|q′ .

(6.11)

In the path integral formula, the extra operator Q inserted at time t1 will simply result in an extra factor of q(t1). Thus

q′′, t′′|Q(t1)|q′, t′ = Dp Dq q(t1) eiS ,

(6.12)

where S =

t′′ t′

dt

(pq˙

−

H ).

Now

let

us

go

in

the

other

direction;

consider

Dp Dq q(t1)q(t2)eiS. This clearly requires the operators Q(t1) and Q(t2),

but their order depends on whether t1 < t2 or t2 < t1. Thus we have

Dp Dq q(t1)q(t2) eiS = q′′, t′′|TQ(t1)Q(t2)|q′, t′ .

(6.13)

where T is the time ordering symbol: a product of operators to its right is to be ordered, not as written, but with operators at later times to the left of those at earlier times. This is signiﬁcant, because time-ordered products enter into the LSZ formula for scattering amplitudes.
To further develop these methods, we need another trick: functional derivatives. We deﬁne the functional derivative δ/δf (t) via

δf

δ (t1)

f

(t2)

=

δ(t1

−

t2)

,

(6.14)

where δ(t) is the Dirac delta function. Also, functional derivatives are deﬁned to satisfy all the usual rules of derivatives (product rule, chain rule, etc). Eq. (6.14) can be thought of as the continuous generalization of (∂/∂xi)xj = δij .
Now, consider modifying the lagrangian of our theory by including external forces acting on the particle:

H(p, q) → H(p, q) − f (t)q(t) − h(t)p(t) ,

(6.15)

where f (t) and h(t) are speciﬁed functions. In this case we will write
t′′
q′′, t′′|q′, t′ f,h = Dp Dq exp i dt pq˙ − H + f q + hp . (6.16)
t′

6: Path Integrals in Quantum Mechanics

60

where H is the original hamiltonian. Then we have

1δ i δf (t1)

q′′, t′′|q′, t′ f,h =

Dp Dq q(t1) ei dt [pq˙−H+fq+hp] ,

1δ1δ i δf (t1) i δf (t2)

q′′, t′′|q′, t′ f,h =

Dp Dq q(t1)q(t2) ei dt [pq˙−H+fq+hp] ,

1δ i δh(t1)

q′′, t′′|q′, t′ f,h =

Dp Dq p(t1) ei dt [pq˙−H+fq+hp] , (6.17)

and so on. After we are done bringing down as many factors of q(ti) or p(ti) as we like, we can set f (t) = h(t) = 0, and return to the original hamiltonian. Thus,

q′′, t′′|TQ(t1) . . . P (tn) . . . |q′, t′

=

1 i

δ ... 1 δf (t1) i

δ ... δh(tn)

q′′, t′′|q′, t′

f,h

.

f =h=0

(6.18)

Suppose we are also interested in initial and ﬁnal states other than position eigenstates. Then we must multiply by the wave functions for these states, and integrate. We will be interested, in particular, in the ground state as both the initial and ﬁnal state. Also, we will take the limits t′ → −∞ and t′′ → +∞. The object of our attention is then

0|0

f,h

=

lim
t′ →−∞

t′′ →+∞

dq′′ dq′ ψ0∗(q′′) q′′, t′′|q′, t′ f,h ψ0(q′) ,

(6.19)

where ψ0(q) = q|0 is the ground-state wave function. Eq. (6.19) is a rather cumbersome formula, however. We will, therefore, employ a trick to
simplify it. Let |n denote an eigenstate of H with eigenvalue En. We will suppose
that E0 = 0; if this is not the case, we will shift H by an appropriate constant. Next we write

|q′, t′ = eiHt′ |q′

∞

=

eiHt′ |n n|q′

n=0

∞

=

ψn∗ (q′)eiEnt′ |n ,

n=0

(6.20)

where ψn(q) = q|n is the wave function of the nth eigenstate. Now, replace H with (1−iǫ)H in eq. (6.20), where ǫ is a small positive inﬁnitesimal. Then, take the limit t′ → −∞ of eq. (6.20) with ǫ held ﬁxed. Every

6: Path Integrals in Quantum Mechanics

61

state except the ground state is then multiplied by a vanishing exponential factor, and so the limit is simply ψ0∗(q′)|0 . Next, multiply by an arbitrary function χ(q′), and integrate over q′. The only requirement is that 0|χ = 0. We then have a constant times |0 , and this constant can be absorbed into the normalization of the path integral. A similar analysis of q′′, t′′| = q′′|e−iHt′′ shows that the replacement H → (1−iǫ)H also picks out the ground state as the ﬁnal state in the t′′ → +∞ limit.
What all this means is that if we use (1−iǫ)H instead of H, we can be cavalier about the boundary conditions on the endpoints of the path. Any reasonable boundary conditions will result in the ground state as both the initial and ﬁnal state. Thus we have

+∞

0|0 f,h = Dp Dq exp i

dt pq˙ − (1−iǫ)H + f q + hp . (6.21)

−∞

Now let us suppose that H = H0 + H1, where we can solve for the eigenstates and eigenvalues of H0, and H1 can be treated as a perturbation. Suppressing the iǫ, eq. (6.21) can be written as

+∞

0|0 f,h = Dp Dq exp i

dt pq˙ − H0(p, q) − H1(p, q) + f q + hp

−∞

= exp

−i

+∞
dt H1
−∞

1 i

δ δh(t)

,

1 i

δ δf (t)

+∞

× Dp Dq exp i

dt pq˙ − H0(p, q) + f q + hp . (6.22)

−∞

To understand the second line of this equation, take the exponential prefactor inside the path integral. Then the functional derivatives (that appear as the arguments of H1) just pull out appropriate factors of p(t) and q(t), generating the right-hand side of the ﬁrst line. We assume that we can compute the functional integral in the second line, since it involves only the solvable hamiltonian H0. The exponential prefactor can then be expanded in powers of H1 to generate a perturbation series.
If H1 depends only on q (and not on p), and if we are only interested in time-ordered products of Q’s (and not P ’s), and if H is no more than quadratic in P , and if the term quadratic in P does not involve Q, then eq. (6.22) can be simpliﬁed to

0|0 f

= exp

i

+∞
dt L1
−∞

1δ i δf (t)

+∞

× Dq exp i

dt L0(q˙, q) + f q .

−∞

(6.23)

where L1(q) = −H1(q).

6: Path Integrals in Quantum Mechanics

62

Reference Notes

Brown and Ramond I have especially clear treatments of various aspects of path integrals. For a careful derivation of the midpoint rule of eq. (6.7), see Berry & Mount.

Problems

6.1) a) Find an explicit formula for Dq in eq. (6.9). Your formula should

be of the form Dq = C

N j=1

dqj

,

where

C

is

a

constant

that

you

should compute.

b) For the case of a free particle, V (Q) = 0, evaluate the path integral of eq. (6.9) explicitly. Hint: integrate over q1, then q2, etc, and look for a pattern. Express you ﬁnal answer in terms of q′, t′, q′′, t′′, and m. Restore ¯h by dimensional analysis.
c) Compute q′′, t′′|q′, t′ = q′′|e−iH(t′′−t′)|q′ by inserting a complete set of momentum eigenstates, and performing the integral over the momentum. Compare with your result in part (b).

7: The Path Integral for the Harmonic Oscillator

63

7 The Path Integral for the Harmonic Oscillator
Prerequisite: 6

Consider a harmonic oscillator with hamiltonian

H(P, Q)

=

1 2m

P

2

+

1 2

mω2Q2

.

(7.1)

We begin with the formula from section 6 for the ground state to ground state transition amplitude in the presence of an external force, specialized to the case of a harmonic oscillator:

+∞

0|0 f = Dp Dq exp i

dt pq˙ − (1−iǫ)H + f q .

−∞

(7.2)

Looking at eq. (7.1), we see that multiplying H by 1−iǫ is equivalent to the replacements m−1 → (1−iǫ)m−1 [or, equivalently, m → (1+iǫ)m] and mω2 → (1−iǫ)mω2. Passing to the lagrangian formulation then gives

+∞

0|0 f =

Dq exp i

−∞

dt

1 2

(1+iǫ)mq˙2

−

1 2

(1−iǫ)mω2q2

+

fq

.

(7.3)

From now on, we will simplify the notation by setting m = 1. Next, let us use Fourier-transformed variables,

+∞

q(E) =

dt eiEt q(t) ,

−∞

q(t) =

+∞ −∞

dE 2π

e−iEt q(E)

.

(7.4)

The expression in square brackets in eq. (7.3) becomes

· · · = 1 +∞ dE dE′ e−i(E+E′)t −(1+iǫ)EE′ − (1−iǫ)ω2 q(E)q(E′) 2 −∞ 2π 2π

+ f (E)q(E′) + f (E′)q(E) .

(7.5)

Note that the only t dependence is now in the prefactor. Integrating over t then generates a factor of 2πδ(E + E′). Then we can easily integrate over E′ to get

+∞

S=

dt · · ·

−∞

=

1 2

+∞ dE −∞ 2π

(1+iǫ)E2 − (1−iǫ)ω2 q(E)q(−E)

+ f (E)q(−E) + f (−E)q(E) .

(7.6)

7: The Path Integral for the Harmonic Oscillator

64

The factor in large parentheses is equal to E2 − ω2 + i(E2 + ω2)ǫ, and we can absorb the positive coeﬃcient into ǫ to get E2 − ω2 + iǫ.
Now it is convenient to change integration variables to

Then we get

x(E)

=

q(E)

+

E2

f (E) − ω2 +

iǫ

.

(7.7)

S

=

1 2

+∞ dE −∞ 2π

x(E)(E2

−

ω2

+

iǫ)x(−E)

−

f (E)f (−E) E2 − ω2 + iǫ

.

(7.8)

Furthermore, because eq. (7.7) is just a shift by a constant, Dq = Dx. Now we have

0|0 f = exp

i 2

+∞ dE −∞ 2π

f (E)f (−E) − E2 + ω2 − iǫ

×

Dx exp

i 2

+∞ −∞

dE 2π

x(E)(E2

−

ω2

+

iǫ)x(−E)

. (7.9)

Now comes the key point. The path integral on the second line of

eq. (7.9) is what we get for 0|0 f in the case f = 0. On the other hand,

if there is no external force, a system in its ground state will remain in its

ground state, and so 0|0 f=0 = 1. Thus 0|0 f is given by the ﬁrst line of

eq. (7.9),

0|0 f = exp

i 2

+∞ dE −∞ 2π

f (E)f (−E) − E2 + ω2 − iǫ

.

(7.10)

We can also rewrite 0|0 f in terms of time-domain variables as

0|0 f = exp

i 2

+∞
dt dt′ f (t)G(t − t′)f (t′)
−∞

,

(7.11)

where

G(t − t′) =

+∞ dE −∞ 2π

−

e−iE (t−t′ ) E2 + ω2 −

iǫ

.

(7.12)

Note that G(t−t′) is a Green’s function for the oscillator equation of motion:

∂2 ∂t2

+

ω2

G(t − t′) = δ(t − t′) .

(7.13)

This can be seen directly by plugging eq. (7.12) into eq. (7.13) and then taking the ǫ → 0 limit. We can also evaluate G(t − t′) explicitly by treating
the integral over E on the right-hand side of eq. (7.12) as a contour integral

7: The Path Integral for the Harmonic Oscillator

65

in the complex E plane, and then evaluating it via the residue theorem.

The result is

G(t − t′) =

i 2ω

exp

−iω|t − t′|

.

(7.14)

Consider now the formula from section 6 for the time-ordered product

of operators. In the case of initial and ﬁnal ground states, it becomes

0|TQ(t1) . . . |0

=

1 i

δ δf (t1)

...

0|0 f f=0 .

(7.15)

Using our explicit formula, eq. (7.11), we have

0|TQ(t1)Q(t2)|0

=

1 i

δ δf (t1)

1 i

δ δf (t2)

0|0 f f=0

=

1 i

δ δf (t1)

+∞

dt′ G(t2 − t′)f (t′)
−∞

0|0 f f=0

=

1 i

G(t2

−

t1)

+

(term

with

f ’s)

0|0 f f=0

=

1 i

G(t2

−

t1)

.

(7.16)

We can continue in this way to compute the ground-state expectation value of the time-ordered product of more Q(t)’s. If the number of Q(t)’s is odd, then there is always a left-over f (t) in the prefactor, and so the result is zero. If the number of Q(t)’s is even, then we must pair up the functional derivatives in an appropriate way to get a nonzero result. Thus, for example,

0|TQ(t1)Q(t2)Q(t3)Q(t4)|0

=

1 i2

G(t1−t2)G(t3−t4)

+ G(t1−t3)G(t2−t4)

+ G(t1−t4)G(t2−t3) .

(7.17)

More generally,

0|TQ(t1) . . . Q(t2n)|0

=

1 in

G(ti1 −ti2 ) . . . G(ti2n−1 −ti2n )
pairings

.

(7.18)

Problems

7.1) Starting with eq. (7.12), do the contour integral to verify eq. (7.14). 7.2) Starting with eq. (7.14), verify eq. (7.13).

7: The Path Integral for the Harmonic Oscillator

66

7.3) a) Use the Heisenberg equation of motion, A˙ = i[H, A], to ﬁnd explicit expressions for Q˙ and P˙ . Solve these to get the Heisenberg-picture operators Q(t) and P (t) in terms of the Schr¨odinger picture operators Q and P .

b) Write the Schr¨odinger picture operators Q and P in terms of the

creation

and

annihilation

operators

a

and

a†,

where

H

=

h¯ ω (a† a +

1 2

).

Then, using your result from part (a), write the Heisenberg-picture

operators Q(t) and P (t) in terms of a and a†.

c) Using your result from part (b), and a|0 = 0|a† = 0, verify eqs. (7.16) and (7.17).

7.4) Consider a harmonic oscillator in its ground state at t = −∞. It is

then then subjected to an external force f (t). Compute the probabil-

ity | 0|0 f |2 that the oscillator is still in its ground state at t = +∞.

Write your answer as a manifestly real expression, and in terms of

the Fourier transform f (E) =

+∞ −∞

dt

eiEtf

(t).

Your answer should

not involve any other unevaluated integrals.

8: The Path Integral for Free Field Theory

67

8 The Path Integral for Free Field Theory
Prerequisite: 3, 7

Our results for the harmonic oscillator can be straightforwardly generalized to a free ﬁeld theory with hamiltonian density

H0

=

1 2

Π2

+

1 2

(∇ϕ)2

+

1 2

m2ϕ2

.

(8.1)

The dictionary we need is

q(t) −→ ϕ(x, t) (classical ﬁeld) Q(t) −→ ϕ(x, t) (operator ﬁeld) f (t) −→ J(x, t) (classical source)

(8.2)

The distinction between the classical ﬁeld ϕ(x) and the corresponding operator ﬁeld should be clear from context.
To employ the ǫ trick, we multiply H0 by 1 − iǫ. The results are equivalent to replacing m2 in H0 with m2 − iǫ. From now on, for notational simplicity, we will write m2 when we really mean m2 − iǫ.
Let us write down the path integral (also called the functional integral) for our free ﬁeld theory:

Z0(J ) ≡ 0|0 J = Dϕ ei d4x[L0+Jϕ] ,

(8.3)

where

L0

=

−

1 2

∂µϕ∂µ

ϕ

−

1 2

m2

ϕ2

is the lagrangian density, and

(8.4)

Dϕ ∝ dϕ(x)
x

(8.5)

is the functional measure. Note that when we say path integral, we now mean a path in the space of ﬁeld conﬁgurations.
We can evaluate Z0(J) by mimicking what we did for the harmonic oscillator in section 7. We introduce four-dimensional Fourier transforms,

ϕ(k) = d4x e−ikx ϕ(x) ,

ϕ(x) =

d4k (2π)4

eikx

ϕ(k)

,

(8.6)

where kx = −k0t + k·x, and k0 is an integration variable. Then, starting with S0 = d4x [L0 + Jϕ], we get

S0

=

1 2

d4k (2π)4

−ϕ(k)(k2 + m2)ϕ(−k) + J(k)ϕ(−k) + J(−k)ϕ(k)

,

(8.7)

8: The Path Integral for Free Field Theory

68

where k2 = k2 − (k0)2. We now change path integration variables to

χ(k)

=

ϕ(k) −

J (k) k2 + m2

.

(8.8)

Since this is merely a shift by a constant, we have Dϕ = Dχ. The action becomes

S0

=

1 2

d4k (2π)4

J(k)J (−k) k2 + m2

−

χ(k)(k2

+

m2)χ(−k)

.

(8.9)

Just as for the harmonic oscillator, the integral over χ simply yields a factor of Z0(0) = 0|0 J=0 = 1. Therefore

Z0(J) = exp

i 2

d4k J(k)J (−k) (2π)4 k2 + m2 − iǫ

= exp

i 2

d4x d4x′ J(x)∆(x − x′)J(x′) .

(8.10)

Here we have deﬁned the Feynman propagator,

∆(x − x′) =

d4k (2π)4

eik(x−x′) k2 + m2 − iǫ

.

(8.11)

The Feynman propagator is a Green’s function for the Klein-Gordon equa-

tion,

(−∂x2 + m2)∆(x − x′) = δ4(x − x′) .

(8.12)

This can be seen directly by plugging eq. (8.11) into eq. (8.12) and then taking the ǫ → 0 limit. We can also evaluate ∆(x − x′) explicitly by treating the k0 integral on the right-hand side of eq. (8.11) as a contour integral in the complex k0 plane, and then evaluating it via the residue
theorem. The result is

∆(x − x′) = i dk eik·(x−x′)−iω|t−t′|

= iθ(t−t′) dk eik(x−x′) + iθ(t′−t) dk e−ik(x−x′) , (8.13)

where θ(t) is the unit step function. The integral over dk can also be

performed in terms of Bessel functions; see section 4.

Now, by analogy with the formula for the ground-state expectation

value of a time-ordered product of operators for the harmonic oscillator,

we have

0|Tϕ(x1) . . . |0

=

1 i

δ δJ (x1 )

. . . Z0(J)

J =0

.

(8.14)

8: The Path Integral for Free Field Theory

69

Using our explicit formula, eq. (8.10), we have

0|Tϕ(x1)ϕ(x2)|0

=

1 i

δ δJ (x1 )

1 i

δ δJ (x2 )

Z0(J )

J =0

=

1 i

δ δJ (x1 )

d4x′ ∆(x2 − x′)J (x′) Z0(J ) J=0

=

1 i

∆(x2

−

x1)

+

(term

with

J ’s)

Z0(J ) J=0

=

1 i

∆(x2

−

x1)

.

(8.15)

We can continue in this way to compute the ground-state expectation value of the time-ordered product of more ϕ’s. If the number of ϕ’s is odd, then there is always a left-over J in the prefactor, and so the result is zero. If the number of ϕ’s is even, then we must pair up the functional derivatives in an appropriate way to get a nonzero result. Thus, for example,

0|Tϕ(x1)ϕ(x2)ϕ(x3)ϕ(x4)|0

=

1 i2

∆(x1−x2)∆(x3−x4)

+ ∆(x1−x3)∆(x2−x4)

+ ∆(x1−x4)∆(x2−x3) .

(8.16)

More generally,

0|Tϕ(x1) . . . ϕ(x2n)|0

=

1 in

∆(xi1 −xi2 ) . . . ∆(xi2n−1 −xi2n )
pairings

.

(8.17)

This result is known as Wick’s theorem.

Problems

8.1) Starting with eq. (8.11), verify eq. (8.12).

8.2) Starting with eq. (8.11), verify eq. (8.13).

8.3) Starting with eq. (8.13), verify eq. (8.12). Note that the time derivatives in the Klein-Gordon wave operator can act on either the ﬁeld (which obeys the Klein-Gordon equation) or the time-ordering step functions.

8.4) Use eqs. (3.19), (3.29), and (5.3) (and its hermitian conjugate) to verify the last line of eq. (8.15).

8.5) The retarded and advanced Green’s functions for the Klein-Gordon wave operator satisfy ∆ret(x − y) = 0 for x0 ≥ y0 and ∆adv(x − y) = 0 for x0 ≤ y0. Find the pole prescriptions on the right-hand side of
eq. (8.11) that yield these Green’s functions.

8: The Path Integral for Free Field Theory

70

8.6) Let Z0(J) = exp iW0(J), and evaluate the real and imaginary parts of W0(J).
8.7) Repeat the analysis of this section for the complex scalar ﬁeld that was introduced in problem 3.5, and further studied in problem 5.1. Write your source term in the form J†ϕ + Jϕ†, and ﬁnd an explicit formula, analogous to eq. (8.10), for Z0(J†, J). Write down the appropriate generalization of eq. (8.14), and use it to compute 0|Tϕ(x1)ϕ(x2)|0 , 0|Tϕ†(x1)ϕ(x2)|0 , and 0|Tϕ†(x1)ϕ†(x2)|0 . Then verify your results by using the method of problem 8.4. Finally, give the appropriate generalization of eq. (8.17).

8.8) A harmonic oscillator (in units with m = h¯ = 1) has a ground-state wave function q|0 ∝ e−ωq2/2. Now consider a real scalar ﬁeld ϕ(x), and deﬁne a ﬁeld eigenstate |A that obeys

ϕ(x, 0)|A = A(x)|A ,

(8.18)

where the function A(x) is everywhere real. For a free-ﬁeld theory speciﬁed by the hamiltonian of eq. (8.1), Show that the ground-state wave functional is

A|0

∝ exp

−

1 2

d3k (2π)3

ω(k)A˜(k)A˜(−k)

,

(8.19)

where A˜(k) ≡ d3x e−ik·xA(x) and ω(k) ≡ (k2 + m2)1/2.

9: The Path Integral for Interacting Field Theory

71

9 The Path Integral for Interacting Field Theory
Prerequisite: 8

Let us consider an interacting quantum ﬁeld theory speciﬁed by a lagrangian of the form

L

=

−

1 2

Zϕ∂µϕ∂µ

ϕ

−

1 2

Zm

m2ϕ2

+

1 6

Zg

gϕ3

+

Y

ϕ

.

(9.1)

As we discussed at the end of section 5, we ﬁx the parameter m by requiring it to be equal to the actual mass of the particle (equivalently, the energy of the ﬁrst excited state relative to the ground state), and we ﬁx the parameter g by requiring some particular scattering cross section to depend on g in some particular way. (We will have more to say about this after we have learned to calculate cross sections.) We also assume that the ﬁeld is normalized by

0|ϕ(x)|0 = 0 and

k|ϕ(x)|0 = e−ikx .

(9.2)

Here |0 is the ground state, normalized via 0|0 = 1, and |k is a state of

one particle with four-momentum kµ, where k2 = kµkµ = −m2, normalized

via

k′|k = (2π)32k0δ3(k′ − k) .

(9.3)

Thus we have four conditions (the speciﬁed values of m, g, 0|ϕ|0 , and

k|ϕ|0 ), and we will use these four conditions to determine the values of

the four remaining parameters (Y and the three Z’s) that appear in L. Before going further, we should note that this theory (known as ϕ3

theory, pronounced “phi-cubed”) actually has a fatal ﬂaw. The hamiltonian

density is

H

=

1 2

Zϕ−1Π2

−Yϕ+

1 2

Zmm2ϕ2

−

1 6

Zg gϕ3

.

(9.4)

Classically, we can make this arbitrarily negative by choosing an arbitrarily

large value for ϕ. Quantum mechanically, this means that this hamiltonian

has no ground state. If we start oﬀ near ϕ = 0, we can tunnel through the

potential barrier to large ϕ, and then “roll down the hill”. However, this

process is invisible in perturbation theory in g. The situation is exactly analogous to the problem of a harmonic oscillator perturbed by a q3 term.

This system also has no ground state, but perturbation theory (both time

dependent and time independent) does not “know” this. We will be inter-

ested in eq. (9.1) only as an example of how to do perturbation expansions

in a simple context, and so we will overlook this problem.

We would like to evaluate the path integral for this theory,

Z(J ) ≡ 0|0 J = Dϕ ei d4x[L0+L1+Jϕ] .

(9.5)

9: The Path Integral for Interacting Field Theory

72

We can evaluate Z(J) by mimicking what we did for quantum mechanics at the end of section 6. Speciﬁcally, we can rewrite eq. (9.5) as

Z(J) = ei

d4x L1

1δ i δJ (x)

Dϕ ei d4x[L0+Jϕ] .

∝ ei

d4x L1

1δ i δJ (x)

Z0(J) ,

(9.6)

where Z0(J) is the result in free-ﬁeld theory,

Z0(J) = exp

i 2

d4x d4x′ J(x)∆(x − x′)J(x′) .

(9.7)

We have written Z(J) as proportional to (rather than equal to) the righthand side of eq. (9.6) because the ǫ trick does not give us the correct overall normalization; instead, we must require Z(0) = 1, and enforce this by hand.
Note that, in eq. (9.7), we have implicitly assumed that

L0

=

−

1 2

∂

µ

ϕ∂µϕ

−

1 2

m2ϕ2

,

(9.8)

since this is the L0 that gives us eq. (9.7). Therefore, the rest of L must be included in L1. We write

L1

=

1 6

Zg

gϕ3

+

Lct

,

Lct

=

−

1 2

(Zϕ

−1)∂µϕ∂µ

ϕ

−

1 2

(Zm−1)m2

ϕ2

+Yϕ

,

(9.9)

where Lct is called the counterterm lagrangian. We expect that, as g → 0, Y → 0 and Zi → 1. In fact, as we will see, Y = O(g) and Zi = 1 + O(g2).
In order to make use of eq. (9.7), we will have to compute lots and lots of
functional derivatives of Z0(J). Let us begin by ignoring the counterterms. We deﬁne

Z1(J) ∝ exp

i 6

Zg

g

d4x

1δ i δJ(x)

3

Z0(J) ,

(9.10)

where the constant of proportionality is ﬁxed by Z1(0) = 1. We now make a dual Taylor expansion in powers of g and J to get

Z1 (J )

∝

∞ V =0

1 V!

iZg g 6

d4x

1 δ 3V i δJ(x)

×

∞ P =0

1 P!

i 2

P
d4y d4z J(y)∆(y−z)J(z) .

(9.11)

If we focus on a term in eq. (9.11) with particular values of V and P , then the number of surviving sources (after we take all the functional derivatives)

9: The Path Integral for Interacting Field Theory

73

S = 23

S = 2 x 3!

Figure 9.1: All connected diagrams with E = 0 and V = 2.

S = 24

S = 23

S = 24

S = 23 x 3!

S = 4!

Figure 9.2: All connected diagrams with E = 0 and V = 4.

is E = 2P − 3V . (Here E stands for external, a terminology that should

become clear by the end of the next section; V stands for vertex and P for

propagator .) The overall phase factor of such a term is then iV (1/i)3V iP =

iV +E−P , and the 3V functional derivatives can act on the 2P sources in

(2P )!/(2P −3V )! diﬀerent combinations. However, many of the resulting

expressions are algebraically identical.

To organize them, we introduce Feynman diagrams. In these diagrams,

a

line

segment

(straight

or

curved)

stands

for

a

propagator

1 i

∆(x−y),

a

ﬁlled circle at one end of a line segment for a source i d4x J(x), and a

vertex joining three line segments for iZgg d4x. Sets of diagrams with

diﬀerent values of E and V are shown in ﬁgs. (9.1–9.11).

To count the number of terms on the right-hand side of eq. (9.11) that

result in a particular diagram, we ﬁrst note that, in each diagram, the num-

ber of lines is P and the number of vertices is V . We can rearrange the

three functional derivatives from a particular vertex without changing the

resulting diagram; this yields a counting factor of 3! for each vertex. Also,

we can rearrange the vertices themselves; this yields a counting factor of

V !. Similarly, we can rearrange the two sources at the ends of a particular

propagator without changing the resulting diagram; this yields a counting

9: The Path Integral for Interacting Field Theory

74

factor of 2! for each propagator. Also, we can rearrange the propagators themselves; this yields a counting factor of P !. All together, these counting factors neatly cancel the numbers from the dual Taylor expansions in eq. (9.11).
However, this procedure generally results in an overcounting of the number of terms that give identical results. This happens when some rearrangement of derivatives gives the same match-up to sources as some rearrangement of sources. This possibility is always connected to some symmetry property of the diagram, and so the factor by which we have overcounted is called the symmetry factor. The ﬁgures show the symmetry factor S of each diagram.
Consider, for example, the second diagram of ﬁg. (9.1). The three propagators can be rearranged in 3! ways, and all these rearrangements can be duplicated by exchanging the derivatives at the vertices. Furthermore the endpoints of each propagator can be simultaneously swapped, and the eﬀect duplicated by swapping the two vertices. Thus, S = 2 × 3! = 12.
Let us consider two more examples. In the ﬁrst diagram of ﬁg. (9.6), the exchange of the two external propagators (along with their attached sources) can be duplicated by exchanging all the derivatives at one vertex for those at the other, and simultaneously swapping the endpoints of each semicircular propagator. Also, the eﬀect of swapping the top and bottom semicircular propagators can be duplicated by swapping the corresponding derivatives at each vertex. Thus, the symmetry factor is S = 2 × 2 = 4.
In the diagram of ﬁg. (9.10), we can exchange derivatives to match swaps of the top and bottom external propagators on the left, or the top and bottom external propagators on the right, or the set of external propagators on the left with the set of external propagators on the right. Thus, the symmetry factor is S = 2 × 2 × 2 = 8.
The diagrams in ﬁgs. (9.1–9.11) are all connected: we can trace a path through the diagram between any two points on it. However, these are not the only contributions to Z(J). The most general diagram consists of a product of several connected diagrams. Let CI stand for a particular connected diagram, including its symmetry factor. A general diagram D can then be expressed as

D

=

1 SD

I

(CI )nI

,

(9.12)

where nI is an integer that counts the number of CI ’s in D, and SD is the additional symmetry factor for D (that is, the part of the symmetry factor that is not already accounted for by the symmetry factors already included in each of the connected diagrams). We now need to determine SD.

9: The Path Integral for Interacting Field Theory

75

S = 2
Figure 9.3: All connected diagrams with E = 1 and V = 1.

S = 22

S = 22

S = 23

Figure 9.4: All connected diagrams with E = 1 and V = 3.

S = 2 Figure 9.5: All connected diagrams with E = 2 and V = 0.

S = 22

S = 22

Figure 9.6: All connected diagrams with E = 2 and V = 2.

9: The Path Integral for Interacting Field Theory

76

Since we have already accounted for propagator and vertex rearrange-

ments within each CI , we need to consider only exchanges of propagators

and vertices among diﬀerent connected diagrams. These can leave the total

diagram D unchanged only if (1) the exchanges are made among diﬀerent

but identical connected diagrams, and only if (2) the exchanges involve all

of the propagators and vertices in a given connected diagram. If there are

nI factors of CI in D, there are nI! ways to make these rearrangements.

Overall, then, we have

SD = nI ! .

(9.13)

I

Now Z1(J) is given (up to an overall normalization) by summing all diagrams D, and each D is labeled by the integers nI. Therefore

Z1(J) ∝ D

{nI }

∝
{nI }

I

1 nI

!

(CI

)nI

∝

I

∞ nI =0

1 nI !

(CI )nI

∝ exp (CI )
I
∝ exp ( I CI ) .

(9.14)

Thus we have a remarkable result: Z1(J) is given by the exponential of the sum of connected diagrams. This makes it easy to impose the normalization
Z1(0) = 1: we simply omit the vacuum diagrams (those with no sources), like those of ﬁgs. (9.1) and (9.2). We then have

Z1(J) = exp[iW1(J)] ,

(9.15)

where we have deﬁned

iW1(J) ≡

CI ,

I ={0}

(9.16)

and the notation I = {0} means that the vacuum diagrams are omitted from the sum, so that W1(0) = 0.1
Were it not for the counterterms in L1, we would have Z(J) = Z1(J). Let us see what we would get if this was, in fact, the case. In particular, let
us compute the vacuum expectation value of the ﬁeld ϕ(x), which is given

1We have included a factor of i on the left-hand side of eq. (9.16) because then W1(J) is real in free-ﬁeld theory; see problem 8.6.

9: The Path Integral for Interacting Field Theory

77

S = 23 S = 23
S = 24

S = 22 S = 23
S = 22 S = 23

S = 22

S = 22

Figure 9.7: All connected diagrams with E = 2 and V = 4.

S = 3! Figure 9.8: All connected diagrams with E = 3 and V = 1.

9: The Path Integral for Interacting Field Theory

78

S = 3!

S = 22

S = 22

Figure 9.9: All connected diagrams with E = 3 and V = 3.

S = 23 Figure 9.10: All connected diagrams with E = 4 and V = 2.

S = 24 S = 24

S = 23 S = 22

S = 22

S = 22

Figure 9.11: All connected diagrams with E = 4 and V = 4.

9: The Path Integral for Interacting Field Theory

79

S = 1

S = 2

S = 2

S = 2

Figure 9.12: All connected diagrams with E = 1, X ≥ 1 (where X is the number of one-point vertices from the linear counterterm), and V + X ≤ 3.

by

0|ϕ(x)|0

=

1δ i δJ(x)

Z1(J )

J =0

=

δ δJ (x)

W1 (J )

J =0

.

(9.17)

This expression is then the sum of all diagrams [such as those in ﬁgs. (9.3) and (9.4)] that have a single source, with the source removed:

0|ϕ(x)|0

=

1 2

ig

d4y

1 i

∆(x−y)

1 i

∆(y−y)

+

O(g3)

.

(9.18)

Here we have set Zg = 1 in the ﬁrst term, since Zg = 1 + O(g2). We see the vacuum-expectation value of ϕ(x) is not zero, as is required for the validity of the LSZ formula. To ﬁx this, we must introduce the counterterm Y ϕ. Including this term in the interaction lagrangian L1 introduces a new kind of vertex, one where a single line segment ends; the corresponding vertex factor is iY d4y. The simplest diagrams including this new vertex are shown in ﬁg. (9.12), with a cross symbolizing the vertex.
Assuming Y = O(g), only the ﬁrst diagram in ﬁg. (9.12) contributes at O(g), and we have

0|ϕ(x)|0

=

iY

+

1 2

(ig)

1 i

∆(0)

d4y

1 i

∆(x−y)

+

O(g3)

.

(9.19)

Thus, in order to have 0|ϕ(x)|0 = 0, we should choose

Y

=

1 2

ig∆(0)

+

O(g3

)

.

(9.20)

The factor of i is disturbing, because Y must be a real number: it is the coeﬃcient of a hermitian operator in the hamiltonian, as seen in eq. (9.4). Therefore, ∆(0) must be purely imaginary, or we are in trouble. We have

∆(0) =

d4k (2π)4

k2

1 + m2

− iǫ

.

(9.21)

9: The Path Integral for Interacting Field Theory

80

From eq. (9.21), it is not immediately obvious whether or not ∆(0) is purely imaginary, but eq. (9.21) does reveal another problem: the integral diverges at large k. This is another example of an ultraviolet divergence, similar to the one we encountered in section 3 when we computed the zero-point energy of the ﬁeld.
To make some progress, we introduce an ultraviolet cutoﬀ Λ, which we assume is much larger than m and any other energy of physical interest. Modiﬁcations to the propagator above some cutoﬀ may be well justiﬁed physically; for example, quantum ﬂuctuations in spacetime itself should become important above the Planck scale, which is given by the inverse square root of Newton’s constant, and has the numerical value of 1019 GeV (compared to, say, the proton mass, which is 1 GeV).
In order to retain the Lorentz-transformation properties of the propagator, we implement the ultraviolet cutoﬀ in a more subtle way than we did in section 3; specﬁcally, we make the replacement

∆(x − y) →

d4k

eik(x−y)

(2π)4 k2 + m2 − iǫ

Λ2 k2 + Λ2 − iǫ

2
.

(9.22)

The integral is now convergent, and we can evaluate the modiﬁed ∆(0) with the methods of section 14; for Λ ≫ m, the result is

∆(0)

=

i 16π2

Λ2

.

(9.23)

Thus Y is real, as required. If we like, we can now formally take the limit Λ → ∞. The parameter Y becomes inﬁnite, but 0|ϕ(x)|0 remains zero, at least to this order in g.
It may be disturbing to have a parameter in the lagrangian that is formally inﬁnite. However, such parameters are not directly measurable, and so need not obey our preconceptions about their magnitudes. Also, it is important to remember that Y includes a factor of g; this means that we can expand in powers of Y as part of our general expansion in powers of g. When we compute something measurable (like a scattering cross section), all the formally inﬁnite numbers will cancel in a well-deﬁned way, leaving behind ﬁnite coeﬃcients for the various powers of g. We will see how this works in detail in sections 14–20.
As we go to higher orders in g, things become more complicated, but in principle the procedure is the same. Thus, at O(g3), we sum up the diagrams of ﬁgs. (9.4) and (9.12), and then add to Y whatever O(g3) term is needed to maintain 0|ϕ(x)|0 = 0. In this way we can determine the value of Y order by order in powers of g.
Once this is done, there is a remarkable simpliﬁcation. Our adjustment of Y to keep 0|ϕ(x)|0 = 0 means that the sum of all connected diagrams

9: The Path Integral for Interacting Field Theory

81

Figure 9.13: All connected diagrams without tadpoles with E ≤ 4 and V ≤ 4.

with a single source is zero. Consider now that same inﬁnite set of diagrams,

but replace the single source in each of them with some other subdiagram.

Here is the point: no matter what this replacement subdiagram is, the sum

of all these diagrams is still zero. Therefore, we need not bother to compute

any of them! The rule is this: ignore any diagram that, when a single line is

cut, falls into two parts, one of which has no sources. All of these diagrams

(known as tadpoles) are canceled by the Y counterterm, no matter what

subdiagram they are attached to. The diagrams that remain (and need to

be computed!) are shown in ﬁg. (9.13).

We turn next to the remaining two counterterms. For notational sim-

plicity we deﬁne

A = Zϕ − 1 , B = Zm − 1 ,

(9.24)

9: The Path Integral for Interacting Field Theory

82

and recall that we expect each of these to be O(g2). We now have

Z(J) = exp

−

i 2

d4x

1δ i δJ(x)

−A∂x2 + Bm2

1δ i δJ(x)

Z1(J) .

(9.25)

We have integrated by parts to put both ∂x’s onto one δ/δJ(x). (Note that

the time derivatives in this interaction should really be treated by including

an extra source term for the conjugate momentum Π = ϕ˙ . However, the

space derivatives are correctly treated, and then the time derivatives must

work out comparably by Lorentz invariance.)

Eq. (9.25) results in a new vertex at which two lines meet. The corre-

sponding vertex factor is (−i) d4x (−A∂x2 + Bm2); the ∂x2 acts on the x in one or the other (but not both) propagators. (Which one does not matter,

and can be changed via integration by parts.) Diagramatically, all we need

do is sprinkle these new vertices onto the propagators in our existing dia-

grams. How many of these vertices we need to add depends on the order

in g we are working to achieve.

This completes our calculation of Z(J) in ϕ3 theory. We express it as

Z(J) = exp[iW (J)] ,

(9.26)

where W (J) is given by the sum of all connected diagrams with no tadpoles and at least two sources, and including the counterterm vertices just discussed.
Now that we have Z(J), we must ﬁnd out what we can do with it.

Problems

9.1) Compute the symmetry factor for each diagram in ﬁg. (9.13). (You can then check your answers by consulting the earlier ﬁgures.)

9.2) Consider a real scalar ﬁeld with L = L0 + L1, where

L0

=

−

1 2

∂µϕ∂µ

ϕ

−

1 2

m2ϕ2

,

L1

=

−

1 24

Zλλϕ4

+ Lct

,

Lct

=

−

1 2

(Zϕ

−1)∂µϕ∂µ

ϕ

−

1 2

(Zm−1)m2

ϕ2

.

a) What kind of vertex appears in the diagrams for this theory (that is, how many line segments does it join?), and what is the associated vertex factor?
b) Ignoring the counterterms, draw all the connected diagrams with 1 ≤ E ≤ 4 and 0 ≤ V ≤ 2, and ﬁnd their symmetry factors.
c) Explain why we did not have to include a counterterm linear in ϕ to cancel tadpoles.

9: The Path Integral for Interacting Field Theory

83

9.3) Consider a complex scalar ﬁeld (see problems 3.5, 5.1, and 8.7) with L = L0 + L1, where

L0 = −∂µϕ†∂µϕ − m2ϕ†ϕ ,

L1

=

−

1 4

Zλ

λ(ϕ†

ϕ)2

+ Lct

,

Lct = −(Zϕ−1)∂µϕ†∂µϕ − (Zm−1)m2ϕ†ϕ .

This theory has two kinds of sources, J and J†, and so we need a way to tell which is which when we draw the diagrams. Rather than labeling the source blobs with a J or J†, we will indicate which is which by putting an arrow on the attached propagator that points towards the source if it is a J†, and away from the source if it is a J.
a) What kind of vertex appears in the diagrams for this theory, and what is the associated vertex factor? Hint: your answer should involve those arrows!
b) Ignoring the counterterms, draw all the connected diagrams with 1 ≤ E ≤ 4 and 0 ≤ V ≤ 2, and ﬁnd their symmetry factors. Hint: the arrows are important!

9.4) Consider the integral

exp W (g, J) ≡ √1 2π

+∞

−∞

dx

exp

−

1 2

x2

+

1 6

gx3

+

Jx

.

(9.27)

This integral does not converge, but it can be used to generate a joint power series in g and J,

∞∞

W (g, J) =

CV,E gVJ E .

V =0 E=0

(9.28)

a) Show that

CV,E =

I

1 SI

,

(9.29)

where the sum is over all connected Feynman diagrams with E sources and V three-point vertices, and SI is the symmetry factor for each diagram.

b) Use eqs. (9.27) and (9.28) to compute CV,E for V ≤ 4 and E ≤ 5. (This is most easily done with a symbolic manipulation program like Mathematica.) Verify that the symmetry factors given in ﬁgs. (9.1– 9.11) satisfy the sum rule of eq. (9.29).

9: The Path Integral for Interacting Field Theory

84

c) Now consider W (g, J+Y ), with Y ﬁxed by the “no tadpole” con-

dition

∂ W (g, J+Y ) = 0 .

∂J

J =0

(9.30)

Then write

∞∞

W (g, J+Y ) =

CV,E gVJ E .

V =0 E=0

(9.31)

Show that

CV,E =

I

1 SI

,

(9.32)

where the sum is over all connected Feynman diagrams with E sources and V three-point vertices and no tadpoles, and SI is the symmetry factor for each diagram.

d) Let Y = a1g + a3g3 + . . . , and use eq. (9.30) to determine a1 and a3. Compute CV,E for V ≤ 4 and E ≤ 4. Verify that the symmetry factors for the diagrams in ﬁg. (9.13) satisfy the sum rule of eq. (9.32).

9.5) The interaction picture. In this problem, we will derive a formula for

0|Tϕ(xn) . . . ϕ(x1)|0 without using path integrals. Suppose we have

a
1 2

hamiltonian density H = H0 m2ϕ2, and H1 is a function of

+ H1, where H0 Π(x, 0) and ϕ(x,

= 0)

1 2

Π2

and

+

1 2

(∇ϕ)2

+

their spatial

derivatives. (It should be chosen to preserve Lorentz invariance, but

we will not be concerned with this issue.) We add a constant to H

so that H|0 = 0. Let |∅ be the ground state of H0, with a constant

added to H0 so that H0|∅ = 0. (H1 is then deﬁned as H − H0.) The

Heisenberg-picture ﬁeld is

ϕ(x, t) ≡ eiHtϕ(x, 0)e−iHt .

(9.33)

We now deﬁne the interaction-picture ﬁeld ϕI (x, t) ≡ eiH0tϕ(x, 0)e−iH0t .

(9.34)

a) Show that ϕI (x) obeys the Klein-Gordon equation, and hence is a free ﬁeld.

b) Show that ϕ(x) = U †(t)ϕI (x)U (t), where U (t) ≡ eiH0te−iHt is unitary.

c)

Show

that

U

(t)

obeys

the

diﬀerential

equation

i

d dt

U

(t)

=

HI

(t)U

(t),

where HI(t) = eiH0tH1e−iH0t is the interaction hamiltonian in the in-

teraction picture, and the boundary condition U (0) = 1.

9: The Path Integral for Interacting Field Theory

85

d) If H1 is speciﬁed by a particular function of the Schr¨odinger-picture ﬁelds Π(x, 0) and ϕ(x, 0), show that HI (t) is given by the same function of the interaction-picture ﬁelds ΠI (x, t) and ϕI (x, t).
e) Show that, for t > 0,

t
U (t) = T exp −i dt′ HI(t′)
0

(9.35)

obeys the diﬀerential equation and boundary condition of part (c). What is the comparable expression for t < 0? Hint: you may need to deﬁne a new ordering symbol.
f) Deﬁne U (t2, t1) ≡ U (t2)U †(t1). Show that, for t2 > t1,

U (t2, t1) = T exp −i t2 dt′ HI (t′) .
t1

(9.36)

What is the comparable expression for t1 > t2? g) For any time ordering, show that U (t3, t1) = U (t3, t2)U (t2, t1) and that U †(t1, t2) = U (t2, t1). h) Show that

ϕ(xn) . . . ϕ(x1) = U †(tn, 0)ϕI (xn)U (tn, tn−1)ϕI (xn−1)

. . . U (t2, t1)ϕI (x1)U (t1, 0) .

(9.37)

i) Show that U †(tn, 0) = U †(∞, 0)U (∞, tn) and also that U (t1, 0) = U (t1, −∞)U (−∞, 0). j) Replace H0 with (1−iǫ)H0, and show that 0|U †(∞, 0) = 0|∅ ∅| and that U (−∞, 0)|0 = |∅ ∅|0 .
k) Show that

0|ϕ(xn) . . . ϕ(x1)|0 = ∅|U (∞, tn)ϕI (xn)U (tn, tn−1)ϕI (xn−1) . . .

U (t2, t1)ϕI (x1)U (t1, −∞)|∅

× | ∅|0 |2 .

(9.38)

l) Show that

0|Tϕ(xn) . . . ϕ(x1)|0 = ∅|TϕI (xn) . . . ϕI (x1)e−i d4x HI(x)|∅

× | ∅|0 |2 .

(9.39)

m) Show that

| ∅|0 |2 = 1/ ∅|Te−i d4x HI(x)|∅ .

(9.40)

9: The Path Integral for Interacting Field Theory

86

Thus we have

0|Tϕ(xn) . . . ϕ(x1)|0

=

∅|TϕI (xn) . . . ϕI (x1)e−i d4x HI(x)|∅ ∅|Te−i d4x HI (x)|∅

.

(9.41)

We can now Taylor expand the exponentials on the right-hand side

of eq. (9.41), and use free-ﬁeld theory to compute the resulting corre-

lation functions.

10: Scattering Amplitudes and the Feynman Rules

87

10 Scattering Amplitudes and the Feynman Rules
Prerequisite: 5, 9

Now that we have an expression for Z(J) = exp iW (J), we can take func-

tional derivatives to compute vacuum expectation values of time-ordered

products of ﬁelds. Consider the case of two ﬁelds; we deﬁne the exact

propagator via

1 i

∆(x1

−

x2)

≡

0|Tϕ(x1)ϕ(x2)|0

.

(10.1)

For notational simplicity let us deﬁne

δj

≡

1 i

δ δJ(xj )

.

(10.2)

Then we have

0|Tϕ(x1)ϕ(x2)|0 = δ1δ2Z(J ) J=0

= δ1δ2iW (J ) J=0 − δ1iW (J ) J=0 δ2iW (J ) J=0

= δ1δ2iW (J ) J=0 .

(10.3)

To get the last line we used δjW (J)|J=0 = 0|ϕ(xj )|0 = 0. Diagramat-

ically, δ1 removes a source, and labels the propagator endpoint x1. Thus

1 i

∆(x1−x2)

is

given

by

the

sum

of

diagrams

with

two

sources,

with

those

sources removed and the endpoints labeled x1 and x2. (The labels must be

applied in both ways. If the diagram was originally symmetric on exchange

of the two sources, the associated symmetry factor of 2 is then canceled by

the double labeling.) At lowest order, the only contribution is the “barbell”

diagram of ﬁg. (9.5) with the sources removed. Thus we recover the obvious

foafctthtehOat(g1i2∆) c(xor1r−ecxt2i)on=s

1 i

∆(x1−x2)

+

O(g2).

in section 14.

We

will

take

up

the

subject

For now, let us go on to compute

0|Tϕ(x1)ϕ(x2)ϕ(x3)ϕ(x4)|0 = δ1δ2δ3δ4Z(J)

= δ1δ2δ3δ4iW

+ (δ1δ2iW )(δ3δ4iW )

+ (δ1δ3iW )(δ2δ4iW )

+ (δ1δ4iW )(δ2δ3iW )

.
J =0

(10.4)

We have dropped terms that contain a factor of 0|ϕ(x)|0 = 0. According to eq. (10.3), the last three terms in eq. (10.4) simply give products of the exact propagators.

10: Scattering Amplitudes and the Feynman Rules

88

Let us see what happens when these terms are inserted into the LSZ formula for two incoming and two outgoing particles,

f |i = i4 d4x1 d4x2 d4x′1 d4x′2 ei(k1x1+k2x2−k1′ x′1−k2′ x′2)

×(−∂12 + m2)(−∂22 + m2)(−∂12′ + m2)(−∂22′ + m2)

× 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 .

(10.5)

If

we

consider,

for

example,

1 i

∆(x1−x′1)

1 i

∆(x2−x′2)

as

one

term

in

the

correlation function in eq. (10.5), we get from this term

d4x1 d4x2 d4x′1 d4x′2 ei(k1x1+k2x2−k1′ x′1−k2′ x′2)F (x11′ )F (x22′ )
= (2π)4δ4(k1−k1′ ) (2π)4δ4(k2−k2′ ) F (k¯11′ ) F (k¯22′ ) , (10.6)
where F (xij) ≡ (−∂i2 +m2)(−∂j2 +m2)∆(xij ), F (k) is its Fourier transform, xij′ ≡ xi−x′j, and k¯ij′ ≡ (ki+kj′ )/2. The important point is the two delta functions: these tell us that the four-momenta of the two outgoing particles (1′ and 2′) are equal to the four-momenta of the two incoming particles (1 and 2). In other words, no scattering has occurred. This is not the event whose probability we wish to compute! The other two similar terms in eq. (10.4) either contribute to “no scattering” events, or vanish due to factors like δ4(k1+k2) (which is zero because k10+k20 ≥ 2m > 0). In general, the diagrams that contribute to the scattering process of interest are only those that are fully connected: every endpoint can be reached from every other endpoint by tracing through the diagram. These are the diagrams that arise from all the δ’s acting on a single factor of W . Therefore, from here on, we restrict our attention to those diagrams alone. We deﬁne the connected correlation functions via

0|Tϕ(x1) . . . ϕ(xE)|0 C ≡ δ1 . . . δEiW (J) J=0 ,

(10.7)

and use these instead of 0|Tϕ(x1) . . . ϕ(xE )|0 in the LSZ formula. Returning to eq. (10.4), we have

0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C = δ1δ2δ1′ δ2′ iW J=0 .

(10.8)

The lowest-order (in g) nonzero contribution to this comes from the diagram of ﬁg. (9.10), which has four sources and two vertices. The four δ’s remove the four sources; there are 4! ways of matching up the δ’s to the sources. These 24 diagrams can then be collected into 3 groups of 8 diagrams each; the 8 diagrams in each group are identical. The 3 distinct diagrams are shown in ﬁg. (10.1). Note that the factor of 8 neatly cancels the symmetry factor S = 8 of the diagram with sources.

10: Scattering Amplitudes and the Feynman Rules

1

1

1

1

1

89
1

2

2

2

2

2

2

Figure 10.1: The three tree-level Feynman diagrams that contribute to the connected correlation function 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C.

This is a general result for tree diagrams (those with no closed loops): once the sources have been stripped oﬀ and the endpoints labeled, each diagram with a distinct endpoint labeling has an overall symmetry factor of one. The tree diagrams for a given process represent the lowest-order (in g) nonzero contribution to that process.
We now have

0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C

= (ig)2

15 i

d4y d4z ∆(y−z)

× ∆(x1−y)∆(x2−y)∆(x′1−z)∆(x′2−z) + ∆(x1−y)∆(x′1−y)∆(x2−z)∆(x′2−z) + ∆(x1−y)∆(x′2−y)∆(x2−z)∆(x′1−z) + O(g4) .

(10.9)

Next, we use eq. (10.9) in the LSZ formula, eq. (10.5). Each Klein-Gordon wave operator acts on a propagator to give

(−∂i2 + m2)∆(xi − y) = δ4(xi − y) .

(10.10)

The integrals over the external spacetime labels x1,2,1′,2′ are then trivial, and we get

f |i

= (ig)2

1 i

d4y d4z ∆(y−z) ei(k1y+k2y−k1′ z−k2′ z)
+ ei(k1y+k2z−k1′ y−k2′ z) + ei(k1y+k2z−k1′ z−k2′ y)

+ O(g4) .

(10.11)

This can be simpliﬁed by substituting

∆(y − z) =

d4k

eik(y−z)

(2π)4 k2 + m2 − iǫ

(10.12)

10: Scattering Amplitudes and the Feynman Rules

90

into eq. (10.9). Then the spacetime arguments appear only in phase factors, and we can integrate them to get delta functions:

f |i = ig2

d4k

1

(2π)4 k2 + m2 − iǫ

× (2π)4δ4(k1+k2+k) (2π)4δ4(k1′ +k2′ +k)

+ (2π)4δ4(k1−k1′ +k) (2π)4δ4(k2′ −k2+k) + (2π)4δ4(k1−k2′ +k) (2π)4δ4(k1′ −k2+k)

+ O(g4)

= ig2 (2π)4δ4(k1+k2−k1′ −k2′ )

×

1 (k1+k2)2

+ m2

+

1 (k1−k1′ )2

+ m2

+

1 (k1−k2′ )2

+

m2

+ O(g4) .

(10.13)

In eq. (10.13), we have left out the iǫ’s for notational convenience only; m2 is really m2 − iǫ. The overall delta function in eq. (10.13) tells that that
four-momentum is conserved in the scattering process, which we should, of
course, expect. For a general scattering process, it is then convenient to deﬁne a scattering matrix element T via

f |i = (2π)4δ4(kin−kout)iT ,

(10.14)

where kin and kout are the total four-momenta of the incoming and outgoing particles, respectively.
Examining the calculation which led to eq. (10.13), we can take away some universal features that lead to a simple set of Feynman rules for computing contributions to iT for a given scattering process. The Feynman rules are:

1. Draw lines (called external lines) for each incoming and each outgoing particle.

2. Leave one end of each external line free, and attach the other to a vertex at which exactly three lines meet. Include extra internal lines in order to do this. In this way, draw all possible diagrams that are topologically inequivalent.

3. On each incoming line, draw an arrow pointing towards the vertex. On each outgoing line, draw an arrow pointing away from the vertex. On each internal line, draw an arrow with an arbitrary direction.

4. Assign each line its own four-momentum. The four-momentum of an external line should be the four-momentum of the corresponding particle.

10: Scattering Amplitudes and the Feynman Rules

91

k1

k1

k1

k1

k1

k1

k1 k1

k1 k2

k1+ k2

k2

k2

k2

k2

k2

k2

Figure 10.2: The tree-level s-, t-, and u-channel diagrams contributing to iT for two particle scattering.
5. Think of the four-momenta as ﬂowing along the arrows, and conserve four-momentum at each vertex. For a tree diagram, this ﬁxes the momenta on all the internal lines.
6. The value of a diagram consists of the following factors: for each external line, 1; for each internal line with momentum k, −i/(k2 + m2 − iǫ); for each vertex, iZgg.
7. A diagram with L closed loops will have L internal momenta that are not ﬁxed by rule #5. Integrate over each of these momenta ℓi with measure d4ℓi/(2π)4.
8. A loop diagram may have some leftover symmetry factors if there are exchanges of internal propagators and vertices that leave the diagram unchanged; in this case, divide the value of the diagram by the symmetry factor associated with exchanges of internal propagators and vertices.
9. Include diagrams with the counterterm vertex that connects two propagators, each with the same four-momentum k. The value of this vertex is −i(Ak2 + Bm2), where A = Zϕ − 1 and B = Zm − 1, and each is O(g2).
10. The value of iT is given by a sum over the values of all these diagrams.
For the two-particle scattering process, the tree diagrams resulting from these rules are shown in ﬁg. (10.2).
Now that we have our procedure for computing the scattering amplitude T , we must see how to relate it to a measurable cross section.
Problems

10: Scattering Amplitudes and the Feynman Rules

92

10.1) Use eq. (9.41) of problem 9.5 to rederive eq. (10.9).

10.2) Write down the Feynman rules for the complex scalar ﬁeld of problem 9.3. Remember that there are two kinds of particles now (which we can think of as positively and negatively charged), and that your rules must have a way of distinguishing them. Hint: the most direct approach requires two kinds of arrows: momentum arrows (as discussed in this section) and what we might call “charge” arrows (as discussed in problem 9.3). Try to ﬁnd a more elegant approach that requires only one kind of arrow.

10.3) Consider a complex scalar ﬁeld ϕ that interacts with a real scalar ﬁeld χ via L1 = gχϕ†ϕ. Use a solid line for the ϕ propagator and a dashed line for the χ propagator. Draw the vertex (remember the
arrows!), and ﬁnd the associated vertex factor.

10.4)

Consider

a

real

scalar

ﬁeld

with

L1

=

1 2

gϕ∂µ

ϕ∂µϕ.

Find

the

associ-

ated vertex factor.

10.5) The scattering amplitudes should be unchanged if we make a ﬁeld redeﬁnition. Suppose, for example, we have

L

=

−

1 2

∂µϕ∂µ

ϕ

−

1 2

m2ϕ2

,

(10.15)

and we make the ﬁeld redeﬁnition

ϕ → ϕ + λϕ2 .

(10.16)

Work out the lagrangian in terms of the redeﬁned ﬁeld, and the corresponding Feynman rules. Compute (at tree level) the ϕϕ → ϕϕ scattering amplitude. You should get zero, because this is a free-ﬁeld theory in disguise. (At the loop level, we also have to take into account the transformation of the functional measure Dϕ; see section 85.)

11: Cross Sections and Decay Rates

93

11 Cross Sections and Decay Rates
Prerequisite: 10

Now that we have a method for computing the scattering amplitude T , we must convert it into something that could be measured in an experiment.
In practice, we are almost always concerned with one of two generic cases: one incoming particle, for which we compute a decay rate, or two incoming particles, for which we compute a cross section. We begin with the latter.
Let us also specialize, for now, to the case of two outgoing particles as well as two incoming particles. In ϕ3 theory, we found in section 10 that in this case we have

T

= g2

1 (k1+k2)2

+

m2

+

1 (k1−k1′ )2

+

m2

+

1 (k1−k2′ )2

+

m2

+ O(g4) ,

(11.1)

where k1 and k2 are the four-momenta of the two incoming particles, k1′ and k2′ are the four-momenta of the two outgoing particles, and k1+k2 = k1′ +k2′ . Also, these particles are all on shell: ki2 = −m2i . (Here, for later use, we allow for the possibility that the particles all have diﬀerent masses.)

Let us think about the kinematics of this process. In the center-of-

mass frame, or CM frame for short, we take k1 + k2 = 0, and choose k1

to be in the +z direction. Now the only variable left to specify about the

initial state is the magnitude of k1. Equivalently, we could specify the total

energy in the CM frame, E1 + E2. However, it is even more convenient to deﬁne a Lorentz scalar s ≡ −(k1 + k2)2. In the CM frame, s reduces to (E1 + E2)2; s is therefore called the center-of-mass energy squared. Then, since E1 = (k21 + m21)1/2 and E2 = (k21 + m22)1/2, we can solve for |k1| in terms of s, with the result

|k1|

=

√1 2s

s2 − 2(m21 + m22)s + (m21 − m22)2

(CM frame) .

(11.2)

Now consider the two outgoing particles. Since momentum is conserved, we must have k′1 + k′2 = 0, and since energy is conserved, we must also have (E1′ + E2′ )2 = s. Then we ﬁnd
|k′1| = 2√1 s s2 − 2(m21′ + m22′ )s + (m21′ − m22′ )2 (CM frame) . (11.3)

Now the only variable left to specify about the ﬁnal state is the angle θ between k1 and k′1. However, it is often more convenient to work with the Lorentz scalar t ≡ −(k1 − k1′ )2, which is related to θ by

t = m21 + m21′ − 2E1E1′ + 2|k1||k′1| cos θ .

(11.4)

11: Cross Sections and Decay Rates

94

This formula is valid in any frame. The Lorentz scalars s and t are two of the three Mandelstam variables,
deﬁned as

s ≡ −(k1+k2)2 = −(k1′ +k2′ )2 , t ≡ −(k1−k1′ )2 = −(k2−k2′ )2 , u ≡ −(k1−k2′ )2 = −(k2−k1′ )2 .

(11.5)

The three Mandelstam variables are not independent; they satisfy the linear

relation

s + t + u = m21 + m22 + m21′ + m22′ .

(11.6)

In terms of s, t, and u, we can rewrite eq. (11.1) as

T = g2

1 m2 −

s

+

1 m2 −

t

+

1 m2 −

u

+ O(g4) ,

(11.7)

which demonstrates the notational utility of the Mandelstam variables. Now let us consider a diﬀerent frame, the ﬁxed target or FT frame (also
sometimes called the lab frame), in which particle #2 is initially at rest: k2 = 0. In this case we have

|k1|

=

1 2m2

s2 − 2(m21 + m22)s + (m21 − m22)2

(FT frame) .

(11.8)

Note that, from eqs. (11.8) and (11.2), √
m2|k1|FT = s |k1|CM .

(11.9)

This will be useful later.

We would now like to derive a formula for the diﬀerential scattering

cross section. In order to do so, we assume that the whole experiment is

taking place in a big box of volume V , and lasts for a large time T . We

should really think about wave packets coming together, but we will use

some simple shortcuts instead. Also, to get a more general answer, we will

let the number of outgoing particles be arbitrary.

Recall from section 10 that the overlap between the initial and ﬁnal

states is given by

f |i = (2π)4δ4(kin−kout)iT .

(11.10)

To get a probability, we must square f |i , and divide by the norms of the

initial and ﬁnal states:

P=

| f |i |2 f |f i|i

.

(11.11)

11: Cross Sections and Decay Rates

95

The numerator of this expression is

| f |i |2 = [(2π)4δ4(kin−kout)]2 |T |2 .

(11.12)

We write the square of the delta function as [(2π)4δ4(kin−kout)]2 = (2π)4δ4(kin−kout) × (2π)4δ4(0) ,

(11.13)

and note that

(2π)4δ4(0) = d4x ei0·x = V T .

(11.14)

Also, the norm of a single particle state is given by

k|k = (2π)32k0δ3(0) = 2k0V .

(11.15)

Thus we have

i|i = 4E1E2V 2 ,
n′
f |f = 2kj′ 0V ,
j=1

(11.16) (11.17)

where n′ is the number of outgoing particles.

If we now divide eq. (11.11) by the elapsed time T , we get a probability

per unit time

P˙

=

(2π)4δ4(kin−kout) V |T |2

4E1E2V 2

n′ j=1

2kj′0V

.

(11.18)

This is the probability per unit time to scatter into a set of outgoing par-

ticles with precise momenta. To get something measurable, we should sum

each outgoing three-momentum k′j over some small range. Due to the box, all three-momenta are quantized: k′j = (2π/L)n′j , where V = L3, and n′j is a three-vector with integer entries. (Here we have assumed periodic bound-

ary conditions, but this choice does not aﬀect the ﬁnal result.) In the limit

of large L, we have

n′j

→

V (2π)3

d3k′j .

(11.19)

Thus we should multiply P˙ by a factor of V d3k′j/(2π)3 for each outgoing particle. Then we get

P˙

=

(2π )4 δ4 (kin −kout ) 4E1E2V

|T

|2

n′
dk′j
j=1

,

(11.20)

11: Cross Sections and Decay Rates

96

where we have identiﬁed the Lorentz-invariant phase-space diﬀerential

dk

≡

d3k (2π)32k0

(11.21)

that we ﬁrst introduced in section 3. To convert P˙ to a diﬀerential cross section dσ, we must divide by the
incident ﬂux. Let us see how this works in the FT frame, where particle #2 is at rest. The incident ﬂux is the number of particles per unit volume
that are striking the target particle (#2), times their speed. We have one incident particle (#1) in a volume V with speed v = |k1|/E1, and so the incident ﬂux is |k1|/E1V . Dividing eq. (11.20) by this ﬂux cancels the last factor of V , and replaces E1 in the denominator with |k1|. We also set E2 = m2 and note that eq. (11.8) gives |k1|m2 as a function of s; dσ will be Lorentz invariant if, in other frames, we simply use this function as the value of |k1|m2. Adopting this convention, and using eq. (11.9), we have

dσ

=

1√ 4|k1|CM s

|T

|2

dLIPSn′ (k1+k2)

,

(11.22)

where |k1|CM is given as a function of s by eq. (11.2), and we have deﬁned the n′-body Lorentz-invariant phase-space measure

n′

dLIPSn′ (k) ≡ (2π)4δ4(k−

n′ j=1

ki′

)

dk′j .

j=1

(11.23)

Eq. (11.22) is our ﬁnal result for the diﬀerential cross section for the scat-

tering of two incoming particles into n′ outgoing particles.

Let us now specialize to the case of two outgoing particles. We need to

evaluate

dLIPS2(k) = (2π)4δ4(k−k1′ −k2′ ) dk′1dk′2 ,

(11.24)

where k = k1 + k2. Since dLIPS2(k) is Lorentz invariant, we can compute

it k1

in +

any k2 =

convenient frame. Let 0 and k0 = E1 + E2 =

√uss;

work then

in the CM we have

frame,

where

k

=

dLIPS2(k)

=

1 4(2π)2E1′ E2′

δ(E1′ +E2′ −√s ) δ3(k′1+k′2) d3k′1d3k′2

.

(11.25)

We can use the spatial part of the delta function to integrate over d3k′2, with the result

dLIPS2(k)

=

1 4(2π)2E1′ E2′

δ(E1′ +E2′

√ −s

)

d3k′1

,

(11.26)

11: Cross Sections and Decay Rates

97

where now

E1′ = k′12 + m21′ and E2′ = k′12 + m22′ .

(11.27)

Next, let us write

d3k′1 = |k′1|2 d|k′1| dΩCM ,

(11.28)

where dΩCM = sin θ dθ dφ is the diﬀerential solid angle, and θ is the angle between k1 and k′1 in the CM frame. We can carry out the integral over the magnitude of k′1 in eq. (11.26) using dx δ(f (x)) = i |f ′(xi)|−1, where xi
satisﬁes f (xi) = 0. In our case, the argument of the delta function vanishes at just one value of |k′1|, the value given by eq. (11.3). Also, the derivative of that argument with respect to |k′1| is

∂ ∂|k′1|

E1′

+

E2′

−

√ s

=

|k′1| E1′

+

|k′1| E2′

= =

||kEk′1′11′||E√2E′sE1′.1′+EE2′ 2′

(11.29)

Putting all of this together, we get

dLIPS2(k) = 16|πk2′1√| s dΩCM .

(11.30)

Combining this with eq. (11.22), we have

dσ dΩCM

=

1 64π2s

|k′1| |k1|

|T

|2

,

(11.31)

where |k1| and |k′1| are the functions of s given by eqs. (11.2) and (11.3), and dΩCM is the diﬀerential solid angle in the CM frame.
The diﬀerential cross section can also be expressed in a frame-independent
manner by noting that, in the CM frame, we can take the diﬀerential of
eq. (11.4) at ﬁxed s to get

dt = 2 |k1| |k′1| d cos θ

=

2 |k1| |k′1|

dΩCM 2π

.

Now we can rewrite eq. (11.31) as

(11.32) (11.33)

dσ dt

=

1 64πs|k1|2

|T

|2

,

(11.34)

11: Cross Sections and Decay Rates

98

where |k1| is given as a function of s by eq. (11.2). We can now transform dσ/dt into dσ/dΩ in any frame we might like
(such as the FT frame) by taking the diﬀerential of eq. (11.4) in that frame. In general, though, |k′1| depends on θ as well as s, so the result is more complicated than it is in eq. (11.32) for the CM frame.
Returning to the general case of n′ outgoing particles, we can deﬁne a
Lorentz invariant total cross section by integrating completely over all the
outgoing momenta, and dividing by an appropriate symmetry factor S. If there are n′i identical outgoing particles of type i, then

S = n′i! ,
i

(11.35)

and

σ

=

1 S

dσ ,

(11.36)

where dσ is given by eq. (11.22). We need the symmetry factor because
merely integrating over all the outgoing momenta in dLIPSn′ treats the ﬁnal state as being labeled by an ordered list of these momenta. But if
some outgoing particles are identical, this is not correct; the momenta of
the identical particles should be speciﬁed by an unordered list (because, for example, the state a†1a†2|0 is identical to the state a†2a†1|0 ). The symmetry factor provides the appropriate correction.
In the case of two outgoing particles, eq. (11.36) becomes

σ

=

1 S

dΩCM

dσ dΩCM

=

2π S

+1 −1

d

cos

θ

dσ dΩCM

,

(11.37) (11.38)

where S = 2 if the two outgoing particles are identical, and S = 1 if they are distinguishable. Equivalently, we can compute σ from eq. (11.34) via

σ

=

1 S

tmax tmin

dt

dσ dt

,

(11.39)

where tmin and tmax are given by eq. (11.4) in the CM frame with cos θ = −1 and +1, respectively. To compute σ with eq. (11.38), we should ﬁrst express t and u in terms of s and θ via eqs. (11.4) and (11.6), and then integrate over θ at ﬁxed s. To compute σ with eq. (11.39), we should ﬁrst express u in terms of s and t via eq. (11.6), and then integrate over t at ﬁxed s.
Let us see how all this works for the scattering amplitude of ϕ3 theory, eq. (11.7). In this case, all the masses are equal, and so, in the CM frame,

11: Cross Sections and Decay Rates

99

E=

1 2

√s

for

all

four

particles,

and

|k′1|

=

|k1|

=

1 2

(s

−

4m2)1/2.

Then

eq. (11.4) becomes

t

=

−

1 2

(s

−

4m2

)(1

−

cos

θ)

.

(11.40)

From eq. (11.6), we also have

u

=

−

1 2

(s

−

4m2)(1

+

cos

θ)

.

(11.41)

Thus |T |2 is quite a complicated function of s and θ. In the nonrelativistic limit, |k1| ≪ m or equivalently s − 4m2 ≪ m2, we have

T

=

5g2 3m2

1

−

8 15

s − 4m2 m2

+

5 18

1

+

27 25

cos2

θ

s − 4m2 m2

2
+...

+ O(g4) .

(11.42)

Thus the diﬀerential cross section is nearly isotropic. In the extreme relativistic limit, |k1| ≫ m or equivalently s ≫ m2, we have

T

=

g2 s sin2 θ

3 + cos2 θ −

(3

+ cos2 sin2 θ

θ)2

−

16

m2 s

+

...

+ O(g4) .

(11.43)

Now the diﬀerential cross section is sharply peaked in the forward (θ = 0) and backward (θ = π) directions.
We can compute the total cross section σ from eq. (11.39). We have in this case tmin = −(s − 4m2) and tmax = 0. Since the two outgoing particles are identical, the symmetry factor is S = 2. Then setting u = 4m2 − s − t, and performing the integral in eq. (11.39) over t at ﬁxed s, we get

σ

=

g4 32πs(s − 4m2)

2 m2

+

s (s

− −

4m2 m2)2

−

s

2 − 3m2

+

(s

−

4m2 m2)(s −

2m2)

ln

s − 3m2 m2

+ O(g6) . (11.44)

In the nonrelativistic limit, this becomes

σ

=

25g4 1152πm6

1

−

79 60

s − 4m2 m2

+...

+ O(g6) .

(11.45)

In the extreme relativistic limit, we get

σ

=

g4 16πm2s2

1

+

7 2

m2 s

+

...

+ O(g6) .

(11.46)

11: Cross Sections and Decay Rates

100

These results illustrate how even a very simple quantum ﬁeld theory can yield speciﬁc predictions for cross sections that could be tested experimentally.
Let us now turn to the other basic problem mentioned at the beginning of this section: the case of a single incoming particle that decays to n′ other particles.
We have an immediate conceptual problem. According to our development of the LSZ formula in section 5, each incoming and outgoing particle should correspond to a single-particle state that is an exact eigenstate of the exact hamiltonian. This is clearly not the case for a particle that can decay. Referring to ﬁg. (5.1), the hyperbola of such a particle must lie above the continuum threshold. Strictly speaking, then, the LSZ formula is not applicable.
A proper understanding of this issue requires a study of loop corrections that we will undertake in section 25. For now, we will simply assume that the LSZ formula continues to hold for a single incoming particle. Then we can retrace the steps from eq. (11.11) to eq. (11.20); the only change is that the norm of the initial state is now

i|i = 2E1V

(11.47)

instead of eq. (11.16). Identifying the diﬀerential decay rate dΓ with P˙ then

gives

dΓ

=

1 2E1

|T

|2

dLIPSn′ (k1)

,

(11.48)

where now s = −k12 = m21. In the CM frame (which is now the rest frame of the initial particle), we have E1 = m1; in other frames, the relative factor
of E1/m1 in dΓ accounts for relativistic time dilation of the decay rate.
We can also deﬁne a total decay rate by integrating over all the outgoing

momenta, and dividing by the symmetry factor of eq. (11.35):

Γ

=

1 S

dΓ .

(11.49)

We will compute a decay rate in problem 11.1

Reference Notes

For a derivation with wave packets, see Brown, Itzykson & Zuber, or Peskin & Schroeder.

Problems