Quantum Field Theory Mark Srednicki University of California, Santa Barbara mark@physics.ucsb.edu c 2006 by M. Srednicki All rights reserved. Please DO NOT DISTRIBUTE this document. Instead, link to http://www.physics.ucsb.edu/∼mark/qft.html 1 To my parents Casimir and Helen Srednicki with gratitude Contents Preface for Students 8 Preface for Instructors 12 Acknowledgments 16 I Spin Zero 18 1 Attempts at relativistic quantum mechanics 19 2 Lorentz Invariance (prerequisite: 1) 30 3 Canonical Quantization of Scalar Fields (2) 36 4 The Spin-Statistics Theorem (3) 45 5 The LSZ Reduction Formula (3) 49 6 Path Integrals in Quantum Mechanics 57 7 The Path Integral for the Harmonic Oscillator (6) 63 8 The Path Integral for Free Field Theory (3, 7) 67 9 The Path Integral for Interacting Field Theory (8) 71 10 Scattering Amplitudes and the Feynman Rules (5, 9) 87 11 Cross Sections and Decay Rates (10) 93 12 Dimensional Analysis with ¯h = c = 1 (3) 104 13 The Lehmann-K¨all´en Form of the Exact Propagator (9) 106 14 Loop Corrections to the Propagator (10, 12, 13) 109 15 The One-Loop Correction in Lehmann-K¨all´en Form (14) 120 16 Loop Corrections to the Vertex (14) 124 17 Other 1PI Vertices (16) 127 18 Higher-Order Corrections and Renormalizability (17) 129 4 19 Perturbation Theory to All Orders (18) 133 20 Two-Particle Elastic Scattering at One Loop (19) 135 21 The Quantum Action (19) 139 22 Continuous Symmetries and Conserved Currents (8) 144 23 Discrete Symmetries: P , T , C, and Z (22) 152 24 Nonabelian Symmetries (22) 157 25 Unstable Particles and Resonances (14) 161 26 Infrared Divergences (20) 167 27 Other Renormalization Schemes (26) 172 28 The Renormalization Group (27) 178 29 Effective Field Theory (28) 185 30 Spontaneous Symmetry Breaking (21) 196 31 Broken Symmetry and Loop Corrections (30) 200 32 Spontaneous Breaking of Continuous Symmetries (22, 30)205 II Spin One Half 210 33 Representations of the Lorentz Group (2) 211 34 Left- and Right-Handed Spinor Fields (3, 33) 215 35 Manipulating Spinor Indices (34) 222 36 Lagrangians for Spinor Fields (22, 35) 226 37 Canonical Quantization of Spinor Fields I (36) 236 38 Spinor Technology (37) 240 39 Canonical Quantization of Spinor Fields II (38) 246 40 Parity, Time Reversal, and Charge Conjugation (23, 39) 254 5 41 LSZ Reduction for Spin-One-Half Particles (5, 39) 263 42 The Free Fermion Propagator (39) 268 43 The Path Integral for Fermion Fields (9, 42) 272 44 Formal Development of Fermionic Path Integrals (43) 276 45 The Feynman Rules for Dirac Fields (10, 12, 41, 43) 282 46 Spin Sums (45) 292 47 Gamma Matrix Technology (36) 295 48 Spin-Averaged Cross Sections (46, 47) 298 49 The Feynman Rules for Majorana Fields (45) 303 50 Massless Particles and Spinor Helicity (48) 308 51 Loop Corrections in Yukawa Theory (19, 40, 48) 314 52 Beta Functions in Yukawa Theory (28, 51) 323 53 Functional Determinants (44, 45) 326 III Spin One 331 54 Maxwell’s Equations (3) 332 55 Electrodynamics in Coulomb Gauge (54) 335 56 LSZ Reduction for Photons (5, 55) 339 57 The Path Integral for Photons (8, 56) 343 58 Spinor Electrodynamics (45, 57) 345 59 Scattering in Spinor Electrodynamics (48, 58) 351 60 Spinor Helicity for Spinor Electrodynamics (50, 59) 356 61 Scalar Electrodynamics (58) 364 62 Loop Corrections in Spinor Electrodynamics (51, 59) 369 6 63 The Vertex Function in Spinor Electrodynamics (62) 378 64 The Magnetic Moment of the Electron (63) 383 65 Loop Corrections in Scalar Electrodynamics (61, 62) 386 66 Beta Functions in Quantum Electrodynamics (52, 62) 395 67 Ward Identities in Quantum Electrodynamics I (22, 59) 399 68 Ward Identities in Quantum Electrodynamics II (63, 67) 403 69 Nonabelian Gauge Theory (24, 58) 407 70 Group Representations (69) 412 71 The Path Integral for Nonabelian Gauge Theory (53, 69) 420 72 The Feynman Rules for Nonabelian Gauge Theory (71) 424 73 The Beta Function in Nonabelian Gauge Theory (70, 72) 427 74 BRST Symmetry (70, 71) 435 75 Chiral Gauge Theories and Anomalies (70, 72) 443 76 Anomalies in Global Symmetries (75) 455 77 Anomalies and the Path Integral for Fermions (76) 459 78 Background Field Gauge (73) 465 79 Gervais–Neveu Gauge (78) 473 80 The Feynman Rules for N × N Matrix Fields (10) 476 81 Scattering in Quantum Chromodynamics (60, 79, 80) 482 82 Wilson Loops, Lattice Theory, and Confinement (29, 73) 494 83 Chiral Symmetry Breaking (76, 82) 502 84 Spontaneous Breaking of Gauge Symmetries (32, 70) 512 85 Spontaneously Broken Abelian Gauge Theory (61, 84) 517 7 86 Spontaneously Broken Nonabelian Gauge Theory (85) 523 87 The Standard Model: Gauge and Higgs Sector (84) 527 88 The Standard Model: Lepton Sector (75, 87) 532 89 The Standard Model: Quark Sector (88) 540 90 Electroweak Interactions of Hadrons (83, 89) 546 91 Neutrino Masses (89) 555 92 Solitons and Monopoles (84) 558 93 Instantons and Theta Vacua (92) 571 94 Quarks and Theta Vacua (77, 83, 93) 582 95 Supersymmetry (69) 590 96 The Minimal Supersymmetric Standard Model (89, 95) 602 97 Grand Unification (89) 605 Bibliography 615 8 Preface for Students Quantum field theory is the basic mathematical language that is used to describe and analyze the physics of elementary particles. The goal of this book is to provide a concise, step-by-step introduction to this subject, one that covers all the key concepts that are needed to understand the Standard Model of elementary particles, and some of its proposed extensions. In order to be prepared to undertake the study of quantum field theory, you should recognize and understand the following equations: dσ dΩ = |f (θ, φ)|2 a†|n √ = n+1 |n+1 √ J±|j, m = j(j+1)−m(m±1) |j, m±1 A(t) = e+iHt/¯hAe−iHt/¯h H = pq˙ − L ct′ = γ(ct − βx) E = (p2c2 + m2c4)1/2 E = −A˙ /c − ∇ϕ This list is not, of course, complete; but if you are familiar with these equations, you probably know enough about quantum mechanics, classical mechanics, special relativity, and electromagnetism to tackle the material in this book. Quantum field theory has a reputation as a subject that is hard to learn. The problem, I think, is not so much that its basic ingredients are unusually difficult to master (indeed, the conceptual shift needed to go from quantum mechanics to quantum field theory is not nearly as severe as the one needed to go from classical mechanics to quantum mechanics), but rather that there are a lot of these ingredients. Some are fundamental, but many are just technical aspects of an unfamiliar form of perturbation theory. In this book, I have tried to make the subject as accessible to beginners as possible. There are three main aspects to my approach. Logical development of the basic concepts. This is, of course, very different from the historical development of quantum field theory, which, like the historical development of most worthwhile subjects, was filled with inspired guesses and brilliant extrapolations of sometimes fuzzy ideas, as well as its fair share of mistakes, misconceptions, and dead ends. None of that is in this book. From this book, you will (I hope) get the impression that the 9 whole subject is effortlessly clear and obvious, with one step following the next like sunshine after a refreshing rain. Illustration of the basic concepts with the simplest examples. In most fields of human endeavor, newcomers are not expected to do the most demanding tasks right away. It takes time, dedication, and lots of practice to work up to what the accomplished masters are doing. There is no reason to expect quantum field theory to be any different in this regard. Therefore, we will start off analyzing quantum field theories that are not immediately applicable to the real world of electrons, photons, protons, etc., but that will allow us to gain familiarity with the tools we will need, and to practice using them. Then, when we do work up to “real physics”, we will be fully ready for the task. To this end, the book is divided into three parts: Spin Zero, Spin One Half, and Spin One. The technical complexities associated with a particular type of particle increase with its spin. We will therefore first learn all we can about spinless particles before moving on to the more difficult (and more interesting) nonzero spins. Once we get to them, we will do a good variety of calculations in (and beyond) the Standard Model of elementary particles. User friendliness. Each of the three parts is divided into numerous sections. Each section is intended to treat one idea or concept or calculation, and each is written to be as self-contained as possible. For example, when an equation from an earlier section is needed, I usually just repeat it, rather than ask you to leaf back and find it (a reader’s task that I’ve always found annoying). Furthermore, each section is labeled with its immediate prerequisites, so you can tell exactly what you need to have learned in order to proceed. This allows you to construct chains to whatever material may interest you, and to get there as quickly as possible. That said, I expect that most readers of this book will encounter it as the textbook in a course on quantum field theory. In that case, of course, your reading will be guided by your professor, who I hope will find the above features useful. If, however, you are reading this book on your own, I have two pieces of advice. The first (and most important) is this: find someone else to read it with you. I promise that it will be far more fun and rewarding that way; talking about a subject to another human being will inevitably improve the depth of your understanding. And you will have someone to work with you on the problems. (As with all physics texts, the problems are a key ingredient. I will not belabor this point, because if you have gotten this far in physics, you already know it well.) The second piece of advice echoes the novelist and Nobel laureate William Faulkner. An interviewer asked, “Mr. Faulker, some of your readers claim they still cannot understand your work after reading it two or 10 three times. What approach would you advise them to adopt?” Faulkner replied, “Read it a fourth time.” That’s my advice here as well. After the fourth attempt, though, you should consider trying something else. This is, after all, not the only book that has ever been written on the subject. You may find that a different approach (or even the same approach explained in different words) breaks the logjam in your thinking. There are a number of excellent books that you could consult, some of which are listed in the Bibliography. I have also listed particular books that I think could be helpful on specific topics in Reference Notes at the end of some of the sections. This textbook (like all finite textbooks) has a number of deficiencies. One of these is a rather low level of mathematical rigor. This is partly endemic to the subject; rigorous proofs in quantum field theory are relatively rare, and do not appear in the overwhelming majority of research papers. Even some of the most basic notions lack proof; for example, currently you can get a million dollars from the Clay Mathematics Institute simply for proving that nonabelian gauge theory actually exists and has a unique ground state. Given this general situation, and since this is an introductory book, the proofs that we do have are only outlined. those proofs that we do have are only outlined. Another deficiency of this book is that there is no discussion of the application of quantum field theory to condensed matter physics, where it also plays an important role. This connection has been important in the historical development of the subject, and is especially useful if you already know a lot of advanced statistical mechanics. I do not want this to be a prerequisite, however, and so I have chosen to keep the focus on applications within elementary particle physics. Yet another deficiency is that there are no references to the original literature. In this regard, I am following a standard trend: as the foundations of a branch of science retreat into history, textbooks become more and more synthetic and reductionist. For example, it is now rare to see a new textbook on quantum mechanics that refers to the original papers by the famous founders of the subject. For guides to the original literature on quantum field theory, there are a number of other books with extensive references that you can consult; these include Peskin & Schroeder, Weinberg, and Siegel. (Italicized names refer to works listed in the Bibliography.) Unless otherwise noted, experimental numbers are taken from the Review of Particle Properties, available online at http://pdg.lbl.gov. Experimental numbers quoted in this book have an uncertainty of roughly ±1 in the last significiant digit. The Review should be consulted for the most recent experimental results, and for more precise statements of their uncertainty. To conclude, let me say that you are about to embark on a tour of one of 11 humanity’s greatest intellectual endeavors, and certainly the one that has produced the most precise and accurate description of the natural world as we find it. I hope you enjoy the ride. 12 Preface for Instructors On learning that a new text on quantum field theory has appeared, one is surely tempted to respond with Isidor Rabi’s famous comment about the muon: “Who ordered that?” After all, many excellent textbooks on quantum field theory are already available. I, for example, would not want to be without my well-worn copies of Quantum Field Theory by Lowell S. Brown (Cambridge 1994), Aspects of Symmetry by Sidney Coleman (Cambridge 1985), Introduction to Quantum Field Theory by Michael E. Peskin and Daniel V. Schroeder (Westview 1995), Field Theory: A Modern Primer by Pierre Ramond (Addison-Wesley 1990), Fields by Warren Siegel (arXiv.org 2005), The Quantum Theory of Fields, Volumes I, II, and III, by Steven Weinberg (Cambridge 1995), and Quantum Field Theory in a Nutshell by my colleague Tony Zee (Princeton 2003), to name just a few of the more recent texts. Nevertheless, despite the excellence of these and other books, I have never followed any of them very closely in my twenty years of onand-off teaching of a year-long course in relativistic quantum field theory. As discussed in the Preface for Students, this book is based on the notion that quantum field theory is most readily learned by starting with the simplest examples and working through their details in a logical fashion. To this end, I have tried to set things up at the very beginning to anticipate the eventual need for renormalization, and not be cavalier about how the fields are normalized and the parameters defined. I believe that these precautions take a lot of the “hocus pocus” (to quote Feynman) out of the “dippy process” of renormalization. Indeed, with this approach, even the anharmonic oscillator is in need of renormalization; see problem 14.7. A field theory with many pedagogical virtues is ϕ3 theory in six dimensions, where its coupling constant is dimensionless. Perhaps because six dimensions used to seem too outre (though today’s prospective string theorists don’t even blink), the only introductory textbook I know of that treats this model is Quantum Field Theory by George Sterman (Cambridge 1993), though it is also discussed in some more advanced books, such as Renormalization by John Collins (Cambridge 1984) and Foundations of Quantum Chromodynamics by T. Muta (World Scientific 1998). (There is also a series of lectures by Ed Witten on quantum field theory for mathematicians, available online, that treat ϕ3 theory.) The reason ϕ3 theory in six dimensions is a nice example is that its Feynman diagrams have a simple structure, but still exhibit the generic phenomena of renormalizable quantum field theory at the one-loop level. (The same cannot be said for ϕ4 theory in four dimensions, where momentum-dependent corrections to the propagator do not appear until the two-loop level.) Thus, in Part I of this text, ϕ3 theory in six dimensions is the primary example. I use it to give 13 introductory treatments of most aspects of relativistic quantum field theory for spin-zero particles, with a minimum of the technical complications that arise in more realistic theories (like QED) with higher-spin particles. Although I eventually discuss the Wilson approach to renormalization and effective field theory (in section 29), and use effective field theory extensively for the physics of hadrons in Part III, I do not feel it is pedagogically useful to bring it in at the very beginning, as is sometimes advocated. The problem is that the key notion of the decoupling of physical processes at different length scales is an unfamiliar one for most students; there is nothing in typical courses on quantum mechanics or electomagnetism or classical mechanics to prepare students for this idea (which was deemed worthy of a Nobel Prize for Ken Wilson in 1982). It also does not provide for a simple calculational framework, since one must deal with the infinite number of terms in the effective lagrangian, and then explain why most of them don’t matter after all. It’s noteworthy that Wilson himself did not spend a lot of time computing properly normalized perturbative S-matrix elements, a skill that we certainly want our students to have; we want them to have it because a great deal of current research still depends on it. Indeed, the vaunted success of quantum field theory as a description of the real world is based almost entirely on our ability to carry out these perturbative calculations. Studying renormalization early on has other pedagogical advantages. With the Nobel Prizes to Gerard ’t Hooft and Tini Veltman in 1999 and to David Gross, David Politzer, and Frank Wilczek in 2004, today’s students are well aware of beta functions and running couplings, and would like to understand them. I find that they are generally much more excited about this (even in the context of toy models) than they are about learning to reproduce the nearly century-old tree-level calculations of QED. And ϕ3 theory in six dimensions is asymptotically free, which ultimately provides for a nice segue to the “real physics” of QCD. In general I have tried to present topics so that the more interesting aspects (from a present-day point of view) come first. An example is anomalies; the traditional approach is to start with the π0 → γγ decay rate, but such a low-energy process seems like a dusty relic to most of today’s students. I therefore begin by demonstrating that anomalies destroy the self-consistency of the great majority of chiral gauge theories, a fact that strikes me (and, in my experience, most students) as much more interesting and dramatic than an incorrect calculation of the π0 decay rate. Then, when we do eventually get to this process (in section 90), it appears as a straightforward consequence of what we already learned about anomalies in sections 75–77. Nevertheless, I want this book to be useful to those who disagree with my pedagogical choices, and so I have tried to structure it to allow for 14 maximum flexibility. Each section treats a particular idea or concept or calculation, and is as self-contained as possible. Each section also lists its immediate prerequisites, so that it is easy to see how to rearrange the material to suit your personal preferences. In some cases, alternative approaches are developed in the problems. For example, I have chosen to introduce path integrals relatively early (though not before canonical quantization and operator methods are applied to free-field theory), and use them to derive Dyson’s expansion. For those who would prefer to delay the introduction of path integrals (but since you will have to cover them eventually, why not get it over with?), problem 9.5 outlines the operator-based derivation in the interaction picture. Another point worth noting is that a textbook and lectures are ideally complementary. Many sections of this book contain rather tedious mathematical detail that I would not and do not write on the blackboard during a lecture. (Indeed, the earliest origins of this book are supplementary notes that I typed up and handed out.) For example, much of the development of Weyl spinors in sections 34–37 can be left to outside reading. I do encourage you not to eliminate this material entirely, however; pedagogically, the problem with skipping directly to four-component notation is explaining that (in four dimensions) the hermitian conjugate of a left-handed field is right handed, a deeply important fact that is the key to solving problems such as 36.5 and 83.1, which are in turn vital to understanding the structure of the Standard Model and its extensions. A related topic is computing scattering amplitudes for Majorana fields; this is essential for modern research on massive neutrinos and supersymmetric particles, though it could be left out of a time-limited course. While I have sometimes included more mathematical detail than is ideal for a lecture, I have also tended to omit explanations based on “physical intuition.” For example, in section 90, we compute the π− → ℓ−ν¯ℓ decay amplitude (where ℓ is a charged lepton) and find that it is proportional to the lepton mass. There is a well-known heuristic explanation of this fact that goes something like this: “The pion has spin zero, and so the lepton and the antineutrino must emerge with opposite spin, and therefore the same helicity. An antineutrino is always right-handed, and so the lepton must be as well. But only the left-handed lepton couples to the W −, so the decay amplitude vanishes if the left- and right-handed leptons are not coupled by a mass term.” This is essentially correct, but the reasoning is a bit more subtle than it first appears. A student may ask, “Why can’t there be orbital angular momentum? Then the lepton and the antineutrino could have the same spin.” The answer is that orbital angular momentum must be perpendicular to the linear momentum, whereas helicity is (by definition) parallel to the 15 linear momentum; so adding orbital angular momentum cannot change the helicity assignments. (This is explored in a simplified model in problem 48.4.) The larger point is that intuitive explanations can almost always be probed more deeply. This is fine in a classroom, where you are available to answer questions, but a textbook author has a hard time knowing where to stop. Too little detail renders the explanations opaque, and too much can be overwhelming; furthermore the happy medium tends to differ from student to student. The calculation, on the other hand, is definitive (at least within the framework being explored, and modulo the possibility of mathematical error). As Roger Penrose once said, “The great thing about physical intuition is that it can be adjusted to fit the facts.” So, in this book, I have tended to emphasize calculational detail at the expense of heuristic reasoning. Lectures should ideally invert this to some extent. I should also mention that a section of the book is not intended to coincide exactly with a lecture. The material in some sections could easily be covered in less than an hour, and some would clearly take more. My approach in lecturing is to try to keep to a pace that allows the students to follow the analysis, and then try to come to a more-or-less natural stopping point when class time is up. This sometimes means ending in the middle of a long calculation, but I feel that this is better than trying to artificially speed things along to reach a predetermined destination. It would take at least three semesters of lectures to cover this entire book, and so a year-long course must omit some. A sequence I might follow is 1–23, 26–28, 33–43, 45–48, 51, 52, 54–59, 62–64, 66–68, 24, 69, 70, 44, 53, 71–73, 75–77, 30, 32, 84, 87–89, 29, 82, 83, 90, and, if any time was left, a selection of whatever seemed of most interest to me and the students of the remaining material. To conclude, I hope you find this book to be a useful tool in working towards our mutual goal of bringing humanity’s understanding of the physics of elementary particles to a new audience. 16 Acknowledgments Every book is a collaborative effort, even if there is only one author on the title page. Any skills I may have as a teacher were first gleaned as a student in the classes of those who taught me. My first and most important teachers were my parents, Casimir and Helen Srednicki, to whom this book is dedicated. In our small town in Ohio, my excellent public-school teachers included Esta Kefauver, Marie Casher, Carol Baird, Jim Chase, Joe Gerin, Hugh Laughlin, and Tom Murphy. In college at Cornell, Don Hartill, Bruce Kusse, Bob Siemann, John Kogut, and Saul Teukolsky taught particularly memorable courses. In graduate school at Stanford, Roberto Peccei gave me my first exposure to quantum field theory, in a superb course that required bicycling in by 8:30 AM (which seemed like a major sacrifice at the time). Everyone in that class very much hoped that Roberto would one day turn his extensive hand-written lecture notes (which he put on reserve in the library) into a book. He never did, but I’d like to think that perhaps a bit of his consummate skill has found its way into this text. I have also used a couple of his jokes. My thesis advisor at Stanford, Lenny Susskind, taught me how to think about physics without getting bogged down in the details. This book includes a lot of detail that Lenny would no doubt have left out, but while writing it I have tried to keep his exemplary clarity of thought in mind as something to strive for. During my time in graduate school, and subsequently in postdoctoral positions at Princeton and CERN, and finally as a faculty member at UC Santa Barbara, I was extremely fortunate to be able to interact with many excellent physicists, from whom I learned an enormous amount. These include Stuart Freedman, Eduardo Fradkin, Steve Shenker, Sidney Coleman, Savas Dimopoulos, Stuart Raby, Michael Dine, Willy Fischler, Curt Callan, David Gross, Malcolm Perry, Sam Trieman, Arthur Wightman, Ed Witten, Hans-Peter Nilles, Daniel Wyler, Dmitri Nanopoulos, John Ellis, Keith Olive, Jose Fulco, Ray Sawyer, John Cardy, Frank Wilczek, Jim Hartle, Gary Horowitz, Andy Strominger, and Tony Zee. I am especially grateful to my Santa Barbara colleagues David Berenstein, Steve Giddings, Don Marolf, Joe Polchinski, and Bob Sugar, who used various drafts of this book while teaching quantum field theory, and made various suggestions for improvement. I am also grateful to physicists at other institutions who read parts of the manuscript and also made suggestions, including Oliver de Wolfe, Marcelo Gleiser, Steve Gottlieb, Arkady Tsetlyn, and Arkady Vainshtein. I must single out for special thanks Professor Heidi Fearn of Cal State Fullerton, whose careful reading of Parts I and II allowed me to correct many unclear 17 passages and outright errors that would otherwise have slipped by. Students over the years have suffered through my varied attempts to arrive at a pedagogically acceptable scheme for teaching quantum field theory. I thank all of them for their indulgence. I am especially grateful to Sam Pinansky, Tae Min Hong, and Sho Yaida for their diligence in finding and reporting errors, and to Brian Wignal for help with formatting the manuscript. Also, a number of students from around the world (as well as Santa Barbara) kindly reported errors in versions of this book that were posted online; these include Omri Bahat-Treidel, Hee-Joong Chung, Yevgeny Kats, Sue Ann Koay, Peter Lee, Nikhil Jayant Joshi, Kevin Weil, Dusan Simic, and Miles Stoudenmire. I thank them for their help, and apologize to anyone that I may have missed. Throughout this project, the assistance and support of my wife Elo¨ıse and daughter Julia were invaluable. Elo¨ıse read through the manuscript and made suggestions that often clarified the language. Julia offered advice on the cover design (a highly stylized Feynman diagram). And they both kindly indulged the amount of time I spent working on this book that you now hold in your hands. Part I Spin Zero 1: Attempts at relativistic quantum mechanics 19 1 Attempts at relativistic quantum mechanics Prerequisite: none In order to combine quantum mechanics and relativity, we must first understand what we mean by “quantum mechanics” and “relativity”. Let us begin with quantum mechanics. Somewhere in most textbooks on the subject, one can find a list of the “axioms of quantum mechanics”. These include statements along the lines of The state of the system is represented by a vector in Hilbert space. Observables are represented by hermitian operators. The measurement of an observable yields one of its eigenvalues as the result. And so on. We do not need to review these closely here. The axiom we need to focus on is the one that says that the time evolution of the state of the system is governed by the Schr¨odinger equation, i¯h ∂ ∂t |ψ, t = H|ψ, t , (1.1) where H is the hamiltonian operator, representing the total energy. Let us consider a very simple system: a spinless, nonrelativistic particle with no forces acting on it. In this case, the hamiltonian is H = 1 2m P2 , (1.2) where m is the particle’s mass, and P is the momentum operator. In the position basis, eq. (1.1) becomes i¯h ∂ ∂t ψ(x, t) = − ¯h2 2m ∇2ψ(x, t) , (1.3) where ψ(x, t) = x|ψ, t is the position-space wave function. We would like to generalize this to relativistic motion. The obvious way to proceed is to take H = + P2c2 + m2c4 , (1.4) 1: Attempts at relativistic quantum mechanics 20 which yields the correct relativistic energy-momentum relation. If we for- mally expand this hamiltonian in inverse powers of the speed of light c, we get H = mc2 + 1 2m P2 + . . . . (1.5) This is simply a constant (the rest energy), plus the usual nonrelativistic hamiltonian, eq. (1.2), plus higher-order corrections. With the hamiltonian given by eq. (1.4), the Schr¨odinger equation becomes i¯h ∂ ψ(x, t) = + −¯h2c2∇2 + m2c4 ψ(x, t) . ∂t (1.6) Unfortunately, this equation presents us with a number of difficulties. One is that it apparently treats space and time on a different footing: the time derivative appears only on the left, outside the square root, and the space derivatives appear only on the right, under the square root. This asymme- try between space and time is not what we would expect of a relativistic theory. Furthermore, if we expand the square root in powers of ∇2, we get an infinite number of spatial derivatives acting on ψ(x, t); this implies that eq. (1.6) is not local in space. We can alleviate these problems by squaring the differential operators on each side of eq. (1.6) before applying them to the wave function. Then we get −¯h2 ∂2 ∂t2 ψ(x, t) = −¯h2c2∇2 + m2c4 ψ(x, t) . (1.7) This is the Klein-Gordon equation, and it looks a lot nicer than eq. (1.6). It is second-order in both space and time derivatives, and they appear in a symmetric fashion. To better understand the Klein-Gordon equation, let us consider in more detail what we mean by “relativity”. Special relativity tells us that physics looks the same in all inertial frames. To explain what this means, we first suppose that a certain spacetime coordinate system (ct, x) represents (by fiat) an inertial frame. Let us define x0 = ct, and write xµ, where µ = 0, 1, 2, 3, in place of (ct, x). It is also convenient (for reasons not at all obvious at this point) to define x0 = −x0 and xi = xi, where i = 1, 2, 3. This can be expressed more elegantly if we first introduce the Minkowski metric,  −1  gµν =  +1 +1  , (1.8) +1 where blank entries are zero. We then have xµ = gµν xν, where a repeated index is summed. 1: Attempts at relativistic quantum mechanics 21 To invert this formula, we introduce the inverse of g, which is confusingly also called g, except with both indices up:  −1  gµν =  +1 +1  . +1 (1.9) We then have gµν gνρ = δµρ, where δµρ is the Kronecker delta (equal to one if its two indices take on the same value, zero otherwise). Now we can also write xµ = gµν xν. It is a general rule that any pair of repeated (and therefore summed) indices must consist of one superscript and one subscript; these indices are said to be contracted. Also, any unrepeated (and therefore unsummed) indices must match (in both name and height) on the left- and right-hand sides of any valid equation. Now we are ready to specify what we mean by an inertial frame. If the coordinates xµ represent an inertial frame (which they do, by assumption), then so do any other coordinates x¯µ that are related by x¯µ = Λµν xν + aµ , (1.10) where Λµν is a Lorentz transformation matrix and aµ is a translation vector. Both Λµν and aµ are constant (that is, independent of xµ). Furthermore, Λµν must obey gµν ΛµρΛν σ = gρσ . (1.11) Eq. (1.11) ensures that the interval between two different spacetime points that are labeled by xµ and x′µ in one inertial frame, and by x¯µ and x¯′µ in another, is the same. This interval is defined to be (x − x′)2 ≡ gµν (x − x′)µ(x − x′)ν = (x − x′)2 − c2(t − t′)2 . (1.12) In the other frame, we have (x¯ − x¯′)2 = gµν (x¯ − x¯′)µ(x¯ − x¯′)ν = gµν ΛµρΛν σ(x − x′)ρ(x − x′)σ = gρσ(x − x′)ρ(x − x′)σ = (x − x′)2 , (1.13) as desired. When we say that physics looks the same, we mean that two observers (Alice and Bob, say) using two different sets of coordinates (representing 1: Attempts at relativistic quantum mechanics 22 two different inertial frames) should agree on the predicted results of all possible experiments. In the case of quantum mechanics, this requires Alice and Bob to agree on the value of the wave function at a particular spacetime point, a point that is called x by Alice and x¯ by Bob. Thus if Alice’s predicted wave function is ψ(x), and Bob’s is ψ¯(x¯), then we should have ψ(x) = ψ¯(x¯). Furthermore, in order to maintain ψ(x) = ψ¯(x¯) throughout spacetime, ψ(x) and ψ¯(x¯) should obey identical equations of motion. Thus a candidate wave equation should take the same form in any inertial frame. Let us see if this is true of the Klein-Gordon equation. We first introduce some useful notation for spacetime derivatives: ∂µ ≡ ∂ ∂xµ = + 1 c ∂ ∂t , ∇ , (1.14) Note that ∂µ ≡ ∂ ∂xµ = − 1 c ∂ ∂t , ∇ . ∂µxν = gµν , (1.15) (1.16) so that our matching-index-height rule is satisfied. If x¯ and x are related by eq. (1.10), then ∂¯ and ∂ are related by ∂¯µ = Λµν ∂ν . (1.17) To check this, we note that ∂¯ρx¯σ = (Λρµ∂µ)(Λσν xν + aµ) = ΛρµΛσν (∂µxν ) = ΛρµΛσν gµν = gρσ , (1.18) as expected. The last equality in eq. (1.18) is another form of eq. (1.11); see section 2. We can now write eq. (1.7) as −¯h2c2∂02ψ(x) = (−¯h2c2∇2 + m2c4)ψ(x) . (1.19) After rearranging and identifying ∂2 ≡ ∂µ∂µ = −∂02 + ∇2, we have (−∂2 + m2c2/¯h2)ψ(x) = 0 . (1.20) This is Alice’s form of the equation. Bob would write (−∂¯2 + m2c2/¯h2)ψ¯(x¯) = 0 . (1.21) Is Bob’s equation equivalent to Alice’s equation? To see that it is, we set ψ¯(x¯) = ψ(x), and note that ∂¯2 = gµν ∂¯µ∂¯ν = gµν ΛµρΛµσ∂ρ∂σ = ∂2 . (1.22) 1: Attempts at relativistic quantum mechanics 23 Thus, eq. (1.21) is indeed equivalent to eq. (1.20). The Klein-Gordon equation is therefore manifestly consistent with relativity: it takes the same form in every inertial frame. This is the good news. The bad news is that the Klein-Gordon equation violates one of the axioms of quantum mechanics: eq. (1.1), the Schr¨odinger equation in its abstract form. The abstract Schr¨odinger equation has the fundamental property of being first order in the time derivative, whereas the Klein-Gordon equation is second order. This may not seem too important, but in fact it has drastic consequences. One of these is that the norm of a state, ψ, t|ψ, t = d3x ψ, t|x x|ψ, t = d3x ψ∗(x)ψ(x), (1.23) is not in general time independent. Thus probability is not conserved. The Klein-Gordon equation obeys relativity, but not quantum mechanics. Dirac attempted to solve this problem (for spin-one-half particles) by introducing an extra discrete label on the wave function, to account for spin: ψa(x), a = 1, 2. He then tried a Schr¨odinger equation of the form i¯h ∂ ∂t ψa(x) = −i¯hc(αj )ab∂j + mc2(β)ab ψb(x) , (1.24) where all repeated indices are summed, and αj and β are matrices in spin- space. This equation, the Dirac equation, is consistent with the abstract Schr¨odinger equation. The state |ψ, a, t carries a spin label a, and the hamiltonian is Hab = cPj (αj )ab + mc2(β)ab , (1.25) where Pj is a component of the momentum operator. Since the Dirac equation is linear in both time and space derivatives, it has a chance to be consistent with relativity. Note that squaring the hamiltonian yields (H2)ab = c2Pj Pk(αj αk)ab + mc3Pj (αj β + βαj )ab + (mc2)2(β2)ab . (1.26) Since PjPk is symmetric on exchange of j and k, we can replace αjαk by its symmetric part, 1 2 {αj , αk }, where {A, B} = AB + BA is the anticom- mutator. Then, if we choose matrices such that {αj , αk}ab = 2δjkδab , {αj , β}ab = 0 , (β2)ab = δab , (1.27) we will get (H2)ab = (P2c2 + m2c4)δab . (1.28) Thus, the eigenstates of H2 are momentum eigenstates, with H2 eigenvalue p2c2 + m2c4. This is, of course, the correct relativistic energy-momentum 1: Attempts at relativistic quantum mechanics 24 relation. While it is outside the scope of this section to demonstrate it, it turns out that the Dirac equation is fully consistent with relativity provided the Dirac matrices obey eq. (1.27). So we have apparently succeeded in constructing a quantum mechanical, relativistic theory! There are, however, some problems. We would like the Dirac matrices to be 2 × 2, in order to account for electron spin. However, they must in fact be larger. To see this, note that the 2 × 2 Pauli matrices obey {σi, σj} = 2δij, and are thus candidates for the Dirac αi matrices. However, there is no fourth matrix that anticommutes with these three (easily proven by writing down the most general 2 × 2 matrix and working out the three anticommutators explicitly). Also, we can show that the Dirac matrices must be even dimensional; see problem 1.1. Thus their minimum size is 4× 4, and it remains for us to interpret the two extra possible “spin” states. However, these extra states cause a more severe problem than a mere overcounting. Acting on a momentum eigenstate, H becomes the matrix c α·p + mc2β. In problem 1.1, we find that the trace of this matrix is zero. Thus the four eigenvalues must be +E(p), +E(p), −E(p), −E(p), where E(p) = +(p2c2 + m2c4)1/2. The negative eigenvalues are the problem: they indicate that there is no ground state. In a more elaborate theory that included interactions with photons, there seems to be no reason why a positive energy electron could not emit a photon and drop down into a negative energy state. This downward cascade could continue forever. (The same problem also arises in attempts to interpret the Klein-Gordon equation as a modified form of quantum mechanics.) Dirac made a wildly brilliant attempt to fix this problem of negative energy states. His solution is based on an empirical fact about electrons: they obey the Pauli exclusion principle. It is impossible to put more than one of them in the same quantum state. What if, Dirac speculated, all the negative energy states were already occupied? In this case, a positive energy electron could not drop into one of these states, by Pauli exclusion. Many questions immediately arise. Why don’t we see the negative electric charge of this Dirac sea of electrons? Dirac’s answer: because we’re used to it. (More precisely, the physical effects of a uniform charge density depend on the boundary conditions at infinity that we impose on Maxwell’s equations, and there is a choice that renders such a uniform charge density invisible.) However, Dirac noted, if one of these negative energy electrons were excited into a positive energy state (by, say, a sufficiently energetic photon), it would leave behind a hole in the sea of negative energy electrons. This hole would appear to have positive charge, and positive energy. Dirac therefore predicted (in 1927) the existence of the positron, a particle with the same mass as the electron, but opposite charge. The positron was found experimentally five years later. 1: Attempts at relativistic quantum mechanics 25 However, we have now jumped from an attempt at a quantum description of a single relativistic particle to a theory that apparently requires an infinite number of particles. Even if we accept this, we still have not solved the problem of how to describe particles like photons or pions or alpha nuclei that do not obey Pauli exclusion. At this point, it is worthwhile to stop and reflect on why it has proven to be so hard to find an acceptable relativistic wave equation for a single quantum particle. Perhaps there is something wrong with our basic approach. And there is. Recall the axiom of quantum mechanics that says that “Observables are represented by hermitian operators.” This is not entirely true. There is one observable in quantum mechanics that is not represented by a hermitian operator: time. Time enters into quantum mechanics only when we announce that the “state of the system” depends on an extra parameter t. This parameter is not the eigenvalue of any operator. This is in sharp contrast to the particle’s position x, which is the eigenvalue of an operator. Thus, space and time are treated very differently, a fact that is obscured by writing the Schr¨odinger equation in terms of the position-space wave function ψ(x, t). Since space and time are treated asymmetrically, it is not surprising that we are having trouble incorporating a symmetry that mixes them up. So, what are we to do? In principle, the problem could be an intractable one: it might be impossible to combine quantum mechanics and relativity. In this case, there would have to be some meta-theory, one that reduces in the nonrelativistic limit to quantum mechanics, and in the classical limit to relativistic particle dynamics, but is actually neither. This, however, turns out not to be the case. We can solve our problem, but we must put space and time on an equal footing at the outset. There are two ways to do this. One is to demote position from its status as an operator, and render it as an extra label, like time. The other is to promote time to an operator. Let us discuss the second option first. If time becomes an operator, what do we use as the time parameter in the Schr¨odinger equation? Happily, in relativistic theories, there is more than one notion of time. We can use the proper time τ of the particle (the time measured by a clock that moves with it) as the time parameter. The coordinate time T (the time measured by a stationary clock in an inertial frame) is then promoted to an operator. In the Heisenberg picture (where the state of the system is fixed, but the operators are functions of time that obey the classical equations of motion), we would have operators Xµ(τ ), where X0 = T . Relativistic quantum mechanics can indeed be developed along these lines, but it is surprisingly 1: Attempts at relativistic quantum mechanics 26 complicated to do so. (The many times are the problem; any monotonic function of τ is just as good a candidate as τ itself for the proper time, and this infinite redundancy of descriptions must be understood and accounted for.) One of the advantages of considering different formalisms is that they may suggest different directions for generalizations. For example, once we have Xµ(τ ), why not consider adding some more parameters? Then we would have, for example, Xµ(σ, τ ). Classically, this would give us a continuous family of worldlines, what we might call a worldsheet, and so Xµ(σ, τ ) would describe a propagating string. This is indeed the starting point for string theory. Thus, promoting time to an operator is a viable option, but is complicated in practice. Let us then turn to the other option, demoting position to a label. The first question is, label on what? The answer is, on operators. Thus, consider assigning an operator to each point x in space; call these operators ϕ(x). A set of operators like this is called a quantum field. In the Heisenberg picture, the operators are also time dependent: ϕ(x, t) = eiHt/¯hϕ(x, 0)e−iHt/¯h . (1.29) Thus, both position and (in the Heisenberg picture) time are now labels on operators; neither is itself the eigenvalue of an operator. So, now we have two different approaches to relativistic quantum theory, approaches that might, in principle, yield different results. This, however, is not the case: it turns out that any relativistic quantum physics that can be treated in one formalism can also be treated in the other. Which we use is a matter of convenience and taste. And, quantum field theory, the formalism in which position and time are both labels on operators, is much more convenient and efficient for most problems. There is another useful equivalence: ordinary nonrelativistic quantum mechanics, for a fixed number of particles, can be rewritten as a quantum field theory. This is an informative exercise, since the corresponding physics is already familiar. Let us carry it out. Begin with the position-basis Schr¨odinger equation for n particles, all with the same mass m, moving in an external potential U (x), and interacting with each other via an interparticle potential V (x1 − x2): i¯h ∂ ∂t ψ = n j=1 − ¯h2 2m ∇2j + U (xj ) n j−1 + V (xj − xk) ψ , j=1 k=1 (1.30) where ψ = ψ(x1, . . . , xn; t) is the position-space wave function. The quantum mechanics of this system can be rewritten in the abstract form of 1: Attempts at relativistic quantum mechanics 27 eq. (1.1) by first introducing (in, for now, the Schr¨odinger picture) a quantum field a(x) and its hermitian conjugate a†(x). We take these operators to have the commutation relations [a(x), a(x′)] = 0 , [a†(x), a†(x′)] = 0 , [a(x), a†(x′)] = δ3(x − x′) , (1.31) where δ3(x) is the three-dimensional Dirac delta function. Thus, a†(x) and a(x) behave like harmonic-oscillator creation and annihilation operators that are labeled by a continuous index. In terms of them, we introduce the hamiltonian operator of our quantum field theory, H= d3x a†(x) − ¯h2 2m ∇2 + U (x) a(x) + 1 2 d3x d3y V (x − y)a†(x)a†(y)a(y)a(x) . (1.32) Now consider a time-dependent state of the form |ψ, t = d3x1 . . . d3xn ψ(x1, . . . , xn; t)a†(x1) . . . a†(xn)|0 , (1.33) where ψ(x1, . . . , xn; t) is some function of the n particle positions and time, and |0 is the vacuum state, the state that is annihilated by all the a’s, a(x)|0 = 0 . (1.34) It is now straightforward (though tedious) to verify that eq. (1.1), the ab- stract Schr¨odinger equation, is obeyed if and only if the function ψ satisfies eq. (1.30). Thus we can interpret the state |0 as a state of “no particles”, the state a†(x1)|0 as a state with one particle at position x1, the state a†(x1)a†(x2)|0 as a state with one particle at position x1 and another at position x2, and so on. The operator N = d3x a†(x)a(x) (1.35) counts the total number of particles. It commutes with the hamiltonian, as is easily checked; thus, if we start with a state of n particles, we remain with a state of n particles at all times. However, we can imagine generalizations of this version of the theory (generalizations that would not be possible without the field formalism) in which the number of particles is not conserved. For example, we could try adding to H a term like ∆H ∝ d3x a†(x)a2(x) + h.c. . (1.36) 1: Attempts at relativistic quantum mechanics 28 This term does not commute with N , and so the number of particles would not be conserved with this addition to H. Theories in which the number of particles can change as time evolves are a good thing: they are needed for correct phenomenology. We are already familiar with the notion that atoms can emit and absorb photons, and so we had better have a formalism that can incorporate this phenomenon. We are less familiar with emission and absorption (that is to say, creation and annihilation) of electrons, but this process also occurs in nature; it is less common because it must be accompanied by the emission or absorption of a positron, antiparticle to the electron. There are not a lot of positrons around to facilitate electron annihilation, while e+e− pair creation requires us to have on hand at least 2mc2 of energy available for the rest-mass energy of these two particles. The photon, on the other hand, is its own antiparticle, and has zero rest mass; thus photons are easily and copiously produced and destroyed. There is another important aspect of the quantum theory specified by eqs. (1.32) and (1.33). Because the creation operators commute with each other, only the completely symmetric part of ψ survives the integration in eq. (1.33). Therefore, without loss of generality, we can restrict our attention to ψ’s of this type: ψ(. . . xi . . . xj . . . ; t) = +ψ(. . . xj . . . xi . . . ; t) . (1.37) This means that we have a theory of bosons, particles that (like photons or pions or alpha nuclei) obey Bose-Einstein statistics. If we want Fermi-Dirac statistics instead, we must replace eq. (1.31) with {a(x), a(x′)} = 0 , {a†(x), a†(x′)} = 0 , {a(x), a†(x′)} = δ3(x − x′) , (1.38) where again {A, B} = AB + BA is the anticommutator. Now only the fully antisymmetric part of ψ survives the integration in eq. (1.33), and so we can restrict our attention to ψ(. . . xi . . . xj . . . ; t) = −ψ(. . . xj . . . xi . . . ; t) . (1.39) Thus we have a theory of fermions. It is straightforward to check that the abstract Schr¨odinger equation, eq. (1.1), still implies that ψ obeys the differential equation (1.30).1 Interestingly, there is no simple way to write 1Now, however, the ordering of the a and a† operators in the last term of eq. (1.32) becomes significant, and must be as written. 1: Attempts at relativistic quantum mechanics 29 down a quantum field theory with particles that obey Boltzmann statistics, corresponding to a wave function with no particular symmetry. This is a hint of the spin-statistics theorem, which applies to relativistic quantum field theory. It says that interacting particles with integer spin must be bosons, and interacting particles with half-integer spin must be fermions. In our nonrelativistic example, the interacting particles clearly have spin zero (because their creation operators carry no labels that could be interpreted as corresponding to different spin states), but can be either bosons or fermions, as we have seen. Now that we have seen how to rewrite the nonrelativistic quantum mechanics of multiple bosons or fermions as a quantum field theory, it is time to try to construct a relativistic version. Reference Notes The history of the physics of elementary particles is recounted in Pais. A brief overview can be found in Weinberg I. More details on quantum field theory for nonrelativistic particles can be found in Brown. Problems 1.1) Show that the Dirac matrices must be even dimensional. Hint: show that the eigenvalues of β are all ±1, and that Tr β = 0. To show that Tr β = 0, consider, e.g., Tr α21β. Similarly, show that Tr αi = 0. 1.2) With the hamiltonian of eq. (1.32), show that the state defined in eq. (1.33) obeys the abstract Schr¨odinger equation, eq. (1.1), if and only if the wave function obeys eq. (1.30). Your demonstration should apply both to the case of bosons, where the particle creation and annihilation operators obey the commutation relations of eq. (1.31), and to fermions, where the particle creation and annihilation operators obey the anticommutation relations of eq. (1.38). 1.3) Show explicitly that [N, H] = 0, where H is given by eq. (1.32) and N by eq. (1.35). 2 Lorentz Invariance Prerequisite: 1 A Lorentz transformation is a linear, homogeneous change of coordinates from xµ to x¯µ, x¯µ = Λµν xν , (2.1) that preserves the interval x2 between xµ and the origin, where x2 ≡ xµxµ = gµν xµxν = x2 − c2t2 . (2.2) This means that the matrix Λµν must obey gµν ΛµρΛν σ = gρσ , (2.3) where  −1  gµν =  +1 +1  . +1 (2.4) is the Minkowski metric. Note that this set of transformations includes ordinary spatial rotations: take Λ00 = 1, Λ0i = Λi0 = 0, and Λij = Rij, where R is an orthogonal rotation matrix. The set of all Lorentz transformations forms a group: the product of any two Lorentz transformations is another Lorentz transformation; the product is associative; there is an identity transformation, Λµν = δµν ; and every Lorentz transformation has an inverse. It is easy to demonstrate these statements explicitly. For example, to find the inverse transformation (Λ−1)µν, note that the left-hand side of eq. (2.3) can be written as ΛνρΛν σ, and that we can raise the ρ index on both sides to get Λν ρΛνσ = δρσ. On the other hand, by definition, (Λ−1)ρνΛνσ = δρσ. Therefore (Λ−1)ρν = Λν ρ . (2.5) Another useful version of eq. (2.3) is gµν ΛρµΛσν = gρσ . (2.6) To get eq. (2.6), start with eq. (2.3), but with the inverse transformations (Λ−1)µρ and (Λ−1)νσ. Then use eq. (2.5), raise all down indices, and lower all up indices. The result is eq. (2.6). For an infinitesimal Lorentz transformation, we can write Λµν = δµν + δωµν . (2.7) 2: Lorentz Invariance 31 Eq. (2.3) can be used to show that δω with both indices down (or up) is antisymmetric: δωρσ = −δωσρ . (2.8) Thus there are six independent infinitesimal Lorentz transformations (in four spacetime dimensions). These can be divided into three rotations (δωij = −εijknˆkδθ for a rotation by angle δθ about the unit vector nˆ) and three boosts (δωi0 = nˆiδη for a boost in the direction nˆ by rapidity δη). Not all Lorentz transformations can be reached by compounding infinitesimal ones. If we take the determinant of eq. (2.5), we get (det Λ)−1 = det Λ, which implies det Λ = ±1. Transformations with det Λ = +1 are proper, and transformations with det Λ = −1 are improper. Note that the product of any two proper Lorentz transformations is proper, and that infinitesimal transformations of the form Λ = 1 + δω are proper. There- fore, any transformation that can be reached by compounding infinitesimal ones is proper. The proper transformations form a subgroup of the Lorentz group. Another subgroup is that of the orthochronous Lorentz transformations: those for which Λ00 ≥ +1. Note that eq. (2.3) implies (Λ00)2 − Λi0Λi0 = 1; thus, either Λ00 ≥ +1 or Λ00 ≤ −1. An infinitesimal transformation is clearly orthochronous, and it is straightforward to show that the product of two orthochronous transformations is also orthochronous. Thus, the Lorentz transformations that can be reached by compounding infinitesimal ones are both proper and orthochronous, and they form a subgroup. We can introduce two discrete transformations that take us out of this subgroup: parity and time reversal. The parity transformation is  +1  Pµν = (P−1)µν =  −1 −1  . (2.9) −1 It is orthochronous, but improper. The time-reversal transformation is  −1  T µ ν = (T −1)µν =  +1 +1  . (2.10) +1 It is nonorthochronous and improper. Generally, when a theory is said to be Lorentz invariant, this means under the proper orthochronous subgroup only. Parity and time reversal are treated separately. It is possible for a quantum field theory to be invariant under the proper orthochronous subgroup, but not under parity and/or time-reversal. 2: Lorentz Invariance 32 From here on, in this section, we will treat the proper orthochronous subgroup only. Parity and time reversal will be treated in section 23. In quantum theory, symmetries are represented by unitary (or antiunitary) operators. This means that we associate a unitary operator U (Λ) to each proper, orthochronous Lorentz transformation Λ. These operators must obey the composition rule U (Λ′Λ) = U (Λ′)U (Λ) . (2.11) For an infinitesimal transformation, we can write U (1+δω) = I + i 2¯h δωµν M µν , (2.12) where M µν = −M νµ is a set of hermitian operators called the generators of the Lorentz group. If we start with U (Λ)−1U (Λ′)U (Λ) = U (Λ−1Λ′Λ), let Λ′ = 1 + δω′, and expand both sides to linear order in δω, we get δωµν U (Λ)−1M µν U (Λ) = δωµν ΛµρΛν σM ρσ . (2.13) Then, since δωµν is arbitrary (except for being antisymmetric), the anti- symmetric part of its coefficient on each side must be the same. In this case, because M µν is already antisymmetric (by definition), we have U (Λ)−1M µν U (Λ) = ΛµρΛν σM ρσ . (2.14) We see that each vector index on M µν undergoes its own Lorentz transformation. This is a general result: any operator carrying one or more vector indices should behave similarly. For example, consider the energymomentum four-vector P µ, where P 0 is the hamiltonian H and P i are the components of the total three-momentum operator. We expect U (Λ)−1P µU (Λ) = Λµν P ν . (2.15) If we now let Λ = 1 + δω in eq. (2.14), expand to linear order in δω, and equate the antisymmetric part of the coefficients of δωµν , we get the commutation relations [M µν , M ρσ] = i¯h gµρM νσ − (µ↔ν) − (ρ↔σ) . (2.16) These commutation relations specify the Lie algebra of the Lorentz group. We can identify the components of the angular momentum operator J as Ji ≡ 1 2 εij kM jk , and the components of the boost operator K as Ki ≡ M i0. We then find from eq. (2.16) that [Ji, Jj ] = i¯hεijkJk , [Ji, Kj ] = i¯hεijkKk , [Ki, Kj ] = −i¯hεijkJk . (2.17) 2: Lorentz Invariance 33 The first of these is the usual set of commutators for angular momentum, and the second says that K transforms as a three-vector under rotations. The third implies that a series of boosts can be equivalent to a rotation. Similarly, we can let Λ = 1 + δω in eq. (2.15) to get [P µ, M ρσ] = i¯h gµσP ρ − (ρ↔σ) , (2.18) which becomes [Ji, H] = 0 , [Ji, Pj ] = i¯hεijkPk , [Ki, H] = i¯hPi , [Ki, Pj] = i¯hδijH , Also, the components of P µ should commute with each other: (2.19) [Pi, Pj] = 0 , [Pi, H] = 0 . (2.20) Together, eqs. (2.17), (2.19), and (2.20) form the Lie algebra of the Poincar´e group. Let us now consider what should happen to a quantum scalar field ϕ(x) under a Lorentz transformation. We begin by recalling how time evolution works in the Heisenberg picture: e+iHt/¯hϕ(x, 0)e−iHt/¯h = ϕ(x, t) . (2.21) Obviously, this should have a relativistic generalization, e−iP x/¯hϕ(0)e+iP x/¯h = ϕ(x) , (2.22) where P x = P µxµ = P · x − Hct. We can make this a little fancier by defining the unitary spacetime translation operator T (a) ≡ exp(−iP µaµ/¯h) . (2.23) Then we have T (a)−1ϕ(x)T (a) = ϕ(x − a) . (2.24) For an infinitesimal translation, T (δa) = I − i ¯h δaµ P µ . (2.25) Comparing eqs. (2.12) and (2.25), we see that eq. (2.24) leads us to expect U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) . (2.26) 2: Lorentz Invariance 34 Derivatives of ϕ then carry vector indices that transform in the appropriate way, e.g., U (Λ)−1∂µϕ(x)U (Λ) = Λµρ∂¯ρϕ(Λ−1x) , (2.27) where the bar on a derivative means that it is with respect to the argument x¯ = Λ−1x. Eq. (2.27) also implies U (Λ)−1∂2ϕ(x)U (Λ) = ∂¯2ϕ(Λ−1x) , (2.28) so that the Klein-Gordon equation, (−∂2 + m2/¯h2c2)ϕ = 0, is Lorentz invariant, as we saw in section 1. Reference Notes A detailed discussion of quantum Lorentz transformations can be found in Weinberg I. Problems 2.1) Verify that eq. (2.8) follows from eq. (2.3). 2.2) Verify that eq. (2.14) follows from U (Λ)−1U (Λ′)U (Λ) = U (Λ−1Λ′Λ). 2.3) Verify that eq. (2.16) follows from eq. (2.14). 2.4) Verify that eq. (2.17) follows from eq. (2.16). 2.5) Verify that eq. (2.18) follows from eq. (2.15). 2.6) Verify that eq. (2.19) follows from eq. (2.18). 2.7) What property should be attributed to the translation operator T (a) that could be used to prove eq. (2.20)? 2.8) a) Let Λ = 1 + δω in eq. (2.26), and show that [ϕ(x), M µν ] = Lµνϕ(x) , (2.29) where Lµν ≡ ¯h i (xµ∂ν − xν ∂µ) . (2.30) b) Show that [[ϕ(x), M µν ], M ρσ] = Lµν Lρσϕ(x). c) Prove the Jacobi identity, [[A, B], C] + [[B, C], A] + [[C, A], B] = 0. Hint: write out all the commutators. d) Use your results from parts (b) and (c) to show that [ϕ(x), [M µν , M ρσ]] = (Lµν Lρσ − LρσLµν )ϕ(x) . (2.31) 2: Lorentz Invariance 35 e) Simplify the right-hand side of eq. (2.31) as much as possible. f) Use your results from part (e) to verify eq. (2.16), up to the possibility of a term on the right-hand side that commutes with ϕ(x) and its derivatives. (Such a term, called a central charge, in fact does not arise for the Lorentz algebra.) 2.9) Let us write Λρτ = δρτ + i 2¯h δωµν (SVµν )ρ τ , (2.32) where (SVµν )ρτ ≡ ¯h i (gµρ δν τ − gνρδµτ ) (2.33) are matrices which constitute the vector representation of the Lorentz generators. a) Let Λ = 1 + δω in eq. (2.27), and show that [∂ρϕ(x), M µν ] = Lµν ∂ρϕ(x) + (SVµν )ρτ ∂τϕ(x) . (2.34) b) Show that the matrices SVµν must have the same commutation relations as the operators M µν . Hint: see the previous problem. c) For a rotation by an angle θ about the z axis, we have 1 0 0 0 Λµν =  0 0 cos θ sin θ − sin θ cos θ 0 0  . 00 01 (2.35) Show that Λ = exp(−iθSV12/¯h) . (2.36) d) For a boost by rapidity η in the z direction, we have  cosh η 0 0 sinh η  Λµν =  0 0 10 01 0 0  . sinh η 0 0 cosh η (2.37) Show that Λ = exp(+iηSV30/¯h) . (2.38) 3: Canonical Quantization of Scalar Fields 36 3 Canonical Quantization of Scalar Fields Prerequisite: 2 Let us go back and drastically simplify the hamiltonian we constructed in section 1, reducing it to the hamiltonian for free particles: H= = d3x a†(x) − 1 2m ∇2 a(x) d3p 1 2m p2 a†(p)a(p) , (3.1) where a(p) = d3x (2π)3/2 e−ip·x a(x) . Here we have simplified our notation by setting (3.2) ¯h = 1 . (3.3) The appropriate factors of h¯ can always be restored in any of our formulas via dimensional analysis. The commutation (or anticommutation) relations of the a(p) and a†(p) operators are [ a(p), a(p′)]∓ = 0 , [ a†(p), a†(p′)]∓ = 0 , [ a(p), a†(p′)]∓ = δ3(p − p′) , (3.4) where [A, B]∓ is either the commutator (if we want a theory of bosons) or the anticommutator (if we want a theory of fermions). Thus a†(p) can be interpreted as creating a state of definite momentum p, and eq. (3.1) describes a theory of free particles. The ground state is the vacuum |0 ; it is annihilated by a(p), a(p)|0 = 0 , (3.5) and so its energy eigenvalue is zero. The other eigenstates of H are all of the form a†(p1) . . . a†(pn)|0 , and the corresponding energy eigenvalue is E(p1) + ... + E(pn), where E(p) = 1 2m p2. It is easy to see how to generalize this theory to a relativistic one; all we need to do is use the relativistic energy formula E(p) = +(p2c2 + m2c4)1/2: H = d3p (p2c2 + m2c4)1/2 a†(p)a(p) . (3.6) Now we have a theory of free relativistic spin-zero particles, and they can be either bosons or fermions. 3: Canonical Quantization of Scalar Fields 37 Is this theory really Lorentz invariant? We will answer this question (in the affirmative) in a very roundabout way: by constructing it again, from a rather different point of view, a point of view that emphasizes Lorentz invariance from the beginning. We will start with the classical physics of a real scalar field ϕ(x). Real means that ϕ(x) assigns a real number to every point in spacetime. Scalar means that Alice [who uses coordinates xµ and calls the field ϕ(x)] and Bob [who uses coordinates x¯µ, related to Alice’s coordinates by x¯µ = Λµνxν +aν, and calls the field ϕ¯(x¯)], agree on the numerical value of the field: ϕ(x) = ϕ¯(x¯). This then implies that the equation of motion for ϕ(x) must be the same as that for ϕ¯(x¯). We have already met an equation of this type: the Klein-Gordon equation, (−∂2 + m2)ϕ(x) = 0 . (3.7) Here we have simplified our notation by setting c=1 (3.8) in addition to h¯ = 1. As with h¯, factors of c can restored, if desired, by dimensional analysis. We will adopt eq. (3.7) as the equation of motion that we would like ϕ(x) to obey. It should be emphasized at this point that we are doing classical physics of a real scalar field. We are not to think of ϕ(x) as a quantum wave function. Thus, there should not be any factors of h¯ in this version of the Klein-Gordon equation. This means that the parameter m must have dimensions of inverse length; m is not (yet) to be thought of as a mass. The equation of motion can be derived from variation of an action S = dt L, where L is the lagrangian. Since the Klein-Gordon equation is local, we expect that the lagrangian can be written as the space integral of a lagrangian density L: L = d3x L. Thus, S = d4x L. The integration measure d4x is Lorentz invariant: if we change to coordinates x¯µ = Λµνxν , we have d4x¯ = |det Λ| d4x = d4x. Thus, for the action to be Lorentz invariant, the lagrangian density must be a Lorentz scalar: L(x) = L¯(x¯). Then we have S¯ = d4x¯ L¯(x¯) = d4x L(x) = S. Any simple function of ϕ is a Lorentz scalar, and so are products of derivatives with all indices contracted, such as ∂µϕ∂µϕ. We will take for L L = − 1 2 ∂µϕ∂µ ϕ − 1 2 m2 ϕ2 + Ω0 , (3.9) where Ω0 is an arbitrary constant. We find the equation motion (also known as the Euler-Lagrange equation) by making an infinitesimal variation δϕ(x) 3: Canonical Quantization of Scalar Fields 38 in ϕ(x), and requiring the corresponding variation of the action to vanish: 0 = δS = d4x − 1 2 ∂µδϕ∂µ ϕ − 1 2 ∂µϕ∂µδϕ − m2ϕ δϕ = d4x +∂µ∂µϕ − m2ϕ δϕ . (3.10) In the last line, we have integrated by parts in each of the first two terms, putting both derivatives on ϕ. We assume δϕ(x) vanishes at infinity in any direction (spatial or temporal), so that there is no surface term. Since δϕ has an arbitrary x dependence, eq. (3.10) can be true if and only if (−∂2 + m2)ϕ = 0. One solution of the Klein-Gordon equation is a plane wave of the form exp(ik·x ± iωt), where k is an arbitrary real wave-vector, and ω = +(k2 + m2)1/2 . (3.11) The general solution (assuming boundary conditions that require ϕ to remain finite at spatial infinity) is then ϕ(x, t) = d3k f (k) a(k)eik·x−iωt + b(k)eik·x+iωt , (3.12) where a(k) and b(k) are arbitrary functions of the wave vector k, and f (k) is a redundant function of the magnitude of k which we have inserted for later convenience. Note that, if we were attempting to interpret ϕ(x) as a quantum wave function (which we most definitely are not), then the second term would constitute the “negative energy” contributions to the wave function. This is because a plane-wave solution of the nonrelativistic Schr¨odinger equation for a single particle looks like exp(ip · x − iE(p)t), with E(p) = 1 2m p2 ; there is a minus sign in front of the positive energy. We are trying to interpret eq. (3.12) as a real classical field, but this formula does not generically result in ϕ being real. We must impose ϕ∗(x) = ϕ(x), where ϕ∗(x, t) = d3k f (k) a∗(k)e−ik·x+iωt + b∗(k)e−ik·x−iωt = d3k f (k) a∗(k)e−ik·x+iωt + b∗(−k)e+ik·x−iωt . (3.13) In the second term on the second line, we have changed the dummy integration variable from k to −k. Comparing eqs. (3.12) and (3.13), we see 3: Canonical Quantization of Scalar Fields 39 that ϕ∗(x) = ϕ(x) requires b∗(−k) = a(k). Imposing this condition, we can rewrite ϕ as ϕ(x, t) = = = d3k f (k) a(k)eik·x−iωt + a∗(−k)eik·x+iωt d3k a(k)eik·x−iωt + a∗(k)e−ik·x+iωt f (k) d3k f (k) a(k)eikx + a∗(k)e−ikx , (3.14) where kx = k·x − ωt is the Lorentz-invariant product of the four-vectors xµ = (t, x) and kµ = (ω, k): kx = kµxµ = gµν kµxν . Note that k2 = kµkµ = k2 − ω2 = −m2 . (3.15) A four-momentum kµ that obeys k2 = −m2 is said to be on the mass shell, or on shell for short. It is now convenient to choose f (k) so that d3k/f (k) is Lorentz invariant. An integration measure that is manifestly invariant under orthochronous Lorentz transformations is d4k δ(k2+m2) θ(k0), where δ(x) is the Dirac delta function, θ(x) is the unit step function, and k0 is treated as an independent integration variable. We then have +∞ dk0 δ(k2+m2) θ(k0) −∞ = 1 2ω . Here we have used the rule +∞ dx δ(g(x)) = −∞ i 1 |g′(xi)| , (3.16) (3.17) where g(x) is any smooth function of x with simple zeros at x = xi; in our case, the only zero is at k0 = ω. Thus we see that if we take f (k) ∝ ω, then d3k/f (k) will be Lorentz invariant. We will take f (k) = (2π)32ω. It is then convenient to give the corresponding Lorentz-invariant differential its own name: dk ≡ d3k (2π)32ω . (3.18) Thus we finally have ϕ(x) = dk a(k)eikx + a∗(k)e−ikx . (3.19) 3: Canonical Quantization of Scalar Fields 40 We can also invert this formula to get a(k) in terms of ϕ(x). We have d3x e−ikxϕ(x) = 1 2ω a(k) + 1 2ω e2iωt a∗(−k) , d3x e−ikx∂0ϕ(x) = − i 2 a(k) + i 2 e2iωt a∗ (−k) . We can combine these to get (3.20) a(k) = d3x e−ikx i∂0ϕ(x) + ωϕ(x) =i d3x e−ikx ↔ ∂0 ϕ(x) , (3.21) ↔ where f ∂µg ≡ f (∂µg) − (∂µf )g, and ∂0ϕ = ∂ϕ/∂t = ϕ˙ . Note that a(k) is time independent. Now that we have the lagrangian, we can construct the hamiltonian by the usual rules. Recall that, given a lagrangian L(qi, q˙i) as a function of some coordinates qi and their time derivatives q˙i, the conjugate momenta are given by pi = ∂L/∂q˙i, and the hamiltonian by H = i piq˙i − L. In our case, the role of qi(t) is played by ϕ(x, t), with x playing the role of a (continuous) index. The appropriate generalizations are then Π(x) = ∂L ∂ ϕ˙ (x) (3.22) and H = Πϕ˙ − L , (3.23) where H is the hamiltonian density, and the hamiltonian itself is H = d3x H. In our case, we have Π(x) = ϕ˙ (x) (3.24) and H = 1 2 Π2 + 1 2 (∇ϕ)2 + 1 2 m2ϕ2 − Ω0 . (3.25) Using eq. (3.19), we can write H in terms of the a(k) and a∗(k) coefficients: H = −Ω0V + 1 2 dk dk′ d3x −iω a(k)eikx + iω a∗(k)e−ikx −iω′ a(k′)eik′x + iω′ a∗(k′)e−ik′x + +ik a(k)eikx − ik a∗(k)e−ikx · +ik′ a(k′)eik′x − ik′ a∗(k′)e−ik′x + m2 a(k)eikx + a∗(k)e−ikx a(k′)eik′x + a∗(k′)e−ik′x 3: Canonical Quantization of Scalar Fields 41 = −Ω0V + 1 2 (2π)3 dk dk′ δ3(k − k′)(+ωω′ + k·k′ + m2) × a∗(k)a(k′)e−i(ω−ω′)t + a(k)a∗(k′)e+i(ω−ω′)t + δ3(k + k′)(−ωω′ − k·k′ + m2) × a(k)a(k′)e−i(ω+ω′)t + a∗(k)a∗(k′)e+i(ω+ω′)t = −Ω0V + 1 2 dk 1 2ω (+ω2 + k2 + m2) a∗(k)a(k) + a(k)a∗(k) + (−ω2 + k2 + m2) a(k)a(−k)e−2iωt + a∗(k)a∗(−k)e+2iωt = −Ω0V + 1 2 dk ω a∗(k)a(k) + a(k)a∗(k) , (3.26) where V is the volume of space. To get the second equality, we used d3x eiq·x = (2π)3δ3(q) . (3.27) To get the third equality, we integrated over k′, using dk′ = d3k′/(2π)32ω′. The last equality then follows from ω = (k2+m2)1/2. Also, we were careful to keep the ordering of a(k) and a∗(k) unchanged throughout, in anticipa- tion of passing to the quantum theory where these classical functions will become operators that may not commute. Let us take up the quantum theory now. We can go from classical to quantum mechanics via canonical quantization. This means that we promote qi and pi to operators, with commutation relations [qi, qj] = 0, [pi, pj] = 0, and [qi, pj] = i¯hδij. In the Heisenberg picture, these operators should be taken at equal times. In our case, where the “index” is continuous (and we have set ¯h = 1), we have [ϕ(x, t), ϕ(x′, t)] = 0 , [Π(x, t), Π(x′, t)] = 0 , [ϕ(x, t), Π(x′, t)] = iδ3(x − x′) . (3.28) From these canonical commutation relations, and from eqs. (3.21) and (3.24), we can deduce [a(k), a(k′)] = 0 , [a†(k), a†(k′)] = 0 , [a(k), a†(k′)] = (2π)32ω δ3(k − k′) . (3.29) 3: Canonical Quantization of Scalar Fields 42 We are now denoting a∗(k) as a†(k), since a†(k) is now the hermitian conjugate (rather than the complex conjugate) of the operator a(k). We can now rewrite the hamiltonian as H = dk ω a†(k)a(k) + (E0 − Ω0)V , (3.30) where E0 = 1 2 (2π)−3 d3k ω (3.31) is the total zero-point energy of all the oscillators per unit volume, and, using eq. (3.27), we have interpreted (2π)3δ3(0) as the volume of space V . If we integrate in eq. (3.31) over the whole range of k, the value of E0 is infinite. If we integrate only up to a maximum value of Λ, a number known as the ultraviolet cutoff, we find E0 = Λ4 16π2 , (3.32) where we have assumed Λ ≫ m. This is physically justified if, in the real world, the formalism of quantum field theory breaks down at some large energy scale. For now, we simply note that the value of Ω0 is arbitrary, and so we are free to choose Ω0 = E0. With this choice, the ground state has energy eigenvalue zero. Now, if we like, we can take the limit Λ → ∞, with no further consequences. (We will meet more of these ultraviolet divergences after we introduce interactions.) The hamiltonian of eq. (3.30) is now the same as that of eq. (3.6), with a(k) = [(2π)32ω]1/2 a(k). The commutation relations (3.4) and (3.29) are also equivalent, if we choose commutators (rather than anticommutators) in eq. (3.4). Thus, we have re-derived the hamiltonian of free relativistic bosons by quantization of a scalar field whose equation of motion is the Klein-Gordon equation. The parameter m in the lagrangian is now seen to be the mass of the particle in the quantum theory. (More precisely, since m has dimensions of inverse length, the particle mass is h¯cm.) What if we want fermions? Then we should use anticommutators in eqs. (3.28) and (3.29). There is a problem, though; eq. (3.26) does not then become eq. (3.30). Instead, we get H = −Ω0V , a simple constant. Clearly there is something wrong with using anticommutators. This is another hint of the spin-statistics theorem, which we will take up in section 4. Next, we would like to add Lorentz-invariant interactions to our theory. With the formalism we have developed, this is easy to do. Any local function of ϕ(x) is a Lorentz scalar, and so if we add a term like ϕ3 or ϕ4 to the lagrangian density L, the resulting action will still be Lorentz invariant. Now, however, we will have interactions among the particles. Our next task is to deduce the consequences of these interactions. 3: Canonical Quantization of Scalar Fields 43 However, we already have enough tools at our disposal to prove the spin-statistics theorem for spin-zero particles, and that is what we turn to next. Problems 3.1) Derive eq. (3.29) from eqs. (3.21), (3.24), and (3.28). 3.2) Use the commutation relations, eq. (3.29), to show explicitly that a state of the form |k1 . . . kn ≡ a†(k1) . . . a†(kn)|0 (3.33) is an eigenstate of the hamiltonian, eq. (3.30), with eigenvalue ω1 + . . . + ωn. The vacuum |0 is annihilated by a(k), a(k)|0 = 0, and we take Ω0 = E0 in eq. (3.30). 3.3) Use U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) to show that U (Λ)−1a(k)U (Λ) = a(Λ−1k) , U (Λ)−1a†(k)U (Λ) = a†(Λ−1k) , (3.34) and hence that U (Λ)|k1 . . . kn = |Λk1 . . . Λkn , (3.35) where |k1 . . . kn = a†(k1) . . . a†(kn)|0 is a state of n particles with momenta k1, . . . , kn. 3.4) Recall that T (a)−1ϕ(x)T (a) = ϕ(x − a), where T (a) ≡ exp(−iP µaµ) is the spacetime translation operator, and P 0 is identified as the hamiltonian H. a) Let aµ be infinitesimal, and derive an expression for [ϕ(x), P µ]. b) Show that the time component of your result is equivalent to the Heisenberg equation of motion iϕ˙ = [ϕ, H]. c) For a free field, use the Heisenberg equation to derive the KleinGordon equation. d) Define a spatial momentum operator P ≡ − d3x Π(x)∇ϕ(x) . (3.36) Use the canonical commutation relations to show that P obeys the relation you derived in part (a). e) Express P in terms of a(k) and a†(k). 3: Canonical Quantization of Scalar Fields 44 3.5) Consider a complex (that is, nonhermitian) scalar field ϕ with lagrangian density L = −∂µϕ†∂µϕ − m2ϕ†ϕ + Ω0 . (3.37) a) Show that ϕ obeys the Klein-Gordon equation. b) Treat ϕ and ϕ† as independent fields, and find the conjugate momentum for each. Compute the hamiltonian density in terms of these conjugate momenta and the fields themselves (but not their time derivatives). c) Write the mode expansion of ϕ as ϕ(x) = dk a(k)eikx + b†(k)e−ikx . (3.38) Express a(k) and b(k) in terms of ϕ and ϕ† and their time derivatives. d) Assuming canonical commutation relations for the fields and their conjugate momenta, find the commutation relations obeyed by a(k) and b(k) and their hermitian conjugates. e) Express the hamiltonian in terms of a(k) and b(k) and their hermitian conjugates. What value must Ω0 have in order for the ground state to have zero energy? 4: The Spin-Statistics Theorem 45 4 The Spin-Statistics Theorem Prerequisite: 3 Let us consider a theory of free, spin-zero particles specified by the hamil- tonian H0 = dk ω a†(k)a(k) , (4.1) where ω = (k2 + m2)1/2, and either the commutation or anticommutation relations [a(k), a(k′)]∓ = 0 , [a†(k), a†(k′)]∓ = 0 , [a(k), a†(k′)]∓ = (2π)32ω δ3(k − k′) . (4.2) Of course, if we want a theory of bosons, we should use commutators, and if we want fermions, we should use anticommutators. Now let us consider adding terms to the hamiltonian that will result in local, Lorentz invariant interactions. In order to do this, it is convenient to define a nonhermitian field, ϕ+(x, 0) ≡ dk eik·x a(k) , (4.3) and its hermitian conjugate ϕ−(x, 0) ≡ dk e−ik·x a†(k) . (4.4) These are then time-evolved with H0: ϕ+(x, t) = eiH0tϕ+(x, 0)e−iH0t = dk eikx a(k) , ϕ−(x, t) = eiH0tϕ−(x, 0)e−iH0t = dk e−ikx a†(k) . (4.5) Note that the usual hermitian free field ϕ(x) is just the sum of these: ϕ(x) = ϕ+(x) + ϕ−(x). For a proper orthochronous Lorentz transformation Λ, we have U (Λ)−1ϕ(x)U (Λ) = ϕ(Λ−1x) . (4.6) This implies that the particle creation and annihilation operators transform as U (Λ)−1a(k)U (Λ) = a(Λ−1k) , U (Λ)−1a†(k)U (Λ) = a†(Λ−1k) . (4.7) 4: The Spin-Statistics Theorem 46 This, in turn, implies that ϕ+(x) and ϕ−(x) are Lorentz scalars: U (Λ)−1ϕ±(x)U (Λ) = ϕ±(Λ−1x) . (4.8) We will then have local, Lorentz invariant interactions if we take the interaction lagrangian density L1 to be a hermitian function of ϕ+(x) and ϕ−(x). To proceed we need to recall some facts about time-dependent pertur- bation theory in quantum mechanics. The transition amplitude Tf←i to start with an initial state |i at time t = −∞ and end with a final state |f at time t = +∞ is +∞ Tf←i = f | T exp −i dt HI (t) |i , −∞ (4.9) where HI(t) is the perturbing hamiltonian in the interaction picture, HI(t) = exp(+iH0t) H1 exp(−iH0t) , (4.10) H1 is the perturbing hamiltonian in the Schr¨odinger picture, and T is the time ordering symbol: a product of operators to its right is to be ordered, not as written, but with operators at later times to the left of those at earlier times. We write H1 = d3x H1(x, 0), and specify H1(x, 0) as a hermitian function of ϕ+(x, 0) and ϕ−(x, 0). Then, using eqs. (4.5) and (4.10), we can see that, in the interaction picture, the perturbing hamiltonian density HI (x, t) is simply given by the same function of ϕ+(x, t) and ϕ−(x, t). Now we come to the key point: for the transition amplitude Tf←i to be Lorentz invariant, the time ordering must be frame independent. The time ordering of two spacetime points x and x′ is frame independent if their separation is timelike; this means that the interval between them is negative, (x−x′)2 < 0. Two spacetime points whose separation is spacelike, (x − x′)2 > 0, can have different temporal ordering in different frames. In order to avoid Tf←i being different in different frames, we must then require [HI(x), HI (x′)] = 0 whenever (x − x′)2 > 0 . (4.11) Obviously, [ϕ+(x), ϕ+(x′)]∓ = [ϕ−(x), ϕ−(x′)]∓ = 0. However, [ϕ+(x), ϕ−(x′)]∓ = dk dk′ ei(kx−k′x′)[a(k), a†(k′)]∓ = dk eik(x−x′) = m 4π2r K1(mr) ≡ C(r) . (4.12) 4: The Spin-Statistics Theorem 47 In the next-to-last line, we have taken (x − x′)2 = r2 > 0, and K1(z) is a modified Bessel function. (This Lorentz-invariant integral is most easily evaluated in the frame where t′ = t.) The function C(r) is not zero for any r > 0. (Not even when m = 0; in this case, C(r) = 1/4π2r2.) On the other hand, HI (x) must involve both ϕ+(x) and ϕ−(x), by hermiticity. Thus, generically, we will not be able to satisfy eq. (4.11). To resolve this problem, let us try using only particular linear combinations of ϕ+(x) and ϕ−(x). Define ϕλ(x) ≡ ϕ+(x) + λϕ−(x) , ϕ†λ(x) ≡ ϕ−(x) + λ∗ϕ+(x) , (4.13) where λ is an arbitrary complex number. We then have [ϕλ(x), ϕ†λ(x′)]∓ = [ϕ+(x), ϕ−(x′)]∓ + |λ|2[ϕ−(x), ϕ+(x′)]∓ = (1 ∓ |λ|2) C(r) (4.14) and [ϕλ(x), ϕλ(x′)]∓ = λ[ϕ+(x), ϕ−(x′)]∓ + λ[ϕ−(x), ϕ+(x′)]∓ = λ(1 ∓ 1) C(r) . (4.15) Thus, if we want ϕλ(x) to either commute or anticommute with both ϕλ(x′) and ϕ†λ(x′) at spacelike separations, we must choose |λ| = 1, and we must choose commutators. Then (and only then), we can build a suitable HI (x) by making it a hermitian function of ϕλ(x). But this has simply returned us to the theory of a real scalar field, because, for λ = eiα, e−iα/2ϕλ(x) is hermitian. In fact, if we make the replacements a(k) → e+iα/2a(k) and a†(k) → e−iα/2a†(k), then the commutation relations of eq. (4.2) are unchanged, and e−iα/2ϕλ(x) = ϕ(x) = ϕ+(x) + ϕ−(x). Thus, our attempt to start with the creation and annihilation operators a†(k) and a(k) as the fundamental objects has simply led us back to the real, commuting, scalar field ϕ(x) as the fundamental object. Let us return to thinking of ϕ(x) as fundamental, with a lagrangian density given by some function of the Lorentz scalars ϕ(x) and ∂µϕ(x)∂µϕ(x). Then, quantization will result in [ϕ(x), ϕ(x′)]∓ = 0 for t = t′. If we choose anticommutators, then [ϕ(x)]2 = 0 and [∂µϕ(x)]2 = 0, resulting in a trivial L that is at most linear in ϕ, and independent of ϕ˙ . This clearly does not lead to the correct physics. This situation turns out to generalize to fields of higher spin, in any number of spacetime dimensions. One choice of quantization (commuta- tors or anticommutators) always leads to a trivial L, and so this choice 4: The Spin-Statistics Theorem 48 is disallowed. Furthermore, the allowed choice is always commutators for fields of integer spin, and anticommutators for fields of half-integer spin. If we try treating the particle creation and annihilation operators as fundamental, rather than the fields, we find a situation similar to that of the spin-zero case, and are led to the reconstruction of a field that must obey the appropriate quantization scheme. Reference Notes This discussion of the spin-statistics theorem follows that of Weinberg I, which has more details. Problems 4.1) Verify eq. (4.12). Verify its limit as m → 0. 5: The LSZ Reduction Formula 49 5 The LSZ Reduction Formula Prerequisite: 3 Let us now consider how to construct appropriate initial and final states for scattering experiments. In the free theory, we can create a state of one particle by acting on the vacuum state with a creation operator |k = a†(k)|0 , (5.1) where a†(k) = −i d3x eikx ↔ ∂0 ϕ(x) . The vacuum state |0 is annihilated by every a(k), (5.2) a(k)|0 = 0 , (5.3) and has unit norm, 0|0 = 1 . (5.4) The one-particle state |k then has the Lorentz-invariant normalization k|k′ = (2π)3 2ω δ3(k − k′) , (5.5) where ω = (k2 + m2)1/2. Next, let us define a time-independent operator that (in the free theory) creates a particle localized in momentum space near k1, and localized in position space near the origin: a†1 ≡ d3k f1(k)a†(k) , (5.6) where f1(k) ∝ exp[−(k − k1)2/4σ2] (5.7) is an appropriate wave packet, and σ is its width in momentum space. Consider the state a†1|0 . If we time evolve this state in the Schr¨odinger picture, the wave packet will propagate (and spread out). The particle is thus localized far from the origin as t → ±∞. If we consider instead a state of the form a†1a†2|0 , where k1 = k2, then the two particles are widely separated in the far past. Let us guess that this still works in the interacting theory. One complication is that a†(k) will no longer be time independent, and so a†1, eq. (5.6), becomes time dependent as well. Our guess for a suitable initial state of a scattering experiment is then |i = lim t→−∞ a†1(t)a†2(t)|0 . (5.8) 5: The LSZ Reduction Formula 50 By appropriately normalizing the wave packets, we can make i|i = 1, and we will assume that this is the case. Similarly, we can consider a final state |f = lim t→+∞ a†1′ (t)a†2′ (t)|0 , (5.9) where k′1 = k′2, and f |f = 1. This describes two widely separated particles in the far future. (We could also consider acting with more creation operators, if we are interested in the production of some extra particles in the collision of two.) Now the scattering amplitude is simply given by f |i . We need to find a more useful expression for f |i . To this end, let us note that a†1(+∞) − a†1(−∞) = +∞ dt ∂0a†1(t) −∞ = −i d3k f1(k) d4x ∂0 eikx ↔ ∂0 ϕ(x) = −i d3k f1(k) d4x eikx(∂02 + ω2)ϕ(x) = −i = −i = −i d3k f1(k) d3k f1(k) d3k f1(k) d4x eikx(∂02 + k2 + m2)ϕ(x) d4x eikx(∂02 − ∇←2 + m2)ϕ(x) d4x eikx(∂02 − ∇→2 + m2)ϕ(x) = −i d3k f1(k) d4x eikx(−∂2 + m2)ϕ(x) . (5.10) The first equality is just the fundamental theorem of calculus. To get the second, we substituted the definition of a†1(t), and combined the d3x from this definition with the dt to get d4x. The third comes from straightforward evaluation of the time derivatives. The fourth uses ω2 = k2 + m2. The fifth writes k2 as −∇2 acting on eik·x. The sixth uses integration by parts to move the ∇2 onto the field ϕ(x); here the wave packet is needed to avoid a surface term. The seventh simply identifies ∂02 − ∇2 as −∂2. In free-field theory, the right-hand side of eq. (5.10) is zero, since ϕ(x) obeys the Klein-Gordon equation. In an interacting theory, with (say) L1 = 1 6 gϕ3 , we have instead (−∂2 + m2)ϕ = 1 2 gϕ2 . Thus the right-hand side of eq. (5.10) is not zero in an interacting theory. Rearranging eq. (5.10), we have a†1(−∞) = a†1(+∞) + i d3k f1(k) d4x eikx(−∂2 + m2)ϕ(x) . (5.11) We will also need the hermitian conjugate of this formula, which (after a little more rearranging) reads a1(+∞) = a1(−∞) + i d3k f1(k) d4x e−ikx(−∂2 + m2)ϕ(x) . (5.12) 5: The LSZ Reduction Formula 51 Let us return to the scattering amplitude, f |i = 0|a1′ (+∞)a2′ (+∞)a†1(−∞)a†2(−∞)|0 . (5.13) Note that the operators are in time order. Thus, if we feel like it, we can put in a time-ordering symbol without changing anything: f |i = 0|Ta1′ (+∞)a2′ (+∞)a†1(−∞)a†2(−∞)|0 . (5.14) The symbol T means the product of operators to its right is to be ordered, not as written, but with operators at later times to the left of those at earlier times. Now let us use eqs. (5.11) and (5.12) in eq. (5.14). The time-ordering symbol automatically moves all ai′(−∞)’s to the right, where they annihilate |0 . Similarly, all a†i (+∞)’s move to the left, where they annihilate 0|. The wave packets no longer play a key role, and we can take the σ → 0 limit in eq. (5.7), so that f1(k) = δ3(k − k1). The initial and final states now have a delta-function normalization, the multiparticle generalization of eq. (5.5). We are left with f |i = in+n′ d4x1 eik1x1(−∂12 + m2) . . . d4x′1 e−ik1′ x′1 (−∂12′ + m2) . . . × 0|Tϕ(x1) . . . ϕ(x′1) . . . |0 . (5.15) This formula has been written to apply to the more general case of n incoming particles and n′ outgoing particles; the ellipses stand for similar factors for each of the other incoming and outgoing particles. Eq. (5.15) is the Lehmann-Symanzik-Zimmermann reduction formula, or LSZ formula for short. It is one of the key equations of quantum field theory. However, our derivation of the LSZ formula relied on the supposition that the creation operators of free field theory would work comparably in the interacting theory. This is a rather suspect assumption, and so we must review it. Let us consider what we can deduce about the energy and momentum eigenstates of the interacting theory on physical grounds. First, we assume that there is a unique ground state |0 , with zero energy and momentum. The first excited state is a state of a single particle with mass m. This state can have an arbitrary three-momentum k; its energy is then E = ω = (k2 + m2)1/2. The next excited state is that of two particles. These two particles could form a bound state with energy less than 2m (like the 5: The LSZ Reduction Formula 52 E 2m m 0 P Figure 5.1: The exact energy eigenstates in the (P, E) plane. The ground state is isolated at (0, 0), the one-particle states form an isolated hyperbola that passes through (0, m), and the multi-particle continuum lies at and above the hyperbola that passes through (0, 2m). hydrogen atom in quantum electrodynamics), but, to keep things simple, let us assume that there are no such bound states. Then the lowest possible energy of a two-particle state is 2m. However, a two-particle state with zero total three-momentum can have any energy above 2m, because the two particles could have some relative momentum that contributes to their total energy. Thus we are led to a picture of the states of theory as shown in fig. (5.1). Now let us consider what happens when we act on the ground state with the field operator ϕ(x). To this end, it is helpful to write ϕ(x) = exp(−iP µxµ)ϕ(0)exp(+iP µxµ) , (5.16) where P µ is the energy-momentum four-vector. (This equation, introduced in section 2, is just the relativistic generalization of the Heisenberg equation.) Now let us sandwich ϕ(x) between the ground state (on the right), and other possible states (on the left). For example, let us put the ground state on the left as well. Then we have 0|ϕ(x)|0 = 0|e−iP xϕ(0)e+iP x|0 = 0|ϕ(0)|0 . (5.17) 5: The LSZ Reduction Formula 53 To get the second line, we used P µ|0 = 0. The final expression is just a Lorentz-invariant number. Since |0 is the exact ground state of the interacting theory, we have (in general) no idea what this number is. We would like 0|ϕ(0)|0 to be zero. This is because we would like a†1(±∞), when acting on |0 , to create a single particle state. We do not want a†1(±∞) to create a linear combination of a single particle state and the ground state. But this is precisely what will happen if 0|ϕ(0)|0 is not zero. So, if v ≡ 0|ϕ(0)|0 is not zero, we will shift the field ϕ(x) by the constant v. This means that we go back to the lagrangian, and replace ϕ(x) everywhere by ϕ(x) + v. This is just a change of the name of the operator of interest, and does not affect the physics. However, the shifted ϕ(x) obeys, by construction, 0|ϕ(x)|0 = 0. Let us now consider p|ϕ(x)|0 , where |p is a one-particle state with four-momentum p, normalized according to eq. (5.5). Again using eq. (5.16), we have p|ϕ(x)|0 = p|e−iP xϕ(0)e+iP x|0 = e−ipx p|ϕ(0)|0 , (5.18) where p|ϕ(0)|0 is a Lorentz-invariant number. It is a function of p, but the only Lorentz-invariant functions of p are functions of p2, and p2 is just the constant −m2. So p|ϕ(0)|0 is just some number that depends on m and (presumably) the other parameters in the lagrangian. We would like p|ϕ(0)|0 to be one. That is what it is in free-field theory, and ized woneek-pnaorwtitchleats,taintef.reTeh-fiuesl,dfotrheao†1r(y±, ∞a†1)(±to∞c)recarteeataescoarrceocrtrleyctnlyornmoarlmizaeld- one-particle state in the interacting theory, we must have p|ϕ(0)|0 = 1. So, if p|ϕ(0)|0 is not equal to one, we will rescale (or, one might say, renormalize) ϕ(x) by a multiplicative constant. This is just a change of the name of the operator of interest, and does not affect the physics. However, the rescaled ϕ(x) obeys, by construction, p|ϕ(0)|0 = 1. Finally, consider p, n|ϕ(x)|0 , where |p, n is a multiparticle state with total four-momentum p, and n is short for all other labels (such as relative momenta) needed to specify this state. We have p, n|ϕ(x)|0 = p, n|e−iP xϕ(0)e+iP x|0 = e−ipx p, n|ϕ(0)|0 = e−ipxAn(p) , (5.19) where An(p) is a function of Lorentz invariant products of the various (relative and total) four-momenta needed to specify the state. Note that, 5: The LSZ Reduction Formula 54 from fig. (5.1), p0 = (p2 + M 2)1/2 with M ≥ 2m. The invariant mass M is one of the parameters included in the set n. We would like p, n|ϕ(x)|0 to be zero, because we would like a†1(±∞), when acting on |0 , to create a single particle state. We do not want a†1(±∞) to create any multiparticle states. But this is precisely what may happen if p, n|ϕ(x)|0 is not zero. Actually, we are being a little too strict. We really need p, n|a†1(±∞)|0 to be zero, and perhaps it will be zero even if p, n|ϕ(x)|0 is not. Also, we really should test a†1(±∞)|0 only against normalizable states. Mathematically, non-normalizable states cause all sorts of trouble; mathematicians don’t consider them to be states at all. In physics, this usually doesn’t bother us, but here we must be especially careful. So let us write |ψ = d3p ψn(p)|p, n , n (5.20) where the ψn(p)’s are wave packets for the total three-momentum p. Note that eq. (5.20) is highly schematic; the sum over n includes integrals over continuous parameters like relative momenta. Now we want to examine ψ|a†1(t)|0 = −i n d3p ψn∗ (p) d3k f1(k) d3x eikx ↔ ∂0 p, n|ϕ(x)|0 . (5.21) We will take the limit t → ±∞ in a moment. Using eq. (5.19), eq. (5.21) becomes ψ|a†1(t)|0 = −i n d3p ψn∗ (p) d3k f1(k) d3x eikx ↔ ∂0 e−ipx An(p) = n d3p ψn∗ (p) d3k f1(k) d3x (p0+k0)ei(k−p)xAn(p) . (5.22) Next we use d3x ei(k−p)·x = (2π)3δ3(k − p) to get ψ|a†1(t)|0 = n d3p (2π)3(p0+k0)ψn∗ (p)f1(p)An(p)ei(p0−k0)t , (5.23) where p0 = (p2 + M 2)1/2 and k0 = (p2 + m2)1/2. Now comes the key point. Note that p0 is strictly greater than k0, because M ≥ 2m > m. Thus the integrand of eq. (5.23) contains a phase factor that oscillates more and more rapidly as t → ±∞. Therefore, by the Riemann-Lebesgue lemma, the right-hand side of eq. (5.23) vanishes as t → ±∞. 5: The LSZ Reduction Formula 55 Physically, this means that a one-particle wave packet spreads out differently than a multiparticle wave packet, and the overlap between them goes to zero as the elapsed time goes to infinity. Thus, even though our operator a†1(t) creates some multiparticle states that we don’t want, we can “follow” the one-particle state that we do want by using an appropriate wave packet. By waiting long enough, we can make the multiparticle contribution to the scattering amplitude as small as we like. Let us recap. The basic formula for a scattering amplitude in terms of the fields of an interacting quantum field theory is the LSZ formula, which is worth writing down again: f |i = in+n′ d4x1 eik1x1(−∂12 + m2) . . . d4x1′ e−ik1′ x′1 (−∂12′ + m2) . . . × 0|Tϕ(x1) . . . ϕ(x′1) . . . |0 . The LSZ formula is valid provided that the field obeys 0|ϕ(x)|0 = 0 and k|ϕ(x)|0 = e−ikx . (5.24) (5.25) These normalization conditions may conflict with our original choice of field and parameter normalization in the lagrangian. Consider, for example, a lagrangian originally specified as L = − 1 2 ∂ µ ϕ∂µϕ − 1 2 m2ϕ2 + 1 6 gϕ3 . (5.26) After shifting and rescaling (and renaming some parameters), we will have instead L = − 1 2 Zϕ∂µϕ∂µ ϕ − 1 2 Zm m2ϕ2 + 1 6 Zg gϕ3 + Y ϕ . (5.27) Here the three Z’s and Y are as yet unknown constants. They must be chosen to ensure the validity of eq. (5.25); this gives us two conditions in four unknowns. We fix the parameter m by requiring it to be equal to the actual mass of the particle (equivalently, the energy of the first excited state relative to the ground state), and we fix the parameter g by requiring some particular scattering cross section to depend on g in some particular way. (For example, in quantum electrodynamics, the parameter analogous to g is the electron charge e. The low-energy Coulomb scattering cross section is proportional to e4, with a definite constant of proportionality and no higher-order corrections; this relationship defines e.) Thus we have four conditions in four unknowns, and it is possible to calculate Y and the three Z’s order by order in powers of g. Next, we must develop the tools needed to compute the correlation functions 0|Tϕ(x1) . . . |0 in an interacting quantum field theory. 5: The LSZ Reduction Formula 56 Reference Notes Useful discussions of the LSZ reduction formula can be found in Brown, Itzykson & Zuber, Peskin & Schroeder, and Weinberg I. Problems 5.1) Work out the LSZ reduction formula for the complex scalar field that was introduced in problem 3.5. Note that we must specify the type (a or b) of each incoming and outgoing particle. 6: Path Integrals in Quantum Mechanics 57 6 Path Integrals in Quantum Mechanics Prerequisite: none Consider the nonrelativistic quantum mechanics of one particle in one dimension; the hamiltonian is H(P, Q) = 1 2m P 2 + V (Q) , (6.1) where P and Q are operators obeying [Q, P ] = i. (We set ¯h = 1 for notational convenience.) We wish to evaluate the probability amplitude for the particle to start at position q′ at time t′, and end at position q′′ at time t′′. This amplitude is q′′|e−iH(t′′−t′)|q′ , where |q′ and |q′′ are eigenstates of the position operator Q. We can also formulate this question in the Heisenberg picture, where op- erators are time dependent and the state of the system is time independent, as opposed to the more familiar Schr¨odinger picture. In the Heisenberg picture, we write Q(t) = eiHtQe−iHt. We can then define an instantaneous eigenstate of Q(t) via Q(t)|q, t = q|q, t . These instantaneous eigenstates can be expressed explicitly as |q, t = e+iHt|q , where Q|q = q|q . Then our transition amplitude can be written as q′′, t′′|q′, t′ in the Heisenberg picture. To evaluate q′′, t′′|q′, t′ , we begin by dividing the time interval T ≡ t′′ − t′ into N + 1 equal pieces of duration δt = T /(N + 1). Then introduce N complete sets of position eigenstates to get q′′, t′′|q′, t′ = N dqj q′′|e−iHδt|qN qN |e−iHδt|qN−1 . . . q1|e−iHδt|q′ . j=1 (6.2) The integrals over the q’s all run from −∞ to +∞. Now consider q2|e−iHδt|q1 . We can use the Campbell-Baker-Hausdorf formula exp(A + B) = exp(A) exp(B) exp(− 1 2 [A, B] + . . .) (6.3) to write exp(−iHδt) = exp[−i(δt/2m)P 2] exp[−iδtV (Q)] exp[O(δt2)] . (6.4) Then, in the limit of small δt, we should be able to ignore the final exponential. Inserting a complete set of momentum states then gives q2|e−iHδt|q1 = = dp1 q2|e−i(δt/2m)P 2 |p1 p1|e−iδtV (Q)|q1 dp1 e−i(δt/2m)p21 e−iδtV (q1) q2|p1 p1|q1 6: Path Integrals in Quantum Mechanics 58 = dp1 2π e−i(δt/2m)p21 e−iδtV (q1) eip1(q2−q1) . = dp1 e−iH(p1,q1)δt eip1(q2−q1) . 2π (6.5) To get the third line, we used q|p = (2π)−1/2 exp(ipq). If we happen to be interested in more general hamiltonians than eq. (6.1), then we must worry about the ordering of the P and Q operators in any term that contains both. If we adopt Weyl ordering, where the quantum hamiltonian H(P, Q) is given in terms of the classical hamiltonian H(p, q) by H(P, Q) ≡ dx 2π dk 2π eixP +ikQ dp dq e−ixp−ikq H(p, q) , (6.6) then eq. (6.5) is not quite correct; in the last line, H(p1, q1) should be replaced with H(p1, q¯1), where q¯1 = 1 2 (q1 + q2). For the hamiltonian of eq. (6.1), which is Weyl ordered, this replacement makes no difference in the limit δt → 0. Adopting Weyl ordering for the general case, we now have q′′, t′′|q′, t′ = N k=1 dqk N j=0 dpj 2π eipj (qj+1−qj ) e−iH(pj ,q¯j )δt , (6.7) where q¯j = 1 2 (qj + qj+1), q0 = q′, and qN +1 = q′′. If we now define q˙j ≡ (qj+1 − qj)/δt, and take the formal limit of δt → 0, we get t′′ q′′, t′′|q′, t′ = Dq Dp exp i dt p(t)q˙(t) − H(p(t), q(t)) . (6.8) t′ The integration is to be understood as over all paths in phase space that start at q(t′) = q′ (with an arbitrary value of the initial momentum) and end at q(t′′) = q′′ (with an arbitrary value of the final momentum). If H(p, q) is no more than quadratic in the momenta [as is the case for eq. (6.1)], then the integral over p is gaussian, and can be done in closed form. If the term that is quadratic in p is independent of q [as is the case for eq. (6.1)], then the prefactors generated by the gaussian integrals are all constants, and can be absorbed into the definition of Dq. The result of integrating out p is then t′′ q′′, t′′|q′, t′ = Dq exp i dt L(q˙(t), q(t)) , t′ (6.9) where L(q˙, q) is computed by first finding the stationary point of the p integral by solving 0 = ∂ ∂p pq˙ − H(p, q) = q˙ − ∂ H (p, ∂p q) (6.10) 6: Path Integrals in Quantum Mechanics 59 for p in terms of q˙ and q, and then plugging this solution back into pq˙ − H to get L. We recognize this procedure from classical mechanics: we are passing from the hamiltonian formulation to the lagrangian formulation. Now that we have eqs. (6.8) and (6.9), what are we going to do with them? Let us begin by considering some generalizations; let us examine, for example, q′′, t′′|Q(t1)|q′, t′ , where t′ < t1 < t′′. This is given by q′′, t′′|Q(t1)|q′, t′ = q′′|e−iH(t′′−t1)Qe−iH(t1−t′)|q′ . (6.11) In the path integral formula, the extra operator Q inserted at time t1 will simply result in an extra factor of q(t1). Thus q′′, t′′|Q(t1)|q′, t′ = Dp Dq q(t1) eiS , (6.12) where S = t′′ t′ dt (pq˙ − H ). Now let us go in the other direction; consider Dp Dq q(t1)q(t2)eiS. This clearly requires the operators Q(t1) and Q(t2), but their order depends on whether t1 < t2 or t2 < t1. Thus we have Dp Dq q(t1)q(t2) eiS = q′′, t′′|TQ(t1)Q(t2)|q′, t′ . (6.13) where T is the time ordering symbol: a product of operators to its right is to be ordered, not as written, but with operators at later times to the left of those at earlier times. This is significant, because time-ordered products enter into the LSZ formula for scattering amplitudes. To further develop these methods, we need another trick: functional derivatives. We define the functional derivative δ/δf (t) via δf δ (t1) f (t2) = δ(t1 − t2) , (6.14) where δ(t) is the Dirac delta function. Also, functional derivatives are defined to satisfy all the usual rules of derivatives (product rule, chain rule, etc). Eq. (6.14) can be thought of as the continuous generalization of (∂/∂xi)xj = δij . Now, consider modifying the lagrangian of our theory by including external forces acting on the particle: H(p, q) → H(p, q) − f (t)q(t) − h(t)p(t) , (6.15) where f (t) and h(t) are specified functions. In this case we will write t′′ q′′, t′′|q′, t′ f,h = Dp Dq exp i dt pq˙ − H + f q + hp . (6.16) t′ 6: Path Integrals in Quantum Mechanics 60 where H is the original hamiltonian. Then we have 1δ i δf (t1) q′′, t′′|q′, t′ f,h = Dp Dq q(t1) ei dt [pq˙−H+fq+hp] , 1δ1δ i δf (t1) i δf (t2) q′′, t′′|q′, t′ f,h = Dp Dq q(t1)q(t2) ei dt [pq˙−H+fq+hp] , 1δ i δh(t1) q′′, t′′|q′, t′ f,h = Dp Dq p(t1) ei dt [pq˙−H+fq+hp] , (6.17) and so on. After we are done bringing down as many factors of q(ti) or p(ti) as we like, we can set f (t) = h(t) = 0, and return to the original hamiltonian. Thus, q′′, t′′|TQ(t1) . . . P (tn) . . . |q′, t′ = 1 i δ ... 1 δf (t1) i δ ... δh(tn) q′′, t′′|q′, t′ f,h . f =h=0 (6.18) Suppose we are also interested in initial and final states other than position eigenstates. Then we must multiply by the wave functions for these states, and integrate. We will be interested, in particular, in the ground state as both the initial and final state. Also, we will take the limits t′ → −∞ and t′′ → +∞. The object of our attention is then 0|0 f,h = lim t′ →−∞ t′′ →+∞ dq′′ dq′ ψ0∗(q′′) q′′, t′′|q′, t′ f,h ψ0(q′) , (6.19) where ψ0(q) = q|0 is the ground-state wave function. Eq. (6.19) is a rather cumbersome formula, however. We will, therefore, employ a trick to simplify it. Let |n denote an eigenstate of H with eigenvalue En. We will suppose that E0 = 0; if this is not the case, we will shift H by an appropriate constant. Next we write |q′, t′ = eiHt′ |q′ ∞ = eiHt′ |n n|q′ n=0 ∞ = ψn∗ (q′)eiEnt′ |n , n=0 (6.20) where ψn(q) = q|n is the wave function of the nth eigenstate. Now, replace H with (1−iǫ)H in eq. (6.20), where ǫ is a small positive infinitesimal. Then, take the limit t′ → −∞ of eq. (6.20) with ǫ held fixed. Every 6: Path Integrals in Quantum Mechanics 61 state except the ground state is then multiplied by a vanishing exponential factor, and so the limit is simply ψ0∗(q′)|0 . Next, multiply by an arbitrary function χ(q′), and integrate over q′. The only requirement is that 0|χ = 0. We then have a constant times |0 , and this constant can be absorbed into the normalization of the path integral. A similar analysis of q′′, t′′| = q′′|e−iHt′′ shows that the replacement H → (1−iǫ)H also picks out the ground state as the final state in the t′′ → +∞ limit. What all this means is that if we use (1−iǫ)H instead of H, we can be cavalier about the boundary conditions on the endpoints of the path. Any reasonable boundary conditions will result in the ground state as both the initial and final state. Thus we have +∞ 0|0 f,h = Dp Dq exp i dt pq˙ − (1−iǫ)H + f q + hp . (6.21) −∞ Now let us suppose that H = H0 + H1, where we can solve for the eigenstates and eigenvalues of H0, and H1 can be treated as a perturbation. Suppressing the iǫ, eq. (6.21) can be written as +∞ 0|0 f,h = Dp Dq exp i dt pq˙ − H0(p, q) − H1(p, q) + f q + hp −∞ = exp −i +∞ dt H1 −∞ 1 i δ δh(t) , 1 i δ δf (t) +∞ × Dp Dq exp i dt pq˙ − H0(p, q) + f q + hp . (6.22) −∞ To understand the second line of this equation, take the exponential prefactor inside the path integral. Then the functional derivatives (that appear as the arguments of H1) just pull out appropriate factors of p(t) and q(t), generating the right-hand side of the first line. We assume that we can compute the functional integral in the second line, since it involves only the solvable hamiltonian H0. The exponential prefactor can then be expanded in powers of H1 to generate a perturbation series. If H1 depends only on q (and not on p), and if we are only interested in time-ordered products of Q’s (and not P ’s), and if H is no more than quadratic in P , and if the term quadratic in P does not involve Q, then eq. (6.22) can be simplified to 0|0 f = exp i +∞ dt L1 −∞ 1δ i δf (t) +∞ × Dq exp i dt L0(q˙, q) + f q . −∞ (6.23) where L1(q) = −H1(q). 6: Path Integrals in Quantum Mechanics 62 Reference Notes Brown and Ramond I have especially clear treatments of various aspects of path integrals. For a careful derivation of the midpoint rule of eq. (6.7), see Berry & Mount. Problems 6.1) a) Find an explicit formula for Dq in eq. (6.9). Your formula should be of the form Dq = C N j=1 dqj , where C is a constant that you should compute. b) For the case of a free particle, V (Q) = 0, evaluate the path integral of eq. (6.9) explicitly. Hint: integrate over q1, then q2, etc, and look for a pattern. Express you final answer in terms of q′, t′, q′′, t′′, and m. Restore ¯h by dimensional analysis. c) Compute q′′, t′′|q′, t′ = q′′|e−iH(t′′−t′)|q′ by inserting a complete set of momentum eigenstates, and performing the integral over the momentum. Compare with your result in part (b). 7: The Path Integral for the Harmonic Oscillator 63 7 The Path Integral for the Harmonic Oscillator Prerequisite: 6 Consider a harmonic oscillator with hamiltonian H(P, Q) = 1 2m P 2 + 1 2 mω2Q2 . (7.1) We begin with the formula from section 6 for the ground state to ground state transition amplitude in the presence of an external force, specialized to the case of a harmonic oscillator: +∞ 0|0 f = Dp Dq exp i dt pq˙ − (1−iǫ)H + f q . −∞ (7.2) Looking at eq. (7.1), we see that multiplying H by 1−iǫ is equivalent to the replacements m−1 → (1−iǫ)m−1 [or, equivalently, m → (1+iǫ)m] and mω2 → (1−iǫ)mω2. Passing to the lagrangian formulation then gives +∞ 0|0 f = Dq exp i −∞ dt 1 2 (1+iǫ)mq˙2 − 1 2 (1−iǫ)mω2q2 + fq . (7.3) From now on, we will simplify the notation by setting m = 1. Next, let us use Fourier-transformed variables, +∞ q(E) = dt eiEt q(t) , −∞ q(t) = +∞ −∞ dE 2π e−iEt q(E) . (7.4) The expression in square brackets in eq. (7.3) becomes · · · = 1 +∞ dE dE′ e−i(E+E′)t −(1+iǫ)EE′ − (1−iǫ)ω2 q(E)q(E′) 2 −∞ 2π 2π + f (E)q(E′) + f (E′)q(E) . (7.5) Note that the only t dependence is now in the prefactor. Integrating over t then generates a factor of 2πδ(E + E′). Then we can easily integrate over E′ to get +∞ S= dt · · · −∞ = 1 2 +∞ dE −∞ 2π (1+iǫ)E2 − (1−iǫ)ω2 q(E)q(−E) + f (E)q(−E) + f (−E)q(E) . (7.6) 7: The Path Integral for the Harmonic Oscillator 64 The factor in large parentheses is equal to E2 − ω2 + i(E2 + ω2)ǫ, and we can absorb the positive coefficient into ǫ to get E2 − ω2 + iǫ. Now it is convenient to change integration variables to Then we get x(E) = q(E) + E2 f (E) − ω2 + iǫ . (7.7) S = 1 2 +∞ dE −∞ 2π x(E)(E2 − ω2 + iǫ)x(−E) − f (E)f (−E) E2 − ω2 + iǫ . (7.8) Furthermore, because eq. (7.7) is just a shift by a constant, Dq = Dx. Now we have 0|0 f = exp i 2 +∞ dE −∞ 2π f (E)f (−E) − E2 + ω2 − iǫ × Dx exp i 2 +∞ −∞ dE 2π x(E)(E2 − ω2 + iǫ)x(−E) . (7.9) Now comes the key point. The path integral on the second line of eq. (7.9) is what we get for 0|0 f in the case f = 0. On the other hand, if there is no external force, a system in its ground state will remain in its ground state, and so 0|0 f=0 = 1. Thus 0|0 f is given by the first line of eq. (7.9), 0|0 f = exp i 2 +∞ dE −∞ 2π f (E)f (−E) − E2 + ω2 − iǫ . (7.10) We can also rewrite 0|0 f in terms of time-domain variables as 0|0 f = exp i 2 +∞ dt dt′ f (t)G(t − t′)f (t′) −∞ , (7.11) where G(t − t′) = +∞ dE −∞ 2π − e−iE (t−t′ ) E2 + ω2 − iǫ . (7.12) Note that G(t−t′) is a Green’s function for the oscillator equation of motion: ∂2 ∂t2 + ω2 G(t − t′) = δ(t − t′) . (7.13) This can be seen directly by plugging eq. (7.12) into eq. (7.13) and then taking the ǫ → 0 limit. We can also evaluate G(t − t′) explicitly by treating the integral over E on the right-hand side of eq. (7.12) as a contour integral 7: The Path Integral for the Harmonic Oscillator 65 in the complex E plane, and then evaluating it via the residue theorem. The result is G(t − t′) = i 2ω exp −iω|t − t′| . (7.14) Consider now the formula from section 6 for the time-ordered product of operators. In the case of initial and final ground states, it becomes 0|TQ(t1) . . . |0 = 1 i δ δf (t1) ... 0|0 f f=0 . (7.15) Using our explicit formula, eq. (7.11), we have 0|TQ(t1)Q(t2)|0 = 1 i δ δf (t1) 1 i δ δf (t2) 0|0 f f=0 = 1 i δ δf (t1) +∞ dt′ G(t2 − t′)f (t′) −∞ 0|0 f f=0 = 1 i G(t2 − t1) + (term with f ’s) 0|0 f f=0 = 1 i G(t2 − t1) . (7.16) We can continue in this way to compute the ground-state expectation value of the time-ordered product of more Q(t)’s. If the number of Q(t)’s is odd, then there is always a left-over f (t) in the prefactor, and so the result is zero. If the number of Q(t)’s is even, then we must pair up the functional derivatives in an appropriate way to get a nonzero result. Thus, for example, 0|TQ(t1)Q(t2)Q(t3)Q(t4)|0 = 1 i2 G(t1−t2)G(t3−t4) + G(t1−t3)G(t2−t4) + G(t1−t4)G(t2−t3) . (7.17) More generally, 0|TQ(t1) . . . Q(t2n)|0 = 1 in G(ti1 −ti2 ) . . . G(ti2n−1 −ti2n ) pairings . (7.18) Problems 7.1) Starting with eq. (7.12), do the contour integral to verify eq. (7.14). 7.2) Starting with eq. (7.14), verify eq. (7.13). 7: The Path Integral for the Harmonic Oscillator 66 7.3) a) Use the Heisenberg equation of motion, A˙ = i[H, A], to find explicit expressions for Q˙ and P˙ . Solve these to get the Heisenberg-picture operators Q(t) and P (t) in terms of the Schr¨odinger picture operators Q and P . b) Write the Schr¨odinger picture operators Q and P in terms of the creation and annihilation operators a and a†, where H = h¯ ω (a† a + 1 2 ). Then, using your result from part (a), write the Heisenberg-picture operators Q(t) and P (t) in terms of a and a†. c) Using your result from part (b), and a|0 = 0|a† = 0, verify eqs. (7.16) and (7.17). 7.4) Consider a harmonic oscillator in its ground state at t = −∞. It is then then subjected to an external force f (t). Compute the probabil- ity | 0|0 f |2 that the oscillator is still in its ground state at t = +∞. Write your answer as a manifestly real expression, and in terms of the Fourier transform f (E) = +∞ −∞ dt eiEtf (t). Your answer should not involve any other unevaluated integrals. 8: The Path Integral for Free Field Theory 67 8 The Path Integral for Free Field Theory Prerequisite: 3, 7 Our results for the harmonic oscillator can be straightforwardly generalized to a free field theory with hamiltonian density H0 = 1 2 Π2 + 1 2 (∇ϕ)2 + 1 2 m2ϕ2 . (8.1) The dictionary we need is q(t) −→ ϕ(x, t) (classical field) Q(t) −→ ϕ(x, t) (operator field) f (t) −→ J(x, t) (classical source) (8.2) The distinction between the classical field ϕ(x) and the corresponding operator field should be clear from context. To employ the ǫ trick, we multiply H0 by 1 − iǫ. The results are equivalent to replacing m2 in H0 with m2 − iǫ. From now on, for notational simplicity, we will write m2 when we really mean m2 − iǫ. Let us write down the path integral (also called the functional integral) for our free field theory: Z0(J ) ≡ 0|0 J = Dϕ ei d4x[L0+Jϕ] , (8.3) where L0 = − 1 2 ∂µϕ∂µ ϕ − 1 2 m2 ϕ2 is the lagrangian density, and (8.4) Dϕ ∝ dϕ(x) x (8.5) is the functional measure. Note that when we say path integral, we now mean a path in the space of field configurations. We can evaluate Z0(J) by mimicking what we did for the harmonic oscillator in section 7. We introduce four-dimensional Fourier transforms, ϕ(k) = d4x e−ikx ϕ(x) , ϕ(x) = d4k (2π)4 eikx ϕ(k) , (8.6) where kx = −k0t + k·x, and k0 is an integration variable. Then, starting with S0 = d4x [L0 + Jϕ], we get S0 = 1 2 d4k (2π)4 −ϕ(k)(k2 + m2)ϕ(−k) + J(k)ϕ(−k) + J(−k)ϕ(k) , (8.7) 8: The Path Integral for Free Field Theory 68 where k2 = k2 − (k0)2. We now change path integration variables to χ(k) = ϕ(k) − J (k) k2 + m2 . (8.8) Since this is merely a shift by a constant, we have Dϕ = Dχ. The action becomes S0 = 1 2 d4k (2π)4 J(k)J (−k) k2 + m2 − χ(k)(k2 + m2)χ(−k) . (8.9) Just as for the harmonic oscillator, the integral over χ simply yields a factor of Z0(0) = 0|0 J=0 = 1. Therefore Z0(J) = exp i 2 d4k J(k)J (−k) (2π)4 k2 + m2 − iǫ = exp i 2 d4x d4x′ J(x)∆(x − x′)J(x′) . (8.10) Here we have defined the Feynman propagator, ∆(x − x′) = d4k (2π)4 eik(x−x′) k2 + m2 − iǫ . (8.11) The Feynman propagator is a Green’s function for the Klein-Gordon equa- tion, (−∂x2 + m2)∆(x − x′) = δ4(x − x′) . (8.12) This can be seen directly by plugging eq. (8.11) into eq. (8.12) and then taking the ǫ → 0 limit. We can also evaluate ∆(x − x′) explicitly by treating the k0 integral on the right-hand side of eq. (8.11) as a contour integral in the complex k0 plane, and then evaluating it via the residue theorem. The result is ∆(x − x′) = i dk eik·(x−x′)−iω|t−t′| = iθ(t−t′) dk eik(x−x′) + iθ(t′−t) dk e−ik(x−x′) , (8.13) where θ(t) is the unit step function. The integral over dk can also be performed in terms of Bessel functions; see section 4. Now, by analogy with the formula for the ground-state expectation value of a time-ordered product of operators for the harmonic oscillator, we have 0|Tϕ(x1) . . . |0 = 1 i δ δJ (x1 ) . . . Z0(J) J =0 . (8.14) 8: The Path Integral for Free Field Theory 69 Using our explicit formula, eq. (8.10), we have 0|Tϕ(x1)ϕ(x2)|0 = 1 i δ δJ (x1 ) 1 i δ δJ (x2 ) Z0(J ) J =0 = 1 i δ δJ (x1 ) d4x′ ∆(x2 − x′)J (x′) Z0(J ) J=0 = 1 i ∆(x2 − x1) + (term with J ’s) Z0(J ) J=0 = 1 i ∆(x2 − x1) . (8.15) We can continue in this way to compute the ground-state expectation value of the time-ordered product of more ϕ’s. If the number of ϕ’s is odd, then there is always a left-over J in the prefactor, and so the result is zero. If the number of ϕ’s is even, then we must pair up the functional derivatives in an appropriate way to get a nonzero result. Thus, for example, 0|Tϕ(x1)ϕ(x2)ϕ(x3)ϕ(x4)|0 = 1 i2 ∆(x1−x2)∆(x3−x4) + ∆(x1−x3)∆(x2−x4) + ∆(x1−x4)∆(x2−x3) . (8.16) More generally, 0|Tϕ(x1) . . . ϕ(x2n)|0 = 1 in ∆(xi1 −xi2 ) . . . ∆(xi2n−1 −xi2n ) pairings . (8.17) This result is known as Wick’s theorem. Problems 8.1) Starting with eq. (8.11), verify eq. (8.12). 8.2) Starting with eq. (8.11), verify eq. (8.13). 8.3) Starting with eq. (8.13), verify eq. (8.12). Note that the time derivatives in the Klein-Gordon wave operator can act on either the field (which obeys the Klein-Gordon equation) or the time-ordering step functions. 8.4) Use eqs. (3.19), (3.29), and (5.3) (and its hermitian conjugate) to verify the last line of eq. (8.15). 8.5) The retarded and advanced Green’s functions for the Klein-Gordon wave operator satisfy ∆ret(x − y) = 0 for x0 ≥ y0 and ∆adv(x − y) = 0 for x0 ≤ y0. Find the pole prescriptions on the right-hand side of eq. (8.11) that yield these Green’s functions. 8: The Path Integral for Free Field Theory 70 8.6) Let Z0(J) = exp iW0(J), and evaluate the real and imaginary parts of W0(J). 8.7) Repeat the analysis of this section for the complex scalar field that was introduced in problem 3.5, and further studied in problem 5.1. Write your source term in the form J†ϕ + Jϕ†, and find an explicit formula, analogous to eq. (8.10), for Z0(J†, J). Write down the appropriate generalization of eq. (8.14), and use it to compute 0|Tϕ(x1)ϕ(x2)|0 , 0|Tϕ†(x1)ϕ(x2)|0 , and 0|Tϕ†(x1)ϕ†(x2)|0 . Then verify your results by using the method of problem 8.4. Finally, give the appropriate generalization of eq. (8.17). 8.8) A harmonic oscillator (in units with m = h¯ = 1) has a ground-state wave function q|0 ∝ e−ωq2/2. Now consider a real scalar field ϕ(x), and define a field eigenstate |A that obeys ϕ(x, 0)|A = A(x)|A , (8.18) where the function A(x) is everywhere real. For a free-field theory specified by the hamiltonian of eq. (8.1), Show that the ground-state wave functional is A|0 ∝ exp − 1 2 d3k (2π)3 ω(k)A˜(k)A˜(−k) , (8.19) where A˜(k) ≡ d3x e−ik·xA(x) and ω(k) ≡ (k2 + m2)1/2. 9: The Path Integral for Interacting Field Theory 71 9 The Path Integral for Interacting Field Theory Prerequisite: 8 Let us consider an interacting quantum field theory specified by a lagrangian of the form L = − 1 2 Zϕ∂µϕ∂µ ϕ − 1 2 Zm m2ϕ2 + 1 6 Zg gϕ3 + Y ϕ . (9.1) As we discussed at the end of section 5, we fix the parameter m by requiring it to be equal to the actual mass of the particle (equivalently, the energy of the first excited state relative to the ground state), and we fix the parameter g by requiring some particular scattering cross section to depend on g in some particular way. (We will have more to say about this after we have learned to calculate cross sections.) We also assume that the field is normalized by 0|ϕ(x)|0 = 0 and k|ϕ(x)|0 = e−ikx . (9.2) Here |0 is the ground state, normalized via 0|0 = 1, and |k is a state of one particle with four-momentum kµ, where k2 = kµkµ = −m2, normalized via k′|k = (2π)32k0δ3(k′ − k) . (9.3) Thus we have four conditions (the specified values of m, g, 0|ϕ|0 , and k|ϕ|0 ), and we will use these four conditions to determine the values of the four remaining parameters (Y and the three Z’s) that appear in L. Before going further, we should note that this theory (known as ϕ3 theory, pronounced “phi-cubed”) actually has a fatal flaw. The hamiltonian density is H = 1 2 Zϕ−1Π2 −Yϕ+ 1 2 Zmm2ϕ2 − 1 6 Zg gϕ3 . (9.4) Classically, we can make this arbitrarily negative by choosing an arbitrarily large value for ϕ. Quantum mechanically, this means that this hamiltonian has no ground state. If we start off near ϕ = 0, we can tunnel through the potential barrier to large ϕ, and then “roll down the hill”. However, this process is invisible in perturbation theory in g. The situation is exactly analogous to the problem of a harmonic oscillator perturbed by a q3 term. This system also has no ground state, but perturbation theory (both time dependent and time independent) does not “know” this. We will be inter- ested in eq. (9.1) only as an example of how to do perturbation expansions in a simple context, and so we will overlook this problem. We would like to evaluate the path integral for this theory, Z(J ) ≡ 0|0 J = Dϕ ei d4x[L0+L1+Jϕ] . (9.5) 9: The Path Integral for Interacting Field Theory 72 We can evaluate Z(J) by mimicking what we did for quantum mechanics at the end of section 6. Specifically, we can rewrite eq. (9.5) as Z(J) = ei d4x L1 1δ i δJ (x) Dϕ ei d4x[L0+Jϕ] . ∝ ei d4x L1 1δ i δJ (x) Z0(J) , (9.6) where Z0(J) is the result in free-field theory, Z0(J) = exp i 2 d4x d4x′ J(x)∆(x − x′)J(x′) . (9.7) We have written Z(J) as proportional to (rather than equal to) the righthand side of eq. (9.6) because the ǫ trick does not give us the correct overall normalization; instead, we must require Z(0) = 1, and enforce this by hand. Note that, in eq. (9.7), we have implicitly assumed that L0 = − 1 2 ∂ µ ϕ∂µϕ − 1 2 m2ϕ2 , (9.8) since this is the L0 that gives us eq. (9.7). Therefore, the rest of L must be included in L1. We write L1 = 1 6 Zg gϕ3 + Lct , Lct = − 1 2 (Zϕ −1)∂µϕ∂µ ϕ − 1 2 (Zm−1)m2 ϕ2 +Yϕ , (9.9) where Lct is called the counterterm lagrangian. We expect that, as g → 0, Y → 0 and Zi → 1. In fact, as we will see, Y = O(g) and Zi = 1 + O(g2). In order to make use of eq. (9.7), we will have to compute lots and lots of functional derivatives of Z0(J). Let us begin by ignoring the counterterms. We define Z1(J) ∝ exp i 6 Zg g d4x 1δ i δJ(x) 3 Z0(J) , (9.10) where the constant of proportionality is fixed by Z1(0) = 1. We now make a dual Taylor expansion in powers of g and J to get Z1 (J ) ∝ ∞ V =0 1 V! iZg g 6 d4x 1 δ 3V i δJ(x) × ∞ P =0 1 P! i 2 P d4y d4z J(y)∆(y−z)J(z) . (9.11) If we focus on a term in eq. (9.11) with particular values of V and P , then the number of surviving sources (after we take all the functional derivatives) 9: The Path Integral for Interacting Field Theory 73 S = 23 S = 2 x 3! Figure 9.1: All connected diagrams with E = 0 and V = 2. S = 24 S = 23 S = 24 S = 23 x 3! S = 4! Figure 9.2: All connected diagrams with E = 0 and V = 4. is E = 2P − 3V . (Here E stands for external, a terminology that should become clear by the end of the next section; V stands for vertex and P for propagator .) The overall phase factor of such a term is then iV (1/i)3V iP = iV +E−P , and the 3V functional derivatives can act on the 2P sources in (2P )!/(2P −3V )! different combinations. However, many of the resulting expressions are algebraically identical. To organize them, we introduce Feynman diagrams. In these diagrams, a line segment (straight or curved) stands for a propagator 1 i ∆(x−y), a filled circle at one end of a line segment for a source i d4x J(x), and a vertex joining three line segments for iZgg d4x. Sets of diagrams with different values of E and V are shown in figs. (9.1–9.11). To count the number of terms on the right-hand side of eq. (9.11) that result in a particular diagram, we first note that, in each diagram, the num- ber of lines is P and the number of vertices is V . We can rearrange the three functional derivatives from a particular vertex without changing the resulting diagram; this yields a counting factor of 3! for each vertex. Also, we can rearrange the vertices themselves; this yields a counting factor of V !. Similarly, we can rearrange the two sources at the ends of a particular propagator without changing the resulting diagram; this yields a counting 9: The Path Integral for Interacting Field Theory 74 factor of 2! for each propagator. Also, we can rearrange the propagators themselves; this yields a counting factor of P !. All together, these counting factors neatly cancel the numbers from the dual Taylor expansions in eq. (9.11). However, this procedure generally results in an overcounting of the number of terms that give identical results. This happens when some rearrangement of derivatives gives the same match-up to sources as some rearrangement of sources. This possibility is always connected to some symmetry property of the diagram, and so the factor by which we have overcounted is called the symmetry factor. The figures show the symmetry factor S of each diagram. Consider, for example, the second diagram of fig. (9.1). The three propagators can be rearranged in 3! ways, and all these rearrangements can be duplicated by exchanging the derivatives at the vertices. Furthermore the endpoints of each propagator can be simultaneously swapped, and the effect duplicated by swapping the two vertices. Thus, S = 2 × 3! = 12. Let us consider two more examples. In the first diagram of fig. (9.6), the exchange of the two external propagators (along with their attached sources) can be duplicated by exchanging all the derivatives at one vertex for those at the other, and simultaneously swapping the endpoints of each semicircular propagator. Also, the effect of swapping the top and bottom semicircular propagators can be duplicated by swapping the corresponding derivatives at each vertex. Thus, the symmetry factor is S = 2 × 2 = 4. In the diagram of fig. (9.10), we can exchange derivatives to match swaps of the top and bottom external propagators on the left, or the top and bottom external propagators on the right, or the set of external propagators on the left with the set of external propagators on the right. Thus, the symmetry factor is S = 2 × 2 × 2 = 8. The diagrams in figs. (9.1–9.11) are all connected: we can trace a path through the diagram between any two points on it. However, these are not the only contributions to Z(J). The most general diagram consists of a product of several connected diagrams. Let CI stand for a particular connected diagram, including its symmetry factor. A general diagram D can then be expressed as D = 1 SD I (CI )nI , (9.12) where nI is an integer that counts the number of CI ’s in D, and SD is the additional symmetry factor for D (that is, the part of the symmetry factor that is not already accounted for by the symmetry factors already included in each of the connected diagrams). We now need to determine SD. 9: The Path Integral for Interacting Field Theory 75 S = 2 Figure 9.3: All connected diagrams with E = 1 and V = 1. S = 22 S = 22 S = 23 Figure 9.4: All connected diagrams with E = 1 and V = 3. S = 2 Figure 9.5: All connected diagrams with E = 2 and V = 0. S = 22 S = 22 Figure 9.6: All connected diagrams with E = 2 and V = 2. 9: The Path Integral for Interacting Field Theory 76 Since we have already accounted for propagator and vertex rearrange- ments within each CI , we need to consider only exchanges of propagators and vertices among different connected diagrams. These can leave the total diagram D unchanged only if (1) the exchanges are made among different but identical connected diagrams, and only if (2) the exchanges involve all of the propagators and vertices in a given connected diagram. If there are nI factors of CI in D, there are nI! ways to make these rearrangements. Overall, then, we have SD = nI ! . (9.13) I Now Z1(J) is given (up to an overall normalization) by summing all diagrams D, and each D is labeled by the integers nI. Therefore Z1(J) ∝ D {nI } ∝ {nI } I 1 nI ! (CI )nI ∝ I ∞ nI =0 1 nI ! (CI )nI ∝ exp (CI ) I ∝ exp ( I CI ) . (9.14) Thus we have a remarkable result: Z1(J) is given by the exponential of the sum of connected diagrams. This makes it easy to impose the normalization Z1(0) = 1: we simply omit the vacuum diagrams (those with no sources), like those of figs. (9.1) and (9.2). We then have Z1(J) = exp[iW1(J)] , (9.15) where we have defined iW1(J) ≡ CI , I ={0} (9.16) and the notation I = {0} means that the vacuum diagrams are omitted from the sum, so that W1(0) = 0.1 Were it not for the counterterms in L1, we would have Z(J) = Z1(J). Let us see what we would get if this was, in fact, the case. In particular, let us compute the vacuum expectation value of the field ϕ(x), which is given 1We have included a factor of i on the left-hand side of eq. (9.16) because then W1(J) is real in free-field theory; see problem 8.6. 9: The Path Integral for Interacting Field Theory 77 S = 23 S = 23 S = 24 S = 22 S = 23 S = 22 S = 23 S = 22 S = 22 Figure 9.7: All connected diagrams with E = 2 and V = 4. S = 3! Figure 9.8: All connected diagrams with E = 3 and V = 1. 9: The Path Integral for Interacting Field Theory 78 S = 3! S = 22 S = 22 Figure 9.9: All connected diagrams with E = 3 and V = 3. S = 23 Figure 9.10: All connected diagrams with E = 4 and V = 2. S = 24 S = 24 S = 23 S = 22 S = 22 S = 22 Figure 9.11: All connected diagrams with E = 4 and V = 4. 9: The Path Integral for Interacting Field Theory 79 S = 1 S = 2 S = 2 S = 2 Figure 9.12: All connected diagrams with E = 1, X ≥ 1 (where X is the number of one-point vertices from the linear counterterm), and V + X ≤ 3. by 0|ϕ(x)|0 = 1δ i δJ(x) Z1(J ) J =0 = δ δJ (x) W1 (J ) J =0 . (9.17) This expression is then the sum of all diagrams [such as those in figs. (9.3) and (9.4)] that have a single source, with the source removed: 0|ϕ(x)|0 = 1 2 ig d4y 1 i ∆(x−y) 1 i ∆(y−y) + O(g3) . (9.18) Here we have set Zg = 1 in the first term, since Zg = 1 + O(g2). We see the vacuum-expectation value of ϕ(x) is not zero, as is required for the validity of the LSZ formula. To fix this, we must introduce the counterterm Y ϕ. Including this term in the interaction lagrangian L1 introduces a new kind of vertex, one where a single line segment ends; the corresponding vertex factor is iY d4y. The simplest diagrams including this new vertex are shown in fig. (9.12), with a cross symbolizing the vertex. Assuming Y = O(g), only the first diagram in fig. (9.12) contributes at O(g), and we have 0|ϕ(x)|0 = iY + 1 2 (ig) 1 i ∆(0) d4y 1 i ∆(x−y) + O(g3) . (9.19) Thus, in order to have 0|ϕ(x)|0 = 0, we should choose Y = 1 2 ig∆(0) + O(g3 ) . (9.20) The factor of i is disturbing, because Y must be a real number: it is the coefficient of a hermitian operator in the hamiltonian, as seen in eq. (9.4). Therefore, ∆(0) must be purely imaginary, or we are in trouble. We have ∆(0) = d4k (2π)4 k2 1 + m2 − iǫ . (9.21) 9: The Path Integral for Interacting Field Theory 80 From eq. (9.21), it is not immediately obvious whether or not ∆(0) is purely imaginary, but eq. (9.21) does reveal another problem: the integral diverges at large k. This is another example of an ultraviolet divergence, similar to the one we encountered in section 3 when we computed the zero-point energy of the field. To make some progress, we introduce an ultraviolet cutoff Λ, which we assume is much larger than m and any other energy of physical interest. Modifications to the propagator above some cutoff may be well justified physically; for example, quantum fluctuations in spacetime itself should become important above the Planck scale, which is given by the inverse square root of Newton’s constant, and has the numerical value of 1019 GeV (compared to, say, the proton mass, which is 1 GeV). In order to retain the Lorentz-transformation properties of the propagator, we implement the ultraviolet cutoff in a more subtle way than we did in section 3; specfically, we make the replacement ∆(x − y) → d4k eik(x−y) (2π)4 k2 + m2 − iǫ Λ2 k2 + Λ2 − iǫ 2 . (9.22) The integral is now convergent, and we can evaluate the modified ∆(0) with the methods of section 14; for Λ ≫ m, the result is ∆(0) = i 16π2 Λ2 . (9.23) Thus Y is real, as required. If we like, we can now formally take the limit Λ → ∞. The parameter Y becomes infinite, but 0|ϕ(x)|0 remains zero, at least to this order in g. It may be disturbing to have a parameter in the lagrangian that is formally infinite. However, such parameters are not directly measurable, and so need not obey our preconceptions about their magnitudes. Also, it is important to remember that Y includes a factor of g; this means that we can expand in powers of Y as part of our general expansion in powers of g. When we compute something measurable (like a scattering cross section), all the formally infinite numbers will cancel in a well-defined way, leaving behind finite coefficients for the various powers of g. We will see how this works in detail in sections 14–20. As we go to higher orders in g, things become more complicated, but in principle the procedure is the same. Thus, at O(g3), we sum up the diagrams of figs. (9.4) and (9.12), and then add to Y whatever O(g3) term is needed to maintain 0|ϕ(x)|0 = 0. In this way we can determine the value of Y order by order in powers of g. Once this is done, there is a remarkable simplification. Our adjustment of Y to keep 0|ϕ(x)|0 = 0 means that the sum of all connected diagrams 9: The Path Integral for Interacting Field Theory 81 Figure 9.13: All connected diagrams without tadpoles with E ≤ 4 and V ≤ 4. with a single source is zero. Consider now that same infinite set of diagrams, but replace the single source in each of them with some other subdiagram. Here is the point: no matter what this replacement subdiagram is, the sum of all these diagrams is still zero. Therefore, we need not bother to compute any of them! The rule is this: ignore any diagram that, when a single line is cut, falls into two parts, one of which has no sources. All of these diagrams (known as tadpoles) are canceled by the Y counterterm, no matter what subdiagram they are attached to. The diagrams that remain (and need to be computed!) are shown in fig. (9.13). We turn next to the remaining two counterterms. For notational sim- plicity we define A = Zϕ − 1 , B = Zm − 1 , (9.24) 9: The Path Integral for Interacting Field Theory 82 and recall that we expect each of these to be O(g2). We now have Z(J) = exp − i 2 d4x 1δ i δJ(x) −A∂x2 + Bm2 1δ i δJ(x) Z1(J) . (9.25) We have integrated by parts to put both ∂x’s onto one δ/δJ(x). (Note that the time derivatives in this interaction should really be treated by including an extra source term for the conjugate momentum Π = ϕ˙ . However, the space derivatives are correctly treated, and then the time derivatives must work out comparably by Lorentz invariance.) Eq. (9.25) results in a new vertex at which two lines meet. The corre- sponding vertex factor is (−i) d4x (−A∂x2 + Bm2); the ∂x2 acts on the x in one or the other (but not both) propagators. (Which one does not matter, and can be changed via integration by parts.) Diagramatically, all we need do is sprinkle these new vertices onto the propagators in our existing dia- grams. How many of these vertices we need to add depends on the order in g we are working to achieve. This completes our calculation of Z(J) in ϕ3 theory. We express it as Z(J) = exp[iW (J)] , (9.26) where W (J) is given by the sum of all connected diagrams with no tadpoles and at least two sources, and including the counterterm vertices just discussed. Now that we have Z(J), we must find out what we can do with it. Problems 9.1) Compute the symmetry factor for each diagram in fig. (9.13). (You can then check your answers by consulting the earlier figures.) 9.2) Consider a real scalar field with L = L0 + L1, where L0 = − 1 2 ∂µϕ∂µ ϕ − 1 2 m2ϕ2 , L1 = − 1 24 Zλλϕ4 + Lct , Lct = − 1 2 (Zϕ −1)∂µϕ∂µ ϕ − 1 2 (Zm−1)m2 ϕ2 . a) What kind of vertex appears in the diagrams for this theory (that is, how many line segments does it join?), and what is the associated vertex factor? b) Ignoring the counterterms, draw all the connected diagrams with 1 ≤ E ≤ 4 and 0 ≤ V ≤ 2, and find their symmetry factors. c) Explain why we did not have to include a counterterm linear in ϕ to cancel tadpoles. 9: The Path Integral for Interacting Field Theory 83 9.3) Consider a complex scalar field (see problems 3.5, 5.1, and 8.7) with L = L0 + L1, where L0 = −∂µϕ†∂µϕ − m2ϕ†ϕ , L1 = − 1 4 Zλ λ(ϕ† ϕ)2 + Lct , Lct = −(Zϕ−1)∂µϕ†∂µϕ − (Zm−1)m2ϕ†ϕ . This theory has two kinds of sources, J and J†, and so we need a way to tell which is which when we draw the diagrams. Rather than labeling the source blobs with a J or J†, we will indicate which is which by putting an arrow on the attached propagator that points towards the source if it is a J†, and away from the source if it is a J. a) What kind of vertex appears in the diagrams for this theory, and what is the associated vertex factor? Hint: your answer should involve those arrows! b) Ignoring the counterterms, draw all the connected diagrams with 1 ≤ E ≤ 4 and 0 ≤ V ≤ 2, and find their symmetry factors. Hint: the arrows are important! 9.4) Consider the integral exp W (g, J) ≡ √1 2π +∞ −∞ dx exp − 1 2 x2 + 1 6 gx3 + Jx . (9.27) This integral does not converge, but it can be used to generate a joint power series in g and J, ∞∞ W (g, J) = CV,E gVJ E . V =0 E=0 (9.28) a) Show that CV,E = I 1 SI , (9.29) where the sum is over all connected Feynman diagrams with E sources and V three-point vertices, and SI is the symmetry factor for each diagram. b) Use eqs. (9.27) and (9.28) to compute CV,E for V ≤ 4 and E ≤ 5. (This is most easily done with a symbolic manipulation program like Mathematica.) Verify that the symmetry factors given in figs. (9.1– 9.11) satisfy the sum rule of eq. (9.29). 9: The Path Integral for Interacting Field Theory 84 c) Now consider W (g, J+Y ), with Y fixed by the “no tadpole” con- dition ∂ W (g, J+Y ) = 0 . ∂J J =0 (9.30) Then write ∞∞ W (g, J+Y ) = CV,E gVJ E . V =0 E=0 (9.31) Show that CV,E = I 1 SI , (9.32) where the sum is over all connected Feynman diagrams with E sources and V three-point vertices and no tadpoles, and SI is the symmetry factor for each diagram. d) Let Y = a1g + a3g3 + . . . , and use eq. (9.30) to determine a1 and a3. Compute CV,E for V ≤ 4 and E ≤ 4. Verify that the symmetry factors for the diagrams in fig. (9.13) satisfy the sum rule of eq. (9.32). 9.5) The interaction picture. In this problem, we will derive a formula for 0|Tϕ(xn) . . . ϕ(x1)|0 without using path integrals. Suppose we have a 1 2 hamiltonian density H = H0 m2ϕ2, and H1 is a function of + H1, where H0 Π(x, 0) and ϕ(x, = 0) 1 2 Π2 and + 1 2 (∇ϕ)2 + their spatial derivatives. (It should be chosen to preserve Lorentz invariance, but we will not be concerned with this issue.) We add a constant to H so that H|0 = 0. Let |∅ be the ground state of H0, with a constant added to H0 so that H0|∅ = 0. (H1 is then defined as H − H0.) The Heisenberg-picture field is ϕ(x, t) ≡ eiHtϕ(x, 0)e−iHt . (9.33) We now define the interaction-picture field ϕI (x, t) ≡ eiH0tϕ(x, 0)e−iH0t . (9.34) a) Show that ϕI (x) obeys the Klein-Gordon equation, and hence is a free field. b) Show that ϕ(x) = U †(t)ϕI (x)U (t), where U (t) ≡ eiH0te−iHt is unitary. c) Show that U (t) obeys the differential equation i d dt U (t) = HI (t)U (t), where HI(t) = eiH0tH1e−iH0t is the interaction hamiltonian in the in- teraction picture, and the boundary condition U (0) = 1. 9: The Path Integral for Interacting Field Theory 85 d) If H1 is specified by a particular function of the Schr¨odinger-picture fields Π(x, 0) and ϕ(x, 0), show that HI (t) is given by the same function of the interaction-picture fields ΠI (x, t) and ϕI (x, t). e) Show that, for t > 0, t U (t) = T exp −i dt′ HI(t′) 0 (9.35) obeys the differential equation and boundary condition of part (c). What is the comparable expression for t < 0? Hint: you may need to define a new ordering symbol. f) Define U (t2, t1) ≡ U (t2)U †(t1). Show that, for t2 > t1, U (t2, t1) = T exp −i t2 dt′ HI (t′) . t1 (9.36) What is the comparable expression for t1 > t2? g) For any time ordering, show that U (t3, t1) = U (t3, t2)U (t2, t1) and that U †(t1, t2) = U (t2, t1). h) Show that ϕ(xn) . . . ϕ(x1) = U †(tn, 0)ϕI (xn)U (tn, tn−1)ϕI (xn−1) . . . U (t2, t1)ϕI (x1)U (t1, 0) . (9.37) i) Show that U †(tn, 0) = U †(∞, 0)U (∞, tn) and also that U (t1, 0) = U (t1, −∞)U (−∞, 0). j) Replace H0 with (1−iǫ)H0, and show that 0|U †(∞, 0) = 0|∅ ∅| and that U (−∞, 0)|0 = |∅ ∅|0 . k) Show that 0|ϕ(xn) . . . ϕ(x1)|0 = ∅|U (∞, tn)ϕI (xn)U (tn, tn−1)ϕI (xn−1) . . . U (t2, t1)ϕI (x1)U (t1, −∞)|∅ × | ∅|0 |2 . (9.38) l) Show that 0|Tϕ(xn) . . . ϕ(x1)|0 = ∅|TϕI (xn) . . . ϕI (x1)e−i d4x HI(x)|∅ × | ∅|0 |2 . (9.39) m) Show that | ∅|0 |2 = 1/ ∅|Te−i d4x HI(x)|∅ . (9.40) 9: The Path Integral for Interacting Field Theory 86 Thus we have 0|Tϕ(xn) . . . ϕ(x1)|0 = ∅|TϕI (xn) . . . ϕI (x1)e−i d4x HI(x)|∅ ∅|Te−i d4x HI (x)|∅ . (9.41) We can now Taylor expand the exponentials on the right-hand side of eq. (9.41), and use free-field theory to compute the resulting corre- lation functions. 10: Scattering Amplitudes and the Feynman Rules 87 10 Scattering Amplitudes and the Feynman Rules Prerequisite: 5, 9 Now that we have an expression for Z(J) = exp iW (J), we can take func- tional derivatives to compute vacuum expectation values of time-ordered products of fields. Consider the case of two fields; we define the exact propagator via 1 i ∆(x1 − x2) ≡ 0|Tϕ(x1)ϕ(x2)|0 . (10.1) For notational simplicity let us define δj ≡ 1 i δ δJ(xj ) . (10.2) Then we have 0|Tϕ(x1)ϕ(x2)|0 = δ1δ2Z(J ) J=0 = δ1δ2iW (J ) J=0 − δ1iW (J ) J=0 δ2iW (J ) J=0 = δ1δ2iW (J ) J=0 . (10.3) To get the last line we used δjW (J)|J=0 = 0|ϕ(xj )|0 = 0. Diagramat- ically, δ1 removes a source, and labels the propagator endpoint x1. Thus 1 i ∆(x1−x2) is given by the sum of diagrams with two sources, with those sources removed and the endpoints labeled x1 and x2. (The labels must be applied in both ways. If the diagram was originally symmetric on exchange of the two sources, the associated symmetry factor of 2 is then canceled by the double labeling.) At lowest order, the only contribution is the “barbell” diagram of fig. (9.5) with the sources removed. Thus we recover the obvious foafctthtehOat(g1i2∆) c(xor1r−ecxt2i)on=s 1 i ∆(x1−x2) + O(g2). in section 14. We will take up the subject For now, let us go on to compute 0|Tϕ(x1)ϕ(x2)ϕ(x3)ϕ(x4)|0 = δ1δ2δ3δ4Z(J) = δ1δ2δ3δ4iW + (δ1δ2iW )(δ3δ4iW ) + (δ1δ3iW )(δ2δ4iW ) + (δ1δ4iW )(δ2δ3iW ) . J =0 (10.4) We have dropped terms that contain a factor of 0|ϕ(x)|0 = 0. According to eq. (10.3), the last three terms in eq. (10.4) simply give products of the exact propagators. 10: Scattering Amplitudes and the Feynman Rules 88 Let us see what happens when these terms are inserted into the LSZ formula for two incoming and two outgoing particles, f |i = i4 d4x1 d4x2 d4x′1 d4x′2 ei(k1x1+k2x2−k1′ x′1−k2′ x′2) ×(−∂12 + m2)(−∂22 + m2)(−∂12′ + m2)(−∂22′ + m2) × 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 . (10.5) If we consider, for example, 1 i ∆(x1−x′1) 1 i ∆(x2−x′2) as one term in the correlation function in eq. (10.5), we get from this term d4x1 d4x2 d4x′1 d4x′2 ei(k1x1+k2x2−k1′ x′1−k2′ x′2)F (x11′ )F (x22′ ) = (2π)4δ4(k1−k1′ ) (2π)4δ4(k2−k2′ ) F (k¯11′ ) F (k¯22′ ) , (10.6) where F (xij) ≡ (−∂i2 +m2)(−∂j2 +m2)∆(xij ), F (k) is its Fourier transform, xij′ ≡ xi−x′j, and k¯ij′ ≡ (ki+kj′ )/2. The important point is the two delta functions: these tell us that the four-momenta of the two outgoing particles (1′ and 2′) are equal to the four-momenta of the two incoming particles (1 and 2). In other words, no scattering has occurred. This is not the event whose probability we wish to compute! The other two similar terms in eq. (10.4) either contribute to “no scattering” events, or vanish due to factors like δ4(k1+k2) (which is zero because k10+k20 ≥ 2m > 0). In general, the diagrams that contribute to the scattering process of interest are only those that are fully connected: every endpoint can be reached from every other endpoint by tracing through the diagram. These are the diagrams that arise from all the δ’s acting on a single factor of W . Therefore, from here on, we restrict our attention to those diagrams alone. We define the connected correlation functions via 0|Tϕ(x1) . . . ϕ(xE)|0 C ≡ δ1 . . . δEiW (J) J=0 , (10.7) and use these instead of 0|Tϕ(x1) . . . ϕ(xE )|0 in the LSZ formula. Returning to eq. (10.4), we have 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C = δ1δ2δ1′ δ2′ iW J=0 . (10.8) The lowest-order (in g) nonzero contribution to this comes from the diagram of fig. (9.10), which has four sources and two vertices. The four δ’s remove the four sources; there are 4! ways of matching up the δ’s to the sources. These 24 diagrams can then be collected into 3 groups of 8 diagrams each; the 8 diagrams in each group are identical. The 3 distinct diagrams are shown in fig. (10.1). Note that the factor of 8 neatly cancels the symmetry factor S = 8 of the diagram with sources. 10: Scattering Amplitudes and the Feynman Rules 1 1 1 1 1 89 1 2 2 2 2 2 2 Figure 10.1: The three tree-level Feynman diagrams that contribute to the connected correlation function 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C. This is a general result for tree diagrams (those with no closed loops): once the sources have been stripped off and the endpoints labeled, each diagram with a distinct endpoint labeling has an overall symmetry factor of one. The tree diagrams for a given process represent the lowest-order (in g) nonzero contribution to that process. We now have 0|Tϕ(x1)ϕ(x2)ϕ(x′1)ϕ(x′2)|0 C = (ig)2 15 i d4y d4z ∆(y−z) × ∆(x1−y)∆(x2−y)∆(x′1−z)∆(x′2−z) + ∆(x1−y)∆(x′1−y)∆(x2−z)∆(x′2−z) + ∆(x1−y)∆(x′2−y)∆(x2−z)∆(x′1−z) + O(g4) . (10.9) Next, we use eq. (10.9) in the LSZ formula, eq. (10.5). Each Klein-Gordon wave operator acts on a propagator to give (−∂i2 + m2)∆(xi − y) = δ4(xi − y) . (10.10) The integrals over the external spacetime labels x1,2,1′,2′ are then trivial, and we get f |i = (ig)2 1 i d4y d4z ∆(y−z) ei(k1y+k2y−k1′ z−k2′ z) + ei(k1y+k2z−k1′ y−k2′ z) + ei(k1y+k2z−k1′ z−k2′ y) + O(g4) . (10.11) This can be simplified by substituting ∆(y − z) = d4k eik(y−z) (2π)4 k2 + m2 − iǫ (10.12) 10: Scattering Amplitudes and the Feynman Rules 90 into eq. (10.9). Then the spacetime arguments appear only in phase factors, and we can integrate them to get delta functions: f |i = ig2 d4k 1 (2π)4 k2 + m2 − iǫ × (2π)4δ4(k1+k2+k) (2π)4δ4(k1′ +k2′ +k) + (2π)4δ4(k1−k1′ +k) (2π)4δ4(k2′ −k2+k) + (2π)4δ4(k1−k2′ +k) (2π)4δ4(k1′ −k2+k) + O(g4) = ig2 (2π)4δ4(k1+k2−k1′ −k2′ ) × 1 (k1+k2)2 + m2 + 1 (k1−k1′ )2 + m2 + 1 (k1−k2′ )2 + m2 + O(g4) . (10.13) In eq. (10.13), we have left out the iǫ’s for notational convenience only; m2 is really m2 − iǫ. The overall delta function in eq. (10.13) tells that that four-momentum is conserved in the scattering process, which we should, of course, expect. For a general scattering process, it is then convenient to define a scattering matrix element T via f |i = (2π)4δ4(kin−kout)iT , (10.14) where kin and kout are the total four-momenta of the incoming and outgoing particles, respectively. Examining the calculation which led to eq. (10.13), we can take away some universal features that lead to a simple set of Feynman rules for computing contributions to iT for a given scattering process. The Feynman rules are: 1. Draw lines (called external lines) for each incoming and each outgoing particle. 2. Leave one end of each external line free, and attach the other to a vertex at which exactly three lines meet. Include extra internal lines in order to do this. In this way, draw all possible diagrams that are topologically inequivalent. 3. On each incoming line, draw an arrow pointing towards the vertex. On each outgoing line, draw an arrow pointing away from the vertex. On each internal line, draw an arrow with an arbitrary direction. 4. Assign each line its own four-momentum. The four-momentum of an external line should be the four-momentum of the corresponding particle. 10: Scattering Amplitudes and the Feynman Rules 91 k1 k1 k1 k1 k1 k1 k1 k1 k1 k2 k1+ k2 k2 k2 k2 k2 k2 k2 Figure 10.2: The tree-level s-, t-, and u-channel diagrams contributing to iT for two particle scattering. 5. Think of the four-momenta as flowing along the arrows, and conserve four-momentum at each vertex. For a tree diagram, this fixes the momenta on all the internal lines. 6. The value of a diagram consists of the following factors: for each external line, 1; for each internal line with momentum k, −i/(k2 + m2 − iǫ); for each vertex, iZgg. 7. A diagram with L closed loops will have L internal momenta that are not fixed by rule #5. Integrate over each of these momenta ℓi with measure d4ℓi/(2π)4. 8. A loop diagram may have some leftover symmetry factors if there are exchanges of internal propagators and vertices that leave the diagram unchanged; in this case, divide the value of the diagram by the symmetry factor associated with exchanges of internal propagators and vertices. 9. Include diagrams with the counterterm vertex that connects two propagators, each with the same four-momentum k. The value of this vertex is −i(Ak2 + Bm2), where A = Zϕ − 1 and B = Zm − 1, and each is O(g2). 10. The value of iT is given by a sum over the values of all these diagrams. For the two-particle scattering process, the tree diagrams resulting from these rules are shown in fig. (10.2). Now that we have our procedure for computing the scattering amplitude T , we must see how to relate it to a measurable cross section. Problems 10: Scattering Amplitudes and the Feynman Rules 92 10.1) Use eq. (9.41) of problem 9.5 to rederive eq. (10.9). 10.2) Write down the Feynman rules for the complex scalar field of problem 9.3. Remember that there are two kinds of particles now (which we can think of as positively and negatively charged), and that your rules must have a way of distinguishing them. Hint: the most direct approach requires two kinds of arrows: momentum arrows (as discussed in this section) and what we might call “charge” arrows (as discussed in problem 9.3). Try to find a more elegant approach that requires only one kind of arrow. 10.3) Consider a complex scalar field ϕ that interacts with a real scalar field χ via L1 = gχϕ†ϕ. Use a solid line for the ϕ propagator and a dashed line for the χ propagator. Draw the vertex (remember the arrows!), and find the associated vertex factor. 10.4) Consider a real scalar field with L1 = 1 2 gϕ∂µ ϕ∂µϕ. Find the associ- ated vertex factor. 10.5) The scattering amplitudes should be unchanged if we make a field redefinition. Suppose, for example, we have L = − 1 2 ∂µϕ∂µ ϕ − 1 2 m2ϕ2 , (10.15) and we make the field redefinition ϕ → ϕ + λϕ2 . (10.16) Work out the lagrangian in terms of the redefined field, and the corresponding Feynman rules. Compute (at tree level) the ϕϕ → ϕϕ scattering amplitude. You should get zero, because this is a free-field theory in disguise. (At the loop level, we also have to take into account the transformation of the functional measure Dϕ; see section 85.) 11: Cross Sections and Decay Rates 93 11 Cross Sections and Decay Rates Prerequisite: 10 Now that we have a method for computing the scattering amplitude T , we must convert it into something that could be measured in an experiment. In practice, we are almost always concerned with one of two generic cases: one incoming particle, for which we compute a decay rate, or two incoming particles, for which we compute a cross section. We begin with the latter. Let us also specialize, for now, to the case of two outgoing particles as well as two incoming particles. In ϕ3 theory, we found in section 10 that in this case we have T = g2 1 (k1+k2)2 + m2 + 1 (k1−k1′ )2 + m2 + 1 (k1−k2′ )2 + m2 + O(g4) , (11.1) where k1 and k2 are the four-momenta of the two incoming particles, k1′ and k2′ are the four-momenta of the two outgoing particles, and k1+k2 = k1′ +k2′ . Also, these particles are all on shell: ki2 = −m2i . (Here, for later use, we allow for the possibility that the particles all have different masses.) Let us think about the kinematics of this process. In the center-of- mass frame, or CM frame for short, we take k1 + k2 = 0, and choose k1 to be in the +z direction. Now the only variable left to specify about the initial state is the magnitude of k1. Equivalently, we could specify the total energy in the CM frame, E1 + E2. However, it is even more convenient to define a Lorentz scalar s ≡ −(k1 + k2)2. In the CM frame, s reduces to (E1 + E2)2; s is therefore called the center-of-mass energy squared. Then, since E1 = (k21 + m21)1/2 and E2 = (k21 + m22)1/2, we can solve for |k1| in terms of s, with the result |k1| = √1 2s s2 − 2(m21 + m22)s + (m21 − m22)2 (CM frame) . (11.2) Now consider the two outgoing particles. Since momentum is conserved, we must have k′1 + k′2 = 0, and since energy is conserved, we must also have (E1′ + E2′ )2 = s. Then we find |k′1| = 2√1 s s2 − 2(m21′ + m22′ )s + (m21′ − m22′ )2 (CM frame) . (11.3) Now the only variable left to specify about the final state is the angle θ between k1 and k′1. However, it is often more convenient to work with the Lorentz scalar t ≡ −(k1 − k1′ )2, which is related to θ by t = m21 + m21′ − 2E1E1′ + 2|k1||k′1| cos θ . (11.4) 11: Cross Sections and Decay Rates 94 This formula is valid in any frame. The Lorentz scalars s and t are two of the three Mandelstam variables, defined as s ≡ −(k1+k2)2 = −(k1′ +k2′ )2 , t ≡ −(k1−k1′ )2 = −(k2−k2′ )2 , u ≡ −(k1−k2′ )2 = −(k2−k1′ )2 . (11.5) The three Mandelstam variables are not independent; they satisfy the linear relation s + t + u = m21 + m22 + m21′ + m22′ . (11.6) In terms of s, t, and u, we can rewrite eq. (11.1) as T = g2 1 m2 − s + 1 m2 − t + 1 m2 − u + O(g4) , (11.7) which demonstrates the notational utility of the Mandelstam variables. Now let us consider a different frame, the fixed target or FT frame (also sometimes called the lab frame), in which particle #2 is initially at rest: k2 = 0. In this case we have |k1| = 1 2m2 s2 − 2(m21 + m22)s + (m21 − m22)2 (FT frame) . (11.8) Note that, from eqs. (11.8) and (11.2), √ m2|k1|FT = s |k1|CM . (11.9) This will be useful later. We would now like to derive a formula for the differential scattering cross section. In order to do so, we assume that the whole experiment is taking place in a big box of volume V , and lasts for a large time T . We should really think about wave packets coming together, but we will use some simple shortcuts instead. Also, to get a more general answer, we will let the number of outgoing particles be arbitrary. Recall from section 10 that the overlap between the initial and final states is given by f |i = (2π)4δ4(kin−kout)iT . (11.10) To get a probability, we must square f |i , and divide by the norms of the initial and final states: P= | f |i |2 f |f i|i . (11.11) 11: Cross Sections and Decay Rates 95 The numerator of this expression is | f |i |2 = [(2π)4δ4(kin−kout)]2 |T |2 . (11.12) We write the square of the delta function as [(2π)4δ4(kin−kout)]2 = (2π)4δ4(kin−kout) × (2π)4δ4(0) , (11.13) and note that (2π)4δ4(0) = d4x ei0·x = V T . (11.14) Also, the norm of a single particle state is given by k|k = (2π)32k0δ3(0) = 2k0V . (11.15) Thus we have i|i = 4E1E2V 2 , n′ f |f = 2kj′ 0V , j=1 (11.16) (11.17) where n′ is the number of outgoing particles. If we now divide eq. (11.11) by the elapsed time T , we get a probability per unit time P˙ = (2π)4δ4(kin−kout) V |T |2 4E1E2V 2 n′ j=1 2kj′0V . (11.18) This is the probability per unit time to scatter into a set of outgoing par- ticles with precise momenta. To get something measurable, we should sum each outgoing three-momentum k′j over some small range. Due to the box, all three-momenta are quantized: k′j = (2π/L)n′j , where V = L3, and n′j is a three-vector with integer entries. (Here we have assumed periodic bound- ary conditions, but this choice does not affect the final result.) In the limit of large L, we have n′j → V (2π)3 d3k′j . (11.19) Thus we should multiply P˙ by a factor of V d3k′j/(2π)3 for each outgoing particle. Then we get P˙ = (2π )4 δ4 (kin −kout ) 4E1E2V |T |2 n′ dk′j j=1 , (11.20) 11: Cross Sections and Decay Rates 96 where we have identified the Lorentz-invariant phase-space differential dk ≡ d3k (2π)32k0 (11.21) that we first introduced in section 3. To convert P˙ to a differential cross section dσ, we must divide by the incident flux. Let us see how this works in the FT frame, where particle #2 is at rest. The incident flux is the number of particles per unit volume that are striking the target particle (#2), times their speed. We have one incident particle (#1) in a volume V with speed v = |k1|/E1, and so the incident flux is |k1|/E1V . Dividing eq. (11.20) by this flux cancels the last factor of V , and replaces E1 in the denominator with |k1|. We also set E2 = m2 and note that eq. (11.8) gives |k1|m2 as a function of s; dσ will be Lorentz invariant if, in other frames, we simply use this function as the value of |k1|m2. Adopting this convention, and using eq. (11.9), we have dσ = 1√ 4|k1|CM s |T |2 dLIPSn′ (k1+k2) , (11.22) where |k1|CM is given as a function of s by eq. (11.2), and we have defined the n′-body Lorentz-invariant phase-space measure n′ dLIPSn′ (k) ≡ (2π)4δ4(k− n′ j=1 ki′ ) dk′j . j=1 (11.23) Eq. (11.22) is our final result for the differential cross section for the scat- tering of two incoming particles into n′ outgoing particles. Let us now specialize to the case of two outgoing particles. We need to evaluate dLIPS2(k) = (2π)4δ4(k−k1′ −k2′ ) dk′1dk′2 , (11.24) where k = k1 + k2. Since dLIPS2(k) is Lorentz invariant, we can compute it k1 in + any k2 = convenient frame. Let 0 and k0 = E1 + E2 = √uss; work then in the CM we have frame, where k = dLIPS2(k) = 1 4(2π)2E1′ E2′ δ(E1′ +E2′ −√s ) δ3(k′1+k′2) d3k′1d3k′2 . (11.25) We can use the spatial part of the delta function to integrate over d3k′2, with the result dLIPS2(k) = 1 4(2π)2E1′ E2′ δ(E1′ +E2′ √ −s ) d3k′1 , (11.26) 11: Cross Sections and Decay Rates 97 where now E1′ = k′12 + m21′ and E2′ = k′12 + m22′ . (11.27) Next, let us write d3k′1 = |k′1|2 d|k′1| dΩCM , (11.28) where dΩCM = sin θ dθ dφ is the differential solid angle, and θ is the angle between k1 and k′1 in the CM frame. We can carry out the integral over the magnitude of k′1 in eq. (11.26) using dx δ(f (x)) = i |f ′(xi)|−1, where xi satisfies f (xi) = 0. In our case, the argument of the delta function vanishes at just one value of |k′1|, the value given by eq. (11.3). Also, the derivative of that argument with respect to |k′1| is ∂ ∂|k′1| E1′ + E2′ − √ s = |k′1| E1′ + |k′1| E2′ = = ||kEk′1′11′||E√2E′sE1′.1′+EE2′ 2′ (11.29) Putting all of this together, we get dLIPS2(k) = 16|πk2′1√| s dΩCM . (11.30) Combining this with eq. (11.22), we have dσ dΩCM = 1 64π2s |k′1| |k1| |T |2 , (11.31) where |k1| and |k′1| are the functions of s given by eqs. (11.2) and (11.3), and dΩCM is the differential solid angle in the CM frame. The differential cross section can also be expressed in a frame-independent manner by noting that, in the CM frame, we can take the differential of eq. (11.4) at fixed s to get dt = 2 |k1| |k′1| d cos θ = 2 |k1| |k′1| dΩCM 2π . Now we can rewrite eq. (11.31) as (11.32) (11.33) dσ dt = 1 64πs|k1|2 |T |2 , (11.34) 11: Cross Sections and Decay Rates 98 where |k1| is given as a function of s by eq. (11.2). We can now transform dσ/dt into dσ/dΩ in any frame we might like (such as the FT frame) by taking the differential of eq. (11.4) in that frame. In general, though, |k′1| depends on θ as well as s, so the result is more complicated than it is in eq. (11.32) for the CM frame. Returning to the general case of n′ outgoing particles, we can define a Lorentz invariant total cross section by integrating completely over all the outgoing momenta, and dividing by an appropriate symmetry factor S. If there are n′i identical outgoing particles of type i, then S = n′i! , i (11.35) and σ = 1 S dσ , (11.36) where dσ is given by eq. (11.22). We need the symmetry factor because merely integrating over all the outgoing momenta in dLIPSn′ treats the final state as being labeled by an ordered list of these momenta. But if some outgoing particles are identical, this is not correct; the momenta of the identical particles should be specified by an unordered list (because, for example, the state a†1a†2|0 is identical to the state a†2a†1|0 ). The symmetry factor provides the appropriate correction. In the case of two outgoing particles, eq. (11.36) becomes σ = 1 S dΩCM dσ dΩCM = 2π S +1 −1 d cos θ dσ dΩCM , (11.37) (11.38) where S = 2 if the two outgoing particles are identical, and S = 1 if they are distinguishable. Equivalently, we can compute σ from eq. (11.34) via σ = 1 S tmax tmin dt dσ dt , (11.39) where tmin and tmax are given by eq. (11.4) in the CM frame with cos θ = −1 and +1, respectively. To compute σ with eq. (11.38), we should first express t and u in terms of s and θ via eqs. (11.4) and (11.6), and then integrate over θ at fixed s. To compute σ with eq. (11.39), we should first express u in terms of s and t via eq. (11.6), and then integrate over t at fixed s. Let us see how all this works for the scattering amplitude of ϕ3 theory, eq. (11.7). In this case, all the masses are equal, and so, in the CM frame, 11: Cross Sections and Decay Rates 99 E= 1 2 √s for all four particles, and |k′1| = |k1| = 1 2 (s − 4m2)1/2. Then eq. (11.4) becomes t = − 1 2 (s − 4m2 )(1 − cos θ) . (11.40) From eq. (11.6), we also have u = − 1 2 (s − 4m2)(1 + cos θ) . (11.41) Thus |T |2 is quite a complicated function of s and θ. In the nonrelativistic limit, |k1| ≪ m or equivalently s − 4m2 ≪ m2, we have T = 5g2 3m2 1 − 8 15 s − 4m2 m2 + 5 18 1 + 27 25 cos2 θ s − 4m2 m2 2 +... + O(g4) . (11.42) Thus the differential cross section is nearly isotropic. In the extreme relativistic limit, |k1| ≫ m or equivalently s ≫ m2, we have T = g2 s sin2 θ 3 + cos2 θ − (3 + cos2 sin2 θ θ)2 − 16 m2 s + ... + O(g4) . (11.43) Now the differential cross section is sharply peaked in the forward (θ = 0) and backward (θ = π) directions. We can compute the total cross section σ from eq. (11.39). We have in this case tmin = −(s − 4m2) and tmax = 0. Since the two outgoing particles are identical, the symmetry factor is S = 2. Then setting u = 4m2 − s − t, and performing the integral in eq. (11.39) over t at fixed s, we get σ = g4 32πs(s − 4m2) 2 m2 + s (s − − 4m2 m2)2 − s 2 − 3m2 + (s − 4m2 m2)(s − 2m2) ln s − 3m2 m2 + O(g6) . (11.44) In the nonrelativistic limit, this becomes σ = 25g4 1152πm6 1 − 79 60 s − 4m2 m2 +... + O(g6) . (11.45) In the extreme relativistic limit, we get σ = g4 16πm2s2 1 + 7 2 m2 s + ... + O(g6) . (11.46) 11: Cross Sections and Decay Rates 100 These results illustrate how even a very simple quantum field theory can yield specific predictions for cross sections that could be tested experimentally. Let us now turn to the other basic problem mentioned at the beginning of this section: the case of a single incoming particle that decays to n′ other particles. We have an immediate conceptual problem. According to our development of the LSZ formula in section 5, each incoming and outgoing particle should correspond to a single-particle state that is an exact eigenstate of the exact hamiltonian. This is clearly not the case for a particle that can decay. Referring to fig. (5.1), the hyperbola of such a particle must lie above the continuum threshold. Strictly speaking, then, the LSZ formula is not applicable. A proper understanding of this issue requires a study of loop corrections that we will undertake in section 25. For now, we will simply assume that the LSZ formula continues to hold for a single incoming particle. Then we can retrace the steps from eq. (11.11) to eq. (11.20); the only change is that the norm of the initial state is now i|i = 2E1V (11.47) instead of eq. (11.16). Identifying the differential decay rate dΓ with P˙ then gives dΓ = 1 2E1 |T |2 dLIPSn′ (k1) , (11.48) where now s = −k12 = m21. In the CM frame (which is now the rest frame of the initial particle), we have E1 = m1; in other frames, the relative factor of E1/m1 in dΓ accounts for relativistic time dilation of the decay rate. We can also define a total decay rate by integrating over all the outgoing momenta, and dividing by the symmetry factor of eq. (11.35): Γ = 1 S dΓ . (11.49) We will compute a decay rate in problem 11.1 Reference Notes For a derivation with wave packets, see Brown, Itzykson & Zuber, or Peskin & Schroeder. Problems