Principles of Quantum Mechanics SECOND EDITION Principles of Quantum Mechanics SECOND EDITION R. Shankar Yale University New Haven, Connecticut ~Springer Library of Congress Cataloging–in–Publication Data Shankar, Ramamurti. Principles of quantum mechanics / R. Shankar. 2nd ed. p. cm. Includes bibliographical references and index. ISBN 0-306-44790-8 1. Quantum theory. I. Title. QC174. 12.S52 1994 530. 1’2–dc20 94–26837 CIP ISBN 978-1-4757-0578-2 ISBN 978-1-4757-0576-8 (eBook) DOI: 10.1007/978-1-4757-0576-8 © 1994, 1980 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 19 18 (corrected printing, 2008) springer.com To My Parent~ and to Uma, Umesh, Ajeet, Meera, and Maya Preface to the Second Edition Over the decade and a half since I wrote the first edition, nothing has altered my belief in the soundness of the overall approach taken here. This is based on the response of teachers, students, and my own occasional rereading of the book. I was generally quite happy with the book, although there were portions where I felt I could have done better and portions which bothered me by their absence. I welcome this opportunity to rectify all that. Apart from small improvements scattered over the text, there are three major changes. First, I have rewritten a big chunk of the mathematical introduction in Chapter 1. Next, I have added a discussion of time-reversal invariance. I don't know how it got left out the first time-1 wish I could go back and change it. The most important change concerns the inclusion of Chaper 21, "Path Integrals: Part II." The first edition already revealed my partiality for this subject by having a chapter devoted to it, which was quite unusual in those days. In this one, I have cast off all restraint and gone all out to discuss many kinds of path integrals and their uses. Whereas in Chapter 8 the path integral recipe was simply given, here I start by deriving it. I derive the configuration space integral (the usual Feynman integral), phase space integral, and (oscillator) coherent state integral. I discuss two applica- tions: the derivation and application of the Berry phase and a study of the lowest Landau level with an eye on the quantum H.all effect. The relevance of these topics is unquestionable. This is followed by a section of imaginary time path integrals~ its description of tunneling, instantons, and symmetry breaking, and its relation to classical and quantum statistical mechanics. An introduction is given to the transfer matrix. Then I discuss spin coherent state path integrals and path integrals for fermions. These were thought to be topics too advanced for a book like this, but I believe this is no longer true. These concepts are extensively used and it seemed a good idea to provide the students who had the wisdom to buy this book with a head start. How are instructors to deal with this extra chapter given the time constraints? I suggest omitting some material from the earlier chapters. (No one I know, myself included, covers the whole book while teaching any fixed group of students.) A realistic option is for the instructor to teach part of Chapter 21 and assign the rest as reading material, as topics for take-home exams, term papers, etc. To ignore it, vii viii PREFACE TO THE SECOND EDITION I think, would be to lose a wonderful opportunity to expose the student to ideas that are central to many current research topics and to deny them the attendant excitement. Since the aim of this chapter is to guide students toward more frontline topics, it is more concise than the rest of the book. Students are also expected to consult the references given at the end of the chapter. Over the years, I have received some very useful feedback and I thank all those students and teachers who took the time to do so. I thank Howard Haber for a discussion of the Born approximation; Harsh Mathur and Ady Stern for discussions of the Berry phase; Alan Chodos, Steve Girvin, Ilya Gruzberg, Martin Gutzwiller, Ganpathy Murthy, Charlie Sommerfeld, and Senthil Todari for many useful comments on Chapter 21. I am most grateful to Captain Richard F. Malm, U.S.C.G. (Retired), Professor Dr. D. Schlüter of the University of Kiel, and Professor V. Yakovenko of the University of Maryland for detecting numerous errors in the first printing and taking the trouble to bring them to my attention. I thank Amelia McNamara of Plenum for urging me to write this edition and Plenum for its years of friendly and warm cooperation. I thank Ron Johnson, Editor at Springer for his tireless efforts on behalf of this book, and Chris Bostock, Daniel Keren and Jimmy Snyder for their generous help in correcting errors in the 14th printing. Finally, I thank my wife Uma for shielding me as usual from real life so I could work on this edition, and my battery of kids (revised and expanded since the previous edition) for continually charging me up. R. Shankar New Haven, Connecticut Preface to the First Edition Publish and perish-Giordano Bruno Given the number of books that already exist on the subject of quantum mechanics, one would think that the public needs one more as much as it does, say, the latest version of the Table oflntegers. But this does not deter me (as it didn't my predeces- sors) from trying to circulate my own version of how it ought to be taught. The approach to be presented here (to be described in a moment) was first tried on a group of Harvard undergraduates in the summer of '76, once again in the summer of '77, and more recently at Yale on undergraduates ('77-'78) and graduates ('78- '79) taking a year-long course on the subject. In all cases the results were very satisfactory in the sense that the students seemed to have learned the subject well and to have enjoyed the presentation. It is, in fact, their enthusiastic response and encouragement that convinced me of the soundness of my approach and impelled me to write this book. The basic idea is to develop the subject from its postulates, after addressing some indispensable preliminaries. Now, most people would agree that the best way to teach any subject that has reached the point of development where it can be reduced to a few postulates is to start with the latter, for it is this approach that gives students the fullest understanding of the foundations of the theory and how it is to be used. But they would also argue that whereas this is all right in the case of special relativity or mechanics, a typical student about to learn quantum mechanics seldom has any familiarity with the mathematical language in which the postulates are stated. I agree with these people that this problem is real, but I differ in my belief that it should and can be overcome. This book is an attempt at doing just this. It begins with a rather lengthy chapter in which the relevant mathematics of vector spaces developed from simple ideas on vectors and matrices the student is assumed to know. The level of rigor is what I think is needed to make a practicing quantum mechanic out of the student. This chapter, which typically takes six to eight lecture hours, is filled with examples from physics to keep students from getting too fidgety while they wait for the "real physics." Since the math introduced has to be taught sooner or later, I prefer sooner to later, for this way the students, when they get to it, can give quantum theory their fullest attention without having to ix X PREFACE TO THE FIRST EDITION battle with the mathematical theorems at the same time. Also, by segregating the mathematical theorems from the physical postulates, any possible confusion as to which is which is nipped in the bud. This chapter is followed by one on classical mechanics, where the Lagrangian and Hamiltonian formalisms are developed in some depth. It is for the instructor to decide how much of this to cover; the more students know of these matters, the better they will understand the connection between classical and quantum mechanics. Chapter 3 is devoted to a brief study of idealized experiments that betray the inadequacy of classical mechanics and give a glimpse of quantum mechanics. Having trained and motivated the students I now give them the postulates of quantum mechanics of a single particle in one dimension. I use the word "postulate" here to mean "that which cannot be deduced from pure mathematical or logical reasoning, and given which one can formulate and solve quantum mechanical problems and interpret the results." This is not the sense in which the true axiomatist would use the word. For instance, where the true axiomatist would just postulate that the dynamical variables are given by Hilbert space operators, I would add the operator identifications, i.e., specify the operators that represent coordinate and momentum (from which others can be built). Likewise, I would not stop with the statement that there is a Hamiltonian operator that governs the time evolution through the equation i1101lf/);at= HI 'If); I would say the His obtained from the classical Hamiltonian by substituting for x and p the corresponding operators. While the more general axioms have the virtue of surviving as we progress to systems of more degrees of freedom, with or without classical counterparts, students given just these will not know how to calculate anything such as the spectrum of the oscillator. Now one can, of course, try to "derive" these operator assignments, but to do so one would have to appeal to ideas of a postulatory nature themselves. (The same goes for "deriving'' the Schrodinger equation.) As we go along, these postulates are generalized to more degrees of freedom and it is for pedagogical reasons that these generalizations are postponed. Perhaps when students are finished with this book, they can free themselves from the specific operator assignments and think of quantum mechanics as a general mathematical formalism obeying certain postulates (in the strict sense of the term). The postulates in Chapter 4 are followed by a lengthy discussion of the same, with many examples from fictitious Hilbert spaces of three dimensions. Nonetheless, students will find it hard. It is only as they go along and see these postulates used over and over again in the rest of the book, in the setting up of problems and the interpretation of the results, that they will catch on to how the game is played. It is hoped they will be able to do it on their own when they graduate. I think that any attempt to soften this initial blow will be counterproductive in the long run. Chapter 5 deals with standard problems in one dimension. It is worth mentioning that the scattering off a step potential is treated using a wave packet approach. If the subject seems too hard at this stage, the instructor may decide to return to it after Chapter 7 (oscillator), when students have gained more experience. But I think that sooner or later students must get acquainted with this treatment of scattering. The classical limit is the subject of the next chapter. The harmonic oscillator is discussed in detail in the next. It is the first realistic problem and the instructor may be eager to get to it as soon as possible. If the instructor wants, he or she can discuss the classical limit after discussing the oscillator. We next discuss the path integral formulation due to Feynman. Given the intuitive understanding it provides, and its elegance (not to mention its ability to give the full propagator in just a few minutes in a class of problems), its omission from so many books is hard to understand. While it is admittedly hard to actually evaluate a path integral (one example is provided here), the notion of expressing the propagator as a sum over amplitudes from various paths is rather simple. The importance of this point of view is becoming clearer day by day to workers in statistical mechanics and field theory. I think every effort should be made to include at least the first three (and possibly five) sections of this chapter in the course. The content of the remaining chapters is standard, in the first approximation. The style is of course peculiar to this author, as are the specific topics. For instance, an entire chapter (11) is devoted to symmetries and their consequences. The chapter on the hydrogen atom also contains a section on how to make numerical estimates starting with a few mnemonics. Chapter 15, on addition of angular momenta, also contains a section on how to understand the "accidental" degeneracies in the spectra of hydrogen and the isotropic oscillator. The quantization of the radiation field is discussed in Chapter 18, on time-dependent perturbation theory. Finally the treatment of the Dirac equation in the last chapter (20) is intended to show that several things such as electron spin, its magnetic moment, the spin-orbit interaction, etc. which were introduced in an ad hoc fashion in earlier chapters, emerge as a coherent whole from the Dirac equation, and also to give students a glimpse of what lies ahead. This chapter also explains how Feynman resolves the problem of negativeenergy solutions (in a way that applies to bosons and fermions). xi PREPACE TO THE FIRST EDITION For Whom Is this Book Intended? In writing it, I addressed students who are trying to learn the subject by themselves; that is to say, I made it as self-contained as possible, included a lot of exercises and answers to most of them, and discussed several tricky points that trouble students when they learn the subject. But I am aware that in practice it is most likely to be used as a class text. There is enough material here for a full year graduate course. It is, however, quite easy so adapt it to a year-long undergraduate course. Several sections that may be omitted without loss of continuity are indicated. The sequence of topics may also be changed, as stated earlier in this preface. I thought it best to let the instructor skim through the book and chart the course for his or her class, given their level of preparation and objectives. Of course the book will not be particularly useful if the instructor is not sympathetic to the broad philosophy espoused here, namely, that first comes the mathematical training and then the development of the subject from the postulates. To instructors who feel that this approach is all right in principle but will not work in practice, I reiterate that it has been found to work in practice, not just by me but also by teachers elsewhere. The book may be used by nonphysicists as well. (I have found that it goes well with chemistry majors in my classes.) Although I wrote it for students with no familiarity with the subject, any previous exposure can only be advantageous. Finally, I invite instructors and students alike to communicate to me any suggestions for improvement, whether they be pedagogical or in reference to errors or misprints. xii PREFACE TO THE FIRST EDITION Acknowledgments As I look back to see who all made this book possible, my thoughts first turn to my brother R. Rajaraman and friend Rajaram Nityananda, who, around the same time, introduced me to physics in general and quantum mechanics in particular. Next come my students, particularly Doug Stone, but for whose encouragement and enthusiastic response I would not have undertaken this project. I am grateful to Professor Julius Kovacs of Michigan State, whose kind words of encouragement assured me that the book would be as well received by my peers as it was by my students. More recently, I have profited from numerous conversations with my colleagues at Yale, in particular Alan Chodos and Peter Mohr. My special thanks go to Charles Sommerfield, who managed to make time to read the manuscript and made many useful comments and recommendations. The detailed proofreading was done by Tom Moore. I thank you, the reader, in advance, for drawing to my notice any errors that may have slipped past us. The bulk of the manuscript production cost were borne by the J. W. Gibbs fellowship from Yale, which also supported me during the time the book was being written. Ms. Laurie Liptak did a fantastic job of typing the first 18 chapters and Ms. Linda Ford did the same with Chapters 19 and 20. The figures are by Mr. J. Brosious. Mr. R. Badrinath kindly helped with the index.t On the domestic front, encouragement came from my parents, my in-laws, and most important of all from my wife, Uma, who cheerfully donated me to science for a year or so and stood by me throughout. Little Umesh did his bit by tearing up all my books on the subject, both as a show of support and to create a need for this one. R. Shankar New Haven, Connecticut tIt is a pleasure to acknowledge the help of Mr. Richard Hatch, who drew my attention to a number of errors in the first printing. Prelude Our description of the physical world is dynamic in nature and undergoes frequent change. At any given time, we summarize our knowledge of natural phenomena by means of certain laws. These laws adequately describe the phenomenon studied up to that time, to an accuracy then attainable. As time passes, we enlarge the domain of observation and improve the accuracy of measurement. As we do so, we constantly check to see :r •he laws continue to be valid. Those laws that do remain valid gain in stature, and those that do not must be abandoned in favor of new ones that do. In this changing picture, the laws of classical mechanics formulated by Galileo, Newton, and later by Euler, Lagrange, Hamilton, Jacobi, and others, remained unaltered for almost three centuries. The expanding domain of classical physics met its first obstacles around the beginning of this century. The obstruction came on two fronts: at large velocities and small (atomic) scales. The problem of large velocities was successfully solved by Einstein, who gave us his relativistic mechanics, while the founders of quantum mechanics-Bohr, Heisenberg, Schrodinger, Dirac, Born, and others-solved the problem of small-scale physics. The union of relativity and quantum mechanics, needed for the description of phenomena involving simultaneously large velocities and small scales, turns out to be very difficult. Although much progress has been made in this subject, called quantum field theory, there remain many open questions to this date. We shall concentrate here on just the small-scale problem, that is to say, on non-relativistic quantum mechanics. The passage from classical to quantum mechanics has several features that are common to all such transitions in which an old theory gives way to a new one: (1) There is a domain Dn of phenomena described by the new theory and a subdomain Do wherein the old theory is reliable (to a given accuracy). (2) Within the subdomain Do either theory may be used to make quantitative predictions. It might often be more expedient to employ the old theory. (3) In addition to numerical accuracy, the new theory often brings about radical conceptual changes. Being of a qualitative nature, these will have a bearing on all of Dn. For example, in the case of relativity, Do and Dn represent (macroscopic) phenomena involving small and arbitrary velocities, respectively, the latter, of course, xiii xiv PRELUDE being bounded by the velocity of light. In addition to giving better numerical predictions for high-velocity phenomena, relativity theory also outlaws several cherished notions of the Newtonian scheme, such as absolute time, absolute length, unlimited velocities for particles, etc. In a similar manner. quantum mechanics brings with it not only improved numerical predictions for the microscopic world, but also conceptual changes that rock the very foundations of classical thought. This book introduces you to this subject, starting from its postulates. Between you and the postulates there stand three chapters wherein you will find a summary of the mathematical ideas appearing in the statement of the postulates, a review of classical mechanics, and a brief description of the empirical basis for the quantum theory. In the rest of the book, the postulates are invoked to formulate and solve a variety of quantum mechanical problems. rt is hoped thaL by the time you get to the end of the book, you will be able to do the same yourself. Note to the Student Do as many exercises as you can, especially the ones marked * or whose results carry equation numbers. The answer to each exercise is given <~ither with the exercise or at the end of the book. The first chapter is very important. Do not rush through it. Even if you know the math, read it to get acquainted with the notation. I am not saying it is an easy subject. But I hope this book makes it seem reasonable. Good luck. Contents l. Mathematical Introduction 1 1.1. Linear Vector Spaces: Basics . 1 1.2. Inner Product Spaces . 7 1.3. Dual Spaces and the Dirac Notation 11 1.4. Subspaces . 17 1.5. Linear Operators . 18 1.6. Matrix Elements of Linear Operators 20 1.7. Active and Passive Transformations. 29 1.8. The Eigenvalue Problem. 30 1.9. Functions of Operators and Related Concepts 54 1.10. Generalization to Infinite Dimensions 57 2. Review of Classical Mechanics . 75 2.1. The Principle of Least Action and Lagrangian Mechanics 78 2.2. The Electromagnetic Lagrangian 83 2.3. The Two-Body Problem . 85 2.4. How Smart Is a Particle? 86 2.5. The Hamiltonian Formalism . 86 2.6. The Electromagnetic Force in the Hamiltonian Scheme 90 2.7. Cyclic Coordinates, Poisson Brackets, and Canonical Transformations 91 2.8. Symmetries and Their Consequences 98 3. Allis Not Well with Classical Mechanics 107 3.1. Particles and Waves in Classical Physics . 107 3.2. An Experiment with Waves and Particles (Classical) 108 3.3. The Double-Slit Experiment with Light 110 3.4. Matter Waves (de Broglie Waves) 112 3.5. Conclusions 112 XV xvi 4. The Postulates-a General Discussion 115 CONTENTS 4.1. The Postulates . . . . . . . . 115 4.2. Discussion of Postulates 1-111 . 116 4.3. The Schrodinger Equation (Dotting Your i's and Crossing your fz's) . . . . . . . . . . . . . . 143 5. Simple Problems in One Dimension . 151 5.1. The Free Particle . . . . . . 151 5.2. The Particle in a Box . . . . 157 5.3. The Continuity Equation for Probability. 164 5.4. The Single-Step Potential: a Problem in Scattering 167 5.5. The Double-Slit Experiment 175 5.6. Some Theorems . . . . . . . . . . . . . . . 176 6. The Classical Limit . 179 7. The Harmonic Oscillator 185 7.1. Why Study the Harmonic Oscillator? 185 7.2. Review of the Classical Oscillator. . 188 7.3. Quantization of the Oscillator (Coordinate Basis). 189 7.4. The Oscillator in the Energy Basis . . . . . 202 7.5. Passage from the Energy Basis to the X Basis 216 8. The Path Integral Formulation of Quantum Theory 223 8.1. The Path Integral Recipe . . . . . . . . 223 8.2. Analysis of the Recipe . . . . . . . . . 224 8.3. An Approximation to U(t) for the Free Particle 225 8.4. Path Integral Evaluation of the Free-Particle Propagator. 226 8.5. Equivalence to the Schrodinger Equation . . . . 229 8.6. Potentials of the Form V=a+hx+cx2 +dx+exx. 231 9. The Heisenberg Uncertainty Relations. . . . . 237 9.I. Introduction . . . . . . . . . . . . . 237 9.2. Derivation of the Uncertainty Relations . 237 9.3. The Minimum Uncertainty Packet . . . 239 9.4. Applications of the Uncertainty Principle 241 9.5. The Energy-Time Uncertainty Relation 245 10. Systems with N Degrees of Freedom . . . 247 10.1. N Particles in One Dimension . . . 247 10.2. More Particles in More Dimensions 259 10.3. Identical Particles . . . . . . . . 260 11. Symmetries and Their Consequences 11.1. 11.2. 11.3. 11.4. 11.5. Overview. Translational Invariance in Quantum Theory Time Translational Invariance. Parity Invariance Time-Reversal Symmetry . 279 xv:ii 279 CONTENTS 279 294 297 301 12. Rotational Invariance and Angular Momentum 305 12.1. Translations in Two Dimensions. 305 12.2. Rotations in Two Dimensions . 306 12.3. The Eigenvalue Problem of Lc. 313 12.4. Angular Momentum in Three Dimensions 318 12.5. The Eigenvalue Problem of L 2 and Lc 321 12.6. Solution of Rotationally Invariant Problems 339 13. The Hydrogen Atom 353 13.1. The Eigenvalue Problem 353 13.2. The Degeneracy of the Hydrogen Spectrum . 359 13.3. Numerical Estimates and Comparison with Experiment . 361 13.4. Multielectron Atoms and the Periodic Table 369 14. Spin . 373 14.1. Introduction 373 14.2. What is the Nature of Spin? 373 14.3. Kinematics of Spin 374 14.4. Spin Dynamics 385 14.5. Return of Orbital Degrees of Freedom 397 15. Addition of Angular Momenta 403 15.1. A Simple Example . 403 15.2. The General Problem 408 15.3. Irreducible Tensor Operators 416 15.4. Explanation of Some "Accidental" Degeneracies. 421 16. Variational and WKB Methods 429 16.1. The Variational Method 429 16.2. The Wentzel-Kramers-Brillouin Method 435 17. Time-Independent Perturbation Theory 451 17.1. The Formalism 451 17.2. Some Examples . 454 17.3. Degenerate Perturbation Theory . 464 xviii CONTENTS 18. Time-Dependent Perturbation Theory . . 473 18.1. The Problem . . 473 18.2. First-Order Perturbation Theory. 474 18.3. Higher Orders in Perturbation Theory 484 18.4. A General Discussion of Electromagnetic Interactions 492 18.5. Interaction of Atoms with Electromagnetic Radiation 499 19. Scattering Theory . . . . . . . . . . . . . . . . . . . 523 19.1. Introduction . . . . . . . . . . . . . . . . . . 523 19.2. Recapitulation of One-Dimensional Scattering and Overview 524 19.3. The Born Approximation (Time-Dependent Description) 529 19.4. Born Again (The Time-Independent Approximation). 534 19.5. The Partial Wave Expansion 545 19.6. Two-Particle Scattering. 555 20. The Dirac Equation . . . . . 563 20.1. The Free-Particle Dirac Equation 563 20.2. Electromagnetic Interaction of the Dirac Particle 566 20.3. More on Relativistic Quantum Mechanics 574 21. Path Integrals-II . . . . . . . . . 581 21. 1. Derivation of the Path Integral 582 21.2. Imaginary Time Formalism . . 613 21.3. Spin and Fermion Path Integrals 636 21.4. Summary. . . . . . . 652 Appendix 655 A. I. Matrix Inversion. 655 A.2. Gaussian Integrals 659 A.3. Complex Numbers . 660 A.4. The i8 Prescription . 661 ANSWERS TO SELECTED ExERCISES 665 TABLE oF CoNsTANTs 669 bJDEX . . . . . . . 671 1 Mathematical Introduction The aim of this book is to provide you with an introduction to quantum mechanics, starting from its axioms. It is the aim of this chapter to equip you with the necessary mathematical machinery. All the math you will need is developed here, starting from some basic ideas on vectors and matrices that you are assumed to know. Numerous examples and exercises related to classical mechanics are given, both to provide some relief from the math and to demonstrate the wide applicability of the ideas developed here. The effort you put into this chapter will be well worth your while: not only will it prepare you for this course, but it will also unify many ideas you may have learned piecemeal. To really learn this chapter, you must, as with any other chapter, work out the problems. 1.1. Linear Vector Spaces: Basics In this section you will be introduced to linear vector spaces. You are surely familiar with the arrows from elementary physics encoding the magnitude and direction of velocity, force, displacement, torque, etc. You know how to add them and multiply them by scalars and the rules obeyed by these operations. For example, you know that scalar multiplication is distributive: the multiple of a sum of two vectors is the sum of the multiples. What we want to do is abstract from this simple case a set of basic features or axioms, and say that any set of objects obeying the same forms a linear vector space. The cleverness lies in deciding which of the properties to keep in the generalization. If you keep too many, there will be no other examples; if you keep too few, there will be no interesting results to develop from the axioms. The following is the list of properties the mathematicians have wisely chosen as requisite for a vector space. As you read them, please compare them to the world of arrows and make sure that these are indeed properties possessed by these familiar vectors. But note also that conspicuously missing are the requirements that every vector have a magnitude and direction, which was the first and most salient feature drilled into our heads when we first heard about them. So you might think that in dropping this requirement, the baby has been thrown out with the bath water. However, you will have ample time to appreciate the wisdom behind this choice as 1 2 CHAPTER I you go along and see a great unification and synthesis of diverse ideas under the heading of vector spaces. You will see examples of vector spaces that involve entities that you cannot intuitively perceive as having either a magnitude or a direction. While you should be duly impressed with all this, remember that it does not hurt at all to think of these generalizations in terms of arrows and to use the intuition to prove theorems or at the very least anticipate them. Definition 1. A linear vector space W is a collection of objects 11 ), 12), ... , IV), ... , I W), ... , called vectors, for which there exists 1. A definite rule for forming the vector sum, denoted IV) + IW) 2. A definite rule for multiplication by scalars a, b, ... , denoted al V) with the following features: • The result of these operations is another element of the space, a feature called closure: IV)+ I W)e'V. • Scalar multiplication is distributive in the vectors: a( IV)+ I W)) = al V)+al W). • Scalar multiplication is distributive in the scalars: (a+b)l V)=al V)+bl V). • Scalar multiplication is associative: a(bl V)) = abl V). • Addition is commutative: I V) + I W) = IW) + I V). • Addition is associative: IV)+ (I W) + IZ)) =(IV)+ I W)) + IZ). • There exists a null vector 10) obeying IV)+ 10) =IV). • For every vector IV) there exists an inverse under addition, 1- V), such that IV>+ I-V)= IO). There is a good way to remember all of these; do what comes naturally. Definition 2. The numbers a, b, ... are called the field over which the vector space is defined. If the field consists of all real numbers, we have a real vector space, if they are complex, we have a complex vector space. The vectors themselves are neither real nor complex; the adjective applies only to the scalars. Let us note that the above axioms imply • 10) is unique, i.e., if IO') has all the properties of 10), then 10) = 10'). • OJV)=IO). • 1-V)=-JV). • 1- V) is the unique additive inverse of IV). The proofs are left as to the following exercise. You don't have to know the proofs, but you do have to know the statements. Exercise 1.1.1. Verify these claims. For the first consider 10) + 10') and use the advertised properties of the two null vectors in turn. For the second start with 10) = (0+ 1)1 V) + 1- V). For the third, begin with jV)+(-jV))=OjV)=IO). For the last, let IW) also satisfy IV)+ IW) = 10). Since 10) is unique, this means IV)+ IW) =IV)+ 1- V). Take it from here. Figure 1.1. The rule for vector addition. Note that it obeys axioms (i)-(iii). 3 MATHEMATICAL INTRODUCTION Exercise 1.1.2. Consider the set of all entities of the form (a, b, c) where the entries are real numbers. Addition and scalar multiplication are defined as follows: (a, b, c)+(d, e, f)=(a+d, b+e, c+f) a(a, b, c)= (a a, ab, a c). Write down the null vector and inverse of (a, b, c). Show that vectors of the form (a, b, 1) do not forr.1 ::: vector space. Observe that we are using a new symbol IV) to denote a generic vector. This object is called ket V and this nomenclature is due to Dirac whose notation will be discussed at some length later. We do not purposely use the symbol V to denote the vectors as the first step in weaning you away from the limited concept of the vector as an arrow. You are however not discouraged from associating with IV) the arrowlike object till you have seen enough vectors that are not arrows and are ready to drop the crutch. You were asked to verify that the set of arrows qualified as a vector space as you read the axioms. Here are some of the key ideas you should have gone over. The vector space consists of arrows, typical ones being Vand V'. The rule for addition is familiar: take the tail of the second arrow, put it on the tip of the first, and so on as in Fig. 1.1. Scalar multiplication by a corresponds to stretching the vector by a factor a. This is a real vector space since stretching by a complex number makes no sense. (If a is negative, we interpret it as changing the direction of the arrow as well as rescaling it by lal.) Since these operations acting on arrows give more arrows, we have closure. Addition and scalar multiplication clearly have all the desired associative and distributive features. The null vector is the arrow of zero length, while the inverse of a vector is the vector reversed in direction. So the set of all arrows qualifies as a vector space. But we cannot tamper with it. For example, the set of all arrows with positive z-componeat~ do not form a vector space: there is no inverse. Note that so far, no reference has been made to magnitude or direction. The point is that while the arrows have these qualities, members of a vector space need not. This statement is pointless unless I can give you examples, so here are two. Consider the set of all 2 x 2 matrices. We know how to add them and multiply them by scalars (multiply all four matrix elements by that scalar). The corresponding rules obey closure, associativity, and distributive requirements. The null matrix has all zeros in it and the inverse under addition of a matrix is the matrix with all elements negated. You must agree that here we have a genuine vector space consisting of things which don't have an obvious length or direction associated with them. When we want to highlight the fact that the matrix M is an element of a vector space, we may want to refer to it as, say, ket number 4 or: 14). 4 CHAPTER I As a second example, consider all functionsf(x) defined in an interval 0 sx:::; L. We define scalar multiplication by a simply as af(x) and addition as pointwise addition: the sum of two functionsfand g has the valuef(x)+g(x) at the point x. The null function is zero everywhere and the additive inverse off is - f Exercise I. 1.3. Do functions that vanish at the end points x = 0 and x '= L form a vector space? How about periodic fimctions obeying .f(O) =.f(L)? How about functions that obey .f(O) = 4? If the functions do not qualify, list the things that go wrong. The next concept is that of linear independence of a set of vectors i l), i 2) ... In). First consider a linear relation of the form L: adi)=IO) i-" 1 (1.1.1) We may assume without loss of generality that the left-hand side does not contain any multiple of I0), for if it did, it could be shifted to the right, and combined with the 10) there to give 10) once more. (We are using the fact that any multiple of 10) equals 10).) Definition 3. The set of vectors is said to be finear(v independent if the only such linear relation as Eq. ( 1.1.1) is the trivial one with all ai = 0. If the set of vectors is not linearly independent, we say they are linearly dependent. Equation ( 1.1.1) tells us that it is not possible to vvrite any member of the linearly independent set in terms of the others. On the other hand, if the set of vectors is linearly dependent, such a relation will exist, and it must contain at least two nonzero coefficients. Let us say rl} ¥0. Then we could write 13) = L: ···-a --' Ii) a3 i= !,; Suppose we bring in a third vector 13 also in the plane. If it is parallel to either of the first two, we already have a linearly dependent set. So let us suppose it is not. But even now the three of them are linearly dependent. This is because we can write one of them, say 13), as a linear combination of the other two. To find the combina- tion, draw a line from the tail of 13) in the direction of 11 ). Next draw a line antiparallel to from the tip of 13). These lines will intersect since I) and 12) are not parallel by assumption. The intersection point P will determine how much of 11) and 12) we want: we go from the tail of 13) to P using the appropriate multiple of 11) and go from P to the tip of 13) using the appropriate multiple of 12). 5 MATHEMATICAL INTRODUCTION Exercise 1.1. 4. Consider three elements from the vector space of real 2 x 2 matrices: 11>=[~ ~] 12>=[~ ~] 13)=[-2 -1] 0 -2 Are they linearly independent? Support your answer with details. (Notice we are calling these matrices vectors and using kets to represent them to emphasize their role as elements of a vector space) Exercise 1.1.5. Show that the following row vectors are linearly dependent: (1, I, 0), (1, 0, 1), and (3, 2, 1). Show the opposite for (1, 1, 0), (1, 0, 1), and (0, 1, 1). Definition 4. A vector space has dimension n if it can accommodate a maximum of n linearly independent vectors. It will be denoted by 'l.lr(R) if the field is real and by 'l.lr(C) if the field is complex. In view of the earlier discussions, the plane is two-dimensional and the set of all arrows not limited to the plane define a three-dimensional vector space. How about 2 x 2 matrices? They form a four-dimensional vector space. Here is a proof. The following vectors are linearly independent: II>=[~ ~] 12>=[~ ~] 13>=[~ ~] 14>=[~ ~] since it is impossible to form linear combinations of any three of them to give the fourth any three of them will have a zero in the one place where the fourth does not. So the space is at least four-dimensional. Could it be bigger? No, since any arbitrary 2 x 2 matrix can be written in terms of them: [: !]=all)+bl2)+cl3)+dl4) If the scalars a, b, c, dare real, we have a real four-dimensional space, if they are complex we have a complex four-dimensional space. Theorem 1. Any vector IV) in ann-dimensional space can be written as a linear combination of n linearly independent vectors II) ... In). The proof is as follows: if there were a vector IV) for which this were not possible, it would join the given set of vectors and form a set of n + I linearly independent vectors, which is not possible in an n-dimensional space by definition. 6 CHAPTER I Definition 5. A set of n linearly independent vectors in an n-dimensional space is called a basis. Thus we can write, on the strength of the above IV)= L V;li) i=l where the vectors Ii) form a basis. (1.1.3) Definition 6. The coefficients of expansion V; of a vector in terms of a linearly independent basis (I i)) are called the components of the vector in that basis. Theorem 2. The expansion in Eq. (1.1.3) is unique. Suppose the expansion is not unique. We must then have a second expansion: n IV)= L v;li) i=l (1.1.4) Subtracting Eq. (1.1.4) from Eq. (1.1.3) (i.e., multiplying the second by the scalar -1 and adding the two equations) we get' IO)=I (v;-v;)li) (1.1.5) which implies that v,=v; ( 1.1.6) since the basis vectors are linearly independent and only a trivial linear reiation between them can exist. Note that given a basis the components are unique, but if we change the basis, the components will change. We refer to IV) as the vector in the abstract, having an existence of its own and satisfying various relations involving other vectors. When we choose a basis the vectors assume concrete forms in terms of their components and the relation between vectors is satisfied by the components. Imagine for example three arrows in the plane, A, il, Csatisfying A+ B= Caccording to the laws for adding arrows. So far no basis has been chosen and we do not need a basis to make the statement that the vectors from a closed triangle. Now we choose a basis and write each vector in terms of the components. The components will satisfy C; =A;+ B;, i = 1, 2. If we choose a different basis, the components will change in numerical value, but the relation between them expressing the equality of C to the sum of the other two will still hold between the new set of components. In the case of nonarrow vectors, adding them in terms of components proceeds as in the elementary case thanks to the axioms. If ( 1.1.7) 7 MATHEMATICAL INTRODUCTION IW)=L: wdi) then ( 1.1.8) (1.1.9) where we have used the axioms to carry out the regrouping of terms. Here is the conclusion: To add two vectors, add their components. There is no reference to taking the tail of one and putting it on the tip of the other, etc., since in general the vectors have no head or tail. Of course, if we are dealing with arrows, we can add them either using the tail and tip routine or by simply adding their components in a basis. In the same way, we have: (l.l.lO) In other words, To multiply a vector by a scalar, multiply all its components by the scalar. 1.2. Inner Product Spaces The matrix and function examples must have convinced you that we can have a vector space with no preassigned definition of length or direction for the elements. However, we can make up quantities that have the same properties that the lengths and angles do in the case of arrows. The first step is to define a sensible analog of the dot product, for in the case of arrows, from the dot product A·B= IA II Bl cos e (1.2.1) we can read off the length of say A as JfAI·IAI and the cosine of the angle between two vectors as A· B/IAIIBI. Now you might rightfully object: how can you use the dot product to define the length and angles, if the dot product itself requires knowledge of the lengths and angles? The answer is this. Recall that the dot product has a second 8 CHAPTER l I I I I I I ~--Pk-: I I I V; Pj·-·~ 1 - - - - - - - - p jk -- .. - -~ Figure 1.2. Geometrical proof that the dot product obeys axiom (3) for an inner product. The axiom requires that the projections obey P,+P1 =P1k. equivalent expression in terms of the components: ( !.2.2) Our goal is to define a similar formula for the general case where we do have the notion of components in a basis. To this end we recall the main features of the above dot product: I. A-B=B·A (symmetry) 2. A- A:2: 0 0 iff A= 0 (positive semidefiniteness) 3. X (bE+ cC) = b.A- B+ ot· C (linearity) The linearity of the dot product is illustrated in Fig. 1.2. We want to invent a generalization called the inner product or scalar product between any two vectors IV) and I W). We denote it by the symbol (VI W). It is once again a number (generally complex) dependent on the two vectors. We demand that it obey the following axioms: • (VI W) = ( W! V) * (skew-symmetry) • (VI V) :2:0 0 iffl V) =I 0) (positive semidefiniteness) • (VI (al W) +biZ))== ( VlaW+ hZ) =a( VI W) +h( VIZ) (linearity in ket) Definition 7. A vector space with an inner product is called an inner product space. Notice that we have not yet given an explicit rule for actually evaluating the scalar product, we are merely demanding that any rule we come up with must have these properties. With a view to finding such a rule, let us familiarize ourselves with the axioms. The first differs from the corresponding one for the dot product and makes the inner product sensitive to the order of the two factors, with the two choices leading to complex conjugates. In a real vector space this axioms states the symmetry of the dot product under exchange of the two vectors. For the present, let us note that this axiom ensures that (VI V) is real. The second axiom says that (VI V) is not just real bnt also positive semidefinite, vanishing only if the vector itself does. If we are going to define the length of the vector as the square root of its inner product with itself (as .in the dot product) this quantity had better be real and positive for all nonzero vectors. The last axiom expresses the linearity of the inner product when a linear super- position al W) + bl Z) =I a W + bZ) appears as the second vector in the scalar prod- uct. We have discussed its validity for the arrows case (Fig. 1.2). What if the first factor in the product is a linear superposition, i.e., what is (aW+bZI V)? This is determined by the first axiom: (aW+bZI V) = (VIaW+bZ) * =(a( VI W) + b( VIZ))* =a*( VI W)*+b*(VIZ)* =a*(WI V)+b*(ZI V) (1.2.3) 9 MATHEMATICAL INTRODUCTION which expresses the antilinearity of the inner product with respect to the first factor in the inner product. In other words, the inner product of a linear superposition with another vector is the corresponding superposition of inner products if the superposition occurs in the second factor, while it is the superposition with all coefficients conjugated if the superposition occurs in the first factor. This asymmetry, unfamiliar in real vector spaces, is here to stay and you will get used to it as you go along. Let us continue with inner products. Even though we are trying to shed the restricted notion of a vector as an arrow and seeking a corresponding generalization of the dot product, we still use some of the same terminology. Definition 8. We say that two vectors are orthogonal or perpendicular if their inner product vanishes. Definition 9. We will refer to .j( VI V) =I VI as the norm or length of the vector. A normalized vector has unit norm. Definition 10. A set of basis vectors all of unit norm, which are pairwise orthogonal will be called an orthonormal basis. We will also frequently refer to the inner or scalar product as the dot product. We are now ready to obtain a concrete formula for the inner product in terms of the components. Given IV) and IW) IV)=:[ V;li) IW>=I W;IJ> we follow the axioms obeyed by the inner product to obtain: (VI W)=:[ I v?wj(i i j ( 1.2.4) To go any further we have to know (iiJ), the inner product between basis vectors. That depends on the details of the basis vectors and all we know for sure is that 10 CHAPTER l they are linearly independent This situation exists for arrows as well. Consider a two-dimensional problem where the basis vectors are two linearly independent but nonperpendicular vectors. If we write all vectors in terms of this basis, the dot product of any two of them will likewise be a double sum with four terms (detem1ined by the four possible dot products between the basis vectors) as well as the vector components. However, if we use an orthonormal basis such as z', j, only diagonal terms like (il i) will survive and we will get the familiar result A· B=A,Bx+ A,B,. depending only on the components. For the more general nonarrow case. we invoke Theorem 3. Theorem 3 (Gram-Schmidt). Given a linearly independent basis we can form linear combinations of the basis vectors to obtain an orthonormal basis. Postponing the proof for a moment. let us assume that the procedure has been implemented and that the current basis is orthonormal: for i=j ~ (i =(\ forirj 1 where bu is called the Kronecker delta syrnhol. Feeding this into Eq. (1.2.4) we find the double sum collapses to a single one due to the Kronecker delta, to give (V! W)=L: v(w1 ( 1.2.5) This is the form of the inner product we will use from now on. You can now appreciate the first axiom; but for the complex conjugation of the components of the first vector. [vi, vi, ... v~] <--> means "within a basis." There is, however, nothing wrong with the first viewpoint of associating a scalar product with a pair of columns or kets (making no reference to another dual space) and living with the asymmetry between the first and second vector in the inner product (which one to transpose conjugate?). If you found the above discussion heavy going, you can temporarily ignore it. The only thing you must remember is that in the case of a general nonarrow vector space: • Vectors can still be assigned components in some orthonormal basis, just as with arrows, but these may be complex. • The inner product of any two vectors is given in terms of these components by Eq. (1.2.5). This product obeys all the axioms. 1.3.1. Expansion of Vectors in an Orthonormal Basis Suppose we wish to expand a vector IV) in an orthonormal basis. To find the components that go into the expansion we proceed as follows. We take the dot product of both sides of the assumed expansion with If): (or =I li)(il V> ( 1.3.5) Let us make sure the basis vectors look as they should. If we set IV)= IJ) in Eq. (1.3.5), we find the correct answer: the ith component of the jth basis vector is 8iJ. Thus for example the column representing basis vector number 4 will have a 1 in the 4th row and zero everywhere else. The abstract relation ( 1.3.6) becomes in this basis VI 1 0 () v2 0 1 () =vi + v2 0 +· .. v, Vn 0 0 13 MATHEMATICAL INTRODUCTION (1.3.7) 1.3.2. Adjoint Operation We have seen that we may pass from the column representing a ket to the row representing the corresponding bra by the adjoint operation, i.e., transpose conjugation. Let us now ask: if ----> [a*vf, a*vi • ... , a*v~]--> (VI a* ( 1.3.8) It is customary to write a! V) as Ia V) and the corresponding bra as (a VI. What we have found is that (aV! =(VIa* ( 1.3.9) Since the relation between bras and kets is linear we can say that if we have an equation among kets such as a!V) =b! ~V) +c!Z)+ · · · (1.3.10) this implies another one among the corresponding bras: (VIa*= ( W!b* + (Zic* + · · · (1.3.11) The two equations above are said to be a(ljoints of each other. .Just as any equation involving complex numbers implies another obtained by taking the complex conjugates of both sides, an equation between (bras) kets implies another one between (kets) bras. If you think in a basis, you will see that this follows simply from the fact that if two columns are equal, so are their transpose conjugates. Here is the rule for taking the adjoint: 14 CHAPTER I To take the adjoint of a linear equation relating kets (bras), replace every ket (bra) by its bra (ket) and complex conjugate all coefficients. We can extend this rule as follows. Suppose we have an expansion for a vector: IV)= I V;li) i=1 ( 1.3.12) in terms of basis vectors. The adjoint is (VI= I (ilv7 i=J Recalling that v;= (il V) and v7 = III 15 MATHEMATICAL INTRODUCTION Clearly As for the second vector in the basis, consider 12') =III> -II)(lill) which is Ill) minus the part pointing along the first unit vector. (Think of the arrow example as you read on.) Not surprisingly it is orthogonal to the latter: (112') = (1.5.1) One says that the operator n has transformed the ket IV) into the ket IV'). We will restrict our attention throughout to operators n that do not take us out of the vector space, i.e., if IV) is an element of a space V, so is IV')= 01 V). Operators can also act on bras : (V'IO=(V"I (1.5.2) We will only be concerned with linear operators, i.e., ones that obey the following rules: Qal V,) = aOI V,) (l.5.3a) n{al V,)+ /31 Jj)} =aOI V,)+ /301 Jj) (1.5.3b) (V1IaO=(V;jQa (l.5.4a) ((V,Ia + Leave the vector alone! Thus, II V) =IV) for all kets IV) ( 1.5.5) and (VII=(VI for all bras (VI ( 1.5.6) We next pass on to a more interesting operator on w'(R): R(~ ni)-> Rotate vector by~ n about the unit vector i e [More generally, R(O) stands for a rotation by an angle = 101 about the axis parallel e to the unit vector = eI e.] Let us consider the action of this operator on the three unit vectors i, ~.and k, which in our notation will be denoted by 11 ), 12), and 13) (see Fig. 1.3). From the figure it is clear that R(~ni)ll)=ll) (1.5. 7a) R(h012)=13) ( 1.5.7b) R(~ ni) 13) = -12) (l.5.7c) Clearly R(~ni) is linear. For instance, it is clear from the same figure that R[I2)+13)]=Ri2)+RI3). D The nice feature of linear operators is that once their action on the basis vectors is known, their action on any vector in the space is determined. If Oli) =It') for a basis II), 12), ... , In) in 1,r, then for any I V)=I v;li) (1.5.8) 20 CHAPTER I This is the case in the example n = R(h'i ). If is any vector, then The product ol two operators stands for the instruction that the instruction~ corresponding to the two operators be carried out in sequence AOI =A(QI V) )=AIOV) (1.5.9) where iQ V) is the ket obtained by the action of Q on I v"). The order of the operators in a product is very important: in general, QA-AO=:[O, A] called the commutator of Q and A isn't zero. For example R(~ rri) and R(~ rrj) do not commute, i.e., their commutator is nonzero. Two useful identities involving commutators are [Q, A8] = J\[0, 8] + [Q, A]O (1.5.10) [AQ, 8] = J\[0, 8] +[A, 8]0 (1.5.11) Notice that apart from the emphasis on ordering, these rules resemble the chain rule in calculus for the derivative of a product. The inverse of Q, denoted by !:T 1, satisfies:!: (1.5.12) Not every operator has an inverse. The condition for the existence of the inverse is Rd given in Appendix A. L The operator Jri) has an inverse: it is R( -~ Jri ). The inverse of a product of operators is the product of the inverses in reverse: (1.5.13) for only then do we have (QA)(OA)- 1 = (QA)(A ·In··· I) =OAA···Io····l =no··· I =1 1.6. Matrix Elements of Linear Operators \Ve are now accustomed to the idea of an abstract vector being represented in a basis by an n-tuple of numbers, called its components, in terms of which all vector ~In 'V"(C) with n finite, D 1 D~f.,,,.[2f2 1 =/. Prove this using the ideas introduced toward the end of Theorem A.l.l., Appendix A.l. operations can be carried out. We shall now see that in the same manner a linear operator can be represented in a basis by a set of n2 numbers, written as an n x n matrix, and called its matrix elements in that basis. Although the matrix elements, just like the vector components, are basis dependent, they facilitate the computation of all basis-independent quantities, by rendering the abstract operator more tangible. Our starting point is the observation made earlier, that the action of a linear operator is fully specified by its action on the basis vectors. If the basis vectors suffer a change 21 MATHEMATICAL INTRODUCTION !lli)= li') (where Ii') is known), then any vector in this space undergoes a change that is readily calculable: !ll V)=Q L v;li)=L v;!lli)=L vdi') ; i i When we say Ii') is known, we mean that its components in the original basis (1.6.1) are known. The n2 numbers, Q!'i, are the matrix elements of Q in this basis. If !liV)=IV'> then the components of the transformed ket IV') are expressable in terms of the nij and the components of IV): (iln(~ v; = (il V') = (il!ll V) = vjlj)) =L Vj(il!llj) j (1.6.2) Equation (1.6.2) can be cast in matrix form: ·· v;] [(11!lll) (ll!ll2) (llfiln)][~] [ v...2 = (21!...ll1) v~ (nl!lll) (nl!lln) Vn (1.6.3) A mnemonic: the elements of the first column are simply the components of the first transformed basis vector 11') = !lll) in the given basis. Likewise, the elements of the jth column represent the image of the jth basis vector after n acts on it. 22 CHAPTER I Convince yourself that the same matrix flu acting to the left on the row vector corresponding to any ( v'l gives the row vector corresponding to ( v"l = ( v'l Q. Example 1.6.1. Combining our mnemonic with the fact that the operator R(~ ni) has the following effect on the basis vectors: R(jni)ll) =II) R(jni)l2)=1 R(~ni)i3) = -12) we can write down the matrix that represents it in the 11), 12), 13) basis: R(±ni)•---·[~ ~ -~] ( L6.4) _0 1 0 For instance, the ----1 in the third column tells us that R rotates 13) into -12). One may also ignore the mnemonic altogether and simply use the definition Ru= (il RIJ) to compute the matrix. D Exercise 1.6. 1. An operator n is given by the matrix What is its action'1 Let us now consider certain specific operators and see how they appear in matrix form. (1) The Identity Operator L ou fu= (illiJ) = (iiJ) = ( 1.6.5) Thus I is represented by a diagonal matrix with I 's along the diagonal. You should verify that our mnemonic gives the same result. (2) The Projection Operators. Let us first get acquainted with projection operators. Consider the expansion of an arbitrary kct IV) in a basis: IV)= I li)(iJV) i=]_ In terms of the objects Ii)(il, which are linear operators, and which, by definition, act on IV) to give li)(il V), we may write the above as 23 MATHEMATICAL INTRODUCTION ( 1.6.6) Since Eq. ( 1.6.6) is true for all IV), the object in the brackets must be identified with the identity (operator) I= I li)(il =I IP'; i=J jooo ( ( 1.6. 7) The object IP',= Ii)(i I is called the projection operator for the ket Ii). Equation ( 1.6.7), which is called the completeness relation, expresses the identity as a sum over projection operators and will be invaluable to us. (If you think that any time spent on the identity, which seems to do nothing, is a waste of time, just wait and see.) Consider ( 1.6.8) Clearly IP', is linear. Notice that whatever IV) is, IP';I V) is a multiple of if) with a coefficient (v;) which is the component of IV) along li). Since P; projects out the component of any ket I V) along the direction I it is called a projection operator. The completeness relation, Eq. ( 1.6. 7), says that the sum of the projections of a vector along all the n directions equals the vector itself. Projection operators can also act on bras in the same way: ( 1.6.9) Projection operators corresponding to the basis vectors obey (1.6.10) This equation tells us that (1) once IP' 1 projects out the part of IV) along !i), further applications of IP'; make no difference; and (2) the subsequent application of IP';(j of i) will result in zero, since a vector entirely along Ii) cannot have a projection along a perpendicular direction IJ). 24 CHAPTER l Figure 1.4. P, and P, are polarizers placed in the way of a beam traveling along the z axis. The action of the polarizers on the electric field E obeys the law of combination of projection operators: P,P1=i5uP1. The following example from optics may throw some light on the discussion. Consider a beam of light traveling along the z axis and polarized in the x ---- y plane at an angle e with respect to the y axis (see Fig. 1.4). If a polarizer P,, that only admits light polarized along they axis, is placed in the way, the projection E cos 8 along the y axis is transmitted. An additional polarizer P, placed in the way has no further effect on the beam. We may equate the action of the polarizer to that of a projection operator ~DY that acts on the electric field vector E. If P,. is followed by a polarizer P, the beam is completely blocked. Thus the polarizers obey the equation P1P;= 8uP; expected of projection operators. Let us next turn to the matrix elements of !P'i. There are two approaches. The first one, somewhat indirect, gives us a feeling for what kind of an object Ii) Q +, I)+--> (I, a+--> a*. (Of course, there is no real need to reverse the location of the scalars a except in the interest of unifcrmity.) Hermitian, Anti-Hermitian, and Unitary Operators We now turn our attention to certain special classes of operators that will play a major role in quantum mechanics. Definition 13. An operator Q is Hermitian if nt = n. nt Definition 14. An operator Q is anti-Hermitian if = -Q. The adjoint is to an operator what the complex conjugate is to numbers. Hermitian and anti-Hermitian operators are like pure real and pure imaginary numbers. Just as every number may be decomposed into a sum of pure real and pure imaginary parts, a+a* a-a* a=---+--- 2 2 we can decompose every operator into its Hermitian and anti-Hermitian parts: (1.6.18) Exercise 1.6.2. * Given Q and A are Hermitian what can you say about (I) QA; (2) QA+AQ; (3) [Q, A]; and (4) i[Q, A]? 28 CHAPTER I Definition 15. An operator U is unitary if uut=I (1.6.19) This equation tells us that U and ut are inverses of each other. Consequently, from Eq. (1.5.12), ( 1.6.20) Following the analogy between operators and numbers, unitary operators are like complex numbers of unit modulus, u = e;6 . Just as u*u = 1, so is ut U =I. Exercise 1.6.3. * Show that a product of unitary operators is unitary. Theorem 7. Unitary operators preserve the inner product between the vectors they act on. Proof Let and IV2)= Ul Vz) Then ( V21 Vi)= ( UVzl UV,) = (Vzl UtUI V,) = (Vzl V,) (1.6.21) (Q.E.D.) Unitary operators are the generalizations of rotation operators from W3(R) to Wn( C), for just like rotation operators in three dimensions, they preserve the lengths of vectors and their dot products. In fact, on a real vector space, the unitarity condition becomes u-' = ur (T means transpose), which defines an orthogonal or rotation matrix. [R(hi) is an example.] Theorem 8. If one treats the columns of an n x n unitary matrix as components of n vectors, these vectors are orthonormal. In the same way, the rows may be interpreted as components of n orthonormal vectors. Proof 1. According to our mnemonic, thejth column of the matrix representing U is the image of the jth basis vector after U acts on it. Since U preserves inner products, the rotated set of vectors is also orthonormal. Consider next the rows. We now use the fact that ut is also a rotation. (How else can it neutralize U to give ut U =I?) Since the rows of U are the columns of ut (but for an overall complex conjugation which does not affect the question of orthonormality), the result we already have for the columns of a unitary matrix tells us the rows of U are orthonormal. Proof 2. Since ut U=I, oij= (illiJ) = (il utUIJ) = L (il Utlk)(kl UIJ) k = L: u)~ukj= L: ut;ukj k k ( 1.6.22) which proves the theorem for the columns. A similar result for the rows follows if we start with the equation uut =I. Q.E.D. Note that ut U=I and uut =I are not independent conditions. Exercise 1.6.4. * It is assumed that you know (I) what a determinant is. (2) that det n7 = det n (T denotes transpose), (3) that the determinant of a product of matrices is the product of the determinants. [If you do not, verify these properties for a two-dimensional case 29 MATHEMATICAL INTRODUCTION with det n = (a 8 - f1 y ).] Prove that the determinant of a unitary matrix is a complex number of unit modulus. Exercise 1.6.5. * Verify that Rd rri) is unitary (orthogonal) by examining its matrix. Exercise 1.6.6. Verify that the following matrices are unitary: 1-iJ 1Jl+i 2[)-i Hi Verify that the determinant is of the form e'" in each case. Are any of the above matrices Hermitian? 1.7. Active and Passive Transformations Suppose we subject all the vectors I V) in a space to a unitary transformation I V)_,.UI V) (1.7.1) Under this transformation, the matrix elements of any operator Q are modified as follows: ( V'IOI V)->( UV'iOI UV) = ( V'l UtQUI V) ( 1.7.2) 30 CHAPTER 1 It is dear that the same change would be effected if we left the vectors alone and subjected all operators to the change n-.u'nu (1.7.3) The first case is called an active transformation and the second a passive tram:fiJrmation. The present nomenclature is in reference to the vectors: they are affected in an active transfonnation and left alone in the passive case. The situation is exactly the opposite from the point of view of the operators. Later we will see that the physics in quantum theory lies in the matrix elements of operators, and that active and passive transformations provide us with two equivalent ways of describing the same physical transfom1ation. ExercL1·e I. 7.1. * The trace of a matrix is defined to be the sum of its diagonal matrix elements TrH=:Ln,, Show that ( 1) Tr(!1A) = Tr(Arl) (2) Tr(!1A8) = Tr(MKl) o Tr( I}HA) (The permutations arc cyclir). (3) The trace of an operator is unaffected by a unitary change of basis li)->Uii). [Equiva- lently, show Tr !1 = Tr( U'Hl!).] Exercise 1. 7.2. Show that the determinant of a matrix is unaffected by a unitary change of basis. [Equivalently show det D = det( U+WJ).] 1.8. The Eigenvalue Problem Consider some linear operator Q acting on an arbitrary nonzero ket IV): 01 V)=i V') (1.8.1) Unless the operator happens to be a trivial one, such as the identity or its multiple, the ket will suffer a nontrivial change, i.e., IV') will not be simply related to IV). So much for an arbitrary ket. Each operator, however, has certain kets of its own, called its eigenkets, 0!1 which its action is simply that of rescaling: fll V)=wl V) (1.8.2) Equation ( 1.8.2) is an eigenvalue equation: I V) is an eigenket of Q with eigenvalue w. In this chapter we will see how, given an operator Q, one can systematically determine all its eigenvalues and eigenvectors. How such an equation enters physics will be illustrated by a few examples from mechanics at the end of this section, and once we get to quantum mtxhanics proper, it will be eigen, eigen, eigen all the way. Example 1.8.1. To illustrate how easy the eigenvalue problem really is, we will begin with a case that will be completely solved: the case 0 =I. Since IIV)=IV) 31 MATHEMATICAL INTRODUCfiON for alii V), we conclude that ( 1) the only eigenvalue of I is 1; (2) all vectors are its eigenvectors with this eigenvalue. 0 Example 1.8.2. After this unqualified success, we are encouraged to take on a slightly more difficult case: Q = IP v, the projection operator associated with a normalized ket IV). Clearly (1) any ket a 1 V) = 1a V), parallel to I V) is an eigenket with eigenvalue 1: IP>vla V) =I V)v(al V) + ,81 V1)) =I a V) =I y(al V) + ,81 V.1)) Since every ket in the space falls into one of the above classes, we have found all the eigenvalues and eigenvectors. 0 Example 1.8.3. Consider now the operator R(!JTi}. We already know that it has one eigenket, the basis vector II ) along the x axis: R(~lii)ll)=jl) Are there others? Of course, any vector all) along the x axis is also unaffected by the x rotation. This is a general feature of the eigenvalue equation and reflects the linearity of the operator: if 01 V) =(I) IV) then Oal V) = aOI V) = a(l)j V) = r.oal V) 32 CHAPTER I for any multiple a. Since the eigenvalue equation fixes the eigenvector only up to an overall scale factor, we will not treat the multiples of an eigenvector as distinct eigenvectors. With this understanding in mind, let us ask if R(~ni) has any eigenvec- tors besides II). Our intuition says no, for any vector not along the x axis necessarily gets rotated by R(~ ni) and cannot possibly transform into a multiple of itself. Since every vector is either parallel to II) or isn't, we have fully solved the eigenvalue problem. The trouble with this conclusion is that it is wrong! R(~ ni) has two other eigenvectors besides 11 ). But our intuition is not to be blamed, for these vectors are in V\ C) and not W3(R). It is clear from this example that we need a reliable and systematic method for solving the eigenvalue problem in Wn( C). We now tum our attention to this very question. D The Characteristic Equation and the Solution to the Eigenvalue Problem We begin by rewriting Eq. ( 1.8.2) as (Q- col)l V) = 10) Operating both sides with (Q-col)- 1, assuming it exists, we get (1.8.3) ( 1.8.4) Now, any finite operator (an operator with finite matrix elements) acting on the null vector can only give us a null vector. It therefore seems that in asking for a nonzero eigenvector IV), we are trying to get something for nothing out of Eq. (1.8.4). This is impossible. It follows that our assumption that the operator (Q- col)- 1 exists (as a finite operator) is false. So we ask when this situation will obtain. Basic matrix theory tells us (see Appendix A.l) that the inverse of any matrix Misgiven by M_ 1 =cofactor MT det M ( 1.8.5) Now the cofactor of M is finite if M is. Thus what we need is the vanishing of the determinant. The condition for nonzero eigenvectors is therefore det(Q- col)= 0 ( 1.8.6) This equation will determine the eigenvalues co. To find them, we project Eq. (1.8.3) onto a basis. Dotting both sides with a basis bra - [If not, Theorem (A.l.l) would imply that (Q-w 1/) is invertible.] Consider the subspace 'W~ 11 of all vectors orthogonal to Im1). Let us choose as our basis the vector Im1) (normalized to unity) and any n ----·1 orthonormal vectors {vi,' V~t' ... ' V";; 1} in I I. In this basis Q has the following fom1: W1 0 0 0 0 '.. 0 0 n.-. o (1.8.12) 0 The first column is just the image of lw,) after Q has acted on it. Given the first column, the first row follows from the Hermiticity of n. The characteristic equation now takes the form i m1- OJ)· (determinant of boxed submatrix) = 0 1/-! I (w~-----m) cmw"'=(w~·----w)P"- 1 (w)=O 0 Now the polynomial P'' ·· 1 must also generate one root, m2 , and a normalized eigenvector lm2). Define the subspace 'W'li.~ of vectors in 1 1 orthogonal to fw 2) (and automatically to IWt)) and repeat the same procedure as before. Finally, the matrix n becomes, in the basis fro,), lm,), ... , 1 OJ] 0 0 0 OJ? 0 n ...... 0 0 Oh ,:J 0 0 0 Since every Iw,) was chosen from a space that was orthogonal to the previous ones, Im 1), Iw 2 ), .•. , Iw,_ 1); the basis of eigenvectors is orthonormaL (Notice that nowhere did we have to assume that the eigenvalues were all distinct.) Q.E.D. [The analogy between real numbers and Hermitian operators is further strength-- ened by the fact that in a certain basis (of eigenvectors) the Hermitian operator can be represented by a matrix with all real. elements.] In stating Theorem 10, it was indicated that there might exist more than one basis of eigenvectors that diagonalized n. This happens if there is any degeneracy. Suppose m1 = m2 = (J). Then we have two orthonormal vectors obeying It follows that 37 MATHEMATICAL INTRODUCTION for any a and [3. Since the vectors lw 1) and loh) are orthogonal (and hence LI), we find that there is a whole two-dimensional subspace spanned by m,) and 1m2), the elements of which are eigenvectors of Q with eigenvalue m. One refers to this space as an eigenspace of Q with eigenvalue w. Besides the vectors Iw 1) and Im2), there exists an infinity of orthonormal pairs lm!), lm;), obtained by a rigid rotation of Iw 1), Im2 ), from which we may select any pair in forming the eigenbasis of n. In general, if an eigenvalue occurs m 1 times, that is, if the characteristic equation has m; of its roots equal to some w1, there will be an eigenspace w;~: from which we may choose any m; orthonormal vectors to form the basis referred to in Theorem 10. In the absence of degeneracy, we can prove Theorem 9 and 10 very easily. Let us begin with two eigenvectors: (1.8.13a) < < Dotting the first with m; I and the second with w;!, we get ( 1.8.13b) ( 1.8.14a) ( 1.8.14b) Taking the adjoint of the last equation and using the Hermitian nature of Q, we get Subtracting this equation from Eq. (1.8.14a), we get If i=j, we get, since (m1lm;)#O, (1.8.15) (1.8.16) 38 CHAPTER I If i#-j, we get (1.8.17) since mi- mf = mi- mj#-0 by assumption. That the proof of orthogonality breaks down form;= mj is not surprising, for two vectors labeled by a degenerated eigenvalue could be any two members of the degenerate space which need not necessarily be orthogonal. The modification of this proof in this case of degeneracy calls for arguments that are essentially the ones used in proving Theorem 10. The advantage in the way Theorem 10 was proved first is that it suffers no modification in the degener- ate case. Degeneracy We now address the question of degeneracy as promised earlier. Now, our general analysis of Theorem 10 showed us that in the face of degeneracy, we have not one, but an infinity of orthonormal eigenbases. Let us see through an example how this variety manifests itself when we look for eigenvectors and how it is to be handled. Example 1.8.5. Consider an operator Q with matrix elements in some basis. The characteristic equation is i.e., m=O, 2, 2 The vector corresponding to m = 0 is found by the usual means to be The case m = 2 leads to the following equations for the components of the eigenvector: 0=0 Now we have just one equation, instead of the two (n- 1) we have grown accustomed to! This is a reflection of the degeneracy. For every extra appearance (besides the first) a root makes, it takes away one equation. Thus degeneracy permits us extra degrees of freedom besides the usual one (of normalization). The conditions 39 MATHEMATICAL INTRODUCTION x 2 arbitrary define an ensemble of vectors that are perpendicular to the first, Iw = 0), i.e., lie in a plane perpendicular to Iw = 0). This is in agreement with our expectation that a twofold degeneracy should lead to a two-dimensional eigenspace. The freedom in X2 (or more precisely, the ratio x2/x3) corresponds to the freedom of orientation in this plane. Let us arbitrarily choose x 2 = 1, to get a normalized eigenvector corresponding to w = 2: The third vector is now chosen to lie in this plane and to be orthogonal to the second (being in this plane automatically makes it perpendicular to the first Iw = 0)): Clearly each distinct choice of the ratio, x2/x3 , gives us a distinct doublet of orthonor- mal eigenvectors with eigenvalue 2. D Notice that in the face of degeneracy, IW;) no longer refers to a single ket but to a generic element of the eigenspace w::;;. To refer to a particular element, we must use the symbol IW;, a), where a labels the ket within the eigenspace. A natural choice of the label a will be discussed shortly. We now consider the analogs of Theorems 9 and 10 for unitary operators. Theorem 11. The eigenvalues of a unitary operator are complex numbers of unit modulus. Theorem 12. The eigenvectors of a unitary operator are mutually orthogonal. (We assume there is no degeneracy.) 40 CHAPTER I Proof of Both Theorems (assuming no degeneracy). Let (1.8.18a) and ( 1.8.18b) If we take the adjoint of the second equation and dot each side with the corresponding side of the first equation, we get so that If i=j, we get, since (udui)#O, ( 1.8.19) while if i#j, ( 1.8.20a) ( 1.8.20b) since lui) #lu1 )~ui#Uf*Uiut #uiuf~uiuf # 1. (Q.E.D.) If U is degenerate, we can carry out an analysis parallel to that for the Hermitian operator !1, with just one difference. Whereas in Eq. (1.8.12), the zeros of the first row followed from the zeros of the first column and nt = n, here they follow from the requirement that the sum of the modulus squared of the elements in each row adds up to I. Since I u11 = I, all the other elements in the first row must vanish. Diagonalization of Hermitian Matrices Consider a Hermitian operator Q on 'Vn( C) represented as a matrix in some orthonormal basis 11), ... , li), ... , In). If we trade this basis for the eigenbasis lroi), ... , lroi), ... , Icon), the matrix representing Q will become diagonal. Now the operator U inducing the change of basis leo)= VIi) (1.8.21) is clearly unitary, for it "rotates" one orthonormal basis into another. (If you wish you may apply our mnemonic to U and verify its unitary nature: its columns contain the components of the eigenvectors Ico) that are orthonormal.) This result is often summarized by the statement: Every Hermitian matrix on 'Vn( C) may be diagonalized by a unitary change of basis. We may restate this result in terms of passive transformations as follows: If n is a Hermitian matrix, there exists a unitary matrix U (built out of the eigenvectors of Q) such that utnu is diagonal. Thus the problem of finding a basis that diagonalizes Q is equivalent to solving its eigenvalue problem. 41 MATHEMATICAL INTRODliCTION Exercise 1.8.1. (1) Find the eigenvalues and normalized eigenvectors of the matrix n=[~ ~ ~l 0 I 4J (2) Is the matrix Hermitian? Are the eigenvectors orthogonal? Exercise 1.8.2. * Consider the matrix (I) Is it Hermitian? (2) Find its eigenvalues and eigenvectors. (3) Verify that u'nu is diagonal, U being the matrix of eigenvectors of n. Exercise 1.8.3. * Consider the Hermitian matrix n=i [~ ~ -~: .0 -1 3 (I) Showthatw 1=w2=1;w3=2. (2) Show that lw = 2) is any vector of the form (3) Show that the w =I eigenspace contains all vectors of the form either by feeding w = 1 into the equations or by requiring thilt the w = 1 eigenspace be orthogonal to Iw = 2). 42 CHAPTER I Exercise 1.8.4. An arbitrary n x n matrix need not have n eigenvectors. Consider as an example Q= 'l' -14 1] 2i (l) Showthat(u 1 =wc=3. (2) By feeding in this value show we gel only one eigenvector of the form We cannot find another one that is LL Exercise 1. 8. 5. * Consider the matrix J [ cos () sin 8 Q= -sinO cos I) (I) Show that it is unitary. (2) Show that its eigenvalues are ed' and e '" (3) Find the corresponding eigenvectors; show that they are orthogonal. (4) Verify that [/QU=(diagonal matrix), where Uis the matrix of eigenvectors ofQ. Exercise 1.8.6. * (I) We have seen that the determinant of a matrix is unchanged under a unitary change of basis. Argue now that n det f! =prodUCt Of eigenvalUeS Of Q = (L); i=! for a Hermitian or unitary Q. (2) Using the invariance of the trace under the same transformation, show that Td2 = 2.: w, i'-''1 Exercise 1.81 By using the results on the trace and determinant from the last problem, show that the eigenvalues of the matrix are 3 and -I. Verify this by explicit computation. Note that the Hermitian nature of the matrix is an essential ingredient. Exercise 1.8.8. * Consider Hermitian matrices M\ M 2, M', M 4 that obey i,j= I, ... , 4 (1) Show that the eigenvalues of M' are ± 1. (Hint: go to the eigenbasis of M', and use the equation for i=j.) (2) By considering the relation M'M'=-MiM' fori~j show that M' are traceless. [Hint: Tr(ACB)=Tr(CBA).] (3) Show that they cannot be odd-dimensional matrices. Exercise 1.8.9. A collection of masses ma, located at ra and rotating with angular velocity oo around a common axis has an angular momentum 43 MATHEMATICAL INTRODUCTION where Va = oo x ra is the velocity of ma. By using the identity A X (B X c) = B( A • c)- C( A • B) show that each Cartesian component !, of I is given by where or in Dirac notation ll)=Miw) (I) Will the angular momentum and angular velocity always be parallel? (2) Show that the moment of inertia matrix Mu is Hermitian. (3) Argue now that there exist three directions tor oo such that I and oo will be parallel. How are these directions to be found? (4) Consider the moment of inertia matrix of a sphere. Due to the complete symmetry of the sphere, it is clear that every direction is its eigendirection for rotation. What does this say about the three eigenvalues of the matrix M? Simultaneous Diagonalization of Two Hermitian Operators Let us consider next the question of simultaneously diagonalizing two Hermitian operators. Theorem 13. If Q and A are two commuting Hermitian operators, there exists (at least) a basis of common eigenvectors that diagonalizes them both. 44 CHAPTER I Proof. Consider first the case where at least one of the operators is nondegenerate, i.e., to a given eigenvalue, there is just one eigenvector, up to a scale. Let us assume n is nondegenerate. Consider any one of its eigenvectors: Since [A, Q] =0, ilAI OJ;)= OJ;AI OJ;) ( 1.8.22) i.e., AI OJ;) is an eigenvector of n with eigenvalue OJ;. Since this vector is unique up to a scale, ( 1.8.23) Thus IOJ;) is also an eigenvector of A with eigenvalue A;. Since every eigenvector of n is an eigenvector of A, it is evident that the basis IOJ;) will diagonalize both operators. Since n is nondegenerate, there is only one basis with this property. What if both operators are degenerate? By ordering the basis vectors such that the elements of each eigenspace are adjacent, we can get one of them, say n, into the form (Theorem 10) OJ, Now this basis is not unique: in every eigenspace w::;; =W7'' corresponding to the eigenvalue OJ;, there exists an infinity of bases. Let us arbitrarily pick in w::;; a set !OJ;, a) where the additional label a runs from 1 tom;. How does A appear in the basis? Although we made no special efforts to get A into a simple form, it already has a simple form by virtue of the fact that it commutes with n. Let us start by mimicking the proof in the nondegenerate case: ilAIOJ;, a)=Ail!OJ;, a)=OJ;AIOJ;, a) However, due to the degeneracy of n, we can only conclude that Alm 1, a) lies in V'('• Now, since vectors from different eigenspaces are orthogonal [Eq. (1.8.15)], 45 MATHEMATICAL INTRODUCTION iflm1, a) and lmj, f3> are basis vectors such that m1#mj. Consequently, in this basis, which is called a block diagonal matrix for obvious reasons. The block diagonal form of A reflects the fact that when A acts on some element Im1, a> of the eigenspace V'('', it turns it into another element of V'(''. Within each subspace i, A is given by a matrix A1, which appears as a block in the equation above. Consider a matrix A1 in V'(''. It is Hermitian since A is. It can obviously be diagonalized by trading the basis lm1, 1), lm1, 2), ... , lm 1, m1) in W'('' that we started with, for the eigenbasis of A1• Let us make such a change of basis in each eigenspace, thereby rendering A diagonal. Meanwhile what of n? It remains diagonal of course, since it is indifferent to the choice of orthonormal basis in each degenerate eigenspace. If the eigenvalues of A; are ;.p> ;.p>, ... , Mm,) then we end up with Q.E.D. 46 CHAPTER I If A is not degenerate within any given subspace, i= , for any k, !, and i, the basis we end up with is unique: the freedom Q gave us in each eigenspace is fully eliminated by A. The elements of this basis may be named uniquely by the pair of indices w and it as jw, it). with A playing the role of the extra label u. If A is degenerate within an eigenspace of Q, if say A\))= , there is a two-dimensional eigenspace from which we can choose any two orthonormal vectors for the common basis. lt is then necessary to bring in a third operator r, that commutes with both Q and A, and which will be nondegenerate in this subspace. In general, one can always find, for finite n, a set of operators {Q, A, 1, ... } that commute with each other and that nail down a unique, common, eigenbasis, the elements of which may be labeled unambiguously as ! w, it, y, . .. ). ln our study of quantum mechanics it will be assumed that such a complete set of commuting operators exists if n is infinite. Exercise 1.8. 10. * By considering the commutator, :>how that the following Hermitian matrices may be simultaneously diagonalized. Find the eigenvectors common to both and verify that under a unitary transformation to this basis, both matrices are diagonalized. 0 1 0 0 -i] 0 Since Q is degenerate and!\ is not you must be prudent in deciding which matrix dictates the choice of basis. Example 1.8.6. We will nov· discuss, in some detail, the complete solution to a problem in mechanics. It i~ in·.portant that you understand this example thoroughly, for it not only illustrates tl1e use of the mathematical techniques developed in this chapter but also contains the main features of the central problem in quantum mechanics. The mechanical syste'n in question is depicted in Fig. 1.5. The two masses m are coupled to each other and the walls by springs of force constant k. If x1 and x2 measure the displacements of the masses from their equilibrium points, these coordinates obey the following equations, derived through an elementary application of Newton's laws: .YJ = -- 2k X1 + k X2 m m ( 1.8.24a) k 2k ~'i'2=-x1--x2 m m ( 1.8.24b) Figure 1.5. The coupled mass problem. All masses arc m. all spring constants are k. and the displacements of the masses from equilibrium are x 1 and x.•. The problem is to find x 1(t) and x2(t) given the initial-value data, which in this case consist of the initial positions and velocities. If we restrict ourselves to the case of zero initial velocities, our problem is to find x 1(t) and x2(t), given x 1(0) and x 2(0). In what follows, we will formulate the problem in the language of linear vector spaces and solve it using the machinery developed in this chapter. As a first step, we rewrite Eq. (1.8.24) in matrix form: 47 MATHEMATICAL JNTRODUCT!ON (1.8.25a) where the elements of the Hermitian matrix Qii are QII=Qn=-2k/m, (1.8.25b) We now view x1 and x2 as components of an abstract vector lx), and Qii as the matrix elements of a Hermitian operator n. Since the vector jx) has two real components, it is an element of w2(R), and Q is a Hennitian operator on w\R). The abstract form of Eq. (1.8.25a) is I.X(t))=Qjx(t)) ( 1.8.26) Equation ( 1.8.25a) is obtained by projecting Eq. (1.8.26) on the basis vectors II), 12), which have the following physical significance: J I 11 ) ,___) 1 <---+ first mass displace~ by unity] Lo L second mass und1splaced _ ) I I OJ first mass undisplaced l 12 <---+ l1 <---+ Lsecond mass displaced by unity_ (1.8.27a) (1.8.27b) An arbitrary state, in which the masses are displaced by x 1 and x2 , is given in this basis by ( 1.8.28) The abstract counterpart of the above equation is ( 1.8.29) It is in this 11), 12) basis that Q is represented by the matrix appearing in Eq. (1.8.25), with elements -2k/m, k/m, etc. The basis II), 12) is very desirable physically, for the components of jx) in this basis (x1 and x2) have the simple interpretation as displacements of the masses. However, from the standpoint of finding a mathematical solution to the initial-value problem, it is not so desirable, for the components x1 and x 2 obey the coupled 48 CHAPTER I differential equations (1.8.24a) and (1.8.24b). The coupling is mediated by the offdiagonal matrix elements Cl12 =Cl21 =kjm. Having identified the problem with the 11), 12) basis, we can now see how to get around it: we must switch to a basis in which Cl is diagonal. The components of lx) in this basis will then obey another uncoupled differential equations which may be readily solved. Having found the solution, we can return to the physically preferable 11), 12) basis. This, then, is our broad strategy and we now turn to the details. From our study of Hermitian operators we know that the basis that diagonalizes Cl is the basis of its normalized eigenvectors. Let II) and III) be its eigenvectors defined by Clll)=-wfll) (1.8.30a) (1.8.30b) We are departing here from our usual notation: the eigenvalue of n is written as - w2 rather than as w in anticipation of the fact that n has eigenvalues of the form -oi, with w real. We are also using the symbols II) and III) to denote what should be called I- w~) and I- coi1) in our convention. It is a simple exercise (which you should perform) to solve the eigenvalue problem of Cl in the 11), 12) basis (in which the matrix elements of Clare known) and to obtain WI--(k-)l/2, m (1.8.31a) -k [ -(3k)l/2 Wu- -m ' III)+-. 2 -11] (1.8.31b) If we now expand the vector lx(t)) in this new basis as lx(t)) =I I)xi(t) +I Il)xu(t) (1.8.32) [in analogy with Eq. (1.8.29)], the components X1 and xu will evolve as follows: (1.8.33) We obtain this equation by rewriting Eq. (1.8.26) in the II), III) basis in which n has its eigenvalues as the diagonal entries, and in which lx) has components x1 and xn. Alternately we can apply the operator 49 MATHEMATICAL INTRODUCTION to both sides of the expansion of Eq. ( 1.8.32), and get (1.8.34) Since II) and III) are orthogonal, each coefficient is zero. The solution to the decoupled equations i=I, II ( 1.8.35) subject to the condition of vanishing initial velocities, is i= I, II ( 1.8.36) As anticipated, the components of lx) in the IJ), III) basis obey decoupled equations that can be readily solved. Feeding Eq. ( 1.8.36) into Eq. ( 1.8.32) we get (1.8.37a) = II)(IIx(O)) cos w1t+ III)(IIIx(O)) cos m11 t ( 1.8.37b) Equation (1.8.37) provides the explicit solution to the initial-value problem. It corresponds to the following algorithm for finding lx(t)) given lx(O)). Step (l). Solve the eigenvalue problem of n. Step (2). Find the coefficients x1(0) =(I: x(O)) and x11(0) = (Hjx(O)) m the expansion lx(O)) =I I)xi(O) +I II)xn(O) Step (3). Append to each coefficient xdO) (i= I, II) a time dependence cos m;t to get the coefficients in the expansion of Ix(l) ). Let me now illustrate this algorithm by solving the following (general) initialvalue problem: Find the future state of the system given that at t = 0 the masses are displaced by x 1(0) and x2(0). Step (1). We can ignore this step since the eigenvalue problem has been solved [Eq. (1.8.31)]. 50 CHAPTER I Step (2). x1(0) = (IIx(O)) = [X1 (Q)J x (O)=IIIIx(O))= ll ' . 1 (1 -1) , ..xd·O) X1 (0)- Xo(O) = " i/2 Step (3 ). Ix( t)) = II,; x···1····(···0····)····+····:·x····2···(···0···)·· cos OJ r t + II. .I) x··1··(···0······)···-····x····2(·0) cos OJ t · 2L2 2112 IT The explicit solution above can be made even more explicit by projecting lx(t)) onto the 11 ), 12) basis to find x 1( t) and x2( t), the displacements of the masses. We get (feeding in the explicit formulas for w 1 and wu) x1(t) = Xt(O) ~;'"(0) cos [ek')l.'21J 2 ,m, 2 ,m J . [(" J =~(x 1 (0)+x2(0)]cos [( k')1" 2t. + 1 [xi(O)-x2(0)]cos ~-k·') 1/2t 2 rn .. 2 .. m, .. ( 1.8.38a) using the fact that = 1 It can likewise be shown that (1.8.38b) We can rewrite Eq. ( 1.8.38) in matrix form as cos [(k/m) 112t] +cos [(3k/m) 112tJ L,(J~ X1(t) [ 2 oo'[(kjm)' ''tl- m; [(3kjm)' "t] 2 t]l cos [(k/m) 112t]- cos [(3k/m) 112 2 cos [(kjm) 1 .-'2 t] +;cos [(3~/'1l)l "/')t] ( 1.8.39) This completes our determination of the future state of the system given the initial state. The Propagator There are two remarkable features in Eq. ( 1.8.39): 51 MATHEMATICAL INTRODUCTION (1) The final-state vector is obtained from the initial-state vector upon multiplication by a matrix. (2) This matrix is independent of the initial state. We call this matrix the propagator. Finding the propagator is tantamount to finding the complete solution to the problem, for given any other initial state with displacements i 1(0) and i 2(0), we get .x1(t) and x2(t) by applying the same matrix to the initial-state vector. We may view Eq. (1.8.39) as the image in the II), 12) basis of the abstract relation lx(t))= U(t)lx(O)) ( 1.8.40) By comparing this equation with Eq. (l.8.37b), we find the abstract representation of U: U(t) = ii)(II cos m, t + III)(III cos m11 t II =I li)(il cos m,l i=I (l.8.4la) (1.8.4lb) You may easily convince yourself that if we take the matrix elements of this operator in the ll ), 12) basis, we regain the matrix appearing in Eq. (1.8.39). For example u,l = 01 ull > (~} tJ (~) t]}11) =(II {II)(II cos[ 12 + IH)(HI cos[ 1 12 l (~f"rJ 112 = ( 11 1)(111) cos [ (;;) t + (II II)(Hil) cos [ l{ [(k) [/3k' J =2. ~;) 112 cos ; t +cos 112 t]} Notice that U(t) [Eq. (1.8.41)] is determined completely by the eigenvectors and eigenvalues of Q. We may then restate our earlier algorithm as follows. To solve the equation 52 CHAPTER I (1) Solve the eigenvalue problem of Q. (2) Construct the propagator U in terms of the eigenvalues and eigenvectors. (3) lx(t))= U(t)lx(O)). The Normal Modes There are two initial states lx(O)) for which the time evolution is particularly simple. Not surprisingly, these are the eigenkets II) and III). Suppose we have lx(O))=II). Then the state at timet is = II(t)) U(t)ll) = (i I) (II cos w, t +I H)(HI cos Wn t)l I) =IT)cosm1 t ( 1.8.42) Thus the system starting off in II) is only modified by an overall factor cos w1 t. A similar remark holds with I-+II. These two modes of vibration, in which all (two) components of a vector oscillate in step are called normal modes. The physics of the normal modes is dear in the 11), 12) basis. In this basis II)~ I [~J and corresponds to a state in which both masses are displaced by equal amounts. The middle spring is then a mere spectator and each mass oscillates with a frequency w1 = (k/m) 112 in response to the end spring nearest to it. Consequently J II(l) ~ I [cos [(k/m) 11.2t]. l cos [(k/m) "2t] On the other hand, if we start with IH)~ 1 [_:J the masses are displaced by equal and opposite amounts. In this case the middle spring is distorted by twice the displacement of each mass. If the masses are adjusted by ~ and -~, respectively, each mass feels a restoring force of 3k~ (2k~ from the middle spring and k~ from the end spring nearest to it). Since the effective force constant is kerr=3k~/~=3k, the vibrational frequency is (3k/m) 112 and t(l IH,;(.t);,+->1l"' [. cos [(3k/r.n.) ,1' 2 , 2 '" ····cos [(3kjm) '~tL If the system starts off in a linear combination of II) and III) it evolves into the corresponding linear combination of the normal modes ll(t)) and III(t)). This is the content of the propagator equation lx(t))= U(t)lx(O)) =ll)(llx(O)) cos WJt+III)(III x(O)) cos Wnt = l/(t) )(llx(O)) + III(t) )(IIIx(O)) 53 MATHEMATICAL INTRODUCTION Another way to see the simple evolution of the initial states II) and III) is to determine the matrix representing U in the II), III) basis: U <---+ [cos co1 t I,n 0 basis 0 J cos COn t ( 1.8.43) You should verify this result by taking the appropriate matrix elements of U(t) in Eq. (1.8.4lb). Since each column above is the image of the corresponding basis vectors (II) or III)) after the action of U(t), (which is to say, after time evolution), we see that the initial states II) and III) evolve simply in time. The central problem in quantum mechanics is very similar to the simple example that we have just discussed. The state of the system is described in quantum theory by a ket IVt) which obeys the Schrodinger equation iii Iift) =HI VI) where 1i is a constant related to Planck's constant h by 1i =hj27r, and His a Hermitian operator called the Hamiltonian. The problem is to find IVt(t)) given IVt(O)). [Since the equation is first order in t, no assumptions need be made about Iif/(0)), which is determined by the Schrodinger equation to be (- i/1i)HI Vt(O) ).] In most cases, His a time-independent operator and the algorithm one follows in solving this initial-value problem is completely analogous to the one we have just seen: Step (1). Solve the eigenvalue problem of H. Step (2). Find the propagator U(t) in terms of the eigenvectors and eigenvalues of H. Step (3). IVt(t)) = U(t)l Vt(O)). You must of course wait till Chapter 4 to find out the physical interpretation of IVf), the actual form of the operator H, and the precise relation between U(t) and the eigenvalues and eigenvectors of H. D Exercise 1.8.11. Consider the coupled mass problem discussed above. (l) Given that the initial state is 11), in which the first mass is displaced by unity and the second is left alone, calculate ll(t)) by following the algorithm. (2) Compare your result with that following from Eq. (1.8.39). 54 CHAPTER I Exercise 1.8.12. Consider once again the problem discussed in the previous example. (I) Assuming that jx)=fljx) has a solution jx(t)) = U(t)jx(O)) find the differential equation satisfied by U(t). Use the fact that jx(O)) is arbitrary. (2) Assuming (as is the case) that n and U can be simultaneously diagonalized, solve for the elements of the matrix U in this common basis and regain Eq. (1.8.43). Assume jx(O))=O. 1.9. Functions of Operators and Related Concepts We have encountered two types of objects that act on vectors: scalars, which commute with each other and with all operators; and operators, which do not generally commute with each other. It is customary to refer to the former as c numbers and the latter as q numbers. Now, we are accustomed to functions of c numbers such as sin(x), log(x), etc. We wish to examine the question whether functions of q numbers can be given a sensible meaning. We will restrict ourselves to those functions that can be written as a power series. Consider a series (1.9.1) where x is a c number. We define the same function of an operator or q number to be (1.9.2) This definition makes sense only if the sum converges to a definite limit. To see what this means, consider a common example: (1.9.3) Let us restrict ourselves to Hermitian n. By going to the eigenbasis of n we can readily perform the sum of Eq. (1.9.3). Since ll)l n= [ J ( 1.9.4) and (J)j gm= [ (1.9.5) 55 MATHEMATICAL INTRODUCTION (1.9.6) Since each sum converges to the familiar limit e0", the operator e0 is indeed well defined by the power series in this basis (and therefore in any other). Exercise 1.9.1. * We know that the series f(x)= I x" n=O may be equated to the functionf(x) = (1- x)- 1 if lxl < 1. By going to the eigenbasis, examine when the q number power series 00 /CO.)= I n" n=O of a Hermitian operator 0. may be identified with (1-0.)- 1• Exercise 1.9.2. * If His a Hermitian operator, show that U=em is unitary. (Notice the analogy with c numbers: if() is real, u=e18 is a number of unit modulus.) Exercise 1.9.3. For the case above, show that det U=enrH. Derivatives of Operators with Respect to Parameters Consider next an operator O(A.) that depends on a parameter A. Its derivative with respect to A. is defined to be dO( A.)= lim [O(A.+ AA.)- O(A.)J dA. ,u~o AA. If O(A.) is written as a matrix in some basis, then the matrix representing dO(A.)jdA. is obtained by differentiating the matrix elements of O(A.). A special case of O(A.) we 56 CHAPTER I are interested in is where Q is Hermitian. We can show, by going to the eigenbasis of Q, that d~B-(A-.-) =QeAn =eMlQ=O(A.)Q d). (1.9.7) The same result may be obtained, even if Q is not Hermitian, by working with the power series, provided it exists: - I I L: L: d cc )."Q" 0() nA."-lgn 00 ;.n~lgn-1 00 ;.mgm ················= ····························=n ~ ~ =n -~--=neA.Q dA.n~o n! n=l n! n-1 (n-l)! m-o m! Conversely, we can say that if we are confronted with the differential Eq. ( 1.9.7), its solution is given by O(A.)=c exp(f Q dX )=c exp(QA.) (It is assumed here that the exponential exists.) In the above, cis a constant (opera- e tor) of integration. The solution =em corresponds to the choice c =I. In all the above operations, we see that Q behaves as if it were just a c number. Now, the real difference between c numbers and q numbers is that the latter do not generally commute. However, if only one q number (or powers of it) enter the picture, everything commutes and we can treat them as c numbers. If one remembers this mnemonic, one can save a lot of time. If, on the other hand, more than one q number is involved, the order of the factors is all important. For example, it is true that as may be verified by a power-series expansion, while it is not true that or that unless [Q, 0] = 0. Likewise, in differentiating a product, the chain rule is (1.9.8) We are free to move n through e'.n and write the first term as but not as 57 MATHEMATICAL INTRODUCTION unless [0, 0]=0. 1.10. Generalization to Infinite Dimensions In all of the preceding discussions, the dimensionality (n) of the space was unspecified but assumed to be some finite number. We now consider the generalization of the preceding concepts to infinite dimensions. Let us begin by getting acquainted with an infinite-dimensional vector. Consider a function defined in some interval, say, a~x~b. A concrete example is provided by the displacementf(x, t) of a string clamped at x=O and x=L (Fig. 1.6). Suppose we want to communicate to a person on the moon the string's displacementf(x), at some timet. One simple way is to divide the interval 0- L into 20 equal parts, measure the displacementf(x;) at the 19 points x=L/20, 2Lj20, ... , 19Lj20, and transmit the 19 values on the wireless. Given thesef(x;), our friend on the moon will be able to reconstruct the approximate picture of the string shown in Fig. 1.7. If we wish to be more accurate, we can specify the values of f(x) at a larger number of points. Let us denote by f,(x) the discrete approximation to f(x) that coincides with it at n points and vanishes in between. Let us now interpret the ordered n-tuple Un(xi), f,(x2), ... , f,(xn)} as components of a ket Ifn> in a vector space 'lor(R): (1.10.1) Figure 1.6. The string is clamped at x=O and x = L. It is free to oscillate in the plane of the paper. Figure 1.7. The string as reconstructed by the person on the moon. 58 CHAPTER I The basis vectors in this space are 0 0 Ix;) <--> 1 +-- ith place 0 (1.10.2) 0 corresponding to the discrete function which is unity at x = x; and zero elsewhere. The basis vectors satisfy (x;!xj) = Ou (orthogonality) (1.10.3) L lx;)(x;l =I (completeness) i=l (l.l0.4) Try to imagine a space containing n mutually perpendicular axes, one for each point X;. Along each axis is a unit vector lx;). The functionf,(x) is represented by a vector whose projection along the ith direction is fn(x;): If,)= L fn(X;)Ix;) i=1 ( 1.10.5) To every possible discrete approximation gn(x), hn(x), etc., there is a corresponding ket lg11), lh11), etc., and vice versa. You should convince yourself that if we define vector addition as the addition of the components, and scalar multiplication as the multiplication of each component by the scalar, then the set of all kets representing discrete functions that vanish at x = 0, L and that are specified at n points in between, forms a vector space. We next define the inner product in this space: 11 ( 1.10.6) i=l Two functionsf,(x) and gn(x) will be said to be orthogonal if = j . / 2(x) dx ·o (1.10.7) (1.10.8) If we wish to go beyond the instance of the string and consider complex functions of x as well, in some interval as, x-<;;;, b, the only modification we need is in the inner product: r dx'= (xiiif)= (1.10.12) a Now, (xl f), the projection of If) along the ·basis ket lx), is just f(x). Likewise (x'l f)= f(x'). Let the inner product (xlx') be some unknown function o(x, x'). o Since (x, x') vanishes if xi' x' we can restrict the integral to an infinitesimal region near x'=x in Eq. (1.10.12): fx+< o(x, x') f(x') dx' =f(x) ( 1.10.13) x---·e In this infinitesimal region,f(x') (for any reasonably smoothf) can be approximated by its value at x' =x, and pulled out of the integral: r+• f(x) o(x, x') dx'=f(x) (1.10.14) x-& so that IX+< x-e 8(x, x') dx' = 1 (1.10.15) Clearly o(x, x') cannot be finite at x' = x, for then its integral over an infinitesimal region would also be infinitesimal. In fact o(x, x') should be infinite in such a way that its integral is unity. Since o(x, x') depends only on the difference X- x'' let us write it as o(x- x'). The "function," 8(x- x'), with the properties r o(x-x')=O, o(x-x') dx'= 1, a x#x' a Since the kets are in correspondence with the functions, 0 takes the function f(x) into another,](x). Now, one operator that does such a thing is the familiar differential operator, which, acting onf(x), gives](x) ={{((x)jdx. In the function space we can describe the action of this operator as Dl f> = dfldx) where ldf/dx) is the ket corresponding to the function djjdx. What are the matrix elements of D in the Ix) basis? To find out, we dot both sides of the above equation 64 CHAPTER I with (xi, and insert the resolution of identity at the right place Jl' (xI D Ix') (x' If) dx' =~I{ dx (1.10.27) Comparing this to Eq. (l.I0.21), we deduce that ID =D ..·= n 8'(x-x') · = d o ( x - x · 'd)x-' (1.10.28) o It is worth remembering that Du = '(x- .x') is to be integrated over the second index (x') and pulls aut the derivative off at the first index (x). Some people prefer to integrate o'(x-x') over the first index, in which case it pulls out -df/dx'. Our convention is more natural if one views Dxx' as a matrix acting to the right on the components f~ =f(x') of a vector If). Thus the familiar differential operator is an infinite-dimensional matrix with the elements given above. Normally one doesn't think of D as a matrix for the following reason. Usually when a matrix acts on a vector, there is a sum over a common index. In fact, Eq. (1.1 0.27) contains such a sum over the index X 1• If, however, we feed into this equation the value of Dx-:·, the delta function renders the integration trivial: J. ') 8(x .... x -d /'(x') dx' =({-f d.x'· dxl df dx Thus the action of Dis simply to apply d/d.x tof(x) with no sum over a common index in sight. Although we too will drop the integral over the common index ultimately, we will continue to use it for a while to remind us that D, like all linear operators, is a matrix. Let us now ask if Dis Hermitian and examine its eigenvalue problem. If D were Hermitian, we would have D,,.=D~, But this is not the case: D"=o'(x-x') while D~·x = 0 '(x'- = 8'(x'- x) = -o'(x- x') But we can easily convert D to a Hermitian matrix by multiplying it with a pure imaginary number. Consider K=-iD 65 MATHEMATICAL INTRODUCTION which satisfies x:x= [-io'(x'-x)]*= +io'(x'-x)= -io'(x- x') =K,:.: It turns out that despite the above, the operator K is not guaranteed to be Hermitian, as the following analysis will indicate. Let I/) and lg) be two kets in the function space, whose images in the X basis are two functions f(x) and g(x) in the interval a- b. If K is Hermitian, it must also satisfy (giK]/)= (gl Kf)=(Kfig)*= dx dxJ" a a Integrating the left-hand side by parts gives - i g * ( x ) f ( x ) /; 1 +i fb dg*(~}f(x)dx a a dx So K is Hermitian only if the surface term vanishes: -ig*(x)f(x{ =0 (1.10.29) In contrast to the finite-dimensional case, Kxx' = x:x is not a sufficient condition for K to be Hermitian. One also needs to look lit the behavior of the functions at the end points a and b. Thus K is Hermitian if the space consists of functions that obey Eq. (1.10.29). One set of functions that obey this condition are the possible configurationsf(x) of the string clamped at x=O, L, sincef(x) vanishes at the end points. But condition (1.10.29) can also be fulfilled in another way. Consider functions,Jn our own three-dimensional space, parametrized by r, (}, and 4> (4> is the angle measured around the z axis). Let us require that these functions be single 66 CHAPTER I valued. In particular, if we start at a certain point and go once around the z axis, returning to the original point, the function must take on its original value, i.e., f(cp)=f(¢+2n:) In the space of such periodic functions, K= -i d/dcp is a Hermitian operator. The surface term vanishes because the contribution from one extremity cancels that from the other: 2rr -ig*(¢) /(¢)1 = -i[g*(2n:)f (2n:)- g*(O)f(O)] = 0 0 In the study of quantum mechanics, we will be interested in functions defined over the full interval - oo ::;; x::;; +oo. They fall into two classes, those that vanish as lxl -+ oo, and those that do not, the latter behaving as eikX, k being a real parameter that labels these functions. It is dear that K= -i d/dx is Hermitian when sandwiched between two functions of the first class or a function from each, since in either case the surface term vanishes. When sandwiched between two functions of the second class, the Hermiticity hinges on whether eikx e-ik'xloo ~0 -oe If k=k', the contribution from one end cancels that from the other. If kiok', the answer is unclear since ei (1.10.30) Following the standard procedure, (xi.K]k) ==k(xl k) f(xj.K]x') (x'l k) dx' =k'l'k(x) 67 MATHEMATICAL INTRODUC"'TION (1.10.31) - z.d- IP'k(x) =k'lfk(x) dx where by definition 'l'k(x) =(xi k). This equation could have been written directly had we made the immediate substitution K= -i d/dx in the X basis. From now on we shall resort to this shortcut unless there are good reasons for not doing so. The solution to the above equation is simply (l.l 0.32) where A, the overall scale, is a free parameter, unspecified by the eigenvalue problem. So the eigenvalue problem of K is fully solved: any real number k is an eigenvalue, and the corresponding eigenfunction is given by A e1kx. As usual, the freedom in scale will be used to normalize the solution. We choose A to be (l/2nr 112 so that jk).,.... _1_____ eikx (2n)l/2 and (klk')= fro (k!x)(xlk')dx=2~ Joo e-i(k-k')xdx=.S(k-k') (1.10.33) -oo -oo (Since (k Ik) is infinite, no choice of A can normalize lk) to unity. The delta function normalization is the natural one when the eigenvalue spectrum is continuous.) The attentive reader may have a question at this point. "Why was it assumed that the eigenvalue k was real? It is clear that the function A eikx with k=k 1 +ik2 also satisfies Eq. (1.10.31)." The answer is, yes, there are eigenfunctions of K with complex eigenvalues. If, however, our space includes such functions, K must be classified a non-Hermitian operator. (The surface term no longer vanishes since eikx blows up exponentially as x tends to either +oo or -oo, depending on the sign of the imaginary part k2-) In restricting ourselves to real k we have restricted ourselves to what we will call the physical Hilbert space, which is of interest in quantum mechanics. This space is defined as the space of functions that can be either normalized to unity or to the Dirac delta function and plays a central role in quantum mechanics. (We use the qualifier "physical" to distinguish it from the Hilbert space as defined by mathematicians, which contains only proper vectors, i.e., vectors normalizable to unity. The role of the improper vectors in quantum theory will be clear later.) 68 CHAPTER I We will assume that the theorem proved for finite dimensions, namely, that the eigenfunctions of a Hermitian operator form a complete basis, holds in the Hilbertt space. (The trouble with infinite-dimensional spaces is that even if you have an infinite number of orthonormal eigenvectors, you can never be sure you have them all, since adding or subtracting a few still leaves you with an infinite number of them.) Since K is a Hermitian operator, functions that were expanded in the X basis with componentsf(x)=(xlf> must also have an expansion in the K basis. To find the components, we start with a ket If), and do the following: fx" f f(k)=(klf>= ( k j x ) ( x l f ) d x = ( 2: ) 112 .f(x) dx (1.10.34) The passage back to the X basis is done as follows: fcc rxy_ .f(x)=(xl.f>= (klx)(klf)dk= (2:)1/2 e1k'f(k)dk (1.10.35) Thus the familiar Fourier transform is just the passage from one complete basis to another, lk). Either basis may be used to expand functions that belong to the Hilbert space. The matrix elements of K are trivial in the K basis: (kl Klk') =k'(kl k') =k'8(k -k') (1.10.36) Now, we know where the K basis came from: it was generated by the Hermitian operator K. Which operator is responsible for the orthonormal X basis? Let us call it the operator X. The kets lx) are its eigenvectors with eigenvalue x: Xlx) =xlx) ( 1.10.37) Its matrix elements in the X basis are (x'IXIx) = x1S(x'- x) ( 1.10.38) To find its action on functions, let us begin with XI/)=!]) and follow the routine: J (xiXI /) = (xiX!x') (x' I/) dx' = xf(x) = (x ll> =l(x) .'. _1(x) = x.f(x) :j: Hereafter we will omit the qualifier "physical." Thus the effect of X is to multiply f(x) by x. As in the case of the K operator, one generally suppresses the integral over the common index since it is rendered trivial by the delta function. We can summarize the action of X in Hilbert space as Xlf(x))=ixf(x)) ( 1.1 0.39) 69 MATHEMATICAL INTRODUCTION where as usual lxf(x)) is the ket corresponding to the function xf(x). There is a nice reciprocity between the X and K operators which manifests itself if we compute the matrix elements of X in the K basis: J'"' . . . ') = +i ddk ( 21; e' 4(x) Jq f)__. -i df(x) dx X K] .f) -> -z.x d---f--(---x----)--dx KXI.f) _,. -i -d----- :x.f(x) dx [X, K]l.f)-> -ix df_+ ix d.f +if= if·-'> il] /) dx dx tIn the last step we have used the fact that o(k' --- k) = o(k --- k'). 70 CHAPTER I Since If) is an arbitrary ket, we now have the desired result: [X, K]=il (1.10.41) This brings us to the end of our discussion on Hilbert space, except for a final example. Although there are many other operators one can study in this space, we restricted ourselves to X and K since almost all the operators we will need for quantum mechanics are functions of X and P= nK, where n is a constant to be defined later. Example 1.10.1: A Normal Mode Problem in Hilbert Space. Consider a string of length L clamped at its two ends x = 0 and L. The displacement 1Jf(x, t) obeys the differential equation ( 1.10.42) Given that at t = 0 the displacement is 1Jf(x, 0) and the velocity ift(x, 0) = 0, we wish to determine the time evolution of the string. But for the change in dimensionality, the problem is identical to that of the two coupled masses encountered at the end of Section 1.8 [see Eq. (1.8.26)]. It is recommended that you go over that example once to refresh your memory before proceeding further. We first identify 1Jf(x, t) as components of a vector llJI( t)) in a Hilbert space, the elements of which are in correspondence with possible displacements 1Jf, i.e., functions that are continuous in the interval 0 ~ x ~ L and vanish at the end points. You may verify that these functions do form a vector space. The analog of the operator n in Eq. (1.8.26) is the operator iPI8x2• We recognize this to be minus the square of the operator K ~ -i8j8x. Since K acts on a space in which 1J1(0)= 1Jf(L)=O, it is Hermitian, and so is K 2. Equation (1.10.42) has the abstract counterpart ( 1.10.43) We solve the initial-value problem by following the algorithm developed in Example 1.8.6: Step (1). Solve the eigenvalue problem of -K2• Step (2). Construct the propagator U(t) in terms of the eigenvectors and eigenvalues. Step (3). 11J!(t)) = U(t)I1J!(O)) (1.10.44) The equation to solve is In the X basis, this becomes (1.10.45) 71 MATHEMATICAL INTRODUCTION (1.10.46) the general solution to which is '1-'k(x) =A cos kx + B sin kx (1.10.47) where A and Bare arbitrary. However, not all these solutions lie in the Hilbert space we are considering. We want only those that vanish at x=O and x=L. At x=O we find lf/k(O)=O=A (1.10.48a) while at x = L we find O=BsinkL (1.10.48b) If we do not want a trivial solution (A= B= 0) we must demand sin kL=O, kL=mn, m= 1, 2, 3, ... (1.10.49) We do not consider negative m since it doesn't lead to any further LI solutions [sin( -x) =-sin x]. The allowed eigenveQ'tors thus form a discrete set labeled by an integer m: L L lf/m(x) = ( 2 ) 112 sm• (mnx) (1.10.50) where we have chosen B=(2/L} 112 so that JL lf/m(X)lflrn'(x) dx=Omm' 0 (1.10.51) Let us associate with each solution labeled by the integer man abstract ket lm): lm>---> (2/L} 112 sin (mnx) Xbasis L (1.10.52) 72 CHAPTER 1 (mn (m If we project llfl(t)) on the lm) basis, in which K is diagonal with eigenvalues I L)2, the components llf/(1)) will obey the decoupled equations m= 1, 2, ... (1.10.53) in analogy with Eq. (1.8.33). These equations may be readily solved (subject to the condition of vanishing initial velocities) as (mn1) (mllf!(1))=(mllf!(O))cos L (1.10.54) Consequently X' llf/(1))= L lm)(mllfl(t)) m=l X L = lm) (m llf/(0)) COS COm1, m=I m7r com =L- or 'lj U(t) = L lm) (m Icos COm1, m=l mn com =L- The propagator equation ilfl(l)) = U(1)ilfi(O)) becomes in the Ix) basis (x llf/(1)) = lf!(X, 1) =(xi U(1)llf/(O)) = foL (xl U(1)lx') (x' llf/(0)) dx' It follows from Eq. (1.10.56) that (xl U(1)lx')=I (xlm) (ml x') cos COm1 "=~ (L2) .sm (mLnx) .sm (mLnx') coscom1 ( 1.10.55) (1.10.56) ( 1.10.57) (1.10.58) Thus, given any lJI(x', 0), we can get lJI(X, t) by performing the integral in Eq. (1.10.57), using (x\ U(t)\x') from Eq. (1.10.58). If the propagator language seems too abstract, we can begin with Eq. (1.10.55). Dotting both sides with (x\, we get 00 lJI(x,t)= I (x\m)(m\lJI(O))cosmmt m=l Ioo ( -2 ) 112 sm. (m-tc-x) cos Wmt(m\ lJI(O)) m~I L L (1.10.59) Given \IJI(O)), one must then compute 73 MATHEMATICAL INTRODUCTION Usually we will find that the coefficients (m \IJI(O)) fall rapidly with m so that a few leading terms may suffice to get a good approximation. D Exercise I. 10.4. A string is displaced as follows at t = 0: 2xh IJI(x,O)=z:· 2h =z;(L-x), L O-=0 if we go to any nearby path xc1 (t) + TJ(I). We use square brackets to enclose the argument of S to remind us that the function S depends on an entire path or function x(t), and not just the value of x at some time t. One calls S afunctional to signify that it is a function of a function. (3) The classical path is one on which S is a minimum. (Actually we will only require that it be an extremum. It is, however, customary to refer to this condition as the principle of least action.) We will now verify that this principle reproduces Newton's Second Law. The first step is to realize that a functional S[x(t)] is just a function ofn variables as n-+oo. In other words, the function x(t) simply specifies an infinite number of values x(t;), ... , x(t), ... , x(t1 ), one for each instant in time t in the interval t;$,t$,tf> and Sis a function of these variables. To find its minimum we simply generalize the procedure for the finite n case. Let us recall that iff=f(x 1, •• • , Xn) = f(x); the minimum x0 is characterized by the fact that if we move away from it by a small amount TJ in any direction, the first-order change lif0 ) in f vanishes. That is, if we make a Taylor expansion: I. f(x0 +TJ)=f(x0)+ ofi1J;+higher-ordertermsin 11 ;-I OX; x0 (2.1.4) then (2.1.5) From this condition we can deduce an equivalent and perhaps more familiar of/ox;, expression of the minimum condition: every first-order partial derivative vanishes at x0• To prove this, for say, we simply choose TJ to be along the ith direction. Thus of! =o OX; x0 ' i=l, ... ,n (2.1.6) Let us now mimic this procedure for the action S. Let xc1 (t) be the path of least action and xc1 (t) + 17(1) a "nearby" path (see Fig. 2.2). The requirement that all paths coincide at t; and t1 means (2.1.7) 78 CHAPTER 2 Now I, S[xc1 (t) + 17(t)] = !t'(xc~{t) + 17(t); ic~(t) + l)(t)) dt t, I = I .('[ !t'(xc~(t), t, ic1(t)) + -a!-t' ax(f) Xd · 17(t) + a!t' I . l)(t)+· · ·Jdt a.x(t) x" = S[xc1 (t)] + 8 S0 1+ higher-order terms We set os<11 =0 in analogy with the finite variable case: I . I J 0=8S<11 = I ''[ -a!-t' I; ax(f) Xd · 17(t)+a!-t.'ax(f) X<] ·l)(t) dt If we integrate the second term by parts, it turns into J -a!t' I · 11U) l'f- I''[d- -a!t' · 77{t) dt ax(t) X" ,, '; dt ax(t) x" The first of these terms vanishes due to Eq. (2.1.7). So that J · o= 8s< 11 = I'~[ a!t' _!:__ a!t' 11(t) dt I; ax(t) dt ax(t) x" (2.1.8) Note that the condition 8s is obvious in Eq. (2.1.22). The conservation ofP follows from Eq. (2.1.19) only after some manipulations and is practically invisible in Eqs. (2.1.16) and (2.1.17). Both the conserved quantity and its conservation law arise naturally in the Lagrangian scheme. 0