Albert Einstein (1879- 1955) Oxford University Press, Great Clarendon Street, Oxford OX2 6DP Oxford New York Athens Auckland Bangkok Bogot,a Bombay Buenos Aires Calcutta Cape Town Cliennai Dar .es Sal,aam Delhi Florence Hong Kong /sf,a,nbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Siio Paolo Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan Oxford is a trade mark of Oxford University Press Published in the United States by Oxford University Press Inc., New York © Ray d'Inverno, 1992 Reprinted 1993, 1995 (with corrections), 1996, 1998 All rights reserved. No par-t of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press. Within the UK, exceptions are allowed i,rrespect ofanyfair dealing for the purpose ofresearch or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, .. 1988, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms and in other countries should be sent to the Rights Department, Oxford University Press, at the address above. This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form of bind_in.9 or cover other than that in which it is publis~d and without a similar condition including this condition being imposed on the subsequent purchaser. A catalogue record for this book is available from the British Library Library of Congress C-a.taloging in Publication Data d' I nverno, R. A. Introducing Einstein's relativity/R. A. d'lnverno. Includes bibliographical references and index. 1. Relativity (Physics) 2. Black holes (Astronomy) 3. Gravitation. 4. Cosmology. 5. Calculus of tensors. I. Title. QC173.55.158 1992 530, fl-dc20 91-24894 ISBN 0 19 859653 7 (Hbk) ISBN O 19 859686 3 (Pbk) Printed in Malta by Interprint Limited Contents Overview 1. The organization of the book 3 1.1 Notes for the student 3 1.2 Acknowledgements 4 1.3 A brief survey of relativity theory 6 1.4 Notes for the teacher 8 1.5 A final note for the less able student 10 Exercises 11 Part A. Special Relativity 13 2. The k-calculus 15 2.1 Model building 15 2.2 Historical background 16 2.3 Newtonian framework 16 2.4 Galilean transformations 17 2.5 The principle of special relativity 18 2.6 The constancy of the velocity of light 19 2.7 The k•factor 20 2.8 Relative speed of two inertial observers 21 2.9 Composition law for velocities 22 2.10 The relativity of simultaneity 23 2.11 The clock paradox 24 2.12 The Lorentz transformations 25 2.13 The four-dimensional world view 26 Exercises 28 3. The key attributes of special relativity 29 3.1 Standard derivation of the Lorentz transformations 29 3.2 Mathematical properties of Lorentz transformations 31 3.3 Length contraction 32 3.4 Time dilation 33 3.5 Transformation of velocities 34 3.6 Relationship between space-time diagrams of inertial observers 35 3.7 Acceleration in special relativity 36 3.8 Uniform acceleration 37 3.9 The twin paradox 38 3.10 The Doppler effect 39 Exercises 40 4. The elements of relativistic mechanics 42 4.1 Newtonian theory 42 4.2 Isolated systems of particles in Newtonian mechanics 44 4.3 Relativistic mass 45 4.4 Relativistic energy 47 4.5 Photons 49 Exercises 51 Part B. The Formalism of Tenso·rs 53 5. Tensor algebra 55 5.1 Introduction 55 5.2 Manifolds and coordinates 55 5.3 Curves and surfaces 57 5.4 Transformation of coordinates 58 5.5 Contravariant tensors 60 5.6 Covariant and mixed tensors 61 5.7 Tensor fields 62 5.8 Elementary operations with tensors 63 5.9 Index-free interpretation of contra- variant vector fields 64 Exercises 67 viii I Contents 6. Tensor calculus 68 6.1 Partial derivative of a tensor 68 6.2 The Lie derivative 69 6.3 The affine connection and covariant differentiation 72 6.4 Affine geodesics 74 6.5 The Riemann tensor 77 6.6 Geodesic coordinates 77 6.7 Affine flatness 78 6.8 The metric 81 6.9 Metric geodesics 82 6.10 The metric connection 84 6.11 Metric flatness 85 6.12 The curvature tensor 86 6.13 The Weyl tensor 87 Exercises 89 7. Integration, variation, and symmetry 91 7.1 Tensor densities 91 7.2 The Levi-Civita alternating symbol 92 7.3 The metric determinant 93 7.4 Integrals and Stokes' theorem 95 7.5 The Euler- Lagrange equations 96 7.6 The variational method for geodesics 99 7.7 Isometrics 102 Exercises 103 Part C. General Relativity 105 8. Special relativity revisited 107 8.1 Minkowski space-time 107 8.2 The null cone 108 8.3 The Lorentz group 109 8.4 Proper time 111 8.5 An axiomatic formulation of special relativity 112 8.6 A variational principle approach to classical mechanics 114 8.7 A variational principle approach to relativistic mechanics 116 8.8 Covariant formulation of relativistic mechanics 117 Exercises 119 9. The principles of general relativity 120 9.1 The role of physical principles 120 9.2 Mach's principle 121 9.3 Mass in Newtonian theory 125 9.4 The principle of equivalence 128 9.5 The principle of general covariance 130 9.6 The principle of minimal gravitational coupling 131 9.7 The correspondence principle 132 Exercises 132 10. The field equations of general relativity 134 10.1 Non-local lift experiments 134 10.2 The Newtonian equation of deviation 135 10.3 The equation of geodesic deviation 136 10.4 The Newtonian correspondence 139 10.5 The vacuum field equations of general relativity 141 10.6 The story so far 142 10.7 The full field equations of general relativity 142 Exercises 144 11. General relativity from a variational principle 145 11.1 The Palatini equation 145 11.2 Differential constraints on the field equations 146 11.3 A simple example 147 11.4 The Einstein Lagrangian 148 11.5 Indirect derivation of the field equations 149 11.6 An equivalent Lagrangian 151 11.7 The Palatini approach 152 11.8 The full field equations 153 Exercises 154 12. The energy-momentum tensor 155 12.1 Preview 155 12.2 Incoherent matter 155 12.3 Perfect fluid 157 12.4 Maxwell's equations 158 12.5 Potential formulation of Maxwell's equations 160 12.6 The Maxwell energy-momentum tensor 162 12.7 Other energy-momentum tensors 163 12.8 The dominant energy condition 164 12.9 The Newtonian limit 165 12.10 The coupling constant 167 Exercises 168 13. The structure of the field equations 169 13.1 Interpretation of the field equations 169 13.2 Determinacy, non-linearity, and differentiability 170 13.3 The cosmological term 171 13.4 The conservation equations 173 13.5 The Cauchy problem 174 13.6 The hole problem 177 13.7 The equivalence problem 178 Exercises 179 14. The Schwarzschild solution 180 14.1 Stationary solutions 180 14.2 Hypersurface-orthogonal vector fields 181 14.3 Static solutions 183 14.4 Spherically symmetric solutions 184 14.5 The Schwarzschild solution 186 14.6 Properties of the Schwarzschild solµtion 188 14.7 Isotropic coordinates 189 Exercises 190 15. Exper'imental tests of general relativity 192 15.1 Introduction 192 lS.2 Classical Kepler motion 192 Contents I ix lS.3 Advance of the perihelion of Mercury 195 15.4 Bending of light 199 15.5 Gravitational red shift 201 lS.6 Time delay of light 204 15.7 The Eotvos experiment 205 15.8 Solar oblateness 206 15.9 A chronology of experimental and observational events 207 IS.IO Rubber-sheet geometry 207 Exercises 209 Part D. Black Holes 211 16. Non-rotating black holes 213 16.1 Characterization of coordinates 213 16.2 Singularities 214 16.3 Spatial and space-time diagrams 215 16.4 Space-time diagram in Schwarzschild coordinates 216 16.5 A radially infalling particle 218 16.6 Eddington-Finkelstein coordinaies 219 16.7 Event horizons 221 16.8 Black holes 223 16.9 A classical argument 224 16.10 Tidal forces in a black hole 225 16.11 Observational evidence for black holes 226 16.12 Theoretical status of black holes 227 Exercises 229 17. Maximal extension and conformal compactification 230 17.1 Maximal analytic extensions 230 17.2 The Kruskal solution 230 17.3 The Einstein-Rosen bridge 232 17.4 Penrose diagram for Minkowski space-timl! 234 17.5 Penrose diagram for the Kruskal solution 237 Exercises 238 x I Contents 18. Charged black holes 239 18.1 The field of a charged mass point 239 18.2 Intrinsic and coordinate singularities 241 18.3 Space-time diagram of the Reissner-Nordstrnm solution 242 18.4 Neutral particles in Reissuer- Nordstrnm space-time 243 18.5 Penrose diagrams of the maximal analytic extensions 244 Exercises 247 19. Rotating black holes 248 19.1 Null tetrads 248 19.2 The Kerr solution from a complex transformation 250 19.3 The three main forms of the Kerr solution 251 19.4 Basic properties of the Kerr solution 252 19.5 Singularities and horizons 254 19.6 The principal null congruences 256 19.7 Eddington-Finkelstein coordinates 258 19.8 The stationary limit 259 19.9 Maximal extension for the case a2 < m2 260 19.10 Maximal extension for the case a2 > m2 261 19.11 Rotating black holes 262 19.12 The singularity theorems 265 19.13 The Hawking effect 266 Exercises 268 Part E. Gravitational Waves 269 20. Plane gravitational waves 271 20.1 The linearized field equations 271 20.2 Gauge transformations 272 20.3 Linearized plane gravitational waves 274 20.4 Polarization states 278 20.5 Exact plane gravitational waves 280 20.6 Impulsive plane gravitational waves 282 20.7 Colliding impulsive plane gravita- tional waves 283 20.8 Colliding gravitational waves 284 20.9 Detection of gravitational waves 285 Exercises 288 21. Radiation from an isolated source 290 21.1 Radiating isolated sources 290 21.2 Characteristic hypersurfaces of Einstein's equations 292 21.3 Radiation coordinates 293 21.4 Bondi's radiating metric 294 21.5 The characteristic initial value problem 296 21.6 News and mass loss 297 21.7 The Petrov classification 299 21.8 The peeling-off theorem 301 21.9 The optical scalars 302 Exercises 303 Part F. Cosmology 305 22. Relativistic cosmology 307 22.1 Preview 307 22.2 Olbers' paradox 308 22.3 Newtonian cosmology 310 22.4 The cosmological principle 312 22.5 Weyl's postulate 314 22.6 Relativistic cosmology 315 22.7 Spaces of constant curvature 317 22.8 The geometry 'of 3-spaces of constant curvature 319 22.9 Friedmann's equation 322 22.10 Propagation of light 324 22.11 A cosmological definition of distance 325 22.12 Hubble's law in relativistic cosmology 326 Exercises 329 23. Cosmological models 331 23.1 The flat space models 331 23.2 Models with vanishing cosmological constant 334 23.3 Classification of Friedmann models 335 23.4 The de Sitter model 337 23.5 The first models 338 23.6 The time-scale problem 339 23.7 Later models 339 23.8 The missing matter problem 341 23.9 The standard models 342 23.10 Early epochs of the universe 343 23.11 Cosmological coincidences 343 23.12 The steady-state theory 344 23.13 The event horizon of the de Sitter universe 348 23.14 Particle and event horizons 349 23.15 Conformal structure of Robertson-Walker space-times 351 Contents I xi 23.16 Conformal structure of de Sitter space-time 352 23.17 Inflation 354 23.18 The anthropic principle 356 23.19 Conclusion 358 Exercises 359 Answers to exercises 360 Further reading 370 Selected bibliography 372 Index 375 111111111111111111111111111111111111111111 111111111111111111 ■ 1111111111111111111111■ 1111111111■■■■ 111111111111111111111 6 . a l l1l1l1ll1l1ll1l1ll1l1ll1l1ll1l1ll■llll1l1ll1ll1l1ll1l1ll1l1■1B■ll1l1l1. 1■■l1l1l■11l■ll■lll■ll, 11111■■111■11■111•111•11·111·111·111·11·■·■·■·111·1111·111·111·111 l1!1ll1l1ll1ll1!■ llll■lll■lll■ llll1!1II1II1I■IIII1I1II■III1II1H■II 111 ■■llllllll IIIHlll ■■ ■■■■■IIIHI lll ■li ■IIIIIHI IHllll■ IHl■■■ll ■ ll■ II •. . 111111■11111111 IIIHl ■■ III 111111111111111111111111■ 1111■ 111111 ■ 111•11 ■111111111111111111111 ■111111 ■11111 IIIIIIIIIIIHlllllllllll■ llll■■ ll■ llll■ IIIIIHlll■ lllll■llll■■■llll ■■ III llllllllllll■ llllillllllll■ llllllllllll ■ IIIIIHIIHllll■ lllllll ■■•■■■■■llllll ==============1111111111111111111111111111 llllllll ■ ll■ llllllllllllllllllllll ■llll■■!llllll ■■■ ll ■■■IIIIIIIIIIIIIHlllll 111111111111■■ 1111111111 ■■■ 1111111111■■■111111■ 111111 ■ 11111111■ 111111 ■ 1 lllllllllllllll 1111111111111■ !■11i1l1l1l1l1l1l1l1l■ll■ll■111l■l■l1■11- 1 ■■11■ 11■1111■-■ •■ •■ •■ •■ •■ ••111•1■ •■■■11■111•1 ■■■ I ■■ 1111 ■ llllfll ■ llfllllllllllllllllll■ IIIIIIIHl■ ll ■llll ■■ IHIIHlllll ■■ll ■ IIIIIIII 1111111111111111 ■■■111111 ■■ 111111 ■11■■■1111 ■■■ 11■1111■ 11 ■11111111111 ·················-···········••111■ 1111■ 1111111111 ·■·11·■·11·11·11·11-■·■·1·111·1 ■·1·1 ·■·■·11·■·■·■·■·■·1·1 ■·1•1■1■1111■1111111■11111111111111111111111■1■■ 1111■ 1111■ 1111 11 ■ llll ■■ llllll ■■lllil ■IIIIIIIIIIHl■■■llllll■■■■llll■■ ll■ IIIIIIIIIIIIIII ·····••111111111 ■1111 lllll ■■■ IIIIIIIIIHlllll ■ ■11111111111111111111 llllllllllll ■llll■ll ■ 11■■ ■■llll ••··••11■■■■■ ■ll ■llll ■ llllllll■ II 11 ■ lllil ■lllllllliillllllllllll ■ll■■■■■ ll■■ll■ llll■■■■■■■ llll■■ IIIIII ■ 11111111111 ■1111 ■■ 1111111111■ 11 ■111111■ 1111111111111111■ 111111111 ■■ 11■1111 ■ l·ll·l·■·lll·l•■•ll■ll ■ ••■·l·ll·l·ll•l•ll1l1ll1l1l ■1I■I■II1I1H■ll■l ■11l11ll■ll■ll1l1■ ■■ ■l11l1l1l1l1l1l1ll■ll1l1l1l1■■■lll1l1l1l1ll■l l1l1l1l1l1l1■ ■ l1l1l1l1l ■ ■1l1ll■ll■l ■■ l11l1l1l1l1l ■ ■■■■ 11■■l11l1l1ll■ll■ll1l1l1l1i■ll■l•■•ll·ll·■·ll·ll·■·ll·■·ll·■·■·ll·■·II·II IIHlll■llll■ lllll■ IIIHlllll■ llllllll ■■■■ll■ll■■ llll ■ ll■ ll ■■■ll■ II ■ IIIIIIIHlllll ■ lllllllllllllll■■ll■ llll■ ll ■llll ■ll ■ llllllll■ llll■ llll■ IIII 111111■ 1111111111111■ 11111111111111111111111111■■■■ 11 ■111111■ 111 ■■■■11 ■ 11 ll!■ llllllllllllll■ llll■ llll■ ll ■■IHl■ llllllllll■ ll ■llllllll ■ ll■■■■ IIII 11111111 ■■!11 ■ 111111 ■111111111111■■■1111 ■111111■ 11 ■■ 111111 ■ 111■ 11 ■111111 111111111111111111111111111 ■11111111111111■■ 11■ 11111111 ■■ 1111■ 11 ■■ 111111 ■ 111 1111■111111111111111111111111111111■ 111111 ■ 1111 ■■1111 ■11 ■ 11111111■ 11111111 ■ llllllllllll ■■■l l■■ llll■ llll■ ll■■ lll ■■ ll ■■ ll ■■ llllllll ■■ llll■ lllll■ II ■ llllllllllilllllllll■■■■■ ll■■llll ■■lllll ■llll ■•■■ll ■■■■ llllllll■■ II llllllllllllll l■ IIIRIIIIB ■ llllllllll ■ llllllllll ■■■■■llll ■ll ■ IIIIIIIIIIIIII 111111■ 11 ■1111111■■111111111111111■ 11 ■111111■■■1111 ■11 ■■11 ■ 1111 ■ 111■ 11 ■ ·11·11·11·111·1·■·■•11•11■11■1111111■111111■1■11 ■111111111111111■1■■■ 11■ 11■ ■■11 ■1111111■1 1■1111111111■1■ ■■ 11■■!■11 ■■■■■111■11111 ■■ ll■ lllllllllllllllllllil ■llllll■ lll ■■ ll ■■■ ll■ ll ■ll ■ ll■■ llllll ■IIIIII 11 ■ 11■■ 11111111■11111111 ■1111 ■11 ■11■■1111 ■ 11111111■■1111 ■1111 ■ 11■ 11!11 ■■ ll■■■■ ll ■IIIIHll■ lllllllll ■■ ll ■■ IIIHllll ■llll ■■ IIIIIIIIIIIIIIIIIIII ■ ■■■llllllll ■ ll■■ llllllll■ ll ■llllll■ IHl■ lllllllll ■llllll■■•■■ IIIIII ■ 11 ■■11 ■11■ 11 ■11111111 ■ 11■■■••·············••11■ 1111111111 llllllllllllllllllllllllll ■ll ■ llllllllllllllllllll■ ll ■■ IIIIIIIIIIIIIHl■ llll■ II ■ 1.1 Notes for the student There is little doubt that relativity theory captures the imagination. Nor is it surprising: the anti-intuitive properties of special relativity, the bizarre characteristics of black holes, the exciting prospect of gravitational wave detection and with it the advent of gravitational wave astronomy, and the sheer.scope and nature of cosmology and its posing of ultimate questions; these and other issues combine to excite the minds of the inquisitive. Yet, ifwe are to look at these issues meaningfully, then we really require both physical insight and a sound mathematical foundation. The aim of this book is to help provide these. The book grew out of some notes I wrote in the mid-1970s to accompany a UK course on general relativity. Originally, the course was a third-year undergraduate option aimed at mathematicians and physicists. It subsequently grew to include M.Sc. students and some first-year Ph.D. students. Consequently, the notes, and with it the book, are pitched principally at the undergraduate level, but they contain sufficient depth and coverage to interest many students at the first-year graduate level. To help fulfil this dual purpose, I have indicated the more advanced sections (level-two material) by a grey shaded bar alongside the appropriate section. Level-one material is essential to the understanding of the book, whereas level two is enrichment material included for the more advanced student. To help put a bit more light and shade into the book, the more important equations and results an: given in tint panels. In designing the course, I set myself two main objectives. First of all, I wanted the student to gain insight into, and confidence in handling, the basic equations of the theory. From the mathematical viewpoint, this requires good manipulative ability with tensors. Part B is devoted to developing the necessary expertise in tensors for the rest of the book. It is essentially written as a self-study unit. Students are urged to attempt all the exercises which accompany the various sections. Experience has shown that this is the only real way to be in a position to deal confidently with the ensuing material. From the physical viewpoint, I think the best route to understanding relativity theory is to follow the one taken by Einstein. Thus the second chapter of Part C is devoted to discussing the principles which guided Einstein in his search for a relativistic theory of gravitation. The field equations are approached first from a largely physical viewpoint using these principles and subsequently from a purely mathematical viewpoint using the 4 I The organization of the book variational principle approach. After a chapter devoted to investigating the quantity which goes on the 'right-hand side' of the equations, the structure of the equations is discussed as a prelude to solving them in the simplest case. This part of the course ends by cpnsidering the experimental status of general relativity. The course originally assumed that the student had some reasonable knowledge of special relativity. In fact, over the years, a growing number of students have taken the course without this background, and so, for completeness, I eventually added Part A. This is designed to provide an introduction to special relativity sufficient for the needs of the rest of the book. The second main objective of the course was to develop it in such a way that it would be possible to reach three major topics of current interest, namely, black holes, gravitational waves, and cosmology. These topics form the subject matter of Parts D, E, and F respectively. Each of the chapters is supported by exercises, numbering some 300 in total. The bulk of these are straightforward calculations used to fill in parts omitted in the text. The numbers in parentheses indicate the sections to which the exercises refer. Although the exercises in general are important in aiding understanding, their status is different from those in Part B. I see the exercises in Part B as being absolutely essential for understanding the rest of the book and they should not be omitted. The remaining exercises are desirable. The book is neither exhaustive nor complete, since there are topics in the theory which we do not cover or only meet briefly. However, it is hoped that it provides the student with a sound understanding of the basics of the theory. A few words of advice if you find studying from a book hard going. Remember that understanding is not an all or nothing process. One understands things at deeper and deeper levels, as various connections are made or ideas are seen in different contexts or from a different perspective. So do not simply attempt to study a section by going through it line by line and expect it all to make sense at the first go. It is better to begin by reading through a few sections quickly- skimming- thereby trying to get a general feel for the scope, level, and coverage of the subject matter. A second reading should be more thorough, ·but should not stop if ideas are met which are not clear straightaway. In a final pass, the sections should be studied in depth with the exercises attempted at the end of each section. ·However, if you get stuck, do not stop there, press on. You will often find that the penny will drop later, sometimes on its own, or that subsequent work will produce the necessary understanding. Many exercises (and exam questions) are hierarchical in nature. They require you to establish a result at one stage which is then used at a subsequent stage. If you cannot establish the result, then do not give up. Try and use it in the subsequent section. You will often find that this will give you the necessary insight to allow you .to go back and establish the earlier result. For most students, frequent study sessions of not too long a duration are more productive than occasional long drawn out sessions. The best study environment varies greatly from one individual to another. Try experimenting with different environments to find out what is the most effective for you. As far as initial conditions are concerned, that is assumptions about your background, it is difficult to be precise, because you can probably get by with much less than the book might seem to indicate (see §1.5). Added to which, there is a big difference between understanding a topic fully and only having some vague acquaintance with it. On the mathematical side, you certainly need to know calculus, up to and including partial differentiation, and solution of simple ordinary differential equations. Basic algebra is assumed and some matrix theory, although you can probably take eigenvalues and diagonalisation on trust. Familiarity with vectors and some exposure to vector fields is assumed. It would also be good to have met the ideas of a vector space and bases. We use Taylor's theorem a lot, but probably knowledge of Maclaurin's will be sufficient. On the Physics side, you obviously need to know Newton's laws and Newtonian gravitation. It would be helpful also to know a little about the potential formulation of gravitation (though, again, just the basics will do). The book assumes familiarity with electromagnetism (Maxwell's equations, in particular) and fluid dynamics (the Navier-Stokes equation, in particular), but neither of these are absolutely essential. It would be very helpful to have met some ideas about waves (such as the fundamental relationship c = lv) and the wave equation in particular. In cosmology, it is assumed that you know something about basic astronomy. Having listed all these topics, then, if you are still unsure about your background,, my approach would be to say: try the book and see how you get on, if it gets beyond you (and it is not a level two section) press on for a bit and, if things do not get any better, then cut out. Hopefully, you may still have learnt a lot, and you can always come back to it when your background is stronger. In fact, it should not require much background to get started, for part A on special relativity assumes very little. After that you hit part B, and this is where your motivation will be seriously tested. I hope you make it through because the pickings on the other side are very rich indeed. So, finally, good luck! 1.2 Acknowledgements Very little of this book is wholly original. When I drew up the notes, I decided from the outset that I would collect together the best approaches to the material which were known to me. Thus, to take an example right from the beginning of the book, I believe that the k-calculus provides the best introduction to special relativity, because it offers insight from the outset through the simple diagrams that can be drawn. Indeed one of the themes of this book is the provision of a large number of illustrative diagrams (over 200 in fact). The visual sense is the most immediate we possess and helps lead directly to a better comprehension. A good sl.lbtitle for the book would be, An approach to relativity theory via space-time pictures. The k-calculus is an approach developed by H. Bondi from the earlier ideas of A. Milne. My use of it is not surprising since I spent my years as a research student at King's College, London, in the era of Hermann Bondi and Felix Pirani, and many colleagues will detect their influences throughout the book. So the fact is that many of the approaches in the book have been borrowed from one author or another; there is little that I have written completely afresh. My intention has been to organize the material in such a way that it is the more readily accessible to the majority of students. General relativity has the reputation of being intellectually very demanding. There is the apocryphal story, I think attributed to Sir Arthur Eddington, who, when asked whether he believed it true that only three people in the world understood general relativity, replied, 'Who is the third?' 1.2 Acknowledgements I 5 I6 The organization of the book Indeed, the intellectual leap required by Einstein to move from the special theory to the general theory is, there can be little doubt, one of the greatest in the history of human thought. So it is not surprising that the theory has the reputation it does. However, general relativity has been with us for some three-quarters of a century and our understanding is such that we can now build it up in a series of simple logical steps. This brings the theory within the grasp of most undergraduates equipped with the right ba;,,kground. Quite clearly, I owe a huge debt to all the authors who have provided the source material for and inspiration of this book, However, I cannot make the proper detailed acknowledgements to all these authors, because some of them are not known even to me, and I would otherwise run the risk of leaving somebody out. Most of the sources can be found in the bibliography given at the end of the book, and some specific references can be found in the section on further reading. I sincerely hope I have not offended anyone (authors or publishers) in adopting this approach. I have written this book in the spirit that any explanation that aids understanding should ultimately reside in the pool of human knowledge and thence in the public domain. None the less, I would like to thank all those who, wittingly or unwittingly, have made this book possible. In particular, I would like to thank my old Oxford tutor, Alan Tayler, since it was largely his backing that led finally to the book being produced. In the process of converting the notes to a book, I have made a number of changes, and have added sections, further exercises, and answers. Consequently this new material, unlike the earlier, has not been vetted by the student body and it seems more than likely that it may contain errors of one sort or another. If this is the case, I hope that it does not detract too much from the book and, of course, I would be delighted to receive corrections from readers. However, I have sought some help and, in this respect, I would particularly like to thank my colleague James Vickers for a critical reading of much of the book. Having said I do not wish to cite my sources, I now wish to make one important exception. I think it would generally be accepted in the relativity community that the most authoritative text in existence in the field is The large scale structure of space-time by Stephen Hawking and George Ellis (published by Cambridge University Press). Indeed, this has taken on something akin to the status of the Bible in the field. However, it is written at a level which is perhaps too sophisticated for most undergraduates (in parts too sophisticated for most specialists!). When I compiled the notes, I had in mind the aspiration that they might provide a small stepping stone to Hawking and Ellis. In particular, I hoped it might become the next port of call for anyone wishing to pursue their interest further. To that end, and because I cannot improve on it, I have in places included extracts from that source virtually verbatim. I felt that, if students were to consult this text, then the familiarity of some of the material might instil confidence and encourage them to delve deeper. I am hugely indebted to the authors for allowing me to borrow from their superb book. 1.3 A brief survey of relativity theory It might be useful, before embarking on the course proper, to attempt to give some impression of the areas which come under the umbrella of relativity theory. I have attempted this schematically in Fig. 1.1. This is a rather partial 1.3 A brief survey of relativity theory I 7 ~--1 rl Quantum theory 1-------------------------------- r--i I Differential geometry H Electrodynamics ------------------------------, d r4 H Thermodynamics ----------------------------~ H Kinetictheory I I I I I 1--------------------------- I I I I I I I I II I I I I I I I H Statistical mechanics ----------------------- · I II I "' .., .., .., .., ... ,!. I Differential topology I I I t tI I I t I t I I I I I I I I l y Special relativity I I Relativity I I J. General relativity I I I I lI I I I I I Cosmology I ,l, i' I i' "' I + i' Astronomy Astrophysics I Experimental tests ~ Exact solutions ~ Formalisms ~ Gravitational radiation H Gravitational collapse ~ Orbits Gravitational waves Black holes Gravitational red shift Radar signals Light bending Gyroscopes Classification Equivalence problem Analytic extensions Singularities Cosmic strings Complex techniques Transformation groups Algebraic computing Tensors Frames Forms Spinors Spin coefficients Twistors Waves Energy transfer Conservation laws Equations of motion Asymptotic structure of space-time Variational principles Group representations Black holes Singularity theorems Global techniques Cosmic censorship ! ---++-- Initial value problem M Hamiltonian formulation Stability theorems Superspace Positive mass theorems Numerical relativity ! Alternative theories i..i Torsion theories Brans-Dicke Hoyle-Narlikar Whitehead Bimetric theories etc. ! Unified field theory Kaluza-Klein theory ! ~ Quantum gravity Canonical gravity Quantum theory on curved backgrounds Path-integral approach Su pe rgravity Superstrings etc. Fig.1.1 An individual survey of relativity. and incomplete view, but should help to convey some idea of our planned route. Most of the topics mentioned are being actively researched today. Of course, they are interrelated in a much more complex way than the diagram suggests. Every few years since 1955 (in fact every three since 1959), the relativity community comes together in an international conference of general relativity and gravitation. The first such conference held in Berne in 1955 is now referred to as GRO, with the subsequent ones numbered accordingly. The list, to date, of the GR conferences is given in Table 1.1. At these conferences, there are specialist discussion groups which are held covering the whole area of interest. Prior to GR8, a list was published giving some detailed idea of what each discussion group would cover. This is presented below and may be used as an alternative to Fig. 1.1 to give an idea of the topics which comprise the subject. Table 1.1 GRO 1955 Bern, Switzerland GRl 1957 Chapel Hill , North Carolina , USA GR2 1959 Royaumont, France GR3 1962 Jablonna, Poland GR4 1965 London, England GR5 1968 Tbilisi, USSR GR6 1971 Copenhagen, Denmark GR7 1974 Tel-Aviv, Israel GR8 1977 Waterloo, Canada GR9 1980 Jena, DDR GRlO 1983 Padua, Italy GRll 1986 Stockholm , Sweden GR12 1989 Boulder, Colorado, USA 8 I The organization of the book I. Relativity and astrophysics Relativistic stars and binaries; pulsars and quasars; gravitational waves and gravitational collapse; black holes; X-ray sources and accretion models. II. Relativity and classical physics Equations of motion; conservation laws; kinetic theory; asymptotic flatness and the positivity of energy; Hamiltonian theory, Lagrangians, and field theory; relativistic continuum mechanics, electrodynamics, and thermodynamics. III. Mathematical relativity Differential geometry and fibre bundles; the topology of manifolds; applications of complex manifolds; twistors; causal and conformal structures; partial differential equations and exact solutions; stability; geometric singularities and catastrophe theory; spin and torsion: Einstein-Cartan theory. IV. Relativity and quantum physics Quantum theory on curved backgrounds; quantum gravity; gravitation and elementary particles; black hole evaporation; quantum cosmology. V. Cosmology Galaxy formation; super-clustering; cosmological consequences of spontaneous symmetry breakdown: domain structures; current estimates of cosmological parameters; radio source counts; microwave background; the isotropy of the universe; singularities. VI. Observational and experimental relativity Theoretical frameworks and viable theories; tests of relativity; gravitational wave detection; solar oblateness. VII. Computers in relativity Numerical methods; solution of field equations; symbolic manipulation systems in general relativity. 1.4 Notes for the teacher In my twenty years as a university lecturer, I have undergone two major conversions which have profoundly affected the way I teach. These have, in their way, contributed to the existence of this book. The first conversion was to the efficacy of the printed word. I began teaching, probably like most of my colleagues, by giving lectures using the medium of chalk and talk. I soon discovered that this led to something of a conflict in that the main thing that students want from a course (apart from success in the exam) is a good set of lecture notes, whereas what I really wanted was that they should understand the course. The process of trying to give students a good set of lecture notes meant that there was, to me, a lot of time wasted in the process of note taking. I am sure colleagues know the caricature of the conventional lecture: notes are copied from the lecturer's notebook to the student's notebook without their going through the heads of-either-a definition which is perhaps too close for comfort. I was converted at an early stage to the desirability of providing students with printed notes. The main advantage is that it frees up the lecture period from the time-consuming process of note copying, and the time released can be used more effectively for developing and explaining the course at a rate which the students are able to cope with. I still find that there is something rather final and definitive about the printed word. This has the effect on me of making me think more carefully about what goes into a course and how best to organize it. Thus, printed notes have the added advantage of making me put more into the preparation of a course than I would have done otherwise. It must be admitted that there are some disadvantages with using printed notes, but this is not the place to elaborate on them. My second conversion was to the efficacy of self-study. This is a rather elaborate title for the concept of students studying and learning on their own from books or prepared materials. It is still a surprise to me just how little of this actually goes on in certain disciplines. And yet you would think that one of the main objectives of a university education is to teach students how to use books. My experience is that, in mathematics particularly, students find this hard to do. This is not so surprising since it requires high-level skills which many do not come to university equipped with. So one needs a mechanism which encourages students to use books. My first experience was in designing a Keller-type (self-paced) self-study course, where the students study from specially prepared units and are required to pass periodic tests before they move on to new topics. This eventually led me in other courses to use a coursework component counting towards a final assessment as a mechanism for helping to get students to study on their own. I have been involved in a good deal of research into this approach and the most frequent remark students make about coursework is that 'it gets me to work'. The coursework approach was particularly important in the design of the general relativity course for reasons which I believe are worth exploring. In the mid-1970s, there were very few undergraduate courses in general relativity in existence in the UK. Those that there were usually only got as far as the Schwarzschild solution and then stopped. This was because the bulk of the course was devoted to developing the necessary expertise in tensors and there did'not seem to be any short cut. This meant, from the viewpoint of both the student and the teacher, that the course ended just as things were beginning to get really interesting. It was clear to me that what students really wanted to know about most were the topics of black holes, gravitational waves, and cosmology. So, from the outset, the object was to design a course which made this possible. It was achieved by separating out what is Part B of this book as a self-study unit on tensors. The notes were distributed at the beginning of the course and the students were instructed to begin immediately working through the self-study part and attempting all the exercises. The fact that most students put in the bulk of their efforts in their other courses towards the end of these courses helped in this respect, since they were less heavily loaded at the outset. The students were offered some optional tutorials in case they got stuck (as some undertaking individual study for the first time invariably did). The inducement for doing the exercises was that they counted towards the final assessment (by some 35 per cent currently). The deadline for completing the exercises was set for about a third of the way through the course. While the students were busy in their own time working on the tensors, the lecture course began by revising the key ideas in 1.4 Notes for the teacher I 9 10 I The organization of the book special relativity. The special theory was then formulated in a tensorial way, making use of the new language and so providing some initial motivation. This was followed by a detailed and deliberate development of the principles underlying general relativity. Tensors are then used in earnest for the first time in deriving the equation of geodesic deviation of Chapter 10. It is by about this time that the students have gained considerable expertise in manipulating tensors and the lectures help to provide further motivation and consolidation. This device means that the Schwarzschild solution can be reached by not much more than half-way through the course. Another important advantage of printed lecture notes is that one has much more control over the speed at which the course is delivered, and one can to some extent tune the speed to fit the capabilities of the class. The Southampton course is some thirty-six lectures in length. In the early years, when the students had a good background in special relativity, I was able to cover all three end topics. Indeed, in the first year of operation, I ended up in the final week by organizing five seminars given by outside speakers which all the students attended and which attempted to show how the work we had covered related to some topics of current research interest. In more recent years, the preparation of the students in special relativity has been more patchy, and so I have taken this more on board and have been somewhat less ambitious. This has usually meant leaving out a topic such as rotating black holes or gravitational radiation. Of course, since these are contained in the notes, the students are able to fill in these gaps if they so choose. I have been encouraged to write up the notes in book form for a number of reasons. The course has been running for some fifteen years and several hundred students have been through it, so that I have a good deal of consumer experience to draw upon. Not only has the course proved popular, but it seems to have coped surprisingly successfully with students of a wide ability range. This may in part be because I have included many of the more detailed steps in the text itself (and where these have been left out they have often been relegated to straightforward exercises). In fact, the notes are sold to the students to cover the cost of production. It has been gratifying that each year a number of students who are not on the course, sometimes not even in a related discipline, but who have by chance come across the notes, purchase a copy for themselves. Finally, a number ofmy relativity colleagues both in the UK and abroad have asked for copies and used them in varying degrees in their own courses. So, despite the fact that there are a number of fine texts around in the area, I have agreed to present the notes in book form. I hope you, the teacher, find them a valuable resource in your teaching. 1.5 A final note for the less able student I was far from being a child prodigy, and yet I learnt relativity at the age of 15! Let me elaborate. As testimony to my intellectual ordinariness, I had left my junior school at the age of 11 having achieved the unremarkable feat of coming 22nd in the class in my final set of examinations. Yet I really did know some relativity four years on -and I don't just mean the special theory, but the general theory (up to and including the Schwarzschild solution and the classical tests). I remember detecting a hint of disbelief when I recounted this to the same Alan Tayler, who was later to become my tutor, in an Oxford Exercises I 11 entrance interview. He followed up by asking me to define a tensor, and when I rattled off a definition, he seemed somewhat surprised. Indeed, as it turned out, we did not cover very much more than I first knew in the Oxford third year specialist course on general relativity. So how was this possible? I, too, had heard the s-tory about how only a few people in the world really understood relativity, and it had aroused my curiosity. I went to the local library and, as luck would have it, I pulled out a book entitled Einstein's Theory of Relativity by Lillian Lieber (1949). This is a very bizarre book in appearance. The book is not set out in the usual way but rather as though it were concrete poetry. Moreover, it is interspersed by surrealist drawings by Hugh Lieber involving the symbols from the text (Fig. L2). I must confess that at first sight the book looks rather cranky; but it is not. I worked through it, filling in all the details missing from the calculations as I went. What was amazing was that the book did not make too many assumptions about what mathematics the reader needed to know. For example, I had not then met partial differentiation in my school mathematics, and yet there was sufficient coverage in the book for me to cope. It felt almost as if the book had been written just for me. The combination of the intrinsic interest of the material and the success I had in doing the intervening calculations provided sufficient motivation for me to see the enterprise through to the end. Perhaps, if you consider yourself a less able student, you are a bit daunted by the intellectual challenge that lies ahead. I will not deny that the book includes some very demanding ideas (indeed, I do not understand every facet of all of these ideas myself). But I hope the two facts that the arguments are broken down into small steps and that the calculations are doable, will help you on your way. Even if you decide to cut out after part C, you will have come a long way. Take heart from my little story- I am certain that if you persevere you will consider it worth the effort in the end. Fig. 1.2. 'The product of two tensors is equal to another' according to Hugh Lieber. Exercises 1.1 (§1.3) Go to the library and see if you can locate current copies of the following journals: (i) General Relativity and Gravitation; (ii) Classical and Quantum Gravity; (iii) Journal of Mathematical Physics; (iv) Physical Review D. See if you can relate any of the articles in them to any of the topics contained in Fig. 1.1. 1.2 Look back through copies of Scientific American for future reference, to see what articles there have been in recent years on relativity theory, especially black holes, gravitational waves, and cosmology. 1.3 Read a biography of Einstein (see Part A of the Selected Bibliography at the end of this book). 2.1 Model building Before we start, we should be clear what we are about. The essential activity of mathematical physics, or theoretical physics, is that of modelling or model building. The activity consists of constructing a mathematical model which we hope in some way captures the essentials of the phenomena we are investigating. I think we should never fail to be surprised that this turns out to be such a productive activity. After all, the first thing you notice about the world we inhabit is that it is an extremely complex place. The fact that so much of this rich structure can be captured by what are, in essence, a set of simple formulae is to me quite astonishing. Just think how simple Newton's universal law of gravitation is; and yet it encompasses a whole spectrum of phenomena from a falling apple to the shape of a globular cluster of stars. As Einstein said, 'The most incomprehensible thing about the world is that it is comprehensible.' The very success of the activity of modelling has, throughout the history of science, turned out to be counterproductive. Time and again, the successful model has been confused with the ultimate reality, and this in turn has stultified progress. Newtonian theory provides an outstanding example of this. So successful had it been in explaining a wide range of phenomena, that, after more than two centuries of success, the laws had taken on an absolute character. Thus it was that, when at the end of the nineteenth century it was becoming increasingly clear that something was fundamentally wrong with the current theories, there was considerable reluctance to make any fundamental changes to them. Instead, a number of artificial assumptions were made in an attempt to explain the unexpected phenomena. It eventually required the genius of Einstein to overthrow the prejudices of centuries and demonstrate in a. number of simple thought experiments that some of the most cherished assumptions of Newtonian theory were untenable. This he did in a number of brilliant papers written in 1905 proposing a theory which has become known today as the special theory of relativity. We should perhaps be discouraged from using words like right or wrong when discussing a physical theory. Remembering that the essential activity is model building, a model should then rather be described as good or bad, depending on how well it describes the phenomena it encompasses. Thus, Newtonian theory is an excellent theory for describing a whole range of phenomena. For example, if one is concerned with describing the motion of a car, then the Newtonian framework is likely to be the appropriate one. 16 I The k-calculus However, it fails to be appropriate if we are interested in very high speeds (comparable with the speed oflight) or very intense gravitational fields (such as in the nucleus of a galaxy). To put it another way: together with every theory, there should go its range of validity. Thus, to be more precise, we should say that Newtonian theory is an excellent theory within its range of validity. From this point of view, developing our models of the physical world does not involve us in constantly throwing theories out, since they are perceived to be wrong, or unlearning them, but rather it consists more of a process of refinement in order to increase their range of validity. So the moral of this section is that, for all their remarkable success, one must not confuse theoretical models with the ultimate reality they seek to describe. 2.2 Historical background In 1865, James Clerk Maxwell put forward the theory of electromagnetism. One of the triumphs of the theory was the discovery that light waves are electromagnetic in character. Since all other known wave phenomena required a material medium in which the oscillations were carried, it was postulated that there existed an all-pervading medium, called the 'luminiferous ether', which carried the oscillations of electromagnetism. It was then anticipated that experiments with light would allow the absolute motion of a body through the ether to be detected. Such hopes were upset by the null result of the famous Michelson-Morley experiment (1881) which attempted to measure the velocity of the earth relative to the ether and found it to be undetectably small. In order to explain this null result, two ad hoc hypotheses were put forward by Lorentz, Fitzgerald, and Poincare (1895), namely, the contraction of rigid bodies and the slowing down of clocks when moving through the ether. These effects were contained is some simple formulae called the 'Lorentz transformations'. This would affect every apparatus designed to measure the motion relative to the ether so as to neutralize exactly all expected results. Although this theory was consistent with the observations, it had the philosophical defect that its fundamental assumptions were unverifiable. In fact, the essence of the special theory of relativity is contained in the Lorentz transformations. However, Einstein was able to derive them from two postulates, the first being called the 'principle of special relativity' - a principle which Poincare had also suggested independently in 1904 - and the second concerning the constancy of the velocity of light. In so doing, he was forced to re-evaluate our ideas of space and time and he demonstrated through a number of simple thought experiments that the source of the limitations of the classical theory lay in the concept of simultaneity. Thus, although in a sense Einstein found nothing new in that he rederived the Lorentz transformations, his derivation was physically meaningful and in the process revealed the inadequacy of some of the fundamental assumptions of classical thought. Herein lies his chief contribution. 2.3 Newtonian framework We start by outlining the Newtonian framework. An event intuitively means something happening in a fairly limited region of space and for a short duration in time. Mathematically, we idealize this concept to become a point 2.4 Galilean transformations I 17 I • X p Fig. 2.1 Train travels in straight line. in space and an instant in time. Everything that happens in the universe is an event or collection of events. Consider a train travelling from one station P to another R, leaving at 10 a.m. and arriving at 11 a.m. We can illustrate this in the following way: for simplicity, let us assume that the motion takes place in a straight line (say along the x-axis (Fig. 2.1); then we can represent the motion by a space-time diagram (Fig. 2.2) in which we plot the position of some fixed point on the train, which we represent by a pointer, against time. The curve in the diagram is called the history or world-line of the pointer. Notice that at Q the train was stationary for a period. We shall call individuals equipped with a clock and a measuring rod or ruler observers. Had we looked out of the train window on our journey at a clock in a passing station then we would have expected it to agree with our watch. One of the central assumptions of the Newtonian framework is that two observers wiJJ, once they have synchronized their clocks, always agree about the time of an event, irrespective of their relative motion. This implies that for all observers time is an absolute concept. In particular, all observers can agree on an origin of time. In order to fix an event in space, an observer may choose a convenient origin in space together with a set of three Cartesian coordinate axes. We shall refer to an observer's clock, ruler, and coordinate axes as a frame of reference (Fig. 2.3). Then an observer is able to coordinatize events, that is, determine the time t an event occurs and its relative position (x, y, z). We have set the stage with space and time; they provide the backcloth, but what is the story about? The stuff which provides the events of the universe is matter. For the moment, we shall idealize lumps of matter into objects called bodies. If the body has no physical extent, we refer to it as a particle or point mass. Thus, the role of observers in Newtonian theory is to chart the history of bodies. t 11 10~---~-------. p Q R X Fig. 2.2 Space-time diagram of pointer. y t G) I' l""I >------x z Fig. 2.3 Observer's frame of reference. 2.4 Galilean transformations Now, relativity theory is concerned with the way different observers see the same phenomena. One can ask: are the laws of physics the same for all observers or are there preferred states of motion, preferred reference systems, and so on? Newtonian theory postulates the existence of preferred frames of reference. This is contained essentially in the first law, which we shall call Nl and state in the following form: Thus, there exists a privileged set of bodies, namely those not acted on by forces. The frame of reference of a co-moving observer is called an inertial frame (Fig. 2.4). It follows that, once we have found one inertial frame, then all 18 I The k-calculus Fig. 2.4 Two observed bodies and their inertial frames. y y' Fig. 2.5 Two frames in standard configuration at time t. r V r'0\:Y~,·=·•~j"='~j•~•j='~X 0,""'""'l""l"'I - - - - - - - --- - - - - - - - - - - - - - - - - S J( z z' others are at rest or travel with constant velocity relative to it (for otherwise Newton's first law would no longer be true). The transformation which connects one inertial frame with another is called a Galilean transformation. To fix ideas, let us consider two inertial frames called S and S' in standard configuration, that is, with axes parallel and S' moving along S's positive xaxis with constant velocity (Fig. 2.5). We also assume that the observers synchronize their clocks so that the origins of time are set when the origins of the frames coincide. It follows from Fig. 2.5 that the Galilean transformation connecting the two frames is given by The last equation provides a manifestation of the assumption of absolute time in Newtonian theory. Now, Newton's laws hold only in inertial frames. From a mathematical viewpoint, this means that Newton's laws must be invariant under a Galilean transformation. 2.5 The principle of special relativity We begin by stating the relativity principle which underpins Newtonian theory I 2.6 The constancy of the velocity of light 19 This means that, if one inertial observer carries out some dynamical experiments and discovers a physical law, then any other inertial observer performing the same experiments must discover the same law. Put another way, these laws must be invariant under a Galilean transformation. That is to say, if the law involves the coordinates x, y, z, t of an inertial observer S, then the law relative to another observer S' will be the same with x, y, z, t replaced by x', y', z', t', respectively. Many fundamental principles of physics are statements of impossibility, and the above statement of the relativity principle is equivalent to the statement of the impossibility of deciding, by performing dynamical experiments, whether a body is absolutely at rest or in uniform motion. In Newtonian theory, we cannot determine the absolute position in space of an event, but only its position relative to some other event. In exactly the same way, uniform velocity has only a relative significance; we can only talk about the velocity of a body relative to some other. Thus, both position and velocity are relative concepts. Einstein realized that the principle as stated above is empty because there is no such thing as a purely dynamical experiment. Even on a very elementary level, any dynamical experiment we think of performing involves observation, i.e. looking, and looking is a part of optics, not dynamics. In fact, the more one analyses any one experiment, the more it becomes apparent that practically all the branches of physics are involved in the experiment. Thus, Einstein took the logical step of removing the restriction of dynamics in the principle and took the following as his first postulate. Hence we see that this principle is in no way a contradiction of Newtonian thought, but rather constitutes its logical completion. 2.6 The constancy of the velocity of light We previously defined an observer in Newtonian theory as someone equipped with a clock and ruler with which to map the events of the universe. However, the approach of the k-calculus is to dispense with the rigid ruler and use radar methods for measuring distances. (What is rigidity anyway? If a moving frame appears non-rigid in another frame, which, if either, is the rigid one?) Thus, an observer measures the distance of an object by sending out a light signal which is reflected off the object and received back by the observer. The distance is then simply defined as half the time difference between emission and reception. Note that by this method distances are measured in ~ intervals of time, like the light year or the light second ( 1010 cm). Why use light? The reason is that we know that the velocity of light is independent of many things. Observations from double stars tell us that the velocity of light in vacuo is independent of the motion of the sources as well as independent of colour, intensity, etc. For, if we suppose that the velocity of light were dependent on the motion of the source relative to an observer (so that if the source was coming towards us the light would be travelling faster and vice versa) then we would no longer see double stars moving in Keplerian 20 I The k-calculus orbits (circles, ellipses) about each other: their orbits would appear distorted; yet no such distortion is observed. There are many experiments which confirm this assumption. However, these were not known to Einstein in 1905, who adopted the second postulate purely on heuristic grounds. We state the second postulate in the following form. Or stated another way: there is no overtaking oflight by light in empty space. The speed of light is conventionally denoted by c and has the exact numerical value 2.997 924 580 x 108 ms - 1, but in this chapter we shall adopt relativistic units in which c is taken to be unity (i.e. c = 1). Note, in passing, that another reason for using radar methods is that other methods are totally impracticable for large distances. In fact, these days, distances from the Earth to the Moon and Venus can be measured very accurately by bouncing radar signals off them. 2. 7 The k-factor For simplicity, we shall begin by working in two dimensions, one spatial dimension and one time dimension. Thus, we consider a system of observers distributed along a straight line, each equipped with a clock and a flashlight. We plot the events they map in a two-dimensional space-time diagram. Let us assume we have two observers, A at rest and B moving away from A with uniform (constant) speed. Then, in a space-time diagram, the world-line of A will be represented by a vertical straight line and the world-line of B by a straight line at an angle to A's, as shown in Fig. 2.6. A light signal in the diagram will be denoted by a straight line making an angle ¼n with the axes, because we are taking the speed of light to be 1. Now, suppose A sends out a series of flashes of light to B, where the interval between the flashes is denoted by T according to A's clock. Then it is plausible to assume that the intervals of reception by B's clock are proportional to T, say kT. Moreover, the quantity k, which we call the k-factor, is B Time A B ~ - - -- - - -- -space Fig. 2.6 The world -lines of observers A and 8. T Fig. 2.7 The reciprocal nature of the k-factor. I 2.8 Relative speed of two inertial observers 21 clearly a characteristic of the motion ofB relative to A. We now assume that if A and Bare inertial observers, then k is a constant in time. (In fact, there is a hidden assumption here, since how do we know that B's world-line will be a straight line as indicated in the diagram? Strictly speaking, we are assuming that there is a linear relationship between the space and time coordinates of A and B.) Then the principle of special relativity requires that the relationship between A and B must be reciprocal, so that, if B emits two signals with a time lapse of T according to B's clock, then A receives them after a time lapse of kT according to A's clock (Fig. 2.7). Note that, from B's point of view, A is moving away from B with the same relative speed. Observer A assigns coordinates to an event P by bouncing a light signal off it. So that if a light signal is sent out at a time t = t1 , and received back at a time t = t2 (Fig. 2.8), then, according to our radar definition of distances, the coordinates of P are given by t, P(t,x) L---------x Fig. 2.8 Coordinatizing events. (t, x) = (½(t1 + t2), ½(t2 - t1)), (2.2) remembering that the velocity of light is 1. We now use the k-factor to develop the k-calculus. 2.8 Relative speed of two inertial observers Consider the configuration shown in Fig. 2.9 and assume that A and B A B synchronize their clocks to zero when they cross at event 0. After a time T, A sends a signal to B, which is reflected back at event P. From B's point of view, a light signal is sent to A after a time lapse of kT by B's clock. It follows from the definition of the k-factor that A receives this signal after a time lapse of k(kT). Then, using (2.2) with t 1 = Tand t2 = k2 T, we find the coordinates of P according to A's clock are given by k'T (t,x) = (½(k2 + l)T,½(k2 - l)T). (2.3) Thus, as Tvaries, this gives the coordinates of the events which constitute B's world-line. Hence, if v is the velocity of B relative to A, we find X k2 - 1 V=t=k2+1· Solving for k in terms of v, and noting from the diagram that k must be greater than 1 if the observers are separating, we find Fig. 2.9 Relating the k-factor to the relative speed of separation. T We shall see in the next chapter that this is the usual relativistic formula for the radial Doppler shift. If B is moving away from A then k > 1 which T represents a 'red' shift, whereas if B is approaching A then k < 1 which represents a 'blue' shift. Note that the transformation v -+ - v corresponds to k ➔ l/k. Moreover, = v = 0 k = 1, T as we should expect for observers relatively at rest: once they have syn- Fig. 2.10 Observers relatively at rest chronized their clocks, the synchronization remains (Fig. 2.10). (k = 1). 22 I The k-calculus A 2.9 Composition law for velocities Consider the situation in Fig. 2.11, where kAB denotes the k-factor between A and B, with kBc and kAc defined similarly. It follows immediately that = kAc kABkBc· (2.5) Using (2.4), we find the corresponding composition law for velocities: T Fig. 2.11 Composition of k-factors. This formula has been confirmed by Fizeau's experiment in which the speed of light in a moving fluid is measured and turns out not to be simply the sum of the speed of light and the moving fluid but rather obeys the more complicated law (2.6) to higher order. Note that, if vAB and vBc are small compared with the speed of light, i.e. then we obtain the classical Newtonian formula to lowest order. Although the composition law for velocities is not simple, the one for k-factors is, and in special relativity it is the k-factors which are the directly measurable quantities. Note also that, formally, if we substitute vBc = 1, representing the speed of a light signal relative to B, in (2.6), then the resulting speed of the light signal relative to A is + V,tB 1 1 V,tc=-1+ =, VAB in agreement with the constancy of the velocity of light postulate. From the composition law, we can show that, if we add two speeds less than the speed of light, then we again obtain a speed less than the speed of light. This does not mean, as is sometimes stated, that nothing can move faster than the speed of light in special relativity, but rather that the speed of light is a border which can not be crossed or even reached. More precisely, special relativity allows for the existence of three classes of particles. 1. Particles that move slower than the speed of light are called subluminal particles. They include material particles and elementary particles such as electrons and neutrons. 2. Particles that move with the speed of light are called luminal particles. They include the carrier of the electromagnetic field interaction, the photon, and theoretically the carrier of the gravitational field interaction, called the graviton. These are both particles with zero rest mass (see §4.5). It was thought that neutrinos also had zero rest mass, but more recent evidence suggests they may have a tiny mass. 3. Particles that move faster than the speed of light are called superluminal particles or tachyons. There was some excitement in the 1970s surrounding the possible existence of tachyons, but all attempts to detect them to date have failed. This suggests two likely possibilities: either tachyons do 2.10 The relativity of simultaneity I 23 not exist or, if they do, they do not interact with ordinary matter. This would seem to be just as well, for otherwise they could be used to signal back into the past and so would appear to violate causality. For example, it would be possible theoretically to construct a device which sent out a tachyon at a given time and which would trigger a mechanism in the device to blow it up before the tachyon was sent out! 2.10 The relativity of simultaneity Consider two events P and Q which take place at the same time, according to A, and also at points equal but opposite distances away. A could establish this by sending out and receiving the light rays as shown in Fig. 2.12 (continuous lines). Suppose now that another inertial observer B meets A at the time these events occur according to A. B also sends out light rays RQU and SPV to illuminate the events, as shown (dashed lines). By symmetry RU = SV and so these events are equidistant according to B. However, the signal RQ was sent before the signal SP and so B concludes that the event Q took place well before P. Hence, events that A judges to be simultaneous, B judges not to be simultaneous. Similarly, A maintains that P, 0, and Q occurred simultaneously, whereas B maintains that they occurred in the order Q, then 0, and then P. This relativity of simultaneity lies at the very heart of special relativity and resolves many of the paradoxes that the classical theory gives rise to, such as the Michelson-Morley experiment. Einstein realized the crucial role that simultaneity plays in the theory and gave the following simple thought experiment to illustrate its dependence on the observer. Imagine a train travelling along a straight track with velocity v relative to an observer A on the bank of the track. In the train, Bis an observer situated at the centre of one of the carriages. We assume that there are two electrical devices on the track which are the length of the carriage apart and equidistant from A. When the carriage containing B goes over these devices, they fire and activate two light sources situated at each end of the carriage (Fig. 2.13). From the configuration, it is clear that A will judge that the two events, when the light sources first switch on, occur simultaneously. However, B is travelling towards the light emanating from light source 2 and away from the light emanating from light source 1. Since the speed oflight is a constant, B will see the light from source 2 before seeing the light from source 1, and so will conclude that one light source comes on before the other. B Fig. 2.12 Relativity of simultaneity. Firing device 1___ X V _,,..- Firing device 2 X Fig. 2.13 Light signals emanating from the two sources. I 24 The k-calculus 'Light cone' Fig. 2.14 Event relationships in special relativity. We can now classify event relationships in space and time in the following manner. Consider any event O on A's world-line and the four regions, as shown in Fig. 2.14, given by the light rays ending and commencing at 0. Then the event E is on the light ray leaving 0 and so occurs after 0. •Any other inertial observer agrees on this; that is, no observer sees E illuminated before A sends out the signal from 0. The fact that E is illuminated (because A originally sends out a signal at 0) subsequent to 0 is a manifestation of causality-the event O ultimately causes the event E. Similarly, the event F can be reached by an inertial observer travelling from 0 with finite speed. Again, all inertial observers agree that F occurs after 0. Hence all the events in this region are called the absolute future of 0. In the same way, any event occurring in the region vertically below takes place in O's absolute past. However, the temporal relationship to 0 of events in the other two regions, called elsewhere (or sometimes the relative past and relative future) will not be something all observers will agree upon. For example, one class of observers will say that G took place after 0, another class before, and a third class will say they took place simultaneously. The light rays entering and leaving 0 constitute what is called the light cone or null cone at 0 (the fact that it is a cone will become clearer later when we take all the spatial dimensions into account). Note that the world-line of any inertial observer or material particle passing through O must lie within the light cone at 0. A p '~/ Fig. 2.15 The clock paradox. Fig. 2.16 Spatial analogue of clock paradox. 2.11 The clock paradox Consider three inertial observers as shown in Fig. 2.15, with the relative velocity V,4.c = -v,4.8, Assume that A and B synchronize their clocks at O and that C's clock is synchronized with B's at P. Let B and C meet after a time T according to B, whereupon they emit a light signal to A. According to the k-calculus, A receives the signal at R after a time kT since meeting B. Remembering that C is moving with the opposite velocity to B (so that k ➔ k - 1), then A will meet C at Q after a subsequent time lapse of k- 1 T. The total time that A records between events O and Q is therefore (k + k- 1) T. For k =I 1, this is greater than the combined time intervals 2T recorded between events OP and PQ by Band C. But should not tJie time lapse between the two events agree? This is one form of the so-called clock paradox. However, it is not really a paradox, but rather what it shows is that in relativity time, like distance, is a route-dependent quantity. The point is that the 2T measurement is made by two inertial observers, not one. Some people have tried to reverse the argument by setting B and C to rest, but this is not possible since they are in relative motion to each other. Another argument says that, when Band C meet, C should take B's clock and use it. But, in this case, the clock would have to be accelerated when being transferred to C and so it is no longer inertial. Again, some opponents of special relativity (e.g. H. Dingle) have argued that the short period of acceleration should not make such a difference, but this is analogous to saying that a journey between two points which is straight nearly all the time is about the same length as one which is wholly straight (as shown), which is absurd (Fig. 2.16). The moral is that in special relativity time is a more difficult concept to work with than the absolute time of Newton. A more subtle point revolves around the implicit assumption that the clocks of A and B are 'good' clocks, i.e. that the seconds of A's clock are the 2.12 The Lorentz transformations I 25 same as those of B's clock. One suggestion is that A has two clocks and adjusts the tick rate until they are the same and then sends one of them to Bat a very slow rate of acceleration. The assumption here is that the very slow rate of acceleration will not affect the tick rate of the clock. However, what is there to say that a clock may not be able to somehow add up the small bits of acceleration and so affect its performance. A more satisfactory approach would be for A and B to use identically constructed atomic clocks (which is after all what physicists use today to measure time). The objection then arises that their construction is based on ideas in quantum physics which is, a priori, outside the scope of special relativity. However, this is a manifestation of a point raised earlier, that virtually any real experiment which one can imagine carrying out involves more than one branch of physics. The whole structure is intertwined in a way which cannot easily be separated. 2.12 The Lorentz transformations We have derived a number of important results in special relativity, which only involve one spatial dimension, by use of the k-calculus. Other results follow essentially from the trahsformations connecting inertial observers, the famous Lorentz transformations. We shall finally use the k-calculus to derive these transformations. Let event P have coordinates (t, x) relative to A and (t', x') relative to B (Fig. 2.17). Observer A must send out a light ray at time t - x to illuminate P at time t and also receive the reflected ray back at t + x (check this from (2.2)). The world-line of A is given by x = 0, and the origin of A's time coordinate tis arbitrary. Similar remarks apply to B, where we use primed quantities for B's coordinates (t', x'). Assuming A and B synchronize their clocks when they meet, then the k-calculus immediately gives t' - x' = k(t - x), t + x = k(t' + x'). (2.7) After some rearrangement, and using equation (2.4), we obtain the so-called special Lorentz transformation t-x 8 p (t,x) (r,x') This is also referred to as a boost in the x-direction with speed v, since it takes one from A's coordinates to B's coordinates and B is moving away from A with speed v. Some simple algebra reveals the result (exercise) Fig. 2.17 Coordinatization of events by inertial observers. showing that the quantity t2 - x 2 is an invariant under a special Lorentz transformation or boost. To obtain the corresponding formulae in the case of three spatial dimen- sions we consider Fig. 2.5 with two inertial frames in standard configuration. Now, since by assumption the xz-plane (y = 0) of A must coincide with the x'z'-plane (y' = 0) of B, then they and y' coordinates must be connected by a transformation of the form y = ny', (2.9) 26 I The k-calculus Fig. 2.18 The x- and y-axes reversed in Fig. 2.5. y' y ' Fig. 2.19 Figure 2.18 from B's poir:it of ~ - - - - - - - - - - - - - - - - - - ~ - , ' view. z' z because y =0 <-> y' =0. We now make the assumption that space is isotropic, that is, it is the same in any direction. We then reverse the direction of the x- and y-axes of A and B and consider the motion from B's point of view (see Figs. 2.18 and 2.19). Clearly, from B's point of view, the roles of A and B have interchanged. Hence, by symmetry, we must have y' = ny. (2.10) Combining (2.9) and (2.10), we find n2 = 1 => n = ± 1. The negative sign can be dismissed since, as v-+ 0, we must have y' -+ y, in which case n = 1. Hence, we find y' = y, and a similar argument for z produces z' = z. 2.13 The four-dimensional woa:ld view We now compare the special Lorentz transformation of the last section in relativistic units with the Galilean transformation connecting inertial observers in standard configuration (see Table 2.1). In a Galilean transformation, the absolute time coordinate remains invariant. However, in a Table 2.1 Galilean transformation t' = t x' = x- vt Y'=Y Z'=Z Lorentz transformation t- vx t'=--- (1 - v2); x- vt X'=--- (1 - v2 )t y' =y Z'= Z 2.13 The four-dimensional world view I 27 Lorentz transformation, the time and space coordinates get mixed up (note the symmetry in x and t). In the words of Minkowski, 'Henceforth space by itself, and time by itself are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.' In the old Newtonian picture, time is split off from three-dimensional Euclidean space. Moreover, since we have an absolute concept of simultaneity, we can consider two simultaneous events with coordinates (t, Xi, Yi, zi) and (t, x 2 , y2 , z2), and then the square of the Euclidean distance between them, (2.11) is invariant under a Galilean transformation. In the new special relativity picture, time and space merge together into a four-dimensional continuum called space-time. In this picture, the square of the interval between any two events (ti, Xi, Yi, zi) and (t2 , x 2 , y2 , z2 ) is defined by s2 = (ti - t2)2 - (xi - x2)2 - (Yi - Y2)2 - (zi - z2)2, (2.12) and it is this quantity which is invariant under a Lorentz transformation. Note that we always denote the square of the interval by s2, but the quantity s is only defined if the right-hand side of (2.12) is non-negative. If we consider two events separated infinitesimally, (t, x, y, z) and (t + dt, x + dx, y + dy, z + dz), then this equation becomes where all the infinitesimals are squared in (2.13). A four-dimensional spacetime continuum in which the above form is invariant is called Minkowski space-time and it provides the background geometry for special relativity. So far, we have only met a special Lorentz transformation which connects two inertial frames in standard configuration. A full Lorentz transformation connects two frames in general position (Fig. 2.20). It can be shown that a full Lorentz transformation can be decomposed into an ordinary spatial rotation, followed by a boost, followed by a further ordinary rotation. Physically, the first rotation lines up the x-axis of S with the velocity v of S'. Then a boost in y' x' Fig. 2.20 Two frames in general position. 28 I The k-calculus this direction with speed v transforms S to a frame which is at rest relative to S'. A final rotation lines up the coordinate frame with that of S'. The spatial rotations introduce no new physics. The only new physical information arises from the boost and that is why we can, without loss of generality, restrict our attention to a special Lorentz transformation. Exercises 2.1 (§2.4) Write down the Galilean transformation from observer S to observer S', where S' has velocity v1 relative to S. Find the transformation from S' to Sand state in simple terms how the transformations are related. Write down the Galilean transformation from S' to S", where S" has velocity v2 relative to S'. Find·the transformation from S to S". Prove that the Galilean transformations form an Abelian (commutative) group. 2.2 (§2,7) Draw the four fundamental k-factor diagrams (see Fig. 2.7) for the cases of two inertial observers A and B approaching and receding with uniform velocity v: (i) as seen by A; (ii) as seen by B. 2.3 (§2.8) Show that v -+ - v corresponds to k -+ k- 1. If k > 1corresponds physically to a red shift of recession, what does k < 1 correspond to? 2.4 (§2.9) Show that (2.6) follows from (2.5). Use the composition law for velocities to prove that if O< v AB < 1 and 0 < VBc < 1, then O < VAc < 1. 2.5 (§2.9) Establish the fact that if vAB and Vac are small compared with the velocity of light, then the composition law for velocities reduces to the standard additive law of Newtonian theory. 2.6 (§2.10) In the event diagram of Fig. 2.14, find a geometrical construction for the world-line of an inertial observer passing through 0 who considers event G as occurring simultaneously with 0. Hence describe the world-lines of inertial observers passing through 0 who consider G as occurring before or after 0. 2.7 (§2.11) Draw Fig. 2.15 from B's point of view. Coordinatize the events 0, R, and Q with respect to Band find the times between 0 and R, and R and Q, and compare them with A's timings. 2.8 (§2.12) Deduce (2.8) from (2.7). Use (2.7) to deduce directly that Confirm the equality under the transformation formula (2.8). 2.9 (§2.12) In S, two events occur at the origin and a distance X along the x-axis simultaneously at t = 0. The time interval between the events in S' is T. Show that the spatial distance between the events in S' is (X2 + T 2)½ and determine the relative velocity v of the frames in terms of X and T. 2.10 (§2.13) Show that the interval between two events (t1,X1, Y1,zil and (t2,X2,Y2,z2) defined by s2 = (t1 - t2)2 - (x, - X2)2 - (Yi - Y2l2 - (z, - Z2)2 is invariant under a special Lorentz transformation. Deduce the Minkowski line element (2.13) for infinitesimally separated events. What does s2 become if t1 = t2, and how is it related to the Euclidean distance <1 between the two events? 3.1 Standard derivation of the Lorentz transformations We start this chapter by deriving again the Lorentz transformations, but this time by using a more standard approach. We shall· work in nonrelativistic units in which the speed of light is denoted by c. We restrict attention to two inertial observers S and S' in standard configuration. As before, we shall show that the Lorentz transformations follow from the two postulates, namely, the principle of special relativity and the constancy of the velocity of light. Now, by the first postulate, if the observer S sees a free particle, that is, a particle with no forces acting on it, travelling in a straight line with constant velocity, then so will S'. Thus, using vector notation, it follows that under a transformation connecting the two frames r = ro + ut -= r' = ro + u't'. Since straight lines get mapped into straight lines, it suggests that the transformation between the frames is linear and so we shall assume that the transformation from S to S' can be written in matrix form (3.1) where Lis a 4 x 4 matrix of quantities which can only depend on the speed of separation v. Using exactly the same argument as we used at the end of §2.12, the assumption that space is isotropic leads to the transformations of y and z being y' = y and z' = z. (3.2) We next use the second postulate. Let us assume that, when the origins of S and S' are coincident, they zero their clocks, i.e. t = t' = 0, and emit a flash of light. Then, according to S, the light flash moves out radially from the origin with speed c. The wave front of light will constitute a sphere. If we define the quantity I by + + I(t, x, y, z) = x 2 y 2 z 2 - c 2 t 2 , then the events comprising this sphere must satisfy I = 0. By the second I 30 The key attributes of special relativity Fig. 3.1 A rotation in (x, T)-space. postulate, S' must also see the light move out in a spherical wave front with speed c and satisfy I'= x'z + y'2 + z'2 - c2t'2 = 0. Thus it follows that, under a transformation connecting Sand S', I = 0 -= I' = 0, (3.3) and since the transformation is linear by (3.1), we may conclude I = nl', (3.4) where n is a quantity which can only depend on v. Using the same argument as we did in §2.12, we can reverse the role of S and S' and so by the relativity principle we must also have I'= nl. (3.5) Combining the last two equations we find n2 = 1 => n = ± l. In the limit as v -+ 0, the two frames coincide and I' -+ I, from which we conclude that we must take n = 1. Substituting n = 1 in (3.4), this becomes x2 + Y2 + z2 _ c2t2 = x'2 + y'2 + z'2 _ c2t'2, and, using (3.2), this reduces to (3.6) We next introduce imaginary time coordinates T and T' defined by T=ict, (3.7) T' = ict', (3.8) in which case equation (3.6) becomes x2 + r2 = x'2 + T'z. In a two-dimensional (x, T)-space, the quantity x 2 + T 2 represents the distance of a point P from the origin. This will only remain invariant under a rotation in (x, T)-space (Fig. 3.1). If we denote the angle of rotation by 0, then a rotation is given by x' = xcos0 + Tsin0, (3.9) T' = -xsin0 + Tcos0. (3.10) Now, the origin of S' (x' = 0), as seen by S, moves along the positive x-axis of S with speed v and so must satisfy x = vt. Thus, we require x' = 0 -= x = vt -= x = vT/ic, using (3.7). Substituting this into (3.9) gives tan0 = iv/c, (3.11) from which we see that the angle 0 is imaginary as well. We can obtain an expression for cos 0, using I 3.2 Mathematical properties of Lorentz transformations 31 If we use the conventional symbol /3 for this last expression, i.e. where the symbol = here means 'is defined to be', then (3.9) gives x' = cos0(x + Ttan0) = f3[x + ict(iv/c)] = f3(x - vt). Similarly, (3.10) gives e T' = ict' = cos 0( - x tan + T) = f3 [ -x(iv/c) + ict], from which we find t' = /3(t - vx/c 2 ). Thus, collecting the results together, we have rederived the special Lorentz transformation or boost (in non-relativistic units): If we put c = 1, this takes the same form as we found in §2.13. 3.2 Mathematical properties of Lorentz transformations From the results of the last section, we find the following properties of a special Lorentz transformation or boost. 1. Using the imaginary time coordinate T, a boost along the x-axis of speed v is equivalent to an imaginary rotation in (x, T)-space through an e e angle given by tan = iv/c. 2. If we consider v to be very small compared with c, for which we use the notation v ~ c, and neglect terms of order v2/c 2, then we regain a Galilean transformation t' = t, x' = x - vt, y' =y, z' = z. We can obtain this result formally by taking the limit c ➔ oo in (3.12). 3. If we solve (3.12) for the unprimed coordinates, we get t = f3(t' + vx' /c 2 ), x = f3(x' + vt'), y = y', z = z'. This can be obtained formally from (3.12) by interchanging primed and unprimed coordinates and replacing v by - v. This we should expect from physical reasons, since, if S' moves along the positive x-axis of S with speed v, then S moves along the negative x'-axis ofS' with speed v, or, equivalently, S moves along the positive x'-axis of S' with speed - v. 4. Special Lorentz transformations form a group: (a) The identity element is given by v = 0. (b) The inverse element is given by -v (as in 3 above). I 32 The key attributes of special relativity (c) The product of two boosts with velocities v and v' is another boost with velocity v". Since v and v' correspond to rotations in (x, T)-space of 0 and 0', where tan0 = iv/c and tan0' = iv'/c, then their resultant is a rotation of 0" = 0 + 0', where w. "/c -_ tan 0,, -_ tan (0 + 0,)-- tan0 + 1 - tan 0ttaann00' , , from which we find ,, V =1-+v -+vv-'v/'c-2 • Compare this with equation (2.6) in relativistic units. (d) Associativity is left as an exercise. 5. The square of the infinitesimal interval between infinitesimally separated events (see (2.13)), is invariant under a Lorentz transformation. We now turn to the key physical attributes of Lorentz transformations. Throughout the remaining sections, we shall assume that S and S' are in standard configuration with non-zero relative velocity v. 3.3 Length contraction Consider a rod fixed in S' with endpoints x~ and x~, as shown in Fig. 3.2. In S, the ends have coordinates X,4 and x8 (which, of course, vary in time) given by the Lorentz transformations (3.14) In order to measure the lengths of the rod according to S, we have to find the x-coordinates of the end points at the same time according to S. If we denote the rest length, namely, the length in S', by and the length in Sat time t = t,4 = t 8 by l = XB - x,., Fig. 3.2 A rod moving with velocity v Telative to S. then, subtracting the formulae in (3.14), we find the result 3.4 Time dilation I 33 Since lvl < c - P> 1 - l < 10 , the result shows that the length of a body in the direction of its motion with uniform velocity vis reduced by a factor (1- v2/c2 )½. This phenomenon is called length contraction. Clearly, the body will have greatest length in its rest frame, in which case it is called the rest length or proper length. Note also that the length approaches zero as the velocity approaches the velocity of light. In an attempt to explain the null result of the Michelson-Morley experiment, Fitzgerald had suggested the apparent shortening of a body in motion relative to the ether. This is rather different from the length contraction of special relativity, which is not to be regarded as illusory but is a very real effect. It is closely connected with the relativity ofsimultaneity and indeed can be deduced as a direct consequence of it. Unlike the Fitzgerald contraction, the effect is relative, i.e. a rod fixed in S appears contracted in S'. Note also that there are no contraction effects in directions transverse to the direction of motion. 3.4 Time dilation Let a clock fixed at x' = x'.4 in S' record two successive events separated by an interval of time T0 (Fig. 3.3). The successive events in S' are (x'.4, t1) and (x'.4, t1+ T0 ), say. Using the Lorentz transformation, we have in S ti = P(t'i + vx'.4/c 2), t2 = P(t'i + T0 + vx'.4/c 2 ). On subtracting, we find the time interval in S defined by is given by T = t2 - ti World-line of clock F'ig. 3.3 Successive events recorded by a clock fixed in S'. Thus, moving clocks go slow by a factor (1- v2/c2)-t. This phenomenon is called time dilation. The fastest rate of a clock is in its rest frame and is called s its proper rate. Again, the effect has a reciprocal nature. Let us now consider an accelerated clock. We define an ideal clock to be one unaffected by its acceleration; in other words, its instantaneous rate t1 depends only on its instantaneous speed v, in accordance with the above phenomenon of time dilation. This is often referred to as the clock hypothesis. The time recorded by an ideal clock is called the proper time -r (Fig. 3.4). to Thus, the proper time of an ideal clock between t0 and ti is given by World-line of clock f F'ig. 3.4 Proper time recorded by an accelerated clock. I 34 The key attributes of special relativity The general question of what constitutes a clock or an ideal clock is a nontrivial one. However, an experiment has been performed where an atomic clock was flown round the world and then compared with an identical clock left back on the ground. The travelling clock was found on return to be running slow by precisely the amount predicted by time dilation. Another instance occurs in the study of cosmic rays. Certain mesons reaching us from the top of the Earth's atmosphere are so short-lived that, even had they been travelling at the speed oflight, their travel time in the absence of time dilation would exceed their known proper lifetimes by factors of the order of 10. However, these particles are in fact detected at the Earth's surface because their very high velocities keep them young, as it were. Of course, whether or not time dilation affects the human clock, that is, biological ageing, is still an open question. But the fact that we are ultimately made up of atoms, which do appear to suffer time dilation, would suggest that there is no reason by which we should be an exception. 3.5 Transformation of velocities Consider a particle in motion (Fig. 3.5) with its Cartesian components of velocity being and dt'' = ( ' I U I ) U1' 2, U3 dx' ( dy' dz') dt'' dt' in S'. Taking differentials of a Lorentz transformation t' = {J(t - vx/c 2 ), x' = {J(x - vt), y' = y, z' = z, we get dt' = {J(dt - vdx/c2 ), dx' = {J(dx - v dt), dy' = dy, dz' = dz, and hence dx' {J(dx - v dt) -dx- v• dt u1 - v u'i = dt' = {3(dt - vdx/c 2 ) = 1 _ _!__(v dx) = 1 - u1 v/c2 ' c2 dt (3.18) dy cl2(v!:)] I dy' dy dt U2=dt'= {3(dt-vdx/c2) = /3[1- Fig. 3.5 Particle in motion relative to S and S' ~ s S' Path of particle ~---+--------------------------- dz' dz u3= dt' = /3(dt-vdx/c 2 ) 3.6 Relationship between space-time diagrams of inertial observers I 35 dz dt Notice that the velocity components u2 and u3 transverse to the direction of motion of the frame S' are affected by the transformation. This is due to the time difference in the two frames. To obtain the inverse transformations, simply interchange primes and unprimes and replace v by - v. 3.6 Relationship between space-time diagrams of inertial observers We now show how to relate the space-time diagrams of Sand S' (see Fig. 3.6). We start by taking ct and x as the coordinate axes of S, so that a light ray has slope ¼n (as in relativistic units). Then, to draw the ct'- and x'-axes of S', we note from the Lorentz transformation equations (3.12) ct er ct' = 0 - ct = (v/c)x, that is, the x'-axis, ct' = 0, is the straight line ct= (v/c)x with slope v/c < 1. Similarly, x' x' = 0 - ct = (c/v)x, that is, the ct'-axis, x' = 0, is the straight line ct= (c/v)x with slope c/v > 1. The lines parallel to O(ct') are the world-lines affixed points in S'. The lines parallel to Ox' are the lines connecting points at a fixed time according to S' and are called lines of simultaneity in S'. The coordinates of a general event Pare (ct, x) = (OR, OQ) relative to Sand (ct', x') = (OV, OU) relative to S'. However, the diagram is somewhat misleading because the length scales along the axes are not the same. To relate them, we draw in the hyperbolae x2 - c2t2 = x'2 - c2t'2 = ± 1, Fig. 3.6 The world-lines in S of the fixed points and simultaneity lines of S'. as shown in Fig. 3.7. Then, ifwe first consider the positive sign, setting ct'= 0, we get x' = ± 1. It follows that PA is a unit distance on Ox'. Similarly, taking the negative sign and setting :t' = 0 we get ct' = ± 1 and so OB is the unit measure on Oct'. Then the coordinates of Pin the frame S' are given by (ct,' x ') = ( ou OA' ov) OB • Note the following properties from Fig. 3.7. 1. A boost can be thought of as a rotation through an imaginary angle in the (x, T)-plane, where Tis imaginary ti~. We have seen that this is equivalent, in the real (x, ct)~plane, to a skewing of the coordinate axes inwards through the same angle. (This was not appreciated by some past opponents of special relativity, who gave some erroneous counterarguments based on the mistaken idea that a boost could be represented by a real rotation in the (x, ct)-plane.) 2. The hyperbolae are the same for all frames and so we can draw in any number of frames in the same diagram and use the hyperbolae to calibrate them. ct=l x=l Fig. 3.7 Length scales in Sand S'. 36 I The key attributes of special relativity 3. The length contraction and time dilation effects can be read off directly from the diagram. For example, the world-lines of the endpoints of a unit rod OA in S', namely x' = 0 and x' = 1, cut Ox in less than unit distance. Similarly world-lines x = 0 and x = 1 in S cut Ox' inside OE, from which the reciprocal nature of length contraction is evident. 4. Even A has coordinates (ct', x') = (0, 1) relative to S', and hence by a Lorentz transformation coordinates (ct, x) = (Pv/c, P) relative to S. The quantity OA defined by OA = (c 2 t 2 + x2 )½ = {3(1 + v2/c2 )½ is a measure of the calibration factor 3.7 Acceleration in special relativity We start with the inverse transformation of (3.18), namely, + U'1 V U1 = 1 + u'1v/c2' from which we find the differential du - du'1 - ( u'1 + v ).!:..du' 1 - 1 + u'1v/c 2 (l + u'1v/c2)2 c2 1 1 du'1 = 132 (1 + u~v/c2)2. Similarly, from the inverse Lorentz transformation we find the differential t = P(t' + x'v/c2 ), dt = P(dt' + dx'v/c;:) = p(l + u'1v/c2)dt'. Combining these results, we find t~t the x-component of the acceleration }ransforms according to Similarly, we find du 1 1 du'1 dt = {3 3 (1 + u'1v/c2 ) 3 dt' • (3.21) du2 1 du2 VUz du1 dt = _p2(1 + u'1v/c2 )2 cit' - c2{3 2 (1 + u'1 v/c 2 )3 dt'' du 3 1 du3 vu3 du'1 dt = {32 (1 + u'1v/c 2 )3 cit' - c2P2 (1 + u'1v/c 2)3 cit'. (3.22) (3.23) The inverse transformations can be found in the usual way. It follows from the transformation formulae that acceleration is not an invariant in special relativity. However, it is clear from the formulae that acceleration is an absolute quantity, that is, all observers agree whether a body is accelerating or not. Put another way, if the acceleration is zero in one frame, then it is necessarily zero in any other frame. We shall see that this is Table 3.1 Theory Newtonian Special relativity General relativity Position Relative Relat ive Relative Velocity Relative Relative Relative Time Absolute Relative Relative Acceleration Absolute Absolute Relative 3.8 Uniform acceleration I 37 no longer the case in general relativity. We summarize the situation in Table 3.1, which indicates why the subject matter of the book is 'relativity' theory. 3.8 Uniform acceleration The Newtonian definition of a particle moving under uniform acceleration is du dt = constant. This turns out to be inappropriate in special relativity since it would imply that u ➔ oo as t ➔ ro, which we know is impossible. We therefore adopt a different definition. Acceleration is said to be uniform in special relativity if it has the same value in any co-moving frame, that is, at each instant, the acceieration in an inertial frame travelling with the same velocity as the particle has the same value. This is analogous to the idea in Newtonian theory of motion under a constant force. For example, a spaceship whose motor is set at a constant emission rate would be uniformly accelerated in this sense. Taking the velocity of the particle to be u = u(t) relative to an inertial frame S, then at any instant in a co-moving frame S', it follows that v = u, the velocity relative to S' is zero, i.e. u' = 0, and the acceleration is a constant, a say, i.e. du'/dt' = a. Using (3.21), we find !: 3 = ; 3 a = ( 1 - ::Ya. We can solve this differential equation by separating the variables du - - - -3 =adt (1 - u2/c2 F and integrating both sides. Assuming that the particle starts from rest at t = t0 ; we find Solving for u, we get u = dx - dt = a(t - - [1 -+ ~- a2 (t - t0~) ~ ~ t0 )2 /c2 J½ • Next, integrating with respect to t, and setting x = x0 at t = t0 , produces 38 I The key attributes of special relativity ct II -+--+----1i---1r-t-----1----1--- x Fig. 3.8 Hyperbolic motions. This can be rewritten in the form (x - x0 + c2/a)2 (c2 /a)2 (3.24) which is a hyperbola in (x, ct)-space. If, in particular, we take x0 - c2/a = t0 = 0, then we obtain a family of hyperbolae for different values of a (Fig. 3.8). These world-lines are known as hyperbolic motions and, as we shall see in Chapter 23, they have significance in cosmology. It can be shown that the radar distance between the world-lines is a constant. Moreover, consider the regions I and II bounded by the light rays passing through 0, and a system of particles undergoing hyperbolic motions as shown in Fig. 3.8 (in some cosmological models, the particles would be galaxies). Then, remembering that light rays emanating from any point in the diagram do so at 45°, no particle in region I can communicate with another particle in region II, and vice versa. The light rays are called event horizons and act as barriers beyond which no knowledge can ever be gained. We shall see that event horizons will play an important role later in this book. ct 3.9 The twin paradox A Uniform reversal of direction Uniform velocity Uniform acceleration away from the Earth Fig. 3.9 The twin paradox. ct Fig. 3.10 Simultaneity lines of A on the outward and return journeys. This is a form of the clock paradox which has caused the most controversy a controversy which raged on and off for over 50 years. The paradox concerns two twins whom we shall call A and A. The twin A takes off in a spaceship for a return trip to some distant star. The assumption is that A is uniformly accelerated to some given velocity which is retained until the star is reached, whereupon the motion is uniformly reversed, as shown in Fig. 3.9. According to A, A's clock records slowly on the outward and return journeys and so, on return, A will be younger than A. If the periods of acceleration are negligible compared with the periods of uniform velocity, then could not A reverse the argument and conclude that it is A who should appear to be the younger? This is the basis of the paradox. The resolution rests on the fact that the accelerations, however brief, have immediate and finite effects on A but not on A who remains inertial throughout. One striking way of seeing this effect is to draw in the simultaneity lines of A for the periods of uniform velocity, as in Fig. 3.10. Clearly, the period of uniform reversal has a marked effect on the simuitaneity lines. Another way oflooking at it is to see the effect that the periods of acceleration have on shortening the length of the journey as viewed by A. Let us be specific: we assume that the periods of acceleration are T1 , T2 , and T3 , and that, after the period Ti, A has attained a speed v = ✓3c/2. Then, from A's viewpoint, during the period T1 , A finds that more than half the outward journey has been accomplished, in that A has transferred to a frame in which the distance between the Earth and the star is more than halved by length contraction. Thus, A accomplishes the outward trip in about half the time which A ascribes to it, and the same applies to the return trip. In fact, we could use the machinery of previous sections to calculate the time elapsed in both the periods of uniform acceleration and uniform velocity, and we would again reach the conclusion that on return A will be younger than A. As we have said before, this points out the fact that in special relativity time is a route-dependent quantity. The fact that in Fig. 3.9 A's world-line is longer than A's, and yet takes less time to travel, is connected with the Minkowskian metric = ds 2 c2 dt 2 - dx 2 - dy 2 - dz 2 and the negative signs which appear in it compared with the positive signs occurring in the usual three-dimensional Euclidean metric. 3.10 The Doppler effect All kinds of waves appear lengthened when the source recedes from the observer: sounds are deepened, light is reddened. Exactly the opposite occurs when the source, instead, approaches the observer. We first of all calculate the classical Doppler effect. Consider a source of light emitting radiation whose wavelength in its rest frame is A0 . Consider an observer S relative to whose frame the source is in motion with radial velocity ur. Then, if two successive pulses are emitted at time differing by dt' as measured by S', the distance these pulses have to travel will differ by an amount u,dt' (see Fig. 3.11). Since the pulses travel with speed c, it follows that they arrive at S with a time difference giving At= dt' + urdt'/c, At/dt; = 1 + u,/c. Now, using the fundamental relationship between wavelength and velocity, set A= cAt and Ao= cdt'. We then obtain the classical Doppler formula 3.10 The Doppler effect I 39 Let us now consider the special relativistic formula. Because of time dilation (see Fig. 3.3), the time interval between successive pulses according to S is /Jdt' (Fig. 3.12). Hence, by the same argument, the pulses arrive at S with a time difference At = /J dt' + ur/J dt'/c SL s·r-.u, (a) ------------------------~ u,dt' S L _____________________ - -- -- --1----S'_u, Fig. 3.11 The Doppler effect: (b) (a) first pulse; (b) second pulse. I 40 The key attributes of special relativity and so this time we find that the special relativistic Doppler formula is V ). l + u,/c ).0 (1 - v2/c 2 )½' (3.26) If the velocity of the source is purely radial, then u, = v and (3.26) reduces to Fig. 3.12 The special relativistic Doppler shift. V This is the radial Doppler shift, and, if we set c = 1, we obtain (2.4), which is the formula for the k-factor. Combining Figs. 2.7 and 3.12, the radial Doppler shift is illustrated in Fig. 3.13, where dt' is replaced by T. From equation (3.26), we see that there is also a change in wavelength, even when the radial velocity of the source is zero. For example, if the source is moving in a circle about the origin of S with speed v (as measured by an instantaneous comoving frame), then the transverse Doppler shift is given by Fig. 3.13 The radial Doppler shift k. This is a purely relativistic effect due to the time dilation of the moving source. Experiments with revolving apparatus using the so-called 'Mossbauer effect' have directly confirmed the transverse Doppler shift in full agreement with the relativistic formula, thus providing another striking verification of the phenomenon of time dilation. Exercises 3.1 (§3.1) Sand S' are in standard configuration with v = etc (0 < ct < 1). If a rod at rest in S' makes an angle of 45° with Ox in Sand 30° with O'x in S', then find ex. 3,2 (§3,1) Note from the previous question that perpendicular lines in one frame need not be perpendicular in another frame. This shows that there is no obvious meaning to the phrase 'two inertial frames are parallel', unless their relative velocity is along a common axis, because the axes of either frame need not appear rectangular in the other. Verify that the Lorentz transformation between frames in standard configuration with relative velocity v = (v, 0, 0) may be written in vector form r' = r + ( -v;·r; (/1 - 1) - {1t) v, t' = p( t- 2v·r) . where r = (x, y, z). The formulae are said to comprise the 'Lorentz transformation without relative rotation'. Justify this name by showing that the formulae remain valid when the frames are not in standard configuration, but are parallel in the sense that the same rotation must be applied to each frame to bring the two into standard configuration (in which case v is the velocity of S' relative to S, but v = (v, 0, 0) no longer applies). 3.3 (§3.1) Prove that the first two equations of the special Lorentz transformation can be written in the form ct' = - xsinh + ctcosh , x' = xcosh - ctsinh , = where the rapidity is defined by tanh- 1 (v/c). Establish also the following version of these equations: ct'+ x' = e-4>(ct + x), ct' - x' = e4>(ct - x), e24> = (1 + v/c)/(1 - v/c). What relation does have to 0 in equation (3.11)? 3.4 (§3.1) Aberration refers to the fact that the direction of travel of a light ray depends on the motion of the observer. Hence, if a telescope observes a star at an inclination 0' to the horizontal, then show that classically the 'true' inclina- tion 0 of the star is related to 0' by sin0 tan 0' = - - - - , cos0 + v/c where v is the velocity of the telescope relative to the star. Show that the corresponding relativistic formula is sin0 tan 0' = -{J(cos --- 0 + v/c) 3.5 (§3.2) Show that special Lorentz transformations are associative, that is, if O(vi) represents the transformation from observer S to S', then show that (O(v 1)0(v2 ))0(v3 ) = O(vi)(O(v2 )0(v3 )). 3.6 (§3,3) An athlete carrying a horizontal 20-ft-long pole runs at a speed v such that (1 - v2/c 2 )-½ = 2 into a 10-ftlong room and closes the door. Explain, in the athlete's frame, in which the room is only 5 ft long, how this is possible. [Hint: no effect travels faster than light.] Show that the minimum length of the room for the performance of this trick is 20/(.J3 + 2) ft. Draw a space-time diagram to indic- ate what is going on in the rest frame of the athlete. 3.7 (§3.5) A particle has velocity u = (u1 , u2 , u3) in Sand u' = (u1, u;, u;) in S'. Prove from the velocity trans- formation formulae that c2(c2 - u'2)(c2 - v2) c2 - u2 = - - - - - - - - . (c 2 + u'1 v)2 Deduce that, if the speed of a particle is less than c in any one inertial frame, then it is less than c in every inertial frame. 3.8 (§3.7) Check the transformation formulae for the components of acceleration (3.21)-(3.23). Deduce that acceleration is an absolute quantity in special relativity. 3.9 (§3.8) A particle moves from rest at the origin of a frame Salong the x-axis, with constant acceleration ex (as measured in an instantaneous rest frame). Show that the equation of motion is Exercises I 41 and prove that the light signals emitted after time t = c/cx at the origin will never reach the receding particle. A standard clock carried along with the particle is set to read zero at the beginning of the motion and reads Tat time tin S. Using the clock hypothesis, prove the following relationships: U IXT - = tanh-, C C ( 1- -uc22)-½ = coshIX-T , C -IXt = IXT sinh-, C C x = ~ c2 ( cosh(X~T - 1) . Show that, if T 2 which satisfies Poisson's equation 1 at points inside the distribution, where the Laplacian operator V2 is given in Cartesian coordinates by a2 ,J2 a2 v2 = 8x2 + fJy2 + 8z2" At points external to the distribution, this reduces to Laplace's equation We assume that the reader is familiar with this background to Newtonian theory. 4.2 Isolated systems of particles in Newtonian mechanics In this section, we shall, for completeness, derive the conservation of linear momentum in Newtonian mechanics for a system of n particles. Let the ith particle have constant mass mi and position vector ri relative to some arbitrary origin. Then the ith particle possesses linear momentum p1 defined by p1 = mii'i, where the dot denotes differentiation with respect to time t. If Fi is the total force on mi, then, by Newton's second law, we have (4.7) The total force F1on the ith particle can be divided into an external force Ff'1 due to any external fields present and to the resultant of the internal forces. We write L n = Fi Ff'1 + Fij, j= l where Fli is the force or the ith particle due to the jth particle and where, for convenience, we define Fii = 0. If we sum over i in (4.7), we find - L L - L L + d • • dp1 • • dt i= 1 Pi = i= 1 dt = Ft"1 1= 1 Fi}· i,J= 1 L ;= Using New~on's third law, namely, Fil= -F1i, then the last term is zero and we obtain P = pext, where P = 1 p 1 is termed the total linear momentum I;= of the system and p••1 = 1 Ff•1 is the total external force on the system. If, in particular, the system of particles is isolated, then pext = 0 => p = C, where c is a constant vector. This leads to the law of the conservation of linear momentum of the system, namely, I 4.3 Relativistic mass 45 4.3 Relativistic mass The transition from Newtonian to relativistic mechanics is not, in fact, completely straightforward, because it involves at some point or another the introduction of ad hoc assumptions about the behaviour of particles in relativistic situations. We shall adopt the approach of trying to keep as close to the non-relativistic definition of energy and momentum as we can. This leads to results which in the end must be confronted with experiment. The ultimate justification of the formulae we shall derive resides in the fact that they have been repeatedly confirmed in numerous laboratory experiments in particle physics. We shall only derive them in a simple case and state that the arguments can be extended to a more general situation. It would seem plausible that, since length and time measurements are dependent on the observer, then mass should also be an observer-dependent quantity. We thus assume that a particle which is moving with a velocity u relative to an inertial observer has a mass, which we shall term its relativistic mass, which is some function of u, that is, m = m(u), (4.9) where the problem is to find the explicit dependence of m on u. We restrict attention to motion along a straight line and consider the special case of two equal particles colliding inelastically (in which case they stick together), and look at the collision from the point of view of two inertial observers Sand S' (see Fig. 4.3). Let one of the particles be at rest in the frame S and the other possess a velocity u before they collide. We then assume that they coalesce and that the combined object moves with velocity U. The masses of the two particles are respectively m(O) and m(u) by (4.9). We denote m(O) by m0 and term it the rest mass of the particle. In addition, we denote the mass of the combined object by M( U). Ifwe take S' to be the centre-of-mass frame, then it should be clear that, relative to S', the two equal particles collide with equal and opposite speeds, leaving the combined object with mass M0 at rest. It follows that S' must have velocity U relative to S. I 46 The elements of relativistic mechanics @-u m(u) l Before Ins Fig. 4.3 The inelastic collision in the frames Sand S'. 0---------+ u m(U) ®O----+-U M(U) u~ m{U) ©O Mo After l Before in S' After We shall assume both conservation ofrelativistic mass and conservation of linear momentum and see what this leads to. In the frame S, we obtain m(u) + m0 = M(U), m(u)u + 0 = M(U)U, from which we get, eliminating M ( U ), (u ~ m(u) = m0 U ). (4.10) The left-hand particle has a velocity U relative to S', which in turn has a velocity U relative to S. Hence, using the composition of velocities law, we can compose these two velocities and the resultant velocity must be identical with the velocity u of the left-hand particle in S. Thus, by (2.6) in nonrelativistic units, 2U u=(1+u2; c2)· Solving for U in terms of u, we obtain the quadratic u + U2 - ( -2-c-;2- ) c2 = 0, c: y- r rl which has roots u=: ±[ c2 = : [1 ± ( 1 - :: In the limit u -+ 0, this must produce a finite result, so we must take the negative sign (check), and, substituting in (4.10), we find finally where This is the basic result which relates the relativistic tnass of a moving particle to its rest mass. Note that this is the same in structure as the time dilation formula (3.16), i.e. T=PT0 , where P=(l-v2 /c2 )-t, except that time I 4.4 Relativistic energy 47 dilation involves the factor f3 which depends on the velocity v of the frame S' relative to S, whereas y depends on the velocity u of the particle relative to S. Ifwe plot m against u, we see that relativistic mass increases without bound as u approaches c (Fig. 4.4). It is possible to extend the above argument to establish (4.11) in more general situations. However, we emphasize that it is not possible to derive the result a priori, but only with the help of extra assumptions. However it is produced, the only real test of the validity of the result is in the experimental arena and here it has been extensively confirmed. 4.4 Relativistic energy m(u) mo Fig. 4.4 Relativistic mass as a function of velocity. Let us expand the expression for the relativistic mass, namely, m(u) = ym0 = m0 (1 - u2/c 2 )-½, in the case when the velocity u is small compared with the speed of light c. Then we get (u m(u) = m0 + c12(z1-m0u2 ) + 0 4 c4 ) , (4.13) where the final term stands for all terms of order (u/c)4 and higher. If we multiply both sides by c2, then, apart from the constant m0 c2 , the right-hand side is to first approximation the classical kinetic energy (k.e.), that is, mc2 = m0c2 + ½m0 u2 + ••· ~ constant+ k.e. (4.14) We have seen that relativistic mass contains within it the expression for classical kinetic energy. In fact, it can be shown that the conservation of relativistic mass leads to the conservation of kinetic energy in the Newtonian approximation. As a simple example, consider the collision of two particles with rest mass m0 and m0 , initial velocities v1 and ii1 , and final velocities v2 and ii2 , respectively (Fig. 4.5). Conservation of relativistic mass gives m m0 (1 - vUc2 )-½ + 0 (1 - iir/c2 )-½ = m0 (1 - v~/c2 )-½ + mo(l - vVc2 )-½. (4.15) If we now assume that v1 , v2 , v1 , and v2 are all small compared with c, then we find (exercise) that the leading terms in the expansion of (4.15) give (4.16) which is the usual conservation of energy equation. Thus, in this sense, conservation of relativistic mass includes within it conservation of energy. Now, since energy is only defined up to the addition of a constant, the result Before mO---V2 0 After Fig. 4.5 Two colliding particles. I 48 The elements of relativistic mechanics (4.14) suggest that we regard the energy E of a particle as given·by This is one of the most famous equations in physics. However, it is not just a mathematical relationship between two different quantities, namely energy and mass, but rather states that energy and mass are equivalent concepts. Because of the arbitrariness in the actual value of E, a better way of stating the relationship is to say that a change in energy is equal to a change in relativistic mass, namely, AE = Amc2 Using conventional units, c2 is a large number and indicates that a small change in mass is equivalent to an enormous change in energy. As is well known, this relationship and the deep implications it carries with it for peace and war, have been amply verified. For obvious reasons, the term m0 c-2 is termed the rest energy of the particle. Finally, we point out that conservation of linear momentum, using relativistic mass, leads to the usual conservation law in the Newtonian approximation. For example (exercise), the collision problem considered above leads to the usual conservation of linear momentum equation for slow-moving particles: (4.18) Extending these ideas to three spatial dimensions, then a particle moving with velocity u relative to an inertial frame S has relativistic mass m, energy E, and linear momentum p given by Some straightforward algebra (exercise) reveals that (E/c) 2 - p; - p; - p; .= (m0 c) 2 , (4.20) where m0 c is an invariant, since it is the same for all inertial observers. If we compare this with the invariant (3.13), i.e. (ct)2 - x2 - y2 - z2 = s2, then it suggests that the quantities (E/c, p,,, Py, p,) transform under a Lorentz transformation in the same way as the quantities (ct, x, y, z). We shall see in Part C that the language of tensors provides a better framework for discussing transformation laws. For the moment, we shall assume that energy and momentum transform in an identical manner and quote the results. Thus, in a frame S' moving in standard configuration with velocity v relative to S, the transformation equations are (see (3.12)) The inverse transformations are obtained in the usual way, namely, by interchanging primes and unprimes and replacing v by -v, which gives If, in particular, we take S' to be the instantaneous rest frame of the particle, then p' = 0 and E' = E0 = m0c2. Substituting in (4.22), we find , moc2 2 E=/3E =(1-v2;c2)½=mc' where m = m0 (1 - v2/c2)-½ andp = (f3vE' / c2, 0, 0) = (mv, 0, 0) = mv, which are precisely the values of the energy, mass, and momentum arrived at in (4.19) with u replaced by v. 4.5 Photons At the end of the last century, there was considerable conflict between theory and experiment in the investigation of radiation in enclosed volumes. In an attempt to resolve the difficulties, Max Planck proposed that light and other electromagnetic radiation consisted of individual 'packets' of energy, which he called quanta. He suggested that the energy E of each quantum was to depend on its frequency v, and proposed the simple law, called Planck's hypothesis, where his a universal constant known now as Planck's constant. The idea of the quantum was developed further by Einstein, especially in attempting to explain the photoelectric effect. The effect is to do with the ejection of electrons from a metal surface by incident light (especially ultraviolet) and is strongly in support of Planck's quantum hypothesis. Nowadays, the quantum theory is well established and applications of it to explain properties of molecules, atoms, and fundamental particles are at the heart of modern physics. Theories of light now give it a dual wave- particle nature. Some properties, such as diffraction and interference, are wavelike in nature, while the photoelectric effect and other cases of the interaction of light and atoms are best described on a particle basis. The particle description oflight consists in treating it as a stream of quanta called photons. Using equation (4.19) and substituting in the speed of light, u = c, we find (4.24) that is, the rest mass of a photon must be zero! This is not so bizarre as it first seems, since no inertial observer ever sees a photon at rest - its speed is always c - and so the rest mass of a photon is merely a notional quantity. If we let ii be a unit vector denoting the direction of travel of the photon, then P = (Px, Py, P,) = pn, and equation (4.20) becomes (E/ c) 2 - p2 = 0. I 4.5 Photons 49 50 I The elements of relativistic mechanics Taking square roots (and remembering c and pare positive), we find that the energy E of a photon is related to the magnitude p of its momentum by E = pc. (4.25) Finally, using the energy-mass relationship E = mc2 , we find that the rela- tivistic mass of a photon is non-zero and is given by m = p/ c. (4.26) Combining these results with Planck's hypothesis, we obtain the following formulae for the energy E, relativistic mass m, and linear momentum p of the photon: It is gratifying to discover that special relativity, which was born to reconcile conflicts in the kinematical properties of light and matter, also includes their mechanical properties in a single all-inclusive system. We finish this section with an argument which shows that Planck's hypothesis can be derived directly within the framework of special relativity. We have already seen in the last chapter that the radial Doppler effect for a moving source is given by (3.27), namely ~=(l+v/c)t 10 1 - v/c ' where Ao is the wavelength in the frame of the source and l is the wavelength in the frame of the observer. We write this result, instead, in terms of frequency, using the fundamental relationships c = Av and c = Ao v0 , to obtain Vo=(~)½. v 1 - v/c (4.28) Now, suppose that the source emits a light flash of total energy £ 0 . Let us use the equations (4.22) to find the energy received in the frame of the observer S. Since, recalling Fig. 3.11, the light flash is travelling along the negative xdirection of both frames, the relationship (4.25) leads to the result p~ = -E0/ c, with the other primed components of momentum zero. Substituting in the first equation of (4.22), namely, we get E = /3(E' + vp~), or E0 =(1+v/c)t· E 1 - v/c (4.29) Combining this with equation (4.28), we obtain E0 E Vo V Since this relationship holds for any pair of inertial observers, it follows that the ratio must be a universal constant, which we call h. Thus, we have derived Planck's hypothesis E = hv. We leave our considerations of special relativity at this point and turn our attention to the formalism of tensors. This will enable us to reformulate . special relativity in a way which will aid our transition to general relativity, that is, to a theory of gravitation consistent with special relativity. Exercises I 51 Exercises 4.1 (§4.l) Discuss the possibility of using force rather than mass as the basic quantity, taking, for example, a standard weight at a given latitude as the unit of force. How should one then define and measure the mass of a body? 4.2 (§4.3) Show that, in the inelastic collision considered in §4.3, the rest mass of the combined object is greater than the sum of the original rest masses. Where does this increase derive from? 4.3 (§4.3) A particle of rest mass m0 and speed u strikes a stationary particle of rest mass m0 . If the collision is perfectly inelastic, then find the rest mass of the composite particle. 4.4 (§4.4) (i) Establish the transition from equation (4.15) to (4.16). (ii) Establish the Newtonian approximation equation (4.18). 4.5 (§4.4) Show that (4.19) leads to (4.20). Deduce_(4.21). 4.6 (§4.4) Newton's second law for a particle of relativistic mass mis d F=-(mu). dt Define the work done d E in moving the particle from r to r+ dr. Show that the rate of doing work is given by dE d(mu) -=--·u. dt dt Use the definition of relativistic mass to obtain the result dE -= m0 ud-u dt (1 - u2/ c2)312 dt [ Hint: u· du= dt u du] dt . Express this last result in terms of dm/ dt and integrate to obtain E = mc2 + constant. 4.7 (§4.4) Two particles whose rest masses are m1 and m2 move along a straight line with velocities u1 and u2 , measured in the same direction. They collide inelastically to form a new particle. Show that the rest mass and velocity of the new particle are m3 and u3 , respectively, where ml= mf + m~ + 2m 1 m 2 y1 y2(1 - u 1 u 2/ c 2 ), + m1Y1U1 m2r2u2 = U3 + m1 Y1 m2r2 with 4.8 (§4.4) A particle of rest mass m0 , energy e0 , and momentum p0 suffers a head on elastic collision (i.e. masses of particles unaltered) with a stationary mass M . In the collision, M is knocked straight forward, with energy E and momentum P, leaving the first particle with energy e and p. Prove that P =2-p 0 M - (-e0-+-M-c-2 ) - 2Meo + M2 c2 + m~c2 and Po(m2c2 - M2c2) p= 2MeO+ M2 c2 + m~c2 What do these formulae become i!) the classical limit? 4.9 (§4.4) Assume that the formulae (4.19) hold for a tachyon, which travels with speed v > c. Taking the energy to be a measurable quantity, the,n deduce that the rest mass of a tachyon is imaginary and define the real quantity µ0 by mo= iµo . If the tachyon moves along the x-axis and if we assume that the x-component of the momentum is a real positive quantity, then deduce m = V -a . µ o , lvl E = mc 2, where a.= (v2 /c2 - 1)-t. Plot E/mOc2 against v/c for both tachyons and sub- luminal particles. 4.10 (§4.5) Two light rays in the (x, y)-plane of an inertial observer, making angles 0 and -0, respectively, with the positive x axis, collide at the origin. What is the velocity v of 52 I The elements of relativistic mechanics the inertial observer (travelling in standard configuration) who sees the light rays collide head on? 4.11 (§4.5) An atom of rest mass m0 is at rest in a laboratory and absorbs a photon of frequency v. Find the velocity and mass of the recoiling particle. 4.12 (§4.5) An atom at rest in a laboratory emits a photon and recoils. If its initial mass is m0 and it loses the rest energy e in the emission, prove that the frequency of the emitted photon is given by 5.1 Introduction To work effectively in Newtonian theory, one really needs the language of vectors. This language, first of all, is more succinct, since it summarizes a set of three equations in one. Moreover, the formalism -o_f vectors helps to solve certain problems more readily, and, most important of all, the language reveals structure and thereby offers insight. In exactly the same way, in relativity theory, one needs the language of tensors. Again, the language helps to summarize sets of equations succinctly and to solve problems more readily, and it reveals structure in the equations. This part of the book is devoted to learning the formalism of tensors which is a pre-condition for the rest of the book. The approach we adopt is to concentrate on the technique of tensors without taking into account the deeper geometrical significance behind the theory. We shall be concerned more with what you do with tensors rather than what tensors actually are. There are two distinct approaches to the teaching of tensors: the abstract or index-free (coordinate-free) approach and the conventional approach based on indices. There has been a move in recent years in some quarters to introduce tensors from the start using the more modern abstract approach (although some have subsequently changed their mind and reverted to the conventional approach). The main advantage of this approach is that it offers deeper geometrical insight. However, it has two disadvantages. First of all, it requires much more of a mathematical background, which in turn takes time to develop. The other disadvantage is that, for all its elegance, when one wants to do a real calculation with tensors, as one frequently needs to, then recourse has to be made to indices. We shall adopt the more conventional index approach, because it will prove faster and more practical. However, we advise those who wish to take their study of the subject further to look at the index-free approach at the first opportunity. We repeat that the exercises are seen as integral to this part of the book and should not be omitted. 5.2 Manifolds and coordinates We shall start by working with tensors defined inn dimensions since, and it is part of the power of the formalism, there is little extra effort involved. A tensor is an object defined on a geometric entry called a (differential) manifold. We shall not define a manifold precisely because it would involve 56 I Tensor algebra Fig. 5.1 Plane polar coordinate curves. us in too much of a digression. But, in simple terms, a manifold is something which 'locally' looks like a bit of n-dimensional Euclidean space JR". For example, compare a 2-sphere S2 with the Euclidean plane JR2. They are clearly different. But a small bit of S2 looks very much like a small bit of JR2 (if we neglect metrical properties). The fact that S 2 is 'compact', i.e. in some sense finite, whereas JR2 'goes off to infinity' is a global property rather than a local property. We shall not say anything precise about global properties-the topology of the manifold-, although the issue will surface when we start to look carefully at solutions of Einstein's equations in general relativity. We shall simply take an n-dimensional manifold M to be a set of points such that each point possesses a set of n coordinates (x1, x2, ... , x"), where each coordinate ranges over a subset of the reals, which may, in particular, range from - oo to + oo. To start off with, we can think of these coordinates as corresponding to distances or angles in Euclidean space. The reason why the coordinates are written as superscripts rather than subscripts will become clear later. Now the key thing about a manifold is that it may not be possible to cover the whole manifold by one non-degenerate coordinate system, namely, one which ascribes a unique set of n coordinate numbers to each point. Sometimes it is simply convenient to use coordinate systems with degenerate points. For example, plane polar coordinates (R, ¢) in the plane have a degeneracy at the origin because ¢ is indeterminate there (Fig. 5.1). However, here we could avoid the· degeneracy at the origin by using Cartesian coordinates. But in other circumstances we have no choice in the matter. For example, it can be shown that there is no coordinate system which covers the whole of a 2-sphere S2 without degeneracy. The smallest number needed is two, which is shown schematically in Fig. 5.2. We therefore ,..,,..,..,..,.,.,,.,..,.__ _ _ First non-degenerate coordinate system covering North Pole Overlap of coordinate ) systems at equator Fig. 5.2 Two non-degenerate coordinate systems covering an 52 • """'~~~'----Second non-degenerate coordinate system covering South Pole Overlap of coordinate patches Manifold M Fig. 5.3 Overlapping coordinate patches in a manifold. Coordinate patch work with coordinate systems which cover only a portion of the manifold and which are called coordinate patches. Figure 5.3 indicates this schematically. A set of coordinate patches which covers the whole manifold is called an atlas. The theory of manifolds tells us how to get from one coordinate patch to another by a coordinate transformation in the overlap region. The behaviour of geometric quantities under coordinate transformations lies at the heart of tensor calculus. I 5.3 Curves and surfaces 57 5.3 Curves and surfaces Given a manifold, we shall be concerned with points in it and subsets of points which define curves and surfaces of different dimensions. We shall frequently define these curves and surfaces parametrically. Thus (in exactly the same way as is done in Euclidean 2- and 3-space), since a curve has one degree of freedom it depends on one parameter and so we define a curve by the parametric equations where u is the parameter and x1 ( u), x2( u), ... , xn(u) denote n functions of u. Similarly, since a subspace or surface of m dimensions (m < n) has m degrees of freedom, it depends on m parameters and it is given by the parametric equations xa = xa(u1, u2, ••• 'u"') (a= 1, 2, ... 'n). (5.2) If, in particular, m = n - 1, the subspace is called a hypersurface. In this case, xa=xa(u1,u2, ... ,u"- 1 ) (a=l,2, ... ,n) and the n - 1 parameters can be eliminated from these n equations to give one equation connecting the coordinates, i.e. From a different but equivalent point 9fview, a point in a general position in a manifold has n degrees of freedom. If it is restricted to lie in a hypersurface, an (n - 1)-subspace, then its coordinates must satisfy one constraint, namely, f (x1, x2, ... , x") = 0, which is the same as equation (5.3). Similarly, points in an m-dimensional subspace (m < n) must satisfy n - m constraints f 1 (x1, .t2, ... , x") = 0, } f 2 (x1, x2, ... , x") = 0, (5.4) f"-"'(x1, x 2, ... , x") = 0, which is an alternative to the parametric representation (5.2). 58 I Tensor algebra 5.4 Transformation of coordinates As we have seen, a point in a manifold can be covered by many different coordinate patches. The essential point about tensor calculus is that when we make a statement about tensors we do not wish it simply to hold just for one coordinate system but rather for all coordinate systems. Consequently, we need to find out how quantities behave when we go from one coordinate system to another one. We therefore consider the change of coordinates x0 -+ x'0 given by the n equations x'• = f 0 (x1, x 2, .•. , x") (a= 1, 2, ... , n), (5.5) where the f's are single-valued continuous differentiable functions, at least for certain ranges of their arguments. Hence, at this stage, we view a coordinate transformation passively as assigning to a point of the manifold whose old coordinates are (x1, x 2, ... , x") the new primed coordinates (x'1, x'2, ... , x'"). We can write (5.5) more succinctly as x'• = f°(x), where, from now on, lower case Latin indices are assumed to run from 1 to n, the dimension of the manifold, and the f" are alt functions of the old unprimed coordinates. Furthermore, we can write the equation more simply still as where x'0 (x) denote then functionsf°(x). Notation plays an important role in tensor calculus, and equation (5.6) is clearly easier to write than equation (5.5). We next contemplate differentiating (5.6) with respect to each of the coordinates xb. This produces then x n transformation matrix of coefficients: OX' 1 OX' 1 OX' 1 OX 1 ox 2 ox" [!;:] ox'2 ox'2 = OX 1 ox2 ox' 2 ox• (5.7) ox'" ox'" ox'" OX 1 ox2 ox• The determinant J' of this matrix is called the Jacobian of the transformation: (5.8) We shall assume that this in non-zero for some range of the coordinates xb. Then it follows from the implicit function theorem that we can (in principle) solve equation (5.6) for the old coordinates x• and obtain the inverse trans- formation equations x" = x0 (x'). (5.9) 5.4 Transformation of coordinates I 59 It follows from the product rule for determinants that, if we define the Jacobian of the inverse transformation by then J = 1/1 1• In three dimensions, the equation of a surface is given by z = f(x, y), then its total differential is defined to be aJ aJ dz = ax dx + ay dy. Then, in an exactly analogous manner, starting from (5.6), we define the total differential for each a running from 1 to n. We can write this more economically by introducing an explicit summation sign: ft <:I IQ d ~ d b X IQ~" L'..<:ibX. b=I uX (5.10) This can be written more economically still by introducing the Einstein summation convention: whenever a literal index is repeated, it is understood to imply a summation over the index from l to n, the dimension of the manifold. Hence, we can write (5.10) simply as The index a occurring on each side of this equation is said to be free and may take on separately any value from 1 to n. The index b on the right-hand side is repeated and hence there is an implied summation from 1 to n. A repeated index is called bound or dummy because it can be replaced by any other index not already in use. For example, because c was not already in use in the expression. We define the Kronecker delta o,: to be a quantity which is either Oor 1 according to o,: = { l if a = b, 0. if a -=I- b. (5.12) It therefore follows directly •from the definition of partial differentiation (check) that (5.13) 60 I Tensor algebra 5.5 Contravariant tensors The approach we are going to adopt is to define a geometrical quantity in terms of its transformation properties under a coordinate transformation (5.6). We shall start with a prototype and then give the general definition. Consider two neighbouring points in the manifold P and Q with coordinates x• and x• + dx", respectively. The two points define an infinitesimal dis- PQ. placement or infinitesimal vector The vector is not to be regarded as free, but as being attached to the point P (Fig. 5.4). The components of this vector in the x"-coordinate system are dx". The components in another coordinate system, say the x'"-coordinate system, are dx'• which are connec- ted to dx" by (5.11), namely, a ,. dx'• = a:b d~. (5.14) Fig. 5.4 Infinitesimal vector PQattached to P. The transformation matrix appearing in this equation is to be regarded as being evaluated at the point P. i.e. strictly speaking we should write (5.15) but with this understood we shall stick to the notation of (5.14). Thus, [ox'• /oxb]p consists of an n x n matrix of real numbers. The transformation is therefore a linear homogeneous transformation. This is our prototype. A contravariant vector or contravariant tensor of rank (order) 1 is a set of quantities, written x• in the x•-coordinate system, associated with a point P, which transforms under a change of coordinates according to where the transformation matrix is evaluated at P. The infinitesimal vector dx" is a special case of (5.16) where the components x• are infinitesimal. An example of a vector with finite components is provided by the tangent vector dx"/du to the curve x'l = x"(u). It is important to distinguish between the actual geometric object like the tangent vector in Fig. 5.5 (depicted by an arrow) and its representation in a particular coordinate system, like the n numbers [dx"/du]p in the x•-coordinate system and the (in general) different numbers [dx'"/du]p in the x'"-coordinate system. We now generalize the definition (5.16) to obtain contravariant tensors of higher rank or order. Thus, a contravariant tensor of rank 2 is a set of n2 quantities associated with a point P, denoted by x•b in the x•-coordinate system, which transform according to Fig. 5.5 The tangent vector at two points of a curve xa = x•( u). X'•bO=X'"-O-XX'b 'd OX' OXd (5.17) The quantities X'"b are the components in the x'"-coordinate system, the transformation matrices are evaluated at P, and the law involves two dummy indices c and d. An example of such a quantity is provided by the product Yo zb, say, of two contravariant vectors y• and z•. The definition of third- and higher-order contravariant tensors proceeds in an analogous manner. An I • 5.6 Covariant and mixed tensors 61 important case is a tensor of zero rank, called a scalar or scalar invariant ¢, which transforms according to at P. 5.6 Covariant and mixed tensors As in the last section, we begin by considering the transformation of a prototype quantity. Let

) in JR2 and obtain the transformation matrix [ox'• /axb] expressed as a function of the primed coordinates. Find the components of the tangent vector to the curve consisting of a circle of radius a centred at the origin with the standard parametrization (see Exercise 5.1 (i)) and use (5.16) to find its components in the primed coordinate system. 5.7 (§5.6) Write down the definition of a tensor of type (2, 1). 5.8 (§5.6) Prove that o! has the tensor character indicated. Prove also that Bi is a constant or numerical tensor, that is, it has the same components in all coordinate systems. 5.9 (§5.6) Show, by differentiating (5.20) with respect to x", that o2lox•oxb is not a tensor. Exercises I 67 5.10 (§5.8) Show that if y•be and z•be are tensors of the type indicated then so is their sum and difference. 5.11 (§5.8) (i) Show that the fact that a covariant second rank tensor is symmetric in one coordinate system is a tensorial property. (ii) If x•b is anti-symmetric and Y.b is symmetric then prove that x•b Y.b = 0. 5.12 (§5.8) Prove that any covariant (or contravariant) tensor of rank 2 can be written as the sum of a symmetric and an anti-symmetric tensor. [Hint: consider the identity x.b = ½(Xab + xb.) + ½(X.b - xb.).] 5.13 (§5.8) If x•b, is a tensor of the type indicated, then prove that the contracted quantity Y, = x•ac is a covariant vector. 5.14 (§5.8) Evaluate o: and o:o! in n dimensions. 5.15 (§5.9) Check that the definition of the Lie bracket leads to the results (5.37), (5.38), and (5.39). 5.16 (§5.9) In JR2, let (x") = (x, y) denote Cartesian and (x'•) = (R, ) plane polar coordinates (see Exercise 5.6). (i) If the vector field X has components x• = (1, 0), then find X'". (ii) The operator grad can be written in each coordinate system as i aJ . aJ. aJ ~ aJ gradf=-1+-1=-R +--, ax 8y 8R 8 R where f is an arbitrary function and i R= cos i + sin j, = - sini + cosj. Take the scalp.r product of gradfwith i,j, R, and j in turn to find relationships between the operators a;ax, a;ay, 8/8R, and afo. (iii) Express the vector field X as an operator in each coordinate system. Use part (ii) to show that these expressions are the same. (iv) If Yo = (0, 1) and z• = ( -y, x), then find Y'•, Z'•, Y, and Z. (v) Evaluate all the Lie brackets of X, Y, and Z. 6.1 Partial derivative of a tensor In the last chapter, we met algebraic operations which are tensorial, that is, which conv_ert tensors into tensors. The operations are addition, subtraction, multiplication, and contraction. The next question which arises is, What differential operations are there that are tensorial? The answer to this turns out to be very much more involved. The first thing we shall see is that partial differentiation of tensors is not tensorial. Different authors denote the partial derivative of a contravariant vector xa by a b a x or axa axb or a X ,b or xa Jb and similarly for higher-rank tensors. We shall use a mixture of all the first three notations. (Note that in the literature, the partial derivative of a tensor is often referred to as the ordinary derivative of a tensor, to distinguish it from the tensorial differentiation we shall shortly meet). Now differentiating (5.16) with respect to x'C, we find a' ,a - a (ax'a b) ex - ax'C axb X (6.1) If the first term on the right-hand side alone were present, then this would be the usual tensor transformation law for a tensor of type (1, 1). However, the presence of the second term prevents abxa from behaving like a tensor. There is a fundamental reason why this is the case. By definition, the process of differentiation involves comparing a quantity evaluated at two neighbouring points, P and Q say, dividing by some parameter representing the separation of P and Q and then taking the limit as this parameter goes to zero. In the case of a contravariant vector field xa, this would involve computing 11. m [Xa]p - [Xa]Q -----~ du-o OU for some appropriate parameter ou. However, from the transformation law in the form (5.25), we see that ~~:1 and X'a =[ xt. This involves the transformation matrix evaluated at different points, from which it should be clear that X'j, - Xa is not a tensor. Similar remarks hold for differentiating tensors in general. It turns out that if we wish to differentiate a tensor in a tensorial manner then we need to introduce some auxiliary field onto the manifold. We shall meet three different types of differentiation. First of all, in the next section, we shall introduce a contravariant vector field onto the manifold and use it to define the Lie derivative. Then we shall introduce a quantity called an affine connection and use it to define covariant differentiation. Finally, we shall introduce a tensor called a metric and from it build a special affine connection, called the metric connection, and again define covariant differentiation but relative to this specific connection. 6.2 The Lie derivative I 69 6.2 The Lie derivative The argument we present in this section is rather intricate. It rests on the idea of interpreting a coordinate transformation actively as a point transforma- tion, rather than passively as we have done up to now. The important results occur at the end of the • section and consist of the formula for the Lie derivative of a general tensor field and the basic properties of Lie differentiation. We start by considering a congruence of curves defined such that only one curve goes through each point in the manifold. Then, given any one curve of the congruence, x• = x"(u), we can use it to define the tangent vector field dx•/du along the curve. If we do this for every curve in the congruence, then we end up with a vector field x• (given by dx"/du at each point) defined over the whole manifold (Fig. 6.1). Conversely, given a non-zero vector field X"(x) defined over the manifold, then this can be used to define a congruence of curves in the manifold called the orbits or trajectories of x•. The procedure is exactly the same as the way in which a vector field gives rise to field lines or streamlines in vector analysis. These curves are obtained by solving the ordinary differential equations Fig. 6.1 The tangent vector field resulting from a congruence of curves. dx" cfu = x•(x (u)) . (6.2) The existence and uniqueness theorem for ordinary differential equations guarantees a solution, at least for some subset of the reals. In what follows, we are really only intere11ted in what happens locally (Fig. 6.2). We therefore assume that x• has been given and we have constructed the local congruence ofcurves. Suppose we have some tensor field rr: :(x) which we wish to differentiate using x•. Then the essential idea is to use the congruence of curves to drag the tensor at some point P (i.e. rr: :(P)) along the curve passing through P to some neighbouring point Q, and then compare this 'dragged-along tensor' with the tensor already there (i.e. ~ :: :(Q)) (Fig. 6.3). Since the dragged-along tensor will be of the same type as Fig. 6.2 The local congruence 6f curves resulting from a vector field . 70 I Tensor calculus 'Tensor' at P Fig. 6.3 Using the congruence to compare tensors at neighbouring points. 'Dragged-along tensor' at Q 1 I 'Tensor' at Q I I I x•(o, I Q the tensor already at Q, we can subtract the two tensors at Q and so define a derivative by some limiting process as Q tends to P. The technique for dragging involves viewing the coordinate transformation from P to Q actively, and applying it to the usual transformation law for tensors. We shall consider the detailed calculation in the case of a contravariant tensor field of rank 2, r•h(x) say. Consider the transformation where l>u is small. This is called a point transformation and is to be regarded actively as sending the point P, with coordinates x•, to the point Q, with coordinates x• + ou X"(x), where the coordinates of each point are given in the same x•-coordinate system, i.e. P--+Q x•--+ x• + ou X"(x). The point Q clearly lies on the curve of the congruence through P which x• generates (Fig. 6.4). Differentiating (6.3), we get (6.4) Next, consider the tensor field r•b at the point P. Then its components at p are T"b(x) and, under the point transformation (6.3), we have the mapping Fig. 6.4 The point P transformed to Qin the same xa -coordinate system. T"b(x)--+ T'"b(x'), i.e. the transformation 'drags' the tensor pb along from P to Q. The •components of the dragged-along tensor are given by the usual transformation law for tensors (see (5.25)), and so, using (6.4), 0 O ,a rb T'"b( ') = _:_ _:_ red( ) X OX' OXd X = (8~ + ouocX")(o~ + ouodX6 )T"d(x) = T06(x) + [o,X0 T'6 (x) + adX6 T 0d(x)]ou + O(ou 2 ). (6.5) Applying Taylor's theorem to first order, we get T"b(x') = T"6(x' + ou X'(x)) = T06 (x) + ou X' ac T06(x). (6.6) We are now in a position to define the Lie derivative of pb with respect to x•, which is denoted by Lx Yob, as 6.2 The Lie derivative I 71 This involves comparing the tensor T 0 b(x') already at Q with T'0 b(x'), the dragged-along tensor at Q. Using (6.5) and (6.6), we find (6.8) It can be shown that it is always possible to introduce a coordinate system such that the curve passing through P is given by x 1 varying, with x 2, x 3 ,... , x" all constant along the curve, and such that x· ~ o~ = (1, o, o, ... ,o) (6.9) along this curve. The notation ~ used in (6.9) means that the equation holds only in a particular coordinate system. Then it follows that x = x·a. ~ 01, and equation (6.8) reduces to (6.10) Thus, in this special coordinate system, Lie differentiation reduces to ordinary differentiation. In fact, one can define Lie differentiation starting from this viewpoint. We end the section by collecting together some important properties of Lie differentiation with respect to X which follow from its definition. 1. It is linear; for example where Aand µ are constants. Thus, in particular, the Lie derivative of the sum and difference of two tensors is the sum and difference, respectively, of the Lie derivatives of the two tensors. 2. It is Leibniz; that is, it satisfies the usual product rule for differentiation, for example 3. It is type-preserving; that is, the Lie derivative of a tensor of type (p, q) is again a tensor of type (p, q). 4. It commutes with contraction; for example 72 I Tensor calculus 5. The Lie derivative of a scalar field is given by 6. The Lie derivative of a contravariant vector field ya is given by the Lie bracket of X and Y, that is, 7. The Lie derivative of a covariant vector field Ya is given by 8. The Lie derivative of a general tensor field r:::: is obtained as follows: we first partially differentiate the tensor and contract it with X. We then get an additional term for each index of the form of the last two terms in (6.15) and (6.16), where the corresponding sign is negative for a contravariant index and positive for a covariant index, that is, x• X'+bX" I ---l 'Parallel' vector I I I I p Q Fig. 6.5 The parallel vector xa + oX8 at Q. 6.3 The affine connection and covariant differentiation Consider a contravariant vector field xa(x) evaluated at a point Q, with coordinates xa + cha, near to a point P, with coordinates xa. Then, by Taylor's theorem, xa(x + ch) = xa(x) + bxb abx· (6.18) to first order. If we denote the second term by bX"(x), i.e. bX"(x) = bxbabx· = xa(x + bx) - X"(x), (6.19) then it is not tensorial since it involves subtracting tensors evaluated at two different points. We are going to define a tensorial derivative by introducing a vector at Q which in some general sense is 'parallel' to x• at P. Since x• + bx• is close to xa, we can assume that the parallel vector only differs from xa(x) by a small amount, which we denote bX"(x) (Fig. 6.5). By the same argument as in §6.1 above, bX"(x) is not tensorial, but we shall construct it in such a way as to make the difference vector X"(x) + bX"(x) - [X"(x) + bX"(x)] = c5X"(x) - bX°(x) (6.20) tensorial. It is natural to require that bX"(x) should vanish whenever X"(x) or bx" does. Then the simplest definition is to assume that bX" is linear in both x• and c5x", which means that there exist multiplicative factors I'f.c I 6.3 The affine connection and covariant differentiation 73 where (6.21) and the minus sign is introduced to agree with convention. We have therefore introduced a set of n3 functions r,:c(x) on the manifold, whose transformation properties have yet to be determined. This we do by defining the covariant derivative of X 0 , written in one of the notations (where we shall use a mixture of the first two) VcXa or xa;c or X 0 11 c, by the limiting process VcX0 = lim 1 Tc {X 0 (x + bx) - [X0 (x) + bX0 (x)]}. ~xc-o ux In other words, it is the difference between the vector xa(Q) and the vector at Qparallel to X 0 (P), divided by the coordinate differences, in the limit as these differences tend to zero. Using (6.18) and (6.21), we find Note that in the formula the differentiation index c comes second in the downstairs indices of r. If we now demand that VcX0 is a tensor of type (1, 1), then a straightforward calculation (exercise) reveals that r,;c m\lst transform according to Qr equivalently (exercise) If the second term on the right-hand side were absent, then this would be the usual transformation law for a tensor of type (1, 2). However, the presence of the second term reveals that the transformation law is linear inhomogeneous, and so rbc is not a tensor. Any quantity rbc which transforms according to (6.23) or (6.24) is called an affine connection or sometimes simply a connection or affinity. A manifold with a continuous connection prescribed on it is called an affine manifold. From another point of view, the existence of the inhomogeneous term in the transformation law is not surprising if we are to define a tensorial derivative, since its role is to compensate for the second term which occurs in (6.1). We next define the covariant derivative of a scalar field to be the same as its ordinary derivative, i.e. 74 I Tensor calculus If we now demand that covariant differentiation satisfies the Leibniz rule, then we find Notice again that the differentiation index comes last in the I'-term and that this term enters with a minus sign. Th~ name covariant derivative stems from the fact that the derivative of a tensor/of type (p, q) is of type (p, q + l), i.e. it has one extra covariant rank. The expression in the case of a general tensor is (compare and contrast with (6.17)) It follows directly from the transformation laws that the sum of two connections is not a connection or a tensor. However, the difference of two connections is a tensor of valence (l, 2), because the inhomogeneous term cancels out in the transformation. For the same reason, the anti-symmetric part of a r:c, namely, T~ = r~ - r~b is a tensor called the torsion tensor. If the torsion tensor vanishes, then the connection is symmetric, i.e. From now on, unless we state otherwise, we shall restrict ourselves to symmetric connections, in which case the torsion vanishes. The assumption that the connection is symmetric leads to the following useful result. In the expression for a Lie derivative of a tensor, all occurrences of the parti~ derivatives may be replaced by covariant derivatives. For example, in the cas\: of a vector (exercise) Lx y• = xb i\ y• - Yb abx· = Xb\\ Y" - Yb\:\X". (6.29) 6.4 Affine geodesics If rr:: is any tensor, then we introduce the notation that is, Vx of a tensor is its covariant derivative contracted with X. Now in §6.2 we saw that a contravariant vector field X determines a local congruence of curves, x• = x"(u), where the tangent vector field to the congruence is dxa = xa du • We next define the absolute derivative of a tensor r:::: along a curve C of the congruence, written D Tb: ::/Du, by 6.4 Affine geodesics I 75 The tensor rr:: is said to be parallely propagated or transported along the curve C if This is a first-order ordinary differential equation for n: ::, and so given an initial value for r;;:::, say rr :: (P), equation (6.32) determines a tensor along C which is eyerywhere parallel to r;;:: :(P). Using this notation, an affine geodesic is defined as a privileged curve along which the tangent vector is propagated parallel to itself. In other words, the parallely propagated vector at any point of the curve is parallel, that is, proportional, to the tangent vector at that point: E_(dxa) = A(u) dxa. Du du du Using (6.31), the equation for an affine geodesic can be written in the form or equivalently (exercise) The last result is very important and so we shall establish it afresh from first principles using the notation of the last section. Let the neighbouring points P and Q on C be given by x0 (u) and dx0 x"(u +bu)= x 0 (u) + du bu to first order in bu, respectively. Then in the notation of the last section dx0 bx0 = dubu. (6.35) 76 I Tensor calculus The vector X"(x) at Pis now the tangent vector (dx"/du) (u). The vector at Q parallel to dx"/ du is, by (6.21) and (6.35), dx• dxb dxc - --I''t,--bu. du du du The vector already at Q is dx" dx• d2 x• du (u + bu) =du+ du2 bu to first order in bu. These last two vectors must be parallel, so we require ddxu" + d 2x" du 2 bu= [1 + l(u)bu] (ddxu" - r:c ddxub ddxuc bu) , where we have written the proportionality factor as 1 + l(u)bu without loss of generality, since the equation must hold in the limit bu ➔ 0. Subtracting dx"/du from each side, dividing by bu and taking the limit as bu tends to zero produces the result (6.34). Note that I''t, appears in the equation multiplied by the symmetric quantity (dxb/du)(dx 0 or X 2 < 0, respectively. Otherwise, the metric is called indefinite. The angle between two vectors xa and ya with X 2 i= 0 and Y 2 i= 0 is given by g xayb cos(X, Y) = (lgcdxc XdJ)½(lg,f ye Yfl)½. (6.52) In particular, the vectors xa and ya are said to be orthogonal if gabxayb = 0. (6.53) If the metric is indefinite (as in relativity theory), then there exist vectors which are orthogonal to themselves called null vectors, i.e. o. = gabxaxb (6.54) The determinant of the metric is denoted by g = det(gab) (6.55) The metric is non-singular if g i= 0, in which case the inverse of gab• gab, is given by It follows from this definition that gab is a contravariant tensor of rank 2 and it is called the contravariant metric. We may now use g.b and g•b to lower and raise tensorial indices by defining (6.57) and (6.58) where we use the same kernel letter for the tensor. Since from now on we shall be working with a manifold endowed with a metric, we shall regard such associated contravariant and covariant tensors as representations of the same geometric object. Thus, in particular, •gab• 8!, and gab may all be thought of as different representations of the same geometric object, the metric g. Since we can raise and lower indices freely with the metric, we must be careful about the order in which we write contravariant and covariant indices. For example, in general, X/ will be different from Xba• 6.9 Metric geodesics Consider the timelike curve C with paranretric equation x• ":' x"(u). Dividing equation (6.50) by the square of du we get ( ds du ) 2 _ - dxa dxb gab du du' (6.59) Then the interval s between two points P I and P 2 on C is given by =f f f s P, _ P, ds _ P2 ( dxa dxb )½ ds - d du - gab d d du. Pi Pi U, Pi U U (6.60) We define a timelike metric geodesic between any two points P 1 and P2 as the privileged curve joining .them whose interval is stationary under small variations that vanish at the end points. Hence, the interval may be a maximum, a minimum, or a saddle point. Deriving the geodesic equations involves the calculus of variations and we postpone this to the next chapter. In that chapter, we shall see that the Euler-Lagrange equations result in the second-order differential equations gab d 2 xb du 2 + {be, a} ddxub ddxuc = (d2s;ds) du2 du gab ddxub ' (6.61) where the quantities in curly brackets are called the Christoffel symbols of the first kind and are defined in terms of derivatives of the metric by 6.9 Metric geodesics I 83 Multiplying through by gad and using (6.56), we get the equations d2 xa + { a } dxb dxc = ( d2s / d,.s ) dxa du 2 be du du du2 du du ' (6-63) where Uc} are the Christoffel symbols of the second kind defined by In addition, the norm of the tangent vector dxa /du is given by (6.59). If, in particular, we choose a parameter u which is linearly related to the interval s, that is, U = CXS + /J, (6.65) where IX and pare constants, then the right-hand side of(6.63) vanishes. In the special case when u = s, the equations for a metric geodesic become and where we assume ds # 0. Apart from trivial sign changes, similar results apply for spacelike geo- desics, except that we replace s by u, say, where du2 = -gabdxadxb However, in the case of an indefinite metric, there exist geodesics for which the distance between any two points is zero called null geodesics. It can also 84 I Tensor calculus be shown that these curves can be parametrized by a special parameter u, called an affine parameter, such that their equation does not possess a righthand side, that is, where The last equation follows since the distance between any two points is zero, or equivalently the tangent vector is null. Again, any other affine parameter is related to u by the transformation + U ➔ IXU {J, where IX and fJ are constants. 6.10 The metric connection In general, ifwe have a manifold endowed with both an affine connection and metric, then it possesses two classes of curves, affine geodesics and metric geodesics, which will be different (Fig. 6.11). However, comparing (6.37) with (6.66), the two classes will coincide if we take Metric or, using (6.64) and (6.62), if geodesics ra be = { bae } (6.70) Fig. 6.11 Affine and metric geodesics on a manifold. It follows from the last equation that the connection is necessarily symmetric, i.e. (6.72) In fact, if one checks the transformation properties of {;c} from first prin- ciples, it does indeed transform like a connection (exercise). This special connection built out of the metric and its derivatives is called the metric connection. From now on, we shall always work with the metric connection and we shall denote it by qc rather than {t:,}, where I'i:c is defined by (6.71). This definition leads immediately to the identity (exercise) Conversely, if we require that (6.73) holds for an arbitrary symmetric connection, then it can be deduced (exercise) that the connection is necessarily the metric connection. Thus, we have the following important result. 6.11 Metric flatness I 85 In addition, we can show that and (6.74) (6.75) 6.11 Metric flatness Now at any point P of a manifold, g0b is a symmetric matrix of real numbers. Therefore, by standard matrix theory, there exists a transformation which reduces the matrix to diagonal form with every diagonal term either +1 or -1. The excess of plus signs over minus signs in this form is called the signature of the metric. Assuming that the metric is ·continuous over the manifold and non-singular, then it follows that the signature is an invariant. In general, it will not be possible to find a coordinate system in which the metric reduces to this diagonal form everywhere. If, however, there does exist a coordinate system in which the metric reduces to diagonal form with ±1 diagonal elements everywhere, then the metric is called flat. How does metric flatness relate to affine flatness in the case we are interested in, that is, when the connection is the metric connection? The answer is contained in the following result. Necessity follows from the fact that there exists a coordinate system in which the metric is diagonal with ±1 diagonal elements. Since the metric is constant everywhere, its partial derivatives vanish and therefore the metric connection I''i,c vanishes as a consequence of the definition (6.71). Since I'1,c vanishes everywhere then so must its derivatives. (One way to see this is to recall the definition of partial differentiation which involves subtracting quantities at neighbouring points. If the quantities are always zero, then their difference vanishes, and so does the resulting limit.) The Riemann tensor therefore vanishes by the definition (6.39). Conversely, if the Riemann tensor vanishes, then by the theorem of §6.7, there exists a special coordinate system in which the connection vanishes everywhere. Since this is the metric connection, by (6.73), Vcgab = Ocgab - r~cgdb - I'1,,:g.d = 0, 86 I Tensor calculus from which we get and it follows that aellab = 0. The metric is therefore constant everywhere and hence can be transformed into diagonal form with diagonal elements ±l. Note the result (6.76) which expresses the ordinary derivative of the metric in terms of the connection. This equation will prove useful later. Combining this theorem with the theorem of§6.7, we see that ifwe use the metric connection then metric flatness coincides with affine flatness. 6.12 The curvature tensor The curvature tensor or Riemann-Christoffel tensor (Riemann tensor for short) is defined by (6.39), namely, where I''fx is the metric connection, which by (6.71) is given as I''i,, = ½g•d( ablldc + a,gdb - adllbc). Thus, R\,d depends on the metric and its first and second derivatives. It follows immediately from the definition that it is anti-symmetric on its last pair of indices R•bcd = -R"bdc• (6.77) The fact that the connection is symmetric leads to the identity (6.78) Lowering the first index with the metric, then it is easy to establish, for example by using geodesic coordinates, that the lowered tensor is symmetric under interchange of the first and last pair of indices, that is, (6.79) Combining this with equation (6.77), we see that the lowered tensor is antisymmetric on its first pair of indices as well: (6.80) Collecting these symmetries together, we see that the lowered curvature tensor satisfies These symmetries considerably reduce the number of. independent components; in fact, inn dimensions, the number is reduced from n4 to /2 n2(n2 - 1). In addition to the algebraic identities, it can be shown, again most easily by using geodesic coordinates, that the curvature tensor satisfies a set of differential identities called the Bianchi identities: 6.13 The Weyl tensor I 87 We can use the curvature tensor to define several other important tensors. The Ricci tensor is defined by the contraction which by (6.79) is symmetric. A final contraction defines the curvature scalar or Ricci scalar R by These two tensors can be used to define the Einstein tensor (6.85) which is also symmetric, and, by (6.82), the Einstein tensor can be shown to satisfy the contracted Bianchi identities Note that some authors adopt a different sign convention, which leads to the Riemann tensor or the Ricci tensor having the opposite sign to ours. 6.13 The Weyl tensor We shall mostly be concerned with tensors in four dimensions or less. The algebraic identities (6.81) lead to the following special cases for the curvature tensor: (1) if n = 1, Rabcd = O; (2) if n = 2, Rabcd has one independent component - essentially R; (3) if n = 3, Rabcd has six independent components - essentially R.b; (4) if n = 4, R.bcd has twenty independent components - ten of which are given by Rab and the remaining ten by the Wey! tensor. The Weyl tensor or conformal tensor Cabcd is defined in n dimensions, (n ~ 3) by 1 + - - + n- Cabcd = Rabcd 2 (g.dR cb llbcRda - OacRdb - llbdRca) 1 + (n _ l)(n _ 2) (g.cgdb - lladgcb)R. 88 I Tensor calculus Thus, in four dimensions, this becomes It is straightforward to show that the Weyl tensor possesses the same symmetries as the Riemann tensor, namely, Combining this result with the previous symmetries, it then follows that the Weyl tensor is trace-free, in other words, it vanishes for any pair of contracted indices. One can think of the Weyl tensor as that part of the curvature tensor for which all contractions vanish. Two metrics Oab and iiab are said to be conformally related or conformal to each other if where f.!(x) is a non-zero differentiable function. Given a manifold with two metrics defined on it which are conformal, then it is straightforward from (6.51) and (6.52) to show that angles betwee·n vectors and ratios of magnitudes of vectors, but not lengths, are the same for each metric. Moreover, the null geodesics of one metric coincide with the null geodesics of the other (exercise). The metrics also possess the same Weyl tensor, i.e. Any quantity which satisfies a relationship like (6.91) is called conformally invariant (gab• ric, and R~d are examples of quantities which are not conformally invariant). A metric is said to be conformally flat if it can be reduced to the form (6.92) where flab is a flat metric. We end this section by quoting two results concerning conformally flat metrics. I Exercises 89 Exercises 6.1 (§6.2) Prove (6.13) by showing that Lxo;; = 0 in two ways: (i) using (6.17); (ii) from first principles (remembering Exercise 5.8). 6.2 (§6.2) Use (6.17) to find expressions for LxZ b, and Lx( ya Zb,). Use these expressions and (6.15) to check the Leibniz property in the form (6.12). 6.3 (§6.3) Establish (6.23) by assuming that the quantity defined by (6.22) has the tensor character indicated. Take the partial derivative of with respect to x•b to establish the alternative form (6.24). 6.4 (§6.3) Show that covariant differentiation commutes with contraction by checking that V,o;; = 0. 6.5 (§6.3) Assuming (6.22) and (6.25), apply the Leibniz rule to the covariant derivative of X .x•, where x• is arbitrary, to verify (6.26). 6.6 (§6.3) Check (6.29). 6.7 (§6.4) If X, Y, and Z are vector fields, f and g smooth functions, and .l. and µ constants, then show that (i) Vx(.l.Y + µZ) = .l.Vx Y + µVxZ, (ii) V1x+ 9 ,Z =fVxZ + gV,Z, (iii) VxUYJ = (Xf) Y + fVx Y. 6.8 (§6.4) Show that (6.33) leads to (6.34). 6.9 (§6.4) Ifs is an affine parameter, then show that, under the transformation s s--+ = s(s), the parameter swill be affine only ifs = ocs + /3, where cc and pare constants. 6.10 (§6.5) Show that 6.11 (§6.5) Show that Vx(V yZ") - Vy(VxZ") - = Vex. Y]z· R"b,dzb X' yd. 6.12 (§6.7) Prove that if a manifold is affine flat then the connection is necessarily integrable and symmetric. 6.13 (§6.8) Show that if 9ab is diagonal, i.e. 9ab = 0 if a #- b, then g•b is diagonal with corresponding reciprocal diagonal elements. 6.14 (§6.8) The line elements of JR 3 in Cartesian, cylindrical polar, and spherical polar coordinates are given respectively by (i) ds2 = dx2 + dy2 + dz2 , = (ii) ds2 dR 2 + R2 d 2 + dz 2 , = (iii) ds2 dr2 + r2 d0 2 + r 2 sin 2 0dcp 2 . Find 9ab, g•b, and g in each case. 6.15 (§6.8) Express T.b in terms of T'd. 6.16 (§6.9) Write down the tensor transformation law of 9ab· Show directly that transforms like a connection. 6.17 (§6.9) Find the geodesic equation for JR3 in cylindrical polars. [Hint: use the results of Exercise 6.14(ii) to compute the metric connection and substitute in (6.68).] 6.18 (§6.9) Consider a 3-space with coordinates (x") = (x, y, z) and line element ds2 = dx2 + dy2 - dz2 . Prove that the null geodesics are given by x = lu + I', y =mu+ m', z =nu+ n', where u is a parameter and /, /', m, m', n, n' are arbitrary constants satisfying 12 + m 2 - n2 = 0. 90 I Tensor calculus 6.19 (§6.10) Prove that V,gab =0. Deduce that vbxa = g.,VbX'. 6.20 (§6,10) Suppose we have an arbitrary symmetric con- nection I''t,, satisfying V,gab = 0. Deduce that I''t,, must be the metric connection. [Hint: use the equation to find expressions for abgd,• a,g4b and - a,gbc, as in (6.76), add the equations together, and multiply by ½gaa.] 6.21 (§6.11) The Minkowski line element in Minkowski coordinates = (x•) = (x0 , x', x 2 , x 3 ) (t, x, y, z) is given by ds2 = dt2 - dx 2 - dy2 - dz2 (i) What is the signature? (ii) Is the metric non-singular? (iii) Is the metric flat? 6.22 (f6.11) The line element of JR3 in a particular coordinate system is ds2 =(dx 1)2 + (x 1)2 (dx 2)2 +(x1 sinx 2 )2 (dx3 )2 (i) Identify the coordinates. (ii) Is the metric flat? 6.23 (§6.12) Establish the identities (6.78) and (6.79). [Hint: choose an arbitrary point P and introduce geodesic co- ordinates at P.] Show that (6.78) is equivalent to R•lbcdJ = 0. 6.24 (§.6.12) Establish the identity (6.82). [Hint: use geodesic coordinates.] Show that (6.82) is equivalent to = Rd,[ab;c] 0. Deduce (6.86). 6.25 (§6.12) Show that G.b = 0 if and only if R.b = 0. 6.26 (§6.13) Establish the identity (6.89). Deduce that the Weyl tensor is trace-free on all pairs of indices. 6.27 (§6.13) Show that angles between vectors and ratios of lengths of vectors, but not lengths, are the same for conformally related metrics. 6.28 (§6.13) Prove that the null geodesics of two conformally related metrics coincide. [Hint: the two classes of geodesics need not both be affinely parametrized.] 6.29 (§6.13) Establish (6.91). 6.30 (§6.13) Establish the theorem that any two-dimensional Riemann manifold is conformally flat in the case of a metric of signature 0, i.e. at any point the metric can be reduced to the diagonal form ( + 1, -1) say. [Hint: use null curves as coordinate curves, that is, change to new coordinates ). = ).(x0 , x 1), v = v(x 0 ,x 1 ) satisfying g•b ).,a ).,b = g® V,a V,b = Q and show that the line element reduces to the form ds2 = e2µ d).dv and finally introduce new coordinates ½0- + v) and ½(). - v).] 6.31 This final exercise consists of a long calculation which will be needed later in the book. If we take coordinates x0 =(x0 ,x1,x2 ,x3 )=(t,r,0,tj>), then the four-dimensional spherically symmetric line element is ds2 = e'dt2 - e'dr2 - r2d02 - r2 sin2 0dcp2, where v = v( t, r) and ). = ).( t, r) are arbitrary functions of t and r. (i) Find 9ab, g, and g•b (see Exercise 6.13). (ii) Use the expressions in (i) to calculate rb,. [Hint: re- member I'bc = I'~b-] (iii) Calculate Rahed' [Hint: use the symmetry relations (6.81).] (iv) Calculate Rab• R, .and Gab· (v) Calculate G0 b( =g0 'G,b = Gb0 ). 7.1 Tensor densities A tensor density of weight W, denoted conventionally by a gothic letter, 1r:, transforms like an ordinary tensor, except that in addition the Wth power of the Jacobian I I ox• J = OX'b appears as a factor, i.e. Then, with certain modifications, we can combine tensor densities in much the same way as we do tensors. One exception, which follows from (7.1), is that the product of two tensor densities of weight W1 and W2 is a tensor density of weight W1 + W2 . There is some arbitrariness in defining the covariant derivative of a tensor density, but we shall adhere to the definition that if !i::: is a tensor density of weight W then For example, the covariant derivative of a vector density of weight Wis V/!" = o/!" + rgc!b - WI':C!". In the special case when W = + 1 and c = a, we get the important result (check) that is, the covariant divergence of a vector density of weight + 1 is identical to its ordinary divergence. It can be shown that both these quantities are scalar densities of weight + 1 (exercise). 92 I Integration, variation, and symmetry 7.2 The Levi-Civita alternating symbol We introduce a quantity which is a generalization of the Kronecker delta o:, but which turns out to be a tensor density. The Levi-Civita alternating symbol eabcd is a completely anti-symmetric tensor density of weight +1and contravariant rank 4, whose values in any coordinate system is + 1 or -1 if abed is an even or odd permutation of0123, respectively, and zero otherwise. Thus, for example, in four dimensions, if we let the coordinates range from 0, to 3 (as we shall), i.e. then some of its values are + i,0123 = i,2301 = -EOt32 = -E0321 = 1 and Similarly, we can define the covariant version Eabcd• which has weight -1. It can be used, in particular, to form the determinant of a second-rank density, i.e. Assuming this is non-zero, we can then also use it to construct the inverse of a second-rank tensor. The covariant derivatives of both e•bcd and &abed vanish identically, which from one point of view motivates the definition (7.2). We define the generalized Kronecker delta by + 1 for a t= b, a = c, b = d, o~: = { - 1 for a t= b, a = d, b = c, 0 otherwise, and similarly for higher-order tensors. They are constant tensors of the type indicated, and can be defined in terms of the Kronecker delta by the determinant relationships and od oS o:; od~ = o: o: o~ , 01 o} 01 and so forth. In four dimensions they are related to products of the'a1ternating symbols according to g•bcdEefgh = 0:r:h, &abcdEefgd = 0:}~, e•bcd&efcd = 20:}, e•bcdEebcd = 3!0:, &abed &abed = 4 !. 7.3 The metric determinant I 93 7.3 The metric determinant If we have a Riemannian manifold with metric gab• then it transforms according to (7.4) and so, taking determinants, we have g' = J2g. Hence the metric determinant g is a scalar density of weight + 2. In the later chapters, we shall be working with metrics of negative signature in which case g will be negative, and so we write the last equation in the equivalent form Since all these terms are now positive, we can take square roots, to get and hence (-g)¼ is a scalar density of weight + 1. The quantity (-g)¼ plays an important role in integration. Given any tensor r:,-::, we can form the product (-g)¼ T:,-:: which is then a tensor density of weight + 1. In particu- lar, we can deduce an important result from equation (7.3), namely, for any vector Ta, Now, at any point, the covariant and contravariant metrics are symmetric matrices which are inverse to each other by gabgbc = 0~. Let us digress for a moment and consider the general case of finding the derivative of a determinant of a matrix whose elements are functions of the coordinates. Consider any square matrix A = (aii). Then its inverse, (bii) say, is defined by (7.6) where a is the determinant of A, A ii is tQe cofactor of a,,, and the prime denotes the transpose. Let us fix i, and expand the determinant a by the ith row. Then n a-- L.~,a,, A11 j=l where we have explicitly included the summation sign for clarity. If we partially differentiate both sides with respect to aii• then we get aa .. -=A'J (7.7) aaij ' 94 I Integration, variation, and symmetry since aii does not occur in any of the cofactors Aii (i fixed,} runs from 1 ton). . Repeating the argument for every i, as i runs from 1 to n, we see that the formula (7.7) is quite general. Let us suppose that the aii are all functions of the coordinates xk. Then the determinant is a functional of the aii, which in turn are functions of the xk, that is, a= a(a;j(xk)). Differentiating this partially with respect to x\ using ihe function of a function rule and equation (7.7), we obtain aa aa aaij ,, axk = aa.. axk = abi; aaii axk by equation (7.6). Applying this result to the metric determinant g and remembering that g•b is symmetric, we get the useful equation We now combine this result with (6.76) (which comes directly from the vanishing of the covariant derivative of the metric) and find acg = gg•b(r:clldb + rtcgad) = go:r:c + uoSrtc = 2gr;c. (7.9) Let us compute the covariant derivative of g using (7.2). Then, since g is a scalar density of weight + 2, we have Veg= acg - 2gr;c, and so by equation (7.9) it follows that This is again intimately connected with the choice of the definition (7.2). Similarly, we find from equation (7.9) that ac(-g)½ - (-g)½ r:c = 0, that is, by (7.2), In particular, for any tensor Tt:::, this leads to the identity Vc[(-g)tr,::::] = (-g)½(VcT,::::), (7.12)