zotero/storage/7LEWKUMG/.zotero-ft-cache

3223 lines
170 KiB
Plaintext

MATHEMATICS OF
PHYSICS AND
ENGINEERING
r '** Wee, j±g BORN
EDWARD K. BLUM SERGEY v. LOTOTSKY
MATHEMATICS OF
PHYSICS AND ENGINEERING
This page is intentionally left blank
MATHEMATICS OF
PHYSICS AND
ENGINEERING
EDWARD K. BLUM SERGEY v LOTOTSKY
University of Southern California, USA
YJ? World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
About the cover: The pyramid design by Martin Herkenhoff represents the pyramid of knowledge built up in levels over millenia by the scientists named, and others referred to in the text.
MATHEMATICS OF PHYSICS AND ENGINEERING Copyright © 2006 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher. ISBN 981-256-621-X
Printed in Singapore by B & JO Enterprise
To Lori, Debbie, Beth, and Amy. To Kolya, Olya, and Lya.
This page is intentionally left blank
Preface
What is mathematics of physics and engineering? An immediate answer would be "all mathematics that is used in physics and engineering", which is pretty much ... all the mathematics there is. While it is nearly impossible to present all mathematics in a single book, many books on the subject seem to try this.
On the other hand, a semester-long course in mathematics of physics and engineering is a more well-defined notion, and is present in most universities. Usually, this course is designed for advanced undergraduate students who are majoring in physics or engineering, and who are already familiar with multi-variable calculus and ordinary differential equations. The basic topics in such a course include introduction to Fourier analysis and partial differential equations, as well as a review of vector analysis and selected topics from complex analysis and ordinary differential equations. It is therefore useful to have a book that covers these topics — and nothing else. Besides the purely practical benefits, related to the reduction of the physical dimensions of the volume the students must carry around, the reduction of the number of topics covered has other advantages over the existing lengthy texts on engineering mathematics.
One major advantage is the opportunity to explore the connection between mathematical models and their physical applications. We explore this connection to the fullest and show how physics leads to mathematical models and conversely, how the mathematical models lead to the discovery of new physics. We believe that students will be stimulated by this interplay of physics and mathematics and will see mathematics come alive. For example, it is interesting to establish the connection between electromagnetism and Maxwell's equations on the one side and the integral theorems of vector calculus on the other side. Unfortunately, Maxwell's equations
vii
viii
Mathematics of Physics and Engineering
are often left out of an applied mathematics course, and the study of these equations in a physics course often leaves the mathematical part somewhat of a mystery. In our exposition, we maintain the full rigor of mathematics while always presenting the motivation from physics. We do this for the classical mechanics, electromagnetism, and mechanics of continuous medium, and introduce the main topics from the modern physics of relativity, both special and general, and quantum mechanics, topics usually omitted in conventional books on "Engineering Mathematics."
Another advantage is the possibility of further exploration through problems, as opposed to standard end-of-section exercises. This book offers a whole chapter, about 30 pages, worth of problems, and many of those problems can be a basis of a serious undergraduate research project.
Yet another advantage is the space to look at the historical developments of the subject. Who invented the cross product? (Gibbs in the 1880s, see page 3.) Who introduced the notation i for the imaginary unit sf^ll (Euler in 1777, see page 79.) In the study of mathematics, the fact that there are actual people behind every formula is often forgotten, unless it is a course in the history of mathematics. We believe that historical background material makes the presentation more lively and should not be confined to specialized history books.
As far as the accuracy of our historical passages, a disclaimer is in order. According to one story, the Russian mathematician ANDREI NiKOLAEVlCH KOLMOGOROV (1903-1987) was starting as a history major, but quickly switched to mathematics after being told that historians require at least five different proofs for each claim. While we tried to verify the historical claims in our presentation, we certainly do not have even two independent proofs for most of them. Our historical comments are only intended to satisfy, and to ignite, the curiosity of the reader.
An interesting advice for reading this, and any other textbook, comes from the Russian physicist and Nobel Laureate LEV DAVIDOVICH LANDAU (1908-1968). Rephrasing what he used to say, if you do not understand a particular place in the book, read again; if you still do not understand after five attempts, change your major. Even though we do not intend to force a change of major on our readers, we realize that some places in the book are more difficult than others, and understanding those places might require a significant mental effort on the part of the reader.
While writing the book, we sometimes followed the advice of the German mathematician CARL GUSTAV JACOB JACOBI (1804-1851), who used to say: "One should always generalize." Even though we tried to keep abstract
Preface
IX
constructions to a minimum, we could not avoid them altogether: some ideas, such as the separation of variables for the heat and wave equations, just ask to be generalized, and we hope the reader will appreciate the benefits of these generalizations. As a consolation to the reader who is not comfortable with abstract constructions, we mention that everything in this book, no matter how abstract it might look, is nowhere near the level of abstraction to which one can take it.
The inevitable consequence of unifying mathematics and physics, as we do here, is a possible confusion with notations. For example, it is customary in mathematics to denote a generic region in the plane or in space by G, from the German word Gebiet, meaning "territory." On the other hand, the same letter is used in physics for the universal gravitational constant; in our book, we use G to denote this constant (notice a slight difference between G and G). Since these two symbols never appear in the same formula, we hope the reader will not be confused.
We are not including the usual end-of-section exercises, and instead incorporate the exercises into the main presentation. These exercises act as speed-bumps, forcing the reader to have a pen and pencil nearby. They should also help the reader to follow the presentation better and, once solved, provide an added level of confidence. Each exercise is rated with a super-script A, B, or C; sometimes, different parts of the same exercise have different ratings. The rating is mostly the subjective view of the authors and can represent each of the following: (a) The level of difficulty, with C being the easiest; (b) The degree of importance for general understanding of the material, with C being the most important; (c) The aspiration of the student attempting the exercise. Our suggestion for the first reading is to understand the question and/or conclusion of every exercise and to attempt every C-rated exercise, especially those that ask to verify something. The problems are at the very end, in the chapter called "Further Developments," and are not rated. These problems provide a convenient means to give brief extensions of the subjects treated in the text (see, for example, the problems on special relativity).
A semester-long course using this book would most likely emphasize the chapters on complex numbers, Fourier analysis, and partial differential equations, with the chapters on vectors, mechanics, and electromagnetic theory covered only briefly while reviewing vectors and vector analysis. The chapters on complex numbers and Fourier analysis are short enough to be covered more or less completely, each in about ten 50-minute lectures. The chapter on partial differential equations is much longer, and, beyond
X
Mathematics of Physics and Engineering
the standard material on one-dimensional heat and wave equations, the selection of topics can vary to reflect the preferences of the instructor and/or the students. We should also mention that a motivated student can master the complete book in one 15-week semester by reading, on average, just 5 pages per day.
We extend our gratitude to our colleagues at USC: Todd Brun, Tobias Ekholm, Florence Lin, Paul Newton, Robert Penner, and Mohammed Ziane, who read portions of the manuscript and gave valuable suggestions. We are very grateful to Igor Cialenco, who carefully read the entire manuscript and found numerous typos and inaccuracies. The work of SVL was partially supported by the Sloan Research Fellowship, the NSF CAREER award DMS-0237724, and the ARO Grant DAAD19-02-1-0374.
E. K. Blum and S. V. Lototsky
Contents
Preface
vii
1. Euclidean Geometry and Vectors
1
1.1 Euclidean Geometry
1
1.1.1 The Postulates of Euclid
1
1.1.2 Relative Position and Position Vectors
3
1.1.3 Euclidean Space as a Linear Space
4
1.2 Vector Operations
9
1.2.1 Inner Product
9
1.2.2 Cross Product
17
1.2.3 Scalar Triple Product
23
1.3 Curves in Space
24
1.3.1 Vector-Valued Functions of a Scalar Variable . . . . 24
1.3.2 The Tangent Vector and Arc Length
27
1.3.3 Frenet's Formulas
30
1.3.4 Velocity and Acceleration
33
2. Vector Analysis and Classical and Relativistic Mechanics
39
2.1 Kinematics and Dynamics of a Point Mass
39
2.1.1 Newton's Laws of Motion and Gravitation
39
2.1.2 Parallel Translation of Frames
46
2.1.3 Uniform Rotation of Frames
48
2.1.4 General Accelerating Frames
61
2.2 Systems of Point Masses
66
2.2.1 Non-Rigid Systems of Points
66
2.2.2 Rigid Systems of Points
73
xi
xii
Mathematics of Physics and Engineering
2.2.3 Rigid Bodies
79
2.3 The Lagrange-Hamilton Method
84
2.3.1 Lagrange's Equations
85
2.3.2 An Example of Lagrange's Method
90
2.3.3 Hamilton's Equations
93
2.4 Elements of the Theory of Relativity
95
2.4.1 Historical Background
97
2.4.2 The Lorentz Transformation and Special Relativity . 99
2.4.3 Einstein's Field Equations and General Relativity . . 105
3. Vector Analysis and Classical Electromagnetic Theory
121
3.1 Functions of Several Variables
121
3.1.1 Functions, Sets, and the Gradient
121
3.1.2 Integration and Differentiation
130
3.1.3 Curvilinear Coordinate Systems
141
3.2 The Three Integral Theorems of Vector Analysis
150
3.2.1 Green's Theorem
150
3.2.2 The Divergence Theorem of Gauss
151
3.2.3 Stokes's Theorem
155
3.2.4 Laplace's and Poisson's Equations
157
3.3 Maxwell's Equations and Electromagnetic Theory
163
3.3.1 Maxwell's Equations in Vacuum
163
3.3.2 The Electric and Magnetic Dipoles
170
3.3.3 Maxwell's Equations in Material Media
173
4. Elements of Complex Analysis
179
4.1 The Algebra of Complex Numbers
179
4.1.1 Basic Definitions
179
4.1.2 The Complex Plane
183
4.1.3 Applications to Analysis of AC Circuits
187
4.2 Functions of a Complex Variable
190
4.2.1 Continuity and Differentiability
190
4.2.2 Cauchy-Riemann Equations
191
4.2.3 The Integral Theorem and Formula of Cauchy . . . . 194
4.2.4 Conformal Mappings
202
4.3 Power Series and Analytic Functions
206
4.3.1 Series of Complex Numbers
206
4.3.2 Convergence of Power Series
208
Contents
xiii
4.3.3 The Exponential Function
213
4.4 Singularities of Complex Functions
215
4.4.1 Laurent Series
215
4.4.2 Residue Integration
222
4.4.3 Power Series and Ordinary Differential Equations . . 231
5. Elements of Fourier Analysis
241
5.1 Fourier Series
241
5.1.1 Fourier Coefficients
242
5.1.2 Point-wise and Uniform Convergence
247
5.1.3 Computing the Fourier Series
254
5.2 Fourier Transform
261
5.2.1 From Sums to Integrals
261
5.2.2 Properties of the Fourier Transform
267
5.2.3 Computing the Fourier Transform
271
5.3 Discrete Fourier Transform
274
5.3.1 Discrete Functions
274
5.3.2 Fast Fourier Transform (FFT)
278
5.4 Laplace Transform
281
5.4.1 Definition and Properties
281
5.4.2 Applications to System Theory
285
6. Partial Differential Equations of Mathematical Physics
291
6.1 Basic Equations and Solution Methods
291
6.1.1 Transport Equation
291
6.1.2 Heat Equation
294
6.1.3 Wave Equation in One Dimension
307
6.2 Elements of the General Theory of PDEs
316
6.2.1 Classification of Equations and Characteristics . . . 316
6.2.2 Variation of Parameters
321
6.2.3 Separation of Variables
325
6.3 Some Classical Partial Differential Equations
333
6.3.1 Telegraph Equation
333
6.3.2 Helmholtz's Equation
338
6.3.3 Wave Equation in Two and Three Dimensions . . . . 343
6.3.4 Maxwell's Equations
347
6.3.5 Equations of Fluid Mechanics
353
6.4 Equations of Quantum Mechanics
356
xiv
Mathematics of Physics and Engineering
6.4.1 Schrodinger's Equation
356
6.4.2 Dirac's Equation of Relativistic Quantum Mechanics 373
6.4.3 Introduction to Quantum Computing
379
6.5 Numerical Solution of Partial Differential Equations . . . . 390
6.5.1 General Concepts in Numerical Methods
391
6.5.2 One-Dimensional Heat Equation
395
6.5.3 One-Dimensional Wave Equation
398
6.5.4 The Poisson Equation in a Rectangle
401
6.5.5 The Finite Element Method
403
7. Further Developments and Special Topics
409
7.1 Geometry and Vectors
409
7.2 Kinematics and Dynamics
413
7.3 Special Relativity
422
7.4 Vector Calculus
426
7.5 Complex Analysis
430
7.6 Fourier Analysis
436
7.7 Partial Differential Equations
440
8. Appendix
451
8.1 Linear Algebra and Matrices
451
8.2 Ordinary Differential Equations
455
8.3 Tensors
455
8.4 Lumped Electric Circuits
463
8.5 Physical Units and Constants
465
Bibliography
467
List of Notations
471
Index
473
Chapter 1
Euclidean Geometry and Vectors
1.1 Euclidean G e o m e t r y
1.1.1 The Postulates of Euclid
The two Greek roots in the word geometry, geo and metron, mean "earth" and "a measure," respectively, and until the early 19th century the development of this mathematical discipline relied exclusively on our visual, auditory, and tactile perception of the space in our immediate vicinity. In particular, we believe that our space is homogeneous (has the same properties at every point) and i s o t r o p i c (has the same properties in every direction). The abstraction of our intuition about space is Euclidean geometry, named after the Greek mathematician and philosopher EUCLID, who developed this abstraction around 300 B.C.
The foundations of Euclidean geometry are five postulates concerning points and lines. A p o i n t is an abstraction of the notion of a position in space. A l i n e is an abstraction of the path of a light beam connecting two nearby points. Thus, any two points determine a unique line passing through them. This is Euclid's f i r s t p o s t u l a t e . The second p o s t u l a t e states that a line segment can be extended without limit in either direction. This is rather less intuitive and requires an imaginative conception of space as being infinite in extent. The t h i r d p o s t u l a t e states that, given any straight line segment, a circle can be drawn having the segment as radius and one endpoint as center, thereby recognizing the special importance of the circle and the use of straight-edge and compass to construct planar figures. The fourth p o s t u l a t e states that all right angles are equal, thereby acknowledging our perception of perpendicularity and its uniformity. The f i f t h and f i n a l p o s t u l a t e states that if two lines are drawn in the plane to intersect a third line in such a way that the sum of the
l
2
Euclidean Geometry
inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough. This postulate is equivalent to what is known as the p a r a l l e l p o s t u l a t e , stating that, given a line and a point not on the line, there exists one and only one straight line in the same plane that passes through the point and never intersects the first line, no matter how far the lines are extended. For more information about the parallel postulate, see the book Godel, Escher, Bach: An Eternal Golden Braid by D. R. Hofstadter, 1999. The parallel postulate is somewhat contrary to our physical perception of distance perspective, where in fact two lines constructed to run parallel seem to converge in the far distance.
While any geometric construction that does not exclusively rely on the five postulates of Euclid can be called non-Euclidean, the two basic non-Euclidean geometries, hyperbolic and e l l i p t i c , accept the first four postulates of Euclid, but use their own versions of the fifth. Incidentally, Euclidean geometry is sometimes called p a r a b o l i c . For more information about the non-Euclidean geometries, see the book Euclidean and Non-Euclidean Geometries: Development and History by M. J. Greenberg, 1994.
The parallel postulate of Euclid has many implications, for example, that the sum of the angles of a triangle is 180°. Not surprisingly, this and other implications do not hold in non-Euclidean geometries. Classical (Newtonian) mechanics assumes that the geometry of space is Euclidean. In particular, our physical space is often referred to as the three-dimensional Euclidean space R3, with Bfc denoting the set of the real numbers; the reason for this notation will become clear later, see page 7.
The development of Euclidean geometry essentially relies on our intuition that every line segment joining two points has a length associated with it. Length is measured as a multiple of some chosen unit (e.g. meter). A famous theorem that can be derived in Euclidean geometry is the theorem of Pythagoras: the square of the length of the hypotenuse of a right triangle is equal to the sum of the squares of the lengths of the other two sides. Exercise 1.1.4 outlines one possible proof. This theorem leads to the distance function, or metric, in Euclidean space when a cartesian coordinate system is chosen. The metric gives the distance between any two points by the familiar formula in terms of their coordinates (Exercise 1.1.5).
Relative Position and Position Vectors
3
1.1.2 Relative Position and Position Vectors
Our intuitive conception and observation of position and motion suggest that the position of a point in space can only be specified relative to some other point, chosen as a reference. Likewise, the motion of a point can only be specified relative to some reference point.
The view that only relative motion exists and no meaning can be given to absolute position or absolute motion has been advocated by many prominent philosophers for many centuries. Among the famous proponents of this relativistic view were the Irish bishop and philosopher GEORGE BERKELEY (c. 1685-1753), and the Austrian physicist and philosopher ERNST MACH (1836-1916). An opposing view of absolute motion also had prominent supporters, such as Sir ISAAC NEWTON (1642-1727). In 1905, the German physicist ALBERT EINSTEIN (1879-1955) and his theory of special relativity seemed to resolve the dispute in favor of the relativists (see Section 2.4 below).
Let us apply the idea of the relative position to points in the Euclidean space M3. We choose an arbitrary point O as a reference point and call it an o r i g i n . Relative to O, the position of every point P in R3 is specified by the directed line segment r = OP from O to P. This line segment has length ||r|| — \OP\, the distance from O to P, and is called the p o s i t i o n v e c t o r of P relative to O (the Latin word vector means "carrier"). Conversely, any directed line segment starting at O determines a point P. This description does not require a coordinate system to locate P.
In what follows, we denote vectors by bold letters, either lower or upper case: u, R. Sometimes, when the starting point O and the ending point P of the vector must be emphasized, we write OP to denote the corresponding vector.
The position vectors, or simply vectors, can be added and multiplied by real numbers. With these operations of addition and multiplication, the set of all vectors becomes a vector space. Because of the special geometric structure of K3, two more operations on vectors can be defined, the dot product and the cross product, and this was first done in the 1880s by the American scientist JosiAH WlLLARD GlBBS (1839-1903). We will refer to the study of the four operations on vectors (addition, multiplication by real numbers, dot product, cross product) as vector algebra. By contrast, v e c t o r a n a l y s i s (also known as v e c t o r calculus) is the calculus on M3, that is, differentiation and integration of vector-valued functions of one or
4
Euclidean Geometry
several variables. Vector algebra and vector analysis were developed in the 1880s, independently by Gibbs and by a self-taught British engineer OLIVER HEAVISIDE (1850-1925). In their developments, both Gibbs and Heaviside were motivated by applications to physics: many physical quantities, such as position, velocity, acceleration, and force, can be represented by vectors.
All constructions in vector algebra and analysis are not tied to any particular coordinate system in M3, and do not rely on the interpretation of vectors as position vectors. Nevertheless, it is convenient to depict a vector as a line segment with an arrowhead at one end to indicate direction, and think of the length of the segment as the magnitude of the vector.
Remark 1.1 Most of the time, we will identify all the vectors having the same direction and length, no matter the starting point. Each vector becomes a representative of an equivalence class of vectors and can be moved around by parallel translation. While this identification is convenient to study abstract properties of vectors, it is not always possible in certain physical problems (Figure 1.1.1).
F\ -« 1
F2 I *-
F2 *i
Fi N
Stretching
Compressing
Fig. 1.1.1 Starting Point of a Vector Can Be Important!
1.1.3 Euclidean Space as a Linear Space
Consider the Euclidean space K3 and choose a point O to serve as the origin. In mechanics this is sometimes referred to as choosing a frame of reference, or frame for short. As was mentioned in Remark 1.1, we assume that all the vectors can be moved to the same starting point; this starting point defines the frame. Accordingly, in what follows, the word frame will have one of the three meanings:
• A fixed point; • A fixed point with a fixed coordinate system (not necessarily Carte-
sian) ; • A fixed point and a v e c t o r bundle, that is, the collection of all
vectors that start at that point.
Euclidean Space as a Linear Space
5
Let r be the position vector for a point P. Consider another frame with origin O'. Let r' be the position vector of P relative to O'. Now, let v be the position vector of O' relative to O. The three vectors form a triangle OO'P; see Figure 1.1.2. This suggests that we write r = v + r'. To get from O to P we can first go from O to O' along v and then from O' to P along r'. This can be depicted entirely with position vectors at O if we move r' parallel to itself and place its initial point at O. Then r is a diagonal of the parallelogram having sides v and r ' , all emanating from O. This is called the parallelogram law for vector addition. It is a geometric definition of v + r'. Note that the same result is obtained by forming the triangle OO'P.
r'l S^T IT'
O v
a
Fig. 1.1.2 Vector Addition
r = v + r'
Now, consider three position vectors, u,v,w. It is easy to see that the above definition of vector addition obeys the following algebraic laws:
u +v = v +u (it + v) + w = u + (v + w)
it + 0 = 0 + tt = u
(commutativity) (associativity; see Figure 1.1.3) (1.1.1)
The zero vector 0 is the only vector with zero length and no specific direction.
Next consider two real numbers, A and JJL. In vector algebra, real numbers are called s c a l a r s . The vector Ait is the vector obtained from it by multiplying its length by |A|. If A > 0, then the vectors it and Ait have the same direction; if A < 0, then the vectors have opposite directions. For example, 2it points in the same direction as it but has twice its length, whereas —u has the same length as u and points in the opposite direction (Figure 1.1.4).
6
Euclidean Geometry
U + V+ W
Fig. 1.1.3 Associativity of Vector Addition
Fig. 1.1.4 Multiplication by a Scalar
Multiplication of a vector by a scalar is easily seen to obey the following algebraic rules:
X(u + v) = Xu + Xv (A + /x)u = Xu + piu
(AjLt)u = A(/itt) I u = u.
(distributivity over vector addition) (distributivity over real addition) (a mixed associativity of multiplications)
(1.1.2)
In particular, two vectors are parallel if and only if one is a scalar multiple of the other.
Definition 1.1 A (real) vector space is any abstract set of objects, called vectors, with operations of vector addition and multiplication by (real) scalars obeying the seven algebraic rules (1.1.1) and (1.1.2).
Euclidean Space as a Linear Space
7
Note that, while the set of position vectors is a vector space, the concepts of vector length and the angle between two vectors are not included in the general definition of a vector space. A vector space is said to be n-dimensional if the space has a set of n vectors, u i , . . . , u n such that any vector v can be represented as a linear combination of the Ui, that is, in the form,
» = iiuH
Vxnun,
(1.1.3)
and the scalar components x\,..., xn, are uniquely determined by v. An n-dimensional real vector space is denoted by R"; with R denoting the set of real numbers, this notation is quite natural.
We say that the vectors it,, i — 1 , . . . , n, form a b a s i s in Rn. Notice that nothing is said about the length of the basis vectors or the angles between them: in an abstract vector space, these notions do not exist. The uniqueness of representation (1.1.3) implies that the basis vectors are
l i n e a r l y independent, that is, the equality x\ u\ -\ \-xnun = 0 holds if and only if all the numbers x\,...,xn are equal to zero. It is not difficult to show that a vector space is n dimensional if and only if the space contains n linear independent vectors, and every collection of n + 1 vectors is linear dependent; see Problem 1.7, page 411.
In the space R3 of position vectors, we do have the notions of length and angle. The standard basis in R3 is the c a r t e s i a n b a s i s (?, j , k), consisting of the origin O and three mutually perpendicular vectors i, j , k of unit length with the common starting point O. In a cartesian basis, every position vector r = OP of a point P is written in the form
r = xi + yj + zk;
(1.1.4)
the numbers (x, y, z) are called the coordinates of the point P with respect to the c a r t e s i a n coordinate system formed by the lines along i, j , and k. In the plane of i and j , the vectors x i+y j form a two-dimensional vector space R2. With some abuse of notation, we sometimes write r = (x,y,z) when (1.1.4) holds and the coordinate system is fixed.
The word "cartesian" describes everything connected with the French scientist R E N E DESCARTES (1596-1650), who was also known by the Latin version of his last name, Cartesius. Beside the coordinate system, which he introduced in 1637, he is famous for the statement "I think, therefore I am."
8
Euclidean Geometry
Much of the power of the vector space approach lies in t h e freedom from any choice of basis or coordinates. Indeed, m a n y geometrical concepts and results can be stated in vector terms without resorting to coordinate systems. Here are two examples:
(1) The line determined by two points in M3 can be represented by the position vector function
r(s) = u + s(v - u) — sv + (1 - s)u, - c o < s < + o o ,
(1.1.5)
where u and v are the position vectors of t h e two points. More generally, a line passing through t h e point PQ and having a d i r e c t i o n v e c t o r d consists of the points with position vectors r(s) = OPQ + sd.
(2) The plane determined by the three points having position vectors u,v,w is represented by t h e position vector function
r(s, t) = u + s(v — u) + t(w — u)
,
x
= sv + tw + (1 — s — t)u, —oo < s, t < + c o .
(1-1-6)
E X E R C I S E 1.1.1.-8 Verify that equations (1.1.5) and (1.1.6) indeed define a line and a plane, respectively, in M3.
E X E R C I S E 1.1.2.B Let L\ and Li be two parallel lines in R 3 . A line intersecting both L\ and L^ is called a t r a n s v e r s a l .
(a) Let L be a transversal perpendicular to L\. Prove that L is perpendicular to Li. Hint: If not, then there is a right triangle with L as one side, the other side along L\ and the hypotenuse lying along Li- (b) Prove that the alternate angles made by a transversal are equal. Hint: Let A and B be the points of intersection of the transversal with L\ and L2 respectively. Draw the perpendiculars at A and B. They form two congruent right triangles.
E X E R C I S E 1.1.2>? Use the result of Exercise 1.1.2(b) to prove that the sum of the angles of a triangle equals a straight angle (180°). Hint: Let A,B,C be the vertices of the triangle. Through C draw a line parallel to side AB.
E X E R C I S E 1.1.4:4 Let a, b be the lengths of the sides of a right triangle with hypotenuse of length c. Prove that a2 + b2 = c2 (Pythagorean Theorem). Hint: See Figure 1.1.5 and note that the acute angles A and B are complementary: A + B = 90°. E X E R C I S E 1.1.5. c Use the result of Exercise 1.1.4 to derive the Euclidean distance formula: d(Pi, P2) — \{x\ — X2)2 + {y\ - yi)2 + {z\ — Z2)2]1/2
Inner Product
9
Fig. 1.1.5 Pythagorean Theorem
EXERCISE I.I.6."4 Prove that the diagonals of a parallelogram intersect at their midpoints. Hint: let the vectors u and v form the parallelogram and let r be the position vector of the point of intersection of the diagonals. Argue that r = u + s(v — w) = t(u + v) and deduce that s = t = 1/2.
1.2 Vector Operations
1.2.1 Inner Product
Euclidean geometry and trigonometry deal with lengths of line segments and angles formed by intersecting lines. In abstract vector analysis, lengths of vectors and angles between vectors are defined using the axiomatically introduced notions of norm and inner product.
In M3, where the notions of angle and length already exist, we use these notions to define the inner product u • v of two vectors. We denote the length of vector u by ||u||. A u n i t v e c t o r is a vector with length equal to one. If u is a non-zero vector, then u/||tt|| is the unit vector with the same direction as u; this unit vector is often denoted by u. More generally, a hat ~ on top of a vector means that the vector has unit length. With the dot • denoting the inner product of two vectors, we will sometimes write a.b to denote the product of two real numbers a, b.
Definition 1.2 Let u and v be vectors in M3. The inner product of u and v, denoted u • v, is defined by
u • v = ||M||.||U|| cos#,
(1.2.1)
10
Vector Operations
where 9 is the angle between u and v, 0 < 6 < TT (see Figure 1.2.1), and the notation ||it||.||i;|| means the usual product of two numbers. If u = 0 or v = 0, then u • v = 0.
I V
u
u
Fig. 1.2.1 Angle Between Two Vectors
Alternative names for the inner product are dot product and s c a l a r product.
If u and v are non-zero vectors, then u • v = 0 if and only if 6 = 7r/2. In this case, we say that the vectors u and v are orthogonal or perpendicular, and write u l t ) . Notice that
u u = \\uf >0.
(1.2.2)
In R3, a set of three unit vectors that are mutually orthogonal is called an orthonormal set or orthonormal basis. For example, the unit vectors i, j , k of a cartesian coordinate system make an orthonormal basis. Indeed, i _L j, i ± K, and J I K , i j = i k — j k = 0, and i i — jj= k- k = 1.
The word "orthogonal" comes from the Greek orthogonios, or "rightangled" ; the word "perpendicular" comes from the Latin perpendiculum, or "plumb line", which is a cord with a weight attached to one end, used to check a straight vertical position. The Latin word norma means "carpenter's square," another device to check for right angles.
The dot product simplifies the computations of the angles between two vectors. Indeed, if u and v are two unit vectors, then u-v = cosO. More generally, for two non-zero vectors u and v we have
6 = cos"1 (j^;)
,
(1.2.3)
The notion of the dot product is closely connected with the ORTHOGONAL PROJECTION. If u and v are two non-zero vectors, then we can write u = uv + Up, where uv is parallel to v and up is perpendicular to v (see Figure 1.2.2).
Inner Product
11
Uv — U±
V
Uv = U±
Fig. 1.2.2 Orthogonal Projection
It follows from the picture that ||ti„|| = ||u||.| cos#| and uv has the same direction as v if and only if 0 < 9 < -n/2. Comparing this with (1.2.1) we conclude that
UV V
Uv
(1.2.4)
The vector uv is called the orthogonal p r o j e c t i o n of u on v, and is denoted by uj_; the number u • t>/||t;|| is called the component of u in the direction of v; note that «/||«|| is a unit vector. The verb "to project" comes from Latin "to through forward." Let us emphasize that the orthogonal projection of a vector is also a vector.
Let us now use the idea of the orthogonal projection to establish the
PROPERTIES OF THE INNER PRODUCT.
Consider two non-zero vectors u and w and a unit vector v. Then (u + w) • v is the projection of u + w on v. From Figure 1.2.3, we conclude that (u + w) • v = u • v + w • v.
(u + w)±
Fig. 1.2.3 Orthogonal Projection of Two Vectors
12
Vector Operations
Furthermore, if A is any real scalar, then (Xu) v = X(u-v). For example (2u) • v = 2(ii • v). Also (—it) • v = — (u • v), since the angle between — u and v is-K — 9 and cos(7r — 6) = — cos6. These observations are summarized by the formula
(\u + (iv) • w = \(u • w) + fi(v • w),
(1.2.5)
where A and /U are any real scalars. Note that these properties of inner product are independent of any co-
ordinate system. Next, we will find an expression for the inner product in terms of the
components of the vectors in cartesian coordinates. Let i, j , « be an orthonqrmal set forming a cartesian coordinate system. Any position vector x = OP can be expressed as x = xii+X2j+X3K, where xi = xi, X2 = xj and X3 = x • fi, are the cartesian coordinates of the point P. If y is another vector, then y = yii+y2j+y3& and by (1.2.5) and the orthonormal property of i, j , K, we get
x • y = xiy! + x2y2 + X3V3-
(1.2.6)
This formula expresses xy in terms of the coordinates of a; and y. Together with (1.2.3), we can use the result for computing the angle between two vector with known components in a given cartesian coordinate system.
In linear algebra and in some software packages, such as MATLAB, vectors are represented as column v e c t o r s , that is, as 3 x 1 matrices; for a summary of linear algebra, see page 451. If x and y are column vectors, then the transpose xT is a row vector ( 1 x 3 matrix) and, by the rules of matrix multiplication, x • y = xTy.
EXERCISE 1.2.1.c Let x, y be column vectors and, A a 3 x 3 matrix. Show that Axy = y7'Ax = xTATy = ATy • x. Hint: (AB)T = BTAT.
We can now summarize the main properties of the inner product:
(11) u • u > 0 and u • u = 0 if and only if u = 0. (12) (Ait + /j,v) • w = \{u • w) + n(v • w), where A, fj, are real numbers. (13) uv = vu. (14) u • v = 0 if and only if u _L v.
Property (14) includes the possibility u = 0 or v = 0, because, by convention, the zero vector 0 does not have a specific direction and is therefore
Inner Product
13
orthogonal to any vector. This is consistent with (12): taking A — \i — 1 and v = 0, we also find w • 0 = 0 for every w.
EXERCISE 1.2.2.C Prove the law of cosines: a2 = b2 + c2 - 2bccos6, where a, b, c are the sides of a triangle and 6 is the angle between b and c. Hint: Let c = ||n||, 6 = ||r2||. Then a2 = \\r2 - r i f = ( n - r2) • ( n - r 2 ) .
We now discuss some APPLICATIONS OF THE INNER PRODUCT. We start with the EQUATION OF A LINE IN R2. Choose an origin O and drop the perpendicular from O to the line L; see Figure 1.2.4.
Fig. 1.2.4 Line in The Plane
Let n be a unit vector lying on this perpendicular. For any point P on L, the position vector r satisfies
r n = d,
(1.2.7)
where \d\ is the distance from O to L; indeed, |r • n\ is the length of the projection of r on n . In a cartesian coordinate system (x,y), r = xi + yj, and equation (1.2.7) becomes ax + by = d, where n = ai + bj. More generally, every equation of the form a\x + a^y — 013, with real numbers ffli, 0,2,0,3, defines a line in M2.
Similar arguments produce the EQUATION OF A PLANE IN R3. Let n be a unit vector perpendicular to the plane. For any point P in the plane, the equation (1.2.7) holds again; Figure 1.2.4 represents the view in the plane spanned by the vectors n and r and containing points O, P. In a cartesian coordinate system (x,y,z), r = xi+yj+zk,, and equation (1.2.7) becomes ax + by + cz = d, where n = ai + bj+ck. More generally, every equation of the form a\x + a,2y + a^z = 04 defines a plane in R3 with a (not necessarily unit) normal v e c t o r ax i + a-i 3 + az k. For alternative ways to represent a line and a plane see equations (1.1.5) and (1.1.6) on page 8.
14
Vector Operations
EXERCISE 1.2.3.c Using equation (1.2.7), write an equation of the plane that is 4 units from the origin and has the unit normal n = (2, —1,2)/3. How many such planes are there?
EXERCISE 1.2.4.C Let 2x-y + 2z = 12 and x + y - z = 1 be the equations of two planes. Find the cosine of the angle between these planes.
Yet another application of the dot product is to computing the WORK DONE BY A FORCE. Let F be a force vector acting on a mass m and moving it through a displacement given by vector r. The work W done by F moving m through this displacement is W = F • r, since ||.F|| cos# is the magnitude of the component of F along r and ||r|| is the distance moved.
We will see later that, beside the position and force, many other mechanical quantities (acceleration, angular momentum, angular velocity, momentum, torque, velocity) can be represented as vectors.
To conclude our discussion of the dot product, we will do some ABSTRACT VECTOR ANALYSIS. The properties (II)—(13) of the inner product can be taken as axioms defining an inner product operation in any vector space. In other words, an inner product is a rule that assigns to any pair u, v of vectors a real number u • v so that properties (II)—(13) hold. With this approach, the definition and properties of the inner product are independent of coordinate systems.
Consider the vector space R" with a basis it = (m,... ,un); see page 7. We can represent every element x of Mn as an n-tuples (xi,..., xn) of the components of x in the fixed basis. Clearly, for y = (yi,. • • ,yn) and
AGR,
x + y = (xi +yi,...,xn + yn), Xx = (Xxi,..., Xxn).
We then define
x-y = xiyi +
n
\-xnyn = 'Y^Xiyi.
t=i
(1.2.8)
It is easy to verify that this definition satisfies (II)—(13). For n = 3 with a Cartesian basis, equation (1.2.6) is a special case of (1.2.8).
If an inner product is defined in a vector space, then in view of property (II) we can define a norm or length of a vector by
Nl = («-«)1/2-
(1.2.9)
Inner Product
15
While an inner product defines a norm, other norms in E n exist that are not inner product-based; see Problem 1.8 on page 411.
An orthonormal b a s i s in K™ is a basis consisting of pair-wise orthogonal vectors of unit length.
EXERCISE 1.2.5.^ Verify that, under definition (1.2.8), the corresponding basis U\,..., un is necessarily orthonormal. Hint: argue that the basis vector Ufc is represented by an n-tuple with zeros everywhere except the position k.
EXERCISE 1.2.6.B Prove the p a r a l l e l o g r a m law;
||u + vf + ||u - vf = 2\\u\\2 + 2\\v\\2.
(1.2.10)
Show that in R3 this equality can be stated as follows: in a parallelogram, the sum of the squares of the diagonals is equal to the sum of the squares of the sides (hence the name "parallelogram law").
Theorem 1.2.1 The norm defined by (1.2.9) satisfies the t r i a n g l e inequality
l|ti + w | | < H | + ||t;||
(1.2.11)
and the Cauchy-Schwartz i n e q u a l i t y
|ii-«|<||tt||.||t»||.
(1.2.12)
Proof. We first show that (1.2.11) follows from (1.2.12). Indeed,
\\u + v\\2 = {u + v)-{u + v) = \\u\\2 + 2(u • v) + \\v\\2 < \\uf + 2|u • »| + IMI2 < ||M||2 + 2||u||.||v|| + ||v||2 = (||u|| + IMI)2.
To prove (1.2.12), first suppose u and v are unit vectors. By properties (II)—(13) of the inner product, for any scalar A,
0 < (u + Xv) • (u + Xv) = u • u + 2A« -v + X2v-v = l + 2Xu -v + X2.
Now, take A = -{u • v). Then 0 < 1 - 2(u • v)2 + {u • v)2 = 1 - (u • v)2.
Hence, \u • v\ < 1. On the other hand, for every non-zero vectors u and
v, u = \\u\\ -u/\\u\\ and v = ||w|| -«/||«||. Since u / | | « | | and v/\\v\\ are unit
vectors, we have \u • v|/(||ii|| ||w||) < 1, and so |u • i;| < ||u|| ||«||.
If either u or v is a zero vector, then (1.2.12) trivially holds. Theorem
1.2.1 is proved.
Remark 1.2 Analysis of the proof of Theorem 1.2.1 shows that equality in either (1.2.11) or (1.2.12) holds if and only if one of the vectors is a
16
Vector Operations
scalar multiple of the other: u = Xv orv = Xu for some real number X; we have to write two conditions to allow either u or v, or both, to be the zero vector.
EXERCISE 1.2.7.C Choose a Cartesian coordinate system (x,y,z) with the corresponding unit basis vectors (i, j , k). Let P, Q, be points with coordinates (1, —3,2) and (—2,4, - 1 ) , respectively. Define u = OP, v — OQ. (a) Compute QP = u — v, \\u\\, and \\v\\. Compute the angle between u and v. Verify the Cauchy-Schwartz inequality and the triangle inequality. (b) Let w = 2 £ + 4 j — 5 k. Check that the associative law holds for u, v, w. (c) Suppose u is a force vector. Compute the component of u in the v direction. Suppose v is the displacement of a unit mass acted on by the force u. Compute the work done.
Inequality (1.2.12) is also known as the Cauchy-Bunyakovky-Schwartz inequality, and all three possible combinations of any two of these three names can also refer to the same or similar inequality. This inequality is extremely useful in many areas of mathematics, and all three, Cauchy, Bunyakovky, and Schwartz, certainly deserve to be mentioned in connection with it. The Russian mathematician VIKTOR YAKOVLEVICH BUNYAKOVSKY (1804-1889) and the German mathematician HERMANN AMANDUS SCHWARZ (1843-1921) discovered a version of (1.2.12) for the integrals:
J \f{x)g(x)\dx < (J f(x)dx)
IJ g2(x)dx\ ; (1.2.13)
Bunyakovsky published it in 1859, Schwartz, most probably unaware of Bunyakovsky's work, in 1884. The French mathematician AUGUSTIN Louis CAUCHY (1789-1857) has his name attached not just to (1.2.12) but to many other mathematical results. There are two main reasons for that: he was the first to introduce modern standards of rigor in the mathematical proofs, and he published a lot of papers (789 to be exact, some exceeding 300 pages), covering most ares of mathematics. We will be mentioning Cauchy a lot during our discussion of complex analysis. Throughout the rest of our discussions, we will refer to (1.2.12) and all its modifications as the Cauchy-Schwartz inequality.
EXERCISE 1.2.8^ (a) Use the same arguments as in the proof of (1.2.12) to establish (1.2. IS), (b) Use the same arguments as in the proof of (1.2.12)
Cross Product
17
to establish the following version of the Cauchy-Schwartz inequality:
oo
/ oo \ 1 / 2 / oo \ V 2
£ M * | < $>2fc (J2bl) .
(1.2.14)
In both parts (a) and (b), assume all the necessary integrability and convergence.
We conclude this section with a brief discussion of transformations of a linear vector space. We will see later that a mathematical model of the motion of an object in space is a special transformation of K3.
Definition 1.3 A transformation A of the space R™, n > 2, is a rule that assigns to every element x of M" a unique element A(x) from R™. When there is no danger of confusion, we write Ax instead of A(x). A transformation A is called an isometry if it preserves the distances between points: ||.Aa: - Ay\\ — ||x — y|| for all x, y in R™. A transformation A is called l i n e a r if A(Xx + fiy) = A A(x) + /u A(y) for all x, y from Rn and all real numbers A, \i. A transformation is called orthogonal if it is both a linear transformation and an isometry.
The two Latin roots in the word "transformation," trans and forma, mean "beyond" and "shape," respectively. The two Greek roots in the word "isometry", isos and metron, mean "equal" and "measure." We know from linear algebra that, in R" with a fixed basis, every linear transformation is represented by a square matrix; see Exercise 8.1.4, page 453, in Appendix.
EXERCISE 1.2.9^ (a) Show that if A is a linear transformation, then A(0) — 0. Hint: use that 0 = A0 for all real A. (b) Show that the transformation A is orthogonal if and only if it preserves the inner product: (Ax) • (Ay) = x • y for all x, y from M.n. Hint: use the parallelogram law (1.2.10).
1.2.2 Cross Product
In the three-dimensional vector space R3, we use the Euclidean geometry and trigonometry to define the inner product of two vectors. This definition easily extends to every i n , n > 2. In R3, and only in R3, there exists another product of two vectors, called the cross product, or v e c t o r product.
18
Vector Operations
Definition 1.4 Let u and v be two vectors in R3. Let 8 be the angle between tt and v (0 < 9 < IT, see Figure 1.2.1). The cross product, u x v, is the vector having magnitude \\u x v\\ = ||u||.||u|| sin# and lying on the line perpendicular to u and v and pointing in the direction in which a right-handed screw would move when u is rotated toward v through angle 6.
Sometimes, the symbol Q is used to represent a vector perpendicular to the plane and coming out of the plane toward the observer, while the symbol ® represents a similar vector, but going away from the observer; see Figure 1.2.6.
The triple (u,v,u x v) forms a right-handed triad (Figure 1.2.5). More generally, we say that an ordered triplet of vectors (u, v, w) with a common origin in R3 is a right-handed t r i a d (or right-handed triple) if the vectors are not in the same plane and the shortest turn from u to v, as seen from the tip of w, is counterclockwise.
U XV
JC. u
U XV
Fig. 1.2.5 The Cross Product I
o- U X V u
U XV
Fig. 1.2.6 The Cross Product II
An important application of cross-product in mechanics is the moment of a force about a point O. Suppose an object located at a point P is subjected to a force vector F, applied at P. Let r be the position vector of P. The force F tends to rotate the object around O and exerts a torque, or moment, T around O. (The Latin verb torquere means "to twist.") The magnitude of the torque T is ||T|| = ||r||.||F||sin0, where 6 is the angle between r and F; recall that a.b denotes the usual product of two numbers
Cross Product
19
a, b. The quantity ||F|| sin# is the magnitude of the component of F perpendicular to r. (The component of F along r has no rotational effect.) The magnitude ||r|| is called the moment arm. Our experience with levers convinces us that the torque magnitude is proportional to the moment arm and the magnitude of force applied perpendicular to the arm. Hence, we define the torque of F around O to be the vector T = r x F, where r is the position at which F is applied. The direction of T is perpendicular to r and F and (r, F, T) is a right-handed triad.
PROPERTIES OF THE CROSS PRODUCT. From the definition it follows immediately that the vector w = u xv has the following three properties:
(CI) H | = ||«||.|H|sin(?. (C2) w • u = w • v = 0. (C3) —w = v x u.
A fourth property captures the geometry of the right-handed screw in
algebraic terms. Choose any right-handed cartesian coordinate system
given by three orthonormal vectors i, j , k. Suppose the components of
the vectors u, v, w = u x v in the basis (z, j , k) are, respectively,
(ui,U2,u3), (vi,V2,v3),&nd(wi,W2,w3).
Then
( wi u2 u3\ vi v2 v3 > 0 , Wx W2 W3 J
where det is the determinant of the matrix; a brief review of linear algebra, including the determinants, is in Appendix. To prove (C4), choose k' = w/\\w\\, j ' = v/||v||, and select a unit vector %' orthogonal to both k' and j ' to make (?', j ' , k') a right-handed triad. In this new coordinate system, property (C4) becomes
( u[ u'2 0 \ 0 ||v|| 0 =ui||t>||.|H| > 0 .
(1.2.15)
0 0 HI/
Since (u,v,w) is a right-handed triad, the choice of %' implies that u[ > 0, and (1.2.15) holds. For the system z, j , k with the same origin as (?', j ' , k'), consider an orthogonal transformation that moves the basis vectors i, Vcj, k to the vectors ?', Vcj', k', respectively. If B is the matrix representing this transformation in the basis (i, Vcj, k), then deti? = 1,
20
Vector Operations
and the two matrices, A in (C4) and A' in (1.2.15) are related by A' = BABT. Hence, detA = det.4' > 0 and (C4) holds.
EXERCISE 1.2.10? Verify that A' = BABT. Hint: see Exercise 8.1.4 on page 453 in Appendix. Pay attention to the basis in which each matrix is written.
The following theorem shows that the properties (Cl), (C2), and (C4) define a unique vector w = u x v.
T h e o r e m 1.2.2 For every two non-zero, non-parallel vectors u,v in M3, there is a unique vector w — u x v satisfying (Cl), (C2), (C4)- If (ux,U2,uz) and (vi,V2,vs) are the components of u and v in a cartesian right-handed system i, j , k, then the components wi,u>2,u)3 ofuxv are
wi = U2V3 - U3V2, w2 = U3V1 - uiv3, w3 = uiv2 - U2V1. (1.2.16)
Conversely, the vector with components defined by (1.2.16) has Properties (Cl), (C2), and (C4).
Proof. Let w be a vector so that w • u — 0 and w • v = 0, that is, w is orthogonal to both u and v. By the geometry of R3, there is such a vector. Choose a w with magnitude ||tu|| = ||tt|| ||v|| sin#, satisfying (Cl). By (C2),
uiwi + U2W2 + U3W3 = 0, w i ^ i + u 2 w 2 + W3W3 = 0.
(1.2.17) (1.2.18)
Multiply (1.2.17) by V3 and (1.2.18) by U3 and subtract to get
a
b
-A-
-A.
(U1V3 - U3Vi)u>i = (U3V2 - U2V3)W2.
(1.2.19)
Similarly, multiply (1.2.17) by vi and (1.2.18) by ui and subtract to get
c
~a
(v2ui - viu2)w2 = (U3V1 - uiv3)u>3.
(1.2.20)
Abbreviating, let a = U1V3 — U3V\,b = U3V2 — U2V3 and c = u\v2 — v\u2. Then (1.2.19) and (1.2.20) yield
w\ = (b/a)w2; W3 = {-c/a)w2. Hence, ||u>||2 = (b/a)2w% + w\ + (c/a)2u>2 and
HUJII2 = (1 + (62 + c2)/a2)w22 = (a2 +b2 + c 2 ) ( ^ / a 2 ) .
(1.2.21) (1.2.22)
Cross Product
21
Now, by simple algebra,
a2 + b2 + c2 - (U1V3 - U3V1)2 + (u3v2 - u2v3)2 + {uiv2 - V1U2)2
= (w2 + u\ + ul){v2 + v% + vl) - (ui«i + u2v2 + u3v3)2
=
\\u\\2\\v\\2-(u.Wy l l u | | 2 I M I 22'(1-cos2 6)
N i l 2 Il-u||2sin2<
(1.2.23)
Applying Property (CI), we get a2+b2+c2 = \\w\\2. Using (1.2.22), w\ja2 = 1. Hence, w2 = ± a and by (1.2.21), wi = ± b and W3 = =F c. To determine the signs, consider the special case u = i and v — j . Then u\ = l,u2 — 0 and v\ = 0,^2 = 1 and c = 1 - 1 - 0 - 0 = 1. On the other hand, the determinant in Property (C4) for this choice of u and v is
1 00 det 0 1 0 =pc,
±b ±a^fc
depending on whether W3 = — 1 or w3 = 1. Since the determinant must
be positive, we must take w2 = —a, in order to make W3 = c = 1 in this
case. This implies wi = —b. Therefore, w is uniquely determined and
has components u>i = —b, w2 = —a, w3 = c. In other words, there exists a
unique vector with the properties (Cl), (C2), and (C4), and its components
are given by (1.2.16).
Conversely, let to be a vector with components given by (1.2.16). Then
direct computations show that w has the properties (C2) and (C4). After
that, we repeat the calculations in (1.2.23) to establish Property (Cl). The
details of this argument are the subject of Problem 1.3 on page 410.
Theorem 1.2.2 is proved.
Remark 1.3 Formula (1.2.16) can be represented symbolically by
i 3K u x v = det «1 U2 U3
Vl v2 V3
(1.2.24)
and expanding the determinant by co-factors of the first row. Together with properties of the determinant, this representation implies Property (C3) of the cross product. Also, when combined with (Cl), formula (1.2.24) can be used to compute the angle between two vectors with known components. Still, given the extra complexity of evaluating the determinant, the inner
22
Vector Operations
product formulas (1.2.3) and (1.2.6) are usually more convenient for angle computations.
Remark 1.4 From (1.2.16) it follows that (Aw) x v = X(u x v) = u x At; for any scalar A. Another consequence of (1.2.16) is the distributive property of the cross product:
rx(u + v) — rxu + rxv.
(1.2.25)
Still, the cross product is not associative; instead, the following identity holds:
u x (v x w) + v x (w x u) + w x (u x v) = 0. EXERCISE 1.2.11."4 Prove that
(1.2.26)
u x (v x w) = (u • w)v — (u • v)w.
(1.2.27)
Then use the result to verify (1.2.26). Hint: A possible proof of (1.2.27) is as follows (fill in the details). Choose an orthonormal basis t, j , k so that i is parallel to w and j is in the plane of w and v. Then w = w\l and v = v%i + v-z] and
i 3 it
v x w = det V\ V2 0 = —V2W1K,;
101 0 0
1j k ) = det U\ « 2 U3
0 0 —V2W1
= —U2V2W11 + U1V2W1J;
(u -w)v — (u -v)lV = UlWl(v\l + V2J) — (uiVl +U2«2)Wl* = — U2V2W1I + U\V2W\j.
While the properties (Cl)-(C4) of the cross product are independent of the coordinate system, the definition does not generalize to M.n for n > 4 because in dimension n > 4 there are too many vectors orthogonal to two given vectors.
Property (CI) implies that ||u x v|| is the area of the parallelogram generated by the vectors u and v. Accordingly, we have u x v = 0 if and only if one of the vectors is a scalar multiple of the other. If Pi, P2, P3 are three points in E3, these points are c o l l i n e a r (lie on the same line) if and only if
P1P2 x PXP3 = 0,
(1.2.28)
Scalar Triple Product
23
where PiPj = OPj — OPi- If (xi, yi, Zi) are the cartesian coordinates of the point Pi, then the criterion for collinearity (1.2.28) becomes
Z
J
K
det X2 ~ Xl 2/2 - 2/1 Z2 - Z\ = 0.
Xj, - Xl 1/3 - 2/1 23 - Z\
(1.2.29)
In the following three exercises, the reader will see how the mathematics of vector algebra can be used to solve problems in physics.
EXERCISE 1.2.12.C Suppose two forces F\, F2 are applied at P; r = OP. Show that the total torque at P is T = Ti + T2, where Ti = r x Fi and T2 = r xF2.
EXERCISE 1.2.13/1 Consider a rigid rod with one end fixed at the origin O but free to rotate in any direction around O (say by means of a ball joint). Denote by P the other end of the rod; r = OP. Suppose a force F is applied at the point P. The rod will tend to rotate around O. (a) Let r = 2i + 3j + k and F = i + j + k. Compute the torque T. (b) Let r = 2 i + Aj and F = i + j , so that the rotation is in the (i, j) plane. Compute T. In which direction will the rod start to rotate?
EXERCISE 1.2.14/4 Suppose a rigid rod is placed in the (i, j) plane so that the mid-point of the rod is at the origin O, and the two ends P and Pi have position vectors r = i + 2 j and T"i = — i — 2 j . Suppose the rod is free to rotate around O in the (i, j) plane. Let F = i + j and Fi = — i — j be two forces applied at P and Pi, respectively. Compute the total torque around O. In which direction will the rod start to rotate?
1.2.3 Scalar Triple Product The s c a l a r t r i p l e product (u,v,w) of three vectors is defined by
(u, v, w) = u • (v x w). Using (1.2.24) it is easy to see that, in cartesian coordinates,
U l « 2 ""3
(u, v, w) = det Vi V2 V3
Wi W2 U>3
From the properties of determinants it follows that (u,v,w) = —(v,u,w) = (v,w,u) = (w,u,v)
24
Curves in Space
Thus,
u • (v x w) = w • (u x v) = (u x v) • w.
(1.2.30)
In other words, the scalar triple product does not change under cyclic permutation of the vectors or when • and x symbols are switched.
EXERCISE 1.2.15? Verify that the ordered triplet of non-zero vectors u, v, w is a right-handed triad if and only if (u, v, w) > 0.
Recall that \\v x w\\ = ||u|| • ||iu|| sin# is the area of the parallelogram formed by v and w. Therefore, \u • (v x w)\ is the volume of the parallelepiped formed by u,v, and w. Accordingly, (u,v,w) = 0 if and only if the three vectors are linearly dependent, that is, one of them can be expressed as a linear combination of the other two. Similarly, four points Pi, i = 1 , . . . , 4 are co-planar (lie in the same plane) if and only if
(P1P2,PiP3,P1P4) = 0,
(1.2.31)
where PjP, = OPj — OPi. If (XJ, yi, z^ are the cartesian coordinates of the point Pi, then (1.2.31) becomes
x2 ~xiy2-
2/i z2 - zi
det %3 - x\ 2/3 - 2/i z3 - zi
x4 - x i y 4 - j/i z4 - zx
(1.2.32)
Notice a certain analogy with (1.2.28) and (1.2.29).
EXERCISE 1 . 2 . 1 6 . C Let u = (1,2,3), t; = ( - 2 , 1 , 2 ) , w = ( - 1 , 2 , 1 ) . (a) Compute uxv, vxw, (uxv)x(vxw). (b) Compute the area of the parallelogram formed by u and v. (c) Compute the volume of the parallelepiped formed by u, v, w using the triple product (u, v, w).
1.3 Curves in Space
1.3.1 Vector-Valued Functions of a Scalar Variable To study the mathematical kinematics of moving bodies in M3, we need to define the velocity and acceleration vectors. The rigorous definition of these vectors relies on the concept of the derivative of a vector-valued function with respect to a scalar. We consider an idealized object, called a point mass, with all mass concentrated at a single point.
Vector- Valued Functions of a Scalar Variable
25
Choose an origin O and let r(t) be the position vector of the point mass at time i. The collection of points P(t) so that OP(t) = r{t) is the trajectory of the point mass. This trajectory is a curve in R3. More generally, a curve C is defined by specifying the position vector of a point P on C as a function of a scalar variable t.
Definition 1.5 A curve C in a frame O in E 3 is the collection of points defined by a vector-valued function r = r(t), for t in some interval I in M, bounded or unbounded. A point P is on the curve C if an only if OP = r(t0) for some to & I- A curve is called simple if it does not intersect or touch itself. A curve is called closed if it is defined for t in a bounded closed interval I = [a,b] and r(a) = r(b). For a simple closed curve on [a, b], we have r(t\) — rfa), a < ti < t2 < 6 if and only if fi = a and tz = b.
By analogy with the elementary calculus, we say that the vector function r is continuous at to if
lim||r(i)-r(t0)||=0.
t—no
(1.3.1)
Accordingly, we say that the curve C is continuous if the vector function
that defines C is continuous.
Similarly, the d e r i v a t i v e at to of a vector-valued function r(t) is, by
definition,
dr,
,. .
r(t0 + At) - r{t0)
^ I t - t o ^ r («>) = j i m ,
Xt
(1-3.2)
We say that r is diff e r e n t i a b l e at to if the derivative r'(t) exists at to; we say that r is differentiable on (a,b) if r'(t) exists for all t S {a,b). We say that the curve is smooth if the corresponding vector function is differentiable and the derivative is not a zero vector.
Yet another notation for the derivative r '(t) is r(t), especially when the parameter t is interpreted as time. For a scalar function of time x = x(t), the same notations for the derivative are used:
%=x'(t) = ±{t).
Note that r(t + At) — r(t) = Ar(t) is a vector in the same frame O. The limits in (1.3.1) and (1.3.2) are defined by using the distance, or metric, for vectors. Thus, lim r(t) — r(tG) means that \\r(t) — r(to)|| —> 0 as t —> t0.
t—'to
The derivative r'(t), being the limit of the difference quotient A r ( t ) / A t as At —> 0, is also a vector.
26
Curves in Space
Given a fixed frame O, the formulas of differential calculus for vector functions in this frame are easily obtained by following the corresponding derivations for scalar functions in ordinary calculus. As in ordinary calculus, there are several rules for computing derivatives of vector-valued functions. All these rules follow directly from the definition (1.3.2). The derivative of a sum:
ft(u(t)+v(t))=u'(t)+v'(t).
(1.3.3)
Product rule for multiplication by a scalar: if X(t) is a scalar function, then
ft(X(t)r (t)) = X'(t)r(t) + X(t)r '(t).
Product rules for scalar and cross products:
d , . . . . . du
dv
_(„(t).wW) = - . « +« • - ,
and
d , . . . . . du
dv
s(«(i)x„(i)) = - x „ +« x _ .
The chain rule: If t — (j>(s) and r i ( s ) = r(<p(s)), then
dr\ dr d(f> ds dt ds
(1.3.4) (1.3.5) (1.3.6) (1.3.7)
From the two rules (1.3.3) and (1.3.4), it follows that if (£, j , k) are constant vectors in the frame O so that r(i) = x(t) l + y{t)j+ z(t)k, then r'(t)=x'(t)i + y'{t)j + z'(t)k.
Remark 1.5 The underlying assumption in the above rules for differentiation of vector functions is that all the functions are defined in the same frame. We will see later that these rules for computing derivatives can fail if the vectors are defined in different frames and the frames are moving relative to each other.
Lemma 1.1 If r is differentiable on (a,b) and \\r(t)\\ does not depend on t for t G (a,b), then r(t) _L r'(t) for all t G (a,b). In other words, the derivative of a constant-length vector is perpendicular to the vector itself.
Proof. By assumption, r(t) • r(t) is constant for all t. By the product rule
(1.3.5), 2r '(t) • r{t) = 0 and the result follows.
The Tangent Vector and Arc Length
27
EXERCISE 1.3.1/4 (a) Show that if r is differentiable at t0, then r is continuous at to, but the converse is not true, (b) Does continuity of r imply continuity of \\r\\ ? Does continuity of \\r\\ imply continuity of r? (c) Does differentiability of r imply differentiability of\\r\\? Does differentiability of \\r\\ imply differentiability of r?
The complete description of every curve consists of two parts: (a) the set of its points in R3, (b) the ordering of those points relative to the ordering of the parameter set. For some curves, this complete description is possible in purely vector terms, that is, without choosing a particular coordinate system in the frame O. For other curves, a purely vector description provides only the set of points, while the ordering of that set is impossible without the selection of the particular coordinate system. We illustrate this observation on two simple curves: a straight line and a circle.
A straight line is described by r(t) = T2 — <l>(t)(ri—r2), — oo < t < oo, where r\ and r^ are the position vectors of two distinct points on the line and <j)(t) is a scalar function whose range is all of R. The function (j> determines the ordering of the points on the line. For example, if <p{t) = t, then the point rfa) follows r{t\) in time if ti >t\.
The circle as a set of points in R3 is defined by the two conditions, ||r(i)|| = R and r(t) • n = 0, where n is the unit normal to the plane of the circle. Direct computations show that these conditions do not determine the function r(t) uniquely, and so do not give an ordering of points on the circle. To specify the ordering, we can, for example, fix one point r(to) on the circle at a reference time to and define the angle between r(t) and r(to) as a function of t. But this is equivalent to choosing a polar coordinate system in the plane of the circle.
1.3.2 The Tangent Vector and Arc Length
Let r = r(t) define a curve in R3. If OP = r(to) and r'(to) ^ 0, then, by definition, the u n i t tangent v e c t o r u at P is:
u{to) = ^ | 4 y l|r'(*o)||
(L3-8)
Note that the vector A r = r(to + At) - r(to) defines a line through two points on the curve; similar to ordinary calculus, definition (1.3.2) suggests that the vector r '(£Q) should be parallel to the tangent line at P.
28
Curves in Space
The equation of the tangent l i n e at point P is
R(s) = r(to) + su{t0).
(1.3.9)
EXERCISE 1.3.2.c Let C be a planar curve defined by the vector function r(t) = cosH + sintj, —n < t < n. Compute the tangent vector r'(t) and the unit tangent vector u(t) as functions oft. Compute r ' ( 0 ) and u(0). Draw the curve C and the vectors r ' ( 0 ) , u'(0). Verify your results using a computer algebra system, such as MAPLE, MATLAB, or MATHEMATICA.
EXERCISE 1.3.3.c Let C be a spatial curve defined by the vector function r(t) = costi + sintj + tk. Compute the tangent vector r'(t), the unit tangent vector u(t) and the vector u'(t). Compute r'{ix/2). Draw the curve C for 0 < t < n/2 and draw U'(TV/2) at the point r(n/2). Verify your results using your favorite computer algebra system.
Definition 1.6 A curve C, defined by a vector function r(t), a < t < b, is called smooth if the unit tangent vector u = u(t) exists and is a continuous function for all t € (a,b). If the curve is closed, then, additionally, we must have r'(a) = r'(b). The curve is called piece-wise smooth if it is continuous and consists of finitely many smooth pieces.
EXERCISE 1.3.4.A Give an example of a non-smooth curve C defined by a vector function r(t), — 1 < t < 1, so that the derivative vector r'{t) exists and is continuous for all t G (—1,1).
EXERCISE 1.3.5. c Explain how the graph of a function y = f(x) can be interpreted as a curve in K3. Show that this curve is smooth if and only if
the function f = f(x) has a continuous derivative, and show that, at the
point (xo,f(xo),0), formula (1.3.9) defines the same line as y = f(xo) +
f'(xo)(x-x0),z
= 0.
Given a curve C and two points with position vectors r(c),r(d), a < c < d < b, on the curve, we define the distance between the two points along the curve using a limiting process. The construction is similar to the definition of the Riemann integral in ordinary calculus.
For each n > 2, choose points c — to < h < • • • < tn = d and form
n-l
the sums Ln = Y^, ll^rill> where AT-J = r{ti+\) — r(U). Assume that
maxo<,<n_i(£i+i — U) —> 0 as n —> oo. If the limit linin-Kx, Ln exists for all a < c < d < b, and does not depend on the particular choice of the points tk, then the curve C is called r e c t i f i a b l e . By definition, the distance
The Tangent Vector and Arc Length
29
Lc{c,d) between the points r(c) and r(d) along a rectifiable curve C is Lc(c,d) = lim Ln,
T h e o r e m 1.3.1 Assume that r '(t) exists for all t G (a, b) and the vector function r'(t) is continuous. Then the curve C is rectifiable and
Lc(c,d)= f ||r'(i)||dt.
(1.3.10)
Proof. It follows from the assumptions of the theorem and from relation (1.3.2) that A r j — r'(ti) At; + Vi, where AU — t i + 1 - ti and the vectors Vi satisfy max0<j<n-i \\vi\\/Ati —> 0 as max0<i<„_i At* —> 0. Therefore,
||Ari|| = Wr'itJWAti + SiAU,
(1.3.11)
n— 1
n — 1
TI—1
5 ] |Ar-iH = Y, \\r\U)\\ A^ + £ e'Ai-
i=0
i=0
i=0
where the numbers e$ satisfy
maxo<i<n—I £i —^ 0, n —> oo. Then (1.3.10)
follows after passing to the limit.
EXERCISE 1.3.6? (a) Verify (1.3.11). Hint use the triangle inequality to esti-
mate \\r'(U) Ati + Vi\\ — \\r'(ti)\\ AU. (b) Show that a piece-wise smooth curve is rectifiable. Hint: apply the above theorem to each smooth piece separately, and then add the results.
EXERCISE 1.3.7. c Interpreting the graph of the function y — f(x) as a curve in M3, and assuming that f'(x) exists and is continuous, show that the length of this curve from (c, /(c), 0) to (d,f(d),Q), as given by (1.3.10), is fc yjl + \f'(x)\2dx; the derivation of this result in ordinary calculus is similar to the derivation of (1.3.10).
Given a point r(c) on a rectifiable curve C, we define the arc l e n g t h function s — s(t), t > c, as
s(t) = Lc(c,t)
It follows that ds/dt = \\r'{t)\\ > 0. We call ds = ||r'(*)||dt the l i n e element of the curve C. If r(t) = x(t) i + y(t) j + z(t) k, where (£, j , k) is a cartesian coordinate system at O, then
'ds\ 2 dt
-1 s) (S) ( I ) • d.1r.. 2
~dl
/,\2
/.\2
/,\2
+
+
<-12>
30
Curves in Space
If the curve is smooth, then ds/dt > 0 and s is a monotone function of t so that t is a well-defined function of s. Hence, r(t(s)) is a function of s, and is called the canonical p a r a m e t r i z a t i o n of the smooth curve by the arc length. By the rules of differentiation,
dr _ dr dt _ dr 1
r'(t) _ _
ds ~ ~dt ds ~ ~dt ds/dt = \\r'(t)\\ ~ U^''
EXERCISE 1.3.8.C Consider the r i g h t - h a n d e d c i r c u l a r h e l i x
r(t) = acosti + asintj+tH, a > 0.
(1.3.13)
Re-write the equation of this curve using the arc length s as the parameter.
1.3.3 Frenet's Formulas
In certain frames, called inertial, the Second Law of Newton postulates the following relation between the force F = F(t) acting on the point mass m and the point's trajectory C, defined by a curve r = r(t):
mfgl=F(t).
(1.3.14)
A detailed discussion of inertial frames and Newton's Laws is below on page 43. When F(t) is given, the solution of the differential equation (1.3.14) is the trajectory r(t). However, to get a unique solution of (2.1.1), we must start at some time to and provide two initial conditions r'(to) and r(to) to determine a specific path. In other words, r(to) and r'(io) are reference vectors for the motion. At every time t > to, the vectors r(t) and r '(t) have a well-defined geometric orientation relative to the initial vectors r(to), r'(to). The three Frenet formulas provide a complete description of this orientation. In what follows, we assume that the curve C is smooth, that is, the unit tangent vector u exists at every point of the curve.
To write the formulas, we need several new notions: curvature, principal unit normal vector, unit binormal vector, and torsion. We will use the canonical parametrization of the curve by the arc length s measured from some reference point Po on the curve.
Let u = u(s) be the unit tangent vector at P, where the parameter s is the arc length from Po to P. By Lemma 1.1 on page 26, the derivative u'(s) of u(s) with respect to s is orthogonal to u. By definition, the curvature K(S) at P is
«(*) = ||tt'(a)||;
Prenet's Formulas
31
the principal unit normal vector at P is
p =-«'(«);
(1.3.15)
the unit binomial vector at P is
b(s) = u(s) x p{s).
EXERCISE 1.3.9.C Parameterizing the circle by the arc length, verify that the curvature of the circle of radius R is \/R.
To define the torsion, we derive the relation between b'(s) and the vectors u,p,b. Using Lemma 1.1 once again, we conclude that b'(s) is orthogonal to b(s). Next, we differentiate the relation b(s) • u(s) = 0 with respect to s and use the product rule (1.3.5) to find b'(s)-u(s)+b(s)-u'(s) — 0. By construction, the unit vectors u,p,b are mutually orthogonal, and then the definition (1.3.15) of the vector p implies that b(s) • u'(s) = 0. As a result, b'(s) • u(s) = 0. Being orthogonal to both u(s) and b(s), the vector b'(s) must then be parallel to p(s). We therefore define the t o r s i o n of the curve C at point P as the number r = T(S) so that
b'(s) = -T(s)p(s);
(1.3.16)
the choice of the negative sign ensures that the torsion is positive for the right-handed circular helix (1.3.13).
Note that the above definitions use the canonical parametrization of the curve by the arc length s; the corresponding formulas can be written for an arbitrary parametrization as well; see Problem 1.11 on page 412.
Relations (1.3.15) and (1.3.16) are two of the Prenet formulas. To derive the third formula, note that p(s) = b(s) x u(s). Differentiation with respect to s yields p' = bxu' + b'xu = bxKp — rpxu, and
p'{s) = -Ku{s) + Tb(s).
(1.3.17)
Different sources refer to relations (1.3.15) - (1.3.17) as either the Frenet or the F r e n e t - S e r r e t formulas. In 1847, the French mathematician JEAN FREDERIC FRENET (1816-1900) derived two of these formulas in his doctoral dissertation. Another French mathematician, JOSEPH ALFRED SERRET (1819-1885), gave an independent derivation of all three formulas, but we could not find the exact time of his work. Of course, neither Frenet nor Serret used the modern vector notations in their derivations.
32
Curves in Space
At every point P of the curve, the vector triple (v., p, b) is a righthanded coordinate system with origin at P. We will call this coordinate system F r e n e t ' s t r i h e d r o n at P. The choice of initial conditions r(to), r '(to) means setting up a coordinate system in the frame with origin at PQ, where OPQ = r(to). The coordinate planes spanned by the vectors (u, p), (p, b), and (b, u) are called, respectively, the o s c u l a t i n g , normal, and r e c t i f y i n g (binormal) planes. The word osculating comes from Latin osculum, literally, a little mouth, which was the colloquial way of saying "a kiss". Not surprisingly, of all the planes that pass through the point P, the osculating plane comes the closest to containing the curve C.
EXERCISE 1.3.10? A curve is called p l a n a r if all its points are in the same plane. Show that a planar curve other than a line has the same osculating plane at every point and lies entirely in this plane (for a line, the osculating plane is not well-defined).
The curvature and torsion uniquely determine the curve, up to its position in space. More precisely, if K(S) and r(s) are given continuous functions of s, we can solve the corresponding equations (1.3.15)-(1.3.17) and obtain the vectors u(s),p(s), b(s) which determine the shape of a family of curves. To obtain a particular curve C in this family, we must specify initial values (u(so),p(so), b(so)) of the trihedron vectors and an initial value r(so) of a position vector at a point PQ on the curve. These four vectors are all in some frame with origin O. To obtain r(s) at any point of C we solve the differential equation dr/ds = u(s), with initial condition r(so), together with (1.3.15)-(1.3.17). Note that the curvature is always non-negative, and the torsion can be either positive or negative.
EXERCISE 1.3.11.A For the right circular helix (1.3.13) compute the curvature, torsion, and the Frenet trihedron at every point. Show that the right circular helix is the only curve with constant curvature and constant positive torsion.
As the point P moves along the curve, the trihedron executes three rotations. These rotations about the unit tangent, principal unit normal, and unit binormal vectors are called r o l l i n g , yawing, and p i t c h i n g , respectively. Rolling and yawing change direction of the unit binormal vector 6, rolling and pitching change the direction of the principal unit normal vector p, yawing and pitching change the direction of the unit tangent vector u. To visualize these rotations, consider the motion of an airplane. Intuitively,
Velocity and Acceleration
33
it is clear that the tangent vector 2 points along the fuselage from the tail to the nose, and the normal vector p points up perpendicular to the wings (draw a picture!) In this construction, the vector b points along the wings to make u,p,b a right-handed triple. The center of mass of the plane is the natural common origin of the three vectors. The rolling of the plane, the rotation around u, lifts one side of the plane relative to the other and is controlled by the ailerons on the back edges of the wings. Yawing, the rotation around p, moves the nose left and right and is controlled by the rudder on the vertical part of the tail. Pitching of the plane, the rotation around 6, moves the nose up and down and is controlled by the elevators on the horizontal part of the tail.
Note that rolling and pitching are the main causes of motion sickness.
1.3.4 Velocity and Acceleration
Let the curve C, defined by the vector function r = r(t), be the trajectory of a point mass in some frame O. Between times t and t + At the point moves through the arc length As — s(t + At) — s(t), and therefore ds(t)/dt is the speed of the point along C. As we derived on page 30,
dr ds dr ds ^. ,
— dt
=
—— dtds
=
— dt
«
(v<
) '
.
Therefore, we define the v e l o c i t y v(t) as
._ „ _„, K(1.3.18')
v{t) = dr/dt.
In particular, ||u|| = \ds/dt\ = ds/dt, that is, the speed is the magnitude of the velocity; recall that the arc length s — s(t) is a non-decreasing function of t. This mathematical definition of velocity agrees with our physical intuition of speed in the direction of the tangent line, while making the physical concept of velocity precise, as required in a quantitative science. The definition also works well in practical problems of motion. Indeed, precise physics is mathematical physics.
Similarly, the a c c e l e r a t i o n a(t) of the point mass is, by definition,
a(t)=v'(t) = r"(t).
Since dvjdt = d((ds/dt)u(t))/dt, the product rule (1.3.4) implies
dv _ d2s _
ds du ds
~dl ~ ~dl?U^ ' + ~di ~ds"di
34
Curves in Space
or
. , d2s _ . . Ids\2 du(s) °<«>=d* " < ' > + ( * ) - 1 1 .
.„ „„ , (1.3.19)
Equation (1.3.19) shows that the acceleration a(t) has two components: the t a n g e n t i a l a c c e l e r a t i o n (d2s/dt2) u{t) and the normal a c c e l e r a t i o n (ds/dt)2 (du(s)/ds). By Lemma 1.1, page 26, the derivative of a unit vector is always orthogonal to the vector itself, and so the tangential and normal accelerations are mutually orthogonal. The derivation also shows that the decomposition (1.3.19) of the acceleration into the tangential and normal components does not depend on the coordinate system.
EXERCISE 1.3.12.c In (1.3.20) below, r — r(t) represents the position of point mass m at time t in the Cartesian coordinate system:
r(t) = t2i + 2t2j + t2k; r{t) = 2cost2i + 2smt2 j \
r(t) = 2cos7rfz + 2sin7r£ j ; r(t) = cos t2 i + 2 sin t2 j .
(1.3.20)
For each function r — r(t),
• Sketch the corresponding trajectory; • Compute the velocity and acceleration vectors as functions of t; • Draw the trajectory for 0 < t < 1 and draw the vectors r '(1), r "(1); • Compute the normal and tangential components of the acceleration and
draw the corresponding vectors when t = 1; • Verify your results using a computer algebra system.
We will now write the decomposition (1.3.19) for the CIRCULAR MOTION IN A PLANE. Let C be a circle with radius R and center at the point O. Assume a point mass moves along C. Choose the cartesian coordinates i, j , k with origin at O and i, j in the plane of the circle. Denote by 9(t) the angle between i and the position vector r(t) of the point mass. Suppose that the function 6 = 6(t) has two continuous derivatives in t, |#'(t)| > 0, t > 0, and 0(0) = 0. Then
r(t) = Rcos9(t) i + Rsin6(t)j,
v = r'(t) = -0'(t)Rsm6(t)i + 0'(t)Rcose(t)j,
\\v\\ =
(r'.r'^=R\e'(t)\,
Velocity and Acceleration
35
and v • r — 0. So v is tangent to the circle. The acceleration a is
a(t) =v'(t) = - R(9"(t) sin 9(t) - (9'(t))2 cos<?(*)) I + R(6"(t) cos8(t) - {9'(t))2sm9(t))j
or
a=-(0')2r+(6"/6')v.
(1.3.21)
Thus, the tangential component of a is (9"/6')v, and the normal component, also known as the c e n t r i p e t a l a c c e l e r a t i o n , is ~{9')2r. Also,
H| = zV(0')4 + (0")2-
EXERCISE 1.3.13.B Verify that (1.3.21) coincides with (1.3.19). Hint: First verify that ds/dt = R6'{t) and du(t)/dt = -(0'(t)/R)r(t).
If the rotation is uniform with constant angular speed LJ, then 0(t) = cot and we have the familiar expressions ||a|| = u>2R = ||i>||2/.R.
Note that the centripetal acceleration is in the direction of — r, that is, in the direction toward the center. It is not a coincidence that the Latin verb petere means "to look for."
Next, we write the decomposition (1.3.19) for the GENERAL PLANAR MOTION IN POLAR COORDINATES (r, 9). Consider a frame with origin O and fixed cartesian coordinate system (i, j , k) so that the motion is in the (i, j) plane. Recall that, for a point P with position vector r, r = \\r\\, and 9 is the angle from vector i t o r . Let r = r/r be the unit radius vector and let 9 be the unit vector orthogonal to r so that r x 9 = i x j ; draw a picture or see Figure 2.1.3 on page 48 below. Then
{ r = cos#£ + sin0j, 8 = - sin 9 i + cos 9 j .
(1.3.22)
The vectors f, 9 are functions of 9. From (1.3.22) we get
{ dr/d9 = — sin#z + cos9j= 0, d9/d9 = - cos9i — sin# j = —f.
(1.3.23)
Let r(t) be the position of the point mass m at time t. In polar coordinates, r(t) = r(t)r(9(t)). The velocity of m in the frame O is v — dr/dt = d(r(t)r(8(t)))/dt. Using the rule (1.3.4) and the chain rule, we get v = (dr/dt)f+ r (dr/d9)(d9/dt), or
v = fr + r90 = fr + nJ0.
(1.3.24)
36
Curves in Space
T h e velocity v is a sum of the radial velocity component fr and the angular velocity component rujd. We call f and r6 the r a d i a l and a n g u l a r speeds, respectively.
The acceleration a in the frame O is obtained by differentiating (1.3.24) with respect to t according to the rules (1.3.3), (1.3.4):
a = dv/dt = rr + f (dr/d0)6 + (rd + r$)9 + r6 (dO/d9)9,
or
a = (r-r62)r + (r6 + 2f6)6.
(1.3.25)
T h e acceleration a is a sum of the r a d i a l component ar and the a n g u l a r component ag, where
ar = (f - ruj2) r and ae = {r§ + 2fuj) 6.
(1.3.26)
E X E R C I S E 1.3.14.B Verify that decomposition (1.3.26) of the acceleration is a particular case of (1.3.19).
Now assume t h a t the trajectory of the point mass is a circle with center at O and radius R. T h e n r(t) = R for all t and f(t) = r(t) = 0. Let 9\t) =u(t). By (1.3.26),
{ ar = —RUJ2 f ag = Rwd
(centripetal acceleration) (angular acceleration).
(1.3.27)
Also, by (1.3.24),
v = Rwd.
(1.3.28)
E X E R C I S E 1.3.15.5 Verify that formula (1.3.27) is a particular case of the decomposition (1.3.21) of the acceleration, as derived on page 34-
If we further assume t h a t the angular speed is constant, t h a t is, u>(t) = u>o for all t, then w = 0, and, by (1.3.27),
ar = -Ru%r,
a0 = 0.
(1.3.29)
E X E R C I S E 1 . 3 . 1 6 . B Verify that if the acceleration of a point mass in polar coordinates is given by (1.3.29), then the point moves around the circle of radius R with constant angular speed WQ. Hint: Combine (1.3.29) and (1.3.26) to get differential equations for r and 6. Solve the equations with initial conditions r(0) = R, r(0) = 0, 0(0) = 0, 0(0) = w0 to get r{i) = R, 0(i) = w0t.
Velocity and Acceleration
37
EXERCISE 1.3.17.A Let (r(t),9(t)) be the polar coordinates of a 2-D motion of a point mass m in a fixed frame O. Let r(t) = St and 9(t) — 2t. Sketch the trajectory of the point in the frame O for 0 < t < 5 and verify the result using a computer algebra system. Compute the velocity and acceleration vectors in the frame O in terms of the unit vectors r, 6.
This page is intentionally left blank
Chapter 2
Vector Analysis and Classical and Relativistic Mechanics
2.1 Kinematics and Dynamics of a Point Mass
Kinematics is the study of motion without reference to forces; the Greek word kinema means "motion." Dynamics is the study of motion under the action of forces; the Greek word dynamis means "force." Also, the Greek word mechanikos means "machine."
A curve C, defined by a vector-valued function of time r = r(t) provides the mathematical description of the trajectory in M3 of a particle (point mass) so that the location of the particle at time t is at the end point of the vector r(t). The initial point O of the vector is the origin of the corresponding frame in which the motion is studied. It is clear that the same motion can be studied in different frames and in different coordinates. The Prenet trihedron (page 32) is an example of a coordinate system in which the particle is at rest, but the coordinate system is moving. The objective of this section is to derive the rules for describing the motion of a point mass in different coordinate systems.
Unless explicitly mentioned otherwise, we assume that the curve C is smooth, that is, the unit tangent vector u exists at every point of the curve; see page 27.
2.1.1 Newton's Laws of Motion and Gravitation The motion of a point mass m is related to the net force F acting on the mass. In an inertial frame, this relation is made precise by Newton's three laws of motion, and conversely, every frame in which these laws hold is called i n e r t i a l . These laws were first formulated by Newton around 1666, less than a year after he received his bachelor's degree from Cambridge
39
40
Kinematics and Dynamics of a Point Mass
University.
N e w t o n ' s First Law: Unless acted upon by a force, a point mass is either not moving or moves in a straight line with constant speed. This law is also called the Law of I n e r t i a or G a l i l e o ' s P r i n c i p l e . Newton's Second Law: The acceleration of the point mass is directly proportional to the net force exerted and inversely proportional to the mass. Newton's Third Law: For every action, there is an equal and opposite reaction.
Mathematically, the Second Law is
o9r(t) _ F
dt2
m
(2.1.1)
where r = r(t) is the position of the point mass at time t.
EXERCISE 2.1.1. c Show that the Second Law implies the First Law. In other words, show that if (2.1.1) holds, then the point mass m acted on by zero external force will move with constant velocity v. Hint: find the general solution of equation (2.1.1) when F = 0.
The notion of momentum provides an alternative formulation of Newton's Second Law. Consider a point mass m moving along the path r — r(t) with velocity r(t) relative to a reference frame with origin at O. The (linear) momentum p is the vector
p = mr.
(2.1-2)
If the reference frame is inertial, then (2.1.1) becomes
p = F,
(2.1.3)
and the force is now interpreted as the rate of change of the linear momentum. Incidentally, the Latin word momentum means "motion" or "cause of motion." One advantage of (2.1.3) over (2.1.1) is the possibility of variable mass.
Similarly, the study of circular motion suggest the definition of the angular momentum about the point O as the vector
Lo = mr x r.
(2-1-4)
Note that both p and Lo depend on the reference point O, but do not depend on the coordinate system.
Newton's Laws of Motion and Gravitation
41
Recall that a force F acting on the point mass m has a torque, or moment, about O equal to
T0 = rxF.
Applying the rule (1.3.6), page 26, to formula (2.1.4) we find
(2.1.5)
dLQ m(rxr + rxr) = mrxf. dt
If the frame O is inertial, then r — F/m and
dLc dt
= r x F =
T0.
(2.1.6)
Relation (2.1.6) describes the rotational motion just as (2.1.3) describes the translational motion.
As an example, consider the SIMPLE RIGID PENDULUM, which is a massless thin rigid rod of length £ connected to a point mass m at one end. The other end of the rod is connected to a frictionless pin-joint at a point O, which is a zero-diameter bearing that permits rotation in a fixed plane. We select a cartesian coordinate system (i, j , K) with center at O and i, j fixed in the plane of the rotation (Figure 2.1.1), and assume that the corresponding frame is inertial.
Fig. 2.1.1 Simple Rigid Pendulum
The motion of m is a circular rotation in the (i, j) plane, and is best described using polar coordinates (r,6); see page 35. The rigidity assumption implies r(t) = £ for all t. Denote by 0 = 6{t) the angle from % to the rod at time t. Let r(t) be the position vector of the point mass in the frame O. Then r(t) = £r(0(t)), and by (1.3.24) on page 35, r — £80. Hence, the angular momentum of the point mass about O is
42
Kinematics and Dynamics of a Point Mass
L — mr x r = m(£r x 196) = m£29A, and
^=m£r6k.
(2.1.7)
at
The forces acting on the pendulum are the weight W of m, the air resistance
Fa and the force Fp exerted by the pin at O. Clearly, W = mgi, where
g is the acceleration of gravity. Physical considerations suggest that the
force Fa on m may be assumed to act tangentially to the circular path and
to be proportional to the tangential velocity: Fa = —c£9 6, where c is the
damping constant. The total torque T about O exerted by these forces is
T = r xW + r x Fa + Ox Fp, = £rxmgi-£rxc£9
8,
or
T=-(mg£ sin 9 + £2C9)K.
(2.1.8)
Since the frame O is inertial, equation (2.1.6) applies, and by (2.1.7) and (2.1.8) above, we obtain ml? 6 = —mg£sin6 — l2c6, or
m£e + d6 = -mg sin9.
(2.1.9)
Equation (2.1.9) is a nonlinear ordinary differential equation and cannot be integrated in quadratures, that is, its solution cannot be written using only elementary functions and their anti-derivatives. When c = 0, such a solution does exist and involves elliptic integrals; see Problem 2.3, page 417, if you are curious.
The more familiar harmonic oscillator
6 = -{g/£)6
(2.1.10)
is obtained from (2.1.9) when c = 0 and 6 is small so that sin0 w 6; this equation should be familiar from the basic course in ordinary differential equations. If 9(0) = 0O and 0(0) = 0, then the solution of (2.1.10) is
0(t) = 0O cos(cjt), where w = (£/g)1/2.
The period of the small undamped oscillations is 2n(£/g)1/2, and the value of £ can be adjusted to provide a desired ticking rate for a clock mechanism. The idea to use a pendulum for time-keeping was studied by the Italian scientist GALILEO GALILEI (1564-1642) during the last years of his life, but it was only in 1656 that the Dutch scientist CHRISTIAAN HUYGENS (1629-1695) patented the first pendulum clock.
Newton's Laws of Motion and Gravitation
43
EXERCISE 2.1.2r Let point mass m move in a planar path C given by r(t), where r is the position vector with origin O. (a) Use formulas (1.3.24), Pa9^ 35, and (2.1.4) to express the angular momentum of the point mass about O in the coordinate system {r,6). (b) Suppose that the point mass moves in a circular path C with radius R and center O. Denote the angular speed by w(i). (i) Compute the angular momentum LQ of the point mass about O and the corresponding torque, (ii) Find the force F that is required to produce this motion, assuming the frame O is inertial. (Hi) Write F as a linear combination ofr and 6. (iv) How will the expressions simplify ifui(t) does not depend on time?
As we saw in Exercise 2.1.1, the Second Law of Newton implies the First Law. For further discussion of the logic of Newton's Laws see the book Foundations of Physics by H. Margenau and R. Lindsay, 1957. Regarding the First Law, they quote A. S. Eddington's remark from his book Nature of the Physical World, first published in the 1920s, that the law, in effect, says that "every particle continues in its state of rest or uniform motion in a straight line, except insofar as it doesn't." This is a somewhat facetious commentary on the logical circularity of Newton's original formulation, which depends on the notion of zero force acting, which can only be observed in terms of the motion being at constant velocity. The same logical difficulty arises in the definition of an i n e r t i a l frame as a frame in which the three laws of Newton hold. We do not concentrate on these questions here and simply assume that the primary i n e r t i a l frame, that is, a frame attached to far-away, and approximately fixed, stars is a good approximation of an inertial frame for all motions in the vicinity of the Earth. The idea of this frame goes back to the Irish bishop and philosopher G. Berkeley. The deep question "What is a force?" is also beyond the scope of our presentation; for the discussion of this question, see the abovementioned book Foundations of Physics by H. Margenau and R. Lindsay, or else take as given that there are four basic kinds of forces: gravitational, electromagnetic, strong nuclear, and weak nuclear. In inertial frames, all other forces result from these four.
Newton discovered the Law of Universal Gravitation by combining his laws of motion with Kepler's Laws of Planetary Motion. The history behind this discovery is a lot more complex than the familiar legend about the apple falling from the tree and hitting Newton on the head. As many similar stories, this "apple incident" is questioned by modern historians. Below, we present some of the highlights of the actual development.
44
Kinematics and Dynamics of a Point Mass
The basic ideas of modern astronomy go back to the Polish astronomer NICOLAUS COPERNICUS (1473-1543) and his heliocentric theory of the solar system. Copernicus was a canon (in modern terms, senior manager) of the cathedral at the town of Prauenburg (now Frombork) in northern Poland, and observed the stars and planets from his home. Around 1530, he came to the conclusion that planets in our solar system revolve around the Sun. He was hesitant to publish his ideas, both for fear of being charged with heresy and because of the numerous problems he could not resolve; his work, titled De revolutionibus orbium coelestium ("On the Revolutions of the Celestial Spheres") was finally published in 1543, apparently just a few weeks before he died.
It took some time to formalize the heliocentric ideas mathematically, and the key missing element was the empirical data. The main instrument for astronomical observations, the telescope, was yet to be invented: it was only in 1609 that Galileo Galilei made the first one. Without a telescope, collecting the data required a lot of time and patience, but the Danish scientist TYCHO BRAHE (1546-1601) had both. Brahe was the royal astronomer and mathematician to Rudolf II, the emperor of the Holy Roman Empire. At the observatory in Prague, the seat of the Holy Roman Empire at that time, Brahe compiled the world's first truly accurate and complete set of astronomical tables. His assistant, German scientist JOHANN KEPLER (1571-1630), had been a proponent of the heliocentric theory of Copernicus. After inheriting the position and all the astronomical data from Brahe in 1601, Kepler analyzed the data for the planet Mars and formulated his first two laws in 1609. Further investigations led him to the discovery of the third law in 1619.
K e p l e r ' s First Law: The planets have elliptical orbits with the Sun at one focus.
K e p l e r ' s Second Law: The radius vector from the Sun to a planet sweeps over equal areas in equal time intervals. K e p l e r ' s T h i r d Law: For every planet p, the square of its period Tp of revolution around the Sun is proportional to the cube of the average distance Rp from the planet to the Sun. In other words, T% = KsRl, where the number Ks is the same for every planet.
A planetary orbit has a very small eccentricity and so is close to a circle of some mean radius R. Kepler speculated that a planet is held in its orbit by a force of attraction between the Sun and the planet, and Newton quantified Kepler's qualitative idea. In modern terms, we can
Newton's Laws of Motion and Gravitation
45
recover Newton's argument by combining equation (1.3.29), page 36, with his second law (2.1.1). Take an inertial frame with origin at the Sun and assume that a planet of mass m executes a circular motion around the origin with constant angular speed w. Then (2.1.1) and (1.3.29) result in
F = ma = -moj2r(t) = -mw2Rr{t),
where r = r/\\r\\ is the unit radius vector pointing from the Sun to the planet. On the other hand, u> — 2n/T, and therefore the magnitude of F is ||F|| = m(4rr2/T2)R. Applying Kepler's Third Law, we obtain ||F|| = m{Air2/KsR?)R = Cm/R2, where C = 4n2/Ks. In other words, the gravitational force exerted by the Sun on the planet of mass m at a distance R is proportional to m and R~2. By Newton's third law, there must be an equal and opposite force exerted by m on the Sun. By the same argument, we conclude that the magnitude of the force must also be proportional to M and R~2, where M is the mass of the Sun. Therefore,
11*11 = ^ ,
(2.1.11)
where G is a constant. Newton postulated that G is a u n i v e r s a l g r a v i t a t i o n a l constant , that is, has the same value for any two masses, and therefor (2.1.11) is a Universal Law of G r a v i t a t i o n . In 1798, the English scientist HENRY CAVENDISH (1731-1810), in his quest to determine the mass and density of Earth, verified the relation (2.1.11) experimentally and determined a numerical value of G: G « 6.67 x 1 0 - 1 1 m3/(kg- sec2). Since then, the Universal Law of Gravitation has been tested and verified on many occasions. An extremely small discrepancy has been discovered in the orbit of Mercury that cannot be derived from (2.1.11), and is explained by Einstein's law of gravitation in the theory of general relativity; see Problem 2.2 on page 414.
In our derivation of (2.1.11), we implicitly used the equivalence principle that the two possible values of m, its inertial and gravitational masses, are equal. A priori, this is not at all obvious. Indeed, the mass m in equation (2.1.1) of Newton's Second Law, the i n e r t i a l mass, expresses an object's resistance to external force: the larger the mass, the smaller the acceleration. The mass m in (2.1.11), the g r a v i t a t i o n a l mass, expresses something completely different, namely, its gravitational attraction: the larger the mass, the stronger the gravitational attraction it produces. The equivalence principle is one of the foundations of Einstein's theory of general relativity and can be traced back to Galilei, who was among the first
46
Kinematics and Dynamics of a Point Mass
to study the motion of bodies under Earth's gravity. Even though many modern historians question whether indeed, around 1590, he was dropping different objects from the leaning tower of Pisa, in 1604 Galileo did conduct related experiments using an inclined plane; in 1608, he formulated mathematically the basic laws of accelerated motion under the gravitational force. The conjecture of Galilei that the acceleration due to gravity is essentially the same for all kinds of matter has been verified experimentally. Between 1905 and 1908, the Hungarian physicist VASAROSNAMENYI BARO EOTVOS LORAND (1848-1919), also known as ROLAND EOTVOS, measured a variation of about 5 x 10~9 in the Earth's pull on wood and platinum; somehow, the result was published only in 1922. In the 1950s, the American physicist ROBERT HENRY DiCKE (1916-1997) measured a difference of (1.3 ± 1.0) x 1 0 - 1 1 for the Sun's attraction of aluminum and gold objects.
Following the historical developments, we derived relation (2.1.11) from Newton's Second Law of Motion and Kepler's Third Law of Planetary Motion. Problem 2.1, page 413, presents a deeper insight into the problem. In particular, more detailed analysis shows that, in our derivation of (2.1.11), Kepler's Third Law can be replaced by his First Law, along with the assumption that the gravitational force is attracting and c e n t r a l , that is, acts along the line connecting the Sun and the planet. Moreover, the reader who completes Problem 2.1 will see that all three Kepler's laws follow from (2.1.1) and (2.1.11). This illustrates the power of mathematical models in reasoning about physical laws.
2.1.2 Parallel Translation of Frames
Recall that Newton's laws of motion hold only in inertial frames; see page 4 for the definition of frame. Ignoring the possible logical issues, we therefore say that an i n e r t i a l frame is a frame in which a point mass maintains constant velocity in the absence of external forces. The same frame can be (approximately) inertial in some situations and not inertial in others. For example, a frame fixed to the surface of the Earth is inertial if the objective is to study the motion of a billiard ball on a pool table. The same frame is no longer inertial if the objective is to study the trajectory of an intercontinental ballistic missile: the inertial frame for this problem should not rotate with the Earth; the primary inertial frame fixed to the stars is a possible choice, see page 43. In this and the following two sections, we investigate how relation (2.1.1) changes if the frame is not inertial. The
Parallel Translation of Frames
47
starting point is the analysis of the relative motion of frames. The easiest motion is parallel translation, where the corresponding basis
vectors in the frames stay parallel. Consider two such frames with origins O and Oi respectively. Denote by roi(t) the position vector of 0\ with respect to O. Let the position vectors of a point mass in frames O and 0\ be r0{t) and ri(i) respectively (Figure 2.1.2).
Fig. 2.1.2 Translation of Frames
Clearly, ro(t) — roi(i) + Vi(t), and the absence of relative rotation allows us to consider this equality in the frame O for each t We can identify the parallel vectors that have the same direction and length. Since position vectors in O and 0\ maintain their relative orientation when there is no relative rotation of the frames, the coordinate systems in the frames O and 0\ are the same. Then we can apply the rule for differentiating a sum (1.3.3) to obtain simple relations between the velocities and accelerations in the frames O and 0\\
ro(*)=r0i(t) + fi(t), r(t)=roi(*) + ri(t).
(2.1.12)
If the frame O is inertial, then, by Newton's Second Law, mr(t) = F(t), where m is the mass of the point, and F is the sum of all forces acting on the point. The second equality in (2.1.12) then implies
mri(t) = F(t) - mroi(t)
(2.1.13)
In effect, there are two forces acting on m in the frame 0\. One is the force F. The other, — mroi, is called a t r a n s l a t i o n a l a c c e l e r a t i o n force. It is an example of an apparent, or i n e r t i a l , force, that is, a force that appears because of the relative motion of frames and is not of any of the four types described on page 43. If roi(t) is constant, then roi(t) = 0 and the Second Law of Newton holds in 0\, that is, Oi is also an inertial frame. Thus, all frames moving with constant velocity relative to an inertial frame are also inertial frames.
48
Kinematics and Dynamics of a Point Mass
As an EXAMPLE, consider a golf ball in a moving elevator. We fix the frame O on the ground, and 0\, on the elevator, and select the usual cartesian coordinate systems in both frames so that the corresponding coordinate vectors are parallel. Assume that the elevator is falling down with the gravitational acceleration g, so that roi(t) = —gk, and the ball is falling down inside the elevator, also with the gravitational acceleration g so that ro{t) — -gk. Then (2.1.12) shows that r\(t) = ro(t) — roi(t) = —gk + gk — 0. Therefore, f*i(t) is constant, and if r\(to) = 0, then r[(t) = 0 for all t > to (or until the elevator hits the ground). An observer in the elevator would see the ball as fixed in the elevator frame 0\: the translational acceleration force - m f o i ( i ) compensates the gravitational force m?o(t), and the ball behaves as weightless in the elevator frame.
2.1.3 Uniform Rotation of Frames
Note that it is the absence of rotation that allowed us to use the differentiation rule (1.3.3) in the derivation of relation (2.1.12). This and other rules of differentiation no longer apply if the frames are rotating relative to each other, and relation (2.1.12) must be modified.
We start with the analysis of uniform r o t a t i o n , that is, rotation with constant angular speed around a fixed axis. As a motivational example, consider a car driving with constant angular speed u>oin a circle with radius R and center at O. Consider an object (a point mass m) moving inside the car with constant radial speed VQ relative to O, and rotating together with the car with constant angular speed UQ around O. Introduce a new (noninertial) rotating frame with origin 0\ inside the car and the coordinate basis vectors i\ = r, j 1 = 0; see Figure 2.1.3. As before, let ro and n be the position vectors of the point mass in the frames O and Oi, respectively, and denote 00\ by roi- Assume that T*I(0) = 0.
' *
—•-
0
0[ r
Fig. 2.1.3 Rotation of Frames
By construction, roi = Rf. For a passenger riding in the car, the
Uniform Rotation of Frames
49
path r\(t) — votr of the point mass relative to the car is a straight line, since there is no angular displacement of m relative to the car. Thus, ro(t) = (R + vot)r(t). Note that the polar coordinates of m in frame O are (r(t),9(t)), where r(t) = R + v0t. Hence r(t) = v0 and f(t) = 0. Also, Q(t) = OJQ and 6(t) = 0. Formulas (1.3.26) on page 36 provide the acceleration a = ar + a# of the mass in the frame O, where, with f — Q and 6 = 0,
ar = -(R + v0t)Jl r, ae = 2v0wo §•
(2.1.14)
By (1.3.27) on page 36, r 0 i = -Rw$r. In frame 01: r^t) = 0. Then (2.1.14) shows that the acceleration fo(t) is
r 0 ( 0 = roi(t) + r\(t) - v0ttj$r + 2v0uj0d.
(2.1.15)
We see that the simple relation (2.1.12) between the accelerations in translated frames does not correctly describe acceleration of the point mass in the frame O in terms of the acceleration in the frame 0\.
If O is an inertial frame and J1 is a force acting on the point mass in O to produce the motion, then, by the Newton's Second Law, m ro = F . According to (2.1.15),
mf\ = F + (mRul + mvotwl) r — 2mv0ujo&-
(2.1.16)
Thus, frame 0\ is not inertial. Similar to (2.1.13), inertial forces appear as correctors to Newton's Second Law: the centrifugal force Fc = {rnRiS^ + mvoiwg) r and the Coriolis force Fcor = — 2mvou>0 6. The centrifugal force prevents the mass from flying off at a tangent because of the rotational motion of the car and the mass. One component of this force, mRwQ r = —mroi(t), is related to the motion of the car causing the rotation of the origin of the frame Oi; the other, mvotuiQ r, takes into account the outward radial motion of the point. Note that the direction of the centrifugal force is in the direction of r, and is therefore away from the center and opposite to the direction of the centripetal acceleration (cf. page 35). Incidentally, the Latin verb fug ere means "to run away."
The Coriolis force — 2mvocoo Q is somewhat less expected. This force is perpendicular to the linear path in Oi and ensures that the trajectory of the point in the rotating frame is a straight radial line despite the rotation of the frame. This force was first described in 1835 by the French scientist GASPARD-GUSTAVE DE CORIOLIS (1792-1843). His motivation for the study came from the problems of the early 19th-century industry, such as
50
Kinematics and Dynamics of a Point Mass
the design of water-wheels. More familiar effects of the Coriolis force, such as rotation of the swing plane of the Poucault pendulum and the special directions of atmospheric winds, were discovered in the 1850s and will be discussed in the next section.
Recall that in our example r\ = 0. From (2.1.16) we conclude that F + Fc + Fcor = 0. The real (as opposite to inertial) force F must balance the effects of the inertial forces to ensure the required motion of the object in the rotating frame. For a passenger sliding outward with constant velocity VQ r in a turning car, this real force is the reaction of the seat in the form of friction and forward pressure of the back of the seat. .
EXERCISE 2.1.3? Find the vector function describing the trajectory ofm in the O frame. What is the shape of this trajectory? Verify your conclusion using a computer algebra system.
EXERCISE 2.1.4.A Suppose a point mass m is fixed at a point P in the O frame, that is, m remains at P in the O frame for all times. Find the vector function describing the trajectory ofm in the 0\ frame. What is the shape of this trajectory? Verify your conclusion using a computer algebra system. Hint. This is the path of m relative to the car seen by a passenger riding in the car. Show that O is fixed in Ox.
Coming back to Figure 2.1.3, note that the coordinate vectors in the frame 0\ spin around 0\ with constant angular speed WQ, while the origin 0\ rotates around O with the same angular speed WQ. This observation leads to further generalization by allowing different speeds of spinning and rotation.
EXERCISE 2.I.5.'4 Suppose the origin 0% rotates around the point O with angular speed U>Q, while the coordinate vectors (r, 9) spin around 0\ with constant angular speed 2WQ. Suppose a point mass is fixed at a point P in the 0 frame. Find the vector function describing the trajectory of m in the 0\ frame. What is the shape of this trajectory? Verify your conclusion using a computer algebra system.
Our motivational example with the car illustrated some of the main effects that arise in rotating frames. The example was two-dimensional in nature, and now we move on to uniform rotations in space. There are many different ways to describe rotations in R3. We present an approach using vectors and linear algebra.
We start with a simple problem. Consider a point P moving around a
Uniform Rotation of Frames
51
circle of radius R with uniform angular velocity u (Figure 2.1.4). Denote by r(t) the position vector of the point at time t and assume that the origin O of the frame is chosen so that ||r|| does not change in time. How to express r(t) in terms of r(t) and w?
Fig. 2.1.4 Rotating Point
To solve this problem, consider the plane that contains the circle of rotation and define the r o t a t i o n v e c t o r u> as follows (see Figure 2.1.4). The vector u> is perpendicular to the plane of the rotation; the direction of the vector u> is such that the rotation is counterclockwise as seen from the tip of the vector (alternatively, the rotation is clockwise as seen in the direction of the vector); the length of the vector u: is w, the angular speed of the rotation. As seen from Figure 2.1.4, r(t) = r(t) + r* and the vector r* does not change in time, so that r(t) = r'(t). Consider a cartesian coordinate system (?, j) with the origin at the center O of the circle in the plane of rotation so that r(t) = Rcoscjti + Rsinwtj. Direct computations show that
r'(t) = ( i x j ) x («f(<)) = iv x r{t) = w x r(t), r{t) = LJ x r[t).
(2.1.17) (2.1.18)
EXERCISE 2.1.6.c Verify all equalities in (2.1.17).
We will now use (2.1.18) to derive the relation between the velocities and accelerations of a point mass m relative to two frames O and Oi, when the frame 0\ is rotating with respect to O. We assume that the two frames have the same origins: O = 0\. We also choose cartesian coordinate systems (i, J, k) and (£i, j l t ki) in the frames; see Figure 2.1.9
52
Kinematics and Dynamics of a Point Mass
on page 62. Let the frame 0\ rotate relative to the frame O so that the corresponding rotation vector u> is fixed in the frame O. Because of this rotation, the basis vectors in Oi depend on time when considered in the frame O: ti = »i(t), Jx =J1(*), *i = M * ) - % (2-1.18),
dii/dt = u> x ii, djx/dt = u> x j 1 , dki/dt = u> x k\.
(2.1.19)
Denote by ro(t) and ri(t) the position vectors of the point mass in O and Oi, respectively. If P is the position of the point mass, then ro(t) = OP, r\(t) = 0\P, and, with O = 0\, we have ro(t) — r\(t) for all t. Still, the time derivatives of the vectors are different: ro(t) ^ ri(t) because of the rotation of the frames. Indeed,
ri {t) = xx (t) §i + yi (t) j x + zx (t) ku
r0(t) =
x1(t)i1{t)+y1(t)j1{t)+zl(t)K1{t);
recall that the vectors i%, j x , k\ are fixed relative to Oi, but are moving relative to O. Let us differentiate both equalities in (2.1.20) with respect to time t. For the computations of ri(t), the basis vectors are constants. For the computations of ro(t), we use the product rule (1.3.4) and the relations (2.1.19). The result is
r0(t) = ri(t)+u;xr1(t). EXERCISE 2.1.1? Verify (2.1.21).
(2.1.21)
There is nothing in the derivation of (2.1.21) that requires us to treat ro as a position vector of a point. Accordingly, an alternative form of (2.1.21) can be stated as follows. Introduce the notations Do and D\ for the time derivatives in the frames O and 0\, respectively. Then, for every vector function R = R(t), the derivation of (2.1.21) yields
D0R{t) = DiR(t) +UJX R(t).
(2.1.22)
Relation (2.1.21) is a particular case of (2.1.22), when R is the position
vector of the point. We now use (2.1.22) with R = r0, the velocity of the point in the fixed frame, to get the relation between the accelerations. Then (i) Z V o = r0; (ii) by (2.1.21), Dxr0 = f 1 + w x t , i ; (hi) also by (2.1.21) w x r 0 = w x (ri + w x n ) . Collecting the terms in (2.1.22),
ro = r i + 2 u x r i + w x ( w x j ' i ) .
(2.1.23)
Therefore, the acceleration in the fixed frame has three components: the acceleration fi in the moving frame, the C o r i o l i s a c c e l e r a t i o n acor =
Uniform Rotation of Frames
53
2u> x r i , and the centripetal acceleration o,c — u} x [u) x f i ) . Note that o,c is orthogonal to both u> and w x r i .
Assume that the fixed frame O is inertial, and let F be the force acting on the point mass m in O. By Newton's Second Law, we have mro = F in the inertial frame O and, by (2.1.23),
mfi = F - 2mw x n - m w x f w x r i )
(2.1.24)
in the rotating frame 0\. Similar to (2.1.16), inertial forces appear as corrections to Newton's Second Law in the non-inertial frame 0\. There are two such forces in (2.1.24): the Coriolis force Fcor = - 2 m « x r\ and the centrifugal force Fc = —mw x (a; x n ) .
As an example illustrating the relation (2.1.23), consider a point mass m moving on the surface of the Earth along a meridian (great circle through the poles) with constant angular speed 7; the axis of rotation goes through the North and South Poles. We place the origins O and 0\ of the fixed and rotating frames at the center of the Earth and assume that the frame 0\ is rotating with the Earth. (Figure 2.1.5).
Fig. 2.1.5 Motion Along a Meridian
Denote by P the current position of the point, and consider the plane (NOP) Relative to the Earth, that is, in the frame Ox, the plane (NOP) is fixed, and the motion of m is a simple circular rotation in this plane with constant angular speed 7 so that 6(t) = jt. Relative to the fixed frame O, the plane (NOP) is rotating, and the rotation vector is u>. We will determine the three components of the acceleration of the point in the frame O according to (2.1.23).
Introduce the polar coordinate vectors r, 6 in the plane (NOP). By
54
Kinematics and Dynamics of a Point Mass
(1.3.27), page 36, we find ag = 0 and so
rl=ar = -j2Rr,
(2.1.25)
where R is the radius of the Earth. Similarly, we use (1.3.28), page 36, to find the velocity r j , of the point
in the frame 0\\ ti — "fR9, and then the Coriolis acceleration in O is acor = 2CJ x r\ — 2^Ru> x 9.
To evaluate the product w x 9, we need some additional constructions. Note that both u> and 9 are in the plane (NOP). Let b be the unit vector that lies both in the (NOP) and the equator planes as shown in Figure (2.1.5). This vector is rotating with the plane (NOP) and therefore changes in time; by (2.1.18), b'(t) = u> x b(t). With this construction, the vector LJ x 9 is in the plane of the equator and has the same direction as — b'(t). By definition, 9 is orthogonal to r. By construction, the vectors CJ and b are also orthogonal, and so the angle 6 between the vectors b and r is equal to the angle between the vectors u> and 9. Since ||b'(£)|| — u = ||u>|| and ||0|| = 1, we find UJ x 9 — -sinOb', and therefore
acor(t) = -27-Rsin(7i)b'(i).
(2.1.26)
Finally, we use 9 and b to write the centripetal acceleration ac = w x ( u x r ) of the point. We have
u>xr = Rsm(n/2-e)b'
= RcosQb', u xb'=-uj2b.
(2.1.27)
Hence,
ac = -(<Jj2Rcos'yt)b.
(2.1.28)
Combining (2.1.25), (2.1.26), and (2.1.28) in (2.1.23), we find the total acceleration r(t) of the point mass in the fixed frame O:
r(t) = --y2Rr(t) -2^Rsia{jt)b'(t)
- cj2Rcos(-ft)b(t).
(2.1.29)
EXERCISE 2.1.8.c Verify all the equalities in (2.1.27).
We now use (2.1.24) to study the dynamics of the point mass m moving on (or close to) the surface of the Earth. Let r\ = r\(t) be the trajectory of the point in the frame 0\. Unlike the discussion leading to (2.1.29), we will no longer assume that the point moves along a meridian. Suppose that O is an inertial frame. If F is the force acting on m in the frame O, then
Uniform Rotation of Frames
55
Newton's Second Law implies mr = F. By (2.1.24),
mf\ = F - 2mu> x f i - mu> x (u> x n).
(2.1.30)
The force F is the sum of the Earth's gravitational force Fa = — {Km/\\r\\\2)ri and the net "propulsive" force Fp(t) that ensures the motion of the point. If we prescribe the trajectory r\(t) of m in the frame 0\, then the force Fp necessary to produce this trajectory is given by
FP(t) = mri + 2mu>u x r i + rnu x ( w x n) — Fa-
(2.1.31)
Conversely, if the force Fp — Fp(t) is specified, then the resulting trajectory is determined by solving (2.1.30) with the corresponding initial conditions ri(0), ri(0) and with F = FG + FP.
Note that (2.1.30) can be written as
mri = F — m acor — mac = FQ + Fp + Fcor + Fc.
Both forces FQ and Fc act in the meridian plane (NOP). Indeed, by the Law of Universal Gravitation, the gravitational force FQ acts along the line OP, where P is the current location of the point. The centrifugal force Fc = mijj x (u> x n ) acts in the direction of the vector b, as follows from the properties of the cross product; see Figure 2.1.5.
To analyze the effects of the Coriolis force Fcor = —macor = - 2 m u x f i on the motion of the point mass, we again assume that the point is in the Northern Hemisphere and moves north along a meridian with constant angular speed. According to (2.1.26), the force Fcor is perpendicular to the meridian plane and is acting in the eastward direction. The magnitude of the force is proportional to sin#, with the angle 6 measured from the equator; see Figure 2.1.5. In particular, the force is the strongest on the North pole, and the force is zero on the equator. By (2.1.25) and (2.1.31), to maintain the motion along a meridian, the force Fp must have a westward component to balance Fcor. Thus, to move due North, the mass must be subject to a propulsive force Fp having a westward component. The other component of Fp is in the meridian plane.
The following exercise analyzes the Coriolis force when the motion is parallel to the equator.
EXERCISE 2.1.9. Let ii, be the northward vector along the axis through the North and South poles. We assume that the Earth rotates around this axis, and denote by u> k the corresponding rotation vector. Let O be the fixed inertial frame and 0\, the frame rotating with the Earth; the origins
56
Kinematics and Dynamics of a Point Mass
of both frames are at the Earth center. Suppose that a point mass m is in the Northern hemisphere and moves East along a p a r a l l e l ( a circle cut on the surface of the Earth by a plane perpendicular to the line through the poles). Denote by fi the velocity of the point relative to the frame 0\ so that rx is perpendicular to the meridian plane; see Figure 2.1.6. By (2.1.24), the Coriolis force acting on the point mass is Fcor = —2mu)h x f\.
Fig. 2.1.6 Motion Along a Parallel
(a)c Assume that both ||ri|| and 9 stay (approximately) constant during the travel and that there is zero propulsion force Fp, as after a missile has been fired. Ignore air resistance. (i) Show that the point mass is deflected to the South, and the magnitude of the deflection isu\\ri\\ sin9t2, wheret is the time of travel. Hint: verify that Fcor • 9 = — 2mw||ri||sin0) and so 2mw||ri||sin6| is the force pushing the point mass to the South, (ii) Suppose that 9 = 41°, ||ri || = 1000 meters per second, and the point mass travels 1000 kilometers. Verify that the point mass will be deflected by about 50 kilometers to the South: if the target is due East, the missile will miss the target if aimed due East. Hint: 2o>sin0 « 10~4.
(b)A Compute the deflection of the point mass taking into account the change of \\ri\\ and 9 caused by the Coriolis force.
We now summarize the effects of the Coriolis force on the motion of a point mass near the Earth.
• The force is equal to zero on the equator and is the strongest on the poles.
• For motion in the Northern Hemisphere:
Uniform Rotation of Frames
57
Direction of Motion
North South East West
Deflection of trajectory
East West South North
• For motion in the Southern Hemisphere
Direction of Motion
North South East West
Deflection of trajectory
West East North South
EXERCISE 2 . 1 . 1 0 . B Verify the above properties of the Coriolis force. Hint: The particular shape of the trajectory does not matter; all you need is the direction ofr\, similar to Exercise 2.1.9.
EXERCISE 2.1.11. A Because of the Coriolis force, an object dropped down from a high building does not fall along a straight vertical line and lands to the side. Disregarding the air resistance, compute the direction and magnitude of this deviation for an object dropped from the top of the Empire State Building (or from another tall structure of your choice).
One of the most famous illustrations of the Coriolis force is the FouCAULT PENDULUM, named after its creator, the French scientist JEAN BERNARD LEON FOUCAULT (1819-1869). Foucault was looking for an easy demonstration of the Earth's rotation around its axis, and around 1850 came up with the idea of a pendulum. He started with a small weight on a 6 feet long wire in his cellar, and gradually increased both the weight and the length of the wire. He also found a better location to conduct his experiment. The culmination was the year 1851, when he built a pendulum consisting of a 67 meter-long wire and a 28 kg weight swinging through a three-meter arc. The other end of the wire was attached to the dome of the Paris Pantheon and kept swinging via a special mechanism to compensate for the air resistance and to allow the swing in any vertical plane. Because of the rotation of the Earth around its axis, the Coriolis force was turning the plane of the swing by about 270 degrees every 24 hours, in the clockwise direction as seen from above.
EXERCISE 2.\.Yl.c Draw a picture and convince yourself that, if the pen-
58
Kinematics and Dynamics of a Point Mass
dulum starts to swing in the Northern Hemisphere in the meridian plane, then the Coriolis force will tend to turn the plane of the swing clockwise as seen from above.
Let us perform a simplified analysis of the motion of the Foucault pendulum in the Northern Hemisphere, away from the North Pole and the equator. Assume that the pendulum starts to swing in a meridian plane. Figure 2.1.7 presents a (grossly out-of-scale) illustration, with the Earth's surface represented by the semi-circle. In reality, the length of the support and the height of the supporting point 0\ are much smaller than the radius R of the Earth, so that \00\\ « |0.Po| ^ R- Also, the amplitude of the swing is small compared to the length of the support, so that |OiPo| « \0\PN\ = |OiPs|, and the linear distance \PNPS\ is approximately equal to the length of the corresponding circular swing arc.
Fig. 2.1.7 Foucault Pendulum
The pendulum is suspended at the point 0\, and the points PN, PS are the two extreme positions of the weight. The angle 9 is the latitude of the support point 0\. Denote the distance \PNPS\ by 2r. As seen from the picture, the point PN is closer to the axis ON of the rotation of the Earth than the point 0\, and the amount of this difference is \QNPQ\ = rsm9. Similarly, the point Ps is farther from the axis than 0\ by the same amount. Since the Earth is rotating around the axis ON with angular speed u, the points PN, 0\, and Ps will all move in the direction perpendicular to the meridian plane. The point PN will move slower than Oi, and the point Ps, faster, causing the plane of the swing to turn. The speed of Ps relative to 0\ and of 0\ relative to PN is ruj sin 9. If we assume that these relative
Uniform Rotation of Frames
59
speeds stay the same throughout the revolution of the swing plane, then the weight will be rotating around the point Po with the speed rusinO, and we can find the time T of one complete turn. In time T, the weight will move the full circle of radius r, covering the distance 2nr. The speed of this motion is ru sin#, and so
T = - ^ =^ , ru sin 8 sin f)
(2.1.32)
where To = 27r/o; « 24 hours is the period of Earth's revolution around its axis. The plane of the swing will rotate 2n sin 0 radians every 24 hours. For the original Foucault pendulum in Paris, we have 6 « 48.6°, which results in T = 32 hours, or a 270° turn every 24 hours.
Note that the result (2.1.32) is true, at least formally, on the poles and on the equator. Still,
• On the poles, T = 24 hours as the Earth is turning under the pendulum, making a full turn every 24 hours.
• On the equator, where there is no Coriolis force, the points OI,PN,PS are at the same distance from the axis ON (this is only approximately true if the swinging is not in the plane of the equator). As a result, the plane of the swing does not change: T = +oo.
EXERCISE 2.1.13.B Find the period T for the Foucault pendulum in your home town.
The Coriolis force due to the Earth's rotation has greater effects on the motion than might be deduced from an intuitive approach based on the relative velocities of the moving object and the Earth. In particular, these effects must be taken into account when computing trajectories of longrange missiles. With all that, we must keep in mind that the effects of the Coriolis force due to the Earth rotation are noticeable only for large-scale motions. In particular, the Coriolis force contributes to the erosion of the river banks, but has nothing to do with the direction of water swirling in the toilet bowl.
The Coriolis force also influences the direction of the ATMOSPHERIC WINDS. This was first theorized in 1856 by the American meteorologist WILLIAM FERREL (1817-1891) and formalized in 1857 by the Dutch meteorologist CHRISTOPH HEINRICH DIEDRICH BUYS BALLOT (1817-1890).
60
Kinematics and Dynamics of a Point Mass
/• / Polar Easterlies
60° /
/
//y'Temperate Westerlies/ /
30°/
Calms of Cancer
/ / Easterly Trade Winds / / Doldrums
X^Easterly Trade Winds \ ^ \ ,
Calms of Capricorn
\ \ \Temperate
Westerlies\\/
\ Polar Easterlies '
Fig. 2.1.8 Atmospheric Winds
The general wind pattern on the Earth is as follows (see Figure 2.1.8). Warm air rises vertically from the surface and is deflected by the Coriolis force, resulting in easterly trade winds, temperate westerlies, and polar easterlies. T h e deflection is to the right in the Northern Hemisphere and to the left in the Southern Hemisphere, so t h a t the patterns in the two hemispheres are mirror images of each other. Three regions of relative calm form: the doldrums around the equator, calms of Cancer around the 30° parallel in the Northern Hemisphere, and calms of Capricorn around the 30° parallel in the Southern Hemisphere.
Let us discuss the formation of the easterly trade winds in the Northern Hemisphere. T h e Sun heats the surface of the E a r t h near the equator. The air near the equator also gets warm, becomes lighter, and moves up, creating the area of low pressure near the equator and causing the cooler air from the north to flow south. T h e flow of the cooler air from the North creates the area of low pressure at high altitudes, deflecting the rising warm air from the equator to the north. T h e Coriolis force deflects this flow to the East. At higher altitudes, the air cools down. Cooler, denser air descends around t h e 30° parallel and flows South back to the lower pressure area around the equator. T h e Coriolis force deflects this southward flow to the West. In the stationary regime, this circulation produces a steady wind from the North-East, the easterly trade winds.
General Accelerating Frames
61
EXERCISE 2.1.14. Explain the formation of the temperate westerlies and the polar easterlies.
EXERCISE 2.1.15.'4 The flight time from Los Angeles to Boston is usually different from the flight time from Boston to Los Angeles. Which flight takes longer? Which of the following factors contributes the most to this difference, and how: (a) The Earth's rotation under the airplane; (b) The Coriolis force acting on the airplane; (c) The atmospheric winds? Hint: If in doubt, check the schedules of direct flights between the two cities.
The complete mathematical model of atmospheric physics is vastly more complicated and is outside the scope of this book; possible reference on the subject is the book An Introduction to Dynamic Meteorology by J. R. Holton, 2004, and some partial differential equations appearing in the modelling of flows of gases and liquids are discussed below in Section 6.3.5.
In 1963, while studying the differential equations of fluid convection, the American mathematician and meteorologist EDWARD NORTON LORENZ (b. 1917) discovered a chaotic behavior of the solution and a strange attractor. These Lorentz differential equations are a prime example of a chaotic flow. They also illustrate the intrinsic difficulty of accurate weather prediction. His book The Nature And Theory of The General Circulation of The Atmosphere 1967, is another standard reference in atmospheric physics.
2.1.4 General Accelerating Frames
The analysis in the previous section essentially relied on the equation (2.1.18) on page 51, which was derived for uniformly rotating frames. In what follows, we will use linear algebra to show that, with a proper definition of the vector u>, relation (2.1.18) continues to hold for arbitrary rotating frames.
Consider two cartesian coordinate systems: (i, j , k) with origin O, and (£i, j l 7 K I ) with origin 0\. We assume that O = 0\\ see Figure 2.1.9.
Consider a point P in R3. This point has coordinates (x, y, z) in (z, j , k) and {x\, j/i, z{) in (fj, j l t ki). Then
x i + y j + z k = xi ii + yi j 1 + zi ki.
(2.1.33)
We now take the dot product of both sides of (2.1.33) with i to get
x = xi(ii -i) + yi(j1 •i) + zi(ki -i).
(2.1.34)
62
Kinematics and Dynamics of a Point Mass
Fig. 2.1.9 3-D Rotation of Frames
Similarly, we take the dot product of both sides of (2.1.33) with j :
and with k:
V = xi(ii • 3) + 2/i(Ji • 3) + *i(«i • 3)
(2.1.35)
z = xi{i\ • k) + j / i & • k) + zi(ki • k).
(2.1.36)
The three equations (2.1.34)-(2.1.36) can be written as a single matrix vector equation,
Consider the matrix
*i • 3 3\-3 K-i-3 Kii • k 3i ' £ ki • k
(2.1.37)
*i ' l 3i ' %
U = I 5i • 3 3i3 Ji • k j1-k
Ki • *
Ki • 3
KI-K,
(2.1.38)
EXERCISE 2.1.16. ( a ) c Verify that the matrix U is orthogonal, that is UUT = UTU = I. Hint: 1 = h • fc = (f • d ) 2 + (i • ti)2 + (A • i"i)2, 0 = *i • Ji = (*i • *)(ii • «) + (*i • i)(ji • i) + (ii • K)C?I • «)• W A Veri/j/ i/iai £/ie determinant of the matrix U is equal to 1. (c)A Verify that the matrix
U is a representation, in the basis (i\, J1; k\), of an orthogonal transfor-
mation (see Exercise 8.1.4, page 453, in Appendix). This transformation rotates the frame 0\ so that {i\, j \ , «i) moves into (i, 3, k). (d)A Verify
General Accelerating Frames
63
that the matrix UT is a representation, in the basis (z, j , k), of an orthogonal transformation that rotates the space so that (z, j , k) moves into
Now assume that the picture on Figure 2.1.9 is changing in time as follows:
• Frame O and its coordinate system (i, j , k) are fixed (not moving). • Oi(t) = 0 for alii. • The coordinate system (?i, j x , k{) in frame 0\ is moving (rotating)
relative to (i, j , k). • The point P is fixed in frame 0\ relative to (ii, j l y k\).
Thus, xi,yi,z\ are constants and P is rotating in the (?, j , k) frame. Then (2.1.33) becomes
x(t) i + y(t) j + z(t) k = xi h(t) +yi3i(t)
+ziki(t),
Define the matrix U = U(t) according to (2.1.38), and assume that the entries of the matrix U are differentiable functions of time; it is a reasonable assumption if the rotation is without jerking. Since U(t)UT(t) = I for all t, it follows that d/dt(UUT) = 0, the zero matrix. The product rule applies to matrix differentiation and therefore
i/uT + ui/T = iiuT + (iiuT)T = o,
which means that fi(£) = U(t)UT(t) is antisymmetric, that is, has the form
/ 0 -w3(i) W2(t) \
fi(t) = w3(t) 0 -wi(t)
We use the entries of the matrix Q(t) to define mathematically the i n s t a n t a n e o u s r o t a t i o n v e c t o r in the fixed frame O:
u(t) = u>i (t) i + W2(t)j+ w3 (t) k.
(2.1.39)
EXERCISE 2.1.17? (a) Verify that, for every vector R = Rii + R^j+R3k and each t,
Q.{t) R = w(t) x R.
(2.1.40)
64
Kinematics and Dynamics of a Point Mass
Hint: direct computation, (h) Consider the vector OP = r*o(i) = x(t) i + y(t)j + z(t) k rotating in the frame O. Verify that, with the above definition of u, we have
r0(t) = w(t) x r 0 ( i ) .
(2.1.41)
Hint: write relation (2.1.37) as ro(t) = U{t)f\, where f\ = x\ i + j/i j + z\ k. Then rx = UT{t)r0(t) and r0{t) = U{t)ri.
Note that (2.1.41) agrees with (2.1.18) on page 51 when u(t) is constant and justifies the above definition of u>(t) as a rotation vector.
Denote by ro(t) the position vector of a point P in the frame O, and by T*I(£), the position of the same point in the frame 0\. Since the frames have the same origin, we have ro(t) = ri(t) for all t. On the other hand, because of the relative rotation of the frames, the values of ro(t) and ri(i) are different. As we did earlier on page 52, denote by Do and D\ the derivatives with respect to time in the frames O and 0\, respectively. If the point P is fixed in the rotating frame 0\ and 0\P = r\ = x\ £1+2/1 Ji+zi ki is the position vector of P in 0\, then Dir0(t) = r\(t) = 0, and (2.1.41) implies Doro(t) = ro(t) — UJ x ro(t).
Similar to the derivation of (2.1.21), we can show that if the point P moves relative to the frame 0\, then
r0(t) = ri(t)+u>xr0(t).
(2.1.42)
Therefore, for every vector R — R(t), expressed as functions in frames O and Oi, both denoted by R
D0R(t) = DiR(t) + u(t) x R(t).
(2.1.43)
EXERCISE 2.1.18? Verify (2.1.43). Hint: write R(t) = x(t)i1(t)+y(t)j1(t) + z(t) ki(t) and differentiate this equality using the product rule. Since the vectors
*i> J i ! ki are fixed in the frame 0\, you can use equality (2.1.41) to compute the time derivatives of these vectors. Also, by definition, D\R(t) = x(t)i\(t) +
» ( * ) 5 i ( t ) + * ( * ) « i (*)•
Remark 2.1 Let us stress that the vector ro(t) = x(t)i+y(t)j+z(t) k is the same as the vector ri(t) = zi(£) ?i(£) +2/1 (£)&(*) +21 (£) ki(t): both are equal to OP even if the point P moves relative to the frame 0\. As a result, ro(i) =/= U(t)ri(t). There is no contradiction with (2.1.37), because the components of the vectors TQ, r\ are defined in different frames and cannot be related by a matrix-vector product. What does follow from (2.1.37) is
General Accelerating Frames
65
the equality ro(t) = U{t)r\{t), where f\{t) = x\(t)i + y\{t) j + z\{t) k. Also keep in mind that, despite the equality of the vectors ro(t) = ri(t), the curve C defined by ro in the frame O is different from the curve C\ defined by 7*1 in the frame 0\. For example, if P is fixed in 0\, then C\ is jus a single point.
EXERCISE 2.1.19r Consider the special case of a uniform rotation of frame 0\ relative to frame O so that the origins of the frames coincide, k = ki, and the rotation vector is u> = W3K. Calculate U(t) and U{t). Show that the matrix fl = U(t)UT(t) has the form
/O - w 3 0 \ n = w3 0 0 .
\o 0 0/
For the point P fixed in the rotating frame and having the position vector in the fixed frame ro(t) = x(t) i + y(t) j + z(t) k\ show that
r0{t) = Cl,ro{t) = -u3y{t)i + u>3x(t)3 = w x r0(t).
As a result, you recover relation (2.1.18) we derived geometrically on page 51.
To continue our analysis of rotation, assume that the vector function w = u){t) is differentiable in t. Then we can set R = ro = A)f*o in (2.1.43) and use (2.1.42) to derive the relation between the accelerations of the point in the two frames:
ro(*) = ri(t) + 2 w ( t ) x f i ( t ) + w(t)x ( u ( t ) x r 0 ( t ) ) + w ( t ) x r 0 ( t ) ; (2.1.44)
as before, ro and r i are the position vectors of the point in the frames O and 0\, respectively.
EXERCISE 2.1.20? (a) Verify (2.1.44). Hint: r0(t) = D0r0(t) = D0(fi +u> x r0) = D\r\ + u x r i + u x r o + u x ( f i + u x ro). Note that both u) and ro are defined in the same frame O, so the product rule (1.3.6), page 26, applies, (b) Verify that (2.1.44) can &e written as
Mt)=ri(t) + 2u(t)xri(t)+w(t)x(u(t)xri(t))+w(t)xr1(t).
(2.1.45)
Finally, assume that the point 0\ is moving relative to O so that the function roi(t) = 00\ is twice continuously differentiable. Then we have r(t) = roi(t) + ri(t). Consider the parallel translation of the frame O with the origin O' at 0\, and define the rotation vector u to describe the
66
Systems of Point Masses
rotation of the frame 0\ relative to this translated frame O'. We can now combine relation (2.1.44) for rotation with relation (2.1.12) on page 47 for parallel translation to get
ro(*) = r o i ( * ) + r i ( t ) + 2u(t) x fi(t) +w(f) x (u(t) x ri(t)) +u(J) x n(t).
EXERCISE 2.1.21? Verify (2.146). Hint: Apply (2.1.45) ton, replacing frame O with O'.
Suppose that the frame O is inertial, and a force F is acting on the point mass m. Then, by Newton's Second Law, mro{t) = F; to simplify the notations we will no longer write the time dependence explicitly. By (2.1.46),
mf\ = F - mr-Qi - 2mw x r\ - mu x (u x n ) -mCo x T\. (2.1.47)
As before in (2.1.13), page 47, and in (2.1.24), page 53, we have several corrections to Newton's Second Law in the non-inertial frame 0\. These corrections are the t r a n s l a t i o n a l a c c e l e r a t i o n force Fta = —in^oi, the C o r i o l i s force Fcor = —2mu> x r i , the c e n t r i f u g a l force Fc = —rawx (wxri), and the angular acceleration f o r c e F a a = - m w x r i .
2.2 Systems of Point Masses
The motion of a system of point masses can be decomposed into the motion of one point, the center of mass, and the rotational motion of the system around the center of mass. In what follows, we study this decomposition, first for a finite collection of point masses, and then for certain infinite collections, namely, rigid bodies.
2.2.1 Non-Rigid Systems of Points
n
Let 5 be a system of n point masses, m i , . . . , mn, and M — J^ rrij, the total
mass of S. We assume that the system is non-rigid, that is, the distances between the points can change. Denote by rj the position vector of rrij in some frame 0. By definition, the c e n t e r of mass (CM) of S is the point
Non-Rigid Systems of Points
67
with position vector 1 n
We will see that some information about the motion of S can be obtained by considering a single point mass M with position vector TCM-
EXERCISE 2.2.1. c Verify that a change of the reference point O does
not change either the location in space of the center of mass or formula
(2.2.1) for determining the location: if O' is any other frame and fj is
the position vector of rtij in O', then the position vector of CM in O' is
n
>
^
rcM = (1/-W) 5Z m j Tj. Hint: fj = Vj + O'O and so TCM = TCM + O'O,
which is the same point in space.
EXERCISE 2.2.2.
(a)B Show that the center of mass for three equal
masses not on the same line is at the intersection of the medians of the corresponding triangle. (b)A Four equal masses are at the vertices of a
regular tetrahedron. Locate the center of mass.
The velocity and acceleration of the center of mass are TCM and fcM, respectively. Differentiating (2.2.1) with respect to t, we obtain the relations
1 n
rcM = jjYlmii'i>
J=I
(2-2-2)
1 " fCM = jf^Zmjfj.
3= 1
(2.2.3)
To study the motion of the center of mass, suppose t h a t t h e reference
frame O is inertial a n d denote by Fj, 1 < j < n, the force acting on the
point mass rrij. T h e n Fj = rrij fj, and, multiplying (2.2.3) by M, we get
the relation
n
m
MfCM = Yl m3 *3 =J2Fi = F
3=1
3=1
(2-2-4)
Equality (2.2.4) suggests that the total force F can be assumed to act on a point mass M at the position TCM- As a next step, we will study the structure of the force F.
68
Systems of Point Masses
Typically, each Fj is a sum of an external force F\ ' from outside of S
and an internal system force F], exerted on rrij by the other n — 1 point masses. Thus, Fj = Ylk^tj Fjk> where F V is the force exerted by rrik on rrij. Hence,
F = itFJ = Y, F<JE) + £ FT = p{B)+F{1) •
j=l
j=l
j=l
By Newton's Third Law, F$ = -F$. It follows that F ( 7 ) = £ " = 1 Ff] = 0 and therefore, F = ^ ) " = 1 F(E) = F{E). By (2.2.4), the motion of the center of mass is then determined by
MrCM = F{E).
(2.2.5)
The (linear) momentum PQM °f the center of mass is, by definition,
PCM = MVCM-
With this definition, equation (2.2.5) becomes pCM = F^E\ and if the net external force F^ ' is zero, then PCM ls constant. By (2.2.2),
n
n
PCM = J2 m^ = 12PJ>
(2-2-6)
where Pj = rrij Tj is the momentum of rrij. Thus, if F^E' = 0, then the total linear momentum ps — Yll=i Pj °f ^ e system is conserved.
Next, we consider the rotational motion of the system. By definition (see (2.1.4) on page 40), the angular momentum LQ-J of the point mass rrij about the reference point O is given by Loj = Tj X rrij Tj = Vj x Pj. Accordingly, we define the angular momentum Lo of the system S about O as the sum of the Lo,j'-
n
n
n
j=i
j=i
j=\
For the purpose of the definition, it is not necessary to assume that the
frame O is inertial. Note that, unlike the relation (2.2.6) for the linear momentum, in general LQ ^ rCM x Mr CM-
Non-Rigid Systems of Points
69
From (2.2.7) it follows that
AT
n
n
3= 1
j=l
3= 1
since rj x rj = 0. Now, we again assume that the frame O is inertial. Then F j = rrij Tj and dLo/dt = Y^j=i rj x Fj = !Cj=i Tj, where Tj = Tj X FJ is the torque about O of the force Fj acting on rrij. We define the total torque To = 2 j = i Tj and conclude that
^ = r0.
(2.2.8)
Equation (2.2.8) is an extension of (2.1.6), page 41, to finite systems of point masses. If To = 0, then Lo is constant, that is, angular momentum is conserved.
In general, unlike the equation for the linear momentum (2.2.5), the torque To in (2.2.8) includes both the internal and external forces. If the internal forces are central, then only external forces appear in (2.2.8). Indeed, let us compute the total torque in the case of CENTRAL INTERNAL
FORCES. The internal force F J J ) acting on particle j is FJ7) = f^ F^. By
fc=i k?j
Newton's Third Law, we have F$ = -F$. Then
, j-
n
n
n
n
-^ = J2rjX F> = £ - , x Ff> + J > x X>g>
j=i
j=i
j ' = i k=i
k?j
n
n ..
The terms in the product Yl rj x 2 -^k c a n ^ e arranged as a sum of
J'=I
fc=i
pairs r j x FJj/ + r^ x FJy for each (j, k) with j ^ /c. Also, TJ X FJfc' + rk x FJL- = (r^ — rk) x Fjk- The vector rk — T"j is on the line joining rrij and rrik- If the forces F-k are central, as in the cases of gravitational and electrostatic forces, then the vector F L is parallel to the vector (rj —rk), and (rj — rk) x F k ' = 0. In other words, the internal forces do not contribute to the torque, and (2.2.8) becomes
d ^ = ±rjxFf=±T^=T^,
3= 1
.7 = 1
(2.2.9)
70
Systems of Point Masses
where T J E ) = rj x Ff] is t h e external torque of F J £ ) a n d T{0E) is t h e total torque on the system S by the external forces.
Next, we look at the A N G U L A R MOMENTUM O F A SYSTEM RELATIVE
TO THE CENTER OP MASS. Again, let O b e a n a r b i t r a r y frame of reference, let r j (t) be t h e position of mass rrij, 1 < j < n, a n d let TCM (*) D e t h e position of the center of mass. Define by Xj the position of rrij relative t o the center of mass:
tj = ^ - rCM.
T h e n r-j = feu + ij and, according t o (2.2.7),
n
n
(2.2.10)
3 = 1
3 = 1
n
n
n
= ^2 rrij rCM x rcM + ^ rrij Xj x rCM + ^ rrij TCM X ij
3= 1
3= 1
3 = 1
n
+ ^ m j r j xij.
3 = 1
(2.2.11)
E X E R C I S E 2.2.3.c (a) Verify that
n
Y^mjXj = 0,
j=i
(2.2.12)
where Xj is the position vector of the point mass rrij relative to the center
n
n
n
n
of mass. Hint: ]>2 rrij tj = ^2 rrijrj — ^2 rrij rcM = ^2 mi ri ~ Mr CM — 0.
3 = 1
3=1
3 = 1
3=1
(b) Use (2.2.11) and (2.2.12) to conclude that
n L0 = M rCM x rc'M + ^ rrij Xj x ij.
3=1
(2.2.13)
If we select t h e origin O of the frame at t h e center of mass of the system, then TQM = 0, and we get the expression for the a n g u l a r momentum around t h e c e n t e r of mass:
n LCM = J^2mjXj xij,
3 = 1
(2.2.14)
Non-Rigid Systems of Points
71
where Vj(t) = rj(t) - rcM(t). Hence, (2.2.13) becomes
L0 = M rCM x rCM + LCM,
(2.2.15)
that is, the angular momentum of the system relative to a point O is equal to the angular momentum of the center of mass relative to that point plus the angular momentum of the system relative to the center of mass. Once again, we see that the center of mass plays a very special role in the description of the motion of a system of points.
We emphasize that LQ ^ Mr CM X rcM as long as LCM ^ 0. Note also that the vector functions Xj (t) and ij (t) depend on the choice of the reference frame.
Let us now compute the time derivative of LcM{t) using the differentiation rules of vector calculus (1.3.3), (1.3.4), (1.3.6) (see page 26). Since ij x ij = 0, we have
j
n
n
n
—LCM = y~}rni xi x *•»' "*" 5 Z m J *•» x ^' = 5 Z m J xi x XJ'
3= 1
j=l
j= l
(2-2.16)
To express dLcM/dt in terms of the forces Fj acting on the rrij, we would like to use Newton's Second Law (2.1.1), and then we need the frame to be inertial. The frame at the center of mass is usually not inertial, because the center of mass can have a non-zero acceleration relative to an inertial frame. Accordingly, we choose a convenient inertial frame O and apply (2.2.4), page 67, in that frame:
ij = jr-j - TCM = Fj/rrij - F/M, rrij Xj x ij = Xj x Fj - (rrij/M) Xj x F. By (2.2.16) above,
dLcM
n
dt
j=i
1 / "
\
\j=i
j
and then (2.2.12) implies dLcM = Y,XjXFj. dt
(2.2.17)
If the internal forces i^L are central, then, by (2.2.9), these forces do not contribute to the total torque. Since Xj —Xk = Tj — TCM — (>*fc — I"CM) =
72
Systems of Point Masses
Tj — rfe, we therefore find
^ f = t*J x Ff> = £ T & , = T & ,
(2.2.18)
where TCM, is the external torque of Fj ' about the center of mass and TQM ^S t n e total torque by the external forces. Using (2.2.16) above, we find
^
= ±mjvj,ij=TcSM.
(2.2.19)
Equations (2.2.5) and (2.2.19) provide a complete description of the motion of the system of point masses in an inertial frame.
As an EXAMPLE illustrating (2.2.5) and (2.2.19), let us consider BINARY STARS. A binary, or double, star is a system of two relatively close stars bound to each other by mutual gravitational attraction. Mathematically, a binary star is a system of n = 2 masses m i and mi that are close enough for the mutual gravitational attraction to be much stronger than the gravitational attraction from the other stars. In other words, we have F$ = -F^ and F[B) = F(2E) = 0. Using the equation for the linear momentum (2.2.5), page 68, we conclude that rcM = 0 and TCM is constant relative to every inertial frame O. We can therefore choose an inertial frame with origin at the center of mass of the two stars. Applying (2.2.19) in this frame, we find dLcAi/dt — 0, and by (2.2.16),
mi t i x t i + m.212 x x-2 = 0.
(2.2.20)
EXERCISE 2.2.4.B Assume that mi = m^. Show that the two stars move in a circular orbit around their center of mass. Hint: you can complete the following argument. By (2.2.1), rcM = (1/2)(T*I -\-T2). SO CM is the midpoint between m\ and m.2. Hence, t i = —12, i i = —12, t i = —12- By (2.2.20) above, 2ri x Vi = 0 . This implies that ti and vi are parallel (assuming ti ^= 0). Since dLcM/dt = 0, LCM is constant. By (2.2.14) on page 70, 2m\X\ x ti is constant as well. Together with X\ x ti = 0, this is consistent with equations (1.3.27), (1.3.28), and (1.3.29), page 36, for uniform circular motion.
Binary stars provide one of the primary settings in which astronomers can directly measure the mass. It is estimated that about half of the fifty stars nearest to the Sun are actually binary stars. The term "binary star" was suggested in 1802 by the British astronomer Sir WILLIAM HERSCHEL
Rigid Systems of Points
73
(1738-1822), who also discovered the planet Uranus (1781) and infra-red radiation (around 1800).
2.2.2 Rigid Systems of Points
A system 5 of point masses rrij, 1 < j < n, is called r i g i d if the distance between every two points rrii, rrij never changes. Let O be a reference frame and let Vj{t) be the position in that frame of rrij at time t. The rigidity condition can be stated as
\\rj{t)-ri{t)\\=dii
foralU, i,j = l,...,n,
(2.2.21)
where the dij are constants.
EXERCISE 2.2.5? Verify that condition (2.2.21) is independent of the choice of the frame.
EXERCISE 2.2.6.° Let S be a rigid system in motion. Prove that the norm \\rj — rCM|| and the dot product Tj • TCM remain constant over time for all j = 1 , . . . , n, that is, the center of mass of a rigid system is fixed relative to all rrij. Hence, the augmented system mi,... ,mn, M, with M located at the center of mass, is also a rigid system.
We will now derive the equations of motion for a rigid system. If we consider a motion as a linear transformation of space, then condition (2.2.21) implies that the motion of a rigid system is an isometry. The physical reality also suggests that this motion is orientation-preserving, that is, if three vectors in a rigid system form a right-handed triad at the beginning of the motion, they will be a right-handed triad throughout the motion.
EXERCISE 2.2.7. (a)B Show that an orientation-preserving orthogonal transformation is necessarily a rotation. (b)c Using the result of part (a) and the result of Problem 1.9 on page 412 conclude that the only possible motions of a rigid system are shifts (parallel translations) and rotations.
Let O be a frame with a Cartesian coordinate system (z, j , k). Let S be a rigid system moving relative to O. Denote by rcM(t) the position of the center of mass of S in the frame O. We start by introducing two frames connected with the system. Let OCM be the parallel translation of the frame O to the center of mass. Thus, OCM moves with the center of mass of the system S but does not rotate relative to O. Let 0\ be the frame with 00\ — rcM{t) and with the cartesian basis (?i, J j , k\) rotating with the
74
Systems of Point Masses
system. Define the corresponding rotation vector u> according to (2.1.39) on page 63. Let us apply equation (2.1.46), page 66, to the motions of the point mass rrij relative to frames O and 0\. Denote by rj the position vector of rrij in the frame O (draw a picture!) Then Xj = rj — TCM is the position vector of rrij in OCM • Because of the rigidity condition, the position vector rij of rrij in the frame Oi does not change in time, and so f\j = 0 and rij = 0. By (2.1.42) on page 64, rj = rcM + w x Xj, and then, by (2.1.46), 'fj = TTCM + u x (u> XXJ) + U; x Xj. Since the frame OCM is not rotating relative to O, we have x = f j — rcM, and
Xj=uxxj,
(2.2.22)
Xj = u> x (u> x Xj) + u> x Xj.
(2.2.23)
Next, we use identity (1.2.27) on page 22 for the cross product: w x (u> x Xj) = (w • Xj) OJ - (a; • u>) Xj = (u • Xj) LJ - J1 Xj,
(2.2.24)
where u) = ||u>||. To compute the rate of change of the angular momentum, we will need the cross product Xj x Xj. Applying (1.2.27) one more time,
Xj X(UXXJ) = \\xj\\2u-{xj-uj)xj.
(2.2.25)
Putting everything together, we get
Xj x ij = (u • Xj) XjXu> + \\XJf cj - (XJ • u) Xj.
(2.2.26)
After summation over all j , the right-hand side of the last equality does not look very promising, and to proceed we need some new ideas. Let us look at (2.2.26) in the most simple yet non-trivial situation, when the rotation axis is fixed in space, and all the point masses rrij are in the plane perpendicular to that axis. Then w • tj = 0 for all j , and <ii = Co Q, where Q is the unit vector in the direction of u>; note that since the rotation axis is fixed, the vectors w and w are parallel. Accordingly, equality (2.2.24) becomes U}X(LJ xxj) = —co2Xj, and since d(toCj)/dt = u u), equality (2.2.25) becomes Xj X(UJXXJ) = ||tj ||2wQ. Summing over all j in (2.2.26) and taking into account these simplifications, we find from (2.2.16) on page 71
U1U),
Rigid Systems of Points
75
and it is therefore natural to introduce the quantity ICM = 2 mjll*j||2!
which is called the moment of inertia of S around the line that passes through the center of mass and is parallel to Q. Then
dLcM T . ~ dt = ICMUU.
,„ „ „_* (2.2.27)
Our main goal is to extend equality (2.2.27) to a more general situation; the moment of inertia thus becomes the main object to investigate. To carry out this investigation, we backtrack a bit and look closely at the angular momentum of the rigid system around the center of mass LCM = Z3"=i mjVj x ij. By (2.2.22), we have Xj x ij = Xj x (w x Xj). Similar to (2.2.25) we find Xj x (w x Xj) = | | t j | | 2 w — (XJ • U>)XJ and
VCM = \ J2m^x^2
u -J2mi(xi ' w)rJ-
^' = 1
/
3 = 1
(2.2.28)
As written, equality (2.2.28) does not depend on the basis in the frame OCM- To calculate LCM, we now choose a cartesian coordinate system (i, j , k) in the frame OCM- Let Xj(t) = Xj(t)i + Vj(t)j+ Zj(t)k and w(t) = wx(t) i + ojy(t)j + LJz(t) k. From (2.2.28) above,
LCM = ( J2mi(x2i + $ + **) I w - I ^2mixi vi I w*
*
Vj=1
I
(2.2.29)
rrijZjXj wz,
u=i
/
\j=i
where we omitted the time dependence notation to simplify the formula. Since LCM = LCMX * + LCMV 3 + LCMZ «•, to compute the ^-component LCMX of LCM, we replace the vector Xj in (2.2.29) with Xj, and the vector u>, with wx:
n
n
n
LCMX =^x^2 rrij [y] + zf) - uy ^ rrijXjyj - wz ^ rrijXjZj;
3 = 1
3 = 1
3 = 1
similar representations hold for LcMy and LCMZ-
76
Systems of Point Masses
It is therefore natural to introduce the following notations:
n
n
n
j=i n
j=\
j=\
n
•*xy = -*yx = / j ^'j'^jUji ^xz == *zx = / j TTljXjZj,
•iyz — *zy — /
JlmjVizj-
With these notations, LCMx = uxIxx - ujyIxy - wzIxz. EXERCISE 2.2.8.C Verify that
(2.2.30)
LcMy — —WxIyx + COylyy — U!zIyz, LQMZ — ~^XIZX — UJyIzy + WZJZZ.
EXERCISE 2.2.9.C' Assume that all the point masses are in the (i, j) plane. Show that Ixz = Iyz = 0 and Ixx + Iyy = Izz.
We can easily rewrite (2.2.28) in the matrix-vector form:
CCM(t) = IcM(t)Ct(t),
where CcM(t) is the column vector (LCMx{t), LCMv(t), is the column vector {wx{t), u)y(t), ivz(t))T, and
(2.2.31) LCMz{t))T, Cl(t)
(2.2.32)
The matrix ICM is called the moment of i n e r t i a matrix, or t e n s o r of i n e r t i a , of the system 5 around the center of mass in the basis (i, j , K). The Latin word tensor means "the one that stretches," and, in mathematics, refers to abstract objects that change in a certain way from one coordinate system to another. All matrices are particular cases of tensors. For a summary of tensors, see page 457 in Appendix.
As much as we would like it, equality (2.2.31) is not the end of our investigation, and there two main reasons for that:
(1) It is not at all clear how to compute the entries of the matrix ICM(2) Because the system S is rotating relative to the frame O, the entries of
the matrix ICM depend on time.
Rigid Systems of Points
77
Remembering that our goal is an equation of the type (2.2.27), we have to continue the investigation of the moment of inertia.
Let us forget for a moment that we are dealing with a rotating system, and instead concentrate on the matrix (2.2.32). We know from linear algebra that every change of the coordinate system changes the look of the matrix; see Exercise 8.1.4, page 453 for a brief summary. For many purposes, including ours, the matrix looks the best when diagonal, that is, has zeros everywhere except on the main diagonal. While not all matrices can have this look, every symmetric matrix is diagonal in the basis of its normalized eigenvectors; see Exercise 8.1.5 on page 454.
By (2.2.30) and (2.2.32), the matrix ICM is symmetric, and therefore there exists a cartesian coordinate system (?*, j * , k*) in which ICM has at most three non-zero entries 1*^, I22 > -^33 > and all other entries zero. In other words, there exists an orthogonal matrix U» so that the matrix ICM = U*ICMUJ is diagonal. The matrix [/* is the representation in the basis (2, j , k) of a linear transformation (rotation) that moves the vectors i, j , k to ?*, j * , k*, respectively. The vectors i*, j * , k* are called the p r i n c i p a l axes of the system S. We will refer to the frame with the origin at the center of mass and the basis vectors i*, f, k* as the p r i n c i p a l axes frame. Both the principal axes and the numbers 7 ^ , J£2i ^33 depend only on the configuration of the rigid system S, that is, the positions of the point masses rrij relative to the center of mass. A matrix can have only one diagonal look, but in more than one basis: for example, the identity matrix looks the same in every basis. Accordingly, the principal axes might not be unique, but the numbers 1 ^ , I%2, I^3 are uniquely determined by the configuration of the system, and, in particular, do not depend on time. We know from linear algebra that the numbers 1^, I^\,I33 are the eigenvalues of the matrix ICM > and the matrix U* consists of the corresponding eigenvectors.
EXERCISE 2.2.10.A Given an example of a symmetric 3 x 3 matrix whose entries depend on time, but whose eigenvalues do not. Can you think of a general method for constructing such a matrix?
Formulas (2.2.30) for the entries of the matrix of inertia are true in every basis, and therefore can be used to compute the numbers 1^, I22, 7|3. Since these numbers do not depend on time, the coordinates x*-, y*j, z* of rrij in the principal axes frame should not depend on time either. In other words, the principal axes frame is not rotating relative to the system S, but is fixed in S and rotates together with S.
78
Systems of Point Masses
EXERCISE 2.2.11. Consider four identical point masses m at the vertices of a square with side a. Denote by O the center of the square. Let (i, j , k) be a cartesian system, at O, with the vectors i and j along the diagonals of the square (draw a picture!). Verify the following statements: (a) The center of mass of the system is at O. (b) The vectors i, j , k define the three principal axis of the system, (c) The diagonal elements of the matric IQM in {%, j , k) are ma2, ma2, 2ma2. (d) If the system (i, j , k) is rotated by 7r/4 around k (does not matter clock- or counterclockwise), the result is again a principal axis frame for the system, and the matrix IQM does not change.
EXERCISE 2.2.12/1 When are some of the diagonal entries of IQM equal to zero? Hint: not very often.
We now summarize our excursion into linear algebra: for every rigid system S of point masses, there exists a special frame, called the p r i n c i p a l axes frame, in which the matrix of inertia IQM of the system is diagonal and does not depend on time. This special frame is centered at the center of mass and is fixed (not rotating) relative to the system S.
Let us go back to the analysis of the motion of the system 5. With
CM = LCMXI
+ LCMxz +LCMxi,
U = OJXZ +u>yj +UJZK ,
and after multiplying by U* on the left and by Uj on the right, equation (2.2.31) becomes
£*CM = ^CM^*I
(2.2.33)
where CCM = U^CCMUJ is the column vector {L*CMx, L*CMx, and ft* = C/*fJ[/J is the column-vector (w*, u>*, w*)T.
L*CMz)T
By construction, the principal axes frame rotates relative to the frame
O, and the corresponding rotation vector is w. Denoting the time derivative
in the frame O by Do, and in the the principal axes frame, by £>*, and using
the relation (2.1.43) on page 64, we find
D0LCM(t) = D. LCM{t) + u(t) x LCM(t).
(2.2.34)
To proceed, let us assume that the underlying frame O is inertial. Then relation (2.2.17) applies, and we find D0LcM(t) = TcM(t): the change of LCM (t) in the inertial frame is equal to the torque of all forces about the center of mass. In the principal axes frame, we have TcM{t) = TQMx{t) i* + TcMy{t) j * +T(XMz(i) k*. Also, since the basis vectors i*, f, k* are fixed in
Rigid Bodies
79
the principal axes frame, we have D*LcM{t) = L*CMx{t) i* 4- L*CMy(t) j * + L*CMz(t)k*. On the other hand, since the matrix i £ M does not depend on time, we use (2.2.33) to conclude that the column vector t*CM of the components of D*LcM{t) satisfies t*CM = IQM^ • Finally, we compute the cross product in (2.2.34) by writing the vectors in the principal axes frame and using the relation (2.2.33) for the components C*CM{t) of LCM in that frame. The result is the three Euler equations describing the rotation of the rigid system about the center of mass:
< i;yul+u*xul{rxx-rzz)
= T*CMy,
(2.2.35)
These equations were first published in 1765 by a Swiss mathematician LEONHARD EULER (1707-1783). Leonhard (or Leonard) Euler was the most prolific mathematician ever: extensive publication of his works continued for 50 years after his death and filled 80+ volumes; he also had 13 children. He introduced many modern mathematical notations, such as e for the base of natural logs (1727), f(x) for a function (1734), E for summation (1755), and i for the square root of —1 (1777).
EXERCISE 2.2.13? (a) Verify that (2.2.34) is indeed equivalent to (2.2.35). (b) Write (2.2.27) in an inertial frame and verify that the result is a particular case of (2.2.35).
With a suitable definition of the numbers I*x, I*y, I*z and the vector T*CM, equations (2.2.35) also describe the motion of a rigid body. We study rigid bodies in the following section.
2.2.3 Rigid Bodies
Any collection of points, finite or infinite, can be a rigid system: if two points in the collection have trajectories rl(t), r2(t) in some frame, then the rigidity condition ||T*i(i) — T*2(*)|| = llri(0) — >*2(0)|| must hold for every two points in the collection.
Intuitively, a r i g i d body is a rigid system consisting of uncountably many points, each with infinitesimally small mass. Mathematically, a rigid body is described in a frame O by a mass d e n s i t y function p = p{r), so that the volume A y of the body near the point with the position vector ro has, approximately, the mass A m = p(r) AV. Even more precisely, if the
80
Systems of Point Masses
body occupies the region H in R3, then the mass M of the body is given by the triple (or volume) integral
Af = III p{r)dV.
Without going into the details, let us note that rigid bodies can also be two-dimensional, for example, a (hard) spherical shell, or one-dimensional, for example, a piece of hard-to-bend wire. In these cases, we use surface and line integrals rather then volume integrals. In what follows, we focus on solid three-dimensional objects.
All the formulas for the motion of a rigid body can be derived from the corresponding formulas for a finite number of points by replacing rrij with the mass density function, and summation with integration. For example, the center of mass of a rigid body is the point with the position vector
VCM = ^ JJJ rp(r)dV. n
(2.2.36)
EXERCISE 2.2.14.C Show that both the mass and the location of the center of mass of a rigid body are independent of the frame O.
For a rigid body 11 moving in space relative to a frame O, we denote by lZ(t) the part of the space occupied by the body at time t relative to that frame O. If the frame O is inertial, then an equation similar to (2.2.5) connects the trajectory rcM = fCM(t) of the center of mass in the frame with the external forces per unit mass F1-^ = F^E\r) acting on the points of the body:
MfCM(t) = JJJF^'(r(t))
P(r(t))dV,
(2.2.37)
The angular momentum of TZ about O is, by definition,
Lo(t) = III r(t) x r{t) P(r(t))dV. n{t)
Similar to (2.2.15), page 71, we have
L0(t) = MrCM(t) x rCM(t) + [[[*(!) x *(*) P(r(t))dV,
(2.2.38) (2.2.39)
Rigid Bodies
81
where v(t) = r(t) - rcM(t), and is (2.2.14) replaced by
LCM(t) = llf*(t)
TC(t)
x «(*) P(r(t)) dV,
(2.2.40)
which is the angular momentum relative to the center of mass.
As with finite systems of points, consider the parallel translation of
the frame O to the center of mass of the body. T h e rotation of the rigid
body relative to this translated frame is described by the rotation vector
u>(t) = u>x{t) i + Wy(t) 3 + ojz(t) k, where (S, j , k) is the cartesian basis in the translated frame. By analogy with (2.2.30), we write v(t) = x(t)i +
vit)3+ *(*)&, LCM(t) = LCMx(t)i + LCMy{t)3 + LcMz(t)k, and define
Ixx= IIIiy2{t) +Z2{t))P{x{t))dV' Iyy= III{x2{t) +Z2{t))
-R(t)
n(t)
= IJJ(x2(t) + y2(t))p(x(t))dV,
K{t)
•*-xy — -*-yx jJjx(t)y{t)p{x{t))dV,
*xz — Izx —
it(t)
JnJ(Jt)x(t)z(t)p(x(t))dV,
Tl(t)
lyz — J-zy = JJJy{t)z{t)p{x{t))dV.
W) (2.2.41)
Then we have relation (2.2.31), page 76, for rigid bodies:
CcM(t)=ICM(t)n(t),
where CcM{t) is the column vector (LcMx(t), the column vector (wx{t), wy(t), uz(t))T, and
LcMy{t),
(2.2.42) LcMz(t))T, ft is
( J-xx ~~±xy *xz \ -Iyx Iyy —IyZ I •
(2.2.43)
•'zx ~*zy *zz /
The matrix IQM is called the moment of i n e r t i a m a t r i x , or t e n s o r of i n e r t i a , of the rigid body TZ around the center of mass in the basis (i, j , k). As in the case of a finite rigid system of points, there exists a principal axes frame, in which the matrix ICM is diagonal, and the diagonal elements I*x,I*y,I*z are uniquely determined by mass density func-
82
Systems of Point Masses
tion p. The principal axes frame, with center at the center of mass and basis vectors i*, j * , k*, is attached to the body and rotates with it. If u>(t) = u* i* + LJ*J* + w* k*, then the Euler equations (2.2.35) describe the rotation of the body about the center of mass. Equations (2.2.37) and (2.2.35) provide a complete description of the motion of a rigid body.
As an example, consider the DISTRIBUTED RIGID PENDULUM. Recall that in the simple rigid pendulum (page 41), a point mass is attached to the end of a weightless rod. In the distributed rigid pendulum, a uniform rod of mass M and length I is suspended by one end with a pin joint (Figure 2.2.1).
-*-3 Cross-section of the rod
Fig. 2.2.1 Distributed Pendulum
We assume that the cross-section of the rod is a square with side a. Then the volume of the rod is la2 and the density is constant: p(r) = M/(£a2). EXERCISE 2.2.15.c Verify that the center of mass CM of the rod is the mid-points of the axis of the rod.
Consider the cartesian coordinates (z*, j * , k*) with the origin at the center of mass (Figure 2.2.1). As usual, k* = i* x j * .
Let us compute I*z, as this is the only entry of the matrix 7 £ M we will need:
+ y2)p(r)dV
/ a/2 a/2 1/2
e/2 a/2 a/2
\
M
dy I dz I x2dx + I dx I dz I y2dy
la?
\ - a / 2 - a / 2 -1/2
-1/2 -a/2 - a / 2
= (pa2£3 /12) + (plo^jYL) = (M/12){12 + a2).
Rigid Bodies
83
EXERCISE 2.2.16.C Verify that the vectors i*, j * , k* define a principal axis frame. Hint: I*y = I*z = I*z = 0, as seen from (2.2.41) and the symmetry of the rod.
The motion of the rod is a 2-D rotation around the pin at O, and the vector k* is not rotating. There are no internal forces in the rod to affect the motion. As a result, the angular velocity vector is UJ = ui*z k* —Ok*,
and the Euler equations (2.2.35) simplify to I*z —^ = T^MI, or
M{P+a?)..
(B)
^
V~1CMz'
(2.2.44)
where TQM'Z is the rc*-component of the external torque around the center of mass.
To simplify the analysis, we ignore air resistance. Then the external torque T*jM' around the center of mass is produced by two forces: the force of gravity W and the force Fpin exerted by the pin at O. Since the rod is uniform, the torque due to gravity is zero around the CM. To compute Fpin we now assume that the frame fixed at O is inertial and Newton's Second Law (2.2.5) applies. The external force on mass M at the CM is the sum of the total gravity force, W = Mgi, and the reaction of the pin FPin. Hence, Fpin = M rcM — W. Since CM moves around O in a circle of radius £/2 we use (1.3.27) on page 36 to obtain the components ar, ag of TCM in polar coordinates with origin at O: ar = {—1/2)UJ2 r, a$ = (^/2)w 6. The point of application of Fpin relative to CM has the position vector -rCM = -{i/2)r. Then
TCM = ~rCM x Fpin + 0xW = --rx
Fpin.
Hence,
T*C(M = 4 ? X {MV,CM -W) =
^(^Lbrx0-gvxi^
-IM (I . . „ , .» 2 V2-uj + gsmv I K .
Substituting in (2.2.44, we get (M(£2 + a2)/12)6 = -(£M/2)({£/2)u> + g sin 6) or {(£2 + a2)/l2)0 + (£2w/4) + ( ^ s i n 0 / 2 ) = 0, or, with w = 0,
({At2 + o?)/l2)e = -{gl/2) sinS.
(2.2.45)
84
Systems of Point Masses
If a « 0 and 6 is so small that sin# « #, then equation (2.2.45) becomes
{21/3)0+ g0 = 0.
(2.2.46)
Comparing this with a similar approximation for the simple rigid pendulum (2.1.10), page 42, we conclude that a thin uniform stick of length £ suspended at one end oscillates at about the same frequency as a simple rigid pendulum of length (2/3)1
The objective of the above example was to illustrate how the Euler equations work. Because of the simple nature of the problem, the system of three equations (2.2.35) degenerates to one equation (2.2.45). In fact, an alternative derivation of (2.2.45) is possible by avoiding (2.2.35) altogether; the details are in Problem 2.4, page 417.
Note that if the frame O is fixed on the Earth, then this frame is not inertial, and the Coriolis force will act on the pendulum, but a good pin joint can minimize the effects of this force.
For two more examples of rigid body motion, see Problems 2.7 and 2.8 starting on page 419.
2.3 The Lagrange-Hamilton Method
So far, we used the Newton-Euler method to analyze motion using forces and the three laws of Newton (it was L. Euler who, around 1737, gave a precise mathematical description of the method). An alternative method using energy and work was introduced in 1788 by the French mathematician JOSEPH-LOUIS LAGRANGE (1736-1813) and further developed in 1833 by the Irish mathematician Sir WILLIAM ROWAN HAMILTON (18051865). This Lagrange-Hamilton method is sometimes more efficient than the Newton-Euler method, especially to study systems with constraints. An example of a constraint is the rigidity condition, ensuring that the distance between any two points is constant.
In what follows, we provide a brief description of the Lagrange-Hamilton method. The reader is assumed to be familiar with the basic tools of multivariable calculus, in particular, the chain rule and line integration.
Lagrange's Equations
85
2.3.1 Lagrange's Equations
We first illustrate Lagrange's method for a point mass moving in the plane. The modern methods of multi-variable calculus reduce the derivation of the main result (equation (2.3.17)) to a succession of simple applications of the chain rule.
Consider an inertial frame in M2 with origin O and basis vectors i, j . Let r = xi + yjbe the position of point mass m moving under the action of force F = F2 i + F2 j . By Newton's Second Law (2.1.1),
Fi = m i , F2 = my.
(2.3.1)
The state of m at any time t is given by the four-dimensional vector (x,y,x,y), that is, by the position and velocity. Knowledge of the state at a given time allows us to determine the state at all future times by solving equations (2.3.1).
Now consider a different pair ((71,(72) of coordinates in the plane, for example, for example, polar coordinates, so that
x = x(q1,q2), y = y{qi,q2);
q\ = qi(x, y), q2 = q2{x, y), (2.3.2)
and all the functions are sufficiently smooth. Differentiating (2.3.2),
dx . dx . . dy . dy .
x = — dqiqi
+ d—qq2 2,
y=~-qi dqi
+dq^-2q2.
,„ „ „, (2.3.3)
We call qi,q2,q\,q2 the g e n e r a l i z e d coordinates, since their values determine the state (x,y,x,y) by (2.3.2) and (2.3.3). Note that the partial derivatives dx/dqi, dy/dqi, i = 1,2, depend only on q\ and q2. We then differentiate (2.3.3) to find
dx _ dx dx _ dx dy dy dy dy dq\ dq\' dq2 dq2 ' dq\ dqi' dq2 dq2 '
Next, we apply the chain rule to the function dx(qi(t),q2(t))/dqi
_d_ / dx\ _ d2x dq\
d2x dq2 _ d2x .
d2x .
Jt\dTi)^Wi^ + d~q~2~bTlltt^Wiqi + ^dTiq2'
Prom (2.3.3), differentiating with respect to the variable q\,
to find
}
dx d2x .
d2x . d (dx\
— = -jr-jgi + T-j^q2 = - £ ; , - •
dqi dq{ dqidq2
dt \dqi J
,n „ „, 2.3.6)