Functional Diﬀerential Geometry

Functional Diﬀerential Geometry
Gerald Jay Sussman and Jack Wisdom with Will Farr
The MIT Press Cambridge, Massachusetts London, England

�c 2013 Massachusetts Institute of Technology This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit creativecommons.org.
Other than as provided by this license, no part of this book may be reproduced, transmitted, or displayed by any electronic or mechanical means without permission from the MIT Press or as permitted by law. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please email special sales@mitpress.mit.edu or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142. This book was set in Computer Modern by the authors with the LATEX typesetting system and was printed and bound in the United States of America.
Library of Congress Cataloging-in-Publication Data Sussman, Gerald Jay. Functional Diﬀerential Geometry / Gerald Jay Sussman and Jack Wisdom; with Will Farr.
p. cm. Includes bibliographical references and index. ISBN 978-0-262-01934-7 (hardcover : alk. paper) 1. Geometry, Diﬀerential. 2. Functional Diﬀerential Equations. 3. Mathematical Physics. I. Wisdom, Jack. II. Farr, Will. III. Title. QC20.7.D52S87 2013 516.3'6—dc23
2012042107
10 9 8 7 6 5 4 3 2 1

“The author has spared himself no pains in his endeavour to present the main ideas in the simplest and most intelligible form, and on the whole, in the sequence and connection in which they actually originated. In the interest of clearness, it appeared to me inevitable that I should repeat myself frequently, without paying the slightest attention to the elegance of the presentation. I adhered scrupulously to the precept of that brilliant theoretical physicist L. Boltzmann, according to whom matters of elegance ought be left to the tailor and to the cobbler.”
Albert Einstein, in Relativity, the Special and General Theory, (1961), p. v

Contents

Preface

xi

Prologue

xv

1 Introduction

1

2 Manifolds

11

2.1 Coordinate Functions

12

2.2 Manifold Functions

14

3 Vector Fields and One-Form Fields

21

3.1 Vector Fields

21

3.2 Coordinate-Basis Vector Fields

26

3.3 Integral Curves

29

3.4 One-Form Fields

32

3.5 Coordinate-Basis One-Form Fields

34

4 Basis Fields

41

4.1 Change of Basis

44

4.2 Rotation Basis

47

4.3 Commutators

48

5 Integration

55

5.1 Higher Dimensions

57

5.2 Exterior Derivative

62

5.3 Stokes’s Theorem

65

viii 5.4

Vector Integral Theorems

6 Over a Map 6.1 Vector Fields Over a Map 6.2 One-Form Fields Over a Map 6.3 Basis Fields Over a Map 6.4 Pullbacks and Pushforwards

7 Directional Derivatives 7.1 Lie Derivative 7.2 Covariant Derivative 7.3 Parallel Transport 7.4 Geodesic Motion

8 Curvature 8.1 Explicit Transport 8.2 Torsion 8.3 Geodesic Deviation 8.4 Bianchi Identities

9 Metrics 9.1 Metric Compatibility 9.2 Metrics and Lagrange Equations 9.3 General Relativity

10 Hodge Star and Electrodynamics 10.1 The Wave Equation 10.2 Electrodynamics

11 Special Relativity 11.1 Lorentz Transformations 11.2 Special Relativity Frames

Contents 67
71 71 73 74 76
83 85 93 104 111
115 116 124 125 129
133 135 137 144
153 159 160
167 172 179

Contents

ix

11.3 Twin Paradox

181

A Scheme

185

B Our Notation

195

C Tensors

211

References

217

Index

219

Preface
Learning physics is hard. Part of the problem is that physics is naturally expressed in mathematical language. When we teach we use the language of mathematics in the same way that we use our natural language. We depend upon a vast amount of shared knowledge and culture, and we only sketch an idea using mathematical idioms. We are insuﬃciently precise to convey an idea to a person who does not share our culture. Our problem is that since we share the culture we ﬁnd it diﬃcult to notice that what we say is too imprecise to be clearly understood by a student new to the subject. A student must simultaneously learn the mathematical language and the content that is expressed in that language. This is like trying to read Les Mis´erables while struggling with French grammar.
This book is an eﬀort to ameliorate this problem for learning the diﬀerential geometry needed as a foundation for a deep understanding of general relativity or quantum ﬁeld theory. Our approach diﬀers from the traditional one in several ways. Our coverage is unusual. We do not prove the general Stokes’s Theorem— this is well covered in many other books—instead, we show how it works in two dimensions. Because our target is relativity, we put lots of emphasis on the development of the covariant derivative, and we erect a common context for understanding both the Lie derivative and the covariant derivative. Most treatments of diﬀerential geometry aimed at relativity assume that there is a metric (or pseudometric). By contrast, we develop as much material as possible independent of the assumption of a metric. This allows us to see what results depend on the metric when we introduce it. We also try to avoid the use of traditional index notation for tensors. Although one can become very adept at “index gymnastics,” that leads to much mindless (though useful) manipulation without much thought to meaning. Instead, we use a semantically richer language of vector ﬁelds and diﬀerential forms.
But the single biggest diﬀerence between our treatment and others is that we integrate computer programming into our explanations. By programming a computer to interpret our formulas we soon learn whether or not a formula is correct. If a formula is not clear, it will not be interpretable. If it is wrong, we will get a wrong answer. In either case we are led to improve our

xii

Preface

program and as a result improve our understanding. We have been teaching advanced classical mechanics at MIT for many years using this strategy. We use precise functional notation and we have students program in a functional language. The students enjoy this approach and we have learned a lot ourselves. It is the experience of writing software for expressing the mathematical content and the insights that we gain from doing it that we feel is revolutionary. We want others to have a similar experience.

Acknowledgments
We thank the people who helped us develop this material, and especially the students who have over the years worked through the material with us. In particular, Mark Tobenkin, William Throwe, Leo Stein, Peter Iannucci, and Micah Brodsky have suffered through bad explanations and have contributed better ones.
Edmund Bertschinger, Norman Margolus, Tom Knight, Rebecca Frankel, Alexey Radul, Edwin Taylor, Joel Moses, Kenneth Yip, and Hal Abelson helped us with many thoughtful discussions and advice about physics and its relation to mathematics.
We also thank Chris Hanson, Taylor Campbell, and the community of Scheme programmers for providing support and advice for the elegant language that we use. In particular, Gerald Jay Sussman wants to thank Guy Lewis Steele and Alexey Radul for many fun days of programming together—we learned much from each other’s style.
Matthew Halfant started us on the development of the Scmutils system. He encouraged us to get into scientiﬁc computation, using Scheme and functional style as an active way to explain the ideas, without the distractions of imperative languages such as C. In the 1980s he wrote some of the early Scheme procedures for numerical computation that we still use.
Dan Zuras helped us with the invention of the unique organization of the Scmutils system. It is because of his insight that the system is organized around a generic extension of the chain rule for taking derivatives. He also helped in the heavy lifting that was required to make a really good polynomial GCD algorithm, based on ideas we learned from Richard Zippel.
A special contribution that cannot be suﬃciently acknowledged is from Seymour Papert and Marvin Minsky, who taught us that

Preface

xiii

the practice of programming is a powerful way to develop a deeper understanding of any subject. Indeed, by the act of debugging we learn about our misconceptions, and by reﬂecting on our bugs and their resolutions we learn ways to learn more eﬀectively. Indeed, Turtle Geometry [2], a beautiful book about discrete diﬀerential geometry at a more elementary level, was inspired by Papert’s work on education. [13]
We acknowledge the generous support of the Computer Science and Artiﬁcial Intelligence Laboratory of the Massachusetts Institute of Technology. The laboratory provides a stimulating environment for eﬀorts to formalize knowledge with computational methods. We also acknowledge the Panasonic Corporation (formerly the Matsushita Electric Industrial Corporation) for support of Gerald Jay Sussman through an endowed chair.
Jack Wisdom thanks his wife, Cecile, for her love and support. Julie Sussman, PPA, provided careful reading and serious criticism that inspired us to reorganize and rewrite major parts of the text. She has also developed and maintained Gerald Jay Sussman over these many years.

Gerald Jay Sussman & Jack Wisdom Cambridge, Massachusetts, USA August 2012

Prologue
Programming and Understanding
One way to become aware of the precision required to unambiguously communicate a mathematical idea is to program it for a computer. Rather than using canned programs purely as an aid to visualization or numerical computation, we use computer programming in a functional style to encourage clear thinking. Programming forces us to be precise and unambiguous, without forcing us to be excessively rigorous. The computer does not tolerate vague descriptions or incomplete constructions. Thus the act of programming makes us keenly aware of our errors of reasoning or unsupported conclusions.1
Although this book is about diﬀerential geometry, we can show how thinking about programming can help in understanding in a more elementary context. The traditional use of Leibniz’s notation and Newton’s notation is convenient in simple situations, but in more complicated situations it can be a serious handicap to clear reasoning.
A mechanical system is described by a Lagrangian function of the system state (time, coordinates, and velocities). A motion of the system is described by a path that gives the coordinates for each moment of time. A path is allowed if and only if it satisﬁes the Lagrange equations. Traditionally, the Lagrange equations are written
d ∂L ∂L dt ∂q˙ − ∂q = 0.
What could this expression possibly mean? Let’s try to write a program that implements Lagrange equa-
tions. What are Lagrange equations for? Our program must take a proposed path and give a result that allows us to decide if the path is allowed. This is already a problem; the equation shown above does not have a slot for a path to be tested.
1The idea of using computer programming to develop skills of clear thinking was originally advocated by Seymour Papert. An extensive discussion of this idea, applied to the education of young children, can be found in Papert [13].

xvi

Prologue

So we have to ﬁgure out how to insert the path to be tested.

The partial derivatives do not depend on the path; they are deriva-

tives of the Lagrangian function and thus they are functions with

the same arguments as the Lagrangian. But the time derivative

d/dt makes sense only for a function of time. Thus we must

be intending to substitute the path (a function of time) and its

derivative (also a function of time) into the coordinate and velocity

arguments of the partial derivative functions.

So probably we meant something like the following (assume

that w is a path through the coordinate conﬁguration space, and

so w(t) speciﬁes the conﬁguration coordinates at time t):

⎛

⎞

d dt

⎜⎝

∂L(t, q, ∂q˙

q˙)

q

=

w(t)

⎟⎠ −

∂L(t, q, q˙) ∂q

q = w(t)

= 0.

q˙

=

dw(t) dt

q˙

=

dw(t) dt

In this equation we see that the partial derivatives of the Lagrangian function are taken, then the path and its derivative are substituted for the position and velocity arguments of the Lagrangian, resulting in an expression in terms of the time.
This equation is complete. It has meaning independent of the context and there is nothing left to the imagination. The earlier equations require the reader to ﬁll in lots of detail that is implicit in the context. They do not have a clear meaning independent of the context.
By thinking computationally we have reformulated the Lagrange equations into a form that is explicit enough to specify a computation. We could convert it into a program for any symbolic manipulation program because it tells us how to manipulate expressions to compute the residuals of Lagrange’s equations for a purported solution path.2

2The residuals of equations are the expressions whose value must be zero if
the equations are satisﬁed. For example, if we know that for an unknown x, x3 − x = 0 then the residual is x3 − x. We can try x = −1 and ﬁnd a residual
of 0, indicating that our purported solution satisﬁes the equation. A residual
may provide information. For example, if we have the diﬀerential equation df (x)/dx − af (x) = 0 and we plug in a test solution f (x) = Aebx we obtain the residual (b − a)Aebx, which can be zero only if b = a.

Prologue

xvii

Functional Abstraction

But this corrected use of Leibniz notation is ugly. We had to introduce extraneous symbols (q and q˙) in order to indicate the argument position specifying the partial derivative. Nothing would change here if we replaced q and q˙ by a and b.3 We can simplify the notation by admitting that the partial derivatives of the Lagrangian are themselves new functions, and by specifying the particular partial derivative by the position of the argument that is varied

d

d

d

dt ((∂2L)(t, w(t), dt w(t))) − (∂1L)(t, w(t), dt w(t)) = 0,

where ∂iL is the function which is the partial derivative of the function L with respect to the ith argument.4
Two diﬀerent notions of derivative appear in this expression. The functions ∂2L and ∂1L, constructed from the Lagrangian L, have the same arguments as L. The derivative d/dt is an expression derivative. It applies to an expression that involves the variable t and it gives the rate of change of the value of the expression as the value of the variable t is varied.
These are both useful interpretations of the idea of a derivative. But functions give us more power. There are many equivalent ways to write expressions that compute the same value. For example 1/(1/r1 + 1/r2) = (r1r2)/(r1 + r2). These expressions compute the same function of the two variables r1 and r2. The ﬁrst expression fails if r1 = 0 but the second one gives the right value of the function. If we abstract the function, say as Π(r1, r2), we can ignore the details of how it is computed. The ideas become clearer because they do not depend on the detailed shape of the expressions.

3That the symbols q and q˙ can be replaced by other arbitrarily chosen nonconﬂicting symbols without changing the meaning of the expression tells us that the partial derivative symbol is a logical quantiﬁer, like forall and exists (∀ and ∃).
4The argument positions of the Lagrangian are indicated by indices starting with zero for the time argument.

xviii

Prologue

So let’s get rid of the expression derivative d/dt and replace it
with an appropriate functional derivative. If f is a function then we will write Df as the new function that is the derivative of f :5

d

(Df )(t) = f (x) .

dx

x=t

To do this for the Lagrange equation we need to construct a function to take the derivative of.
Given a conﬁguration-space path w, there is a standard way to make the state-space path. We can abstract this method as a mathematical function Γ:

d Γ[w](t) = (t, w(t), w(t)).
dt

Using Γ we can write:

d dt ((∂2L)(Γ[w](t))) − (∂1L)(Γ[w](t)) = 0.
If we now deﬁne composition of functions (f ◦ g)(x) = f (g(x)), we can express the Lagrange equations entirely in terms of functions:

D((∂2L) ◦ (Γ[w])) − (∂1L) ◦ (Γ[w]) = 0.
The functions ∂1L and ∂2L are partial derivatives of the function L. Composition with Γ[w] evaluates these partials with coordinates and velocites appropriate for the path w, making functions of time. Applying D takes the time derivative. The Lagrange equation states that the diﬀerence of the resulting functions of time must be zero. This statement of the Lagrange equation is complete, unambiguous, and functional. It is not encumbered with the particular choices made in expressing the Lagrangian. For example, it doesn’t matter if the time is named t or τ , and it has an explicit place for the path to be tested.
This expression is equivalent to a computer program:6

5An explanation of functional derivatives is in Appendix B, page 202.
6The programs in this book are written in Scheme, a dialect of Lisp. The details of the language are not germane to the points being made. What is important is that it is mechanically interpretable, and thus unambiguous. In this book we require that the mathematical expressions be explicit enough

Prologue

xix

(define ((Lagrange-equations Lagrangian) w) (- (D (compose ((partial 2) Lagrangian) (Gamma w))) (compose ((partial 1) Lagrangian) (Gamma w))))

In the Lagrange equations procedure the parameter Lagrangian is a procedure that implements the Lagrangian. The derivatives of the Lagrangian, for example ((partial 2) Lagrangian), are also procedures. The state-space path procedure (Gamma w) is constructed from the conﬁguration-space path procedure w by the procedure Gamma:

(define ((Gamma w) t) (up t (w t) ((D w) t)))

where up is a constructor for a data structure that represents a state of the dynamical system (time, coordinates, velocities).
The result of applying the Lagrange-equations procedure to a procedure Lagrangian that implements a Lagrangian function is a procedure that takes a conﬁguration-space path procedure w and returns a procedure that gives the residual of the Lagrange equations for that path at a time.
For example, consider the harmonic oscillator, with Lagrangian

L(t, q, v)

=

1 2

mv2

−

1 2

kq2,

for mass m and spring constant k. This Lagrangian is implemented by

(define ((L-harmonic m k) local) (let ((q (coordinate local)) (v (velocity local))) (- (* 1/2 m (square v)) (* 1/2 k (square q)))))

We know that the motion of a harmonic oscillator is a sinusoid with a given amplitude a, frequency ω, and phase ϕ:

x(t) = a cos(ωt + ϕ).

that they can be expressed as computer programs. Scheme is chosen because it is easy to write programs that manipulate representations of mathematical functions. An informal description of Scheme can be found in Appendix A. The use of Scheme to represent mathematical objects can be found in Appendix B. A formal description of Scheme can be obtained in [10]. You can get the software from [21].

xx

Prologue

Suppose we have forgotten how the constants in the solution relate to the physical parameters of the oscillator. Let’s plug in the proposed solution and look at the residual:

(define (proposed-solution t) (* ’a (cos (+ (* ’omega t) ’phi))))

(show-expression (((Lagrange-equations (L-harmonic ’m ’k)) proposed-solution) ’t))

cos (ωt + ϕ) a k − mω2

The residual here shows that for nonzero amplitude, the only solutions allowed are ones where (k − mω2) = 0 or ω = k/m.
But, suppose we had no idea what the solution looks like. We could propose a literal function for the path:
(show-expression (((Lagrange-equations (L-harmonic ’m ’k)) (literal-function ’x)) ’t))
kx (t) + mD2x (t)

If this residual is zero we have the Lagrange equation for the harmonic oscillator.
Note that we can ﬂexibly manipulate representations of mathematical functions. (See Appendices A and B.)
We started out thinking that the original statement of Lagrange’s equations accurately captured the idea. But we really don’t know until we try to teach it to a naive student. If the student is suﬃciently ignorant, but is willing to ask questions, we are led to clarify the equations in the way that we did. There is no dumber but more insistent student than a computer. A computer will absolutely refuse to accept a partial statement, with missing parameters or a type error. In fact, the original statement of Lagrange’s equations contained an obvious type error: the Lagrangian is a function of multiple variables, but the d/dt is applicable only to functions of one variable.

1
Introduction
Philosophy is written in that great book which ever lies before our eyes—I mean the Universe—but we cannot understand it if we do not learn the language and grasp the symbols in which it is written. This book is written in the mathematical language, and the symbols are triangles, circles, and other geometrical ﬁgures without whose help it is impossible to comprehend a single word of it, without which one wanders in vain through a dark labyrinth.
Galileo Galilei [8]
Diﬀerential geometry is a mathematical language that can be used to express physical concepts. In this introduction we show a typical use of this language. Do not panic! At this point we do not expect you to understand the details of what we are showing. All will be explained as needed in the text. The purpose is to get the ﬂavor of this material.
At the North Pole inscribe a line in the ice perpendicular to the Greenwich Meridian. Hold a stick parallel to that line and walk down the Greenwich Meridian keeping the stick parallel to itself as you walk. (The phrase “parallel to itself” is a way of saying that as you walk you keep its orientation unchanged. The stick will be aligned East-West, perpendicular to your direction of travel.) When you get to the Equator the stick will be parallel to the Equator. Turn East, and walk along the Equator, keeping the stick parallel to the Equator. Continue walking until you get to the 90◦E meridian. When you reach the 90◦E meridian turn North and walk back to the North Pole keeping the stick parallel to itself. Note that the stick is perpendicular to your direction of travel. When you get to the Pole note that the stick is perpendicular to the line you inscribed in the ice. But you started with that stick parallel to that line and you kept the stick pointing in the same direction on the Earth throughout your walk—how did it change orientation?

2

Chapter 1 Introduction

The answer is that you walked a closed loop on a curved surface. As seen in three dimensions the stick was actually turning as you walked along the Equator, because you always kept the stick parallel to the curving surface of the Earth. But as a denizen of a 2-dimensional surface, it seemed to you that you kept the stick parallel to itself as you walked, even when making a turn. Even if you had no idea that the surface of the Earth was embedded in a 3-dimensional space you could use this experiment to conclude that the Earth was not ﬂat. This is a small example of intrinsic geometry. It shows that the idea of parallel transport is not simple. For a general surface it is necessary to explicitly deﬁne what we mean by parallel.
If you walked a smaller loop, the angle between the starting orientation and the ending orientation of the stick would be smaller. For small loops it would be proportional to the area of the loop you walked. This constant of proportionality is a measure of the curvature. The result does not depend on how fast you walked, so this is not a dynamical phenomenon.
Denizens of the surface may play ball games. The balls are constrained to the surface; otherwise they are free particles. The paths of the balls are governed by dynamical laws. This motion is a solution of the Euler-Lagrange equations1 for the free-particle Lagrangian with coordinates that incorporate the constraint of living in the surface. There are coeﬃcients of terms in the EulerLagrange equations that arise naturally in the description of the behavior of the stick when walking loops on the surface, connecting the static shape of the surface with the dynamical behavior of the balls. It turns out that the dynamical evolution of the balls may be viewed as parallel transport of the ball’s velocity vector in the direction of the velocity vector. This motion by parallel transport of the velocity is called geodesic motion.
So there are deep connections between the dynamics of particles and the geometry of the space that the particles move in. If we understand this connection we can learn about dynamics by studying geometry and we can learn about geometry by studying dynamics. We enter dynamics with a Lagrangian and the associated Lagrange equations. Although this formulation exposes many important features of the system, such as how symmetries relate to

1It is customary to shorten “Euler-Lagrange equations” to “Lagrange equations.” We hope Leonhard Euler is not disturbed.

Chapter 1 Introduction

3

conserved quantities, the geometry is not apparent. But when we express the Lagrangian and the Lagrange equations in diﬀerential geometry language, geometric properties become apparent. In the case of systems with no potential energy the Euler-Lagrange equations are equivalent to the geodesic equations on the conﬁguration manifold. In fact, the coeﬃcients of terms in the Lagrange equations are Christoﬀel coeﬃcients, which deﬁne parallel transport on the manifold. Let’s look into this a bit.

Lagrange Equations
We write the Lagrange equations in functional notation2 as follows:
D(∂2L ◦ Γ[q]) − ∂1L ◦ Γ[q] = 0.
In SICM [19], Section 1.6.3, we showed that a Lagrangian describing the free motion of a particle subject to a coordinatedependent constraint can be obtained by composing a free-particle Lagrangian with a function that describes how dynamical states transform given the coordinate transformation that describes the constraints.
A Lagrangian for a free particle of mass m and velocity v is just its kinetic energy, mv2/2. The procedure Lfree implements the free Lagrangian:3
(define ((Lfree mass) state) (* 1/2 mass (square (velocity state))))
For us the dynamical state of a system of particles is a tuple of time, coordinates, and velocities. The free-particle Lagrangian depends only on the velocity part of the state.
For motion of a point constrained to move on the surface of a sphere the conﬁguration space has two dimensions. We can describe the position of the point with the generalized coordinates colatitude and longitude. If the sphere is embedded in 3dimensional space the position of the point in that space can be

2A short introduction to our functional notation, and why we have chosen it, is given in the prologue: Programming and Understanding. More details can be found in Appendix B.
3An informal description of the Scheme programming language can be found in Appendix A.

4

Chapter 1 Introduction

given by a coordinate transformation from colatitude and longitude to three rectangular coordinates.
For a sphere of radius R the procedure sphere->R3 implements the transformation of coordinates from colatitude θ and longitude φ on the surface of the sphere to rectangular coordinates in the embedding space. (The zˆ axis goes through the North Pole, and the Equator is in the plane z = 0.)

(define ((sphere->R3 R) state)

(let ((q (coordinate state)))

(let ((theta (ref q 0)) (phi (ref q 1)))

(up (* R (sin theta) (cos phi))

;x

(* R (sin theta) (sin phi))

;y

(* R (cos theta))))))

;z

The coordinate transformation maps the generalized coordinates on the sphere to the 3-dimensional rectangular coordinates. Given this coordinate transformation we construct a corresponding transformation of velocities; these make up the state transformation. The procedure F->C implements the derivation of a transformation of states from a coordinate transformation:

(define ((F->C F) state) (up (time state) (F state) (+ (((partial 0) F) state) (* (((partial 1) F) state) (velocity state)))))

A Lagrangian governing free motion on a sphere of radius R is then the composition of the free Lagrangian with the transformation of states.

(define (Lsphere m R) (compose (Lfree m) (F->C (sphere->R3 R))))

So the value of the Lagrangian at an arbitrary dynamical state is:

((Lsphere ’m ’R) (up ’t (up ’theta ’phi) (up ’thetadot ’phidot)))
(+ (* 1/2 m (expt R 2) (expt thetadot 2)) (* 1/2 m (expt R 2) (expt (sin theta) 2) (expt phidot 2)))

Chapter 1 Introduction

5

or, in inﬁx notation:

1 mR2θ˙2 + 1 mR2 (sin (θ))2 φ˙2.

2

2

(1.1)

The Metric

Let’s now take a step into the geometry. A surface has a metric which tells us how to measure sizes and angles at every point on the surface. (Metrics are introduced in Chapter 9.)
The metric is a symmetric function of two vector ﬁelds that gives a number for every point on the manifold. (Vector ﬁelds are introduced in Chapter 3). Metrics may be used to compute the length of a vector ﬁeld at each point, or alternatively to compute the inner product of two vector ﬁelds at each point. For example, the metric for the sphere of radius R is

g(u, v) = R2dθ(u)dθ(v) + R2(sin θ)2dφ(u)dφ(v),

(1.2)

where u and v are vector ﬁelds, and dθ and dφ are one-form ﬁelds that extract the named components of the vector-ﬁeld argument. (One-form ﬁelds are introduced in Chapter 3.) We can think of dθ(u) as a function of a point that gives the size of the vector ﬁeld u in the θ direction at the point. Notice that g(u, u) is a weighted sum of the squares of the components of u. In fact, if we identify
dθ(v) = θ˙ dφ(v) = φ˙,

then the coeﬃcients in the metric are the same as the coeﬃcients in the value of the Lagrangian, equation (1.1), apart from a factor of m/2.
We can generalize this result and write a Lagrangian for free motion of a particle of mass m on a manifold with metric g:

L2(x, v) =

1 2

mgij

(x)

vivj

.

ij

(1.3)

This is written using indexed variables to indicate components of the geometric objects expressed with respect to an unspeciﬁed coordinate system. The metric coeﬃcients gij are, in general, a

6

Chapter 1 Introduction

function of the position coordinates x, because the properties of the space may vary from place to place.
We can capture this geometric statement as a program:

(define ((L2 mass metric) place velocity) (* 1/2 mass ((metric velocity velocity) place)))

This program gives the Lagrangian in a coordinate-independent, geometric way. It is entirely in terms of geometric objects, such as a place on the conﬁguration manifold, the velocity at that place, and the metric that describes the local shape of the manifold. But to compute we need a coordinate system. We express the dynamical state in terms of coordinates and velocity components in the coordinate system. For each coordinate system there is a natural vector basis and the geometric velocity vectors can be constructed by contracting the basis with the components of the velocity. Thus, we can form a coordinate representation of the Lagrangian.

(define ((Lc mass metric coordsys) state) (let ((x (coordinates state)) (v (velocities state)) (e (coordinate-system->vector-basis coordsys))) ((L2 mass metric) ((point coordsys) x) (* e v))))

The manifold point m represented by the coordinates x is given by (define m ((point coordsys) x)). The coordinates of m in a diﬀerent coordinate system are given by ((chart coordsys2) m). The manifold point m is a geometric object that is the same point independent of how it is speciﬁed. Similarly, the velocity vector ev is a geometric object, even though it is speciﬁed using components v with respect to the basis e. Both v and e have as many components as the dimension of the space so their product is interpreted as a contraction.
Let’s make a general metric on a 2-dimensional real manifold:4

(define the-metric (literal-metric ’g R2-rect))

4The procedure literal-metric provides a metric. It is a general symmetric function of two vector ﬁelds, with literal functions of the coordinates of the manifold points for its coeﬃcients in the given coordinate system. The quoted symbol ’g is used to make the names of the literal coeﬃcient functions. Literal functions are discussed in Appendix B.

Chapter 1 Introduction

7

The metric is expressed in rectangular coordinates, so the coordinate system is R2-rect.5 The component functions will be labeled
as subscripted gs.
We can now make the Lagrangian for the system:

(define L (Lc ’m the-metric R2-rect))

And we can apply our Lagrangian to an arbitrary state:

(L (up ’t (up ’x ’y) (up ’vx ’vy)))
(+ (* 1/2 m (g 00 (up x y)) (expt vx 2)) (* m (g 01 (up x y)) vx vy) (* 1/2 m (g 11 (up x y)) (expt vy 2)))
Compare this result with equation (1.3).

Euler-Lagrange Residuals
The Euler-Lagrange equations are satisﬁed on realizable paths. Let γ be a path on the manifold of conﬁgurations. (A path is a map from the 1-dimensional real line to the conﬁguration manifold. We introduce maps between manifolds in Chapter 6.) Consider an arbitrary path:6
(define gamma (literal-manifold-map ’q R1-rect R2-rect))
The values of γ are points on the manifold, not a coordinate representation of the points. We may evaluate gamma only on points of the real-line manifold; gamma produces points on the R2 manifold. So to go from the literal real-number coordinate ’t to a point on the real line we use ((point R1-rect) ’t) and to go from a point m in R2 to its coordinate representation we use ((chart R2-rect) m). (The procedures point and chart are introduced in Chapter 2.) Thus

5R2-rect is the usual rectangular coordinate system on the 2-dimensional real manifold. (See Section 2.1, page 13.) We supply common coordinate systems for n-dimensional real manifolds. For example, R2-polar is a polar coordinate system on the same manifold.
6The procedure literal-manifold-map makes a map from the manifold implied by its second argument to the manifold implied by the third argument. These arguments must be coordinate systems. The quoted symbol that is the ﬁrst argument is used to name the literal coordinate functions that deﬁne the map.

8

Chapter 1 Introduction

((chart R2-rect) (gamma ((point R1-rect) ’t)))
(up (qˆ0 t) (qˆ1 t))
So, to work with coordinates we write:
(define coordinate-path (compose (chart R2-rect) gamma (point R1-rect)))
(coordinate-path ’t)
(up (qˆ0 t) (qˆ1 t))
Now we can compute the residuals of the Euler-Lagrange equations, but we get a large messy expression that we will not show.7 However, we will save it to compare with the residuals of the geodesic equations.
(define Lagrange-residuals (((Lagrange-equations L) coordinate-path) ’t))

Geodesic Equations
Now we get deeper into the geometry. The traditional way to write the geodesic equations is

∇vv = 0

(1.4)

where ∇ is a covariant derivative operator. Roughly, ∇vw is a directional derivative. It gives a measure of the variation of the vector ﬁeld w as you walk along the manifold in the direction of v. (We will explain this in depth in Chapter 7.) ∇vv = 0 is intended to convey that the velocity vector is parallel-transported by itself. When you walked East on the Equator you had to hold the stick so that it was parallel to the Equator. But the stick is constrained to the surface of the Earth, so moving it along the Equator required turning it in three dimensions. The ∇ thus must incorporate the 3-dimensional shape of the Earth to provide a notion of “parallel” appropriate for the denizens of the surface of the Earth. This information will appear as the “Christoﬀel coeﬃcients” in the coordinate representation of the geodesic equations.
The trouble with the traditional way to write the geodesic equations (1.4) is that the arguments to the covariant derivative are

7For an explanation of equation residuals see page xvi.

Chapter 1 Introduction

9

vector ﬁelds and the velocity along the path is not a vector ﬁeld. A more precise way of stating this relation is:

∇γ∂/∂tdγ(∂/∂t) = 0.

(1.5)

(We know that this may be unfamiliar notation, but we will explain it in Chapter 7.)
In coordinates, the geodesic equations are expressed

D2qi(t) + Γijk(γ(t))Dqj (t)Dqk(t) = 0,
jk

(1.6)

where q(t) is the coordinate path corresponding to the manifold path γ, and Γijk(m) are Christoﬀel coeﬃcients. The Γijk(m) describe the “shape” of the manifold close to the manifold point m.
They can be derived from the metric g.
We can get and save the geodesic equation residuals by:

(define geodesic-equation-residuals (((((covariant-derivative Cartan gamma) d/dt) ((differential gamma) d/dt)) (chart R2-rect)) ((point R1-rect) ’t)))
where d/dt is a vector ﬁeld on the real line8 and Cartan is a way of encapsulating the geometry, as speciﬁed by the Christoﬀel coeﬃcients. The Christoﬀel coeﬃcients are computed from the metric:

(define Cartan (Christoffel->Cartan (metric->Christoffel-2 the-metric (coordinate-system->basis R2-rect))))

The two messy residual results that we did not show are related by the metric. If we change the representation of the geodesic equations by “lowering” them using the mass and the metric, we see that the residuals are equal:

8We established t as a coordinate function on the rectangular coordinates of the real line by
(define-coordinates t R1-rect)
This had the eﬀect of also deﬁning d/dt as a coordinate vector ﬁeld and dt as a one-form ﬁeld on the real line.

10

Chapter 1 Introduction

(define metric-components (metric->components the-metric (coordinate-system->basis R2-rect)))
(- Lagrange-residuals (* (* ’m (metric-components (gamma ((point R1-rect) ’t)))) geodesic-equation-residuals))
(down 0 0)
This establishes that for a 2-dimensional space the Euler-Lagrange equations are equivalent to the geodesic equations. The Christoffel coeﬃcients that appear in the geodesic equation correspond to coeﬃcients of terms in the Euler-Lagrange equations. This analysis will work for any number of dimensions (but will take your computer longer in higher dimensions, because the complexity increases).

Exercise 1.1: Motion on a Sphere The metric for a unit sphere, expressed in colatitude θ and longitude φ, is
g(u, v) = dθ(u)dθ(v) + (sin θ)2dφ(u)dφ(v).
Compute the Lagrange equations for motion of a free particle on the sphere and convince yourself that they describe great circles. For example, consider motion on the equator (θ = π/2) and motion on a line of longitude (φ is constant).

2
Manifolds
A manifold is a generalization of our idea of a smooth surface embedded in Euclidean space. For an n-dimensional manifold, around every point there is a simply-connected open set, the coordinate patch, and a one-to-one continuous function, the coordinate function or chart, mapping every point in that open set to a tuple of n real numbers, the coordinates. In general, several charts are needed to label all points on a manifold. It is required that if a region is in more than one coordinate patch then the coordinates are consistent in that the function mapping one set of coordinates to another is continuous (and perhaps diﬀerentiable to some degree). A consistent system of coordinate patches and coordinate functions that covers the entire manifold is called an atlas.
An example of a 2-dimensional manifold is the surface of a sphere or of a coﬀee cup. The space of all conﬁgurations of a planar double pendulum is a more abstract example of a 2-dimensional manifold. A manifold that looks locally Euclidean may not look like Euclidean space globally: for example, it may not be simply connected. The surface of the coﬀee cup is not simply connected, because there is a hole in the handle for your ﬁngers.
An example of a coordinate function is the function that maps points in a simply-connected open neighborhood of the surface of a sphere to the tuple of latitude and longitude.1 If we want to talk about motion on the Earth, we can identify the space of conﬁgurations to a 2-sphere (the surface of a 3-dimensional ball). The map from the 2-sphere to the 3-dimensional coordinates of a point on the surface of the Earth captures the shape of the Earth.
Two angles specify the conﬁguration of the planar double pendulum. The manifold of conﬁgurations is a torus, where each point on the torus corresponds to a conﬁguration of the double pendulum. The constraints, such as the lengths of the pendulum rods, are built into the map between the generalized coordi-
1The open set for a latitude-longitude coordinate system cannot include either pole (because longitude is not deﬁned at the poles) or the 180◦ meridian (where the longitude is discontinuous). Other coordinate systems are needed to cover these places.

12

Chapter 2 Manifolds

nates of points on the torus and the arrangements of masses in 3-dimensional space.
There are computational objects that we can use to model manifolds. For example, we can make an object that represents the plane2

(define R2 (make-manifold R^n 2))

and give it the name R2. One useful patch of the plane is the one that contains the origin and covers the entire plane.3

(define U (patch ’origin R2))

2.1 Coordinate Functions

A coordinate function χ maps points in a coordinate patch of a manifold to a coordinate tuple:4

x = χ(m),

(2.1)

where x may have a convenient tuple structure. Usually, the coordinates are arranged as an “up structure”; the coordinates are selected with superscripts:

xi = χi(m).

(2.2)

The number of independent components of x is the dimension of the manifold.
Assume we have two coordinate functions χ and χ . The coordinate transformation from χ coordinates to χ coordinates is just the composition χ ◦ χ −1, where χ −1 is the functional inverse of χ (see ﬁgure 2.1). We assume that the coordinate transformation is continuous and diﬀerentiable to any degree we require.

2 The expression R^n gives only one kind of manifold. We also have spheres S^n and SO3.
3The word origin is an arbitrary symbol here. It labels a predeﬁned patch in R^n manifolds.
4In the text that follows we will use sans-serif names, such as f, v, m, to refer to objects deﬁned on the manifold. Objects that are deﬁned on coordinates (tuples of real numbers) will be named with symbols like f , v, x.

2.1 Coordinate Functions

13

Rn

χ o χ’−1

Rn

χ χ’

mM
Figure 2.1 Here there are two overlapping coordinate patches that are the domains of the two coordinate functions χ and χ . It is possible to represent manifold points in the overlap using either coordinate system. The coordinate transformation from χ coordinates to χ coordinates is just the composition χ ◦ χ −1.
Given a coordinate system coordsys for a patch on a manifold the procedure that implements the function χ that gives coordinates for a point is (chart coordsys). The procedure that implements the inverse map that gives a point for coordinates is (point coordsys).
We can have both rectangular and polar coordinates on a patch of the plane identiﬁed by the origin:5,6
;; Some charts on the patch U (define R2-rect (coordinate-system ’rectangular U)) (define R2-polar (coordinate-system ’polar/cylindrical U))
For each of the coordinate systems above we obtain the coordinate functions and their inverses:
5The rectangular coordinates are good for the entire plane, but the polar coordinates are singular at the origin because the angle is not deﬁned. Also, the patch for polar coordinates must exclude one ray from the origin, because of the angle variable. 6We can avoid explicitly naming the patch:
(define R2-rect (coordinate-system-at ’rectangular ’origin R2))

14

Chapter 2 Manifolds

(define R2-rect-chi (chart R2-rect)) (define R2-rect-chi-inverse (point R2-rect)) (define R2-polar-chi (chart R2-polar)) (define R2-polar-chi-inverse (point R2-polar))
The coordinate transformations are then just compositions. The polar coordinates of a rectangular point are:
((compose R2-polar-chi R2-rect-chi-inverse) (up ’x0 ’y0))
(up (sqrt (+ (expt x0 2) (expt y0 2))) (atan y0 x0))
And the rectangular coordinates of a polar point are:
((compose R2-rect-chi R2-polar-chi-inverse) (up ’r0 ’theta0))
(up (* r0 (cos theta0)) (* r0 (sin theta0)))
And we can obtain the Jacobian of the polar-to-rectangular transformation by taking its derivative:7
((D (compose R2-rect-chi R2-polar-chi-inverse)) (up ’r0 ’theta0))
(down (up (cos theta0) (sin theta0)) (up (* -1 r0 (sin theta0)) (* r0 (cos theta0))))

2.2 Manifold Functions

Let f be a real-valued function on a manifold M: this function maps points m on the manifold to real numbers.
This function has a coordinate representation fχ with respect to the coordinate function χ (see ﬁgure 2.2):

fχ = f ◦ χ−1.

(2.3)

Both the coordinate representation fχ and the tuple x depend on the coordinate system, but the value fχ(x) is independent of coordinates:

fχ(x) = (f ◦ χ−1)(χ(m)) = f(m).

(2.4)

7See Appendix B for an introduction to tuple arithmetic and a discussion of derivatives of functions with structured input or output.

2.2 Manifold Functions

15

Rn fχ
χ
f m
M

f(m)

Figure 2.2 The coordinate function χ maps points on the manifold
in the coordinate patch to a tuple of coordinates. A function f on the manifold M can be represented in coordinates by a function fχ = f ◦χ−1.

The subscript χ may be dropped when it is unambiguous. For example, in a 2-dimensional real manifold the coordinates
of a manifold point m are a pair of real numbers,

(x, y) = χ(m),

(2.5)

and the manifold function f is represented in coordinates by a function f that takes a pair of real numbers and produces a real number

f : R2 → R f : (x, y) → f (x, y).

(2.6)

We deﬁne our manifold function

f:M→R f : m → (f ◦ χ)(m).

(2.7)

Manifold Functions Are Coordinate Independent We can illustrate the coordinate independence with a program. We will show that an arbitrary manifold function f, when deﬁned by its coordinate representation in rectangular coordinates, has the same behavior when applied to a manifold point independent of whether the point is speciﬁed in rectangular or polar coordinates.

16

Chapter 2 Manifolds

We deﬁne a manifold function by specifying its behavior in rectangular coordinates:8

(define f (compose (literal-function ’f-rect R2->R) R2-rect-chi))

where R2->R is a signature for functions that map an up structure of two reals to a real:

(define R2->R (-> (UP Real Real) Real))

We can specify a typical manifold point using its rectangular coordinates:

(define R2-rect-point (R2-rect-chi-inverse (up ’x0 ’y0)))

We can describe the same point using its polar coordinates:

(define corresponding-polar-point (R2-polar-chi-inverse (up (sqrt (+ (square ’x0) (square ’y0))) (atan ’y0 ’x0))))

(f R2-rect-point) and (f corresponding-polar-point) agree, even though the point has been speciﬁed in two diﬀerent coordinate systems:

(f R2-rect-point)
(f-rect (up x0 y0))

(f corresponding-polar-point)
(f-rect (up x0 y0))

Naming Coordinate Functions
To make things a bit easier, we can give names to the individual coordinate functions associated with a coordinate system. Here we name the coordinate functions for the R2-rect coordinate system x and y and for the R2-polar coordinate system r and theta.
(define-coordinates (up x y) R2-rect) (define-coordinates (up r theta) R2-polar)

8Alternatively, we can deﬁne the same function in a shorthand (define f (literal-manifold-function ’f-rect R2-rect))

2.2 Manifold Functions

17

This allows us to extract the coordinates from a point, independent of the coordinate system used to specify the point.

(x (R2-rect-chi-inverse (up ’x0 ’y0)))
x0

(x (R2-polar-chi-inverse (up ’r0 ’theta0)))
(* r0 (cos theta0))
(r (R2-polar-chi-inverse (up ’r0 ’theta0)))
r0

(r (R2-rect-chi-inverse (up ’x0 ’y0)))
(sqrt (+ (expt x0 2) (expt y0 2)))

(theta (R2-rect-chi-inverse (up ’x0 ’y0)))
(atan y0 x0)

We can work with the coordinate functions in a natural manner, deﬁning new manifold functions in terms of them:9

(define h (+ (* x (square r)) (cube y)))

(h R2-rect-point)
(+ (expt x0 3) (* x0 (expt y0 2)) (expt y0 3))

We can also apply h to a point deﬁned in terms of its polar coordinates:

(h (R2-polar-chi-inverse (up ’r0 ’theta0)))
(+ (* (expt r0 3) (expt (sin theta0) 3)) (* (expt r0 3) (cos theta0)))
Exercise 2.1: Curves A curve may be speciﬁed in diﬀerent coordinate systems. For example, a cardioid constructed by rolling a circle of radius a around another circle of the same radius is described in polar coordinates by the equation
r = 2a(1 + cos(θ)).

9This is actually a nasty, but traditional, abuse of notation. An expression like cos(r) can either mean the cosine of the angle r (if r is a number), or the composition cos ◦ r (if r is a function). In our system (cos r) behaves in this way—either computing the cosine of r or being treated as (compose cos r) depending on what r is.

18

Chapter 2 Manifolds

We can convert this to rectangular coordinates by evaluating the residual in rectangular coordinates.
(define-coordinates (up r theta) R2-polar)
((- r (* 2 ’a (+ 1 (cos theta)))) ((point R2-rect) (up ’x ’y)))

(/ (+ (* -2 a x) (* -2 a (sqrt (+ (expt x 2) (expt y 2)))) (expt x 2) (expt y 2))
(sqrt (+ (expt x 2) (expt y 2))))
The numerator of this expression is the equivalent residual in rectangular coordinates. If we rearrange terms and square it we get the traditional formula for the cardioid
(x2 + y2 − 2ax)2 = 4a2 (x2 + y2).

a. The rectangular coordinate equation for the Lemniscate of Bernoulli is
(x2 + y2)2 = 2a2(x2 − y2).
Find the expression in polar coordinates.
b. Describe a helix space curve in both rectangular and cylindrical coordinates. Use the computer to show the correspondence. Note that we provide a cylindrical coordinate system on the manifold R3 for you to use. It is called R3-cyl; with coordinates (r, theta, z).
Exercise 2.2: Stereographic Projection
A stereographic projection is a correspondence between points on the unit sphere and points on the plane cutting the sphere at its equator. (See ﬁgure 2.3.)
The coordinate system for points on the sphere in terms of rectangular coordinates of corresponding points on the plane is S2-Riemann.10 The procedure (chart S2-Riemann) gives the rectangular coordinates on the plane for every point on the sphere, except for the North Pole. The procedure (point S2-Riemann) gives the point on the sphere given rectangular coordinates on the plane. The usual spherical coordinate system on the sphere is S2-spherical.
We can compute the colatitude and longitude of a point on the sphere corresponding to a point on the plane with the following incantation:

10The plane with the addition of a point at inﬁnity is conformally equivalent to the sphere by this correspondence. This correspondence is called the Riemann sphere, in honor of the great mathematician Bernard Riemann (1826–1866), who made major contributions to geometry.

2.2 Manifold Functions
N
φ,λ

19
ρ,θ

Figure 2.3 For each point on the sphere (except for its north pole) a line is drawn from the north pole through the point and extending to the equatorial plane. The corresponding point on the plane is where the line intersects the plane. The rectangular coordinates of this point on the plane are the Riemann coordinates of the point on the sphere. The points on the plane can also be speciﬁed with polar coordinates (ρ, θ) and the points on the sphere are speciﬁed both by Riemann coordinates and the traditional colatitude and longitude (φ, λ).
((compose (chart S2-spherical) (point S2-Riemann) (chart R2-rect) (point R2-polar))
(up ’rho ’theta))
(up (acos (/ (+ -1 (expt rho 2)) (+ +1 (expt rho 2))))
theta)
Perform an analogous computation to get the polar coordinates of the point on the plane corresponding to a point on the sphere given by its colatitude and longitude.

3
Vector Fields and One-Form Fields
We want a way to think about how a function varies on a manifold. Suppose we have some complex linkage, such as a multiple pendulum. The potential energy is an important function on the multi-dimensional conﬁguration manifold of the linkage. To understand the dynamics of the linkage we need to know how the potential energy changes as the conﬁguration changes. The change in potential energy for a step of a certain size in a particular direction in the conﬁguration space is a real physical quantity; it does not depend on how we measure the direction or the step size. What exactly this means is to be determined: What is a step size? What is a direction? We cannot subtract two conﬁgurations to determine the distance between them. It is our job here to make sense of this idea.
So we would like something like a derivative, but there are problems. Since we cannot subtract two manifold points, we cannot take the derivative of a manifold function in the way described in elementary calculus. But we can take the derivative of a coordinate representation of a manifold function, because it takes real-number coordinates as its arguments. This is a start, but it is not independent of coordinate system. Let’s see what we can build out of this.

3.1 Vector Fields

In multiple dimensions the derivative of a function is the multiplier
for the best linear approximation of the function at each argument point:1

f (x + Δx) ≈ f (x) + (Df (x))Δx

(3.1)

The derivative Df (x) is independent of Δx. Although the derivative depends on the coordinates, the product (Df (x))Δx is in-

1In multiple dimensions the derivative Df (x) is a down tuple structure of the partial derivatives and the increment Δx is an up tuple structure, so the indicated product is to be interpreted as a contraction. (See equation B.8.)

22

Chapter 3 Vector Fields and One-Form Fields

variant under change of coordinates in the following sense. Let φ = χ ◦ χ −1 be a coordinate transformation, and x = φ(y). Then Δx = Dφ(y)Δy is the linear approximation to the change in x when y changes by Δy. If f and g are the representations of a manifold function in the two coordinate systems, g(y) = f (φ(y)) = f (x), then the linear approximations to the increments in f and g are equal:

Dg(y)Δy = Df (φ(y)) (Dφ(y)Δy) = Df (x)Δx.

The invariant product (Df (x))Δx is the directional derivative of f at x with respect to the vector speciﬁed by the tuple of components Δx in the coordinate system. We can generalize this idea to allow the vector at each point to depend on the point, making a vector ﬁeld. Let b be a function of coordinates. We then have a directional derivative of f at each point x, determined by b

Db(f )(x) = (Df (x))b(x).

(3.2)

Now we bring this back to the manifold and develop a useful generalization of the idea of directional derivative for functions on a manifold, rather than functions on Rn. A vector ﬁeld on a manifold is an assignment of a vector to each point on the manifold. In elementary geometry, a vector is an arrow anchored at a point on the manifold with a magnitude and a direction. In diﬀerential geometry, a vector is an operator that takes directional derivatives of manifold functions at its anchor point. The direction and magnitude of the vector are the direction and scale factor of the directional derivative.
Let m be a point on a manifold, v be a vector ﬁeld on the manifold, and f be a real-valued function on the manifold. Then v(f) is the directional derivative of the function f and v(f)(m) is the directional derivative of the function f at the point m. The vector ﬁeld is an operator that takes a real-valued manifold function and a manifold point and produces a number. The order of arguments is chosen to make v(f) be a new manifold function that can be manipulated further. Directional derivative operators, unlike ordinary derivative operators, produce a result of the same type as their argument. Note that there is no mention here of any coordinate system. The vector ﬁeld speciﬁes a direction and magnitude at each manifold point that is independent of how it is described using any coordinate system.

3.1 Vector Fields

23

A useful way to characterize a vector ﬁeld in a particular coor-
dinate system is by applying it to the coordinate functions. The resulting functions biχ,v are called the coordinate component functions or coeﬃcient functions of the vector ﬁeld; they measure how
quickly the coordinate functions change in the direction of the
vector ﬁeld, scaled by the magnitude of the vector ﬁeld:

biχ,v = v χi ◦ χ−1.

(3.3)

Note that we have chosen the coordinate components to be functions of the coordinate tuple, not of a manifold point.
A vector with coordinate components bχ,v applies to a manifold function f via

v(f)(m) = ((D(f ◦ χ−1) bχ,v) ◦ χ)(m) = D(f ◦ χ−1)(χ(m)) bχ,v(χ(m))
= ∂i(f ◦ χ−1)(χ(m)) biχ,v(χ(m)).
i

(3.4) (3.5) (3.6)

In equation (3.4), the quantity f ◦χ−1 is the coordinate representation of the manifold function f. We take its derivative, and weight the components of the derivative with the coordinate components bχ,v of the vector ﬁeld that specify its direction and magnitude. Since this product is a function of coordinates we use χ to extract the coordinates from the manifold point m. In equation (3.5), the composition of the product with the coordinate chart χ is replaced by function evaluation. In equation (3.6) the tuple multiplication is expressed explicitly as a sum of products of corresponding components. So the application of the vector is a linear combination of the partial derivatives of f in the coordinate directions weighted by the vector components. This computes the rate of change of f in the direction speciﬁed by the vector.
Equations (3.3) and (3.5) are consistent:

v(χ)(χ−1(x)) = D(χ ◦ χ−1)(x) bχ,v(x) = D(I)(x) bχ,v(x) = bχ,v(x).

(3.7)

The coeﬃcient tuple bχ,v(x) is an up structure compatible for addition to the coordinates. Note that for any vector ﬁeld v the coeﬃcients bχ,v(x) are diﬀerent for diﬀerent coordinate functions χ.

24

Chapter 3 Vector Fields and One-Form Fields

In the text that follows we will usually drop the subscripts on b, understanding that it is dependent on the coordinate system and the vector ﬁeld.
We implement the deﬁnition of a vector ﬁeld (3.4) as:

(define (components->vector-field components coordsys) (define (v f) (compose (* (D (compose f (point coordsys))) components) (chart coordsys))) (procedure->vector-field v))
The vector ﬁeld is an operator, like derivative.2 Given a coordinate system and coeﬃcient functions that map
coordinates to real values, we can make a vector ﬁeld. For example, a general vector ﬁeld can be deﬁned by giving components relative to the coordinate system R2-rect by

(define v (components->vector-field (up (literal-function ’b^0 R2->R) (literal-function ’b^1 R2->R)) R2-rect))

To make it convenient to deﬁne literal vector ﬁelds we provide a shorthand: (define v (literal-vector-field ’b R2-rect)) This makes a vector ﬁeld with component functions named b^0 and b^1 and names the result v. When this vector ﬁeld is applied to an arbitrary manifold function it gives the directional derivative of that manifold function in the direction speciﬁed by the components bˆ0 and bˆ1:

((v (literal-manifold-function ’f-rect R2-rect)) R2-rect-point)
(+ (* (((partial 0) f-rect) (up x0 y0)) (bˆ0 (up x0 y0))) (* (((partial 1) f-rect) (up x0 y0)) (bˆ1 (up x0 y0))))
This result is what we expect from equation (3.6). We can recover the coordinate components of the vector ﬁeld
by applying the vector ﬁeld to the coordinate chart:

2An operator is just like a procedure except that multiplication is interpreted as composition. For example, the derivative procedure is made into an operator D so that we can say (expt D 2) and expect it to compute the second derivative. The procedure procedure->vector-field makes a vector-ﬁeld operator.

3.1 Vector Fields

25

((v (chart R2-rect)) R2-rect-point)
(up (bˆ0 (up x y)) (bˆ1 (up x y)))

Coordinate Representation The vector ﬁeld v has a coordinate representation v:
v(f)(m) = D(f ◦ χ−1)(χ(m)) b(χ(m)) = Df (x) b(x) = v(f )(x),

(3.8)

with the deﬁnitions f = f ◦ χ−1 and x = χ(m). The function b is the coeﬃcient function for the vector ﬁeld v. It provides a scale factor for the component in each coordinate direction. However, v is the coordinate representation of the vector ﬁeld v in that it takes directional derivatives of coordinate representations of manifold functions.
Given a vector ﬁeld v and a coordinate system coordsys we can construct the coordinate representation of the vector ﬁeld.3

(define (coordinatize v coordsys) (define ((coordinatized-v f) x) (let ((b (compose (v (chart coordsys)) (point coordsys)))) (* ((D f) x) (b x))))) (make-operator coordinatized-v))

We can apply a coordinatized vector ﬁeld to a function of coordinates to get the same answer as before.

(((coordinatize v R2-rect) (literal-function ’f-rect R2->R)) (up ’x0 ’y0))
(+ (* (((partial 0) f-rect) (up x0 y0)) (bˆ0 (up x0 y0))) (* (((partial 1) f-rect) (up x0 y0)) (bˆ1 (up x0 y0))))

Vector Field Properties
The vector ﬁelds on a manifold form a vector space over the ﬁeld of real numbers and a module over the ring of real-valued manifold functions. A module is like a vector space except that there is no multiplicative inverse operation on the scalars of a module. Manifold functions that are not the zero function do not necessarily

3The make-operator procedure takes a procedure and returns an operator.

26

Chapter 3 Vector Fields and One-Form Fields

have multiplicative inverses, because they can have isolated zeros. So the manifold functions form a ring, not a ﬁeld, and vector ﬁelds must be a module over the ring of manifold functions rather than a vector space.
Vector ﬁelds have the following properties. Let u and v be vector ﬁelds and let α be a real-valued manifold function. Then

(u + v)(f) = u(f) + v(f)

(3.9)

(αu)(f) = α(u(f)).

(3.10)

Vector ﬁelds are linear operators. Assume f and g are functions on the manifold, a and b are real constants.4 The constants a and
b are not manifold functions, because vector ﬁelds take derivatives.
See equation (3.13).

v(af + bg)(m) = av(f)(m) + bv(g)(m)

(3.11)

v(af)(m) = av(f)(m)

(3.12)

Vector ﬁelds satisfy the product rule (Leibniz rule).

v(fg)(m) = v(f)(m) g(m) + f(m) v(g)(m)

(3.13)

Vector ﬁelds satisfy the chain rule. Let F be a function on the range of f.

v(F ◦ f)(m) = DF (f(m)) v(f)(m)

(3.14)

3.2 Coordinate-Basis Vector Fields
For an n-dimensional manifold any set of n linearly independent vector ﬁelds5 form a basis in that any vector ﬁeld can be expressed as a linear combination of the basis ﬁelds with manifold-function
4If f has structured output then v(f) is the structure resulting from v being applied to each component of f. 5 A set of vector ﬁelds, {vi}, is linearly independent with respect to manifold functions if we cannot ﬁnd nonzero manifold functions, {ai}, such that
aivi(f) = 0(f),
i
where 0 is the vector ﬁeld such that 0(f)(m) = 0 for all f and m.

3.2 Coordinate-Basis Vector Fields

27

coeﬃcients. Given a coordinate system we can construct a basis as follows: we choose the component tuple bi(x) (see equation 3.5) to be the ith unit tuple ui(x)—an up tuple with one in the ith position and zeros in all other positions—selecting the partial derivative in that direction. Here ui is a constant function. Like b, it formally takes coordinates of a point as an argument, but it ignores them. We then deﬁne the basis vector ﬁeld Xi by

Xi(f)(m) = D(f ◦ χ−1)(χ(m)) ui(χ(m)) = ∂i(f ◦ χ−1)(χ(m)).

(3.15)

In terms of Xi the vector ﬁeld of equation (3.6) is

v(f)(m) = Xi(f)(m) bi(χ(m)).
i

(3.16)

We can also write

v(f)(m) = X(f)(m) b(χ(m)),

(3.17)

letting the tuple algebra do its job. The basis vector ﬁeld is often written

∂ ∂xi = Xi,

(3.18)

to call to mind that it is an operator that computes the directional derivative in the ith coordinate direction.
In addition to making the coordinate functions, the procedure define-coordinates also makes the traditional named basis vectors. Using these we can examine the application of a rectangular basis vector to a polar coordinate function:

(define-coordinates (up x y) R2-rect) (define-coordinates (up r theta) R2-polar)

((d/dx (square r)) R2-rect-point)
(* 2 x0)
More general functions and vectors can be made as combinations of these simple pieces:

(((+ d/dx (* 2 d/dy)) (+ (square r) (* 3 x))) R2-rect-point)
(+ 3 (* 2 x0) (* 4 y0))

28

Chapter 3 Vector Fields and One-Form Fields

Coordinate Transformations

Consider a coordinate change from the chart χ to the chart χ .

X(f)(m) = D(f ◦ χ−1)(χ(m)) = D(f ◦ (χ )−1 ◦ χ ◦ χ−1)(χ(m)) = D(f ◦ (χ )−1)(χ (m))(D(χ ◦ χ−1))(χ(m)) = X (f)(m)(D(χ ◦ χ−1))(χ(m)).

(3.19)

This is the rule for the transformation of basis vector ﬁelds. The second factor can be recognized as “∂x /∂x,” the Jacobian.6
The vector ﬁeld does not depend on coordinates. So, from
equation (3.17), we have

v(f)(m) = X(f)(m) b(χ(m)) = X (f)(m) b (χ (m)).

(3.20)

Using equation (3.19) with x = χ(m) and x = χ (m), we deduce

D(χ ◦ χ−1)(x) b(x) = b (x ).

(3.21)

Because χ ◦ χ−1 is the inverse function of χ ◦ (χ )−1, their derivatives are multiplicative inverses,

D(χ ◦ χ−1)(x) = (D(χ ◦ (χ )−1)(x ))−1,

(3.22)

and so

b(x) = D(χ ◦ (χ )−1)(x ) b (x ),

(3.23)

as expected.7 It is traditional to express this rule by saying that the basis
elements transform covariantly and the coeﬃcients of a vector in

6This notation helps one remember the transformation rule:

∂f

∂f ∂x j

∂xi =

∂x j ∂xi ,

j

which is the relation in the usual Leibniz notation. As Spivak pointed out in Calculus on Manifolds, p.45, f means something diﬀerent on each side of the equation.

7For coordinate paths q and q related by q(t) = (χ◦(χ )−1)(q (t)) the velocities are related by Dq(t) = D(χ ◦ (χ )−1)(q (t))Dq (t). Abstracting oﬀ paths, we get v = D(χ ◦ (χ )−1)(x )v .

3.3 Integral Curves

29

terms of a basis transform contravariantly; their product is invariant under the transformation.

3.3 Integral Curves

A vector ﬁeld gives a direction and rate for every point on a manifold. We can start at any point and go in the direction speciﬁed by the vector ﬁeld, tracing out a parametric curve on the manifold. This curve is an integral curve of the vector ﬁeld.
More formally, let v be a vector ﬁeld on the manifold M. An integral curve γmv : R → M of v is a parametric path on M satisfying

D(f ◦ γmv )(t) = v(f)(γmv (t)) = (v(f) ◦ γmv )(t) γmv (0) = m,

(3.24) (3.25)

for arbitrary functions f on the manifold, with real values or structured real values. The rate of change of a function along an integral curve is the vector ﬁeld applied to the function evaluated at the appropriate place along the curve. Often we will simply write γ, rather than γmv . Another useful variation is φvt (m) = γmv (t).
We can recover the diﬀerential equations satisﬁed by a coordinate representation of the integral curve by letting f = χ, the coordinate function, and letting σ = χ ◦ γ be the coordinate path corresponding to the curve γ. Then the derivative of the coordinate path σ is

Dσ(t) = D(χ ◦ γ)(t) = (v(χ) ◦ γ)(t) = (v(χ) ◦ χ−1 ◦ χ ◦ γ)(t) = (b ◦ σ)(t),

(3.26)

where b = v(χ) ◦ χ−1 is the coeﬃcient function for the vector ﬁeld v for coordinates χ (see equation 3.7). So the coordinate path σ satisﬁes the diﬀerential equations

Dσ = b ◦ σ.

(3.27)

Diﬀerential equations for the integral curve can be expressed only in a coordinate representation, because we cannot go from one point on the manifold to another by addition of an increment.

30

Chapter 3 Vector Fields and One-Form Fields

However, we can do this by adding the coordinates to an increment of coordinates and then ﬁnding the corresponding point on the manifold.
Iterating the process described by equation (3.24) we can compute higher-order derivatives of functions along the integral curve:

D(f ◦ γ) = v(f) ◦ γ D2(f ◦ γ) = D(v(f) ◦ γ) = v(v(f)) ◦ γ
... Dn(f ◦ γ) = vn(f) ◦ γ

(3.28)

Thus, the evolution of f ◦ γ can be written formally as a Taylor series in the parameter:

(f ◦ γ)(t) = (f ◦ γ)(0) + t D(f ◦ γ)(0) + 1 t2 D2(f ◦ γ)(0) + · · · 2 = (etD(f ◦ γ))(0) = (etvf)(γ(0)).

(3.29)

Using φ rather than γ (f ◦ γmv )(t) = (f ◦ φvt )(m), so, when the series converges, (etvf)(m) = (f ◦ φvt )(m).
In particular, let f = χ, then σ(t) = (χ ◦ γ)(t) = (etD(χ ◦ γ))(0) = (etvχ)(γ(0)),

(3.30) (3.31) (3.32)

a Taylor series representation of the solution to the diﬀerential
equation (3.27).
For example, a vector ﬁeld circular that generates a rotation about the origin is:8

8In this expression d/dx and d/dy are vector ﬁelds that take directional derivatives of manifold functions and evaluate them at manifold points; x and y are manifold functions. define-coordinates was used to create these operators and functions, see page 27.
Note that circular is an operator—a property inherited from d/dx and d/dy.

3.3 Integral Curves

31

(define circular (- (* x d/dy) (* y d/dx)))

We can exponentiate the circular vector ﬁeld, to generate an evolution in a circle around the origin starting at (1, 0):

(series:for-each print-expression (((exp (* ’t circular)) (chart R2-rect)) ((point R2-rect) (up 1 0))) 6)
(up 1 0)
(up 0 t)
(up (* -1/2 (expt t 2)) 0) (up 0 (* -1/6 (expt t 3))) (up (* 1/24 (expt t 4)) 0) (up 0 (* 1/120 (expt t 5)))

These are the ﬁrst six terms of the series expansion of the coordinates of the position for parameter t.
We can deﬁne an evolution operator EΔt,v using equation (3.31)

(EΔt,vf)(m) = (eΔtvf)(m) = (f ◦ φvΔt)(m).

(3.33)

We can approximate the evolution operator by summing the series up to a given order:

(define ((((evolution order) delta-t v) f) m) (series:sum (((exp (* delta-t v)) f) m) order))

We can evolve circular from the initial point up to the parameter t, and accumulate the ﬁrst six terms as follows:

((((evolution 6) ’delta-t circular) (chart R2-rect)) ((point R2-rect) (up 1 0)))
(up (+ (* -1/720 (expt delta-t 6)) (* 1/24 (expt delta-t 4)) (* -1/2 (expt delta-t 2)) 1)
(+ (* 1/120 (expt delta-t 5)) (* -1/6 (expt delta-t 3)) delta-t))

Note that these are just the series for cos Δt and sin Δt, so the coordinate tuple of the evolved point is (cos Δt, sin Δt).

32

Chapter 3 Vector Fields and One-Form Fields

For functions whose series expansions have ﬁnite radius of convergence, evolution can progress beyond the point at which the Taylor series converges because evolution is well deﬁned whenever the integral curve is deﬁned.

Exercise 3.1: State Derivatives Newton’s equations for the motion of a particle in a plane, subject to a force that depends only on the position in the plane, are a system of second-order diﬀerential equations for the rectangular coordinates (X, Y ) of the particle:
D2X(t) = Ax(X(t), Y (t)) and D2Y (t) = Ay(X(t), Y (t)),
where A is the acceleration of the particle. These are equivalent to a system of ﬁrst-order equations for the coor-
dinate path σ = χ ◦ γ, where χ = (t, x, y, vx, vy) is a coordinate system on the manifold R5. Then our equations are:
D(t ◦ γ) = 1 D(x ◦ γ) = vx ◦ γ D(y ◦ γ) = vy ◦ γ D(vx ◦ γ) = Ax(x ◦ γ, y ◦ γ) D(vy ◦ γ) = Ay(x ◦ γ, y ◦ γ)
Construct a vector ﬁeld on R5 corresponding to this system of diﬀerential equations. Derive the ﬁrst few terms in the series solution of this problem by exponentiation.

3.4 One-Form Fields
A vector ﬁeld that gives a velocity for each point on a topographic map of the surface of the Earth can be applied to a function, such as one that gives the height for each point on the topographic map, or a map that gives the temperature for each point. The vector ﬁeld then provides the rate of change of the height or temperature as one moves in the way described by the vector ﬁeld. Alternatively, we can think of a topographic map, which gives the height at each point, as measuring a velocity ﬁeld at each point. For example, we may be interested in the velocity of the wind or the trajectories of migrating birds. The topographic map gives the rate of change of height at each point for each velocity vector ﬁeld. The rate of change of height can be thought of as the

3.4 One-Form Fields

33

number of equally-spaced (in height) contours that are pierced by each velocity vector in the vector ﬁeld.

Diﬀerential of a Function
For example, consider the diﬀerential 9 df of a manifold function f, deﬁned as follows. If df is applied to a vector ﬁeld v we obtain

df(v) = v(f),

(3.34)

which is a function of a manifold point. The diﬀerential of the height function on the topographic map is
a function that gives the rate of change of height at each point for a velocity vector ﬁeld. This gives the same answer as the velocity vector ﬁeld applied to the height function.
The diﬀerential of a function is linear in the vector ﬁelds. The diﬀerential is also a linear operator on functions: if f1 and f2 are manifold functions, and if c is a real constant, then

d(f1 + f2) = df1 + df2 and

d(cf) = cdf.

Note that c is not a manifold function.

One-Form Fields
A one-form ﬁeld is a generalization of this idea; it is something that measures a vector ﬁeld at each point.
One-form ﬁelds are linear functions of vector ﬁelds that produce real-valued functions on the manifold. A one-form ﬁeld is linear in vector ﬁelds: if ω is a one-form ﬁeld, v and w are vector ﬁelds, and c is a manifold function, then

ω(v + w) = ω(v) + ω(w)

(3.35)

and

ω(cv) = cω(v).

(3.36)

9The diﬀerential of a manifold function will turn out to be a special case of the exterior derivative, which will be introduced later.

34

Chapter 3 Vector Fields and One-Form Fields

Sums and scalar products of one-form ﬁelds on a manifold have the following properties. If ω and θ are one-form ﬁelds, and if f is a real-valued manifold function, then:

(ω + θ)(v) = ω(v) + θ(v), (f ω)(v) = f ω(v).

(3.37) (3.38)

3.5 Coordinate-Basis One-Form Fields

Given a coordinate function χ, we deﬁne the coordinate-basis oneform ﬁelds Xi by

Xi(v)(m) = v(χi)(m)

(3.39)

or collectively

X(v)(m) = v(χ)(m).

(3.40)

With this deﬁnition the coordinate-basis one-form ﬁelds are dual
to the coordinate-basis vector ﬁelds in the following sense (see equation 3.15):10

Xi(Xj)(m) = Xj(χi)(m) = ∂j (χi ◦ χ−1)(χ(m)) = δji .

(3.41)

The tuple of basis one-form ﬁelds X(v)(m) is an up structure like that of χ.
The general one-form ﬁeld ω is a linear combination of coordinatebasis one-form ﬁelds:

ω(v)(m) = a(χ(m)) X(v)(m) = ai(χ(m)) Xi(v)(m),
i

(3.42)

with coeﬃcient-function tuple a(x), for x = χ(m). We can write this more simply as

ω(v) = (a ◦ χ) X(v),

(3.43)

because everything is evaluated at m.

10The Kronecker delta δji is one if i = j and zero otherwise.

3.5 Coordinate-Basis One-Form Fields

35

The coeﬃcient tuple can be recovered from the one-form ﬁeld:11

ai(x) = ω(Xi)(χ−1(x)).

(3.44)

This follows from the dual relationship (3.41). We can see this as a program:12

(define omega (components->1form-field (down (literal-function ’a 0 R2->R) (literal-function ’a 1 R2->R)) R2-rect))

((omega (down d/dx d/dy)) R2-rect-point)
(down (a 0 (up x0 y0)) (a 1 (up x0 y0)))

We provide a shortcut for this construction:

(define omega (literal-1form-field ’a R2-rect))

A diﬀerential can be expanded in a coordinate basis:

df(v) = ciX˜i(v).
i

(3.45)

The coeﬃcients ci = df(Xi) = Xi(f) = ∂i(f ◦χ−1)◦χ are the partial derivatives of the coordinate representation of f in the coordinate system of the basis:

(((d (literal-manifold-function ’f-rect R2-rect)) (coordinate-system->vector-basis R2-rect))
R2-rect-point)
(down (((partial 0) f-rect) (up x0 y0))
(((partial 1) f-rect) (up x0 y0)))

However, if the coordinate system of the basis diﬀers from the coordinates of the representation of the function, the result is complicated by the chain rule:

11The analogous recovery of coeﬃcient tuples from vector ﬁelds is equation (3.3): biχ,v = v (χi) ◦ χ−1.
12The procedure components->1form-field is analogous to the procedure components->vector-field introduced earlier.

36

Chapter 3 Vector Fields and One-Form Fields

(((d (literal-manifold-function ’f-polar R2-polar)) (coordinate-system->vector-basis R2-rect))
((point R2-polar) (up ’r ’theta)))
(down (- (* (((partial 0) f-polar) (up r theta)) (cos theta)) (/ (* (((partial 1) f-polar) (up r theta)) (sin theta))
r))
(+ (* (((partial 0) f-polar) (up r theta)) (sin theta)) (/ (* (((partial 1) f-polar) (up r theta)) (cos theta))
r)))

The coordinate-basis one-form ﬁelds can be used to ﬁnd the coeﬃcients of vector ﬁelds in the corresponding coordinate vectorﬁeld basis:

Xi(v) = v(χi) = bi ◦ χ

(3.46)

or collectively,

X(v) = v(χ) = b ◦ χ.

(3.47)

A coordinate-basis one-form ﬁeld is often written dxi. This traditional notation for the coordinate-basis one-form ﬁelds is justiﬁed by the relation:

dxi = Xi = d(χi).

(3.48)

The define-coordinates procedure also makes the basis oneform ﬁelds with these traditional names inherited from the coordinates.
We can illlustrate the duality of the coordinate-basis vector ﬁelds and the coordinate-basis one-form ﬁelds:

(define-coordinates (up x y) R2-rect)

((dx d/dy) R2-rect-point)
0

((dx d/dx) R2-rect-point)
1

We can use the coordinate-basis one-form ﬁelds to extract the coeﬃcients of circular on the rectangular vector basis:

3.5 Coordinate-Basis One-Form Fields

37

((dx circular) R2-rect-point)
(* -1 y0)
((dy circular) R2-rect-point)
x0
But we can also ﬁnd the coeﬃcients on the polar vector basis:
((dr circular) R2-rect-point)
0
((dtheta circular) R2-rect-point)
1
So circular is the same as d/dtheta, as we can see by applying them both to the general function f:
(define f (literal-manifold-function ’f-rect R2-rect)) (((- circular d/dtheta) f) R2-rect-point)
0

Not All One-Form Fields Are Diﬀerentials
Although all one-form ﬁelds can be constructed as linear combinations of basis one-form ﬁelds, not all one-form ﬁelds are diﬀerentials of functions.
The coeﬃcients of a diﬀerential are (see equation 3.45):

ci = Xi(f) = df(Xi) and partial derivatives of functions commute

(3.49)

Xi(Xj(f)) = Xj(Xi(f)).

(3.50)

As a consequence, the coeﬃcients of a diﬀerential are constrained

Xi(cj ) = Xj(ci),

(3.51)

but a one-form ﬁeld can be constructed with arbitrary coeﬃcient functions. For example:

xdx + xdy

(3.52)

is not a diﬀerential of any function. This is why we started with the basis one-form ﬁelds and built the general one-form ﬁelds in terms of them.

38

Chapter 3 Vector Fields and One-Form Fields

Coordinate Transformations Consider a coordinate change from the chart χ to the chart χ .

X(v) = v(χ) = v(χ ◦ (χ )−1 ◦ χ ) = (D(χ ◦ (χ )−1) ◦ χ ) v(χ ) = (D(χ ◦ (χ )−1) ◦ χ ) X (v),

(3.53)

where the third line follows from the chain rule for vector ﬁelds. One-form ﬁelds are independent of coordinates. So,

ω(v) = (a ◦ χ) X(v) = (a ◦ χ ) X (v).

(3.54)

Eqs. (3.54) and (3.53) require that the coeﬃcients transform under coordinate transformations as follows:

a(χ(m)) D(χ ◦ (χ )−1)(χ (m)) = a (χ (m)),

(3.55)

or a(χ(m)) = a (χ (m)) (D(χ ◦ (χ )−1)(χ (m)))−1.

(3.56)

The coeﬃcient tuple a(x) is a down structure compatible for contraction with b(x). Let v be the vector with coeﬃcient tuple b(x), and ω be the one-form with coeﬃcient tuple a(x). Then, by equation (3.43),

ω(v) = (a ◦ χ) (b ◦ χ).

(3.57)

As a program:

(define omega (literal-1form-field ’a R2-rect))

(define v (literal-vector-field ’b R2-rect))

((omega v) R2-rect-point)
(+ (* (bˆ0 (up x y)) (a 0 (up x0 y0))) (* (bˆ1 (up x y)) (a 1 (up x0 y0))))
Comparing equation (3.56) with equation (3.23) we see that one-form components and vector components transform oppositely, so that

a(x) b(x) = a (x ) b (x ),

(3.58)

as expected because ω(v)(m) is independent of coordinates.

3.5 Coordinate-Basis One-Form Fields

39

Exercise 3.2: Veriﬁcation
Verify that the coeﬃcients of a one-form ﬁeld transform as described in equation (3.56). You should use equation (3.44) in your derivation.

Exercise 3.3: Hill Climbing
The topography of a region on the Earth can be speciﬁed by a manifold function h that gives the altitude at each point on the manifold. Let v be a vector ﬁeld on the manifold, perhaps specifying a direction and rate of walking at every point on the manifold.
a. Form an expression that gives the power that must be expended to follow the vector ﬁeld at each point.
b. Write this as a computational expression.

4
Basis Fields

A vector ﬁeld may be written as a linear combination of basis vector ﬁelds. If n is the dimension, then any set of n linearly independent vector ﬁelds may be used as a basis. The coordinate basis X is an example of a basis.1 We will see later that not every basis is a coordinate basis: in order to be a coordinate basis, there must be a coordinate system such that each basis element is the directional derivative operator in a corresponding coordinate direction.
Let e be a tuple of basis vector ﬁelds, such as the coordinate basis X. The general vector ﬁeld v applied to an arbitrary manifold function f can be expressed as a linear combination

v(f)(m) = e(f)(m) b(m) = ei(f)(m) bi(m),
i

(4.1)

where b is a tuple-valued coeﬃcient function on the manifold. When expressed in a coordinate basis, the coeﬃcients that specify the direction of the vector are naturally expressed as functions bi of the coordinates of the manifold point. Here, the coeﬃcient function b is more naturally expressed as a tuple-valued function on the manifold. If b is the coeﬃcient function expressed as a function of coordinates, then b = b ◦ χ is the coeﬃcient function as a function on the manifold.
The coordinate-basis forms have a simple deﬁnition in terms of the coordinate-basis vectors and the coordinates (equation 3.40). With this choice, the dual property, equation (3.41), holds without further fuss. More generally, we can deﬁne a basis of one-forms ˜e that is dual to e in that the property

˜ei(ej )(m) = δji

(4.2)

is satisﬁed, analogous to property (3.41). Figure 4.1 illustrates the duality of basis ﬁelds.

1We cannot say if the basis vectors are orthogonal or normalized until we introduce a metric.

42

Chapter 4 Basis Fields

e1 e0

Figure 4.1 Let arrows e0 and e1 depict the vectors of a basis vector ﬁeld at a particular point. Then the foliations shown by the parallel
lines depict the dual basis one-form ﬁelds at that point. The dotted lines represent the ﬁeld ˜e0 and the dashed lines represent the ﬁeld ˜e1.
The spacings of the lines are 1/3 unit. That the vectors pierce three
of the lines representing their duals and do not pierce any of the lines
representing the other basis elements is one way to see the relationship ˜ei(ej)(m) = δji .

To solve for the dual basis ˜e given the basis e, we express the basis vectors e in terms of a coordinate basis2

ej (f) = Xk(f) ckj ,
k

(4.3)

and the dual one-forms ˜e in terms of the dual coordinate one-forms

˜ei(v) = dil Xl(v),
l

(4.4)

2We write the vector components on the right and the tuple of basis vectors on the left because if we think of the basis vectors as organized as a row and the components as organized as a column then the formula is just a matrix multiplication.

Chapter 4 Basis Fields

43

then

˜ei(ej ) = dilXl(ej )

l

= dilej (χl)

l

= dil Xk(χl)ckj

l

k

= dilδkl ckj

kl

= dikckj .

k

Applying this at m we get

(4.5)

˜ei(ej )(m) = δji = dik(m)ckj (m).
k

(4.6)

So the d coeﬃcients can be determined from the c coeﬃcents (essentially by matrix inversion).
A set of vector ﬁelds {ei} may be linearly independent in the sense that a weighted sum of them may not be identically zero over a region, yet it may not be a basis in that region. The problem is that there may be some places in the region where the vectors are not independent. For example, two of the vectors may be parallel at a point but not parallel elsewhere in the region. At such a point m the determinant of the matrix c(m) is zero. So at these points we cannot deﬁne the dual basis forms.3
The dual form ﬁelds can be used to determine the coeﬃcients b of a vector ﬁeld v relative to a basis e, by applying the dual basis form ﬁelds ˜e to the vector ﬁeld. Let

v(f) = ei(f) bi.
i

(4.7)

Then

˜ej(v) = bj.

(4.8)

3This is why the set of vector ﬁelds and the set of one-form ﬁelds are modules rather than vector spaces.

44

Chapter 4 Basis Fields

Deﬁne two general vector ﬁelds:

(define e0 (+ (* (literal-manifold-function ’e0x R2-rect) d/dx) (* (literal-manifold-function ’e0y R2-rect) d/dy)))

(define e1 (+ (* (literal-manifold-function ’e1x R2-rect) d/dx) (* (literal-manifold-function ’e1y R2-rect) d/dy)))

We use these as a vector basis and compute the dual:

(define e-vector-basis (down e0 e1)) (define e-dual-basis
(vector-basis->dual e-vector-basis R2-polar))

The procedure vector-basis->dual requires an auxiliary coordinate system (here R2-polar) to get the ckj coeﬃcient functions from which we compute the dik coeﬃcient functions. However, the ﬁnal result is independent of this coordinate system. Then
we can verify that the bases e and ˜e satisfy the dual relationship
(equation 3.41) by applying the dual basis to the vector basis:

((e-dual-basis e-vector-basis) R2-rect-point)
(up (down 1 0) (down 0 1))

Note that the dual basis was computed relative to the polar coordinate system: the resulting objects are independent of the coordinates in which they were expressed!
Or we can make a general vector ﬁeld with this basis and then pick out the coeﬃcients by applying the dual basis:

(define v (* (up (literal-manifold-function ’b^0 R2-rect) (literal-manifold-function ’b^1 R2-rect)) e-vector-basis))

((e-dual-basis v) R2-rect-point)
(up (bˆ0 (up x0 y0)) (bˆ1 (up x0 y0)))

4.1 Change of Basis
Suppose that we have a vector ﬁeld v expressed in terms of one basis e and we want to reexpress it in terms of another basis e . We have

4.1 Change of Basis

45

v(f) = ei(f)bi = ej(f)b j.

i

j

(4.9)

The coeﬃcients b can be obtained from v by applying the dual basis

b j = ˜e j (v) = ˜e j (ei)bi.
i

(4.10)

Let

Jji = ˜e j (ei),

(4.11)

then

b j = Jji bi,
i

(4.12)

and

ei(f) = ej (f)Jji .
j

(4.13)

The Jacobian J is a structure of manifold functions. Using tuple arithmetic, we can write

b = Jb

(4.14)

and

e(f) = e (f)J.

(4.15)

We can write

(define (Jacobian to-basis from-basis) (s:map/r (basis->1form-basis to-basis) (basis->vector-basis from-basis)))

These are the rectangular components of a vector ﬁeld:

(define b-rect ((coordinate-system->1form-basis R2-rect) (literal-vector-field ’b R2-rect)))

The polar components are:

46

Chapter 4 Basis Fields

(define b-polar (* (Jacobian (coordinate-system->basis R2-polar) (coordinate-system->basis R2-rect)) b-rect))

(b-polar ((point R2-rect) (up ’x0 ’y0)))
(up (/ (+ (* x0 (bˆ0 (up x0 y0))) (* y0 (bˆ1 (up x0 y0)))) (sqrt (+ (expt x0 2) (expt y0 2)))) (/ (+ (* x0 (bˆ1 (up x0 y0))) (* -1 y0 (bˆ0 (up x0 y0)))) (+ (expt x0 2) (expt y0 2))))

We can also get the polar components directly:

(((coordinate-system->1form-basis R2-polar) (literal-vector-field ’b R2-rect))
((point R2-rect) (up ’x0 ’y0)))
(up
(/ (+ (* x0 (bˆ0 (up x0 y0))) (* y0 (bˆ1 (up x0 y0)))) (sqrt (+ (expt x0 2) (expt y0 2))))
(/ (+ (* x0 (bˆ1 (up x0 y0))) (* -1 y0 (bˆ0 (up x0 y0)))) (+ (expt x0 2) (expt y0 2))))

We see that they are the same. If K is the Jacobian that relates the basis vectors in the other
direction

e (f) = e(f)K

(4.16)

then

KJ = I = JK

(4.17)

where I is a manifold function that returns the multiplicative identity.
The dual basis transforms oppositely. Let

ω = ai˜ei = ai˜e i.

i

i

(4.18)

4.2 Rotation Basis

The coeﬃcients are4

ai = ω(ei) = aj˜e j (ei) = aj Jji

j

j

or, in tuple arithmetic,

a = a J.

Because of equation (4.18) we can deduce

˜e = K˜e .

47 (4.19) (4.20) (4.21)

4.2 Rotation Basis

One interesting basis for rotations in 3-dimensional space is not a coordinate basis.
Rotations are the actions of the special orthogonal group SO(3), which is a 3-dimensional manifold. The elements of this group may be represented by the set of 3 × 3 orthogonal matrices with determinant +1.
We can use a coordinate patch on this manifold with Euler angle coordinates: each element has three coordinates, θ, φ, ψ. A manifold point may be represented by a rotation matrix. The rotation matrix for Euler angles is a product of three simple rotations: M (θ, φ, ψ) = Rz(φ)Rx(θ)Rz(ψ), where Rx and Rz are functions that take an angle and produce the matrices representing rotations about the x and z axes, respectively. We can visualize θ as the colatitude of the pole from the zˆ-axis, φ as the longitude, and ψ as the rotation around the pole.
Given a rotation speciﬁed by Euler angles, how do we change the Euler angle to correspond to an incremental rotation of size
about the xˆ-axis? The direction (a, b, c) is constrained by the equation

Rx( )M (θ, φ, ψ) = M (θ + a , φ + b , ψ + c ).

(4.22)

4We see from equations (4.15) and (4.16) that J and K are inverses. We can obtain their coeﬃcients by: Jji = ˜e j (ei) and Kji = ˜ej (ei).

48

Chapter 4 Basis Fields

Linear equations for (a, b, c) can be found by taking the derivative of this equation with respect to . We ﬁnd

0 = c cos θ + b, 0 = a sin φ − c cos φ sin θ, 1 = c sin φ sin θ + a cos φ,

(4.23) (4.24) (4.25)

with the solution

a = cos φ,
sin φ cos θ b = − sin θ ,
sin φ c= .
sin θ

(4.26) (4.27) (4.28)

Therefore, we can write the basis vector ﬁeld that takes directional derivatives in the direction of incremental x rotations as

∂∂ ∂

ex

=

a ∂θ

+

b ∂φ

+

c ∂ψ

∂ sin φ cos θ ∂ sin φ ∂

=

cos φ ∂θ

−

sin θ

+

.

∂φ sin θ ∂ψ

(4.29)

Similarly, vector ﬁelds for the incremental y and z rotations are

cos φ cos θ ∂

∂ cos φ ∂

ey =

sin θ

∂φ

+ sin φ ∂θ

−

sin θ

, ∂ψ

∂

ez

=

. ∂φ

(4.30) (4.31)

4.3 Commutators

The commutator of two vector ﬁelds is deﬁned as

[v, w](f) = v(w(f)) − w(v(f)).

(4.32)

In the special case that the two vector ﬁelds are coordinate basis ﬁelds, the commutator is zero:

4.3 Commutators

49

[Xi, Xj](f) = Xi(Xj(f)) − Xj(Xi(f)) = ∂i∂j (f ◦ χ−1) ◦ χ − ∂j ∂i(f ◦ χ−1) ◦ χ
= 0,

(4.33)

because the individual partial derivatives commute. The vanishing commutator is telling us that we get to the same manifold point by integrating from a point along ﬁrst one basis vector ﬁeld and then another as from integrating in the other order. If the commutator is zero we can use the integral curves of the basis vector ﬁelds to form a coordinate mesh.
More generally, the commutator of two vector ﬁelds is a vector ﬁeld. Let v be a vector ﬁeld with coeﬃcient function c = c ◦ χ, and u be a vector ﬁeld with coeﬃcient function b = b ◦ χ, both with respect to the coordinate basis X. Then

[u, v](f) = u(v(f)) − v(u(f))

= u( Xi(f)ci) − v( Xj(f)bj)

i

j

= Xj( Xi(f)ci)bj − Xi( Xj (f)bj )ci

j

i

i

j

= [Xj, Xi](f)cibj
ij

+ Xi(f) (Xj (ci)bj − Xj (bi)cj )

i

j

= Xi(f)ai,
i

(4.34)

where the coeﬃcient function a of the commutator vector ﬁeld is

ai =

Xj(ci)bj − Xj(bi)cj

j

= u(ci) − v(bi).

(4.35)

We used the fact, shown above, that the commutator of two coordinate basis ﬁelds is zero.

50

Chapter 4 Basis Fields

We can check this formula for the commutator for the general vector ﬁelds e0 and e1 in polar coordinates:

(let* ((polar-basis (coordinate-system->basis R2-polar)) (polar-vector-basis (basis->vector-basis polar-basis)) (polar-dual-basis (basis->1form-basis polar-basis)) (f (literal-manifold-function ’f-rect R2-rect)))
((- ((commutator e0 e1) f) (* (- (e0 (polar-dual-basis e1)) (e1 (polar-dual-basis e0))) (polar-vector-basis f)))
R2-rect-point))
0

Let e be a tuple of basis vector ﬁelds. The commutator of two basis ﬁelds can be expressed in terms of the basis vector ﬁelds:

[ei, ej ](f) = dkij ek(f),
k

(4.36)

where dkij are functions of m, called the structure constants for the basis vector ﬁelds. The coeﬃcients are

dkij = ˜ek([ei, ej ]).

(4.37)

The commutator [u, v] with respect to a non-coordinate basis ei is

[u, v](f) = ek(f) u(ck) − v(bk) + cibjdkji .

k

ij

(4.38)

Deﬁne the vector ﬁelds Jx, Jy, and Jz that generate rotations about the three rectangular axes in three dimensions:5

(define Jz (- (* x d/dy) (* y d/dx))) (define Jx (- (* y d/dz) (* z d/dy))) (define Jy (- (* z d/dx) (* x d/dz)))

5Using
(define R3-rect (coordinate-system-at ’rectangular ’origin R3)) (define-coordinates (up x y z) R3-rect) (define R3-rect-point ((point R3-rect) (up ’x0 ’y0 ’z0))) (define g (literal-manifold-function ’g-rect R3-rect))

4.3 Commutators

51

(((+ (commutator Jx Jy) Jz) g) R3-rect-point)
0
(((+ (commutator Jy Jz) Jx) g) R3-rect-point)
0
(((+ (commutator Jz Jx) Jy) g) R3-rect-point)
0

We see that

[Jx, Jy] = −Jz [Jy, Jz] = −Jx [Jz, Jx] = −Jy.

(4.39)

We can also compute the commutators for the basis vector ﬁelds ex, ey, and ez in the SO(3) manifold (see equations 4.29–4.31) that correspond to rotations about the x, y, and z axes, respectively:6

(((+ (commutator e x e y) e z) f) SO3-point)
0
(((+ (commutator e y e z) e x) f) SO3-point)
0
(((+ (commutator e z e x) e y) f) SO3-point)
0

You can tell if a set of basis vector ﬁelds is a coordinate basis by calculating the commutators. If they are nonzero, then the basis is not a coordinate basis. If they are zero then the basis vector ﬁelds can be integrated to give the coordinate system.
Recall equation (3.31)

(etvf)(m) = (f ◦ φvt )(m). Iterating this equation, we ﬁnd

(4.40)

(eswetvf)(m) = (f ◦ φvt ◦ φws )(m).

(4.41)

6Using
(define Euler-angles (coordinate-system-at ’Euler ’Euler-patch SO3)) (define Euler-angles-chi-inverse (point Euler-angles)) (define-coordinates (up theta phi psi) Euler-angles) (define SO3-point ((point Euler-angles) (up ’theta ’phi ’psi))) (define f (literal-manifold-function ’f-Euler Euler-angles))

52

Chapter 4 Basis Fields

Notice that the evolution under w occurs before the evolution under v.
To illustrate the meaning of the commutator, consider the evolution around a small loop with sides made from the integral curves of two vector ﬁelds v and w. We will ﬁrst follow v, then w, then −v, and then −w:

(e ve we− ve− wf)(m).

(4.42)

To second order in the result is7

(e 2[v,w]f)(m).

(4.43)

This result is illustrated in ﬁgure 4.2.
Take a point 0 in M as the origin. Then, presuming [ei, ej] = 0, the coordinates x of the point m in the coordinate system corresponding to the e basis satisfy8

m = φx1e(0) = χ−1(x),

(4.44)

where χ is the coordinate function being deﬁned. Because the elements of e commute, we can translate separately along the integral curves in any order and reach the same point; the terms in the exponential can be factored into separate exponentials if needed.

Exercise 4.1: Alternate Angles
Note that the Euler angles are singular at θ = 0 (where φ and ψ become degenerate), so the representations of ex, ey, and ez (deﬁned in equa-

7 For non-commuting operators A and B,

eA eB e−A e−B = 1 + A + A2 + · · · 2

1 + B + B2 + · · · 2

×

1−

A

+

A2 2

+

···

1

−

B

+

B2 2

+

···

= 1 + [A, B] + · · · ,

to second order in A and B. All higher-order terms can be written in terms of higher-order commutators of A and B. An example of a higher-order commutator is [A, [A, B]].

8Here x is an up-tuple structure of components, and e is down-tuple structure of basis vectors. The product of the two contracts to make a scaled vector, along which we translate by one unit.

4.3 Commutators

53

m

2[v, w]

v

−w

w −v

Figure 4.2 The commutator of two vector ﬁelds computes the residual of a small loop following their integral curves.
tions 4.29–4.31) have problems there. An alternate coordinate system avoids this problem, while introducing a similar problem elsewhere in the manifold.
Consider the “alternate angles” (θa, φa, ψa) which deﬁne a rotation matrix via M (θa, φa, ψa) = Rz (φa) Rx (θa) Ry (ψa). a. Where does the singularity appear in these alternate coordinates? Do you think you could deﬁne a coordinate system for rotations that has no singularities?
b. What do the ex, ey, and ez basis vector ﬁelds look like in this coordinate system?
Exercise 4.2: General Commutators Verify equation (4.38).
Exercise 4.3: SO(3) Basis and Angular Momentum Basis How are Jx, Jy, and Jz related to ex, ey, and ez in equations (4.29–4.31)?

5
Integration

We know how to integrate real-valued functions of a real variable. We want to extend this idea to manifolds, in such a way that the integral is independent of the coordinate system used to compute it.
The integral of a real-valued function of a real variable is the limit of a sum of products of the values of the function on subintervals and the lengths of the increments of the independent variable in those subintervals:

b

b

f = f (x) dx = lim f (xi) Δxi.

a

a

Δxi→0 i

(5.1)

If we change variables (x = g(y)), then the form of the integral changes:

b
f=
a
=
=

b
f (x) dx
a g −1 (b)
f (g(y))Dg(y)dy
g −1 (a)
g −1 (b)
(f ◦ g)Dg.
g −1 (a)

(5.2)

We can make a coordinate-independent notion of integration in the following way. An interval of the real line is a 1-dimensional manifold with boundary. We can assign a coordinate chart χ to this manifold. Let x = χ(m). The coordinate basis is associated with a coordinate-basis vector ﬁeld, here ∂/∂x. Let ω be a oneform on this manifold. The application of ω to ∂/∂x is a realvalued function on the manifold. If we compose this with the inverse chart, we get a real-valued function of a real variable. We can then write the usual integral of this function

b
I = ω(∂/∂x) ◦ χ−1.
a

(5.3)

56

Chapter 5 Integration

It turns out that the value of this integral is independent of the
coordinate chart used in its deﬁnition. Consider a diﬀerent coor-
dinate chart x = χ (m), with associated basis vector ﬁeld ∂/∂x . Let g = χ ◦ χ−1. We have

b
ω (∂/∂x ) ◦ χ −1

a

b
= ω ∂/∂x D χ ◦ χ −1 ◦ χ ◦ χ −1

a

b

=

ω(∂/∂x)D χ ◦ χ −1 ◦ χ ◦ χ −1

a

b

=

ω(∂/∂x) ◦ χ −1 D χ ◦ χ −1

a

b

=

ω(∂/∂x) ◦ χ−1 D χ ◦ χ −1 ◦ g Dg

a

b
= ω(∂/∂x) ◦ χ−1,

a

(5.4)

where we have used the rule for coordinate transformations of basis vectors (equation 3.19), linearity of forms in the ﬁrst two lines, and the rule for change-of-variables under an integral in the last line.1
Because the integral is independent of the coordinate chart, we can write simply

I = ω,
M

(5.5)

where M is the 1-dimensional manifold with boundary corresponding to the interval.
We are exploiting the fact that coordinate basis vectors in different coordinate systems are related by a Jacobian (see equation 3.19), which cancels the Jacobian that appears in the changeof-variables formula for integration (see equation 5.2).

1 Note (D (χ ◦ χ −1) ◦ (χ ◦ χ−1)) D(χ ◦ χ−1) = 1. With g = χ ◦ χ−1 this is (D(g−1) ◦ g) (Dg) = 1.

5.1 Higher Dimensions

57

5.1 Higher Dimensions

We have seen that we can integrate one-forms on 1-dimensional
manifolds. We need higher-rank forms that we can integrate on
higher-dimensional manifolds in a coordinate-independent man-
ner. Consider the integral of a real-valued function, f : Rn → R, over
a region U in Rn. Under a coordinate transformation g : Rn → Rn, we have2

f=

(f ◦ g) det (Dg) .

U

g−1(U)

(5.6)

A rank n form ﬁeld takes n vector ﬁeld arguments and produces a real-valued manifold function: ω (v, w, . . . , u) (m). By analogy with the 1-dimensional case, higher-rank forms are linear in each argument. Higher-rank forms must also be antisymmetric under interchange of any two arguments in order to make a coordinatefree deﬁnition of integration analogous to equation (5.3).
Consider an integral in the coordinate system χ:

ω (X0, X1, . . .) ◦ χ−1.
χ(U)

(5.7)

Under coordinate transformations g = χ ◦ χ −1, the integral becomes

ω (X0, X1, . . .) ◦ χ −1 det (Dg) .
χ (U)

(5.8)

Using the change-of-basis formula, equation (3.19):
X(f) = X (f)(D(χ ◦ χ−1)) ◦ χ = X (f) D g−1 ◦ χ.
If we let M = (D (g−1)) ◦ χ then
(ω (X0, X1, . . .) ◦ χ −1) det (Dg) = (ω (X M0, X M1, . . .) ◦ χ −1) det (Dg) = (ω (X0, X1, . . .) ◦ χ −1) α (M0, M1, . . .) det (Dg) ,

(5.9) (5.10)

2The determinant is the unique function of the rows of its argument that i) is linear in each row, ii) changes sign under any interchange of rows, and iii) is one when applied to the identity multiplier.

58

Chapter 5 Integration

using the multilinearity of ω, where Mi is the ith column of M . The function α is multilinear in the columns of M . To make a coordinate-independent integration we want the expression (5.10) to be the same as the integrand in

I=

ω (X0, X1, . . .) ◦ χ −1.

χ (U)

(5.11)

For this to be the case, α (M0, M1, . . .) must be (det (Dg))−1 = det(M ). So α is an antisymmetric function, and thus so is ω.
Thus higher-rank form ﬁelds must be antisymmetric multilinear functions from vector ﬁelds to manifold functions. So we have a coordinate-independent deﬁnition of integration of form ﬁelds on a manifold and we can write

I = I = ω.
U

(5.12)

Wedge Product
There are several ways we can construct antisymmetric higherrank forms. Given two one-form ﬁelds ω and τ we can form a two-form ﬁeld ω ∧ τ as follows:

(ω ∧ τ )(v, w) = ω(v)τ (w) − ω(w)τ (v).

(5.13)

More generally we can form the wedge of higher-rank forms. Let ω be a k-form ﬁeld and τ be an l-form ﬁeld. We can form a (k + l)-form ﬁeld ω ∧ τ as follows:

(k + l)! ω ∧ τ = k! l! Alt(ω ⊗ τ ) where, if η is a function on m vectors,

(5.14)

Alt(η)(v0, . . . , vm−1)

1

= m!

Parity(σ)η(vσ(0), . . . , vσ(m−1)),

σ∈Perm(m)

(5.15)

and where

ω ⊗ τ (v0, . . . , vk−1, vk, . . . , vk+l−1) = ω(v0, . . . , vk−1)τ (vk, . . . , vk+l−1).

(5.16)

5.1 Higher Dimensions

59

u(m)

A(u, v)(m)

v(m) m

Figure 5.1 The area of the parallelogram in the (x, y) coordinate plane is given by A (u, v) (m).

The wedge product is associative, and thus we need not specify the order of a multiple application. The factorial coeﬃcients of these formulas are chosen so that

(dx ∧ dy ∧ . . .) (∂/∂x, ∂/∂y, . . .) = 1.

(5.17)

This is true independent of the coordinate system. Equation (5.17) gives us

dx ∧ dy ∧ . . . = Volume(U)
U

(5.18)

where Volume(U) is the ordinary volume of the region corresponding to U in the Euclidean space of Rn with the orthonormal coordinate system (x, y, . . .).3
An example two-form (see ﬁgure 5.1) is the oriented area of
a parallelogram in the (x, y) coordinate plane at the point m spanned by two vectors u = u0∂/∂x + u1∂/∂y and v = v0∂/∂x + v1∂/∂y, which is given by

A (u, v) (m) = u0 (m) v1 (m) − v0 (m) u1 (m) .

(5.19)

3By using the word “orthonormal” here we are assuming that the range of the coordinate chart is an ordinary Euclidean space with the usual Euclidean metric. The coordinate basis in that chart is orthonormal. Under these conditions we can usefully use words like “length,” “area,” and “volume” in the coordinate space.

60

Chapter 5 Integration

Note that this is the area of the parallelogram in the coordinate plane, which is the range of the coordinate function. It is not the area on the manifold. To deﬁne that, we need more structure—the metric. We will put a metric on the manifold in Chapter 9.

3-Dimensional Euclidean Space Let’s specialize to 3-dimensional Euclidean space. Following equation (5.18) we can write the coordinate-area two-form in another way: A = dx ∧ dy. As code:
(define-coordinates (up x y z) R3-rect)
(define u (+ (* ’u^0 d/dx) (* ’u^1 d/dy))) (define v (+ (* ’v^0 d/dx) (* ’v^1 d/dy)))
(((wedge dx dy) u v) R3-rect-point)
(+ (* uˆ0 vˆ1) (* -1 uˆ1 vˆ0))
If we use cylindrical coordinates and deﬁne cylindrical vector ﬁelds we get the analogous answer in cylindrical coordinates:
(define-coordinates (up r theta z) R3-cyl)
(define a (+ (* ’a^0 d/dr) (* ’a^1 d/dtheta))) (define b (+ (* ’b^0 d/dr) (* ’b^1 d/dtheta)))
(((wedge dr dtheta) a b) ((point R3-cyl) (up ’r0 ’theta0 ’z0)))
(+ (* aˆ0 bˆ1) (* -1 aˆ1 bˆ0))
The moral of this story is that this is the area of the parallelogram in the coordinate plane. It is not the area on the manifold!
There is a similar story with volumes. The wedge product of the elements of the coordinate basis is a three-form that measures our usual idea of coordinate volumes in R3 with a Euclidean metric:
(define u (+ (* ’u^0 d/dx) (* ’u^1 d/dy) (* ’u^2 d/dz))) (define v (+ (* ’v^0 d/dx) (* ’v^1 d/dy) (* ’v^2 d/dz))) (define w (+ (* ’w^0 d/dx) (* ’w^1 d/dy) (* ’w^2 d/dz)))

5.1 Higher Dimensions

61

(((wedge dx dy dz) u v w) R3-rect-point)
(+ (* uˆ0 vˆ1 wˆ2) (* -1 uˆ0 vˆ2 wˆ1) (* -1 uˆ1 vˆ0 wˆ2) (* uˆ1 vˆ2 wˆ0) (* uˆ2 vˆ0 wˆ1) (* -1 uˆ2 vˆ1 wˆ0))

This last expression is the determinant of a 3 × 3 matrix:

(- (((wedge dx dy dz) u v w) R3-rect-point) (determinant (matrix-by-rows (list ’u^0 ’u^1 ’u^2) (list ’v^0 ’v^1 ’v^2) (list ’w^0 ’w^1 ’w^2))))
0

If we did the same operations in cylindrical coordinates we would

get the analogous formula, showing that what we are computing

is volume in the coordinate space, not volume on the manifold.

Because of antisymmetry, if the rank of a form is greater than

the dimension of the manifold then the form is identically zero.

The k-forms on an n-dimensional manifold form a module of di-

mension

n k

.

We can write a coordinate-basis expression for a

k-form as

n

ω=

ωi0,...,ik−1 dxi0 ∧ . . . ∧ dxik−1 .

i0,...,ik−1=0

(5.20)

The antisymmetry of the wedge product implies that

ωiσ(0),...,iσ(k−1) = Parity(σ)ωi0,...,ik−1 ,

(5.21)

from which we see that there are only

n k

independent components

of ω.

Exercise 5.1: Wedge Product Pick a coordinate system and use the computer to verify that a. the wedge product is associative for forms in your coordinate system; b. formula (5.17) is true in your coordinate system.

62

Chapter 5 Integration

5.2 Exterior Derivative

The intention of introducing the exterior derivative is to capture all of the classical theorems of “vector analysis” into one uniﬁed Stokes’s Theorem, which asserts that the integral of a form on the boundary of a manifold is the integral of the exterior derivative of the form on the interior of the manifold:4

ω = dω.

∂M

M

(5.22)

As we have seen in equation (3.34), the diﬀerential of a function
on a manifold is a one-form ﬁeld. If a function on a manifold is considered to be a form ﬁeld of rank zero,5 then the diﬀerential
operator increases the rank of the form by one. We can generalize
this to k-form ﬁelds with the exterior derivative operation. Consider a one-form ω. We deﬁne6

dω(v1, v2) = v1(ω(v2)) − v2(ω(v1)) − ω([v1, v2]).

(5.23)

More generally, the exterior derivative of a k-form ﬁeld is a k + 1form ﬁeld, given by:7

dω(v0, . . . , vk) =

(5.24)

k
((−1)ivi(ω(v0, . . . , vi−1, vi+1, . . . , vk))+

i=0

k
(−1)i+j ω([vi, vj ], v0, . . . , vi−1, vi+1, . . . , vj−1, vj+1, . . . , vk))} .

j=i+1

This formula is coordinate-system independent. This is the way we compute the exterior derivative in our software.

4This is a generalization of the Fundamental Theorem of Calculus. 5A manifold function f induces a form ﬁeld ˆf of rank 0 as follows: ˆf()(m) = f(m).
6The deﬁnition is chosen to make Stokes’s Theorem pretty. 7See Spivak, Diﬀerential Geometry, Volume 1, p.289.

5.2 Exterior Derivative

63

If the form ﬁeld ω is represented in a coordinate basis

n−1

ω=

ai0,...,ik−1 dxi0 ∧ · · · ∧ dxik−1

i0=0,...,ik−1=0

(5.25)

then the exterior derivative can be expressed as

n−1

dω =

dai0,...,ik−1 ∧ dxi0 ∧ · · · ∧ dxik−1 .

i0=0,...,ik−1=0

(5.26)

Though this formula is expressed in terms of a coordinate basis, the result is independent of the choice of coordinate system.

Computing Exterior Derivatives We can test that the computation indicated by equation (5.24) is equivalent to the computation indicated by equation (5.26) in three dimensions with a general one-form ﬁeld:

(define a (literal-manifold-function ’alpha R3-rect)) (define b (literal-manifold-function ’beta R3-rect)) (define c (literal-manifold-function ’gamma R3-rect))

(define theta (+ (* a dx) (* b dy) (* c dz)))

The test will require two arbitrary vector ﬁelds

(define X (literal-vector-field ’X-rect R3-rect)) (define Y (literal-vector-field ’Y-rect R3-rect))

(((- (d theta) (+ (wedge (d a) dx) (wedge (d b) dy) (wedge (d c) dz)))
X Y) R3-rect-point)
0

We can also try a general two-form ﬁeld in 3-dimensional space: Let

ω = ady ∧ dz + bdz ∧ dx + cdx ∧ dy,

(5.27)

where a = α ◦ χ, b = β ◦ χ, c = γ ◦ χ, and α, β, and γ are real-valued functions of three real arguments. As a program,

64

Chapter 5 Integration

(define omega (+ (* a (wedge dy dz)) (* b (wedge dz dx)) (* c (wedge dx dy))))
Here we need another vector ﬁeld because our result will be a three-form ﬁeld.
(define Z (literal-vector-field ’Z-rect R3-rect))
(((- (d omega) (+ (wedge (d a) dy dz) (wedge (d b) dz dx) (wedge (d c) dx dy)))
X Y Z) R3-rect-point)
0

Properties of Exterior Derivatives
The exterior derivative of the wedge of two form ﬁelds obeys the graded Leibniz rule. It can be written in terms of the exterior derivatives of the component form ﬁelds:

d(ω ∧ τ ) = dω ∧ τ + (−1)kω ∧ dτ ,

(5.28)

where k is the rank of ω. A form ﬁeld ω that is the exterior derivative of another form
ﬁeld ω = dθ is called exact. A form ﬁeld whose exterior derivative is zero is called closed.
Every exact form ﬁeld is a closed form ﬁeld: applying the exterior derivative operator twice always yields zero:

d2ω = 0.

(5.29)

This is equivalent to the statement that partial derivatives with respect to diﬀerent variables commute.8
It is easy to show equation (5.29) for manifold functions:

d2f(u, v) = d(df)(u, v) = u(df(v)) − v(df(u)) − df([u, v]) = u(v(f)) − v(u(f)) − [u, v](f) =0

(5.30)

8See Spivak, Calculus on Manifolds, p.92

5.3 Stokes’s Theorem

65

Consider the general one-form ﬁeld θ deﬁned on 3-dimensional rectangular space. Taking two exterior derivatives of θ yields a three-form ﬁeld. It is zero:

(((d (d theta)) X Y Z) R3-rect-point)
0

Not every closed form ﬁeld is an exact form ﬁeld. Whether a closed form ﬁeld is exact depends on the topology of a manifold.

5.3 Stokes’s Theorem

The proof of the general Stokes’s Theorem for n-dimensional orientable manifolds is quite complicated, but it is easy to see how it works for a 2-dimensional region M that can be covered with a single coordinate patch.9
Given a coordinate chart χ(m) = (x(m), y(m)) we can obtain a pair of coordinate-basis vectors ∂/∂x = X0 and ∂/∂y = X1.
The coordinate image of M can be divided into small rectangular areas in the (x, y) coordinate plane. The union of the rectangular areas gives the coordinate image of M. The clockwise integrals around the boundaries of the rectangles cancel on neighboring rectangles, because the boundary is traversed in opposite directions. But on the boundary of the coordinate image of M the boundary integrals do not cancel, yielding an integral on the boundary of M. Area integrals over the rectangular areas add to produce an integral over the entire coordinate image of M.
So, consider Stokes’s Theorem on a small patch P of the manifold for which the coordinates form a rectangular region (xmin < x < xmax and ymin < y < ymax). Stokes’s Theorem on P states

ω = dω.

∂P

P

(5.31)

The area integral on the right can be written as an ordinary multidimensional integral using the coordinate basis vectors (recall

9We do not develop the machinery for integration on chains that is usually needed for a full proof of Stokes’s Theorem. This is adequately done in other books. A beautiful treatment can be found in Spivak, Calculus on Manifolds [17].

66

Chapter 5 Integration

that the integral is independent of the choice of coordinates):

dω (∂/∂x, ∂/∂y) ◦ χ−1

(5.32)

χ(P)

xmax ymax

=

(∂/∂x(ω(∂/∂y)) − ∂/∂y(ω(∂/∂x))) ◦ χ−1.

xmin

ymin

We have used equation (5.23) to expand the exterior derivative. Consider just the ﬁrst term of the right-hand side of equa-
tion (5.32). Then using the deﬁnition of basis vector ﬁeld ∂/∂x we obtain

xmax ymax
∂/∂x(ω(∂/∂y)) ◦ χ−1

xmin
=
=

ymin xmax
xmin xmax
xmin

ymax
X0(ω(∂/∂y)) ◦ χ−1
ymin
ymax
∂0 (ω(∂/∂y)) ◦ χ−1 .
ymin

(5.33)

This integral can now be evaluated using the Fundamental Theorem of Calculus. Accumulating the results for both integrals

dω (∂/∂x, ∂/∂y) ◦ χ−1

χ(P)

xmax

=

(ω(∂/∂x)) ◦ χ−1 (x, ymin)dx

xmin ymax
(ω(∂/∂y)) ◦ χ−1 (xmax, y)dy

ymin

xmax

−

(ω(∂/∂x)) ◦ χ−1 (x, ymax)dx

xmin

ymax

−

(ω(∂/∂y)) ◦ χ−1 (xmin, y)dy

ymin

= ω,
∂P

as was to be shown.

(5.34)

5.4 Vector Integral Theorems

67

5.4 Vector Integral Theorems

Green’s Theorem states that for an arbitrary compact set M ⊂ R2, a 2-dimensional Euclidean space:

((α ◦ χ) dx + (β ◦ χ) dy) = ((∂0β − ∂1α) ◦ χ) dx ∧ dy.(5.35)

∂M

M

We can test this. By Stokes’s Theorem, the integrands are related by an exterior derivative. We need some vectors to test our forms:

(define v (literal-vector-field ’v-rect R2-rect)) (define w (literal-vector-field ’w-rect R2-rect))
We can now test our integrands:10

(define alpha (literal-function ’alpha R2->R)) (define beta (literal-function ’beta R2->R))

(let ((dx (ref (basis->1form-basis R2-rect-basis) 0)) (dy (ref (basis->1form-basis R2-rect-basis) 1)))
(((- (d (+ (* (compose alpha (chart R2-rect)) dx) (* (compose beta (chart R2-rect)) dy)))
(* (compose (- ((partial 0) beta) ((partial 1) alpha))
(chart R2-rect)) (wedge dx dy))) v w) R2-rect-point))
0

We can also compute the integrands for the Divergence Theorem: For an arbitrary compact set M ⊂ R3 and a vector ﬁeld w

div(w) dV = w · n dA

M

∂M

(5.36)

where n is the outward-pointing normal to the surface ∂M . Again, the integrands should be related by an exterior derivative, if this is an instance of Stokes’s Theorem.

10Using (define R2-rect-basis (coordinate-system->basis R2-rect)). Here we extract dx and dy from R2-rect-basis to avoid globally installing
coordinates.

68

Chapter 5 Integration

Note that even the statement of this theorem cannot be made with the machinery we have developed at this point. The concepts “outward-pointing normal,” area A, and volume V on the manifold are not deﬁnable without using a metric (see Chapter 9). However, for orthonormal rectangular coordinates in R3 we can interpret the integrands in terms of forms.
Let the vector ﬁeld describing the ﬂow of stuﬀ be

w=a ∂ +b ∂ +c ∂ . ∂x ∂y ∂z

(5.37)

The rate of leakage of stuﬀ through each element of the boundary is w · n dA. We interpret this as the two-form

a dy ∧ dz + b dz ∧ dx + c dx ∧ dy,

(5.38)

because any part of the boundary will have y-z, z-x, and x-y components, and each such component will pick up contributions from the normal component of the ﬂux w. Formalizing this as code we have

(define a (literal-manifold-function ’a-rect R3-rect)) (define b (literal-manifold-function ’b-rect R3-rect)) (define c (literal-manifold-function ’c-rect R3-rect))

(define flux-through-boundary-element (+ (* a (wedge dy dz)) (* b (wedge dz dx)) (* c (wedge dx dy))))

The rate of production of stuﬀ in each element of volume is div(w) dV . We interpret this as the three-form

∂ a+ ∂ b+ ∂ c ∂x ∂y ∂z

dx ∧ dy ∧ dz.

or:

(5.39)

(define production-in-volume-element (* (+ (d/dx a) (d/dy b) (d/dz c)) (wedge dx dy dz)))

Assuming Stokes’s Theorem, the exterior derivative of the leakage of stuﬀ per unit area through the boundary must be the rate of production of stuﬀ per unit volume in the interior. We check this

5.4 Vector Integral Theorems

69

by applying the diﬀerence to arbitrary vector ﬁelds at an arbitrary point:

(define X (literal-vector-field ’X-rect R3-rect)) (define Y (literal-vector-field ’Y-rect R3-rect)) (define Z (literal-vector-field ’Z-rect R3-rect))

(((- production-in-volume-element (d flux-through-boundary-element))
X Y Z) R3-rect-point)
0

as expected.

Exercise 5.2: Graded Formula Derive equation (5.28).

Exercise 5.3: Iterated Exterior Derivative
We have shown that the equation (5.29) is true for manifold functions. Show that it is true for any form ﬁeld.

6
Over a Map
To deal with motion on manifolds we need to think about paths on manifolds and vectors along these paths. Tangent vectors along paths are not vector ﬁelds on the manifold because they are deﬁned only on the path. And the path may even cross itself, which would give more than one vector at a point. Here we introduce the concept of a vector ﬁeld over a map.1 A vector ﬁeld over a map assigns a vector to each image point of the map. In general the map may be a function from one manifold to another. If the domain of the map is the manifold of the real line, the range of the map is a 1-dimensional path on the target manifold. One possible way to deﬁne a vector ﬁeld over a map is to assign a tangent vector to each image point of a path, allowing us to work with tangent vectors to paths. A one-form ﬁeld over the map allows us to extract the components of a vector ﬁeld over the map.

6.1 Vector Fields Over a Map
Let μ be a map from points n in the manifold N to points m in the manifold M. A vector over the map μ takes directional derivatives of functions on M at points m = μ(n). The vector over the map applied to the function on M is a function on N.

Restricted Vector Fields
One way to make a vector ﬁeld over a map is to restrict a vector ﬁeld on M to the image of N over μ, as illustrated in ﬁgure 6.1.
Let v be a vector ﬁeld on M, and f a function on M. Then

vμ(f) = v(f) ◦ μ,

(6.1)

is a vector over the map μ. Note that vμ(f) is a function on N, not M:

vμ(f)(n) = v(f)(μ(n)).

(6.2)

1See Bishop and Goldberg, Tensor Analysis on Manifolds [3].

72

Chapter 6 Over a Map

μ N

μ(N) M

Figure 6.1 The vector ﬁeld v on M is indicated by arrows. The solid arrows are vμ, the restricted vector ﬁeld over the map μ. The vector ﬁeld over the map is restricted to the image of N in M.
We can implement this deﬁnition as:
(define ((vector-field->vector-field-over-map mu:N->M) v-on-M) (procedure->vector-field (lambda (f-on-M) (compose (v-on-M f-on-M) mu:N->M))))

Diﬀerential of a Map
Another way to construct a vector ﬁeld over a map μ is to transport a vector ﬁeld from the source manifold N to the target manifold M with the diﬀerential of the map

dμ(v)(f)(n) = v(f ◦ μ)(n),

(6.3)

which takes its argument in the source manifold N. The diﬀerential of a map μ applied to a vector ﬁeld v on N is a vector ﬁeld over the map. A procedure to compute the diﬀerential is:

(define (((differential mu) v) f) (v (compose f mu)))

6.2 One-Form Fields Over a Map

73

The nomenclature of this subject is confused. The “diﬀerential of a map between manifolds,” dμ, takes one more argument than the “diﬀerential of a real-valued function on a manifold,” df, but when the target manifold of μ is the reals and I is the identity function on the reals,

dμ(v)(I)(n) = (v(I ◦ μ))(n) = (v(μ))(n) = dμ(v)(n).

(6.4)

We avoid this problem in our notation by distinguishing d and d. In our programs we encode d as differential and d as d.

Velocity at a Time Let μ be the map from the time line to the manifold M, and ∂/∂t be a basis vector on the time line. Then dμ(∂/∂t) is the vector over the map μ that computes the rate of change of functions on M along the path that is the image of μ. This is the velocity vector. We can use the diﬀerential to assign a velocity vector to each moment, solving the problem of multiple vectors at a point if the path crosses itself.

6.2 One-Form Fields Over a Map

Given a one-form ω on the manifold M, the one-form over the map μ : N → M is constructed as follows:

ωμ(vμ)(n) = ω(u)(μ(n)), where u(f)(m) = vμ(f)(n).

(6.5)

The object u is not really a vector ﬁeld on M even though we have given it that shape so that the dual vector can apply to it; u(f) is evaluated only at images m = μ(n) of points n in N. If we were deﬁning u as a vector ﬁeld we would need the inverse of μ to ﬁnd the point n = μ−1(m), but this is not required to deﬁne the object u in a context where there is already an m associated with the n of interest. To extend this idea to k-forms, we carry each vector argument over the map.
The procedure that constructs a k-form over the map from a k-form is:

74

Chapter 6 Over a Map

(define ((form-field->form-field-over-map mu:N->M) w-on-M) (define (make-fake-vector-field V-over-mu n) (define ((u f) m) ((V-over-mu f) n)) (procedure->vector-field u)) (procedure->nform-field (lambda vectors-over-map (lambda (n) ((apply w-on-M (map (lambda (V-over-mu) (make-fake-vector-field V-over-mu n)) vectors-over-map)) (mu:N->M n)))) (get-rank w-on-M)))
The internal procedure make-fake-vector-field counterfeits a vector ﬁeld u on M from the vector ﬁeld over the map μ : N → M. This works here because the only value that is ever passed as m is (mu:N->M n).

6.3 Basis Fields Over a Map

Let e be a tuple of basis vector ﬁelds, and ˜e be the tuple of basis one-forms that is dual to e:

˜ei(ej )(m) = δji .

(6.6)

The basis vectors over the map, eμ, are particular cases of vectors over a map:

eμ(f) = e(f) ◦ μ.

(6.7)

And the elements of the dual basis over the map, ˜eμ, are particular cases of one-forms over the map. The basis and dual basis over the map satisfy

˜eiμ(eμj )(n) = δji .

(6.8)

6.3 Basis Fields Over a Map

75

Walking on a Sphere For example, let μ map the time line to the unit sphere.2 We use
colatitude θ and longitude φ as coordinates on the sphere:

(define S2 (make-manifold S^2 2 3)) (define S2-spherical
(coordinate-system-at ’spherical ’north-pole S2)) (define-coordinates (up theta phi) S2-spherical) (define S2-basis (coordinate-system->basis S2-spherical))
A general path on the sphere is:3

(define mu (compose (point S2-spherical) (up (literal-function ’theta) (literal-function ’phi)) (chart R1-rect)))
The basis over the map is constructed from the basis on the sphere:

(define S2-basis-over-mu (basis->basis-over-map mu S2-basis))

(define h (literal-manifold-function ’h-spherical S2-spherical))

(((basis->vector-basis S2-basis-over-mu) h) ((point R1-rect) ’t0))
(down (((partial 0) h-spherical) (up (theta t0) (phi t0))) (((partial 1) h-spherical) (up (theta t0) (phi t0))))

The basis vectors over the map compute derivatives of the function h evaluated on the path at the given time.
2We execute (define-coordinates t R1-rect) to make t the coordinate function of the real line. 3We provide a shortcut to make literal manifold maps:
(define mu (literal-manifold-map ’mu R1-rect S2-spherical))
But if we used this shortcut, the component functions would be named mu^0 and mu^1. Here we wanted to use more mnemonic names for the component functions.

76

Chapter 6 Over a Map

We can check that the dual basis over the map does the correct thing:

(((basis->1form-basis S2-basis-over-mu) (basis->vector-basis S2-basis-over-mu))
((point R1-rect) ’t0))
(up (down 1 0) (down 0 1))

Components of the Velocity
Let χ be a tuple of coordinates on M, with associated basis vectors Xi, and dual basis elements dxi. The vector basis and dual basis over the map μ are Xμi and dxiμ. The components of the velocity (rates of change of coordinates along the path μ) are obtained by
applying the dual basis over the map to the velocity

vi(t) = dxiμ(dμ(∂/∂t))(t),
where t is the coordinate for the point t. For example, the coordinate velocities on a sphere are

(6.9)

(((basis->1form-basis S2-basis-over-mu) ((differential mu) d/dt))
((point R1-rect) ’t0))
(up ((D theta) t0) ((D phi) t0)))

as expected.

6.4 Pullbacks and Pushforwards
Maps from one manifold to another can also be used to relate the vector ﬁelds and one-form ﬁelds on one manifold to those on the other. We have introduced two such relations: restricted vector ﬁelds and the diﬀerential of a function. However, there are other ways to relate the vector ﬁelds and form ﬁelds on diﬀerent manifolds that are connected by a map.

Pullback and Pushforward of a Function

The pullback of a function f on M over the map μ is deﬁned as

μ∗f = f ◦ μ.

(6.10)

6.4 Pullbacks and Pushforwards

77

This allows us to take a function deﬁned on M and use it to deﬁne
a new function on N.
For example, the integral curve of v evolved for time t as a function of the initial manifold point m generates a map φvt of the manifold onto itself. This is a simple currying4 of the integral curve of v from m as a function of time: φvt (m) = γmv (t). The evolution of the function f along an integral curve, equation (3.33), can be written in terms of the pullback over φvt :

(Et,vf)(m) = f(φvt (m)) = ((φvt )∗f)(m).

(6.11)

This is implemented as:

(define ((pullback-function mu:N->M) f-on-M) (compose f-on-M mu:N->M))

A vector ﬁeld over the map that was constructed by restriction (equation 6.1) can be seen as the pullback of the function constructed by application of the vector ﬁeld to a function:

vμ(f) = v(f) ◦ μ = μ∗(v(f)).

(6.12)

A vector ﬁeld over the map that was constructed by a diﬀerential (equation 6.3) can be seen as the vector ﬁeld applied to the pullback of the function:

dμ(v)(f)(n) = v(f ◦ μ)(n) = v(μ∗f)(n).

(6.13)

If we have an inverse for the map μ we can also deﬁne a push-
forward of the function g, deﬁned on the source manifold of the map:5

μ∗g = g ◦ μ−1.

(6.14)

4A function of two arguments may be seen as a function of one argument whose value is a function of the other argument. This can be done in two diﬀerent ways, depending on which argument is supplied ﬁrst. The general process of specifying a subset of the arguments to produce a new function of the others is called currying the function, in honor of the logician Haskell Curry (1900– 1982) who, with Moses Sch¨onﬁnkel (1889–1942), developed combinatory logic.
5Notation note: superscript asterisk indicates pullback, subscript asterisk indicates pushforward. Pullbacks and pushforwards are tightly binding operators, so, for example μ∗f (n) = (μ∗f )(n).

78

Chapter 6 Over a Map

Pushforward of a Vector Field

We can also deﬁne the pushforward of a vector ﬁeld over the map μ. The pushforward takes a vector ﬁeld v deﬁned on N. The result takes directional derivatives of functions on M at a place determined by a point in M:

μ∗v(f)(m) = v(μ∗f)(μ−1(m)) = v(f ◦ μ)(μ−1(m)),

(6.15)

or

μ∗v(f) = μ∗(v(μ∗f)).

(6.16)

Here we expressed the pushforward of the vector ﬁeld in terms of pullbacks and pushforwards of functions. Note that the pushforward requires the inverse of the map.
If the map is from time to some conﬁguration manifold and represents the time evolution of a process, we can think of the pushforward of a vector ﬁeld as a velocity measured at a point on the trajectory in the conﬁguration manifold. By contrast, the diﬀerential of the map applied to the vector ﬁeld gives us the velocity vector at each moment in time. Because a trajectory may cross itself, the pushforward is not deﬁned at any point where the crossing occurs, but the diﬀerential is always deﬁned.

Pushforward Along Integral Curves
We can push a vector ﬁeld forward over the map generated by an integral curve of a vector ﬁeld w, because the inverse is always available.6
((φwt )∗v)(f)(m) = v((φwt )∗f)(φw−t(m)) = v(f ◦ φwt )(φw−t(m)). (6.17)
This is implemented as:
(define ((pushforward-vector mu:N->M mu^-1:M->N) v-on-N) (procedure->vector-field (lambda (f) (compose (v-on-N (compose f mu:N->M)) mu^-1:M->N))))

6The map φwt is always invertible: (φwt )−1 = φw−t because of the uniqueness of the solutions of the initial-value problem for ordinary diﬀerential equations.

6.4 Pullbacks and Pushforwards

79

Pullback of a Vector Field

Given a vector ﬁeld v on manifold M we can pull the vector ﬁeld back through the map μ : N → M as follows:

μ∗v(f)(n) = (v(f ◦ μ−1))(μ(n))

(6.18)

or μ∗v(f) = μ∗(v(μ∗f)).

(6.19)

This may be useful when the map is invertible, as in the ﬂow generated by a vector ﬁeld.
This is implemented as:

(define (pullback-vector-field mu:N->M mu^-1:M->N) (pushforward-vector mu^-1:M->N mu:N->M))

Pullback of a Form Field
We can also pull back a one-form ﬁeld ω deﬁned on M, but an honest deﬁnition is rarely written. The pullback of a one-form ﬁeld applied to a vector ﬁeld is intended to be the same as the one-form ﬁeld applied to the pushforward of the vector ﬁeld.
The pullback of a one-form ﬁeld is often described by the relation

μ∗ω(v) = ω(μ∗v),

(6.20)

but this is wrong, because the two sides are not functions of points in the same manifold. The one-form ﬁeld ω applies to a vector ﬁeld on the manifold M, which takes a directional derivative of a function deﬁned on M and is evaluated at a point on M, but the left-hand side is evaluated at a point on the manifold N.
A more precise description would be

μ∗ω(v)(n) = ω(μ∗v)(μ(n))

(6.21)

or

μ∗ω(v) = μ∗(ω(μ∗v)).

(6.22)