zotero-db/storage/6GCWMTMC/.zotero-ft-cache

5113 lines
177 KiB
Plaintext
Raw Normal View History

INTRODUCTION TO
LINEAR ALGEBRA
Third Edition
GILBERT STRANG
Massachusetts Institute of Technology
WELLESLEY-CAMBRIDGE PRESS Box 812060 Wellesley MA 02482
Introduction to Linear Algebra, 3rd Edition Copyright ©2003 by Gilbert Strang
All rights reserved. No part of this work may be reproduced or stored or transmitted by any means. including photocopying, without written permission from Wellesley-Cambridge Press. Translation in any language is strictly prohibited - authorized translations are arranged.
Printed in the United States of America
9 87 65 4 32
ISBN 0-9614088-9-8
QA184.S78 2003 5 12'.5 93-14092
Other texts from Wellesley-Cambridge Press Wavelets and Filter Banks. Gilbert Strang and Truong Nguyen, ISBN 0-9614088-7-l.
Linear Algebra, Geodesy, and GPS, Gilbert Strang and Kai Borre, ISBN 0-9614088-6-3.
Introduction to Applied Mathematics, Gilbert Strang, ISBN 0-9614088-0-4.
An Analysis of the Finite Element Method. Gilbert Strang and George Fix. ISBN 0-9614088-8-X.
Calculus, Gilbert Strang, ISBN 0-9614088-2--0.
Wellesley-Cambridge Press Box 812060 Wellesley MA 02482 USA www.wellesleycambridge.com
gs@math.mit.edu math.mit.edu/~ gs phone/fax (78'1) 431-8488
MATLAB® is a registered trademark of The Mathworks, Inc.
!MEX text preparation by Cordula Robinson and Breu Coonley, Massachusens Institute of Technology
IM,EX assembly and book design by Arny Hendrickson, Tp(nology Inc., www.texnology.com
A Solutions Manual is available to instructors by email from the publisher. Course material including syllabus and Teaching Codes and exams and videotaped lectures for this course are available on the linear algebra web site: web.mit.edu/18.06/www
Linear Algebra is included in the OpenCourseWare site ocw.mitedu with videos of the full course.
TABLE OF CONTENTS
1 Introduction to Vectors
.,
1.1 Vectors and Linear Combinations
l
1.2 Lengths and Dot Products
10
2 Solving Linear Equations
21
2.1 Vectors and Linear Equations
21
2.2 The Idea of Elimination
35
2.3 Elimination Using Matrices
46
2-.4 Rules for Matrix Operations
56
2.S Inverse Matrices
71
= 2.6 Elimination Factorization: A=LU
83
2.7 Transposes and Permutations
96
3 Vector Spaces and Subspaces
111
3. I Spaces of Vectors
111
3.2 The Nullspace of A: Solving Ax= 0
122
3.3 The Rank and the Row Reduced Form
134
= 3.4 The Complete Solution to Ax b
144
3.5 Independence, Basis and Dimension
157
3.6 Dimensions of the Four Subspaces
173
4 Orthogonality
184
4.1 Orthogonality of the Four Subspaces
184
4.2 Projections
194
4.3 Least Squares Approximations
206
4.4 Orthogonal Bases and Gram-Schmidt
219
5 Determinants
233
5.1 The Properties of Determinants
233
5.2 Permutations and Cofactors
245
5.3 Cramer's Rule, Inverses, and Volumes .
259
6 Eigenvalues and Eigenvectors
274
6.1 Introduction to Eigenvalues
274
6.2 Diagonalizing a Matrix
288
6.3 Applications to Differential Equations
304
6.4 Symrnenic Matrices
318
iii
iv Table of Contents
6.5 Positive Definite Matrices
330
6.6 Similar Matrices
343
6.7 Singular Value Decomposition (SVD)
352
7 Linear Transformations
363
7.1 The Idea of a Linear Transformation
363
7.2 The Matrix of a Linear Transformation
371
7.3 Change of Basis
384
7.4 Diagonalization and the Pseudoinverse
391
8 Applications
401
8.1 Matrices in Engineering
401
8.2 Graphs and Networks
412
8.3 Markov Matrices and Economic Models
423
8.4 Linear Programming
431
8.5 Fourier Series: Linear Algebra for Functions
437
8.6 Computer Graphics
444
9 Numerical Linear Algebra
450
9.1 Gaussian Elimination in Practice
450
9.2 Nonns and Condition Numbers
459
9.3 Iterative Methods for Linear Algebra
466
10 Complex Vectors and Matrices
477
10.1 Complex Numbers
477
10.2 Hermitian and Unitary Matrices
-486
l 0.3 The Fast Fourier Transform
495
Solutions to Selected Exercises
502
A Final Exam
542
Matrix Factorizations
544
Conceptual Questions for Review
546
Glossary: A Dictionary for Linear Algebra
551
Index
559
Teaching Codes
567
PREFACE
This preface expresses some personal thoughts. It is my chance 10 write about how linear algebra can be taught and learned. If we teach pure abstraction. or settle for cookbook fonnulas, we miss the best part. This course has come a long way, in living
up to what it can be.
It may be helpful to mention the web pages connected to this book. So many messages come back with suggestions and encouragement, and I hope that professors and students will make free use of everything. You can directly access web.mit.edu/18.06/www, which is continually updated for the MIT course that is taught every semester. Linear Algebra is also on the OpenCourseWare site ocw.mit.cdu, where 18.06 became exceptional by including videos (which you definitely don' t have to watch
). I can briefly indicate part of what is available now:
1. Lecture schedule and current homeworks and exams with solutions
itt;
2. The goals of the course and conceptual questions
ttt
3. Interactive Java demos for eigenvalues and least squares and more
''
4. A table of eigenvalue/eigenvector infonnat[on (sec page 362)
S. Glossary: A Dictionary for Linear Algebra
6. Linear Algebra Teaching Codes and MATLAB problems
7. Videos of the fulJ course (taught in a real classroom).
These web pages are a resource for professors and students worldwide. My goaJ is to make this book as useful as possible, with all the course material I can provide.
After this preface, the book will speak for itself. You will sec the spirit right away. The goal is to show the beauty of linear algebra, and its value. The emphasis is on understanding- / try to explain rather than to deduce. This is a book about real mathematics, not endless drill. lam constantly working with examples (create a matrix,
find its nullspacc, add another column, see what changes, ask for help! ). The textbook
has to help too. in teaching what students need. The effort is absolutely rewarding, and
fortunately this subject is not too hard.
The New Edition
A major addition lo the book is the large number of Worked Examples, section by section. Their purpose is to connect the text directly to the homework problems. The
complete solution to a vector equation Ax= b is Xpanicular + Xnullspacc-and the steps
V
VI Preface
arc explained as clearly as I can. The Worked Example 3.4 A converts this explanation into action by taking every step in the solution (starting with the test for solvability). I hope these model examples will bring the content of each section into focus (sec 5.1 A
and 5.2 B on detenninants). The "Pascal matrices" are a neat link from the amazing
properties of Pascal's triangle to linear algebra. The book contains new problems of all kinds - more basic practice, applications
throughout science and engineering and management, and just fun with matrices. Northwest and southeast matrices wander into Problem 2.4.39. Google appears in Chapter 6. Please look at the last exercise in Section I. 1. I hope the problems are a strong point of this book- the newest one is about the six 3 by 3 permutation matrices: What are their detem1inants and pivots and traces and eigenvalues?
The Glossary is also new, in the book and on the web. I believe students will
find it helpful. In addition to defining the importanl terms of linear algebra. there was
also a chance to include many of the key facts for quick reference. Fortunately. the need for linear algeb ra is widely recognized. This subject is ab-
solutely as importalll as calculus. I don't concede anything, when I look at how mathematics is used. There is even a light-hearted essay called "'Too Much Calculus" on the web page. The century of data has begun! So many applications are discrete rather than continuous, digital rather than analog. The truth is that vectors and matrices have become the language to know.
1,
The Linear Algebra Course
tt
1·1
= The equation Ax b uses that language right away. The matrix A times any vector .t
is a combination of the c:ohmms of A. The equation is asking for a combination that
produces b. Our solution comes at three levels and they a re all important:
1. Direct solutio11 by forward eliminatio n and back substitution.
= 2. Matrix solution x A- 1b by inverting the matrix.
3. Vector space solution by looking at the column space and nullspace of A.
And there is another possibility: Ax = b may hal'e 110 solurion. Elimination may lead
= to O I. The matrix approach may fail to lind A - 1. The vector space approach can
look at all combinations Ax of the columns. but b might be outside that column space.
Pan of mathematics is understanding when Ax = b is solvablc, and what to do when
it is not (the least squares solution uses ATA in Chapter 4). Another part is learning to visualize vectors. A vector v with two components
is not hard. Its components v1 and l'2 tell how far to go across and up-we draw
an arrow. A second vector w may be perpendicular to v (and Chapter I tells when).
If those vectors have six components, we can't draw them but our imagination keeps trying. In six-dimensional space, we can test quickly for a right angle. It is easy to visualize 2u (twice as far) and -w (oppos.ite to w). We can almost sec a combination like 2v - w.
..
Preiacc VII
Most important is the effon to imagi11e all tile combillations cv+ dw. They fl]] a ..two-dimensional plane" inside the six-dimensional space. As I write these words, I am not at all sure that [ can see this subspace. But linear algebra works easily with vectors and matrices of any size. If we have currems on six edges, or prices for six products, or just position and velocity of an airplane. we are dealing with six dimensions. For image processing or web searches (or the human genome). six might change to a million. lt is still linear algebra, and linear combinatiions still hold the key.
Structure of the Textbook
Already in this preface, you can see the sty le of the book and its goal. The style is
informal but the goal is absolutely serious. Linear algebra is great mathematics, and I
certainly hope that each professor who teaches this course will learn something new. The author always does. The student will notice how the applications reinforce the ideas. I hope you will see how this book moves forward, gradually and steadily.
I want to note six points about the organization of the book:
1. Chapter I provides a brief introducti.on to vectors and dot products. If the class
has met them before, the course can begin with Chapter 2. That chapter solves
11 by II systems Ax = b, and prepares for the whole course.
2. I now use the reduced row eche/011 form more than before. The MATLAB com-
mand rref(A) produces bases for the row space and column space. Better than
that, reducing the combined matrix [ A I ] produces total infonnation about all
1·1
four of the fundamental subspaces.
3. Those four subspaces are an excellent way to learn about linear independence and bases and dimension. They go to the heart of the matrix. and they are genuinely the key to applications. I hate just making up vector spaces when so many im-
= portant ones come naturally. If the class secs plenty of examples. independence
is virtually understood in advance: A has independent columns when x 0 is
= the only solution to Ax 0.
4. Section 6.1 introduces eigenvalues for 2 by 2 matrices. Many courses want to see eigenvalues early. It is absolutely possible to go directly from Chapter 3 to Section 6.1. The determinant is easy for a 2 by 2 matrix. and eigshow on the
= web captures graphically the moment when Ax Ax.
S. Every section in Chapters I to 7 ends with a highlighted Review of tile Key Ideas. The reader can recapture the main points by going carefully through this review.
6. Chapter 8 (Applications) has a new section on Matrices in Engineering.
When software is available (and time to use it), I see two possible approaches. One is to carry out instantly the steps of llesting linear independence, orthogonalizing
= = by Gram-Schmidt, and solving Ax b and Ax .l.x. The Teaching Codes follow
the steps described in class- MATLAB and Maple and Mathematica compute a little differently. All can be used (optio11a/ly) with this book. The other approach is to experiment on bigger problems-like finding the largest dctcnninant of a ±I matrix. or
viii
the average size of a pivot. The time to compute A - 1b is measured by tic; inv(A) • b; toe. Choose A = rand(1000) and compare with tic; Al b; toe by direct elimination.
A one-semester course that moves steadily will reach eigenvalues. The key idea is to diagonalize A by its eigenvector matrix S. When that succeeds. the eigenvalues
appear in s- 1AS. For symmetric matrices we can choose s- 1 = sT. When A is
rectangular we need LJTAV (U comes from eigenvectors of AAT and V from ATA). Chapters 1 to 6 are the hean of a basic course in linear algebra- theory plus applications. The beauty of this subject is in the way those come together.
May I end with this thought for professors. You might feel that the direction is right, and wonder if your students are ready. J11st give them a chance! Literally thousands of students have written to me, frequently with suggestions and surprisingly often with thanks. They know when the course has a purpose, because the professor and the book are on their side. Linear algebra is a fantastic subject, enjoy it.
Acknowledgements
This book owes a big debt to readers everywhere. Thousands of students and colleagues
have been involved in every step. I have not forgotten the warm welcome for the first
sentence written 30 years ago: "I believe that the teaching of linear algebra has become
~
too abstract." A less formal approach is now widely accepted as the right choice for
1,
the basic course. And this course has steadily improved - the homework problems, the
tt
lectures, the Worked Examples, even the Web. I really hope you see that linear algebra
1· 1
is not some optional elective, it is ne.eded. The first step in all subjects is linear!
I owe a particular debt to friends who offered suggestions and corrections and
ideas. David Arnold in California and Mike Kerckhove in Virginia teach this course
well. Per-Olof Persson created MAH.AB codes for the experiments, as Cleve Moler
and Steven Lee did earlier for the Teaching Codes. And the Pascal matrix examples.
in lhe book and on the Web, owe a lot to Alan Edelman (and a linle to Pascal). It is
just a pleasure to work with friends.
My deepest thanks of all go lo Cordula Robinson and Brett Coonley. They cre-
ated the ~EX pages that you see. Day after day, new words and examples have gone
back and forth across the hall. After 2000 problems (and 3000 attempted solutions)
this expression of my gratitude to them is almost the last sentence, of work they have
beautifully done.
Amy Hendrickson of texnology.com produced the book itself. and you will rec-
ognize the quality of her ideas. My favorites are the clear boxes that highlight key
points. The quilt on the front cover was created by Chris Curtis (it appears in Grear
American Quilrs: Book 5, by Oxmoor House). Those houses show nine linear transfor-
mations of the plane. (At least they are linear in Figure 7. I. possibly superlinear in the
quilt.) Tracy Baldwin has succeeded .again to combine art and color and mathematics.
in her fourth neat cover for Wellesley-Cambridge Press.
May I dedicate this book to grandchildren who are very precious: Roger. Sophie.
Kathryn, Alexander. Scott, Jack, William, Caroline, and Elizabeth. I hope you might
take linear algebra one day. Especially I hope you like it. The author is proud of you.
1
INTRODUCTION TO VECTORS
The heart of linear algebra is in two operations- both with vectors. We add vectors to
get v + w. We multiply by numbers c and d to get cv and dw. Combining those two operations (adding cv to dw) gives the li11ear combination cv +dw.
Linear combinations are all-important in this subject! Sometimes we want one
particular combination, a specific choice of c and d that produces a desired cv + dw.
Other times we want to visualize all the combinations (coming from all c and d). The
vectors cv lie along a line. The combinations cv + dw normally fill a two-dimensional
plane. (I have to say ..two-dimensional" because linear algebra allows higher-dimen-
sional planes.) From four vectors u, v, w, z in four-dimensional space, their combina-
tions are likely to fill the whole space. Chapter l explains these central ideas, on which everything builds. We start with
two-dimensional vectors and three-dimensfonaJ vectors, which are reasonable to draw. Then we move into higher dimensions. The really impressive feature of linear algebra is how smoothly it cakes that step into n-dimensional space. Your mental picture stays completely correct, even if drawing a ten-dimensional vector is impossible.
This is where the book is going (into 11-dimcnsional space), and the first steps are the operations in Sections 1.1 and 1.2:
1.1 Vector addition v + w and linear combinations cv + dw .
= ~ 1.2 The dot product v • w and the length IIvII -
VECTORS AND LINEAR COMBINATIONS ■ 1.1
"You can' t add apples and oranges." In a strange way, this is the reason for vectors! If we keep the number of apples separate from the number of oranges, we have a pair of numbers. That pair is a two-dimensional vector v. with "components" v1 and v2:
= v1 number of apples
= u:z number of oranges.
1
2 Chapter 1 Introduction to Vectors
We write v as a column vector. The main point so far is to have a single letter v (in boldface italic) for this pair of numbers vn and v2 (in lightface italic).
Even if we don't add v1 to v2. we do ad,/ vectors. The first components of v and w stay separate from the second components:
VECTOR
ADDITION
and
You sec the reason. We want to add apples to apples. Subtraction of vectors follows the same idea: The components of v - w are v1 - w, and _ _ .
The other basic operation is scalar multiplication. Vectors can be multiplied by
2 or by - 1 or by any number c. There are two ways to double a vector. One way is
to add v + v. The other way (the usual w.ay} is to multiply each component by 2:
SCALAR MULTIPLICATION
The components of cv arc cv1 and c v2. The number c is called a "scalar".
Notice that the sum of - v and v is the zero vector. This is 0. which is not
the same as the number zero! The vector 0 has components 0 and 0. Forgive me for
hammering away at the difference between a vector and its components. Linear algebra
is built on these operations v + w and cv-addi11g vectors and 11111/tiplying by scalars.
l!6
The order of addition makes no dincrcncc: v + w equals w + v. Check that
by algebra: The first component is v1 + w1 which equals w1 + v 1. Check also by an
1tt.
ex.ample:
V + W = [ ~ ] + [ ~ ] = [ : ] = [ ~ ] + [ ~ ] = W + V.
linear Combinations By combining these operations. we now form "Ji11ear combi11ations'' of v and w. Mul-
tiply v by c: ilnd muhiply w by d; then add cv + dw.
DEFINITION The sum of c:v and dw i.r, a linear rnmbi11mio11 of v and w.
Four special linear combinations arc: s um. difference, zero, and a scalar multiple cv:
- lv + l w
sum of \'ectors in Figure I. I
lv - lw - difference of vectors in Figure I. I
- Ov + Ow
zero vector
- cv + Ow
vector cv in the direction of v
The zero vector is always a possible combination (when the coefficients are zero). Every time we sec a ·•space" of vectors. that zero vector will be included. It is this big view. taking all the combinations of v and w, that makes the subject work.
1.1 Vectors and Linear Combinations 3
The figures show how you can visualize vectors. For algebra, we just need the
components (I ikc 4 and 2). In the plane. that vector v is represented by an arrow. The
= = arrow goes v1 4 units to the right and v2 2 units up. It ends at the point whose
x , y coordinates are 4, 2. This point is another representation of the vector- so we
have three ways to describe v. by an arrow or a poim or a pair of numbers.
Using arrows, you can see how to visualize the sum v + w:
Vector addition (head to tail) At the end of v. place the start of w.
We travel along v and then along w. Or we take the shortcut along v + w. We could also go along w and then v . In other words, w + v gives the same answer as v + w.
These are different ways along the parallelogram (in this example it is a rectangle).
The endpoint in Figure 1.1 is the diagonal v + w which is also w + v.
-[-1] ~ ,
W- 2
~ = [~]
! - - - - \' ' - -t--t-----:i---+--+l' l, [ ~ ]
.,,
1·1
Figure 1.1 Vector addition v + w prodluces the diagonal of a parallelogram. The
linear combination on the right is v - w.
= = The zero vector has VJ 0 and v2 0. It is too short to draw a decent arrow, = but you know that v+ 0 v . For 2v we double the length of the arrow. We reverse its
direction for - v. This reversing gives the subtraction on the right side of Figure J. I. p
2 Figure 1.2 The arrow usually stans at the origin (0, O); c v is always parallel to v.
4 Chapter 1 lntroduc1ion to Vecto~
Vectors in Three Dimensions
A vector with two components corresponds to a point in the xy plane. The components
of v are the coordinates of the point: x = v1 and y = t12. The arrow ends at this point
(v1 , 112), when it starts from (0 , 0). Now we allow vectors to have three components
(v1• 1'2, v3). The x y plane is replaced by three-dimensional space. Here are typical vectors (still column vectors but with three components):
rn [_fl m• v=
and w =
and u+w =
The vector v corresponds to an arrow in 3-space. Usually the arrow starts at the origin,
where the x yz axes meet and the coordinates are (0. 0, 0). The arrow ends at the point
with coordinates v1, vi, v3 . There is a perfect match between the col11mn vector and
n the arrow from tire origin and the point where the arrow ends.
= [ From II0W 011 V
fr also wrille/1 • • V = (I. 2. 2).
= The reason for the row fonn (in parentheses) is to save space. But v ( I, 2, 2) is not
l!6
a row vector! It is in actuality a column vector, just temporarily lying down. The row vector [ l 2 2 ] is absolutely different, even though it has the same three components. It is the "transpose" of the column v.
1
rt
z
y
(3, 2)
( I, - 2. 2} 1
2 :
y ~ - " ' - - - - - 1 -
1.. : i . / " 1
X
Figure 1.3 Vectors [ ; ] and [ [] correspond to points (x. y) and (x. y. z).
+ In three dimensions, v w is still done a component at a time. The sum has
components v1 + w , and v2 + w2 and V3 + w3. You see how to add vectors in 4 or 5 or " dimensions. When w slarts at the end of v. the third side is v + w. The other way around the parallelogram is w + v. Question: Do the four sides all lie in the same plane? Yes. And the sum v + w - v - w goes completely around to produce _ _ .
A typical linear combination of three vectors in three dimensions is u + 4v - 2w:
J.1 Vectors and linear Combinations 5
The ·important Questions
For one vector u. the only linear combinations are the multiples cu. For two vectors,
the combinations are cfl+dv. For three vectors, the combinations are cu+dv+ew. Will
you take the big step from one linear combination to all linear combinations? Every c and d and e are allowed. Suppose the vectors u, v, w are in three-dimensional space:
1 What is the picture of all combinations cu?
2 What is the picture of all combinations cu+ dv'?
3 What is the picture of all combinations cu + dv + e w?
The answers depend on the particular vectors 11, v, and w. If 1hey were all zero vec-
tors (a very extreme case). then every combination would be zero. If they are typical nonzero vectors (components chosen at random), here are the three answers. This is
the key to our subject
1 The combinations cu fill a line.
2 The combinations cu + dv fill a plane.
+ 3 The combinations cu dv + ew fill tl,ree-dimemional space.
~
1
The line is infinitely long, in the direction of u (forward and backward, going through
the zero vector). It is the plane of all cu + dv (combining two lines) that I especially
ask you to think about.
Adding all cu on oue line to all dv 011 tire other li,re fills in the plane in Figure 1.4.
Line from
Plane from all er, + dv
(a)
(b)
Figure 1.4 (aJ The line through 11. (b) The plane containing the lines through
11 and v.
When we include a third vector w, the multiples ew give a third line. Suppose
that line is not in the plane of II and v. Then combining all ew with all cu + dv fills
up the whole three-dimensional space.
6 Chapter l Introduction to Vectors
This is the typical situation! Line, then plane, then space. But other possibilities exist. When w happens to be cu + dv. the third vector is in the plane of the first two. The combinations of u. v, w will not go outside that uv plane. We do not get the fuH
three-dimensional space. Please think abour the special cases in Problem I.
■ REVIEW OF THE KEY IDEAS ■
1. A vector v in two-dimensional space has two components v1 and 1/2,
= = 2. v + w (vi + w,, V2 + w2) and cv (cv1, cv2) are executed a component at a
time.
3. A linear combination of u and v and w is cu + dv + ew.
4. Take all linear combinations of u. or II and v, or u and v and w. In three dimensions, those combinations typically fill a line, a plane, and the whole space.
■ WORKED EXAMPLES ■ 1·1
= = 1.1 A Describe all the linear combinations of v (1. l. 0) and w (0, 1. I). Find
a vector that is 1101 a combination of v and w.
Solutio n These are vectors in three-dimensional space R3. Their combinations cv +
dw fill a plarze in R3. The vectors in that plane allow any c and d:
Four particular vectors in that plane are (0. 0, 0) and (2. 3, l ) and (5, 7. 2) and (../2, 0 , -../2). The second component is always the sum of the first and third components. The vector ( 1, I , I) is 1101 in the plane.
Another description of this plane through (0. O. 0) is to know a vector perpendicular to the plane. In this case n = (I, - 1, l) is perpendicular, as Section 1.2 will
= confinn by testing dot products: v , 11 0 and w • n == 0.
1.1 B For v = ( I, 0) and w = (U, I), describe all the points cv and all the combinations c v + dw with any d and (1) whole numbers c (2) 1101111egative c ~ 0.
1.1 Vectors and Linear Combina1ions 7
Solution
(1) The vectors cv = (c, 0) with whole numbers c: are equally spaced points along
the x axis (the direction of 11). They iinclude (- 2, 0). (- 1, 0), (0, 0), (I, 0). (2, 0).
= Adding all vectors dw (0, d) puts a full line in the )' direction through lhose
= points. We have infinitely many parallel lines from cv + dw (whole number.
any number). These are verticaJ lines in the xy plane, through equally spaced poinls on the x axis. (2) The vectors cv with c ?;. 0 fill a "half-line". It is the positive x axis. staning at
= (0, 0) where c 0. It includes (1r, ·O) but not (- 1r, 0). Adding all vectors dw
puts a full line in the )' direction crossing every point on that half-line. Now we have a lialj-pla11e. It is the right half of the xy plane, where x ?. 0.
Problem Set 1.1
Problems 1-9 are about addition of vectors and linear combinations. 1 Describe geometrically (as a line, plane, . . . ) all linear combinations of
1·1
2 Draw the vectors v ;; [1] and w :::;;; [-fl and 11 + w and v - w in a single
xy plane.
3 If v + w = [I] and v - w = [J]. compute and draw v and w. 4 From v = [f] and w = [} ]. find lDle components of 3v + w and v - 3w and
cv +dw.
5 Compute " + v and u + v + w and 2i, + 2v + w when
= 6 Every combination of v = (I, - 2, I) and w (0, I, - 1) has components that
add to _ _ . Find c and cl so rhat cv + dw = (4. 2, -6).
7 In the xy plane mark all nine of these linear combinations:
8 Chapler 1 lnlroduclion 10 Vectors
8 The parallelogram in Figure 1.1 has diagonal v + w. What is its other diagonal?
What is the sum of the two diagonals? Draw that vector sum.
9 If three comers of a parallelogram are (1, l ), (4, 2). and (l. 3), what are all lhe
possible fourth comers? Draw two of them.
. (0, 0. l) I I
- - - --. ---..i (0 , I , 0) J,
I I
Figure 1.5 Unit cube from i, j, k ; twelve clock vectors.
Problems 10-14 are about special vectors on cubes and clocks.
= = 10 Copy lhc cube and dmw the vector sum of i ( I, 0, 0) and j (0, l , 0) and
= k (0, 0, 1). The addition i + j yields the diagonal of _ _ .
1·1
11 Four comers of the cube are (0, 0, 0), (1, 0, 0), (0, I, 0), (0, 0 , 1). What are the other four corners? Find the coordinates of the center point of the cube. The center points of the six faces are _ _ .
12 How many comers does a cube have in 4 dimensions? How many faces? How many edges? A typical comer is (0, 0. I. 0).
13 (a) What is the sum V of the twelve vectors that go from the center of a clock to the hours I:00. 2:00. . ... 12:00?
(b) If the vector to 4:00 is removed, find the sum of the eleven remaining vectors.
(c) What is the unit vector to I:00?
14 Suppose the twelve vectors start from 6:00 at the bottom instead of (0, 0) at the
= center. The vector to 12:00 is doubled to 2j (0. 2). Add the new twelve
vectors.
Problems 15-19 go further with linear combinations of v and w (Figure 1.6)
15 The figure shows ½v +½ w. Mark the points ¾v +¾ w and ¼v +¼ w and v + w.
= 16 Mark the point - v + 2w and any other combination cv + dw with c + d 1. = Draw the line of all combinations that have c +d I.
1.1 Vectors and Linear Combinations 9
17 Locate ½v + ½w and } v + j W. The combinations cv + cw fiJI out what line?
= Restricted by c:::: 0 those combinations with c d fill out what half line?
18 Restricted by O ~ c ~ I and O :5 d :S 1, shade in all combinations cv + dw.
19 Restricted only by c ~ 0 and d 2: 0 draw the "cone" of all combinations cv+dw.
Problems 20-27 deal with u, v , w in three-dimensional space (see Figure 1.6).
1" 20 Locate + ½v + ! w and ½u + ½w in the dashed triangle. Challenge problem:
Under what restrictions on c, d, e, will the combinations cu +dv + ew fill in the
dashed triangle'?
21 The three sides of the dashed triangle are v - u and w - v and u - w. Their sum is _ _ . Draw the head-to-tail addition around a plane triangle of (3, I) plus (-1, I ) plus (-2, -2).
22 Shade in the pyramid of combinations cu+dv+ew with c ~ 0, d ~ 0, e:::: 0 and
c + d + e ~ I. Mark the vector ½(u + v + w) as inside or outside this pyramid.
w
V -11
l4-.
I
L tt
''
V
Figure 1.6 Problems 15-19 in a plane Problems 20-27 in 3-dimensional space
23 If you look at all combinations of those u, v, and w, is there any vector that can't be produced from cu + dlJ + e w?
24 Which vectors are combinations of u and v, and also combinations of v and w?
25 Draw vectors u. v, w so that 1hcir combinations cu + dv + ew fill only a line. Draw vectors 11. v, w so that their combinations cu + dv + ew fill only a plane.
!] !] 1
26 What combination of the vectors [ and [ produces [ : ]? Express this question as two equations for the coefficients c and d in the linear combination.
27 Review Question. In xyz space, where is the plane of all linear combinations of
; = (1. 0, 0) and j = (0, l. O)?
10 Chapter 1 ln1ro<luc1ion to Vectors
28 If (a, b) is a multiple of (c, d) with abed I- 0, show that (a . c) is a mulliple of
(b, d). This is surprisingly important: call it a challenge question. You could use numbers first to sec how a , b, c , d are related. The question will lead to:
= [: ~] If A
has dependent rows then it has dependent columns.
[An [A Yl And eventually: If AB =
then BA =
That looks so simple. ..
LENCIHS AND DOI PRODUCTS ■ 1.2
The first section mentioned multiplication of vectors. but it backed off. Now we go
forward to define the "dot product" of v and w. This multiplication involves the sepa• rate products v1w1 and v2 w2• but it docsn·t stop there. Those two numbers are added to produce the single number v • w.
= = DEFINITION The dot product or iriner product of v (t11, v2) and w ( w 1, ui2)
is the- number
CI)
= Example 1 The vectors v = (4. 2) and w (- 1, 2) have a zero dot product:
i
In mathematics. zero is always a special number. For dot products. it means that rhe.\t: two ,1ectors are perpendicular. The angle between them is 90°. When we drew them in Figure 1. 1, we saw a rectangle (not just any parallelogram). The clearest example
= ( of perpendicular ve<:tors is i I. 0) along the x axis and j = (0. I) up the y axis.
Again the dot product is i • j = 0 + 0 = 0 . Those vectors i and j form a right angle.
= The dot product of v (I , 2) and w ::; (2, I ) is 4. Please check this. Soon that
will reveal the angle between v and w (not 90°).
= Example 2 Put a weight of 4 at the point x - I and a weight of 2 at the point = = x 2. The .x axis will balance on the center point .\" 0 (like a see•saw). The weights = balance because the dot product is (4)(- 1) + (2)(2) 0.
This example is typical of engineering and science. The vector of weights is
= = (w1. w2) (4, 2). The vector of distances from the cente r is (u1, v2) (- 1, 2). T he
weights times the distances, w 1v1 and w 21.12, give the "moments". The equation for the
= see•saw to balance is w 1v1 + w2 v2 0.
The dot product w • 11 equals v • w. T he order of v and w makes no difference.
Example 3 Dot products enter in economics and business. We have three products to buy and sell. Their prices are (p1, P 2, p 3) for each unit- this is the "price vector" p.
1.2 Leng1hs and Dot Products 11
The quantities we buy or sell are (q1. q2. q3)- positive when we sell. negative when we buy. Selling q z units of the first product at the price p, brings in q, P l · The total income is the dot product q • p:
A zero dot product means that "the books. balance." Total sales equal total purchases
= if q . p 0. Then p is perpendicular to q (in three-dimensional space). With three
products, the vectors are three-dimensional. A supermarket goes quickly into high dimensions.
Small note: Spreadsheets have become essential in management. They compute linear combinations and dot products. What you see on the screen is a matrix.
Main point To compuce the dot product u • w, multiply each v; times w; . Then add.
Lengths and Unit Vectors
= An important case is the dot product of a vector with itself. In this case v w. When = = the vector is v (1, 2, 3), the dot product with itself is v • v 14:
rn .m ...=
= I +4 +9 = 14.
I I
The answer is not zero because v is not perpendicular to itself. Instead of a 90° angle between vectors we have 0°. The dot product v • v gives the length of v squared.
DEFINITION The le11gtl1 (or non11 ) of a vector v is the square root of v • v:
= length 11-v U= .)ii:v.
In two dimensions the length is ✓vr + v~ . In three dimensions it is ✓vr + Vi + Vj.
By the calculation above, the length of v = (l , 2, 3) is llvll = Ji4.
We can explain this definition. llvl1 is just the ordinary length of the arrow that
represents the vector. In two dimensions, the arrqw is in a plane. If the components
are I and 2, the arrow is the third side of a right triangle (Figure 1.7). The formula
= a2 + b2 c2. which connects the three sides, is I2 + 22 = II vII2.
= For the length of v (l, 2, 3), we used the right triangle formula twice. The
vector (1. 2, 0) in the base has length ,./5. This base vector is perpendicular to (0, 0 . 3)
= that goes straight up. So the diagonal of the box has length 11 vll ~ = Ji4.
The length of a four-dimensional vector would be J v1 + Vi + vs + vJ. Thus
= (1, 1, 1, I) has length J i2 + 12 + l2 + I2 2. This is the diagonal through a unit
cube in four-dimensional space. The diagonal in 11 dimensions has length ~ -
12 Chapter 1 Introduction to Vector!'.
(0, 2)
(1. 2) 2 ( 1. 0)
- V • V
5 -
14 -
(0, 0. 3) -----,
, ,
II
f- -
(l, 2, 3) has length .Ji4
v2 + v2 + v2
I I
i l + 2i 3
I I I I
12 + 22 + 32
I I
I I
I
(0, 2, 0)
I I
I
(I. 0. 0) - - - - - .:.~ (1, 2, 0) has length ./5
Figure 1.7 The length _,,fv':v of two-dimensional and three-dimensional vectors.
The word "unit' ' is always indicating that some measurement equals "one." The unit price is the price for one item. A unit cube has sides of length one. A unit circle
is a circle with radius one. Now we define the idea of a "unit vector."
= DEFINITION A u11it vector u is a vector whose lengllz eq,u1/s one. Then u •ll l.
!15
t._
= . An exampIe m• &1our d1"mens•mns 1•s u = ( ~1, 21. !1, 21). Then u • u 1•s 41 + 41 + 41 + 41 I
tt
,:,.'.
= = We divided v (l, I. I. I) by its length II vii 2 to get this unit vector.
Example 4 The standard unit vectors along the x and y axes are written i and j. In the xy plane, the unit vector that makes an angle ·'theta" with the .\" axis is (cos 0, sin0):
l Unit Yectors i = [~] and j = [~] and u = [:~:;
= = When () O. the horizontal vector u is i. When 0 90° (or ~ radians). the vertical = vector is j . At any angle, the components cos 0 and sin 0 produce u • u 1 because = cos2 O+ sin2 0 I. These vectors reach out to the unit circle in Figure l.8. Thus cos 9
and sin (} are simply the coordinates of that point at angle 8 on the unit circle.
In three dimensions, the unit vectors along the axes arc i , j. and k. Their components
are ( I , 0 , 0) and (0. 1, 0) and (0, 0, 1). Notice how every three-dimensional vector is
a linear combination of i , j. and k . The vector v = (2, 2, I) is equal to 2i + 2j + k.
= Its length is J 22 + 22 + 12. This is the square root of 9. so llvll 3. = Since (2, 2. l ) has length 3. the vector ( j, j, ½) has length l. Check that u •u
+; = ~ + ~ 1. To create a unit vector, just divide v by its length llvll.
= 1A Unit vectors Divide any nonzero vector v by its length. Then u v/llvll is a
unit vector in the same direction as v.
= = j (0, I) i + j (1 , I)
-i
= i (I. 0)
l .2 Lengths a11d Dot Prod1.1cts 13 _ [cos0]
u - sin 0
-j
Figure 1.8 The coordinate vectors i and j. The unit vector u at angle 45° (left) and the unit vector (cos B, sin 9) at angle 0.
The Angle Between Two Vectors
= We stated that perpendicular vectors have v • w 0. The dot product is zero when
the angle is 90°. To explain this, we have to connect angles to dot products. Then we
show how v, w finds the angle between any two nonzero vectors v and w.
= 1B Right angles The dot product is 11 • w 0 when v is perpendic11lar to w.
Proof When 11 and w arc perpendicular, they fonn two sides of a right triangle. The
third side is v - w (the hypotenuse going across in Figure 1.7). The P y thagoras Law
= for the sides of a right triangle is a2 + b2 c2:
= Perpendicular vectors 11 u11 2 + ll wH2 nv - wll2
(2)
Writing out the formulas for those lengths in two dimensions, this equation is
(vr + vj) + (wr + wO = (v1 -
2 WJ )
+
( v2
-
w2)2.
(3)
vf - vf The right side begins with
2v1w1 + wr Then and wr are on both sides of
v? the equation and they cancel, leaving - 2v1w1. Similarly and Wi cancel, leaving
- 21.12w2. {ln three dimensions there would also be -2v3w3.) The last step is to divide
by - 2:
= 0 - 2v1 w, - 2vitvi which leads to vI w 1 + viw2 = 0.
(4)
= Conclusion Right angles produce v • w 0. We have proved Theorem 1B. The dot = = = product is zero when the angle is B 90° . Then cos8 0. The zero vector 11 0 is
perpendicular to every vector w because O • w is always zero.
14 Chapter 1 Introduction to Vectors
V
\
angle above 90° \ angle below 90° in this half-plane , in this half-plane
= Figure 1.9 Perpendicular vectors have 11 • w 0. The angle is below 90° when
V • W > 0.
Now suppose v , w is not zero. It may be positive. it may be negative. The
sign of v • w immediately tells whether we arc below or above a right angle. The angle is less than 90° when v • w is positive. The angle is above 90° when v • w is
= = negative. Figure 1.9 shows a typical vector v (3. I). The angle with w (1. 3) is
less than 90°.
The borderline is where vectors arc perpendicular to v. On that dividing line
= between plus and minus, where we find w (1. - 3), the dot product is zero.
The next page takes one more s.tep. to find the exact angle 0. This is not neces-
sary for linear algebra- you could stop here! Once we have matrices and linear equa-
tions. we won't come back to 0 . But while we are on the subject of angles, this is the
1
tt +•·1.
place for the formula.
Stan with unit vectors " and U. The sign of 11 • U tells whether 0 < 90° or
0 > 90°. Because the vectors have length I. we learn more than that. Tlze dot product
" • U is t/,e cosine of 0. This is true in any number of dimensions.
= 1C If u and U are unit vectors then u • U cos 0 . Certainly lu • U I ::: J .
Remember that cos() is never greater than l. It is never less than - 1. The dot prodllct
of unit vectors is betweeri - I and I.
= = Figure l.lO shows this clearly when ·the vectors are " (cos 0. sin0) and i (1. 0). = The dot product is 11 • i cos 0. Thal is the cosine of the angle between them.
After rotation through any angle et. these are still unit vectors. Call the vectors
= u = (cos/3.sin.B) and U (cosa,sina). Their dot product is cosacos,B+sinasin,8.
From trigonometry this is the same as cos(/3 - a). Since f:J - a equals 0 (no change
= in the angle between them) we have reached the fornrnla u • U cos 0.
Problem 26 proves lu•UI ~ I directly, without mentioning angles. The inequality
and the cosine formula " • U = cos 8 are always true for unit vectors.
= What if v and w are not unit vectors? Divide by their lengths to get r1 v/ll vll and = U w/lt wll- Then the dot product of those unit vectors u and U gives cos0.
1.2 Lengths and Doi Products 15
c~s0] [ sm0
u • i = cos0
U = [cs~ms,/83 ]
(..-....ft,/
u
=
cos er] [ sin a
~~\a
9= /J - a
Figure 1.10 The dot product of unit vectors is the cosine of the angle 0.
Whatever the angle, this dot product of v /ll vll with w/ llw ll never exceeds one. That is the "Schwarz inequality,, for dot products- or more correctly the Cauchy-
Schwarz-Buniakowsky inequality. It was found in France and Germany and Russia (and
maybe elsewhere-it is the most important inequality in mathematics). With the divi-
sion by ll v ll Jl w ll from rescaling to unit vectors. we have cos0:
V • W
1D (a) COSil'iE FORMULA If v and w are nonzero vectors then ll vll Uwll = cos 8.
~
(b) SCHWARZ INEQUALITY If v and w are any vectors then lv- wl :s ll vllll wll.
1
I I
Example 5 Find cos9 for v = [t) and w = U] in Figure 1.9b.
Solution The dot product is v • w = 6. Both v and w have length Jio. The cosine is
cosO
=
-V-• W-
llvll Uwll
=
-"1"6-6'1'-10
=
3 -.
5
The angle is below 90° because v • w = 6 is positive. By the Schwarz inequality.
II v II II w II = IO is larger than v • w = 6.
Example 6 The dot product of v = (a. b) and w = (b. a) is 2ab. Both lengths are
+ Ja 2 + b2. The Schwarz inequality says that 2ab ~ a2 b2.
Reason The difference between a2 + b2 and 2ab can never be negative:
= a2 + b2 - 2ab (a - b)2 ~ 0.
= This is more famous if we write x = a2 and y b2• Then the ·•geometric mean" .Jxy ½<x is not larger than the ..arithmetic mean," which is the average + y):
a2 +h2
x +y
ab<- - -2 - becomes FY ~ -2- ·
16 Chapter 1 lnlroduction to Veclors
Notes on Computing
Write the components of v as v(l).. _ .. v(N) and similarly for w. In FORTRAN. the
sum v + w requires a loop to add components separately. The dot product also loops
to add 1he separate v(i)w(i):
DO 10 I = 1,N
DO 10 I= 1,N
10 VPLUSW(I) = v(l)+w(I) 10 VDOTW = VDOTW + V(I) * W(I)
MATLAB works directly with whole vecrors, nor their components. No loop is
needed. When v and w have been defined, v + w is immediately understood. It is
printed unless the line ends in a semicolon. Input v and w as rows-the prime ' at the
end transposes them to columns. The combination 2u + 3w uses * for multiplication.
v=L2 3 4]' : w =[l
I]' ; u =2•v+3• w
The dot product v • w is usually seen as a row times a col11111n (with no dot):
!] !] Instead of [ ~] - [
we more often see [ 1 2] [
or v' * w
The length of v is already known to MATLAB as norm (v). We could define it our-
l!6
selves as sqrt (v' * v). using the square root function- also known. The cosine we have
1 .
to define ourselves! Then the angle (in radians) comes from the arc cosine (acos)
tf
function:
cosine = v' * w /(norm (v) * norm (w)); angle = acos (cosine)
An M -file would crcalc a new function cosine (v, w) for future use. (Quite a few Mfiles have been created especially for this book. They are listed at the end.)
■ REVIEW OF THE KEV IDEAS ■
l. The dot producl v • w multiplies each component v; by w; and adds the v;wi,
2. The length IIvII is the square root of v • v.
3. The vector v/ II v II is a unit vector. Its length is 1.
= 4. The dot product is v - w 0 when v and w arc perpendicular.
S. The cosine of 0 (the angle between any nonzero v and w ) never exceeds I:
V • W
cos0 = -llvl-lll w-ll
S chwan. inequality Iv • WI ~ [l v ll ll w ll,
1.2 Lengths and Dot Products 17 ■ W O RKED EXAMPLES ■
1.2 A For the vectors v = (3, 4) and w = (4, 3) test the Schwarz inequality on
v • w and the triangle inequality on ll v + wll. Find cos0 for the angle between v and w. When will we have equality Iv• w] = llvll llwll and llu + wll = Uvll + llwll?
Solution The dot product is v • w = (3)(4) + (4)(3) = 24. The length of v is
= llvll ✓9 + 16 = 5 and also llwll = 5. The sum v + w = (1, 7) has length Uv + wll =
1..fi. ~ 9.9.
Schwarz inequality Triangle inequality Cosine of angle
Iv· wl ::: llvll llwll is 24 < 25.
IJ v+ wll ~ fl vll + llwJr is 7,,/2 < 10.
= cos0 ~ (Thin angle!)
If one vector is a multiple of the other as in w = -2v, then the angle is 0° or 180°
and Icos01 = 1 and Iv• w j equals llvll llwll, If the angle is 0°, as in w = 2v, then
= II v + wll llvll + llwII, The triangle is flat.
1.2 B Find a unit vector u in the direction of v = (3, 4). Find a unit vector U
perpendicular to u. How many possibilities for U?
l1J-::
Solution For a unit vector u, divide v by its length ll vll = 5. For a perpendicular
vector V we can choose (-4, 3) since the dot product v • V is (3)(- 4) + (4)(3) = 0.
L. tt
For a writ vector V. divide V by its length II V II :
1·1
u = _.!... = (3, 4) = (~ ~)
llvll s s's
U = __!::._ = (-4,3) = (-~ ~)
11v11 s
s·s
= The only other perpendicular unit vector would be -U (!, -j).
Problem Set 1.2
1 Calculate the dot products u • v and u • w and v, w and w. v:
2 Compute the lengths l[u l! and Uvll an.d llwll of those vectors. Check the Schwarz inequalities [u • vi ::: llull llvll and Iv • wf ~ Uvll llwll.
3 Find unit vectors in the directions of v and w in Problem 1. and the cosine of the angle 8. Choose vectors that make 0°, 90°, and 180° angles with w.
4 Find unit vectors u I and u2 in the directions of v = (3, 1) and w = (2, I, 2).
Find unit vectors U I and U2 that are perpendicular to u I and u2.
18 Chapter 1 lnuoducIion to Vector~
5 For any unit vectors v and w. find thle dot products (actual numbers) of
(a) v and - v (b) v + w and v - w (c) v - 2w and v + 2w
6 Find the angle O (from its cosine) between
(a) V =[~] and w = [~] (b)
(c) v=[J3] and w=[~] (d)
HJ and w =
[=~l and w =
= = 7 (a) Describe every vector w (w1, w2) that is perpendicular to v (2, - 1). = ( (b) The vectors that are perpendicular to V 1, 1, 1) lie on a _ _ .
(c) The vectors that arc pcrpcndicu:lar to (I, 1. l ) and (I. 2. 3) lie on a _ _ .
8 True or false {give a reason if true or a counterexample if false):
(a) [f" is perpendicular (in three dimensions) to v and w. then v and w
are parallel.
(b) If II is perpendicular to v and w. then II is perpendicular to v + 2w .
= (c) If u and v are perpendicular unit vectors then ll u - vii ./2.
9 The slopes of the arrows from (0.0) lo (v1. v2) and (w1. w2) are v2/v1 and w2/w1 .
= If the product v2w2/ u1 w1 of those s lopes is -1. show that u • w 0 and the
vectors are perpendicular.
= ( = 10 Draw arrows from (0 . 0) to the points v I. 2) and w (- 2, I). Multiply = their slopes. That answer is a signal that v • w 0 and the arrows are _ _ .
11 If v • w is negative. what does this say about the angle between v and w? Draw a 2-dimensional vector v (an arrow), and show where to find all w's with v•w < 0.
= = ( 12 With v (I, I) and w I. 5) choose a number c so that w - cv is perpendicular
to v. Then find the fonnula that gives this number c for any nonzero v and w.
13 Find two vectors v and w that are perpendicular to (1, 0, 1) and to each other.
14 Find three vectors u , v. w that are perpendicular to (I, I. I, I) and to each other.
= = = 15 The geometric mean of x 2 and .'Y 8 is ,Jxy 4. The arithmetic mean is
= __ . larger: ½<x +y)
This came in Example 6 from the Schwarz inequality
for v = (./2. Js) and w = (Js. ./2). Find cos 0 for this v and w.
= 16 How long is the vector v (l, I, ..., I) in 9 dimensions? Find a unit vector u
in the same direction as v and a vector w that is perpendicular to v.
17 What arc the cosines of the angles a. {3, (} between the vector (1, 0, - I) and the unit
= vectors i . j . k along the axes? Check the fonnula cos2 a + cos2 f3 + cos28 1.
1.2 lengths and Dot Products 19
Problem-; IS-31 lead to the main facts a bout lengths and angles in triangles.
18 The parallelogram with sides v = (4 . 2) and w = (- 1, 2) is a rectangle. Check
= the Pythagoras fonnula a2 + b2 c2 which is for right triangles only:
(length of v)2 + (length of w)2 = (length of v + w)1.
19 In this 90° case. a 2 + b2 = c2 also works for v - w:
= (length of v)2 + (length of w)2 (length of v - wi.
Give an example of v and w (not att right angles) for which this equation fails.
20 (Rules for dot products) These equat ions arc simple but useful:
= (l) v • w w • v (2) u • (v + w ) = 11 • v + u • w (3) (cv) • w = c(v • w)
Use (I) and (2) with u = v + w to prove ll v + w112 = v • v + 2v • w + w • w.
21 The triangle inequality says: (length of v + w) :::: (length of v) + (length of w).
Problem 20 found l] v + w n2 = 1J vu2+ 2v •w +ll wu2. Use the Schwarz inequality v • w :::: ]I vII I] w II to tum this into the triangle inequality:
ll v+ wu2 :5 (llvll + llwll)2 or llv + wll < llvll + llwll.
22 A right triangle in three dimensions still obeys ll vll2 + Jl wU2 = ll v+ wlJ2. Show
how this leads in Problem 20 to vi w1 + v2w2 + v3w3 = 0.
I I
23 The figure shows that cosa = v, / llvll and sin a = V2/ll v ll. Similarly cos /j is _ _ and sin /3 is _ _ . The angle 0 is /J - a. Substitute into the formula cos fJ cosa+ sin fJ sin a for cos(/3 - a) to find cosO = v • w/ll vll ll wll.
24 With v and w at angle 0, the "Law of Cosines" comes from (v - w) • (v - w):
+ Uv - wll2 = ll vll2 - 2ll v]l ll wll cos 0 ll w ll2• If 8 < 90° show that llvll2 + ll wll2 is larger than llv - wll2 (the third side).
25 The Schwarz inequality Iv • wl ~ llvll ll wll by algebra instead of trigonometry:
(a) Multiply out both sides of (u1 w, + viw2)2 :::: (vf + v~)(wf + Wi),
(b) Show that the difference between those sides equals (v1w2 - v2wi)2. This cannot be negative since it is a square- so the inequality is true.
20 Chapter 1 lnlroduclion lo Vectors
26 One-line proof of the Schwarz inequality lu • U I ~ l for unit vectors:
= [u
• UI ~
lu d !Vil+ lu2I IU2I
~
u21
+
U
2 1
2
+
+ u22
U
2 2
2
=
1+ 1
- 2-
l.
= = Put (u 1, u2) (.6, .8) and (U1, U2) (.8, .6) in that whole line and find cos 0.
27 Why is Icos81 never greater than I in the first place?
+ 28 Pick any numbers that add to x y + z = 0. Find the angle between your vector
= = (.t , z) and the vector
y). Challenge question: Explain why
-½. v
y,
w
v • w/[l vllllwll is always
(z, x ,
= = 29 (Recommended) If ]lvll 5 and 11wll 3, what are the smallest and largest
values of l[ v - wl['? What are the smallest and largest values of v • w?
= = 30 lf (l . 2) draw all vectors w
y) in the plane with
= 5. Which
v
(x,
;cy
v
w
is the shortest w?
31 Can three vectors in the xy plane have u • v < 0 and v • w < 0 and u • w < O? I
don't know how many vectors in xyz space can have all negative dot products.
(Four of those vectors in the plane would be impossible... ).
1.1.t:
1,
tt
+1
2
SOLVING LINEAR EQUATIONS
VECTORS AND LINEAR EQUATIONS ■ 2.1
The central problem of linear algebra is to solve a system of equations. Those equations are linear. which means that the unknowns are only multiplied by numbers-we never see x times y. Our first example of a linear system is cenainly not big. It has two equations in two unknowns. But you will see how far it leads:
x 2y - l
3x + 2y - 11
(1)
= We begin a row at a time. The first equation x-2y 1 produces a straight line in the
= = xy plane. The point x 1, y 0 is on the line because it solves that equation. The = = = = point x 3, y I is also on the line because 3 - 2 1. If we choose x 10I we = find y 50. The slope of this particular line is ½(y increases by 50 when x changes
by 100). But slopes are important in calculus and this is linear algebra!
y
3x + 2y = 11
I
2
3
Figure 2.1 Row picture: The point (3, l) where the Jines meet is the solution.
= Figure 2.1 shows that line x - 2y 1. The second line in this "row picture"
comes from the second equation 3x + 2y = 11. You can•t miss the intersection point
21
22 Chapter 2 Solving Linear Equations
= = where the two lines meet. The poim x 3. )' I lies 0 11 borh lines. That point solves
both equations at once. This is the solution to our system of linear equations.
R The row picture shows two lines 111eeti11g at a si11gle point.
Tum now to the column picture. I want to recognize the linear system as a ''vector equation". Instead of numbers we need to see vectors. If you separate the original system into its columns instead of its rows, you get
!] X [ + Y [ -; ] = [ I: ] = b.
(2)
This has two column vectors on the lefl side. The problem is to find the combi11atiu1J of those vectors that equals the veclor 011 Ille right. We are multiplying the first col-
= umn by x and the second column by y, and adding. With the right choices x 3 and y = I. this produces 3(column 1) + !(column 2) = b.
C The colllmn picture combines the column i1ectors 011 the left side to produce the vector b 01i the right side.
' '
f r
rn
/ [f '[!]
I '
I
I I
I '
I
[!]
I
Figure 2.2 Column picture: A comlbination of columns produces the right side ( 1.11 ).
Figure 2.2 is the "column picture" of two equations in cwo unknowns. The first part shows the two separate columns, and that first column multiplied by 3. This multiplication by a scalar (a number) is one of the two basic operations in linear algebra:
Scalar multiplication
2.1 Vectors and Linear Equation> 23
If the components of a vector v are v , and v2, then cv has componcn1s cv1 and cvi. The other basic operation is vector addition. We add the first components and
the second components separa1ely. The vector sum is ( 1, 11) as desired:
Vector addition
The graph in Figure 2.2 shows a parallelogram. The sum ( I. 11) is along the diagonal:
! ] !~; ] The sides are [
a,rd [ - ; ] . The diagonal sum is [
= [ 1: ] .
= = We have multiplied the original columns by x 3 and y I. That combina1ion
produces the vector b = (I. I I) on the right side of the linear equations. To repeat: The left side of the vector equation is a linear combinatio,i of the
= = columns. The problem is to find the right coefficients ;a; 3 and y I. We arc
combining scalar multiplication and vector addition into one step. That step is crucially
important, because it contains both of the basic operations:
Linear combination
Of course the solution .\' = 3. y = I is the same as in the row picture. I don't
1
know which picture you prefer! I suspect that the two intersecting tines arc more fa-
1·1
miliar at first. You may like the row picture better, but only for one day. My own
preference is to combine column vectors. It is a lot easier to sec a combination of
four vectors in four-dimensional space, than to visualize how four hyperplanes might
possibly meet at a point. (Even 011e hyperplane is hard enough. ..)
The coefficient matrix on the left side of the equations is the 2 by 2 matrix A:
Coefficient matrb.:
A== [ 3I -22 ] •
This is very typical of linear algebra. to look at a matrix by rows and by columns. Its rows give the row picture and its columns give the column picture. Same numbers. different pictures. same equations. We write those equations as a matrix problem Ax= b:
Matrix equation [ 31 -22 ] [ x)' ] -- [ I I ] . The row picture deals with the two rows of A. The column picture combines the columns.
= The numbers .r = 3 and y I go into the solution vector x. Then
24 Chapter 2 Solving Linear Equations
Three Equations in Three Unknowns
= The three unknowns are x, y , z. The linear equations Ax b are
X + 2y + 3z - 6
2x + Sy + 2z
4
(3)
6x
3y + z
2
We look for numbers x . y. z that solve all three equations at once. Those desired num-
bers might or might not exist. For this system, they do exist. When the number of unknowns matches the number of equations, there is usually one solution. Before solving the problem. we visualize it both ways:
R The row pich1re shows three planes meeting at a single point.
C The column picture combines tliree columns to produce the vecwr (6, 4. 2).
In the row picture, each equation is a plane in three-dimensional space. The first plane
= comes from the first equation x + 2y + 3:.:: 6. That plane crosses the x and y and
z axes at the points (6, 0, 0) and (0, 3, 0) and (0, 0, 2). Those three points solve the equation and they detennine the whole plane.
The vector (x , y, .::) = (0, 0 , 0) does not solve x + 2y + 3z = 6. Therefore the
plane in Figure 2.3 does not comain the origin.
1·1
line L is on both planes
line L meets third plane at solution
m=m
Figure 2.3 Row picture of three equations: Three planes meet at a point.
2.1 Vectors and Linear Equations 25
= The plane x + 2y + 3z 0 does pass through the origin, and it is parallel to
= x + 2y + 3z 6. When the right side increases to 6, the plane moves away from the
origin.
= The second plane is given by the second equation 2x + 5y + 2z 4. Tr imersec:rs
the first plane i11 a line L. The usual result of two equations in three unknowns is a line L of solutions.
The third equation gives a third plane. It cuts the line L at a single point. That
point lies on all three planes and it solves all three equations. It is harder to draw
this triple intersection point than to imagine it. The three planes meet at the solution
= (which we haven't found yet). The column form shows immediately why z 2!
The column picture starts wiJ/1 tile vector form of tire equations:
(4)
The unknown numbers x, y, z are the coefficients in this linear combination. We want
= to multiply the three column vectors by the correct numbers x, y, z to produce b
(6, 4, 2).
m= column I
[ ;J = column 2
= = b
6~ ]
-3 2 times column 3
[
= Figure 2.4 Column picture: (x, y , .;:) = (0, O. 2) because 2(3. 2, I) = (6. 4, 2) b.
Figure 2.4 shows this column picture. Linear combinations of those columns can
= produce any vector b! The combination tbat produces b (6, 4, 2) is just 2 times the
= = = third column. The coefficienrs we need are x 0, y 0, and z 2. This is also the
intersection point of the three planes in the row picture. It solves the system:
26 Chapter 2 Solving Linear Equations
The Matrix Form of the Equations
We have three rows in the row picture and three columns in the column picture (plus the right side). The three rows and three columns contain nine numbers. These nine rmmbers fill a 3 by 3 matrix. The "coefficient matrix" has the rows and columns that have so far been kept separate:
I 2 3] Tire coefficient matrix is A = 2 5 2 .
[ 6 -3 I
The capital letter A stands for all nine coefficients (in this square array). The
letter b denotes the column vector with components 6, 4, 2. The unknown x is also
a column vector, with components x, y. z. (We use boldface because it is a vector, x
because it is unknown.) By rows the equations were (3), by columns they were (4),
[i j n[nu]- and now by matrices they are (5). The shor1hand is Ax = b:
Mauixeq11a6on
=
(5)
We multiply the matrix A times the unknowtt vector x to get the right side b.
Basic ques1io11: What does it mean to "multiply A times x "? We can multiply
by rows or by columns. Either way. Ax = b must be a correct representation of the
three equations. You do the same nine multiplications either way.
Multiplication by rows Ax comes from dot products, each row times the column x :
( row 1 ) • x ]
Ax = ( row 2) • x .
(6)
[ ( row 3 ) • x
Multiplicatio,r by co/um11s Ax is a combination of col1111m vectors:
Ax = x (column 1) + y ((:o/z,mn 2) + z (column 3).
(7)
nu] ul = When we substitute the solution x (0. 0. 2). the multiplication Ax produces b:
l 2 2 5
[ 6 -3
= 2ames column 3 =
The first dot product in row multiplication is (I. 2. 3) • (0 , 0, 2) = 6. The other dot products are 4 and 2. Multiplication by columns is simply 2 times column 3.
Tlzis book sees Ax as a combination of tlie columns of A.
2. 1 Vectors and Linear Equation!> 27 Example 1 Here arc 3 by 3 matrices A and I. with three ones and six zeros:
If you are a row person, the product of every row (1, 0, 0) with (4, 5, 6) is 4. If you are a column person. the linear combination is 4 times the first column (1. I, 1). In
that matrix A. the second and third columns are zero vectors. The example with / x deserves a careful look, because the matrix I is special. It
has ones on the "main diagonal". Off that diagonal, all the entries are 1.eros. Whatever vector rllis mmrix mulliplies, thlll vector is 11ot cha11ged. This is like multiplication by I, but for matrices and vectors. The exceptional matrix in this example is the 3 by 3 identity matrix:
~ !g] I = [
always yields the mul1iplica1ion I x = x
Matrix Notation
The first row of a 2 by 2 matrix contains a11 and a 12, The second row contains a21 and a22, The first index gives the row number, so that a ij is an entry in row i . The
second index j gives the column number. But those subscripts are not convenient on
a keyboard! Instead of aij it is easier to type A(i, j). Tile = entry C157 A(5. 7) would
be in row S, column 7.
A= [ a,1 a12 ] - [ A(l, I) A(l, 2) ] a21 a22 - A(2, I) A(2, 2) •
For an m by n matrix. the row index i goes from 1 to 111. The column index j stops at n. There are mn entries in the matrix. A square matrix (order 11) has 112 entries.
Multiplication in MATLAB
I want to express A and x and lhcir product Ax using MATLAB commands. This is a first step in learning that language. I begin by defining the matrix A and the vector x.
This vector is a 3 by I matrix. with three rows and one column. Enter matrices a row at a time. and use a semicolon to signal the end of a row:
A = [I 2 3: 2 5 2: 6 - 3 I] x = [0 ; 0 ; 2 ]
28 Chapter 2 Solving Linear Equations
Here are three ways to multiply Ax in MATLAB. ln reality, A * X is the way to do it. MATLAB is a high level language, and it works with matrices:
Matrix multiplication b = A * x
We can also pick out the first row of A (as a smaller matrix!). The notation for that 1 by 3 submatrix is A(l, :). Here the colon symbol keeps all columns of row 1:
= Row ar a time b {A(I. :) * X ; A(2, :) * X : A{3, :) *X ]
Those are dot products, row times column, I by 3 matrix times 3 by I matrix. The other way to multiply uses the columns of A. The first column is the 3 by 1
submatrix A(:, I). Now the colon symbol : is keeping all rows of column I. This
column multiplies x ( l) and the other columns multiply x (2) and x(3):
* Column at a time b = A(:. 1) * x(l) +A(:. 2) x(2) +A(: , 3) *X(3}
I think that matrices are stored by columns. Then multiplying a column at a time will
* be a little faster. So A x is actually executed by columns.
You can see the same choice in a FORTRAN-type structure, which operates on
single entries of A and x. This lower level language needs an outer and inner "DO loop". When the outer loop uses the row number I. multiplication is a row at a time.
= The inner loop J 1. 3 goes along each row /.
When the outer loop uses J. multiplication is a column at a time. I will do that in MATLAB . which needs two more lines "end" "end" to close "for /'' and "for J":
FORTRAN by rows
DO IO / = l. 3 = DO 10 J I, 3 * 10 8(1) = B(I) + A(I. J) X(J)
MATLAB by columns
for J = I : 3 for I = l : 3 = b(l) b(I) + A(l. J ) * x(J)
Notice that MATLAB is sensitive to upper case versus lower case (capital letters and small letters). If the matrix. is A then its entries are A(/. J) not a(l. J).
I think you will prefer the higher level A * X. FORTRAN won't appear again in this book. Maple and Ma1l1ematica and graphing calculators also operate ar the higher level. Multiplication is A. x in Mathematic.a. It is multiply(A , x); or evalm(A&t:x); in Maple. Those languages allow symbolic entries a, b, x , .. . and not only real numbers. Like MATLAB's Symbolic Toolbox, they give the symbolic answer.
■ REVIEW OF THE KEY IDEAS ■
1. The basic operations on vectors are multiplication ct.J and vector addition v + w.
2. Together those operations give linear combinations cv + dw.
2.1 Vectors and Linear Equations 29
3. Matrix-vector multiplication Ax can be executed by rows (dot products). But it
should be understood as a combination of the columns of A!
= 4. Column picture: Ax b asks for a combination of columns to produce b. = = 5. Row picture: Each equation in Ax b gives a line (11 2) or a plane (n = 3)
or a "hyperplane" (n > 3). They intersect at the solution or solutions.
■ WO RKED EXAM PLES ■
2.1 A Describe the column picture of these three equations. Solve by careful inspection of the columns (instead of elimination):
Solution The column picture asks for a linear combination that produces b from the three columns of A. In this example b is minus the seco,rd column. So the solution
= = = is x 0, y -l. z 0. To show that (0. - I. 0) is the only solution we have to
know that "A is invertible" and ..the columns are independent" and "the detenninant isn't zero". Those words are not yet defined but the test comes from elimination: We need (and we find!) a full set of three nonzero pivots.
= ( = If the right side changes to b 4, 4, 8) sum of the fl rst two columns, then = = = = ( the right combination has x l, y 1, z 0. The solution becomes x 1. I, 0).
2.1 8 This system has no solution. because the three planes in the row picture don't pass through a point. No combination of the three columns produces b:
X +3y +5z = 4 X +2y-3z =5 = 2x +Sy +2z 8
(1) Multiply the equations by I, I. - 1 and add to show that these planes don't meet at a point. Are any two of the planes parallel? What are the equations of planes
= parallel to x + 3y + 5z 4'? = ( (2) Take the dot product of each column (and also b) with y 1, I, - 1). How do
those dot products show that the system has no solution'?
(3) Find three right side vectors b• and b.. and b••• that do allow solutions.
C0pyrighted ma,crial
30 Chapter 2 Solving Linear Equations
Solution ( 1) Multiplying the equations by l , I, - 1 and adding gives
X + 3y +5z = 4
X +2y-3z = 5
-[2x + 5y + 2z = 8] Ox + Oy + Oz = I
No Solrttion
The planes don't meet at any point, but no two planes are parallel. For a plane parallel to x+3y+5z = 4, just change the "4... The parallel plane x+3y+5z = 0 goes through the origin (0, 0, 0). And the equation multiplied by any nonzero
+ constant still gives the same plane, as in 2x 6y + lOz = 8.
(2) The dot product of each column with y = ( l. I, -1) is zero. On the right side, y • b = (I, I, -1) • (4, 5, 8) = l is 1101 zero. So a solution is impossible. (If a combination of columns could produce b. take dot products with y. Then a combination of zeros would produce I.)
(3) There is a solution when b is a combination of the columns. These three exam-
ples h*. b**, b*.. have solutions x* = ( I, 0, 0) and x •• = (1, 1. l) and x ••• =
(0, 0, 0):
rn m m- b0 =
= fi~l column b.. =
= sum of columns b... =
l!i5
1.
rt
Problem Set 2.1
Problems 1-9 are about the row and column pictures of Ax= b.
1 With A = I (the identity matrix) draw the planes in the row picture. Three sides
of a box meet al the solution x = (x. y, z) = (2. 3, 4) :
lx + Oy + Oz = 2 Ox + ly + Oz= 3 or
Ox+ 0y + lz = 4
2 Draw the vectors in the column picture of Problem 1. 1\vo times column I plus three times column 2 plus four times column 3 equals the right side b.
3 If the equations in Problem I are multiplied by 2, 3, 4 they become Ax = b:
2x + Oy +Oz= 4
Ox +3y+ Oz=9
or
Ox + 0y + 4z = 16
x Why is the row picture the same? Is the solution the same as x ? What is
changed in the column picture- the columns or the right combination to give b?
2.1 Vectors and Linear Equations 31
4 If equation I is added to equation 2. which of these are changed: the planes in the row picture, the column picture, the coefficient matrix, the solution? The new
= = equations in Problem l would be x = 2. ,t + y 5. z 4.
= 5 Find a point with z = 2 on the intersection line of the planes x + >' + 3z 6 and = = + x - y z 4. Find the point with z 0 and a third point halfway between.
6 The first of these equations plus the second equals the third:
x + y + z= 2
X + 2y+ Z = 3 2x + 3y + 2z = 5.
The first two planes meet along a line. The third plane contains that line, because if x, y, z satisfy the first two equations then they also _ _ . The equations have infinitely many solutions (the whole line L). Find three solutions on L.
= 7 Move the third plane in Problem 6 to a parallel plane 2x + 3y + 2z 9. Now
the three equations have no solution - why not? The first two planes meet along
the line L, but the third plane doesn 'l
that line.
8 In Problem 6 the columns are (I , J. 2 ) and ( I. 2. 3) and ( l. 1, 2). This is a "sin-
gular case'' because the third column is _ _ . Find two combinations of the
= = = columns that give b (2. 3. 5). This is oniy possible for b (4. 6. c) if c
9 Normally 4 "planes" in 4-dimensional space meet at a
. Normally 4 col-
umn vectors in 4-dimensional space can combine to produce b. What combination
= of (] , 0, 0 , 0), (I, I, 0 , 0), (J. 1, I. 0), (I , 1, 1, 1) produces b (3, 3, 3, 2)? What
4 equations for x. y. z. t are you solving?
Problems 10-1S are about multiplying matrices and vectors.
10 Compute each Ax by dot products of the rows with the column vector:
11 Compute each Ax in Problem 10 as a combination of the columns: How many separate multiplications for Ax, when the matrix is "3 by 3'"1
32 Chapter 2 Solving Linear Equations
12 Find the two components of Ax by rows or by columns:
13 Multiply A times x to find three components of Ax:
14 (a) A matrix with m rows and n columns multiplies a vector with - - · components to produce a vector with __ components.
(b) The planes from the m equations Ax= b are in _ .-dimensional space. The combination of the columns of A is in _ _ -dimensionaJ space.
= 15 Write 2\" + 3y + z + 51 8 as a matrix A (how many rows?) multiplying the
column vector x = (x, y, z, t) to produce b. The solutions x fill a plane or "hy~
perplane•, in 4-dimensional space. The pla11e is 3-dimensional with 110 4D volume.
Problems 16-23 ask for matrices that act in special ways on vectors.
1]. 16 (a) What is the 2 by 2 identity matrix? / times [y] equals [
n. (b) What is the 2 by 2 exchange matrix? P times [;] equals [ n 17 (a) What 2 by 2 matrix R rotates every vector by 90°? R times [ is [_i).
(b) What 2 by 2 matrix rotates every vector by 180°?
18 Find the matrix P that multiplies (x, y. :) to give (y, z. x). Find the matrix Q
that multiplies (y, z, x) to bring back (x. y, z). 19 What 2 by 2 matrix E subtracts the first component from the second component?
What 3 by 3 matrix does the same?
and
20 What 3 by 3 matrix E multiplies (x, y, z) to give (x, y. z+x)? What matrix e - 1
multiplies (x, y, z) to give (x, y, z - x)? If you multiply (3, 4, 5) by £ and then multiply by E- 1, the two results are ( _ _ ) and ( _ _ ). 21 What 2 by 2 matrix P1 projects the vector (x, y) onto the x axis to produce
(x, 0)? What matrix P2 projects onto the y axis to produce (0. y)? If you mul-
tiply (5, 7) by Pt and then multiply by P2 • you get ( _ _ )and( _ _ ).
2.1 Vectors and Linear Equations 33
22 What 2 by 2 matrix R rotates every vector through 45°? The vector (1, 0) goes to (--/2/ 2, ./2/ 2). The vector (0, l) goes to (-./2/ 2, ./2/ 2). Those determine the matrix. Draw these particular vectors in the x y plane and find R.
23 Write the dot product of (I. 4. 5) and (x, y, z) as a matrix multiplication Ax. The
= matrix A has one row. The solutions to Ax 0 lie on a _ _ perpendicular
to the vector _ _ . The columns of A are only in _ _ -dimensional space.
24 In MATLAB notation, write the commands that define this matrix A and the col-
= umn vectors x and b. What command would test whether or not Ax b?
25 The MATLAB commands A :;:; eye(3 ) and v = [3 : 5 ]' produce the 3 by 3 iden-
tity matrix and the column vector (3, 4, 5). What are the outputs from A* v and
v' "- v? (Computer not needed!) If you ask for v * A. what happens?
26 If you multiply the 4 by 4 all-ones matrix A = ones(4,4) and the column v = ones{4,1), what is A * v? (Computer not needed.) If you multiply B = eye(4) + ones(4,4) times w = zeros(4,1) + 2 • ones(4, l ), what is B* w?
Questions 27-29 are a review of the row and column pictures.
27 Draw the two pictures in two pJanes for the equations x - 2y = 0, x + y = 6.
28 For two linear equations in three unknowns .t . y, z. the row picture will show
(2 or 3) (lines or planes) in (2 or 3)-dimensional space. The column picture is in (2 or 3)-dimensional space. The solutions normally lie on a _ _ .
29 For four linear equations in two unknowns x and y. the row picture shows four _ _ . The column picture is in _ _ -dimensional space. The equations have no solution unless the vector on the right side is a combination of _ _ .
= 30 Start with the vector uo (1, 0}. Multiply again and again by the same "Markov
matrix" A below. The next three vectors are u 1, ll2, u3:
[.8= UJ .2
.3]
.7
[I] 0
=
[·8] .2
What property do you notice for all four vectors uo, u 1, u 2, 113?
34 Chapter 2 Solving Linear Equations
= 31 With a computer, continue from u o (I. 0) to u 7, and from vo = (0. I) to v7.
What do you notice about u, and 1Ji? Here are two MATLAB codes. one with
while and one with for. They plot 110 to u1- you can use other languages:
u = (1 ; 01; A = [.8 .3 ; .2 .7); x = u; k = [O : 71; while size(x,2) <= 7
u = A* u; x = [x u]; end plot(k, x)
u = (1 ; OJ; A = (.8 .3 ; .2 .7]; x = u; k = (0 : 7];
for j=1 : 7
u = A• u; x = (x u]; end plot(k, x)
32 The u·s and v's in Problem 31 arc approaching a steady state vector s. Guess
= that vector and check that As s. If you start with s. you stay with s.
33 This MATLAB code allows you to input xo with a mouse click, by ginput. With
1 = 1, A rotates vectors by theta. The plot will s how Axo, A2xo, ... going
around a circle (t > 1 will spiral out and t < 1 will spiral in). You can change
thew and the stop at j=l 0. We plan to put this code on web.mit.edu/18.06/www:
theta = 15 *pi/180; t = 1.0; A = t • lcos(theta) - sin(theta) ; siin(theta) cos(theta));
l!i5
1.
d1speClick to select starting point') [xl , x21 = ginput(l ); x = (xl ; x2J;
rt
for j=l :10
x = Ix A• x( : , end)!;
end
plot(x(l ,:), x(2,:), 'o')
hold off
34 Invent a 3 by 3 magic mntrh: M3 with entries 1. 2, .... 9. All rows and columns
and diagonals add to 15. The first row could be 8. 3. 4. What is M3 times (1, l, 1)?
What is M4 times (1. I, I , I) if this magic matrix has entries l. . .. , 16?
2.2 The Idea oi Elimination 35
THE IDEA OF ELIMINATION ■ 2.2
This chapter explains a systematic way to solve linear equations. The method is called "elimination", and you can see it immediately in our 2 by 2 example. Before elimi-
nationt x and y appear in both equations. After elimination, the first unknown x has
disappeared from the second equation:
= Before
x -
3x+
2•r 2y=
I 11
= After
X -2y J
Sy = 8
(multiply by 3 and subtract)
(x has been eliminated)
= The last equation 8y = 8 instantly gives y 1. Substituting for y in the first equation = = = leaves x - 2 1. Therefore x 3 and the solution (x, y) (3, 1) is complete.
Elimination produces an upper triangular system - this is the goal. The nonzero
= = coefficients 1, - 2, 8 form a triangle. The last equation 8y 8 reveals y I, and we
go up the triangle to x . This quick process is called back substitution. It is used for
upper triangular systems of any size. after forward elimination is complete.
= = lmportanc poinc: The original equations have the same solution x 3 and y I.
Figure 2.5 repeats this original system as a pair of lines, intersecting at the solution
point (3, J). After elimination, the lines still meet at the same point! One line is hor-
izontal because its equation Sy= 8 does not contain x .
r';~ .\
How did we get from the first pair of lines to the second pair? We subtracted
tt
3 times the first equation from the second equation. The step that eliminates x from
equation 2 is the fundamental operation in this chapter. We use it so often that we
look at it closely:
To eliminate x : Subtract a 11111/tiple of equatio11 1 from equation 2.
= = Three times x - 2y = l gives 3x - 6y 3. When this is subtracted from 3x +2y I I,
the right side becomes 8. The main point is that 3x cancels 3x. What remains on the left side is 2y - (- 6y) or 8y, and x is eliminated.
Before elimination
y 3x + 2y = 11
After elimination y
2
3
= Figure 2.5 Two lines meet at the solution. So does the new line 8y 8.
36 Chapter 2 Solving Linear Equations
= Ask yourself how thac multiplier e 3 was found. The first equation contains x.
The first pivot is l (the coefficient of x). The second equation contains 3x, so the first
equation was multiplied by 3. Then subtraction 3x - 3x produced the zero.
= You will see the multiplier rule if we change the first equation to 4x - 8y 4.
(Same straight line but the first pivot becomes 4.) The correct multiplier is now e= ¾-
To find the multiplier, divide the coefficie11t " 3" to be eliminated by the pivor "4";
4x - 8y =4
= 3x + 2y 11
Multiply equation 1 by ¾ 4x -Sy= 4
Subtract from equation 2
8y = 8.
The final system is triangular and the last equation still gives y = I. Back substitution
= produces 4x - 8 4 and 4x = 12 and x = 3. We changed the numbers but not the
= }: lines or the solution. Divide by the pivot to find that multiplier t
Pivol
- first nonzero in tlze row that does tl,e elimination
= M11/tiplier - (entry to eliminate) divided by (pivot) ¾-
The new second equation starts with the second pivot, which is 8. We wou1d use it to
eliminate y from the third equation if there were one. To solve II equations we want
" pivots. The pivots are o,i the diagonal of the triangle after eliminatior,.
You could have solved those equations for x and y without reading this book. It
is an extremely humble problem, but we stay with it a little longer. Even for a 2 by 2
system, elimination might break down and we have 10 see how. By understanding the
l!i5
i
i
t-'!-
possible breakdown (when we can't find ai full set of pivots), you will understand the
whole process of elimination.
Breakdown of Elimination
Normally. elimination produces the pivots that take us to the solution. But failure is possible. At some point, the method might ask us to divide by zero. We can't do it. The process has to stop. There might be a way to adjust and continue- or failure may be unavoidable. Example I fails with no solution. Example 2 fails with too many solutions. Example 3 succeeds by exchanging the equations.
Example 1 Pen11anenl failure witlz no sollltion. Elimination makes this clear:
x - 2y = l
Subtract 3 times
= 3x - 6y 11 cqn. I from cqn. 2
x - 2)' = I Oy = 8.
= The last equation is Oy 8. There is no solution. Nonnally we divide the right side
8 by the second pivot. but rhis system has 110 seco,id pivot. (Zero is never allowed as a pivot!) The row and column pictures of this 2 by 2 system show that failure was
unavoidable. If there is no solution, elimination must certainly have trouble.
The row picture in Figure 2.6 shows parallel lines- which never meet. A solution must lie on both lines. With no meeting point, the equations have no solution.
y
x-2y = I
2.2 The Idea of Elimination 37 firsI [ I]
column 3 Columns don't combine to give plane
Figure 2.6 Row picture and column picture for Example I: no solution.
The column picture shows the two columns (l, 3) and (-2, -6) in the same di-
rection. All combinations of tire columns lie along a line. But the column from the
[1h
right side is in a different direction (L 11 ), No combination of the columns can pro-
1.,
duce this right side-therefore no solution.
tt
When we change the right side to ( 1, 3), failure shows as a whole line of solu-
tions. Instead of no solution there are infinitely many:
Example 2 Permanent failure with infinitely many solutions:
= x - 2y l Subtract 3 times = 3x -6y 3 eqn. I from eqn. 2
x - 2y = 1 Oy = 0.
= = Every y satisfies Oy 0. There is really only one equation x - 2y I. The unknown = y is "free". After y is freely chosen, x is determined as x 1 + 2y.
In the row picture. the parallel lines have become the same line. Every point on that line satisfies both equations. We have a whole line of solutions.
In the column picture, the right side: (I. 3) is now the same as the first column.
So we can choose x = I and y = 0. We can also choose x = 0 and y = - ½;
the second column times - ~ equals the right side. There are infinitely many other solutions. Every (x , y) that solves the row problem also solves the column problem.
Elimination can go wrong in a third way-but this time it can be fixed. Suppose the first pivot position comains zero. We refuse to allow zero as a pivot. When the first equation has no term involving x , we can exchange it with an equation below:
Example 3 Temporary failure but a row exchange produces two pivots:
= Ox + 2y 4 Exchange the
3x - 2y = 5 two equations
3x -2y = 5
2y =4.
38 Chapter 2 Solving Linear Equations
)'
[!] • right hand side
lies on the line of columns
Same line from both equations Solutions all along this line
-½ (second column) = [;]
Figure 2.7 Row and column pictures for Example 2: infinitely many solutions.
The new system is already triangular. This small example is ready for back substitution.
= = The last equation gives y 2, and then the first equation gives x 3. The row
1
picture is normal (two intersecting lines). The column picture is also normal (column
vectors not in the same direction). The pivots 3 and 2 are nonnal - but an exchange
11
was required to put the rows in a good order.
Examples 1 and 2 are singular- there is no second pivot. Example 3 is nonsin-
glllar- there is a full set of pivots and exactly one solution. Singular equations have
no solution or infinitely many solutions. Pivots must be nonzero because we have to
divide by them.
Three Equations in Three Unknowns
To understand Gaussian elimination, you have to go beyond 2 by 2 systems. Three by three is enough to sec the pattern. For now the matrices are square - an equal number of rows and columns. Here is a 3 by 3 system, specially constructed so that all steps lead to whole numbers and not fractions:
= 2x + 4y - 2z 2
= 4x + 9y -3z 8
(I)
-2x - 3y + 7z = 10
What arc the steps? The first pivot is the l:loklface 2 (upper left). Below that pivot we
= want to create zeros. The first multiplier is the ratio 4/ 2 2. Multiply the pivot equa-
tion by e21 = 2 and subtract. Subtraction removes the 4.r from the second equation:
2.2 The Idea or Elimination 39
Step I
Subtract 2 times equation I from equation 2.
We also eliminate - 2x from equation 3- still using the first pivot. The quick way is
to add equation I to equation 3. Then 2x cancels - 2x. We do exactly that, but lhe
rule in this book is to subtract rather than add. The systematic pattern has multiplier
l31 = - 2/ 2 = - 1. Subtracting - 1 times an equation is the same as adding:
Step 2
Subtract - I times equation 1 from equation 3.
The two new equations involve only y and z. The second pivot (boldface) is I:
l y+ 1z =4 ly + :5z= l2
We have reached c, 2 by 2 system. The fi1r1al step eliminates y to make it I by 1:
Step 3
= Subtract equation 2new from 3new- The multiplier is I. Then 4z 8,
= = The original system Ax b has been converted into a triangular system Ux c:
= 2.r + 4y - 2.: 2
= 2x + 4y - 2z. 2
= 4x + 9.r - 3:::: 8 has become
l y + lz = 4
(2)
-2x - 3y + 1z = 10
4z: :::: 8.
The goal is achieved-forward elimination is complete. Notice the pivots 2,1,4 along the diagonal. Those pivots 1 and 4 were hidden in the original system! Elimination brought them out. This triangle is ready for back substitution. which is quick:
= = = (4z 8 gives z 2) (y + z 4 gives y = 2) (equation I gives x = - I)
= Tlze sol11tio11 is (x. y, z ) ( - 1, 2, 2), The row picture has three planes from three
equations. All the planes go through this solution. The original planes arc sloping. but
= the last plane 4z 8 after elimination is horizontal.
The column picture shows a combination of column vectors producing the right
side b . The coefficients in that combination Ax are - 1, 2. 2 (the solution):
U] [_fl =n U]. (-1)
+2
+ 2 [ equals
(3)
The numbers x , y , z multiply columns 1, 2, 3 in the original system Ax= b and also
= in the triangular system Ux c.
For a 4 by 4 problem, or an II by 11 problem, elimination proceeds the same way. Here is the whole idea of forward elimination, column by column:
Column 1. Use the first equation to create zeros below the first pivot.
Column 2. Use the new eq11atio11 2 to create zeros below the second pivot.
Columns 3 to 11. Keep going to find the other pivots and tile triangular U.
40 Chapter 2 Solving Linear Equations
[~:::~]- After column 2 we have
QQX X
We want
(4)
0 0 XX
The result of forward elimination is an upper triangular system. It is nonsingular if there is a full set of " pivots (never zero!). Question: Which x could be changed to boldface x because the pivot is known? Here is a final example to show the original
= = Ax b, the triangular system Ux c, and the solution from back substitution:
x+ y+ z= 6 X +2y+2z =9
X +2)' + 3z = JO
x+y+z:= 6 y+z: =3
z= l
All multipliers are I. All pivots are I. All planes meet at the solution (3, 2, 1). The
= = columns combine with coefficients 3, 2, 1 to give b (6, 9, 10) and c (6, 3, I).
■ REVIEW O F THE KEY IDEAS ■
A linear system becomes upper triangular after elimination.
2. The upper triangular system is solved by back substitution (starting at the bottom).
3. Elimination subtracts lij times equation j from equation i, to make the (i, j) entry zero.
r . 4.
The
muIftp
1er
1s
e .. 11
-
-
entry tpoiveolitmminarotewi1n row ;
Pivots can not be zero!
5. A zero in the pivot position can be repaired if there is a nonzero below it.
6. When breakdown is permanent, the s ystem has no solution or infinitely many.
■ WORKED EXAMPLES ■
2.2 A When elimination is applied to this matrix A, what are the first and second pivots? What is the multiplier e21 in the first step (e21 times row I is subtracted from row 2)? What entry in the 2, 2 posi1ion (instead of 9) would force an exchange of rows
= 2 and 3? Why is the multiplier e31 0, subtracting O times row I from row 3?
2.2 The Idea of Eliminalion 41
f = Solution The first pivot is 3. The multiplier l21 is 2. When 2 rimes row l is
subtracted from row 2, the second pivot is revealed as 7. If we reduce the entry "9"
to "2", that drop of 7 in the (2, 2) posiliot1 would force a row exchange. (The second row would start with 6, 2 which is an exact multiple of 3, I in the first row. Zero will
= appear in the second pivot position.) The multipJier l31 is zero because a31 0. A
zero at the scan of a row needs no elimination.
2.2 B Use elimination to reach upper triangular matrices U. Solve by back substi-
tution or explain why this is impossible. What are the pivots (never zero)? Exchange
equations when necessary. The only difference is the -x in equation (3).
x+y+z=7 x+y-z=5 x-y+z=3
x+y+.z=1
x+y-z:=5
-x -y+ z: = 3
Solution For the first system. subtract equation I from equations 2 and 3 (the mul-
tipliers are l21 = 1 and l31 = 1). The 2, 2 entry becomes zero, so exchange equations:
x+y+z= 7
x+y+z= 7
Oy - 2z = -2 exchanges into -2y +oz= -4
-2y+Oz = -4
-2z = -2
= = = Then back substitution gives z l and y 2 and x 4. The pivots are l , -2, -2.
For the second system, subtract equation 1 from equation 2 as before. Add equa-
ri
tion l to equation 3. This leaves zero in the 2, 2 entry a11d below:
x+y+z= 1
0y-2z =-2
= Oy + 2z IO
There is no pivol in column 2.
= A further elimination step gives Oz 8
The three planes don't meet!
Plane l meets plane 2 in a line. Plane I meets plane 3 in a parallel line. No solution. If we change the "3" in the original third equation to ''-5" then elimination would
= = = )eave 2z 2 instead of 2z 10. Now z 1 would be consistent-we have moved
the third plane. Substituting z = l in the first equation leaves x + y = 6. There arc
infinitely many solutions! The three planes now meet along a whole line.
Problem Set 2.2
Problems 1-10 are about elimination on 2 by 2 systems. 1 What multiple l of equation 1 should be subtracted from equation 2?
2x +3y = l LOx + 9y = 11.
After this elimination step, write down the upper triangular system and circle the two pivots. The numbers 1 and l l have no influence on those pivots.
42 Chapter 2 Solving Linear Equations
2 Solve the triangular system of Problem I by back substitution. y before x. Verify that x times (2. 10) plus J' times (3, 9} equals (1, 11). If the right side changes to (4, 44), what is the new solution?
3 What multiple of equation l should be subtracted from equation 2?
2x - 4y = 6
-x + Sy= 0.
After this elimination step, solve the triangular system. If the right side changes to (- 6, 0), what is the new solution'?
4 What multiple i of equation I should be subtracted from equation 2?
ax +by = f ex + dy = g.
The first pivot is a (assumed nonzero). Elimination produces what formula for
the second pivot? What is y'? The second pivot is missing when ad= be.
~
5 Choose a right side which gives no solution and another right side which gives
infinitely many solutions. What are two of those solutions?
r::
tt
3x + 2y = JO
fl
6x + 4y ==
6 Choose a coefficient b that makes this system singular. Then choose a right side g that makes it solvable. Find two solutions in that singular case.
2x + by= 16
4x +8y = g.
7 For which numbers a docs elimination break down ( 1} pennancntly (2) temporarily?
= ax+ 3y -3
4x +6y = 6.
Solve for x and y afler fixing the second breakdown by a row exchange.
8 For which three numbers k does elimination break down? Which is fixed by a row exchange? In each case, is the number of solutions O or I or oo?
kx +3y = 6
= 3x +ky - 6.
2.2 The Idea of Eliminalion 43
9 What test on b1 and b2 decides whether these two equations allow a solution? How many solutions will they have? Draw the column picture.
3x - 2y = bi
6x - 4y = IJ2.
= = 10 In the X )' plane. draw the lines x + y 5 and x + 2y 6 and the equation
= __ = y
that comes from elimination. The line 5x - 4y c will go through
the solution of these equations if c = _ _.
Problems 11-20 study elimination on 3 by 3 systems (and possible failure).
11 Reduce this system to upper triangular form by two row operations:
2x +3y + z = 8 4x + 7y + 5z = 20
- 2y + 2;:: = 0.
Circle the pivots. Solve by back substitution for z, y, x.
12 Apply elimination (circle the pivots) and back substitution to solve
2x - 3y :::: 3
·4x - 5y + z =7
Ltt.
1·1
= 2x - y - 3z 5.
List the three row operations: Subtract
times row
from row
13 Which number d forces a row exchange, and what is the triangular system (not singular) for that d? Which d makes this system singular (no third pivot)?
2x + Sy+ z = 0
4x + dy + z = 2
y - z = 3.
14 Which number b leads later lo a row c1tchangc? Which b leads to a missing
pivot? In that singular case find a nonzero solution x, y , z.
x + b) =0
=z .l' - 2.v -
0
y+z = 0.
15 (a) Construct a 3 by 3 system that needs two row exchanges to reach a triangular form and a solution.
(b) Construct a 3 by 3 system that needs a row exchange to keep going. but breaks down later.
44 Chapter 2 Solving Linear Equations
16 If rows I and 2 are the same, how far can you get with elimination (allowing row exchange)? If columns 1 and 2 are the same, which pivot is missing?
2x-y+z=0
2x-y+z=0
4x + y +z =2
2r+2y +z = 0
4x+4y + z = 0
6x + 6y + z =2.
17 Construct a 3 by 3 example that has 9 different coefficients on the left side, but rows 2 and 3 become zero in elimination. How many solutions to your system
= = with b (1, 10. JOO) and how many with b (0, O. 0)?
18 Which number q makes this system singular and which right side t gives it in-
= finitely many solutions? Find the solution that has z l.
x + 4 y- 2z = I X +7y- 6z = 6
3y+qz=t.
19 (Recommended) It is impossible for a system of linear equations to have exactly two solutions. Exp/a;n why.
(a) Ii (x. y, z) and (X. Y. Z) are two solutions, what is another one'J
(b) If 25 planes meet at two points, where else do they meet?
11
20 Three planes can fail to have an intersection point, when no two planes are parallel. The system is singular if row 3 of A is a _ _ of the first two rows.
= Find a third equation that can't be solved if x + y + :z 0 and x - 2y - z = I.
Problems 21-23 move up to 4 by 4 and n by n.
21 Find the pivots and the solution for lhese four equations:
2x + y
=0
X + 2y + Z =0
y +2z + t =0 z + 21 = 5.
22 This system has the same pivots and right side as Problem 21 . How is the soluw lion different (if it is)?
2x - y
=0
-x + 2y - z =0
y + 2z - t = 0
- z +21 = 5.
2.2 The Idea of Elimination 45
23 If you extend Problems 21- 22 following the l. 2. I pattern or the -1. 2, - l pattern, what is the fifth pivot? What is the nth pivot?
24 If elimination leads to these equations, find three possible original matrices A: x+y+z=O y+z=O 3z =0.
25 For which two numbers a will elimination fail on A = [: ; ]?
26 For which three numbers a will elimination fail to give three pivots?
27 Look for a matrix that has row sums 4 and 8, and column sums 2 and s:
a+b=4 a+c=2 c+d.;;;8 b+d-s
= _ _ . The four equations are solvable only if s
Then find two different ma-
trices that have the correct row and column sums. Exira credit: Write down the 4
= by 4 system Ax= b with x (a, b. c, d) and make A triangular by elimination.
28 Elimination in the usual order gives what pivot matrix and what solution to this
"lower triangular" system? We are really solving by fonvard substimrio11:
3x
=3
6x + 2y = 8
9x -2y +z = 9.
29 Create a MATLAB command A(2, : ) = .. . for the new row 2. to subtrncl 3
times row I from the existing row 2 if the matrix A is already known.
30 Find experimentally the average first and ~econd and third pivot sizes (use the absolute value) in MATLAB's A= rand(3 , 3). The average of abs(A( l , 1)) should be 0.5 but I don't know the others.
46 Chapter 2 Solving Linear Equations
ELIMINATION USING MATRICES ■ 2.3
We now combine two ideas- elimination and matrices. The goal is to express all the steps of elimination (and the final result) in the dearest possible way. In a 3 by 3 example, elimination could be described in words. For larger systems, a long list of steps would be hopeless. You will see how to subtract a multiple of one row from
another row - using matrices.
= The matrix form of a Jinear system is Ax b. Herc arc b. x, and A:
1 The vector of right sides is b. 2 The vector of unknowns is x. (The unknowns change to :q, xi , x3, ... because
we run out of letters before we run out of numbers.)
3 The coefficient matrix is A. In lhis chapter A is square.
= The example in the previous section has the beautifully short form Ax b:
~ ! =;J [;~] = 2xI + 4x2 - 2q 2
= 4x1 + 9x2 - 3.t"J 8 is the s.ame as [
= [ ~] . (I)
- 2x1 - 3x2 + 7x3 = IO
- 2 - 3 7 X3
lO
The nine numbers on the left go into the matrix A. That matrix not only sits beside x, it mulriplies x. The rule for "A times x" is exactly chosen to yield the three equations.
Review of A times x . A matrix times a vector gives a vector. The matrix is square when the number of equations (three) matches the number of unknowns (three). Our
1
matrix is 3 by 3. A general square matrix is II by n. Then the vector x is in 11-
1 1
dimensional space. This example is in 3-dimensional space:
[x'] = Tl,e unk11ow11 is x x2 and the solution is J'.3
= Key point: Ax b represents the row form and also the column form of the equations.
We can multiply by talcing a column ,of A at a time:
(2)
This rule is used so or'ten that we express it once more for emphasis.
2A The product Ax is a combi11atio11 of the colllmns of A. Components of x multiply columns: Ax= x 1 times (column I)+ •••+ Xn times (column 11).
One point to repeat about matrix:. notation: The entry in row 1. column l (the top left comer) is called a 11. The entry in row 1, column 3 is a 13, The entry in row 3. column I is a31, (Row number comes before column number.) The word "entry" for a matrix corresponds to the word "component" for a vector. General rule: The entry in row i, column j of the matrix A is Oij .
2.3 Elimination Using Matrice5 47
= = Example 1 This matrix has au = 2i + j. Then a1, 3. Also a 12 = 4 and a21 5.
Here is Ax with numbers and letters:
[
3
5
4]
6
[2]
l
=
[3
5
·2
• 2
+ +
4•
6 •
1]
1
[:q] = [a11 a12J
[a11x1 + a 12x2 J.
a21 a22 x2
a21x1 + a22x2
= The first component of Ax is 6 + 4 IO. That is the product of the row [3 4] with
the column (2, I). A row times a column gives a dot product!
The ith component of Ax involves row i, which is [ a;1 a;2 • • • a;,,]. The short
formula for its dot product with x uses "sigma notation·•:
LII
28 The ith component of Ax is a;1x1 + a12x2 + •· · + a;11:c11• This is
a ijXj
j= I
L The sigma symbol is an instruction to add. Start with j = I and stop with j = n.
Start the sum with a;1x1 and stop with a;11x11•1
The Matrix Form of One Elimination Step
Ax = b is a convenient form for the original equation. What about the elimination
steps? The first step in this example subtracts 2 times the first equation from the second
equation. On the right side, 2 times the first component of b is subtracted from the
I I
second component:
Lfl Ul b =
changesto boew =
= We want to do that subtraction with a matrix! The same result bncw Eb is achieved
Hi ~l when we multiply an "elimination matrix" E times b. It subtracts 2b1 from In.: T/,e e/imina6011 matrix is £ =
Multiplication by E subtracts 2 times row 1 from row 2. Rows 1 and 3 stay the
same:
H 0 I 0
0 I
0
Notice how b1 = 2 and b3 = 10 stay the same. The first and third rows of E are the first and third rows of the identity matrix i . The new second component is the number 4 that appeared after the elimination step. This is b2 - 2b1.
L· 1Einstein shortened this even more by omiuing 1he The repeated j in DijXj automntically meant a( addi1ion. He also wrote the sum as .Xj, Nol being Einstein. we include the L •
48 Chap1er 2 Solving Linear Eqvations
It is easy to describe the ''elementary matrices" or "elimination matrices" like E. Start with the identity matrix /. Change one of its zeros ro the multiplier - e:
= 2C The identity matrix has I's on the diagonal and otherwise O's. Then / b b.
The elementary matrix or elimination matrix Eu that subtracts a multiple i of row J from row i has the extra nonzero entry -i. in the i, j position.
Example 2
~ Identity I = [~ ~] OO I
~ fl- 0
= [ Elimination £ 31
l
-t 0
When you multiply / times b. you get b. But £ 31 subtracts e times the first component
e= = from the third component. With 4 we get 9 - 4 5:
~ 0
and Eb= [
1
-4 0
= What about the left side of Ax = b? The multiplier e 4 was chosen to produce a
H:5
zero, by subtracting 4 times the pivot. E311 creaJes a zero- jn the (3, 1) position.
i
The notation fits this purpose. Start with A. Apply E's to produce zeros below
tt
the pivots (the first E is £ 21) . End with a triangular U . We now look in detail at
fl
those steps.
First a small point. The vector x stays the same. The solution is not changed by
elimination. (That may be more than a small point.) It is the coefficient matrix that is
= = changed! When we start with Ax b and multiply by E . the result is EAx E b.
The new matrix EA is the result of multiplying E times A .
Matrix Multiplication
The big question is: How do we multiply two matrices? When the first matrix is E (an
elimination matrix), there is already an important clue. We know A, and we know what
it becomes after the elimination step. To keep everything right, we hope and expect
that EA is
H ~] [; :=~] 0 I
~ ~ = [
- ~] (with the zero).
0 1 -2 -3 7
-2 -3 7
This step does not change rows 1 and 3 of A. Those rows are unchanged in EA-only
row 2 is different. Twice the first row has been subtracted from the second row. Matrix
= multiplication agrees with elimination- and the new system of equations is EAx Eb.
£ Ax is simple but it involves a subtle idea. Multiplying both sides of the original
equation gives E(Ax) = Eb. With our proposed multiplication of matrices. this is also
2.3 Elimination Using Matrices 49
= (EA )x Eb. The first was E times Ax, the second is EA times x. They are the = same! The parentheses are not needed. We just write EAx Eb.
When multiplying ABC. you can do BC first or you can do AB first. This is
= the point of an "associative law" like 3 x (4 x 5) (3 x 4) x 5. We multiply 3 times
20, or we multiply 12 times 5. Both answers are 60. That law seems so obvious that
it is hard to imagine it could be false. But the "commutative law" 3 x 4 = 4 x 3 looks
even more obvious. For matrices, EA is different from A E.
20 ASSOCIATIVE LAW
= A ( BC) (AB )C
NOf COMMUTATIVE LAW Often AB ";/: BA .
There is another requirement on matrix multiplication. Suppose B has only one column (this column is b). The matrix-matrix law for EB should be consistent wilh the old matrix-vector law for Eb. Even more, we should be able to multiply matrices a column at a time:
If B has several columns b1, bi, b3, then EB has columns Ebi, Eb2, Eb3.
This holds true for the matrix multiplication above (where the matrix is A instead of B). If you multiply column 1 of A by E, you get column 1 of EA:
H ~] [ ;J- [ i] 0 I 0 1 -2 - -2
and £(column j of A)= column j of EA.
This requirement deals with columns, while elimination deals with rows. The next section describes each individual entry of the product. The beauty of matrix multiplication is that all three approaches (rows, columns, whole matrices) come out right.
The Matrix Pij for a Row Exchange
To subtract row j from row i we use Eij. To exchange or ..permute" those rows we
use another matrix Pij . Row exchanges are needed when zero is in the pivot position. Lower down that pivot column may be a nonzero. By exchanging the two rows, we have a pivot (never zero!) and elimination goes forward.
What matrix P23 exchanges row 2 with row 3? We can find it by exchanging rows of the identity matrix / :
Permutation matrix
50 Chapter 2 Solving Linear Equations
This is a row exchange matrix. Multiplying by P23 exchanges components 2 and 3 of
any column vector. Therefore it also exchanges rows 2 and 3 of any matrix:
fl rn
~
nrn = rn
d ao
~ [
~
r] [~
~
= rn ~ fl ·
On the right, P23 is doing what it was created for. With zero in the second pivot position and "6'' below it, the exchange puts 6 into the pivot.
Matrices act. They don't just sit there. We will soon meet other permutation
matrices, which can change the order of several rows. Rows J. 2, 3 can be moved to
3, I. 2. Our P23 is one particular permutation matrix - it exchanges rows 2 and 3.
2E Row Exchange Matrix P;1 is the identity matrix with rows i and j reversed. When Pij multiplies a matrix A. it ex.changes rows i and j of A.
=[i 1~]- To exchange equations 1 and 3 m11ltiply by P13 I O0
Usually row exchanges are not required. The odds are good that elimination uses only
the Eij- But the Pij are ready if needed. l o move a pivot up to the diagonal.
The Augmented Matrix I I
This book eventually goes far beyond elimination. Matrices have all kinds of practical applications, in which they are multiplied. Our best starting point was a square E times a square A, because we met this in eJimination-and we know what answer to expect for EA. The next step is to allow a rectangular matri.\'. It still comes from our original equations. but now it includes the right side b.
Key idea: Elimination ·docs the same row operations to A and to b. We can include b as an extra co/rmm and follow it tllror,glt elimination. The matrix A is enlarged or "augmented" by the extra column b:
!] . 4 -2
Aug111e11ted matrix [Ab]=[! 9 -3
-2 -3 7 10
Elimination acts 011 whole rows of this matrix. The left side and right side are both multiplied by E, to subtract 2 times equation I from equation 2. With [ A b ] those steps happen together:
H 2] [ !]. 0
-2
I O0 J [ 24 49 - 3 8 -
4 -2 02 I I
0 I -2 -3 1 10
- 2 - 3 7 10
= The new second row contains 0, I , I, 4. The new second equation is + x 2 :c3 4.
Matrix multiplication works by rows and at the same time by columns:
2.3 El imina1ion Using Matrices 51
R (by rows): Each row of E acts on [ A b] to give a row of [ EA Eb].
C (by columns): E acts on each column of [ A b J to g ive a column of [EA Eb ).
Notice again that word "acts." This is essential. Matrices do something! The matrix A acts on x to produce b. The matrix £ operates on A to give EA . The whole process of elimination is a sequence of row operations, alias matrix multiplications. A goes to E21A which goes to £ 31£21 A. Finally £ 32£ 31£21 A is a triangular matrix.
The right side is included in the augmented matrix. The end result is a triangular system of equations. We stop for exercises on multiplication by £. before writing down the rules for all matrix multiplications (including block multiplication).
■ REVIEW O F THE KEY IDEAS ■
E'J~, I. Ax = xi times column I + · · · + Xn times column "· And (Ax); =
a ;jXj,
2. Identity matrix = / , elimination matrix = Eij , exchange matrix = Pij .
= 3. Multiplying Ax b by £ 21 subtracts a multiple l21 of equation J from equa-
tion 2. The number - l21 is the (2, I) entry of the elimination matrix £ 21 .
4. For the augmented matrix [ A b ]. that elimination step gives [ £21 A E 21 b].
5. When A multiplies any matrix B, h multiplies each column of B separately.
1·1
■ W O RKED EXAMPLES ■
2.3 A What 3 by 3 matrix £ 21 subtracts 4 times row I from row 2? What matrix
P 32 exchanges row 2 and row 3? If you multiply A on the right instead of the left,
describe the results A£21 and AP32.
Solutio n By doing those operations on the identity matrix I. we find
= I O O]
£ 21
[
-4 0
l 0
0 l
1 0 0]
and
P32=
[
0 0
0 J
I 0
.
Multiplying by E 21 on the right side will subtract 4 times column 2 from column 1. Multiplying by P32 on the right will exchange columns 2 and 3.
2.3 B
Write down the augmented matrix [A b] with an extra column:
x +2y +2z = I 4x + 8y + 9z = 3
3y + 2z = 1
Apply £ 21 and then P32 to reach a triangular system. Solve by back substitution. What combined matrix P32 E 21 will do both steps at once?
52 Chapter 2 Solving Linear Equations
Solution The augmented matrix and the result of using E 21 are
l 2 2 1]
[A bl = 4 8 9 3
[0 3 2 l
and
] I ? 2 I
£ 21lA bl = 0 0 1 - I [0 3 2 I
P 32 exchanges equation 2 and 3. Back substitution produces (x. y , z):
=[ -l] m=UJ P32 E21(A b]
~ 2 2 3 2
and
0 1
For the matrix P32 £ 21 that does both steps at once, apply P32 to E21!
~ ~ ~ = = [ P32 £ 21 exchange the rows of £21
] .
-4 1 0
2.3 C Multiply these matrices in two ways: first, rows of A times columns of 8
to find c;1ch cn1ry of AB, and second. columns of A times rows of B to produce two
ffi
matrices that add to AB. How many separate ordinary mulliplications are needed?
1
i][~ AB = [!
~] = (3by2)(2by2)
Solution Rows of A times columns of B are dot products of vectors:
[i] = 4
( row l ) • (column L) [J ]
= 10 is the ( I. 1) entry of A 8
= (row 2) • (column I) = [ 1 51 [~] 7 is the (2, 1) entry of AB
The first columns of AB are (10, 7. 4) and ( 16, 9, 8). We need 6 dot products, 2 mul-
liplicalions each, 12 in a11 (3 • 2 • 2). The same AB comes from columns of A limes
rows of B:
[4] 3] [2
AB=
[
I 2
4] +
5 O
[1
J ] = [62 142] +[54 54]= [ 170 196] .
4 8
0 0
4 8
2.3 Elimination Using Matrices 53
Problem Set 2.3
Problems 1-15 are about elimination matrices.
1 Write down the 3 by 3 matrices thac produce these elimination steps:
(a) £ 21 subtracts 5 times row I from row 2.
(b) E32 subtracts -7 times row 2 from row 3.
(c) P exchanges rows l and 2, then rows 2 and 3.
= = 2 In Problem I, applying £21 and then £32 to the column b (I, 0, 0) gives £32 E21 b
= _ _. _ _ . Applying £32 before £ 21 gives E 21 £32b
When £32 comes
first, row _ _ feels no effect from row _ _ .
3 Which three matrices £21, £31. £32 put A into triangular fonn U?
! A= [ -2
Multiply those E's to get one matrix M that does elimination: MA = U .
= 4 Include h (I, 0, 0) as a fourth column in Problem 3 to produce [ A b ]. Carry
out the elimination steps on this augmented matrix to solve Ax = b.
5 Suppose a33 = 7 and the third pivot is 5. If you change a33 to 11, the third pivot is _ . If you change a33 to _ _ , there is no third pivot.
6 If every column of A is a multiple of (I, I. I). then Ax is always a multiple of
( l, 1, 1). Do a 3 by 3 example. How many pivots are produced by elimination?
7 Suppose £31 subtracts 7 times row 1 from row 3. To reverse that step you should _ _ 7 times row _ _ to row _ _ . This "inverse matrix" is R31 = _ _ .
8 Suppose £31 subtracts 7 times row 1 from row 3. What matrix R31 is changed into I? Then £31 R 31 = / where Problem 7 has R 31 £31 = /. Both are true!
9 (a) £21 subtracts row I from row 2 and then Pn exchanges rows 2 and 3. What matrix M = P23E21 docs both steps at once?
(b) P23 exchanges rows 2 and 3 and then £31 subtracts row 1 from row 3.
= What matrix M E31P23 does both steps at once? Explain why the M's
are the same but the E's are different
10 (a) What 3 by 3 matrix E13 will add row 3 to row I? (b) What matrix adds row 1 to row 3 and at the same time row 3 to row 1? (c) What matrix adds row I to row 3 and then adds row 3 to row 1?
54 Chapter 2 Solving Lineat Equations
= 11 Create a matrix that has a 11 a22 = a 33 = I but elimination produces two
negative pivo1s without row exchanges. (The first pivot is 1.)
12 Multiply these matrices:
[~ fl[~ i][~ 0
2
0
I 0
5 8
1 0
gJ
!l [-! ~J [: 0
2
I
3
-1 0
4
13 Explain these facts. If the third column of B is all zero, the third column of EB is all zero (for any E }. If the third row of B is all zero. the third row of £ B
might not be zero.
14 This 4 by 4 matrix will need elimination matrices E21 and £ 32 and £ 43. What are those matrices?
_ - 2I - 2I - O1 O0 J A - 0 -I 2 - 1 •
[ 0 0 -I 2
= = 15 Write down the 3 by 3 matrix that has a ij 2i- 3j. T his matrix has a32 0, but
!!i5
elimination still needs £ 32 to produce a zero in the 3, 2 position. Which previous
1-.\
step destroys the original zero and what is En '?
tt
fl
Problems 1Cr23 are about creating and multiplying matrices.
= 16 Write these ancient problems in a 2 by 2 matrix fonn Ax b and solve them:
(a) X is twice as old as Y and their ages add to 33.
= = (b) (x, y ) (2, 5) and (3, 7) lie on the line y mx + c. Find m and c.
= 17 The parabola y = a + bx + cx2 goes through the points (x , y) ( I, 4) and (2, 8)
and (3, 14). Find and solve a matrix equation for the unknowns (a. b, c).
18 Multiply these matrices in the orders E F and FE and £ 2:
= E
al OI OO J
[b O I
Ol OO J . c l
= = Also compute E 2 EE and F 3 FF F.
19 Multiply these row exchange matrices in the orders P Q and Q P and P 2:
= P O1 0I O0 J and [0 0 I
= Find four matrices whose squares are M2 I.
'J01 0 .
0 0
2.3 Elimination Using Malrices 55
20 (a) Suppose all columns of B are the same. Then all columns of EB are the
same. because each one is E times _ _ .
(b) Suppose all rows of B are [ 1 2 4 ]. Show by example that all rows of EB are not [ I 2 4 ). It is true that those rows are _ _ .
21 lf E adds row 1 to row 2 and F adds row 2 to row 1, does E F equal FE?
L 22 The entries of A and x arc au and Xj. So the first component of Ax is t1 1jXj =
a, 1x1 + ••• + a111x,,. If E21 subtracts row l from row 2, write a formula for
(a) the third component of Ax (b) the (2, I) entry of £21A (c) the (2, l } entry of £21(£21 A) (d) the first component of EAx.
= 23 The elimination matrix E [J V] subtracts 2 times row l of A from row 2 of
A. The result is EA. What is the effect of E(EA)? In the opposite order AE. we are subtracting 2 times _ _ of A from _ _ . (Do examples.)
Problems 24-29 include the column b in the augmented matrix [ A b ].
24 Apply elimination to the 2 by 3 augmented matrix [ A b l, What is the triangular
t
= system Ux c? What is the solution x ?
1
[xi] Ax = [2 3] = [ l].
11
4 I x2
17
25 Apply elimination to the 3 by 4 augmented matrix [ A b ]. How do you know this system has no solution? Change the last number 6 so there is a solution.
= = b* 26 The equations Ax b and Ax*
have the same matrix A . What double
augmented matrix should you use in elimination to solve both equations at once?
Solve both of these equations by working on a 2 by 4 matrix:
27 Choose rhc numbers a, b, c, d in this augmented matrix so that there is (a) no solution (b) infinitely many solutions.
1 2 3 a] [A b]= 0 4 S b
[0 0 d c Which of the numbers a, b, c, or d have no effect on the solvability?
56 Chapter 2 Solving Linear Equations
= 28 If AB I and BC = I use the associative law to prove A = C.
= [: ~] = 29 Choose two matrices M
with det M ad - be = I and with a, b, c, d
positive integers. Prove that every such matrix M either has
EITHER row l :s row 2 OR row 2 ~ row 1 .
Subtraction makes [J ~] Mor [A-:)M nonnegative but smaller than M. If you continue and reach /, write your M's as products of the inverses [ f Y] and [ Af ].
30 Find the triangular matrix £ that reduces "Pascal's marrix" to a smaller Pascal:
[l !;i] £
~l - = [~ 0 ;
1 331
0
21
Challenge question: Which M (from several E's) reduces Pascal all the way to I?
RULES FOR MATRIX OPERATIONS ■ 2.4
I will start with basic facts. A matrix is a rectangular array of numbers or "entries."
When A has m rows and 11 columns. it is an "m by ,,.. matrix. Matrices can be added
if their shapes m the sa.ine. They can be multiplied by any constant t. Herc are
[i ~] ~ ~ fl [i n i n examples of A + B and 2A, for 3 by 2 matrices:
+ [ :] = [
and 2
= [
Matrices are added exactly as vectors are- one entry at a time. We could even regard
= a column vector as a matrix with only one column (so n l). The matrix - A comes = from multiplication by c - 1 (revelfSing all the signs). Adding A to -A leaves the
zero mar,;x, with all entries zero.
The 3 by 2 zero matrix is different from the 2 by 3 zero matrix. Even zero has a shape (several shapes) for matrices. All this is only common sense.
The entry in row i and column j is called aij or A(i, j). The n cmries along the first row arc a11, a12, . . ., a 1n. The lower left entry in the matrix is a111 1 and the
lower right is am11• The row number i goes from 1 co m. The column number j goes
from I ton.
Matrix addition is easy. The serious question is matrix multiplication. When can we multiply A times B. and what is the :product AB? We cannot multiply when A and B are 3 by 2. They don 't pass the following test:
To multiply AB: If A has n colrmms, B must have II rows.
If A has two columns, B must have two rows. When A is 3 by 2, the matrix B can be 2 by I (a vector) or 2 by 2 (square) or 2 by 20. Every column of B is ready to
be multiplied by A. Then AB is 3 by I (a vector) or 3 by 2 or 3 by 20.
2.4 Rules (or Matrix Operations 57
Suppose A is m by n and B is n by p. We can multiply. The product AB is m
by p.
= m rows ] [ 11 rows ] [ m rows ]
[ n columns p columns
p columns •
A row times a column is an extreme case. Then 1 by ,i multiplies n by l. The result is I by 1. That single number is the "dot product."
In every case AB is filled with dot products. For the top comer, the (l. 1) entry of AB is (row l of A) • (column l of B). To multiply matrices, take all these dot products: (each row of A)• (each column of B).
2F The entry in row i and col11m11 j of A B is (row ; of A) • (column j of 8) .
Figure 2.8 picks out the second row (i = 2) of a 4 by 5 matrix A. It picks out the third
column (j = 3) of a 5 by 6 matrix B . Their dot product goes into row 2 and column 3
of AB. The matrix AB has as mmiy rows as A (4 rows), and as many colttmns as B.
[ 0~I O;i • • • O;s ]
* *
btj
bzi
* * *
= [·
*
:t: (AB);j
*
·]"' ,(c
[I+:;
1.,
tt
1·1
bsj
*
A is 4 by 5
B is 5 by 6
AB is 4 by 6
= = = Figure 2.8 Here i 2 and j 3. Then (AB)i3 is (row2) . (column 3) ta21cbo.
Example 1 Square matrices can be multiplied if and only if they have the same size:
The first dot product is I • 2 + I • 3 = 5. Three more dot products give 6, 1, and 0.
Each dot product requires two multiplications- thus eight in all. If A and B are n by 11, so is AB. It contains n2 dot products, row of A times
column of B. Each dot product needs n multiplications, so tlte computation of AB
= = uses n3 separate multiplicatwns. For n 100 we multiply a million times. For 11 2
= we have n3 8.
Mathematicians thought until recently that AB absolutely needed 23 = 8 mul-
tiplications. Then somebody found a way to do it with 7 (and extra additions). By breaking II by II matrices into 2 by 2 blocks, this idea also reduced the count for large matrices. Instead of n3 it went below n2 •8 , and the exponent keeps falling. 1 The best
1 Maybe the exponent won't slop falling before 2 . No number in between looks special.
58 Chapter 2 Solving Linear Equations
at this moment is u 2•376. But the algorithm is so awkward that scientific computing is
done the regular way~ n2 dot products in AB, and 11 multiplications for each one.
Example 2 Suppose A is a row vector (I by 3) and B is a column vector (3 by 1).
Then AB is I by l (only one entry. the dot product). On the other hand B times A
(a column times a row) is a full 3 by 3 matrix. This multiplication is allowed!
[!Ji [! !U Column ffmes row:
I 2 3] =
A row times a column is an "inner" product- that is another name for dot product. A column times a row is an "outer" product. These arc extreme cases of matrix mul-
tiplication, with very thin matrices. T hey follow the rule for shapes in multiplication:
(n by 1) times (1 by 11). The product of column times row is " by n . Example 3 will show llow to multiply AB using col11mns times rows.
Rows and Columns of AB
In the big picture, A multiplies each column of B . The result is a column of AB. In that column. we are combining the columns of A. Eacl, column of AB is a combi-
11ation of the colwm,s of A. That is lhc column picmre of matrix multiplication;
Co/1111111 of AB is (matrix A ) times (column of B ).
The row picture is reversed. Each row of A multiplies the whole matrix B. The result is a row of AB. It is a combination of the rows of B :
[ row i of A ] 4] 52 63 ] = [ row i of AB ].
[7 8 9
We see row operations in elimination ( £ times A). We sec columns in A times x . The ..row-column picture" has the dot products of rows with columns. Believe it or not. there is also a "column-row picture." Not everybody knows that columns 1, . . . . n of A multipJy rows 1. .... 11 of B and add up to the same answer AB.
The laws for Matrix Opera tions
May I put on record six laws that matrices do obey, while emphasizing an equation
they don't obey'? The matrices can be square or rectangular, and the laws invo)vjng
A + B arc all simple and all obeyed. Here are three addition laws:
A+ B = B +A
(commutative law)
c(A+ B) =cA +cB
(distributive law)
+ A (B + C) = (A + B) + C (associative law).
2.4 Rules for Matrix Operations 59
= Three more laws hold for multiplication. but AB BA is not one of them:
AB-:pBA
C(A + B) = CA +CB (A + B)C = AC + BC
= A(BC) (AB)C
(the commutative "law" is usual/y broken)
(distributive law from the left) (distributive law from the right) (associative law for ABC) (parentheses not needed).
When A and B are not square, AB is a djfferent size from BA. These matrices can't be equal- even if both multiplications are allowed. For square matrices, almost any example shows that AB is different from BA:
= It is true that AI I A. All square matrices commute with I and also with c I . Only
these matrices cl commute with all other matrices.
The law A(B + C) = AB + AC is proved a column at a time. Start with A(b +
= c) Ab + Ac for the first column. That is the key to everything - linearity. Say no
LJ..:
'
more.
1,
tt
The law A(BC) = (AB)C means 11,a1 you can multiply BC first or AB first.
+1
The direct proof is sort of awkward (Problem 16) but this law is extremely useful. We
highlighted it above; it is the key to the way we multiply matrices.
= = = Look at the special case when A = B C square matrix. Then (A times A2)
(A2 rimes A). The product in either order is A3. The matrix powers AP follow the
same rules as numbers:
AP= AAA· .. A (p factors)
Those are the ordinary laws for exponents. A3 times A4 is A7 (seven factors). A3 to the fourth power is A12 (twelve A's). When p and q arc zero or negative these rules still hold, provided A has a " - I power"- which is the inverse matrix A- 1. Then
= A0 I is the identity matrix (no factors).
For a number, a - 1 is 1/ a. For a matrix. the inverse is written A- 1. (It is never
I / A. except this is allowed in MATLAB.) Every number has an inverse except a = 0.
To decide when A has an inverse is a central problem in linear algebra. Section 2.5
will start on the answer. This section is a Bill of Rights for matrices, to say when A
and B can be multiplied and how.
60 Chapter 2 Solving Linear Equations
Block Matrices and Block Multiplication
We have to say one more thing about matrices. They can be cut into blocks (which are smaller matrices). This often happens naturally. Here is a 4 by 6 matrix broken into blocks of size 2 by 2-and each block is just /:
_i~ -~ A :::::: [ - ~- ~---1-0-l
_ o_l] = [ I I
10 10 10
1 I
01 0
0 I
If B is also 4 by 6 and its block sizes match the block sizes in A. you can add A + B
a block at a time.
We have seen block matrices before. The right side vector b was placed next to A in the "augmented matrix." Then [ A b] has two blocks of different sizes. Multiplying by an elimination matrix gave [ EA Eb]. No problem to multiply blocks times blocks,
when their shapes permit:
2G Block multiplication If the cuts between columns of A match the cuts between
rows of B. then block multiplication of AB is allowed!
·].• .•.
{1)
This equation is the same as if the blocks were numbers (which are 1 by I blocks).
We are careful to keep A·s in front of s·s. because BA can be different. The cuts
between rows of A give cuts between rows of AB. Any column cuts in B are also column cuts in AB. Main point When matrices split into blocks, it is often simpler to see how they act. The block matrix of l's above is much clearer than the original 4 by 6 matrix A.
Example 3 (Important special case) Let the blocks of A be its n co]umns. Let the blocks of B be its ,r rows. Then block multiplication AB adds up columns times rows:
(2)
This is another way to multiply matrices! Compare it with the usual rows times columns. Row 1 of A times column I of B gave the (1, 1) entry in AB. Now column l of A
2.4 Rules for Matrix Operations 61
times row l of B gives a full matrix-not just a single number. Look at this example:
[!][3 2]+[:][1 O]
[~ ;J + [1 ~l
(3)
We stop there so you can see columns multiplying rows. If a 2 by l matrix (a column) multiplies a 1 by 2 matrix (a row). the result is 2 by 2. That is what we found. Dot products are "inner products," these are "outer products."
When you add the two matrices at the end of equation (3), you get the correct
= answer AB. In the top lefl corner the answer is 3 + 4 7. This agrees with the
row-column dot product of (1, 4} with (3. I).
Srmunary The usual way, rows times columns, gives four dot products (8 multiplica-
tions). The new way, columns times rows. gives two full matrices (8 multiplications). The eight multiplications. and also the four additions. are all the same. You just execute them in a different order.
Example 4 (Elimination by blocks) Suppose the first column of A contains 1, 3, 4. To change 3 and 4 to Oand O. multiply the pivot row by 3 and 4 and subtract. Those
row operations are really multiplications by elimination matrices £ 21 and £ 31:
fl n 0
I
~ 0
and £ 31 = [
l
0
-4 0
The "block idea" is to do both eliminations wilh one matrix E. That matrix clears out
= the whole first column of A below the pivot a 2:
0 1 0
0~]
multiplies
I X X] 3 .t' X
[4 X X
x] = to give EA
J X 0x x .
[0 X X
Block multiplication gives a fonnula for £A. The matrix A has four blocks a, b. c. D:
the pivot. the rest of row I. the rest of column I, and the rest of the matrix. Watch
how E multiplies A by blocks:
(4)
Elimination multiplies the first row [ a b] by c/ a. It subtracts from c to get zeros in the first column. It subtracts from D to gee D - cb/ a. This is ordinary elimination, a column at a time - written in blocks.
Copyrighted ma,a ,al
62 Chapter 2 Solving Linear Equation~ ■ REVIEW OF TH E KEY IDEAS ■
I. The (i, j) entry of AB is (row i of A ) • (column j of B). 2. An m by II matrix times an 11 by p matrix uses 111111, separate multiplications.
3. A times BC equals AB times C (surprisingly important).
4. AB is also the sum of these matrices: (column j of A) times (row j of B).
5. Block multiplication is allowed when the block shapes match correctly.
■ W ORKED EXAMPLES ■
2.4 A Put yourself in the position of the author! I want to show you matrix multiplications that are special. but mostly I am stuck with small matrices. There is one terrific family of Pascal matrices, and they come in all sizes, and above all they have real meaning. I think 4 by 4 is a good size to show some of their amazing patterns.
Here is the lower triangular Pascal matrix L. Its entries come from "Pascal's triangle". I will multiply L times the ones vector, and the powers vector:
x
l+x
= 2
3
I 3
]
I
[
xl2 ]
x3
[ (1 +1x)2 ] •
(1 +x)3
Each row of L leads to the next row: Add an elllry to 1he one on its left to get the
= entry below. In symbols i; j + l; j-1 f; + 1j . The numbers after I, 3, 3, I would
be 1. 4. 6, 4. I. Pascal lived in the l600's, long before matrices, but his triangle fits
perfectly into L. Multiplying by ones is the same as adding up each row. to get powers of 2. In
= = fact powers ones when .t 1. By writing out the last rows of L times powers, you
sec the entries of L as the "binomial coefficients" that are so essential to gamblers:
= 1 + 2x + lx2 (1 + x)2
1 + 3x + Jx2 + tx3 = (I + x)3
The number "3" counts the ways to get Heads once and Tails twice in three coin flips: HTT and THT and TTH. The other "3" counts the ways to get Heads twice: HHT
= and HTH and THH. Those are examples of "i choose j" the number of ways to
get j heads in i coin flips. That number is exactly lij , if we start counting rows and
= = = columns of L at i 0 and j 0 (and remember O! I):
eij = (~) =i choose j = j! ( i ;~ j)!
(~) = 2:~! =6
There are six ways to choose two aces out of four aces. We will see Pascal's triangle and these matrices again. Here arc the questions I want to ask now:
2.4 Rules for Matrix Operations 63
= 1. What is H L 2? This is the ''hypercube matrix''.
2. Multiply H times ones and powers. 3. The last row of H is 8, 12, 6, 1. A cube has 8 corners, 12 edges, 6 faces, I box.
What would the next row of ff tell about a hypercube in 4D?
= Solution Multiply L times L to get the hypercube matrix H L 2:
Now multiply l-1 times the vectors of ones and powers:
1
4
][ l] [x x2
=
2+1x (2+x)
2
]
12 6 l
x3
(2+x)3
= If x l we get the powers of 3. If x = 0 we get powers of 2 (where do l, 2, 4. 8
appear in H ?). Where L changed x to 1+x, applying L again changes 1+x to 2 +x.
How do the rows of H count cor ners and edges and faces of a cube? A square in 2D has 4 corners, 4 edges, I face. Add one dimension at a time:
Co1111ec1 two squares to get a 3D cube. Connect two cubes to get a 4D hypercube.
The cube has s-corners and 12 edges: 4 edges in each square and 4 between the squares.
The cube has 6 faces: 1 in each square and 4 faces between the squares. This row
8. 12, 6. 1 of H will lead to the next row (one more dimension) by 2h; j + h;j - 1 =
h;+lj •
Can you see this in four ,iime11sions? The hypercube has 16 comers, no problem. It has 12 edges from one cube, 12 from th:e other cube, 8 that connect comers between
those cubes: total 2 x 12 + 8 = 32 edges. It has 6 faces from each separate cube and
= 12 more from connecting pairs of edges: total 2 x 6 + 12 24 faces. It has one box
from each cube and 6 more from connecting pairs of faces: total 2 x l + 6 = 8 boxes.
And sure enough, the next row of H is 16, 32, 24, 8, 1.
= = 2.4 B For these matrices, when does AB BA? When does BC CB'? When
docs A times BC equal AB times C? Give the conditions on their entries p. q, r. z:
A=
[p q
Or J
If p, q. r, l, z are 4 by 4 blocks instead of numbers, do the answers change?
64 Chapter 2 Solving linear Equations
Solution First of all. A times BC always equals AB times C. We don't need paren-
= theses in A(BC) (AB)C = ABC. But we do need to keep the matrices in this order
A, B. C. Compare AB with B A:
AB = [ Pq q +P r ]
[P BA = +q q r'] .
= = We only have AB = BA if q 0 and p r. Now compare BC with C B :
BC = [o0 0' ]
B and C happen to commute. One explanation is that the diagonal part of B is / , which commutes with a11 2 by 2 matrices. The off-diagonal part of B looks exactly Jike C (except for a scalar factor z) and every matrix commutes with itself.
When p , q. r, z. are 4 by 4 blocks and I changes to the 4 by 4 identity matrix,
all these products remain correct. So the answers are the same. (If the / 's in B were changed to blocks t , 1, 1, then BC would have the block r z and CB would have the block zt. Those would normally be different- the order is important in block multi~ pJication.)
2.4 C A directed graph starts with n nodes. There are n2 possible edges-each
edge leaves one of the 11 nodes and enters one of the n nodes (possibly itself). The n
= by n adjacency matrix has a;; 1 when an edge leaves node ; and enters node j ; if = no edge then a ij 0. Here arc two directed graphs and their adjacency matrices:
node I to node 2
node I to node I G ( J 2 A=[ : ~ ]
node 2 to node I
The i, j entry of A2 is anatj+· • • +a;,,anj· Why does that sum count the two-step paths from ; to any node to j ? The i, j e ntry of Ak counts k-step paths:
counts the paths I to 2 to l . I to I to 1 1 to I to 2]
with two edges [ 2 to I to l
2 to 1 to 2
List all of the 3•slep paths between each pair of nodes and compare with A3. When Ak has no zeros. that number k is the diameter of the graph- the number of edges needed to connect the most distant pair of nodes. What is the diameter of the second graph?
Solution The number aaakj will be ..I" if there is an edge from node i to k and an edge from k to j . This is a 2-step path. The number a;kak; will be "O" if either of
2.4 Rules ior Matrix Operations 65
those edges (i to k, k to j ) is missing. So the sum of a ;kakJ is the number of 2-stcp paths leaving i and entering j . Matrix multiplication is just right for this count.
The 3-step paths arc counted by A3; we look at paths to node 2:
counts the paths • •• l to I to l to 2, I to 2 to I to 2] with three steps [ • • • 2 to I to 1 to 2
These A k contain the Fibonacci numbers 0, I, 1, 2 , 3, 5, 8, 13, .. . coming in Section 6.2.
= = = Fibonacci's rule F1c+2 F1;+1 + Fk (as in 13 8 + 5) shows up in (A) (Ak) Ak+1:
There are 13 six-step paths from node I to node I, but I can't find them all.
Ak also counts words. A path like l to I. to 2 to l corresponds to the number 1121 or the word aaba. The number 2 (the letter b) is not allowed to repeat because the graph has no edge from node 2 to node 2. The i , j entry of Ale counts the aJlowcd
numbers (or words) of length k + l that start with the ith letter and end with the jth.
The second graph also has diameter 2; A2 has 110 zeros.
Problem Set 2.4
l1..::
1,
tt
Problems 1-17 are about the laws of mairix multiplication.
+1
1 A is 3 by 5, B is 5 by 3, C is 5 by 1, and D is 3 by I. All entries are l. Which of these matrix operations arc allowed, and what are the results?
BA
AB
ABD
DBA
A(B + C ).
2 What rows or columns or matrices do you multiply to find
(a) the third column of AB? (b) the first row of AB? (c) the entry in row 3, column 4 of AB? (d) the entry in row 1, column I of C DE?
3 Add AB to AC and compare with A(B + C):
l A = [~ ~] and B = [~ ~] and C = [~ ~
4 In Problem 3, multiply A times BC. Then multiply AB times C.
5 Compute A2 and A3. Make a prediction for A5 and An:
t] A = [~
and A = [~ ~].
66 Chapter 2 Solving Linear Equations
+ 6 Show lhal (A+ B )2 is different from A2 2AB + B 2 , when
J. ~ ~ ~ A = [ ~] and B = [
Write down the correct rule for (A+ B)(A + 8) = A2 + _ _ + 8 2.
7 True or false. Give a specific example when false:
(a) If columns I and 3 of B are the same. so are columns I and 3 of AB. (b) If rows I and 3 of B are the same, so are rows I and 3 of AB. (c) If rows I and 3 of A are the s.ame. so are rows 1 and 3 of ABC.
= (d) (A 8)2 A2 B 2.
8 How is each row of DA and EA related to the rows of A, when
How is each column of AD and A£ related to the columns of A?
9 Row I of A is added to row 2. This gives EA below. Then column I of EA is
_ffi
added to column 2 lo produce (EA)F:
1 tt
EA=[!~][:
!]=[a:c
b ]
b+d
fl
!]= [a:c and (EA)F=(EA)[~
a+b ] n+c+b+d •
(a) Do those steps in the opposite order. First add column I of A to column 2 by AF, then add row I of AF to row 2 by E(AF).
(b) Compare with (EA)F. What law is obeyed by matrix multiplication?
10 Row 1 of A is again added to row 2 to produce EA. Then F adds row 2 of £ A
to row I. The result is F(EA):
I][ d] = [~ F (EA)
a
b ] [2a + c 2b +
l a +c b+d - a+c b+d •
(a) Do those steps in the opposite order: first add row 2 to row I by FA. then add row I of FA to row 2.
(b} What law is or is not obeyed by matrix multiplication?
11 (3 by 3 matrices) Choose the only B so that for every matrix A
(a) BA =4A (b) BA =48
2.4 Rules for Matrix Operations 67
(c) BA hns rows l and 3 of A reversed and row 2 unchanged (d) All rows of BA are the same as row 1 of A.
12 Suppose AB = BA and AC= CA for these two particular matrices B and C:
= A _ [ac db] commutes with B [ 0I O0 J
= = = Prove that a d and b c 0. Th.en A is a multiple of /. The only matrices
= that commute with B and C and alJ other 2 by 2 matrices are A multiple of I.
13 Which of the following matrices are guaranteed to equal (A - B)2: A2 - B2, (B - A)2, A2 - 2AB + B2, A(A - 8) - B(A - B), A2 - AB - BA + B2?
14 True or false:
(a) If A2 is defined then A is necessarily square.
(b) If AB and BA are defined then A and B arc square.
(c) If AB and BA are defined then AB and BA are square.
(d) JfAB=BthenA=l.
~
15 If A is 111 by n, how many separate multiplications arc involved when
1
(a) A multiplies a vector x with 11 components? (b) A multiplies an ,, by p matrix. B? (c) A multiplies itself to produce A 2 ? Herem = 11.
= 16 To prove that (AB)C A(BC), use the column vectors b1, .... b,, of B. Fim
suppose that C has only one column c with entries c1, ... , en:
AB has columns Ab1 , ... , Ab,, and Be has one column c1 b 1 + · · · + cnb11 •
= = Then (AB)c ci Abt + · ·· +c11 Abn equals A(q b1 + ··· +c11 h,1) A(Bc). = Linearity gives equality of those two sums, and (AB)c A(Bc}. The same is = true for all other _ _ of C. Therefore (AB)C A(BC) . 17 For A = [~ :}] and B = [}g!], compute these answers and not/Jing more:
(a) column 2 of AB
(b} row 2 of AB
(c) row 2 of AA = A2
= (d) row 2 of AAA A3.
Problems 18-20 use aij for the entry in row i, column j of A.
18 Write down the 3 by 3 matrix A whose entries are
68 Chapter 2 Solving Linear Equations
(a) aij = minimum of i and j
= (b) Cljj (- ])i+j
(c) au =;/j.
19 What words would you use to describe each of these classes of matrices? Give a 3 by 3 example in each class. Which matrix belongs to all four classes?
= (a) ail 0 if i # j = (b) Oij 0 if i < j = (C) Gij Oji = (d} llij Olj•
20 The entries of A are a ij. Assuming 1hat zeros don' t appear, what is
(a) the first pivot? (b) the multiplier f31 of row I to be subtracted from row 3? (c) the new entry that replaces ci32 after that subtraction? (d) the second pivot?
Problems 21-25 involve powers of A.
[i ~] A = ~ ~
and V = [ ~] •
000 0
I
22 Find all the powers A2, A3, . . . and AB, (AB)2, . . . for
23 By trial and error find real nonzero 2 by 2 matrices such that
= A2 - I
BC = 0
DE = -ED (not allowing DE= 0).
24 (a) Find a nonzero matrix A for which A2 = 0.
= (b) Find a matrix that has A2 '# 0 but A3 0.
= = 25 By experiment with 11 2 and n 3 predict A" for
2.4 Rules for Matrix Operations 69 Problems 26-34 use column-row multiplication and block multiplication. 26 Multiply AB using columns times rows:
27 The product of upper triangular matrices is always upper triangular:
:J [~ :J ~ AB= [~
~ = [o ].
OOx OOx
00
Row times column is dot product (Row 2 of A)· (column I of 8) = 0. Which other dot products give zeros'!
Column times row is full matrix Draw x·s and O's in (column 2 of A ) times (row 2 of B) and in (column 3 of A) times (row 3 of B).
28 Draw the cuts in A (2 by 3) and B (3 by 4) and AB to show how each of the four multiplication rules is really a block mulliplication:
(I) Mattix A times columns of B . (2) Rows of A times matrix B. (3) Rows of A times columns of B.
(4) Columns of A times rows of B.
29 Draw cuts in A and x to multiply Ax a column at a time: x, (column J) + • ••.
30 Which matrices £ 21 ancJ £31 produce zeros in the (2, 1) and (3, I) positions of E 21 A and E31A?
I
0 5
= Find the single matrix E £ 31E:?1 that produces both zeros at once. Multi-
ply EA.
31 Block multiplication says in the text that column I is eliminated by
0] [a [a b] _ EA _- [- c/Ia I
c
D -
b ] 0 D - ch/ a •
In Problem 30, what are c and D and what is D - ch/ a?
70 Chapter 2 Solving Linear Equ,rnons
= 32 With ;2 - 1, the product of (A + iB) and (x + iy) is Ax + iBx +i Ay - By. Use
blocks to separate the real part without i from the imaginary part that multiplies i:
[x] [Ax - A -BJ =
By] ~eal ~art
[? ? y
?
1magmary part
33 Suppose you solve Ax = b for three special right sides b:
If the three solutions x , . x 2. X3 arc the columns of a matrix X. what is A times X?
= = 34 If the three solutions in Question 33 arc x 1 (I. L. l) and x 2 (0. I, l) and
= = = x 3 (0. 0. l ). solve Ax b when b (3. 5. 8). Challenge problem: What is A?
35 Eli111i11atio11 for a 2 b-:,· 2 block matrix: When you multiply the first block row by CA - I and subtract from the second row, what is the "Schur complement" S that appears'?
36 Find all matrices A =[ :~] that satisfy A(] l] = [I) ]A.
37 Suppose a ••circle graph" has 5 nodes connected (in both directions) by edges around a circle. What is its adjacency matrix from Worked Example 2.4 C? What arc A2 and A3 and the diameter of this graph?
38 If 5 edges in Question 37 go in one direction only, from nodes 1. 2. 3, 4, 5 to 2. 3, 4. 5, 1, what are A and A2 and the diameter of thi:- one-way circle?
39 If you multiply a northwest matrlt A and a southeast matrix B, what type of
matrices are AB and 8 A? "Northwest" and "southeast.. mean zeros below and above the anti<liagonal going from (I, 11) to (11, l).
2.5 Inverse Matrices 71
INVERSE MATRICES ■ 2.5
Suppose A is a square matrix. We look for an ''inverse matrix" A- 1 of the same size, such that A- 1 times A equals I. Whatever A does, A- 1 undoes. Their product is the identity matrjx -which does nothing. But A- I might not exist.
What a matrix mostly does is to multiply a vector x. Multiplying Ax= b by A- 1
gives A- 1Ax = A- 1b. The left side is just x ! The product A- 1A is like multiplying
by a number and then dividing by that number. An ordinary number has an inverse if it is not zero- matrices are more complicated and more interesting. The matrix A - 1
is called "A inverse.''
DEFINITION The matrix A is invertible if there exists a malrix A- 1 such that
(I )
Not all matrices have inverses. This is the first question we ask about a square
matrix: Is A invertible? We don't mean that we immediately calculate A- 1. In most problems we never compute it! Here are six "notes" about A - 1.
l
Note 1 The inverse exists if and only if elimination produces 11 pivots (row ex-
= changes allowed). Elimination solves Ax b without explicitly using A- 1.
L
tt
= Note 2 The matrix A cannot have two different inverses. Suppose BA I and also
ti
AC = I. Then B = C. according to this "proof by parentheses":
= B(AC) = (BA)C gives BI= IC or B C.
(2}
This shows that a left-inverse B (multiplyimg from the left) and a right-inverse C (mul-
= tiplying A from the right to give AC I) must be the same matrix. = Note 3 If A is invertible, the one and only solution to Ax = b is x A- Ib:
Note 4 (Important) Suppose there is a nonzero vector x such that Ax = 0. Then
A cannot have an inverse. No matrix can bring O back to x.
= If A is invertible, then Ax 0 can only have the zero solution x = 0.
Note 5 A 2 by 2 matrix is invertible if and only if ad - be is not zero:
2 by 2 Inverse:
Cl b -I - -I - [ d -b ] [ c d ] - ad - be - c a •
(3)
This number ad -be is the determinant of A. A matrix is invertible if its determinant is not zero (Chapter 5). The test for II pivots is usually decided before the de1erminant appears.
72 Chapler 2 Solving Linear Equations
Note 6
A diagonal matrix has an inverse provided no diagonal entries are zero:
l / d1
- then A- I -
[
= [l Example 1 The 2 by 2 matrix A
i] is not invertible. It fai1s the test in Note
= 5, because ad - be equals 2 - 2 0. It fails the cest in Note 3, because Ax = 0 when = x (2, -1 ). It fails to have two pivots as required by Note 1. Elimination turns the
second row of A into a zero row.
The Inverse of a Product AB
= - t = For two nonzero numbers a and b, the sum a + b might or might not be invertible.
The numbers a 3 and b = -3 have inverses } and
Their sum a + b 0 has
= -1, no inverse. But the product ab - 9 does have an inverse, which is ½times
For two matrices A and B, the situation is similar. It is hard to say much about
the invcrtibility of A+ B. But the product AB has an inverse, whenever the factors A and B are separately invertible (and the same size). The important point is that A- 1
and B- 1 come in reverse order.·
2 H If A and B are invertible then so is AB. T he inverse of a product A B is (4)
s - To see why the order is reversed, multiply AB times 1A- I. The inside step is B B- 1 = I: = = = (AB)(B - 1A- 1) Al A- 1 AA - 1 I.
s - We moved parentheses to multiply B B- 1 first. Similarly 1A - 1 times AB equals /.
This illustrates a basic role of mathematics: Inverses come in reverse order. It is also
common sense: If you put on socks and ·then shoes, the first to be taken off are the _ _ . The same idea applies to three or more matrices:
(5)
Example 2 Inverse of an Elimination Matrix. If £ subtracts 5 times row 1 from row 2, then E- 1 adds 5 times row I to row 2:
;J [i ;]- 0
E=H 1
0
0
and E- 1=
1
0
Multiply £ £ - 1 to get the identity matrix I . Also multiply £ - 1E to get /. We are
adding and subtracting the same 5 times row I. Whether we add and then subtract (this is EE - 1) or subtract and then add Ohis is E- 1E), we arc back at the start.
2.5 Inverse Matrices 73
For square matrices, an inverse on one side is automatically a11 inverse on tlle other
= = side. If AB 1 then automatically BA I. In that case B is A- 1. This is very
useful to know but wc arc not ready to prove it.
Example 3 Suppose F subtracts 4 times row 2 from row 3, and F - 1 adds it back:
J 0
F= 0 l [ 0 -4
~ a.nd F- 1 = [~ ~] . 04 1
Now multiply F by the matrix E in Example 2 to find FE. Also multiply E - 1 times
F - 1 to find (FE)- 1. Notice the orders FE and E - 1F - 1 !
J 0 FE = - 5 l
[ 20 - 4
= [ ~ is inverted by E- 1F- 1
0l OOJ .
(6)
4 l
The result is strange but correct. The product FE contains '"20" but its inverse doesn't. E subtracts 5 times row 1 from row 2. Then F subtracts 4 times the new row 2 (changed by row 1) from row 3. In this order FE, row 3 feels an effect from row I.
e- In the order 1 F - 1, that effect docs not happen. First F - 1 adds 4 rimes row
2 to row 3. After that. E- 1 adds 5 times row l to row 2. There is no 20, because
row 3 doesn't change again. In this order, row 3 feels nQ effect from row 1.
e- For elimination with normal order FE. the product of inverses 1F - 1
is quick. The multipliers fall into place below the diago11a/ of I's.
c - This special property of E- 1F- 1 and E- 1p - l 1 wiU be useful in the next sec-
tion. We will explain it again, more completely. In this section our job is A - I, and we expect some serious work to compute it. Here is a way to organize that computation.
Calcula1ing A- 1 by Gauss-Jordan Elimination
= I hinted that A- 1 might not be explicitly needed. The equation Ax b is solved by = x A - Ib. But it is not necessary or efficient to compute A - 1 and multiply it times
b. Elimina1io11 goes directly to x. Elimination is also the way to calculate A - I, as we
= now show. The Gauss-Jordan idea is to solve A A - I I, .findirzg each column of A - I.
= = A multiplies the first column of A- 1 (call that x 1) to give the first column of 1
(call that e1). This is our equation Ax 1 e1 (I, 0, 0}. Each of the columns x 1. x 2, x 3 of A - 1 is multiplied by A to produce a column of I:
(7)
= To invert a 3 by 3 matrix A, we have to solve three systems of equations: Ax1
= = = = e 1 and A x 2 e2 (0, 1, 0) and Ax3 e3 (0, 0, 1). This already shows why
74 Chapter 2 Solving Linear Equations
computing A- 1 is expensive. We must solve n equations for its n columns. To solve
Ax = b without A - I, we deal only with one column.
In defense of A- I . we want to say that its cost is not 11 times the cost of one
= system Ax b. Surprisingly, the cost for n columns is only multiplied by 3. This = saving is because lhe n equations Ax; e; all involve the same matrix A. Working
with the right sides is relatively cheap, because elimination only has to be done once on A. The complete A- I needs 113 climi111ation steps. where a single x needs n3/ 3. The next section calculates these costs.
The Gauss-Jordan method computes A- 1 by solving all 11 equations together. Usually the "augmented matrix" has one extra column b. from the right side of the equations. Now we have three right sides e 1, e2, e 3 (when A is 3 by 3). They are the columns of I . so the augmented matrix is really the block matrix [ A 11. Here is a worked-out example when A has 2's on the main diagonal and - l's next to the 2's:
0
~] Start Gauss-Jordan
0
-I 0 I 0
32 -l
1
2
I
-I 2 0 0
(} row 1 + row 2)
Tl
0 10
-I
I
2
I
4 12
3 33
( i row 2 + row 3)
We are now halfway. The matrix in the first three columns is U (upper triangular). The pivots 2, ~. ~ are on its diagonal. Gauss would finish by back substitution. The contribution of Jordan is ro continue with elimination! He goes all the way to the
"reduced echelon form". Rows are added to rows above them. to produce zeros above the pivots:
2 -I 0
0
0
l 3
0
3
4
3 z
0
4 3
I 3
0
0
3 ~
I
3
2
0
3
4
3
!
0
4 j
I
3
2
3
(i row 3 + row 2)
( j row 2 +row I )
2.5 Inverse Matrices 75
The last Gauss-Jordan step is to divide each row by its pivot. The new pivots
are 1. We have reached / in the first half of the matrix, because A is invertible. The three columns of A-I are in the second half of [ / A- 1 ):
(divide by 2)
(divide by }) (divide by t)
0 0
0
0
0 0
J
I
I
4 2 4
I
2
1
I
2
= [ / x, X2 XJ ].
I
I
J
4I 4
Staning from the 3 by 6 matrix [ A / ], we ended with [ / A- 11- Herc is the whole
Gauss-Jordan process on one line:
M ultiply [ A 1] by A-1 to get [I A-1].
The elimination steps gradually create the inverse matrix. For large matrices, we probably don't want A- I at aJI. But for small matrices, it can be very worthwhile to kno" the inverse. We add three observations about this particular A-1 because it is an impor-
tant example. We introduce the words symmerric, tridiagonal, and determinam (Chapter 5):
1. A is symmetric across its main diagonal. So is A - 1.
1
i. A is tridiagonal (only three nonzero diagonals). But A - 1 is a full matrix with
no zeros. That is another reason we don't often compute A - 1•
I I
= 3. The product of pivots is 2(!)(!} 4. This number 4 is the detero,inant of A.
A-1 involves division by the detenninant A- I = -l [32 24 2I] . (8)
4 123
fl. Example 4 Find A - 1 by Gauss-Jordan elimination starting from A = [ i There
are two row operations and then a division to put l's in the pivots:
3 1 7 0
3 1
1 -2
0 7
J -2
The reduced echelon form of f A I I is [ I A- I ]. This A- I involves division by the
determinant 2 • 7 - 3 •4 = 2. The code for X = inverse(A) has three important lines!
= '
I eye (11, n);
R = rref (LA I J);
= + X R(: , 11 + I : n n)
% Define the identity matrix
% Eliminate on the augmented matrix % Pick A- 1 from the last ,r columns of R
A must be invertible, or elimination will not reduce it (in the left half of R) to / .
76 Chapter 2 Solving Linear Equations
Singular versus Invertible
We come back to the central question. Which matrices have inverses? The start of this section proposed the pivot test: A-1 exist!it exactly when A has a full set of n pivots.
(Row exchanges allowed.) Now we can prove that by Gauss-Jordan elimination:
= 1. With 11 pivots, elimination solves all the equations Ax; e; . The columns x; go
into I . Then A - I = and I is at least a right-inverse.
A
-
A
I
A
-
2. Elimination is really a sequence of multiplications by E's and rs and v - 1:
(D- 1 • - • E • • • P • • • E)A = I.
(9)
v - 1 divides by the pivots. The matrices E produce zeros below and above the pivms. P will exchange rows if needed (see Section 2.7). The product matrix in equation (9)
= is evidently a left-inverse. With n pivots we reach A - 1A I.
The right-inverse equals the left-inverse. That was Note 2 in this section. So a
square matrix with a full set of pivots will always have a two-sided inverse,
Reasoning in reverse will now show that A must have n pivots if AC = I. Then
we deduce that C is also a left-inverse. Herc is one route to those conclusions:
1. If A doesn't have 11 pivots, elimination will lead to a zero row.
2. Those elimination steps are taken by an invertible M . So a row of MA is zero.
= 3. If AC = I then MAC M. The zero row of MA. times C. gives a zero row
of M.
= 4. The invertible matrix M can't have a zero row! A must have 11 pivots if AC I .
5. Then equation (9) displays the left inverse in BA = I. and Note 2 proves B = C.
That argument took five steps, but the outcome is short and important.
21 A complete test for invertibility of a square matrix A comes from elimination. A-1 exists (a,1d Gauss~Jordari finds it) exactly whe11 A. has n pivots. The full argument shows more:
If AC=! then CA = / and C=A- 1
2.5 Inverse Matrices 77
Examp le 5 If L is lower triangular with l's on the diagonal, so is L - I.
Use the Gauss-Jordan method to construct L- I. Start by subtracting multiples of pivot rows from rows below. Normally this gets us halfway to the inverse, but for L it gets us all the way. L- I appears on the right when / appears on the left:
~] 0 0 I 0
[L I]= [! 1 0 0 1 5I 00
~ [~
~ 0
0 I 5
0 I 0 -3 I -4
0 1 0
n
(3 times row 1 from row 2) (4 times row 1 from row 3)
➔ [~
0 I 0
0 10
0 -3 1 11 -5
n= [I L-1 ).
When L goes to I by elimination. / goes to L - 1. ln other words, the product of elimination matrices £32£31£21 is L - 1. All pivots are l's (a full set). L- 1 is lower
= 2/ 3/ 32 triangular. The strange entry "11" in L_ , does not appear in E £ £ 1 L.
l4,;
t.~
■ REVIEW OF THE KEY IDEAS ■
+I-
f.•·1,
1. The inverse matrix gives A A- 1 = I and A- •A = I .
2. A is invertible if and only if it has n pivots (row exchanges allowed).
= 3. If Ax 0 for a nonzero vector x. then A has no inverse.
4. The inverse of AB is the reverse product B - 1A- I .
S. The Gauss-Jordan method solves AA- 1 = / to find the ,r columns of A- 1. The
augmented matrix [ A I ] is row-reduced to [ I A- I ).
■ WORKED EXAMPLES ■
2.5 A Three of these matrices are invertible, and three are singular. Find the inverse when it exists. Give reasons for noninvertibility (zero determinant, too few pivots.
= nonzero solution to Ax 0) for lhe olher three, in that order. The matrices
n A. B. C, D, E, F are
[: n[:n[: ~J [: n[t: [l: fJ
78 Chapter 2 Solving Linear Equations
Solution
8 -1 = ! [ 7 -3 ] c- • = _l [ 0 6]
4 -8 4
36 6 - 6
~ - 1
~]
0 -1 I
= A is not invertible because its determinant is 4 • 6 - 3 • 8 - 24 - 24 0. D is
not invertible because there is only one pivot~the second row becomes zero when the
first row is subtracted. F is not invertible because a combination of the columns (the
= second column minus the first column) is zero- in other words Fx 0 has the solution
x = (-1, I,0).
Of course all three reasons for noninvertibility would apply to each of A, D, F.
2.5 B Apply the Gauss-Jordan method to find the inverse of this tri angular "Pascal
= matrix" A abs(pascal(4,l )). You see Pascal's triangle-adding each entry to the
entry on its left gives the entry below. The entries are "binomial coefficients..:
! ~ ~ ]. Triangular Pascal matrix A = [ :
I 331
Solution Gauss-Jordan starts with [A /] and produces zeros by subtracting row 1:
[A /) =[ : !~ ~ ~ !~ ~ ]➔ [ i !~ ~ =: !~ ~ ].
1331000 l
O 3 3 l -1 0 0 1
The next stage creates zeros below th.e second pivot, using multipliers 2 and 3. Then the last stage subtracts 3 times the new row 3 from the new row 4:
= [/ ➔
01 0t 00 00 - II 0l 00 00 ] [ 01 01 00 00 -1I O1 0 0 1 0 I -2 l 0 -+ 0 0 I 0 I -2
00 00 ] 1 0
-t
A ].
[ 0 0 3 I 2 -3 0 I
0 0 0 I -I 3 - 3 I
All the pivots were 1! So we didn' t need to divide rows by pivots to get /. The inverse matrix A- 1 looks like A itself, except odd-numbered diagonals are multiplied by -1.
Please notice that 4 by 4 matrix A - 1, we will see Pascal matrices again. The same
pattern continues to II by n Pascal ma1rices- 1he inverse ha'> "alternating diagonals".
Problem Set 2.5
1 Find the inverses (directly or from the 2 by 2 formu la) of A, B . C:
! ~] A =[~ ~] and B =[
and C = [; ~].
2.5 Inverse Matrices 79
2 For these "pennutation matrices" find p - l by trial and error (with l's and O's):
~ = P 0O OI 0l] [I O 0
and P = [~ ~] . l 00
3 Solve for the columns of A- 1 = (; ~);
J 4 Show that [ i] has no inverse by trying to solvc for the column (x, y):
= 5 Find an upper triangular U (not diagonal) with u2 = I and U u-1.
= = 6 (a) If A is invertible and AB AC, prove quickly that B C.
i]. (b) If A = [: = find two matrices B i= C such that AB AC.
= 7 (Important) If A has row I + row 2 row 3. show that A is not invertible:
(a) Explain why Ax = (1, 0, 0) cannot have a solution.
= (b) Which right sides (bi, bi, b3) might allow a solution to Ax b?
(c) What happens to row 3 in elimination?
= 8 If A has column l + column 2 column 3, show that A is not invertible: = (a) Find a nonzero solution x to Ax 0. The matrix is 3 by 3. = (b) Elimination keeps column l + column 2 column 3. Explain why there
is no third pivot.
9 Suppose A is invertible and you exchange its first two rows to reach B. Is the
new matrix B invertible and how would you find n- 1 from A- 1?
10 Find the inverses (in any legal way) of
o o o 2]
00 30
A= [ o 4 O O
5 0 0 0
= and B
43 32 00 0OJ 0 065•
[
0 0 76
11 (a) Find invertible matrices A and B such that A + B is not invertible.
(b) Find singular matrices A and B such that A+ B is invertible.
C0pyrighted ma,cr ,al
80 Chapter 2 Solving Linear Equations
= 12 If the product C AB is invertible (A and B are square), then A ilSclf is in-
vertible. Find a formula for A- 1 thail involves c-1 and B.
= 13 If the product M ABC of three square matrices is invertible. then B is invert-
n- ible. (So arc A and C .) Find a fonnula for 1 that involves M- 1 and A and C.
14 If you add row I of A to row 2 to get B, how do you find B - 1 from A- 1?
! ~] [ = [ Notice the order. The inverse of B
A ] is
15 Prove that a matrix with a column of zeros cannot have an inverse.
16 Multiply [: ~] times [ -~ -~ ]. What is the inverse of each matrix if ad -:fo be?
17 (a) What matrix E has the same effect as these three steps? Subtract row I from row 2, subtract row I from row 3, then subtract row 2 from row 3.
(b) What single matrix L has the same effect as these three reverse steps? Add row 2 to row 3. add row I to row 3, then add row 1 to row 2.
18 If B is the inverse of A2• show that AB is the inverse of A.
19 Find the numbers e1 and b that give the inverse of 5 • eye(4) - ones(4,4):
]-I [a -14 - 4I --II --II
b
b a
b b
bb ]
- t -1 4 -I [ -1 - 1 - I 4
= bbab• bb ba
What are a and b in the inverse of 6 • eye{S) - ones(S,5)?
20 Show that A = 4 • eye(4) - ones(4,4) is nor invertible: Multiply A• ones(4,1).
21 There are sixteen 2 by 2 matrices whose entries are I's and O's. How many of them arc invertible?
Questions 22--28 are about the Gauss-Jordan method for calculating A-1.
22 Change / into A- I as you reduce A to I (by row operations):
23 Follow the 3 by 3 text example but with plus signs in A. Eliminate above and below lhc pivots to reduce [ A I ] to f / A- I ]:
I O t O OJ 210101. l 200
2.5 Inverse Malrices 81
24 Use Gauss-Jordan elimination on [ A I] to solve AA- 1 = / :
25 Find A- 1 and B- 1 (if they exist) by elimination on [ A I] and [ B l ]:
= 2 -1 -l]
and B - 1 2 - 1 .
[-1 - 1 2
U 26 What three matrices E 21 and E 12 and D- 1 reduce A - i] to the identity
matrix? Multiply 0 - 1£1 2 £ 21 to find A- 1.
27 Invert these matrices A by the Gauss-Jordan method starting with [ A I ):
A=[: I] A = 2I OI 3OJ [0 0 I
and
21 2 .
2 3
~
t7
28 Exchange rows and continue with Gauss-Jordan to find A- 1:
-l-J.
J
1·1
[A I] = [ O2 22 OI O1 J •
29 True or false (with a counterexample if false and a reason if true):
(a) A 4 by 4 matrix with a row of zeros is not invertible.
(b) A matrix with l's down the main diagonal is invertible. (c) If A is invertible then A- 1 is invertible. (d) If A is invertible then A2 is invertible.
30 For which three numbers c is this matrix not invertible, and why not?
2 C C] A= C C C .
[8 7 C
31 Prove that A is inveniblc if a =f. 0 and a =f. b (find the pivots or A -I):
82 Chapter 2 Solving Linear Equations
32 This matrix has a remarkable inverse. Find A- 1 by elimination on [ A I ). Extend to a 5 by 5 "alternating matrix" and guess its inverse; then multiply to confinn.
-J] J - I I
0 I -I I A= 0 0 I -I •
[0 0 0 I
= ( 33 Use the 4 by 4 inverse in Question 32 to solve Ax 1, I, 1, 1).
34 Suppose P and Q have the same rows as I but in any order. Show that P - Q
= is singular by solving (P - Q)x 0.
35 Find and check the inverses (assuming they exist) of these block matrices:
= 36 If an invertible mairix A commutes with C (this means AC CA) show that
A- I commutes with C. If also B commutes with C, show that AB commutes
= with C. Translation: If AC = CA amd BC = CB then (A B)C C (AB).
37 Could ;i. 4 by 4 matrix. A be invertible if every row contains the numbers O, I1 2, 3
1
in some order'? What if every row of B contains 0, 1, 2, -3 in some order? 11
38 In the worked example 2.5 B. the triangular Pascal matrix A has an inverse with "alternating diagonals". Check that this A - • is DAD, where the diagonal matrix
= D has alternating entries 1. -1, 1, - 1. Then ADAD I, so what is the inverse = of AD pascal (4, 1)?
39 The Hilbert matrices have Hij = 1/ (i + j - I). Ask MATLAB for the exact 6
by 6 inverse invhilb(6 ). Then ask for inv{hilb(6)). How can these be different, when the computer never makes mistakes?
40 Use inv{S) to invert MATLAB's 4 by 4 symmetric matrix S = pascal(4). Create Pascal's lower triangular A = abs(pascal(4, 1)) and test inv{S) = inv(A' ) • inv(A).
= = 4 1 If A ones(4,4) and b rand(4,1), how does MATLAB tell you that Ax = b
= = has no solution? If b ones(4, 1), which solution to Ax b is found by A\b?
= c•. 42 If AC= I and AC*= I (all square matrices) use 21 to prove that C
= 43 Direct multiplication gives MM- 1 I, and I would recommend doing #3. M - 1
shows the change in A- 1 (useful to know) when a matrix is subtracted from A:
1 M = 1 - uv = 2 M A - llV
and M- 1 =I+ uv/(1 - vu)
= and M- 1 A- 1 +A - 1uvA - 1/ (1 - vA- 111 )
= 3 M I-UV
and M - 1 =I,,+ UUm - VU)- 1V
4 M = A- uw- 1v and M - 1 = A- 1 +A - 1U(W-vA - 1u) - 1VA - 1
2.6 Elimination = Factorization: A = LU 83
The Woodbury-Morrison fonnula 4 is the "matrix inversion lemma" in engineering. The four identities come from the I, l block when inverting these matrices (v is 1 by 11, " is 11 by 1, V is m by 11, U is II by m, m ~ n):
In U] [ V lnr
= ELIMINATION FACTORIZATION: A = LU ■ 2.6
Students often say that mathematics courses are too theoretical. Well, not this section.
It is almost purely practical. The goal is to describe Gaussian elimination in the most
useful way. Many key ideas of linear algebra, when you look at them closely, are really
factorizations of a matrix. The original matrix A becomes the product of two or three
special matrices. TIJe first factorization- also the most important in practice- comes
= now from elimination. The factors are triang11lar matrices. The factorizatioll that
comes from elimination is A LU.
We already know U. the upper triangular matrix with the pivots on its diagonal.
The elimination steps take A to U . We will show how reversing those steps (taking
U back to A) is achieved by a lower triangular L. The entries of L are exactly the
multipliers tu-which multiplied row j when it was subtracted from row i.
Start with a 2 by 2 example. The matrix A contains 2, I, 6, 8. The number to
eliminate is 6. Subtract 3 times row l from row 2. That step is £ 21 in the forward
direction.
The
return
step
from
U
to
A
is
L
=
£
1
21
(an
addition
using
+3):
[-! ~] [~ !] [i !] = Forward from A to U: £21A
=
= U
! ~] [ ~] ! !] Back from U to A : £211U = [
~ =[
= A.
The second line is our factorization. Instead of £ 211U = A we write LU = A. Move
now to larger matrices with many £ 's. Then L will include all their i11verses. Each step from A to U multiplies by a matrix Eij to produce zero in the (i. j)
position. To keep this clear, we stay with the most frequent case-wlre11 no row exchanges are involved. If A is 3 by 3, we multiply by £21 and £31 and £32. The multipliers fu produce zeros in the (2, l) and (3, 1) and (3, 2) positions-all below the diagonal. Elimination ends with the upper triangular U.
Now move those E's onto the other side, where their inverses mulliply U:
The inverses go in opposite order, as they must. That product of three inverses is L.
We have reached A= LU. Now we stop to understand it.
84 Chapter 2 Solving Linear Equations
Explanation and Examples
First poim: Every inverse matrix Ei';1 is lower triangular. Its off-diagonal entry is
t;; . to undo the subtraction with -f;;. The main diagonals of E and E- 1 contain l's.
?]. Our example above had f 2 1 = 3 and E = [_} Y] and £ - 1 = [l
Second point: Equation (I) shows a lower triangular matrix (the product of Eij) multiplying A. It also shows a lower triangular matrix (the product of Eij 1) multiplying U to bring back A. This product of in,erses is L.
One reason for working with the inverses is that we want to factor A, not U.
= The "inverse form" gives A LU. The second reason is that we get something extra,
almost more than we deserve. This is the third point. showing that L is exactly right.
Third poim: Each multiplier tii goes directly into its i. j position- unchanged- in the product of inverses which is L. Usually matrix muhiplication will mix up all the numbers. Here that doesn't happen. The order is right for the inverse matrices, to keep the f's unchanged. The reason is given below in equation (3).
Since each E - 1 has I's down its diagonal. the final good point is that L does too.
= 2J (A LU) This is elimination withor1t row exclia11ges. The upper triangular U
has the pivotc; on its diagonal. The lower triangular L has all 1's on its diagonal. The
mllllipliers eij are below the diagonal of L.
1·1
Example 1 The matrix A has I, 2, I on its diagonals. Elimination subtracts ½times
row 1 from row 2. The last step subtracts j times row 2 from row 3. The lower
= = triangular L has !21 ½ and e32 j- Multiplying LU produces A:
0
A= 2l 21 OI J = [ ½1 I
[0 I 2
0
2
3
3
~
0
The (3, 1) multiplier is zero because the (3. I) entry in A is zero. No operation needed.
Example 2 Change the top left entry from 2 to I. The pivots all become 1. The
mullipliers are all I. That pattern continues when A is 4 by 4:
n i [i i t iJ A=[~
=
J[,•:
These LU examples are showing something extra, which is very important in practice. Assume no row exchanges. When can we predict zeros in L and U?
When a row of A starts witli zeros, so does that row of L . When a column of A swrts with: zeros, so does that column of U.
2.6 Elimination • Factorization: A = LU 85
If a row stans with zero, we don' t need an elimination step. L has a zero, which saves computer time. Similarly, zeros at the start of a column survive into U. But please realize: Zeros in the middle of a matrix are likely to be filled in, while elimination
sweeps forward. We now explain why L has the multipJiers fu in position, with no
mix-up.
Tlie key reason why A equals LU: Ask yourself about the pivot rows Lhat are subtracted from lower rows. Are they the original rows of A? No. elimination probably changed them. Are they rows of U? Yes, the pivol rows never change again. When computing the third row of U. we subtract multiples of earlier rows of V (not rows of A!);
= Row 3 of U (Row 3 of A) - l31 (Row 1 of V) - i32{Row 2 of V). (2)
Rewrite this equation to see that the row [ t31 t32 1 ] is multiplying U:
+ + (Row 3 of A)= £31 (Row 1 of U) €32(Row 2 of U) l (Row 3 of U). (3)
This is exactly row 3 of A = LU. All rows look like this, whatever the size of A.
With no row exchanges, we have A= LU.
Remark The L U factorization is ..unsymmetric" because U has the pivots on its diagonal where L has 1's. This is easy to change. Divide U by a diagonal matrix D tlrat contains the pivots. That leaves a new matrix with J's on the diagonal:
Split U into
l II 12/d1
1
dn
It is convenient (but a little confusing) to keep the same letter U for this new upper triangular matrix. It has l's on the diagonal (like L). Instead of the normal LU, the new form has D in the middle: Lower triangular L times diagonal D times upper triangular U.
The triangular factorization can be wrilten A= LU or A = LDU.
Whenever you see LDU, it is understood that U has 1's on the diagonal. Each row
is divided by irs firsr 1wm:.ero elltry- the pivot. Then L and U are treated evenly in
LDU:
s][~ il [! ~][~ ~] [! ~][2 splitsfunhcrinto
(4)
The pivots 2 and 5 went into D. Dividing the rows by 2 and 5 left the rows [ I 4]
and [0 1] in the new U. The multiplier 3 is still in L.
My own lect11res somelimes stop at this point. The next paragraphs show how elimination codes are organized, and how mong they taJce. If MATLAB (or any software) is available, I strongly recommend the last problems 32 to 35. You can measure the
computing lime by just counting the seconds!
Copyrigl1ted ma1ci ,al
86 Chapter 2 Solving Linear Equations
= One Square System Two Triangular Systems
The matrix l contains our memory of Gaussian elimination. It holds the numbers that multiplied the pivot rows. before subtracting them from lower rows. When do we need this record and how do we use it?
We need L as soon as there is a right side b. The factors L and U were com-
= pletely decided by the left side (the matrix A). On the right side of Ax b, we use
Solve:
1 Factor (into L and U. by forward elimination on A)
2 Solve (forward elimination on b using L, 1hen back substitution using U).
Earlier, we worked on b while we were working on A. No problem with thatjust augment A by an extra column b. But most computer codes keep the two sides separate. The memory of forward elimination is held in L and U, at no extra cost in storage. Then we process b whenever we want to. The User's Guide to UNPACK remarks that "This situation is so common and the savings are so important that no provision has been made for solving a single system with just one subroutine."
How does Solve work on fJ? First. apply forwarc;I elimination to the right side {the
multipliers are stored in L, use them now). This changes b to a new right side c - we
= = are really sob1ing Le b. Then back substitution solves Ux c as always. The = original system Ax h is factored into two triang11lar systems:
= Solve Le= b and then solve U x c .
(5)
= = To see that x is correct, multiply Ux c by L. Then LUx Le is just Ax = b.
To emphasize: There is nothi11g new about those steps. This is exactly what we
= have done all along. We were really solving the triangular system Le b as elimina-
tion went forward. Then back substitution produced x. An example shows it all.
= = Example 3 Forward elimination on Ax b ends at U x c:
u + 2v = 5 4u + 9v = 21
becomes
ll + 2v = 5
V = J.
The multiplier was 4. which is saved in L . The right side used it to find c:
! ~][c] = Le b The lower triangular system [
= [2~] gives c = [~] .
= Ux c The upper triangular system [ ~ ~] [x] = [;] gives x = [~]-
It is satisfying that L and V can take the n2 storage locations that originally held A. The t 's go below the diagonal. The whole discussion is only looking to see what elimination actually did.
2.6 Elimination • f actorization: A= LU 87
The Cost of Elimination
A very practical question is cost- or computing time. Can we so]ve 1000 equations
on a PC? What if 11 = IO, 000? Large systems come up all the time in scientific computing, where a three-dimensional problem can easily lead to a million unknowns. We can let the calculation run overnight, but we can't leave it for 100 years.
The first stage of elimination. on column I. produces zeros below the first pivot. To find each new entry below the pivot row requires one multiplication and one sub traction. We will count this first srage as 112 mulriplications and 112 subtractions. It is actually less. n2 - n, because row I does not change.
The next stage clears out the second column below the second pivot. The working matrix is now of size n - I. Estimate this stage by (n - 1)2 multiplications and subtractions. The matrices are getting smaller as elimination goes forward. The rough
count to reach U is the sum of squares n2 + (n - I)2 + •••+ 22 + 12. There is an exact formula ,½n(n + J)(n + 1) for this sum of squares. When II is
large, the ½and the 1 are not important. The number that mailers is ! n3. The sum of
f squares is like the integral of x 2 ! The integral from O to II is n3:
Elimi11ation 011 A re uires abo1d } 11J 11wltiplicotio11s and }113 s11btractfons.
What about the right side b'! Going forward. we subtract multiples of b 1 from the lower components bi, . . . , b11. This is 11 - 1 steps. The second stage takes only n - 2 steps, because b1 is not involved. The last stage of forward elimination takes one step.
Now start back substitution. Computing Xn uses one step (divide by the last pivot). The next unknown uses two steps. When we reach x 1 it will require II steps (n - I substitutions of the other unknowns, then division by the first pivot). The total count
on the right side, from b to c to x - fon11ard to the bottom and hack to tire top-is
exactly n2:
[(n-l ) +(n - 2)+ .. ·+ 1] + [1+2+··· + (n - l)+nJ = n 2.
(6)
To see that sum, pair off (11 - 1) with 1 and (n - 2) with 2. The pairings leave II terms. each equal to n. That makes n2. The right side costs a lot less than the left side!
Each right side needs 112 multiplications and 112 subtractiom,.
Here are the MATLAB codes to factor A into LU and to solve Ax = b. The program slu stops right away if a number smaller than the tolerance "toI" appears in a pivot
88 Chapter 2 Solving Linear Equations
position. Later the program plu will look down the column for a pivot, to execute a row exchange and continue solving. These Teaching Codes are on web.mit.edu/18.06/www.
= function [L, U] slu(A)
% Square LU factorization with no row exchanges!
= [n. n] size(A); tol = l.e - 6:
fork= l: n
if abs(A(k, k)) < tol
end
% Cannot proceed without a row exchange: stop
= L(k, k) I ;
= for i k + 1 : n % Multipliers for column k are put into L
= L(i, k) A(i. k)/A(k, k);
= for j k + l : n % Elimination beyond row k and column k
= * A(i, j) A(i, j) - L(i , k) A(k. j); % Matrix still called A
end
end
for j ;:::; k: n
U(k. j) = A(k, j);
% row k is settled. now name it U
end
end
= function x slv(A. b)
% Solve Ax = b using l and U from slu(A). No row exchanges!
= [L, U] slu(A);
= for k l : n
for j = l : k - 1 s = s + L(k, j) *c(j);
= = end
c(k) b(k) - s; % Forward cmimination to solve Le b end fork= 11 : - 1 : I % Going backwards from x(n) to x(l )
= for j k + l : 11 % Back substitution
I= I+ U(k. j) * X(j):
end
x(k) == (c(k) - t)/ U(k, k); % Divide by pivot end
x = x ' ; % Transpose to column vector
= = How Jong does it take to solve Ax b? For a random matrix of order n 1000,
we tried the MATLAB command tic; A\b; toe. The time on my PC was 3 seconds.
= For ,r 2000 the time was 20 seconds, which is approaching the n3 rule. The time is
multiplied by about 8 when n is multiplied by 2. According to this n 3 rule, matrices that are IO times as large (order 10,000) will
take thousands of seconds. Matrices of order 100,000 will take millions of seconds.
2.6 Elimination = Factorization: A= LU 89
This is too expensive without a supercomputer, but remember that these matrices are
= full. Most matrices in practice are sparse (many zero entries). In that case A LU
is much faster. For tridiagonal matrices of order 10,000, storing only the nonzeros,
= solving Ax b is a breeze.
■ REVI EW OF THE KEY IDEAS ■
1. Gaussian elimination (with no row exchanges) factors A into L times U.
2. The lower triangular L contains the numbers that multiply pivot rows, going from
A to U. The product LU adds those rows back to recover A.
= = 3. On the right side we solve Le b (forward) and U x c (backwards).
4. There are !<113 - n) multiplications and subtractions on the left side. 5. There are n2 multipHcations and subtractions on the right side.
■ WORKED EXAMPLES ■
2.6 A The lower triangular Pascal matrix Pi was in the worked example 2.5 B. (It contains the ·•Pascal rriangle" and Gauss-Jordan found its inverse.) This problem connects PL to the symmetric Pascal matrix Ps and the upper triangular Pu . The sym-
metric Ps has Pascal's triangle tilted, so each entry is the sum of the entry above and the entry to the left. The ,, by II symmetric Ps is pascal(n) in MATLAB.
= Problem: Establish the amazing lower-upper Jacroriztuiou Ps PLPu:
I 1
= pascal{4)
I 2 [1 3
I 4
Then predict and check the next row and column for 5 by 5 Pascal matrices.
Solution You could multiply PL Pu to get Ps. Better to start with the symmetric
Ps and reach the upper triangular Pu by elimination:
3l 4l ] 6 10
~
[101I 21 311 025 9
~
[0 l 1l 2I 3I] 00 1 3
~
[l011 21 31] 0 0 13
= Pu .
10 20
0 3 9 19
0 0 3 JO
OOO I
= The multipliers f..;j that entered these steps go perfectly into PL, Then Ps Pl Pu is = a particularly neat example of A LU. Notice that every pivot is l ! The pivots are
90 Chapter 2 Solving Linear Equations
on the diagonal of Pu . The next section will show how symmetry produces a special relationship between the triangular L and U. You see Pu as the ..transpose" of Pl .
You might expect the MATLAB command lu(pascal(4)) to produce these factors Pt. and Pu. That doesn't happen because the Ju subroutine chooses the largest available pivot in each column (it will exchange rows so the second pivot is 3). But a dif-
ferent command chol factors without row exchanges. Then [L. U] = c hol(pascal(4))
= produces the triangular Pascal matrices as L and U. Try it. In the 5 by 5 case the new fifth rows do maintain Ps PLPu :
Next Row
I 5 15 35 70 for Ps
I 4 6 4 1 for Pi
I wilJ only check that this fifth row of PL times the (same) fifth column of Pu gives
t'-> + 4"- + 6"- + 4"- + 12 = 70 m• the fifth row of Ps. The full proof of Ps = PL Pu
is quite fascinating-this factorization can be reached in at least four different ways. I am going to put these proofs on the course web page web.mit.edu/18.06/www, which is also available through MIT's OpenCourseWare at ocw.mit.edu.
These Pascal matrices Ps, PL , Pu have so many remarkable properties-we will
see them again. You could locate them usiing the Index at the end of the book.
2.6 8 The problem is: Solve Ps x = b = (1. 0. O. 0). This special right side means 5 that x will be the first column of P 1. That is Gauss-Jordan, matching the columns
of Ps Pi 1 = /. We already know the triangular PL and Pu from 2.6 A. so we solve
PLC = b (forward substitution}
Pu x = c (back substitution).
Use MATLAB to find the full inverse matrix p5- t .
= Solution The lower triangular system PLc b is solved top to bottom:
CJ
=I
+ C1 Cz
= 0
Cl +2ci + C3
=0
q + 3c2 + 3c3 + C4 = 0
gives
Cl =+I c2 = -1 CJ= + l C4 = - 1
Forward eliminacion is multiplication by PL1. It produces the upper triangular system
Pux = c . The solution x comes as always by back substitution. bottom lo top:
+ = + + XI X2
X3
X4
x2 + 2x3 + 3x4 = - l
.t 3 + 3.q = I
X4 = -I
gives
= .'t'J + 4
x2 = - 6
X3 = -t4
X4 = - 1
-~ ] 5 The complete inverse matr~x P 1 has that x in its first column:
4 -6 4
= - 6 14 - 11
inv(pascal(4))
[
4 -1
- 11 3
10 - 3 -3 I
2.6 Elimination = Factorization: A= LU 91 Problem Set 2.6
Problems 1-14 compute the factorization A= LU (and also A= LDU).
= = 1 (Important) Forward elimination changes [ ~ } ]x b to a triangular [ A~ ]x c:
x+ y= 5 X + 2y = 1
x+ y = 5
y= 2
5] 1 l
[1 2 7
-----+
l 1 5]
[0 I 2
That step subtracted £21
times row from row 2. The reverse step
adds t21 times row I to row 2. The matrix for that reverse step is L = _ _ .
Multiply this Ltimes the triangular system (A~ ]x = (~] co get _ _ = _ _ .
= In letters, L multiplies Ux c to givc _ _ .
= 2 (Move to 3 by 3) Forward elimination changes Ax = b to a triangular Ux c:
x + y+ z= 5
X + 2y + 3z =7
= x +3y + 6z I I
x+ y + z =5
y + 2z = 2
2y +5z= 6
x + y+ z= 5
y + 2z = 2
z= 2
= U = = The equation z 2 in x c comes from the original x + 3y + 6z 11 in
Ax = b by subtracting = e31 _ _ times equation 1 and l32 = __ _ times
the final equation 2. Reverse that to recover [ 1 3 6 11 ] in A and b from the
final [ I I I 5 ] and l O I 2 2 ] and LO O l 2 J in U and c:
= Row 3 of [ A b] (t31 Row 1 + l32 Row 2 + J Row 3) of [ U c].
= b= In matrix notation this is multipJication by L. So A LU and le.
= = 3 Write down the 2 by 2 triangular systems Le b and Ux c from Problem 1. = Check that c (5, 2) solves the first one. Find x that solves the second one.
= = c 4 What are the 3 by 3 triangular systems Le b and Ux
from Problem 2?
= Check that c (5, 2, 2) solves the first one. Which x solves the second one?
= = 5 What matrix E puts A into triangular form EA U? Multiply by E- 1 L to
factor A into LU:
A=
[
0 2
6
4 I
3
O2 J .
5
6 What two elimination matrices £ 21 and £ 32 put A into upper triangular form
= = A £ 32£ 21
U?
Multiply
by
£
1
32
and
£
2/
to factor A
into LU
£ 211£ 321U:
I I A= 2 4
[0 4
Copyrighted ""ull,;11,..1