INTRODUCTION TO LINEAR ALGEBRA Third Edition GILBERT STRANG Massachusetts Institute of Technology WELLESLEY-CAMBRIDGE PRESS Box 812060 Wellesley MA 02482 Introduction to Linear Algebra, 3rd Edition Copyright ©2003 by Gilbert Strang All rights reserved. No part of this work may be reproduced or stored or transmitted by any means. including photocopying, without written permission from Wellesley-Cambridge Press. Translation in any language is strictly prohibited - authorized translations are arranged. Printed in the United States of America 9 87 65 4 32 ISBN 0-9614088-9-8 QA184.S78 2003 5 12'.5 93-14092 Other texts from Wellesley-Cambridge Press Wavelets and Filter Banks. Gilbert Strang and Truong Nguyen, ISBN 0-9614088-7-l. Linear Algebra, Geodesy, and GPS, Gilbert Strang and Kai Borre, ISBN 0-9614088-6-3. Introduction to Applied Mathematics, Gilbert Strang, ISBN 0-9614088-0-4. An Analysis of the Finite Element Method. Gilbert Strang and George Fix. ISBN 0-9614088-8-X. Calculus, Gilbert Strang, ISBN 0-9614088-2--0. Wellesley-Cambridge Press Box 812060 Wellesley MA 02482 USA www.wellesleycambridge.com gs@math.mit.edu math.mit.edu/~ gs phone/fax (78'1) 431-8488 MATLAB® is a registered trademark of The Mathworks, Inc. !MEX text preparation by Cordula Robinson and Breu Coonley, Massachusens Institute of Technology IM,EX assembly and book design by Arny Hendrickson, Tp(nology Inc., www.texnology.com A Solutions Manual is available to instructors by email from the publisher. Course material including syllabus and Teaching Codes and exams and videotaped lectures for this course are available on the linear algebra web site: web.mit.edu/18.06/www Linear Algebra is included in the OpenCourseWare site ocw.mitedu with videos of the full course. TABLE OF CONTENTS 1 Introduction to Vectors ., 1.1 Vectors and Linear Combinations l 1.2 Lengths and Dot Products 10 2 Solving Linear Equations 21 2.1 Vectors and Linear Equations 21 2.2 The Idea of Elimination 35 2.3 Elimination Using Matrices 46 2-.4 Rules for Matrix Operations 56 2.S Inverse Matrices 71 = 2.6 Elimination Factorization: A=LU 83 2.7 Transposes and Permutations 96 3 Vector Spaces and Subspaces 111 3. I Spaces of Vectors 111 3.2 The Nullspace of A: Solving Ax= 0 122 3.3 The Rank and the Row Reduced Form 134 = 3.4 The Complete Solution to Ax b 144 3.5 Independence, Basis and Dimension 157 3.6 Dimensions of the Four Subspaces 173 4 Orthogonality 184 4.1 Orthogonality of the Four Subspaces 184 4.2 Projections 194 4.3 Least Squares Approximations 206 4.4 Orthogonal Bases and Gram-Schmidt 219 5 Determinants 233 5.1 The Properties of Determinants 233 5.2 Permutations and Cofactors 245 5.3 Cramer's Rule, Inverses, and Volumes . 259 6 Eigenvalues and Eigenvectors 274 6.1 Introduction to Eigenvalues 274 6.2 Diagonalizing a Matrix 288 6.3 Applications to Differential Equations 304 6.4 Symrnenic Matrices 318 iii iv Table of Contents 6.5 Positive Definite Matrices 330 6.6 Similar Matrices 343 6.7 Singular Value Decomposition (SVD) 352 7 Linear Transformations 363 7.1 The Idea of a Linear Transformation 363 7.2 The Matrix of a Linear Transformation 371 7.3 Change of Basis 384 7.4 Diagonalization and the Pseudoinverse 391 8 Applications 401 8.1 Matrices in Engineering 401 8.2 Graphs and Networks 412 8.3 Markov Matrices and Economic Models 423 8.4 Linear Programming 431 8.5 Fourier Series: Linear Algebra for Functions 437 8.6 Computer Graphics 444 9 Numerical Linear Algebra 450 9.1 Gaussian Elimination in Practice 450 9.2 Nonns and Condition Numbers 459 9.3 Iterative Methods for Linear Algebra 466 10 Complex Vectors and Matrices 477 10.1 Complex Numbers 477 10.2 Hermitian and Unitary Matrices -486 l 0.3 The Fast Fourier Transform 495 Solutions to Selected Exercises 502 A Final Exam 542 Matrix Factorizations 544 Conceptual Questions for Review 546 Glossary: A Dictionary for Linear Algebra 551 Index 559 Teaching Codes 567 PREFACE This preface expresses some personal thoughts. It is my chance 10 write about how linear algebra can be taught and learned. If we teach pure abstraction. or settle for cookbook fonnulas, we miss the best part. This course has come a long way, in living up to what it can be. It may be helpful to mention the web pages connected to this book. So many messages come back with suggestions and encouragement, and I hope that professors and students will make free use of everything. You can directly access web.mit.edu/18.06/www, which is continually updated for the MIT course that is taught every semester. Linear Algebra is also on the OpenCourseWare site ocw.mit.cdu, where 18.06 became exceptional by including videos (which you definitely don' t have to watch ). I can briefly indicate part of what is available now: 1. Lecture schedule and current homeworks and exams with solutions itt; 2. The goals of the course and conceptual questions ttt 3. Interactive Java demos for eigenvalues and least squares and more '' 4. A table of eigenvalue/eigenvector infonnat[on (sec page 362) S. Glossary: A Dictionary for Linear Algebra 6. Linear Algebra Teaching Codes and MATLAB problems 7. Videos of the fulJ course (taught in a real classroom). These web pages are a resource for professors and students worldwide. My goaJ is to make this book as useful as possible, with all the course material I can provide. After this preface, the book will speak for itself. You will sec the spirit right away. The goal is to show the beauty of linear algebra, and its value. The emphasis is on understanding- / try to explain rather than to deduce. This is a book about real mathematics, not endless drill. lam constantly working with examples (create a matrix, find its nullspacc, add another column, see what changes, ask for help! ). The textbook has to help too. in teaching what students need. The effort is absolutely rewarding, and fortunately this subject is not too hard. The New Edition A major addition lo the book is the large number of Worked Examples, section by section. Their purpose is to connect the text directly to the homework problems. The complete solution to a vector equation Ax= b is Xpanicular + Xnullspacc-and the steps V VI Preface arc explained as clearly as I can. The Worked Example 3.4 A converts this explanation into action by taking every step in the solution (starting with the test for solvability). I hope these model examples will bring the content of each section into focus (sec 5.1 A and 5.2 B on detenninants). The "Pascal matrices" are a neat link from the amazing properties of Pascal's triangle to linear algebra. The book contains new problems of all kinds - more basic practice, applications throughout science and engineering and management, and just fun with matrices. Northwest and southeast matrices wander into Problem 2.4.39. Google appears in Chapter 6. Please look at the last exercise in Section I. 1. I hope the problems are a strong point of this book- the newest one is about the six 3 by 3 permutation matrices: What are their detem1inants and pivots and traces and eigenvalues? The Glossary is also new, in the book and on the web. I believe students will find it helpful. In addition to defining the importanl terms of linear algebra. there was also a chance to include many of the key facts for quick reference. Fortunately. the need for linear algeb ra is widely recognized. This subject is ab- solutely as importalll as calculus. I don't concede anything, when I look at how mathematics is used. There is even a light-hearted essay called "'Too Much Calculus" on the web page. The century of data has begun! So many applications are discrete rather than continuous, digital rather than analog. The truth is that vectors and matrices have become the language to know. 1, The Linear Algebra Course tt 1·1 = The equation Ax b uses that language right away. The matrix A times any vector .t is a combination of the c:ohmms of A. The equation is asking for a combination that produces b. Our solution comes at three levels and they a re all important: 1. Direct solutio11 by forward eliminatio n and back substitution. = 2. Matrix solution x A- 1b by inverting the matrix. 3. Vector space solution by looking at the column space and nullspace of A. And there is another possibility: Ax = b may hal'e 110 solurion. Elimination may lead = to O I. The matrix approach may fail to lind A - 1. The vector space approach can look at all combinations Ax of the columns. but b might be outside that column space. Pan of mathematics is understanding when Ax = b is solvablc, and what to do when it is not (the least squares solution uses ATA in Chapter 4). Another part is learning to visualize vectors. A vector v with two components is not hard. Its components v1 and l'2 tell how far to go across and up-we draw an arrow. A second vector w may be perpendicular to v (and Chapter I tells when). If those vectors have six components, we can't draw them but our imagination keeps trying. In six-dimensional space, we can test quickly for a right angle. It is easy to visualize 2u (twice as far) and -w (oppos.ite to w). We can almost sec a combination like 2v - w. .. Preiacc VII Most important is the effon to imagi11e all tile combillations cv+ dw. They fl]] a ..two-dimensional plane" inside the six-dimensional space. As I write these words, I am not at all sure that [ can see this subspace. But linear algebra works easily with vectors and matrices of any size. If we have currems on six edges, or prices for six products, or just position and velocity of an airplane. we are dealing with six dimensions. For image processing or web searches (or the human genome). six might change to a million. lt is still linear algebra, and linear combinatiions still hold the key. Structure of the Textbook Already in this preface, you can see the sty le of the book and its goal. The style is informal but the goal is absolutely serious. Linear algebra is great mathematics, and I certainly hope that each professor who teaches this course will learn something new. The author always does. The student will notice how the applications reinforce the ideas. I hope you will see how this book moves forward, gradually and steadily. I want to note six points about the organization of the book: 1. Chapter I provides a brief introducti.on to vectors and dot products. If the class has met them before, the course can begin with Chapter 2. That chapter solves 11 by II systems Ax = b, and prepares for the whole course. 2. I now use the reduced row eche/011 form more than before. The MATLAB com- mand rref(A) produces bases for the row space and column space. Better than that, reducing the combined matrix [ A I ] produces total infonnation about all 1·1 four of the fundamental subspaces. 3. Those four subspaces are an excellent way to learn about linear independence and bases and dimension. They go to the heart of the matrix. and they are genuinely the key to applications. I hate just making up vector spaces when so many im- = portant ones come naturally. If the class secs plenty of examples. independence is virtually understood in advance: A has independent columns when x 0 is = the only solution to Ax 0. 4. Section 6.1 introduces eigenvalues for 2 by 2 matrices. Many courses want to see eigenvalues early. It is absolutely possible to go directly from Chapter 3 to Section 6.1. The determinant is easy for a 2 by 2 matrix. and eigshow on the = web captures graphically the moment when Ax Ax. S. Every section in Chapters I to 7 ends with a highlighted Review of tile Key Ideas. The reader can recapture the main points by going carefully through this review. 6. Chapter 8 (Applications) has a new section on Matrices in Engineering. When software is available (and time to use it), I see two possible approaches. One is to carry out instantly the steps of llesting linear independence, orthogonalizing = = by Gram-Schmidt, and solving Ax b and Ax .l.x. The Teaching Codes follow the steps described in class- MATLAB and Maple and Mathematica compute a little differently. All can be used (optio11a/ly) with this book. The other approach is to experiment on bigger problems-like finding the largest dctcnninant of a ±I matrix. or viii the average size of a pivot. The time to compute A - 1b is measured by tic; inv(A) • b; toe. Choose A = rand(1000) and compare with tic; Al b; toe by direct elimination. A one-semester course that moves steadily will reach eigenvalues. The key idea is to diagonalize A by its eigenvector matrix S. When that succeeds. the eigenvalues appear in s- 1AS. For symmetric matrices we can choose s- 1 = sT. When A is rectangular we need LJTAV (U comes from eigenvectors of AAT and V from ATA). Chapters 1 to 6 are the hean of a basic course in linear algebra- theory plus applications. The beauty of this subject is in the way those come together. May I end with this thought for professors. You might feel that the direction is right, and wonder if your students are ready. J11st give them a chance! Literally thousands of students have written to me, frequently with suggestions and surprisingly often with thanks. They know when the course has a purpose, because the professor and the book are on their side. Linear algebra is a fantastic subject, enjoy it. Acknowledgements This book owes a big debt to readers everywhere. Thousands of students and colleagues have been involved in every step. I have not forgotten the warm welcome for the first sentence written 30 years ago: "I believe that the teaching of linear algebra has become ~ too abstract." A less formal approach is now widely accepted as the right choice for 1, the basic course. And this course has steadily improved - the homework problems, the tt lectures, the Worked Examples, even the Web. I really hope you see that linear algebra 1· 1 is not some optional elective, it is ne.eded. The first step in all subjects is linear! I owe a particular debt to friends who offered suggestions and corrections and ideas. David Arnold in California and Mike Kerckhove in Virginia teach this course well. Per-Olof Persson created MAH.AB codes for the experiments, as Cleve Moler and Steven Lee did earlier for the Teaching Codes. And the Pascal matrix examples. in lhe book and on the Web, owe a lot to Alan Edelman (and a linle to Pascal). It is just a pleasure to work with friends. My deepest thanks of all go lo Cordula Robinson and Brett Coonley. They cre- ated the ~EX pages that you see. Day after day, new words and examples have gone back and forth across the hall. After 2000 problems (and 3000 attempted solutions) this expression of my gratitude to them is almost the last sentence, of work they have beautifully done. Amy Hendrickson of texnology.com produced the book itself. and you will rec- ognize the quality of her ideas. My favorites are the clear boxes that highlight key points. The quilt on the front cover was created by Chris Curtis (it appears in Grear American Quilrs: Book 5, by Oxmoor House). Those houses show nine linear transfor- mations of the plane. (At least they are linear in Figure 7. I. possibly superlinear in the quilt.) Tracy Baldwin has succeeded .again to combine art and color and mathematics. in her fourth neat cover for Wellesley-Cambridge Press. May I dedicate this book to grandchildren who are very precious: Roger. Sophie. Kathryn, Alexander. Scott, Jack, William, Caroline, and Elizabeth. I hope you might take linear algebra one day. Especially I hope you like it. The author is proud of you. 1 INTRODUCTION TO VECTORS The heart of linear algebra is in two operations- both with vectors. We add vectors to get v + w. We multiply by numbers c and d to get cv and dw. Combining those two operations (adding cv to dw) gives the li11ear combination cv +dw. Linear combinations are all-important in this subject! Sometimes we want one particular combination, a specific choice of c and d that produces a desired cv + dw. Other times we want to visualize all the combinations (coming from all c and d). The vectors cv lie along a line. The combinations cv + dw normally fill a two-dimensional plane. (I have to say ..two-dimensional" because linear algebra allows higher-dimen- sional planes.) From four vectors u, v, w, z in four-dimensional space, their combina- tions are likely to fill the whole space. Chapter l explains these central ideas, on which everything builds. We start with two-dimensional vectors and three-dimensfonaJ vectors, which are reasonable to draw. Then we move into higher dimensions. The really impressive feature of linear algebra is how smoothly it cakes that step into n-dimensional space. Your mental picture stays completely correct, even if drawing a ten-dimensional vector is impossible. This is where the book is going (into 11-dimcnsional space), and the first steps are the operations in Sections 1.1 and 1.2: 1.1 Vector addition v + w and linear combinations cv + dw . = ~ 1.2 The dot product v • w and the length IIvII - VECTORS AND LINEAR COMBINATIONS ■ 1.1 "You can' t add apples and oranges." In a strange way, this is the reason for vectors! If we keep the number of apples separate from the number of oranges, we have a pair of numbers. That pair is a two-dimensional vector v. with "components" v1 and v2: = v1 number of apples = u:z number of oranges. 1 2 Chapter 1 Introduction to Vectors We write v as a column vector. The main point so far is to have a single letter v (in boldface italic) for this pair of numbers vn and v2 (in lightface italic). Even if we don't add v1 to v2. we do ad,/ vectors. The first components of v and w stay separate from the second components: VECTOR ADDITION and You sec the reason. We want to add apples to apples. Subtraction of vectors follows the same idea: The components of v - w are v1 - w, and _ _ . The other basic operation is scalar multiplication. Vectors can be multiplied by 2 or by - 1 or by any number c. There are two ways to double a vector. One way is to add v + v. The other way (the usual w.ay} is to multiply each component by 2: SCALAR MULTIPLICATION The components of cv arc cv1 and c v2. The number c is called a "scalar". Notice that the sum of - v and v is the zero vector. This is 0. which is not the same as the number zero! The vector 0 has components 0 and 0. Forgive me for hammering away at the difference between a vector and its components. Linear algebra is built on these operations v + w and cv-addi11g vectors and 11111/tiplying by scalars. l!6 The order of addition makes no dincrcncc: v + w equals w + v. Check that by algebra: The first component is v1 + w1 which equals w1 + v 1. Check also by an 1tt. ex.ample: V + W = [ ~ ] + [ ~ ] = [ : ] = [ ~ ] + [ ~ ] = W + V. linear Combinations By combining these operations. we now form "Ji11ear combi11ations'' of v and w. Mul- tiply v by c: ilnd muhiply w by d; then add cv + dw. DEFINITION The sum of c:v and dw i.r, a linear rnmbi11mio11 of v and w. Four special linear combinations arc: s um. difference, zero, and a scalar multiple cv: - lv + l w sum of \'ectors in Figure I. I lv - lw - difference of vectors in Figure I. I - Ov + Ow zero vector - cv + Ow vector cv in the direction of v The zero vector is always a possible combination (when the coefficients are zero). Every time we sec a ·•space" of vectors. that zero vector will be included. It is this big view. taking all the combinations of v and w, that makes the subject work. 1.1 Vectors and Linear Combinations 3 The figures show how you can visualize vectors. For algebra, we just need the components (I ikc 4 and 2). In the plane. that vector v is represented by an arrow. The = = arrow goes v1 4 units to the right and v2 2 units up. It ends at the point whose x , y coordinates are 4, 2. This point is another representation of the vector- so we have three ways to describe v. by an arrow or a poim or a pair of numbers. Using arrows, you can see how to visualize the sum v + w: Vector addition (head to tail) At the end of v. place the start of w. We travel along v and then along w. Or we take the shortcut along v + w. We could also go along w and then v . In other words, w + v gives the same answer as v + w. These are different ways along the parallelogram (in this example it is a rectangle). The endpoint in Figure 1.1 is the diagonal v + w which is also w + v. -[-1] ~ , W- 2 ~ = [~] ! - - - - \' ' - -t--t-----:i---+--+l' l, [ ~ ] .,, 1·1 Figure 1.1 Vector addition v + w prodluces the diagonal of a parallelogram. The linear combination on the right is v - w. = = The zero vector has VJ 0 and v2 0. It is too short to draw a decent arrow, = but you know that v+ 0 v . For 2v we double the length of the arrow. We reverse its direction for - v. This reversing gives the subtraction on the right side of Figure J. I. p 2 Figure 1.2 The arrow usually stans at the origin (0, O); c v is always parallel to v. 4 Chapter 1 lntroduc1ion to Vecto~ Vectors in Three Dimensions A vector with two components corresponds to a point in the xy plane. The components of v are the coordinates of the point: x = v1 and y = t12. The arrow ends at this point (v1 , 112), when it starts from (0 , 0). Now we allow vectors to have three components (v1• 1'2, v3). The x y plane is replaced by three-dimensional space. Here are typical vectors (still column vectors but with three components): rn [_fl m• v= and w = and u+w = The vector v corresponds to an arrow in 3-space. Usually the arrow starts at the origin, where the x yz axes meet and the coordinates are (0. 0, 0). The arrow ends at the point with coordinates v1, vi, v3 . There is a perfect match between the col11mn vector and n the arrow from tire origin and the point where the arrow ends. = [ From II0W 011 V fr also wrille/1 • • V = (I. 2. 2). = The reason for the row fonn (in parentheses) is to save space. But v ( I, 2, 2) is not l!6 a row vector! It is in actuality a column vector, just temporarily lying down. The row vector [ l 2 2 ] is absolutely different, even though it has the same three components. It is the "transpose" of the column v. 1 rt z y (3, 2) ( I, - 2. 2} 1 2 : y ~ - " ' - - - - - 1 - 1.. : i . / " 1 X Figure 1.3 Vectors [ ; ] and [ [] correspond to points (x. y) and (x. y. z). + In three dimensions, v w is still done a component at a time. The sum has components v1 + w , and v2 + w2 and V3 + w3. You see how to add vectors in 4 or 5 or " dimensions. When w slarts at the end of v. the third side is v + w. The other way around the parallelogram is w + v. Question: Do the four sides all lie in the same plane? Yes. And the sum v + w - v - w goes completely around to produce _ _ . A typical linear combination of three vectors in three dimensions is u + 4v - 2w: J.1 Vectors and linear Combinations 5 The ·important Questions For one vector u. the only linear combinations are the multiples cu. For two vectors, the combinations are cfl+dv. For three vectors, the combinations are cu+dv+ew. Will you take the big step from one linear combination to all linear combinations? Every c and d and e are allowed. Suppose the vectors u, v, w are in three-dimensional space: 1 What is the picture of all combinations cu? 2 What is the picture of all combinations cu+ dv'? 3 What is the picture of all combinations cu + dv + e w? The answers depend on the particular vectors 11, v, and w. If 1hey were all zero vec- tors (a very extreme case). then every combination would be zero. If they are typical nonzero vectors (components chosen at random), here are the three answers. This is the key to our subject 1 The combinations cu fill a line. 2 The combinations cu + dv fill a plane. + 3 The combinations cu dv + ew fill tl,ree-dimemional space. ~ 1 The line is infinitely long, in the direction of u (forward and backward, going through the zero vector). It is the plane of all cu + dv (combining two lines) that I especially ask you to think about. Adding all cu on oue line to all dv 011 tire other li,re fills in the plane in Figure 1.4. Line from Plane from all er, + dv (a) (b) Figure 1.4 (aJ The line through 11. (b) The plane containing the lines through 11 and v. When we include a third vector w, the multiples ew give a third line. Suppose that line is not in the plane of II and v. Then combining all ew with all cu + dv fills up the whole three-dimensional space. 6 Chapter l Introduction to Vectors This is the typical situation! Line, then plane, then space. But other possibilities exist. When w happens to be cu + dv. the third vector is in the plane of the first two. The combinations of u. v, w will not go outside that uv plane. We do not get the fuH three-dimensional space. Please think abour the special cases in Problem I. ■ REVIEW OF THE KEY IDEAS ■ 1. A vector v in two-dimensional space has two components v1 and 1/2, = = 2. v + w (vi + w,, V2 + w2) and cv (cv1, cv2) are executed a component at a time. 3. A linear combination of u and v and w is cu + dv + ew. 4. Take all linear combinations of u. or II and v, or u and v and w. In three dimensions, those combinations typically fill a line, a plane, and the whole space. ■ WORKED EXAMPLES ■ 1·1 = = 1.1 A Describe all the linear combinations of v (1. l. 0) and w (0, 1. I). Find a vector that is 1101 a combination of v and w. Solutio n These are vectors in three-dimensional space R3. Their combinations cv + dw fill a plarze in R3. The vectors in that plane allow any c and d: Four particular vectors in that plane are (0. 0, 0) and (2. 3, l ) and (5, 7. 2) and (../2, 0 , -../2). The second component is always the sum of the first and third components. The vector ( 1, I , I) is 1101 in the plane. Another description of this plane through (0. O. 0) is to know a vector perpendicular to the plane. In this case n = (I, - 1, l) is perpendicular, as Section 1.2 will = confinn by testing dot products: v , 11 0 and w • n == 0. 1.1 B For v = ( I, 0) and w = (U, I), describe all the points cv and all the combinations c v + dw with any d and (1) whole numbers c (2) 1101111egative c ~ 0. 1.1 Vectors and Linear Combina1ions 7 Solution (1) The vectors cv = (c, 0) with whole numbers c: are equally spaced points along the x axis (the direction of 11). They iinclude (- 2, 0). (- 1, 0), (0, 0), (I, 0). (2, 0). = Adding all vectors dw (0, d) puts a full line in the )' direction through lhose = points. We have infinitely many parallel lines from cv + dw (whole number. any number). These are verticaJ lines in the xy plane, through equally spaced poinls on the x axis. (2) The vectors cv with c ?;. 0 fill a "half-line". It is the positive x axis. staning at = (0, 0) where c 0. It includes (1r, ·O) but not (- 1r, 0). Adding all vectors dw puts a full line in the )' direction crossing every point on that half-line. Now we have a lialj-pla11e. It is the right half of the xy plane, where x ?. 0. Problem Set 1.1 Problems 1-9 are about addition of vectors and linear combinations. 1 Describe geometrically (as a line, plane, . . . ) all linear combinations of 1·1 2 Draw the vectors v ;; [1] and w :::;;; [-fl and 11 + w and v - w in a single xy plane. 3 If v + w = [I] and v - w = [J]. compute and draw v and w. 4 From v = [f] and w = [} ]. find lDle components of 3v + w and v - 3w and cv +dw. 5 Compute " + v and u + v + w and 2i, + 2v + w when = 6 Every combination of v = (I, - 2, I) and w (0, I, - 1) has components that add to _ _ . Find c and cl so rhat cv + dw = (4. 2, -6). 7 In the xy plane mark all nine of these linear combinations: 8 Chapler 1 lnlroduclion 10 Vectors 8 The parallelogram in Figure 1.1 has diagonal v + w. What is its other diagonal? What is the sum of the two diagonals? Draw that vector sum. 9 If three comers of a parallelogram are (1, l ), (4, 2). and (l. 3), what are all lhe possible fourth comers? Draw two of them. . (0, 0. l) I I - - - --. ---..i (0 , I , 0) J, I I Figure 1.5 Unit cube from i, j, k ; twelve clock vectors. Problems 10-14 are about special vectors on cubes and clocks. = = 10 Copy lhc cube and dmw the vector sum of i ( I, 0, 0) and j (0, l , 0) and = k (0, 0, 1). The addition i + j yields the diagonal of _ _ . 1·1 11 Four comers of the cube are (0, 0, 0), (1, 0, 0), (0, I, 0), (0, 0 , 1). What are the other four corners? Find the coordinates of the center point of the cube. The center points of the six faces are _ _ . 12 How many comers does a cube have in 4 dimensions? How many faces? How many edges? A typical comer is (0, 0. I. 0). 13 (a) What is the sum V of the twelve vectors that go from the center of a clock to the hours I:00. 2:00. . ... 12:00? (b) If the vector to 4:00 is removed, find the sum of the eleven remaining vectors. (c) What is the unit vector to I:00? 14 Suppose the twelve vectors start from 6:00 at the bottom instead of (0, 0) at the = center. The vector to 12:00 is doubled to 2j (0. 2). Add the new twelve vectors. Problems 15-19 go further with linear combinations of v and w (Figure 1.6) 15 The figure shows ½v +½ w. Mark the points ¾v +¾ w and ¼v +¼ w and v + w. = 16 Mark the point - v + 2w and any other combination cv + dw with c + d 1. = Draw the line of all combinations that have c +d I. 1.1 Vectors and Linear Combinations 9 17 Locate ½v + ½w and } v + j W. The combinations cv + cw fiJI out what line? = Restricted by c:::: 0 those combinations with c d fill out what half line? 18 Restricted by O ~ c ~ I and O :5 d :S 1, shade in all combinations cv + dw. 19 Restricted only by c ~ 0 and d 2: 0 draw the "cone" of all combinations cv+dw. Problems 20-27 deal with u, v , w in three-dimensional space (see Figure 1.6). 1" 20 Locate + ½v + ! w and ½u + ½w in the dashed triangle. Challenge problem: Under what restrictions on c, d, e, will the combinations cu +dv + ew fill in the dashed triangle'? 21 The three sides of the dashed triangle are v - u and w - v and u - w. Their sum is _ _ . Draw the head-to-tail addition around a plane triangle of (3, I) plus (-1, I ) plus (-2, -2). 22 Shade in the pyramid of combinations cu+dv+ew with c ~ 0, d ~ 0, e:::: 0 and c + d + e ~ I. Mark the vector ½(u + v + w) as inside or outside this pyramid. w V -11 l4-. I L tt '' V Figure 1.6 Problems 15-19 in a plane Problems 20-27 in 3-dimensional space 23 If you look at all combinations of those u, v, and w, is there any vector that can't be produced from cu + dlJ + e w? 24 Which vectors are combinations of u and v, and also combinations of v and w? 25 Draw vectors u. v, w so that 1hcir combinations cu + dv + ew fill only a line. Draw vectors 11. v, w so that their combinations cu + dv + ew fill only a plane. !] !] 1 26 What combination of the vectors [ and [ produces [ : ]? Express this question as two equations for the coefficients c and d in the linear combination. 27 Review Question. In xyz space, where is the plane of all linear combinations of ; = (1. 0, 0) and j = (0, l. O)? 10 Chapter 1 ln1ro 0. Now suppose v , w is not zero. It may be positive. it may be negative. The sign of v • w immediately tells whether we arc below or above a right angle. The angle is less than 90° when v • w is positive. The angle is above 90° when v • w is = = negative. Figure 1.9 shows a typical vector v (3. I). The angle with w (1. 3) is less than 90°. The borderline is where vectors arc perpendicular to v. On that dividing line = between plus and minus, where we find w (1. - 3), the dot product is zero. The next page takes one more s.tep. to find the exact angle 0. This is not neces- sary for linear algebra- you could stop here! Once we have matrices and linear equa- tions. we won't come back to 0 . But while we are on the subject of angles, this is the 1 tt +•·1. place for the formula. Stan with unit vectors " and U. The sign of 11 • U tells whether 0 < 90° or 0 > 90°. Because the vectors have length I. we learn more than that. Tlze dot product " • U is t/,e cosine of 0. This is true in any number of dimensions. = 1C If u and U are unit vectors then u • U cos 0 . Certainly lu • U I ::: J . Remember that cos() is never greater than l. It is never less than - 1. The dot prodllct of unit vectors is betweeri - I and I. = = Figure l.lO shows this clearly when ·the vectors are " (cos 0. sin0) and i (1. 0). = The dot product is 11 • i cos 0. Thal is the cosine of the angle between them. After rotation through any angle et. these are still unit vectors. Call the vectors = u = (cos/3.sin.B) and U (cosa,sina). Their dot product is cosacos,B+sinasin,8. From trigonometry this is the same as cos(/3 - a). Since f:J - a equals 0 (no change = in the angle between them) we have reached the fornrnla u • U cos 0. Problem 26 proves lu•UI ~ I directly, without mentioning angles. The inequality and the cosine formula " • U = cos 8 are always true for unit vectors. = What if v and w are not unit vectors? Divide by their lengths to get r1 v/ll vll and = U w/lt wll- Then the dot product of those unit vectors u and U gives cos0. 1.2 Lengths and Doi Products 15 c~s0] [ sm0 u • i = cos0 U = [cs~ms,/83 ] (..-....ft,/ u = cos er] [ sin a ~~\a 9= /J - a Figure 1.10 The dot product of unit vectors is the cosine of the angle 0. Whatever the angle, this dot product of v /ll vll with w/ llw ll never exceeds one. That is the "Schwarz inequality,, for dot products- or more correctly the Cauchy- Schwarz-Buniakowsky inequality. It was found in France and Germany and Russia (and maybe elsewhere-it is the most important inequality in mathematics). With the divi- sion by ll v ll Jl w ll from rescaling to unit vectors. we have cos0: V • W 1D (a) COSil'iE FORMULA If v and w are nonzero vectors then ll vll Uwll = cos 8. ~ (b) SCHWARZ INEQUALITY If v and w are any vectors then lv- wl :s ll vllll wll. 1 I I Example 5 Find cos9 for v = [t) and w = U] in Figure 1.9b. Solution The dot product is v • w = 6. Both v and w have length Jio. The cosine is cosO = -V-• W- llvll Uwll = -"1"6-6'1'-10 = 3 -. 5 The angle is below 90° because v • w = 6 is positive. By the Schwarz inequality. II v II II w II = IO is larger than v • w = 6. Example 6 The dot product of v = (a. b) and w = (b. a) is 2ab. Both lengths are + Ja 2 + b2. The Schwarz inequality says that 2ab ~ a2 b2. Reason The difference between a2 + b2 and 2ab can never be negative: = a2 + b2 - 2ab (a - b)2 ~ 0. = This is more famous if we write x = a2 and y b2• Then the ·•geometric mean" .Jxy ½ 23 If the components of a vector v are v , and v2, then cv has componcn1s cv1 and cvi. The other basic operation is vector addition. We add the first components and the second components separa1ely. The vector sum is ( 1, 11) as desired: Vector addition The graph in Figure 2.2 shows a parallelogram. The sum ( I. 11) is along the diagonal: ! ] !~; ] The sides are [ a,rd [ - ; ] . The diagonal sum is [ = [ 1: ] . = = We have multiplied the original columns by x 3 and y I. That combina1ion produces the vector b = (I. I I) on the right side of the linear equations. To repeat: The left side of the vector equation is a linear combinatio,i of the = = columns. The problem is to find the right coefficients ;a; 3 and y I. We arc combining scalar multiplication and vector addition into one step. That step is crucially important, because it contains both of the basic operations: Linear combination Of course the solution .\' = 3. y = I is the same as in the row picture. I don't 1 know which picture you prefer! I suspect that the two intersecting tines arc more fa- 1·1 miliar at first. You may like the row picture better, but only for one day. My own preference is to combine column vectors. It is a lot easier to sec a combination of four vectors in four-dimensional space, than to visualize how four hyperplanes might possibly meet at a point. (Even 011e hyperplane is hard enough. ..) The coefficient matrix on the left side of the equations is the 2 by 2 matrix A: Coefficient matrb.: A== [ 3I -22 ] • This is very typical of linear algebra. to look at a matrix by rows and by columns. Its rows give the row picture and its columns give the column picture. Same numbers. different pictures. same equations. We write those equations as a matrix problem Ax= b: Matrix equation [ 31 -22 ] [ x)' ] -- [ I I ] . The row picture deals with the two rows of A. The column picture combines the columns. = The numbers .r = 3 and y I go into the solution vector x. Then 24 Chapter 2 Solving Linear Equations Three Equations in Three Unknowns = The three unknowns are x, y , z. The linear equations Ax b are X + 2y + 3z - 6 2x + Sy + 2z 4 (3) 6x 3y + z 2 We look for numbers x . y. z that solve all three equations at once. Those desired num- bers might or might not exist. For this system, they do exist. When the number of unknowns matches the number of equations, there is usually one solution. Before solving the problem. we visualize it both ways: R The row pich1re shows three planes meeting at a single point. C The column picture combines tliree columns to produce the vecwr (6, 4. 2). In the row picture, each equation is a plane in three-dimensional space. The first plane = comes from the first equation x + 2y + 3:.:: 6. That plane crosses the x and y and z axes at the points (6, 0, 0) and (0, 3, 0) and (0, 0, 2). Those three points solve the equation and they detennine the whole plane. The vector (x , y, .::) = (0, 0 , 0) does not solve x + 2y + 3z = 6. Therefore the plane in Figure 2.3 does not comain the origin. 1·1 line L is on both planes line L meets third plane at solution m=m Figure 2.3 Row picture of three equations: Three planes meet at a point. 2.1 Vectors and Linear Equations 25 = The plane x + 2y + 3z 0 does pass through the origin, and it is parallel to = x + 2y + 3z 6. When the right side increases to 6, the plane moves away from the origin. = The second plane is given by the second equation 2x + 5y + 2z 4. Tr imersec:rs the first plane i11 a line L. The usual result of two equations in three unknowns is a line L of solutions. The third equation gives a third plane. It cuts the line L at a single point. That point lies on all three planes and it solves all three equations. It is harder to draw this triple intersection point than to imagine it. The three planes meet at the solution = (which we haven't found yet). The column form shows immediately why z 2! The column picture starts wiJ/1 tile vector form of tire equations: (4) The unknown numbers x, y, z are the coefficients in this linear combination. We want = to multiply the three column vectors by the correct numbers x, y, z to produce b (6, 4, 2). m= column I [ ;J = column 2 = = b 6~ ] -3 2 times column 3 [ = Figure 2.4 Column picture: (x, y , .;:) = (0, O. 2) because 2(3. 2, I) = (6. 4, 2) b. Figure 2.4 shows this column picture. Linear combinations of those columns can = produce any vector b! The combination tbat produces b (6, 4, 2) is just 2 times the = = = third column. The coefficienrs we need are x 0, y 0, and z 2. This is also the intersection point of the three planes in the row picture. It solves the system: 26 Chapter 2 Solving Linear Equations The Matrix Form of the Equations We have three rows in the row picture and three columns in the column picture (plus the right side). The three rows and three columns contain nine numbers. These nine rmmbers fill a 3 by 3 matrix. The "coefficient matrix" has the rows and columns that have so far been kept separate: I 2 3] Tire coefficient matrix is A = 2 5 2 . [ 6 -3 I The capital letter A stands for all nine coefficients (in this square array). The letter b denotes the column vector with components 6, 4, 2. The unknown x is also a column vector, with components x, y. z. (We use boldface because it is a vector, x because it is unknown.) By rows the equations were (3), by columns they were (4), [i j n[nu]- and now by matrices they are (5). The shor1hand is Ax = b: Mauixeq11a6on = (5) We multiply the matrix A times the unknowtt vector x to get the right side b. Basic ques1io11: What does it mean to "multiply A times x "? We can multiply by rows or by columns. Either way. Ax = b must be a correct representation of the three equations. You do the same nine multiplications either way. Multiplication by rows Ax comes from dot products, each row times the column x : ( row 1 ) • x ] Ax = ( row 2) • x . (6) [ ( row 3 ) • x Multiplicatio,r by co/um11s Ax is a combination of col1111m vectors: Ax = x (column 1) + y ((:o/z,mn 2) + z (column 3). (7) nu] ul = When we substitute the solution x (0. 0. 2). the multiplication Ax produces b: l 2 2 5 [ 6 -3 = 2ames column 3 = The first dot product in row multiplication is (I. 2. 3) • (0 , 0, 2) = 6. The other dot products are 4 and 2. Multiplication by columns is simply 2 times column 3. Tlzis book sees Ax as a combination of tlie columns of A. 2. 1 Vectors and Linear Equation!> 27 Example 1 Here arc 3 by 3 matrices A and I. with three ones and six zeros: If you are a row person, the product of every row (1, 0, 0) with (4, 5, 6) is 4. If you are a column person. the linear combination is 4 times the first column (1. I, 1). In that matrix A. the second and third columns are zero vectors. The example with / x deserves a careful look, because the matrix I is special. It has ones on the "main diagonal". Off that diagonal, all the entries are 1.eros. Whatever vector rllis mmrix mulliplies, thlll vector is 11ot cha11ged. This is like multiplication by I, but for matrices and vectors. The exceptional matrix in this example is the 3 by 3 identity matrix: ~ !g] I = [ always yields the mul1iplica1ion I x = x Matrix Notation The first row of a 2 by 2 matrix contains a11 and a 12, The second row contains a21 and a22, The first index gives the row number, so that a ij is an entry in row i . The second index j gives the column number. But those subscripts are not convenient on a keyboard! Instead of aij it is easier to type A(i, j). Tile = entry C157 A(5. 7) would be in row S, column 7. A= [ a,1 a12 ] - [ A(l, I) A(l, 2) ] a21 a22 - A(2, I) A(2, 2) • For an m by n matrix. the row index i goes from 1 to 111. The column index j stops at n. There are mn entries in the matrix. A square matrix (order 11) has 112 entries. Multiplication in MATLAB I want to express A and x and lhcir product Ax using MATLAB commands. This is a first step in learning that language. I begin by defining the matrix A and the vector x. This vector is a 3 by I matrix. with three rows and one column. Enter matrices a row at a time. and use a semicolon to signal the end of a row: A = [I 2 3: 2 5 2: 6 - 3 I] x = [0 ; 0 ; 2 ] 28 Chapter 2 Solving Linear Equations Here are three ways to multiply Ax in MATLAB. ln reality, A * X is the way to do it. MATLAB is a high level language, and it works with matrices: Matrix multiplication b = A * x We can also pick out the first row of A (as a smaller matrix!). The notation for that 1 by 3 submatrix is A(l, :). Here the colon symbol keeps all columns of row 1: = Row ar a time b {A(I. :) * X ; A(2, :) * X : A{3, :) *X ] Those are dot products, row times column, I by 3 matrix times 3 by I matrix. The other way to multiply uses the columns of A. The first column is the 3 by 1 submatrix A(:, I). Now the colon symbol : is keeping all rows of column I. This column multiplies x ( l) and the other columns multiply x (2) and x(3): * Column at a time b = A(:. 1) * x(l) +A(:. 2) x(2) +A(: , 3) *X(3} I think that matrices are stored by columns. Then multiplying a column at a time will * be a little faster. So A x is actually executed by columns. You can see the same choice in a FORTRAN-type structure, which operates on single entries of A and x. This lower level language needs an outer and inner "DO loop". When the outer loop uses the row number I. multiplication is a row at a time. = The inner loop J 1. 3 goes along each row /. When the outer loop uses J. multiplication is a column at a time. I will do that in MATLAB . which needs two more lines "end" "end" to close "for /'' and "for J": FORTRAN by rows DO IO / = l. 3 = DO 10 J I, 3 * 10 8(1) = B(I) + A(I. J) X(J) MATLAB by columns for J = I : 3 for I = l : 3 = b(l) b(I) + A(l. J ) * x(J) Notice that MATLAB is sensitive to upper case versus lower case (capital letters and small letters). If the matrix. is A then its entries are A(/. J) not a(l. J). I think you will prefer the higher level A * X. FORTRAN won't appear again in this book. Maple and Ma1l1ematica and graphing calculators also operate ar the higher level. Multiplication is A. x in Mathematic.a. It is multiply(A , x); or evalm(A&t:x); in Maple. Those languages allow symbolic entries a, b, x , .. . and not only real numbers. Like MATLAB's Symbolic Toolbox, they give the symbolic answer. ■ REVIEW OF THE KEY IDEAS ■ 1. The basic operations on vectors are multiplication ct.J and vector addition v + w. 2. Together those operations give linear combinations cv + dw. 2.1 Vectors and Linear Equations 29 3. Matrix-vector multiplication Ax can be executed by rows (dot products). But it should be understood as a combination of the columns of A! = 4. Column picture: Ax b asks for a combination of columns to produce b. = = 5. Row picture: Each equation in Ax b gives a line (11 2) or a plane (n = 3) or a "hyperplane" (n > 3). They intersect at the solution or solutions. ■ WO RKED EXAM PLES ■ 2.1 A Describe the column picture of these three equations. Solve by careful inspection of the columns (instead of elimination): Solution The column picture asks for a linear combination that produces b from the three columns of A. In this example b is minus the seco,rd column. So the solution = = = is x 0, y -l. z 0. To show that (0. - I. 0) is the only solution we have to know that "A is invertible" and ..the columns are independent" and "the detenninant isn't zero". Those words are not yet defined but the test comes from elimination: We need (and we find!) a full set of three nonzero pivots. = ( = If the right side changes to b 4, 4, 8) sum of the fl rst two columns, then = = = = ( the right combination has x l, y 1, z 0. The solution becomes x 1. I, 0). 2.1 8 This system has no solution. because the three planes in the row picture don't pass through a point. No combination of the three columns produces b: X +3y +5z = 4 X +2y-3z =5 = 2x +Sy +2z 8 (1) Multiply the equations by I, I. - 1 and add to show that these planes don't meet at a point. Are any two of the planes parallel? What are the equations of planes = parallel to x + 3y + 5z 4'? = ( (2) Take the dot product of each column (and also b) with y 1, I, - 1). How do those dot products show that the system has no solution'? (3) Find three right side vectors b• and b.. and b••• that do allow solutions. C0pyrighted ma,crial 30 Chapter 2 Solving Linear Equations Solution ( 1) Multiplying the equations by l , I, - 1 and adding gives X + 3y +5z = 4 X +2y-3z = 5 -[2x + 5y + 2z = 8] Ox + Oy + Oz = I No Solrttion The planes don't meet at any point, but no two planes are parallel. For a plane parallel to x+3y+5z = 4, just change the "4... The parallel plane x+3y+5z = 0 goes through the origin (0, 0, 0). And the equation multiplied by any nonzero + constant still gives the same plane, as in 2x 6y + lOz = 8. (2) The dot product of each column with y = ( l. I, -1) is zero. On the right side, y • b = (I, I, -1) • (4, 5, 8) = l is 1101 zero. So a solution is impossible. (If a combination of columns could produce b. take dot products with y. Then a combination of zeros would produce I.) (3) There is a solution when b is a combination of the columns. These three exam- ples h*. b**, b*.. have solutions x* = ( I, 0, 0) and x •• = (1, 1. l) and x ••• = (0, 0, 0): rn m m- b0 = = fi~l column b.. = = sum of columns b... = l!i5 1. rt Problem Set 2.1 Problems 1-9 are about the row and column pictures of Ax= b. 1 With A = I (the identity matrix) draw the planes in the row picture. Three sides of a box meet al the solution x = (x. y, z) = (2. 3, 4) : lx + Oy + Oz = 2 Ox + ly + Oz= 3 or Ox+ 0y + lz = 4 2 Draw the vectors in the column picture of Problem 1. 1\vo times column I plus three times column 2 plus four times column 3 equals the right side b. 3 If the equations in Problem I are multiplied by 2, 3, 4 they become Ax = b: 2x + Oy +Oz= 4 Ox +3y+ Oz=9 or Ox + 0y + 4z = 16 x Why is the row picture the same? Is the solution the same as x ? What is changed in the column picture- the columns or the right combination to give b? 2.1 Vectors and Linear Equations 31 4 If equation I is added to equation 2. which of these are changed: the planes in the row picture, the column picture, the coefficient matrix, the solution? The new = = equations in Problem l would be x = 2. ,t + y 5. z 4. = 5 Find a point with z = 2 on the intersection line of the planes x + >' + 3z 6 and = = + x - y z 4. Find the point with z 0 and a third point halfway between. 6 The first of these equations plus the second equals the third: x + y + z= 2 X + 2y+ Z = 3 2x + 3y + 2z = 5. The first two planes meet along a line. The third plane contains that line, because if x, y, z satisfy the first two equations then they also _ _ . The equations have infinitely many solutions (the whole line L). Find three solutions on L. = 7 Move the third plane in Problem 6 to a parallel plane 2x + 3y + 2z 9. Now the three equations have no solution - why not? The first two planes meet along the line L, but the third plane doesn 'l that line. 8 In Problem 6 the columns are (I , J. 2 ) and ( I. 2. 3) and ( l. 1, 2). This is a "sin- gular case'' because the third column is _ _ . Find two combinations of the = = = columns that give b (2. 3. 5). This is oniy possible for b (4. 6. c) if c 9 Normally 4 "planes" in 4-dimensional space meet at a . Normally 4 col- umn vectors in 4-dimensional space can combine to produce b. What combination = of (] , 0, 0 , 0), (I, I, 0 , 0), (J. 1, I. 0), (I , 1, 1, 1) produces b (3, 3, 3, 2)? What 4 equations for x. y. z. t are you solving? Problems 10-1S are about multiplying matrices and vectors. 10 Compute each Ax by dot products of the rows with the column vector: 11 Compute each Ax in Problem 10 as a combination of the columns: How many separate multiplications for Ax, when the matrix is "3 by 3'"1 32 Chapter 2 Solving Linear Equations 12 Find the two components of Ax by rows or by columns: 13 Multiply A times x to find three components of Ax: 14 (a) A matrix with m rows and n columns multiplies a vector with - - · components to produce a vector with __ components. (b) The planes from the m equations Ax= b are in _ .-dimensional space. The combination of the columns of A is in _ _ -dimensionaJ space. = 15 Write 2\" + 3y + z + 51 8 as a matrix A (how many rows?) multiplying the column vector x = (x, y, z, t) to produce b. The solutions x fill a plane or "hy~ perplane•, in 4-dimensional space. The pla11e is 3-dimensional with 110 4D volume. Problems 16-23 ask for matrices that act in special ways on vectors. 1]. 16 (a) What is the 2 by 2 identity matrix? / times [y] equals [ n. (b) What is the 2 by 2 exchange matrix? P times [;] equals [ n 17 (a) What 2 by 2 matrix R rotates every vector by 90°? R times [ is [_i). (b) What 2 by 2 matrix rotates every vector by 180°? 18 Find the matrix P that multiplies (x, y. :) to give (y, z. x). Find the matrix Q that multiplies (y, z, x) to bring back (x. y, z). 19 What 2 by 2 matrix E subtracts the first component from the second component? What 3 by 3 matrix does the same? and 20 What 3 by 3 matrix E multiplies (x, y, z) to give (x, y. z+x)? What matrix e - 1 multiplies (x, y, z) to give (x, y, z - x)? If you multiply (3, 4, 5) by £ and then multiply by E- 1, the two results are ( _ _ ) and ( _ _ ). 21 What 2 by 2 matrix P1 projects the vector (x, y) onto the x axis to produce (x, 0)? What matrix P2 projects onto the y axis to produce (0. y)? If you mul- tiply (5, 7) by Pt and then multiply by P2 • you get ( _ _ )and( _ _ ). 2.1 Vectors and Linear Equations 33 22 What 2 by 2 matrix R rotates every vector through 45°? The vector (1, 0) goes to (--/2/ 2, ./2/ 2). The vector (0, l) goes to (-./2/ 2, ./2/ 2). Those determine the matrix. Draw these particular vectors in the x y plane and find R. 23 Write the dot product of (I. 4. 5) and (x, y, z) as a matrix multiplication Ax. The = matrix A has one row. The solutions to Ax 0 lie on a _ _ perpendicular to the vector _ _ . The columns of A are only in _ _ -dimensional space. 24 In MATLAB notation, write the commands that define this matrix A and the col- = umn vectors x and b. What command would test whether or not Ax b? 25 The MATLAB commands A :;:; eye(3 ) and v = [3 : 5 ]' produce the 3 by 3 iden- tity matrix and the column vector (3, 4, 5). What are the outputs from A* v and v' "- v? (Computer not needed!) If you ask for v * A. what happens? 26 If you multiply the 4 by 4 all-ones matrix A = ones(4,4) and the column v = ones{4,1), what is A * v? (Computer not needed.) If you multiply B = eye(4) + ones(4,4) times w = zeros(4,1) + 2 • ones(4, l ), what is B* w? Questions 27-29 are a review of the row and column pictures. 27 Draw the two pictures in two pJanes for the equations x - 2y = 0, x + y = 6. 28 For two linear equations in three unknowns .t . y, z. the row picture will show (2 or 3) (lines or planes) in (2 or 3)-dimensional space. The column picture is in (2 or 3)-dimensional space. The solutions normally lie on a _ _ . 29 For four linear equations in two unknowns x and y. the row picture shows four _ _ . The column picture is in _ _ -dimensional space. The equations have no solution unless the vector on the right side is a combination of _ _ . = 30 Start with the vector uo (1, 0}. Multiply again and again by the same "Markov matrix" A below. The next three vectors are u 1, ll2, u3: [.8= UJ .2 .3] .7 [I] 0 = [·8] .2 What property do you notice for all four vectors uo, u 1, u 2, 113? 34 Chapter 2 Solving Linear Equations = 31 With a computer, continue from u o (I. 0) to u 7, and from vo = (0. I) to v7. What do you notice about u, and 1Ji? Here are two MATLAB codes. one with while and one with for. They plot 110 to u1- you can use other languages: u = (1 ; 01; A = [.8 .3 ; .2 .7); x = u; k = [O : 71; while size(x,2) <= 7 u = A* u; x = [x u]; end plot(k, x) u = (1 ; OJ; A = (.8 .3 ; .2 .7]; x = u; k = (0 : 7]; for j=1 : 7 u = A• u; x = (x u]; end plot(k, x) 32 The u·s and v's in Problem 31 arc approaching a steady state vector s. Guess = that vector and check that As s. If you start with s. you stay with s. 33 This MATLAB code allows you to input xo with a mouse click, by ginput. With 1 = 1, A rotates vectors by theta. The plot will s how Axo, A2xo, ... going around a circle (t > 1 will spiral out and t < 1 will spiral in). You can change thew and the stop at j=l 0. We plan to put this code on web.mit.edu/18.06/www: theta = 15 *pi/180; t = 1.0; A = t • lcos(theta) - sin(theta) ; siin(theta) cos(theta)); l!i5 1. d1speClick to select starting point') [xl , x21 = ginput(l ); x = (xl ; x2J; rt for j=l :10 x = Ix A• x( : , end)!; end plot(x(l ,:), x(2,:), 'o') hold off 34 Invent a 3 by 3 magic mntrh: M3 with entries 1. 2, .... 9. All rows and columns and diagonals add to 15. The first row could be 8. 3. 4. What is M3 times (1, l, 1)? What is M4 times (1. I, I , I) if this magic matrix has entries l. . .. , 16? 2.2 The Idea oi Elimination 35 THE IDEA OF ELIMINATION ■ 2.2 This chapter explains a systematic way to solve linear equations. The method is called "elimination", and you can see it immediately in our 2 by 2 example. Before elimi- nationt x and y appear in both equations. After elimination, the first unknown x has disappeared from the second equation: = Before x - 3x+ 2•r 2y= I 11 = After X -2y J Sy = 8 (multiply by 3 and subtract) (x has been eliminated) = The last equation 8y = 8 instantly gives y 1. Substituting for y in the first equation = = = leaves x - 2 1. Therefore x 3 and the solution (x, y) (3, 1) is complete. Elimination produces an upper triangular system - this is the goal. The nonzero = = coefficients 1, - 2, 8 form a triangle. The last equation 8y 8 reveals y I, and we go up the triangle to x . This quick process is called back substitution. It is used for upper triangular systems of any size. after forward elimination is complete. = = lmportanc poinc: The original equations have the same solution x 3 and y I. Figure 2.5 repeats this original system as a pair of lines, intersecting at the solution point (3, J). After elimination, the lines still meet at the same point! One line is hor- izontal because its equation Sy= 8 does not contain x . r';~ .\ How did we get from the first pair of lines to the second pair? We subtracted tt 3 times the first equation from the second equation. The step that eliminates x from equation 2 is the fundamental operation in this chapter. We use it so often that we look at it closely: To eliminate x : Subtract a 11111/tiple of equatio11 1 from equation 2. = = Three times x - 2y = l gives 3x - 6y 3. When this is subtracted from 3x +2y I I, the right side becomes 8. The main point is that 3x cancels 3x. What remains on the left side is 2y - (- 6y) or 8y, and x is eliminated. Before elimination y 3x + 2y = 11 After elimination y 2 3 = Figure 2.5 Two lines meet at the solution. So does the new line 8y 8. 36 Chapter 2 Solving Linear Equations = Ask yourself how thac multiplier e 3 was found. The first equation contains x. The first pivot is l (the coefficient of x). The second equation contains 3x, so the first equation was multiplied by 3. Then subtraction 3x - 3x produced the zero. = You will see the multiplier rule if we change the first equation to 4x - 8y 4. (Same straight line but the first pivot becomes 4.) The correct multiplier is now e= ¾- To find the multiplier, divide the coefficie11t " 3" to be eliminated by the pivor "4"; 4x - 8y =4 = 3x + 2y 11 Multiply equation 1 by ¾ 4x -Sy= 4 Subtract from equation 2 8y = 8. The final system is triangular and the last equation still gives y = I. Back substitution = produces 4x - 8 4 and 4x = 12 and x = 3. We changed the numbers but not the = }: lines or the solution. Divide by the pivot to find that multiplier t Pivol - first nonzero in tlze row that does tl,e elimination = M11/tiplier - (entry to eliminate) divided by (pivot) ¾- The new second equation starts with the second pivot, which is 8. We wou1d use it to eliminate y from the third equation if there were one. To solve II equations we want " pivots. The pivots are o,i the diagonal of the triangle after eliminatior,. You could have solved those equations for x and y without reading this book. It is an extremely humble problem, but we stay with it a little longer. Even for a 2 by 2 system, elimination might break down and we have 10 see how. By understanding the l!i5 i i t-'!- possible breakdown (when we can't find ai full set of pivots), you will understand the whole process of elimination. Breakdown of Elimination Normally. elimination produces the pivots that take us to the solution. But failure is possible. At some point, the method might ask us to divide by zero. We can't do it. The process has to stop. There might be a way to adjust and continue- or failure may be unavoidable. Example I fails with no solution. Example 2 fails with too many solutions. Example 3 succeeds by exchanging the equations. Example 1 Pen11anenl failure witlz no sollltion. Elimination makes this clear: x - 2y = l Subtract 3 times = 3x - 6y 11 cqn. I from cqn. 2 x - 2)' = I Oy = 8. = The last equation is Oy 8. There is no solution. Nonnally we divide the right side 8 by the second pivot. but rhis system has 110 seco,id pivot. (Zero is never allowed as a pivot!) The row and column pictures of this 2 by 2 system show that failure was unavoidable. If there is no solution, elimination must certainly have trouble. The row picture in Figure 2.6 shows parallel lines- which never meet. A solution must lie on both lines. With no meeting point, the equations have no solution. y x-2y = I 2.2 The Idea of Elimination 37 firsI [ I] column 3 Columns don't combine to give plane Figure 2.6 Row picture and column picture for Example I: no solution. The column picture shows the two columns (l, 3) and (-2, -6) in the same di- rection. All combinations of tire columns lie along a line. But the column from the [1h right side is in a different direction (L 11 ), No combination of the columns can pro- 1., duce this right side-therefore no solution. tt When we change the right side to ( 1, 3), failure shows as a whole line of solu- tions. Instead of no solution there are infinitely many: Example 2 Permanent failure with infinitely many solutions: = x - 2y l Subtract 3 times = 3x -6y 3 eqn. I from eqn. 2 x - 2y = 1 Oy = 0. = = Every y satisfies Oy 0. There is really only one equation x - 2y I. The unknown = y is "free". After y is freely chosen, x is determined as x 1 + 2y. In the row picture. the parallel lines have become the same line. Every point on that line satisfies both equations. We have a whole line of solutions. In the column picture, the right side: (I. 3) is now the same as the first column. So we can choose x = I and y = 0. We can also choose x = 0 and y = - ½; the second column times - ~ equals the right side. There are infinitely many other solutions. Every (x , y) that solves the row problem also solves the column problem. Elimination can go wrong in a third way-but this time it can be fixed. Suppose the first pivot position comains zero. We refuse to allow zero as a pivot. When the first equation has no term involving x , we can exchange it with an equation below: Example 3 Temporary failure but a row exchange produces two pivots: = Ox + 2y 4 Exchange the 3x - 2y = 5 two equations 3x -2y = 5 2y =4. 38 Chapter 2 Solving Linear Equations )' [!] • right hand side lies on the line of columns Same line from both equations Solutions all along this line -½ (second column) = [;] Figure 2.7 Row and column pictures for Example 2: infinitely many solutions. The new system is already triangular. This small example is ready for back substitution. = = The last equation gives y 2, and then the first equation gives x 3. The row 1 picture is normal (two intersecting lines). The column picture is also normal (column vectors not in the same direction). The pivots 3 and 2 are nonnal - but an exchange 11 was required to put the rows in a good order. Examples 1 and 2 are singular- there is no second pivot. Example 3 is nonsin- glllar- there is a full set of pivots and exactly one solution. Singular equations have no solution or infinitely many solutions. Pivots must be nonzero because we have to divide by them. Three Equations in Three Unknowns To understand Gaussian elimination, you have to go beyond 2 by 2 systems. Three by three is enough to sec the pattern. For now the matrices are square - an equal number of rows and columns. Here is a 3 by 3 system, specially constructed so that all steps lead to whole numbers and not fractions: = 2x + 4y - 2z 2 = 4x + 9y -3z 8 (I) -2x - 3y + 7z = 10 What arc the steps? The first pivot is the l:loklface 2 (upper left). Below that pivot we = want to create zeros. The first multiplier is the ratio 4/ 2 2. Multiply the pivot equa- tion by e21 = 2 and subtract. Subtraction removes the 4.r from the second equation: 2.2 The Idea or Elimination 39 Step I Subtract 2 times equation I from equation 2. We also eliminate - 2x from equation 3- still using the first pivot. The quick way is to add equation I to equation 3. Then 2x cancels - 2x. We do exactly that, but lhe rule in this book is to subtract rather than add. The systematic pattern has multiplier l31 = - 2/ 2 = - 1. Subtracting - 1 times an equation is the same as adding: Step 2 Subtract - I times equation 1 from equation 3. The two new equations involve only y and z. The second pivot (boldface) is I: l y+ 1z =4 ly + :5z= l2 We have reached c, 2 by 2 system. The fi1r1al step eliminates y to make it I by 1: Step 3 = Subtract equation 2new from 3new- The multiplier is I. Then 4z 8, = = The original system Ax b has been converted into a triangular system Ux c: = 2.r + 4y - 2.: 2 = 2x + 4y - 2z. 2 = 4x + 9.r - 3:::: 8 has become l y + lz = 4 (2) -2x - 3y + 1z = 10 4z: :::: 8. The goal is achieved-forward elimination is complete. Notice the pivots 2,1,4 along the diagonal. Those pivots 1 and 4 were hidden in the original system! Elimination brought them out. This triangle is ready for back substitution. which is quick: = = = (4z 8 gives z 2) (y + z 4 gives y = 2) (equation I gives x = - I) = Tlze sol11tio11 is (x. y, z ) ( - 1, 2, 2), The row picture has three planes from three equations. All the planes go through this solution. The original planes arc sloping. but = the last plane 4z 8 after elimination is horizontal. The column picture shows a combination of column vectors producing the right side b . The coefficients in that combination Ax are - 1, 2. 2 (the solution): U] [_fl =n U]. (-1) +2 + 2 [ equals (3) The numbers x , y , z multiply columns 1, 2, 3 in the original system Ax= b and also = in the triangular system Ux c. For a 4 by 4 problem, or an II by 11 problem, elimination proceeds the same way. Here is the whole idea of forward elimination, column by column: Column 1. Use the first equation to create zeros below the first pivot. Column 2. Use the new eq11atio11 2 to create zeros below the second pivot. Columns 3 to 11. Keep going to find the other pivots and tile triangular U. 40 Chapter 2 Solving Linear Equations [~:::~]- After column 2 we have QQX X We want (4) 0 0 XX The result of forward elimination is an upper triangular system. It is nonsingular if there is a full set of " pivots (never zero!). Question: Which x could be changed to boldface x because the pivot is known? Here is a final example to show the original = = Ax b, the triangular system Ux c, and the solution from back substitution: x+ y+ z= 6 X +2y+2z =9 X +2)' + 3z = JO x+y+z:= 6 y+z: =3 z= l All multipliers are I. All pivots are I. All planes meet at the solution (3, 2, 1). The = = columns combine with coefficients 3, 2, 1 to give b (6, 9, 10) and c (6, 3, I). ■ REVIEW O F THE KEY IDEAS ■ A linear system becomes upper triangular after elimination. 2. The upper triangular system is solved by back substitution (starting at the bottom). 3. Elimination subtracts lij times equation j from equation i, to make the (i, j) entry zero. r . 4. The muIftp 1er 1s e .. 11 - - entry tpoiveolitmminarotewi1n row ; Pivots can not be zero! 5. A zero in the pivot position can be repaired if there is a nonzero below it. 6. When breakdown is permanent, the s ystem has no solution or infinitely many. ■ WORKED EXAMPLES ■ 2.2 A When elimination is applied to this matrix A, what are the first and second pivots? What is the multiplier e21 in the first step (e21 times row I is subtracted from row 2)? What entry in the 2, 2 posi1ion (instead of 9) would force an exchange of rows = 2 and 3? Why is the multiplier e31 0, subtracting O times row I from row 3? 2.2 The Idea of Eliminalion 41 f = Solution The first pivot is 3. The multiplier l21 is 2. When 2 rimes row l is subtracted from row 2, the second pivot is revealed as 7. If we reduce the entry "9" to "2", that drop of 7 in the (2, 2) posiliot1 would force a row exchange. (The second row would start with 6, 2 which is an exact multiple of 3, I in the first row. Zero will = appear in the second pivot position.) The multipJier l31 is zero because a31 0. A zero at the scan of a row needs no elimination. 2.2 B Use elimination to reach upper triangular matrices U. Solve by back substi- tution or explain why this is impossible. What are the pivots (never zero)? Exchange equations when necessary. The only difference is the -x in equation (3). x+y+z=7 x+y-z=5 x-y+z=3 x+y+.z=1 x+y-z:=5 -x -y+ z: = 3 Solution For the first system. subtract equation I from equations 2 and 3 (the mul- tipliers are l21 = 1 and l31 = 1). The 2, 2 entry becomes zero, so exchange equations: x+y+z= 7 x+y+z= 7 Oy - 2z = -2 exchanges into -2y +oz= -4 -2y+Oz = -4 -2z = -2 = = = Then back substitution gives z l and y 2 and x 4. The pivots are l , -2, -2. For the second system, subtract equation 1 from equation 2 as before. Add equa- ri tion l to equation 3. This leaves zero in the 2, 2 entry a11d below: x+y+z= 1 0y-2z =-2 = Oy + 2z IO There is no pivol in column 2. = A further elimination step gives Oz 8 The three planes don't meet! Plane l meets plane 2 in a line. Plane I meets plane 3 in a parallel line. No solution. If we change the "3" in the original third equation to ''-5" then elimination would = = = )eave 2z 2 instead of 2z 10. Now z 1 would be consistent-we have moved the third plane. Substituting z = l in the first equation leaves x + y = 6. There arc infinitely many solutions! The three planes now meet along a whole line. Problem Set 2.2 Problems 1-10 are about elimination on 2 by 2 systems. 1 What multiple l of equation 1 should be subtracted from equation 2? 2x +3y = l LOx + 9y = 11. After this elimination step, write down the upper triangular system and circle the two pivots. The numbers 1 and l l have no influence on those pivots. 42 Chapter 2 Solving Linear Equations 2 Solve the triangular system of Problem I by back substitution. y before x. Verify that x times (2. 10) plus J' times (3, 9} equals (1, 11). If the right side changes to (4, 44), what is the new solution? 3 What multiple of equation l should be subtracted from equation 2? 2x - 4y = 6 -x + Sy= 0. After this elimination step, solve the triangular system. If the right side changes to (- 6, 0), what is the new solution'? 4 What multiple i of equation I should be subtracted from equation 2? ax +by = f ex + dy = g. The first pivot is a (assumed nonzero). Elimination produces what formula for the second pivot? What is y'? The second pivot is missing when ad= be. ~ 5 Choose a right side which gives no solution and another right side which gives infinitely many solutions. What are two of those solutions? r:: tt 3x + 2y = JO fl 6x + 4y == 6 Choose a coefficient b that makes this system singular. Then choose a right side g that makes it solvable. Find two solutions in that singular case. 2x + by= 16 4x +8y = g. 7 For which numbers a docs elimination break down ( 1} pennancntly (2) temporarily? = ax+ 3y -3 4x +6y = 6. Solve for x and y afler fixing the second breakdown by a row exchange. 8 For which three numbers k does elimination break down? Which is fixed by a row exchange? In each case, is the number of solutions O or I or oo? kx +3y = 6 = 3x +ky - 6. 2.2 The Idea of Eliminalion 43 9 What test on b1 and b2 decides whether these two equations allow a solution? How many solutions will they have? Draw the column picture. 3x - 2y = bi 6x - 4y = IJ2. = = 10 In the X )' plane. draw the lines x + y 5 and x + 2y 6 and the equation = __ = y that comes from elimination. The line 5x - 4y c will go through the solution of these equations if c = _ _. Problems 11-20 study elimination on 3 by 3 systems (and possible failure). 11 Reduce this system to upper triangular form by two row operations: 2x +3y + z = 8 4x + 7y + 5z = 20 - 2y + 2;:: = 0. Circle the pivots. Solve by back substitution for z, y, x. 12 Apply elimination (circle the pivots) and back substitution to solve 2x - 3y :::: 3 ·4x - 5y + z =7 Ltt. 1·1 = 2x - y - 3z 5. List the three row operations: Subtract times row from row 13 Which number d forces a row exchange, and what is the triangular system (not singular) for that d? Which d makes this system singular (no third pivot)? 2x + Sy+ z = 0 4x + dy + z = 2 y - z = 3. 14 Which number b leads later lo a row c1tchangc? Which b leads to a missing pivot? In that singular case find a nonzero solution x, y , z. x + b) =0 =z .l' - 2.v - 0 y+z = 0. 15 (a) Construct a 3 by 3 system that needs two row exchanges to reach a triangular form and a solution. (b) Construct a 3 by 3 system that needs a row exchange to keep going. but breaks down later. 44 Chapter 2 Solving Linear Equations 16 If rows I and 2 are the same, how far can you get with elimination (allowing row exchange)? If columns 1 and 2 are the same, which pivot is missing? 2x-y+z=0 2x-y+z=0 4x + y +z =2 2r+2y +z = 0 4x+4y + z = 0 6x + 6y + z =2. 17 Construct a 3 by 3 example that has 9 different coefficients on the left side, but rows 2 and 3 become zero in elimination. How many solutions to your system = = with b (1, 10. JOO) and how many with b (0, O. 0)? 18 Which number q makes this system singular and which right side t gives it in- = finitely many solutions? Find the solution that has z l. x + 4 y- 2z = I X +7y- 6z = 6 3y+qz=t. 19 (Recommended) It is impossible for a system of linear equations to have exactly two solutions. Exp/a;n why. (a) Ii (x. y, z) and (X. Y. Z) are two solutions, what is another one'J (b) If 25 planes meet at two points, where else do they meet? 11 20 Three planes can fail to have an intersection point, when no two planes are parallel. The system is singular if row 3 of A is a _ _ of the first two rows. = Find a third equation that can't be solved if x + y + :z 0 and x - 2y - z = I. Problems 21-23 move up to 4 by 4 and n by n. 21 Find the pivots and the solution for lhese four equations: 2x + y =0 X + 2y + Z =0 y +2z + t =0 z + 21 = 5. 22 This system has the same pivots and right side as Problem 21 . How is the soluw lion different (if it is)? 2x - y =0 -x + 2y - z =0 y + 2z - t = 0 - z +21 = 5. 2.2 The Idea of Elimination 45 23 If you extend Problems 21- 22 following the l. 2. I pattern or the -1. 2, - l pattern, what is the fifth pivot? What is the nth pivot? 24 If elimination leads to these equations, find three possible original matrices A: x+y+z=O y+z=O 3z =0. 25 For which two numbers a will elimination fail on A = [: ; ]? 26 For which three numbers a will elimination fail to give three pivots? 27 Look for a matrix that has row sums 4 and 8, and column sums 2 and s: a+b=4 a+c=2 c+d.;;;8 b+d-s = _ _ . The four equations are solvable only if s Then find two different ma- trices that have the correct row and column sums. Exira credit: Write down the 4 = by 4 system Ax= b with x (a, b. c, d) and make A triangular by elimination. 28 Elimination in the usual order gives what pivot matrix and what solution to this "lower triangular" system? We are really solving by fonvard substimrio11: 3x =3 6x + 2y = 8 9x -2y +z = 9. 29 Create a MATLAB command A(2, : ) = .. . for the new row 2. to subtrncl 3 times row I from the existing row 2 if the matrix A is already known. 30 Find experimentally the average first and ~econd and third pivot sizes (use the absolute value) in MATLAB's A= rand(3 , 3). The average of abs(A( l , 1)) should be 0.5 but I don't know the others. 46 Chapter 2 Solving Linear Equations ELIMINATION USING MATRICES ■ 2.3 We now combine two ideas- elimination and matrices. The goal is to express all the steps of elimination (and the final result) in the dearest possible way. In a 3 by 3 example, elimination could be described in words. For larger systems, a long list of steps would be hopeless. You will see how to subtract a multiple of one row from another row - using matrices. = The matrix form of a Jinear system is Ax b. Herc arc b. x, and A: 1 The vector of right sides is b. 2 The vector of unknowns is x. (The unknowns change to :q, xi , x3, ... because we run out of letters before we run out of numbers.) 3 The coefficient matrix is A. In lhis chapter A is square. = The example in the previous section has the beautifully short form Ax b: ~ ! =;J [;~] = 2xI + 4x2 - 2q 2 = 4x1 + 9x2 - 3.t"J 8 is the s.ame as [ = [ ~] . (I) - 2x1 - 3x2 + 7x3 = IO - 2 - 3 7 X3 lO The nine numbers on the left go into the matrix A. That matrix not only sits beside x, it mulriplies x. The rule for "A times x" is exactly chosen to yield the three equations. Review of A times x . A matrix times a vector gives a vector. The matrix is square when the number of equations (three) matches the number of unknowns (three). Our 1 matrix is 3 by 3. A general square matrix is II by n. Then the vector x is in 11- 1 1 dimensional space. This example is in 3-dimensional space: [x'] = Tl,e unk11ow11 is x x2 and the solution is J'.3 = Key point: Ax b represents the row form and also the column form of the equations. We can multiply by talcing a column ,of A at a time: (2) This rule is used so or'ten that we express it once more for emphasis. 2A The product Ax is a combi11atio11 of the colllmns of A. Components of x multiply columns: Ax= x 1 times (column I)+ •••+ Xn times (column 11). One point to repeat about matrix:. notation: The entry in row 1. column l (the top left comer) is called a 11. The entry in row 1, column 3 is a 13, The entry in row 3. column I is a31, (Row number comes before column number.) The word "entry" for a matrix corresponds to the word "component" for a vector. General rule: The entry in row i, column j of the matrix A is Oij . 2.3 Elimination Using Matrice5 47 = = Example 1 This matrix has au = 2i + j. Then a1, 3. Also a 12 = 4 and a21 5. Here is Ax with numbers and letters: [ 3 5 4] 6 [2] l = [3 5 ·2 • 2 + + 4• 6 • 1] 1 [:q] = [a11 a12J [a11x1 + a 12x2 J. a21 a22 x2 a21x1 + a22x2 = The first component of Ax is 6 + 4 IO. That is the product of the row [3 4] with the column (2, I). A row times a column gives a dot product! The ith component of Ax involves row i, which is [ a;1 a;2 • • • a;,,]. The short formula for its dot product with x uses "sigma notation·•: LII 28 The ith component of Ax is a;1x1 + a12x2 + •· · + a;11:c11• This is a ijXj j= I L The sigma symbol is an instruction to add. Start with j = I and stop with j = n. Start the sum with a;1x1 and stop with a;11x11•1 The Matrix Form of One Elimination Step Ax = b is a convenient form for the original equation. What about the elimination steps? The first step in this example subtracts 2 times the first equation from the second equation. On the right side, 2 times the first component of b is subtracted from the I I second component: Lfl Ul b = changesto boew = = We want to do that subtraction with a matrix! The same result bncw Eb is achieved Hi ~l when we multiply an "elimination matrix" E times b. It subtracts 2b1 from In.: T/,e e/imina6011 matrix is £ = Multiplication by E subtracts 2 times row 1 from row 2. Rows 1 and 3 stay the same: H 0 I 0 0 I 0 Notice how b1 = 2 and b3 = 10 stay the same. The first and third rows of E are the first and third rows of the identity matrix i . The new second component is the number 4 that appeared after the elimination step. This is b2 - 2b1. L· 1Einstein shortened this even more by omiuing 1he The repeated j in DijXj automntically meant a( addi1ion. He also wrote the sum as .Xj, Nol being Einstein. we include the L • 48 Chap1er 2 Solving Linear Eqvations It is easy to describe the ''elementary matrices" or "elimination matrices" like E. Start with the identity matrix /. Change one of its zeros ro the multiplier - e: = 2C The identity matrix has I's on the diagonal and otherwise O's. Then / b b. The elementary matrix or elimination matrix Eu that subtracts a multiple i of row J from row i has the extra nonzero entry -i. in the i, j position. Example 2 ~ Identity I = [~ ~] OO I ~ fl- 0 = [ Elimination £ 31 l -t 0 When you multiply / times b. you get b. But £ 31 subtracts e times the first component e= = from the third component. With 4 we get 9 - 4 5: ~ 0 and Eb= [ 1 -4 0 = What about the left side of Ax = b? The multiplier e 4 was chosen to produce a H:5 zero, by subtracting 4 times the pivot. E311 creaJes a zero- jn the (3, 1) position. i The notation fits this purpose. Start with A. Apply E's to produce zeros below tt the pivots (the first E is £ 21) . End with a triangular U . We now look in detail at fl those steps. First a small point. The vector x stays the same. The solution is not changed by elimination. (That may be more than a small point.) It is the coefficient matrix that is = = changed! When we start with Ax b and multiply by E . the result is EAx E b. The new matrix EA is the result of multiplying E times A . Matrix Multiplication The big question is: How do we multiply two matrices? When the first matrix is E (an elimination matrix), there is already an important clue. We know A, and we know what it becomes after the elimination step. To keep everything right, we hope and expect that EA is H ~] [; :=~] 0 I ~ ~ = [ - ~] (with the zero). 0 1 -2 -3 7 -2 -3 7 This step does not change rows 1 and 3 of A. Those rows are unchanged in EA-only row 2 is different. Twice the first row has been subtracted from the second row. Matrix = multiplication agrees with elimination- and the new system of equations is EAx Eb. £ Ax is simple but it involves a subtle idea. Multiplying both sides of the original equation gives E(Ax) = Eb. With our proposed multiplication of matrices. this is also 2.3 Elimination Using Matrices 49 = (EA )x Eb. The first was E times Ax, the second is EA times x. They are the = same! The parentheses are not needed. We just write EAx Eb. When multiplying ABC. you can do BC first or you can do AB first. This is = the point of an "associative law" like 3 x (4 x 5) (3 x 4) x 5. We multiply 3 times 20, or we multiply 12 times 5. Both answers are 60. That law seems so obvious that it is hard to imagine it could be false. But the "commutative law" 3 x 4 = 4 x 3 looks even more obvious. For matrices, EA is different from A E. 20 ASSOCIATIVE LAW = A ( BC) (AB )C NOf COMMUTATIVE LAW Often AB ";/: BA . There is another requirement on matrix multiplication. Suppose B has only one column (this column is b). The matrix-matrix law for EB should be consistent wilh the old matrix-vector law for Eb. Even more, we should be able to multiply matrices a column at a time: If B has several columns b1, bi, b3, then EB has columns Ebi, Eb2, Eb3. This holds true for the matrix multiplication above (where the matrix is A instead of B). If you multiply column 1 of A by E, you get column 1 of EA: H ~] [ ;J- [ i] 0 I 0 1 -2 - -2 and £(column j of A)= column j of EA. This requirement deals with columns, while elimination deals with rows. The next section describes each individual entry of the product. The beauty of matrix multiplication is that all three approaches (rows, columns, whole matrices) come out right. The Matrix Pij for a Row Exchange To subtract row j from row i we use Eij. To exchange or ..permute" those rows we use another matrix Pij . Row exchanges are needed when zero is in the pivot position. Lower down that pivot column may be a nonzero. By exchanging the two rows, we have a pivot (never zero!) and elimination goes forward. What matrix P23 exchanges row 2 with row 3? We can find it by exchanging rows of the identity matrix / : Permutation matrix 50 Chapter 2 Solving Linear Equations This is a row exchange matrix. Multiplying by P23 exchanges components 2 and 3 of any column vector. Therefore it also exchanges rows 2 and 3 of any matrix: fl rn ~ nrn = rn d ao ~ [ ~ r] [~ ~ = rn ~ fl · On the right, P23 is doing what it was created for. With zero in the second pivot position and "6'' below it, the exchange puts 6 into the pivot. Matrices act. They don't just sit there. We will soon meet other permutation matrices, which can change the order of several rows. Rows J. 2, 3 can be moved to 3, I. 2. Our P23 is one particular permutation matrix - it exchanges rows 2 and 3. 2E Row Exchange Matrix P;1 is the identity matrix with rows i and j reversed. When Pij multiplies a matrix A. it ex.changes rows i and j of A. =[i 1~]- To exchange equations 1 and 3 m11ltiply by P13 I O0 Usually row exchanges are not required. The odds are good that elimination uses only the Eij- But the Pij are ready if needed. l o move a pivot up to the diagonal. The Augmented Matrix I I This book eventually goes far beyond elimination. Matrices have all kinds of practical applications, in which they are multiplied. Our best starting point was a square E times a square A, because we met this in eJimination-and we know what answer to expect for EA. The next step is to allow a rectangular matri.\'. It still comes from our original equations. but now it includes the right side b. Key idea: Elimination ·docs the same row operations to A and to b. We can include b as an extra co/rmm and follow it tllror,glt elimination. The matrix A is enlarged or "augmented" by the extra column b: !] . 4 -2 Aug111e11ted matrix [Ab]=[! 9 -3 -2 -3 7 10 Elimination acts 011 whole rows of this matrix. The left side and right side are both multiplied by E, to subtract 2 times equation I from equation 2. With [ A b ] those steps happen together: H 2] [ !]. 0 -2 I O0 J [ 24 49 - 3 8 - 4 -2 02 I I 0 I -2 -3 1 10 - 2 - 3 7 10 = The new second row contains 0, I , I, 4. The new second equation is + x 2 :c3 4. Matrix multiplication works by rows and at the same time by columns: 2.3 El imina1ion Using Matrices 51 R (by rows): Each row of E acts on [ A b] to give a row of [ EA Eb]. C (by columns): E acts on each column of [ A b J to g ive a column of [EA Eb ). Notice again that word "acts." This is essential. Matrices do something! The matrix A acts on x to produce b. The matrix £ operates on A to give EA . The whole process of elimination is a sequence of row operations, alias matrix multiplications. A goes to E21A which goes to £ 31£21 A. Finally £ 32£ 31£21 A is a triangular matrix. The right side is included in the augmented matrix. The end result is a triangular system of equations. We stop for exercises on multiplication by £. before writing down the rules for all matrix multiplications (including block multiplication). ■ REVIEW O F THE KEY IDEAS ■ E'J~, I. Ax = xi times column I + · · · + Xn times column "· And (Ax); = a ;jXj, 2. Identity matrix = / , elimination matrix = Eij , exchange matrix = Pij . = 3. Multiplying Ax b by £ 21 subtracts a multiple l21 of equation J from equa- tion 2. The number - l21 is the (2, I) entry of the elimination matrix £ 21 . 4. For the augmented matrix [ A b ]. that elimination step gives [ £21 A E 21 b]. 5. When A multiplies any matrix B, h multiplies each column of B separately. 1·1 ■ W O RKED EXAMPLES ■ 2.3 A What 3 by 3 matrix £ 21 subtracts 4 times row I from row 2? What matrix P 32 exchanges row 2 and row 3? If you multiply A on the right instead of the left, describe the results A£21 and AP32. Solutio n By doing those operations on the identity matrix I. we find = I O O] £ 21 [ -4 0 l 0 0 l 1 0 0] and P32= [ 0 0 0 J I 0 . Multiplying by E 21 on the right side will subtract 4 times column 2 from column 1. Multiplying by P32 on the right will exchange columns 2 and 3. 2.3 B Write down the augmented matrix [A b] with an extra column: x +2y +2z = I 4x + 8y + 9z = 3 3y + 2z = 1 Apply £ 21 and then P32 to reach a triangular system. Solve by back substitution. What combined matrix P32 E 21 will do both steps at once? 52 Chapter 2 Solving Linear Equations Solution The augmented matrix and the result of using E 21 are l 2 2 1] [A bl = 4 8 9 3 [0 3 2 l and ] I ? 2 I £ 21lA bl = 0 0 1 - I [0 3 2 I P 32 exchanges equation 2 and 3. Back substitution produces (x. y , z): =[ -l] m=UJ P32 E21(A b] ~ 2 2 3 2 and 0 1 For the matrix P32 £ 21 that does both steps at once, apply P32 to E21! ~ ~ ~ = = [ P32 £ 21 exchange the rows of £21 ] . -4 1 0 2.3 C Multiply these matrices in two ways: first, rows of A times columns of 8 to find c;1ch cn1ry of AB, and second. columns of A times rows of B to produce two ffi matrices that add to AB. How many separate ordinary mulliplications are needed? 1 i][~ AB = [! ~] = (3by2)(2by2) Solution Rows of A times columns of B are dot products of vectors: [i] = 4 ( row l ) • (column L) [J ] = 10 is the ( I. 1) entry of A 8 = (row 2) • (column I) = [ 1 51 [~] 7 is the (2, 1) entry of AB The first columns of AB are (10, 7. 4) and ( 16, 9, 8). We need 6 dot products, 2 mul- liplicalions each, 12 in a11 (3 • 2 • 2). The same AB comes from columns of A limes rows of B: [4] 3] [2 AB= [ I 2 4] + 5 O [1 J ] = [62 142] +[54 54]= [ 170 196] . 4 8 0 0 4 8 2.3 Elimination Using Matrices 53 Problem Set 2.3 Problems 1-15 are about elimination matrices. 1 Write down the 3 by 3 matrices thac produce these elimination steps: (a) £ 21 subtracts 5 times row I from row 2. (b) E32 subtracts -7 times row 2 from row 3. (c) P exchanges rows l and 2, then rows 2 and 3. = = 2 In Problem I, applying £21 and then £32 to the column b (I, 0, 0) gives £32 E21 b = _ _. _ _ . Applying £32 before £ 21 gives E 21 £32b When £32 comes first, row _ _ feels no effect from row _ _ . 3 Which three matrices £21, £31. £32 put A into triangular fonn U? ! A= [ -2 Multiply those E's to get one matrix M that does elimination: MA = U . = 4 Include h (I, 0, 0) as a fourth column in Problem 3 to produce [ A b ]. Carry out the elimination steps on this augmented matrix to solve Ax = b. 5 Suppose a33 = 7 and the third pivot is 5. If you change a33 to 11, the third pivot is _ . If you change a33 to _ _ , there is no third pivot. 6 If every column of A is a multiple of (I, I. I). then Ax is always a multiple of ( l, 1, 1). Do a 3 by 3 example. How many pivots are produced by elimination? 7 Suppose £31 subtracts 7 times row 1 from row 3. To reverse that step you should _ _ 7 times row _ _ to row _ _ . This "inverse matrix" is R31 = _ _ . 8 Suppose £31 subtracts 7 times row 1 from row 3. What matrix R31 is changed into I? Then £31 R 31 = / where Problem 7 has R 31 £31 = /. Both are true! 9 (a) £21 subtracts row I from row 2 and then Pn exchanges rows 2 and 3. What matrix M = P23E21 docs both steps at once? (b) P23 exchanges rows 2 and 3 and then £31 subtracts row 1 from row 3. = What matrix M E31P23 does both steps at once? Explain why the M's are the same but the E's are different 10 (a) What 3 by 3 matrix E13 will add row 3 to row I? (b) What matrix adds row 1 to row 3 and at the same time row 3 to row 1? (c) What matrix adds row I to row 3 and then adds row 3 to row 1? 54 Chapter 2 Solving Lineat Equations = 11 Create a matrix that has a 11 a22 = a 33 = I but elimination produces two negative pivo1s without row exchanges. (The first pivot is 1.) 12 Multiply these matrices: [~ fl[~ i][~ 0 2 0 I 0 5 8 1 0 gJ !l [-! ~J [: 0 2 I 3 -1 0 4 13 Explain these facts. If the third column of B is all zero, the third column of EB is all zero (for any E }. If the third row of B is all zero. the third row of £ B might not be zero. 14 This 4 by 4 matrix will need elimination matrices E21 and £ 32 and £ 43. What are those matrices? _ - 2I - 2I - O1 O0 J A - 0 -I 2 - 1 • [ 0 0 -I 2 = = 15 Write down the 3 by 3 matrix that has a ij 2i- 3j. T his matrix has a32 0, but !!i5 elimination still needs £ 32 to produce a zero in the 3, 2 position. Which previous 1-.\ step destroys the original zero and what is En '? tt fl Problems 1Cr23 are about creating and multiplying matrices. = 16 Write these ancient problems in a 2 by 2 matrix fonn Ax b and solve them: (a) X is twice as old as Y and their ages add to 33. = = (b) (x, y ) (2, 5) and (3, 7) lie on the line y mx + c. Find m and c. = 17 The parabola y = a + bx + cx2 goes through the points (x , y) ( I, 4) and (2, 8) and (3, 14). Find and solve a matrix equation for the unknowns (a. b, c). 18 Multiply these matrices in the orders E F and FE and £ 2: = E al OI OO J [b O I Ol OO J . c l = = Also compute E 2 EE and F 3 FF F. 19 Multiply these row exchange matrices in the orders P Q and Q P and P 2: = P O1 0I O0 J and [0 0 I = Find four matrices whose squares are M2 I. 'J01 0 . 0 0 2.3 Elimination Using Malrices 55 20 (a) Suppose all columns of B are the same. Then all columns of EB are the same. because each one is E times _ _ . (b) Suppose all rows of B are [ 1 2 4 ]. Show by example that all rows of EB are not [ I 2 4 ). It is true that those rows are _ _ . 21 lf E adds row 1 to row 2 and F adds row 2 to row 1, does E F equal FE? L 22 The entries of A and x arc au and Xj. So the first component of Ax is t1 1jXj = a, 1x1 + ••• + a111x,,. If E21 subtracts row l from row 2, write a formula for (a) the third component of Ax (b) the (2, I) entry of £21A (c) the (2, l } entry of £21(£21 A) (d) the first component of EAx. = 23 The elimination matrix E [J V] subtracts 2 times row l of A from row 2 of A. The result is EA. What is the effect of E(EA)? In the opposite order AE. we are subtracting 2 times _ _ of A from _ _ . (Do examples.) Problems 24-29 include the column b in the augmented matrix [ A b ]. 24 Apply elimination to the 2 by 3 augmented matrix [ A b l, What is the triangular t = system Ux c? What is the solution x ? 1 [xi] Ax = [2 3] = [ l]. 11 4 I x2 17 25 Apply elimination to the 3 by 4 augmented matrix [ A b ]. How do you know this system has no solution? Change the last number 6 so there is a solution. = = b* 26 The equations Ax b and Ax* have the same matrix A . What double augmented matrix should you use in elimination to solve both equations at once? Solve both of these equations by working on a 2 by 4 matrix: 27 Choose rhc numbers a, b, c, d in this augmented matrix so that there is (a) no solution (b) infinitely many solutions. 1 2 3 a] [A b]= 0 4 S b [0 0 d c Which of the numbers a, b, c, or d have no effect on the solvability? 56 Chapter 2 Solving Linear Equations = 28 If AB I and BC = I use the associative law to prove A = C. = [: ~] = 29 Choose two matrices M with det M ad - be = I and with a, b, c, d positive integers. Prove that every such matrix M either has EITHER row l :s row 2 OR row 2 ~ row 1 . Subtraction makes [J ~] Mor [A-:)M nonnegative but smaller than M. If you continue and reach /, write your M's as products of the inverses [ f Y] and [ Af ]. 30 Find the triangular matrix £ that reduces "Pascal's marrix" to a smaller Pascal: [l !;i] £ ~l - = [~ 0 ; 1 331 0 21 Challenge question: Which M (from several E's) reduces Pascal all the way to I? RULES FOR MATRIX OPERATIONS ■ 2.4 I will start with basic facts. A matrix is a rectangular array of numbers or "entries." When A has m rows and 11 columns. it is an "m by ,,.. matrix. Matrices can be added if their shapes m the sa.ine. They can be multiplied by any constant t. Herc are [i ~] ~ ~ fl [i n i n examples of A + B and 2A, for 3 by 2 matrices: + [ :] = [ and 2 = [ Matrices are added exactly as vectors are- one entry at a time. We could even regard = a column vector as a matrix with only one column (so n l). The matrix - A comes = from multiplication by c - 1 (revelfSing all the signs). Adding A to -A leaves the zero mar,;x, with all entries zero. The 3 by 2 zero matrix is different from the 2 by 3 zero matrix. Even zero has a shape (several shapes) for matrices. All this is only common sense. The entry in row i and column j is called aij or A(i, j). The n cmries along the first row arc a11, a12, . . ., a 1n. The lower left entry in the matrix is a111 1 and the lower right is am11• The row number i goes from 1 co m. The column number j goes from I ton. Matrix addition is easy. The serious question is matrix multiplication. When can we multiply A times B. and what is the :product AB? We cannot multiply when A and B are 3 by 2. They don 't pass the following test: To multiply AB: If A has n colrmms, B must have II rows. If A has two columns, B must have two rows. When A is 3 by 2, the matrix B can be 2 by I (a vector) or 2 by 2 (square) or 2 by 20. Every column of B is ready to be multiplied by A. Then AB is 3 by I (a vector) or 3 by 2 or 3 by 20. 2.4 Rules (or Matrix Operations 57 Suppose A is m by n and B is n by p. We can multiply. The product AB is m by p. = m rows ] [ 11 rows ] [ m rows ] [ n columns p columns p columns • A row times a column is an extreme case. Then 1 by ,i multiplies n by l. The result is I by 1. That single number is the "dot product." In every case AB is filled with dot products. For the top comer, the (l. 1) entry of AB is (row l of A) • (column l of B). To multiply matrices, take all these dot products: (each row of A)• (each column of B). 2F The entry in row i and col11m11 j of A B is (row ; of A) • (column j of 8) . Figure 2.8 picks out the second row (i = 2) of a 4 by 5 matrix A. It picks out the third column (j = 3) of a 5 by 6 matrix B . Their dot product goes into row 2 and column 3 of AB. The matrix AB has as mmiy rows as A (4 rows), and as many colttmns as B. [ 0~I O;i • • • O;s ] * * btj bzi * * * = [· * :t: (AB);j * ·]"' ,(c [I+:; 1., tt 1·1 bsj * A is 4 by 5 B is 5 by 6 AB is 4 by 6 = = = Figure 2.8 Here i 2 and j 3. Then (AB)i3 is (row2) . (column 3) ta21cbo. Example 1 Square matrices can be multiplied if and only if they have the same size: The first dot product is I • 2 + I • 3 = 5. Three more dot products give 6, 1, and 0. Each dot product requires two multiplications- thus eight in all. If A and B are n by 11, so is AB. It contains n2 dot products, row of A times column of B. Each dot product needs n multiplications, so tlte computation of AB = = uses n3 separate multiplicatwns. For n 100 we multiply a million times. For 11 2 = we have n3 8. Mathematicians thought until recently that AB absolutely needed 23 = 8 mul- tiplications. Then somebody found a way to do it with 7 (and extra additions). By breaking II by II matrices into 2 by 2 blocks, this idea also reduced the count for large matrices. Instead of n3 it went below n2 •8 , and the exponent keeps falling. 1 The best 1 Maybe the exponent won't slop falling before 2 . No number in between looks special. 58 Chapter 2 Solving Linear Equations at this moment is u 2•376. But the algorithm is so awkward that scientific computing is done the regular way~ n2 dot products in AB, and 11 multiplications for each one. Example 2 Suppose A is a row vector (I by 3) and B is a column vector (3 by 1). Then AB is I by l (only one entry. the dot product). On the other hand B times A (a column times a row) is a full 3 by 3 matrix. This multiplication is allowed! [!Ji [! !U Column ffmes row: I 2 3] = A row times a column is an "inner" product- that is another name for dot product. A column times a row is an "outer" product. These arc extreme cases of matrix mul- tiplication, with very thin matrices. T hey follow the rule for shapes in multiplication: (n by 1) times (1 by 11). The product of column times row is " by n . Example 3 will show llow to multiply AB using col11mns times rows. Rows and Columns of AB In the big picture, A multiplies each column of B . The result is a column of AB. In that column. we are combining the columns of A. Eacl, column of AB is a combi- 11ation of the colwm,s of A. That is lhc column picmre of matrix multiplication; Co/1111111 of AB is (matrix A ) times (column of B ). The row picture is reversed. Each row of A multiplies the whole matrix B. The result is a row of AB. It is a combination of the rows of B : [ row i of A ] 4] 52 63 ] = [ row i of AB ]. [7 8 9 We see row operations in elimination ( £ times A). We sec columns in A times x . The ..row-column picture" has the dot products of rows with columns. Believe it or not. there is also a "column-row picture." Not everybody knows that columns 1, . . . . n of A multipJy rows 1. .... 11 of B and add up to the same answer AB. The laws for Matrix Opera tions May I put on record six laws that matrices do obey, while emphasizing an equation they don't obey'? The matrices can be square or rectangular, and the laws invo)vjng A + B arc all simple and all obeyed. Here are three addition laws: A+ B = B +A (commutative law) c(A+ B) =cA +cB (distributive law) + A (B + C) = (A + B) + C (associative law). 2.4 Rules for Matrix Operations 59 = Three more laws hold for multiplication. but AB BA is not one of them: AB-:pBA C(A + B) = CA +CB (A + B)C = AC + BC = A(BC) (AB)C (the commutative "law" is usual/y broken) (distributive law from the left) (distributive law from the right) (associative law for ABC) (parentheses not needed). When A and B are not square, AB is a djfferent size from BA. These matrices can't be equal- even if both multiplications are allowed. For square matrices, almost any example shows that AB is different from BA: = It is true that AI I A. All square matrices commute with I and also with c I . Only these matrices cl commute with all other matrices. The law A(B + C) = AB + AC is proved a column at a time. Start with A(b + = c) Ab + Ac for the first column. That is the key to everything - linearity. Say no LJ..: ' more. 1, tt The law A(BC) = (AB)C means 11,a1 you can multiply BC first or AB first. +1 The direct proof is sort of awkward (Problem 16) but this law is extremely useful. We highlighted it above; it is the key to the way we multiply matrices. = = = Look at the special case when A = B C square matrix. Then (A times A2) (A2 rimes A). The product in either order is A3. The matrix powers AP follow the same rules as numbers: AP= AAA· .. A (p factors) Those are the ordinary laws for exponents. A3 times A4 is A7 (seven factors). A3 to the fourth power is A12 (twelve A's). When p and q arc zero or negative these rules still hold, provided A has a " - I power"- which is the inverse matrix A- 1. Then = A0 I is the identity matrix (no factors). For a number, a - 1 is 1/ a. For a matrix. the inverse is written A- 1. (It is never I / A. except this is allowed in MATLAB.) Every number has an inverse except a = 0. To decide when A has an inverse is a central problem in linear algebra. Section 2.5 will start on the answer. This section is a Bill of Rights for matrices, to say when A and B can be multiplied and how. 60 Chapter 2 Solving Linear Equations Block Matrices and Block Multiplication We have to say one more thing about matrices. They can be cut into blocks (which are smaller matrices). This often happens naturally. Here is a 4 by 6 matrix broken into blocks of size 2 by 2-and each block is just /: _i~ -~ A :::::: [ - ~- ~---1-0-l _ o_l] = [ I I 10 10 10 1 I 01 0 0 I If B is also 4 by 6 and its block sizes match the block sizes in A. you can add A + B a block at a time. We have seen block matrices before. The right side vector b was placed next to A in the "augmented matrix." Then [ A b] has two blocks of different sizes. Multiplying by an elimination matrix gave [ EA Eb]. No problem to multiply blocks times blocks, when their shapes permit: 2G Block multiplication If the cuts between columns of A match the cuts between rows of B. then block multiplication of AB is allowed! ·].• .•. {1) This equation is the same as if the blocks were numbers (which are 1 by I blocks). We are careful to keep A·s in front of s·s. because BA can be different. The cuts between rows of A give cuts between rows of AB. Any column cuts in B are also column cuts in AB. Main point When matrices split into blocks, it is often simpler to see how they act. The block matrix of l's above is much clearer than the original 4 by 6 matrix A. Example 3 (Important special case) Let the blocks of A be its n co]umns. Let the blocks of B be its ,r rows. Then block multiplication AB adds up columns times rows: (2) This is another way to multiply matrices! Compare it with the usual rows times columns. Row 1 of A times column I of B gave the (1, 1) entry in AB. Now column l of A 2.4 Rules for Matrix Operations 61 times row l of B gives a full matrix-not just a single number. Look at this example: [!][3 2]+[:][1 O] [~ ;J + [1 ~l (3) We stop there so you can see columns multiplying rows. If a 2 by l matrix (a column) multiplies a 1 by 2 matrix (a row). the result is 2 by 2. That is what we found. Dot products are "inner products," these are "outer products." When you add the two matrices at the end of equation (3), you get the correct = answer AB. In the top lefl corner the answer is 3 + 4 7. This agrees with the row-column dot product of (1, 4} with (3. I). Srmunary The usual way, rows times columns, gives four dot products (8 multiplica- tions). The new way, columns times rows. gives two full matrices (8 multiplications). The eight multiplications. and also the four additions. are all the same. You just execute them in a different order. Example 4 (Elimination by blocks) Suppose the first column of A contains 1, 3, 4. To change 3 and 4 to Oand O. multiply the pivot row by 3 and 4 and subtract. Those row operations are really multiplications by elimination matrices £ 21 and £ 31: fl n 0 I ~ 0 and £ 31 = [ l 0 -4 0 The "block idea" is to do both eliminations wilh one matrix E. That matrix clears out = the whole first column of A below the pivot a 2: 0 1 0 0~] multiplies I X X] 3 .t' X [4 X X x] = to give EA J X 0x x . [0 X X Block multiplication gives a fonnula for £A. The matrix A has four blocks a, b. c. D: the pivot. the rest of row I. the rest of column I, and the rest of the matrix. Watch how E multiplies A by blocks: (4) Elimination multiplies the first row [ a b] by c/ a. It subtracts from c to get zeros in the first column. It subtracts from D to gee D - cb/ a. This is ordinary elimination, a column at a time - written in blocks. Copyrighted ma,a ,al 62 Chapter 2 Solving Linear Equation~ ■ REVIEW OF TH E KEY IDEAS ■ I. The (i, j) entry of AB is (row i of A ) • (column j of B). 2. An m by II matrix times an 11 by p matrix uses 111111, separate multiplications. 3. A times BC equals AB times C (surprisingly important). 4. AB is also the sum of these matrices: (column j of A) times (row j of B). 5. Block multiplication is allowed when the block shapes match correctly. ■ W ORKED EXAMPLES ■ 2.4 A Put yourself in the position of the author! I want to show you matrix multiplications that are special. but mostly I am stuck with small matrices. There is one terrific family of Pascal matrices, and they come in all sizes, and above all they have real meaning. I think 4 by 4 is a good size to show some of their amazing patterns. Here is the lower triangular Pascal matrix L. Its entries come from "Pascal's triangle". I will multiply L times the ones vector, and the powers vector: x l+x = 2 3 I 3 ] I [ xl2 ] x3 [ (1 +1x)2 ] • (1 +x)3 Each row of L leads to the next row: Add an elllry to 1he one on its left to get the = entry below. In symbols i; j + l; j-1 f; + 1j . The numbers after I, 3, 3, I would be 1. 4. 6, 4. I. Pascal lived in the l600's, long before matrices, but his triangle fits perfectly into L. Multiplying by ones is the same as adding up each row. to get powers of 2. In = = fact powers ones when .t 1. By writing out the last rows of L times powers, you sec the entries of L as the "binomial coefficients" that are so essential to gamblers: = 1 + 2x + lx2 (1 + x)2 1 + 3x + Jx2 + tx3 = (I + x)3 The number "3" counts the ways to get Heads once and Tails twice in three coin flips: HTT and THT and TTH. The other "3" counts the ways to get Heads twice: HHT = and HTH and THH. Those are examples of "i choose j" the number of ways to get j heads in i coin flips. That number is exactly lij , if we start counting rows and = = = columns of L at i 0 and j 0 (and remember O! I): eij = (~) =i choose j = j! ( i ;~ j)! (~) = 2:~! =6 There are six ways to choose two aces out of four aces. We will see Pascal's triangle and these matrices again. Here arc the questions I want to ask now: 2.4 Rules for Matrix Operations 63 = 1. What is H L 2? This is the ''hypercube matrix''. 2. Multiply H times ones and powers. 3. The last row of H is 8, 12, 6, 1. A cube has 8 corners, 12 edges, 6 faces, I box. What would the next row of ff tell about a hypercube in 4D? = Solution Multiply L times L to get the hypercube matrix H L 2: Now multiply l-1 times the vectors of ones and powers: 1 4 ][ l] [x x2 = 2+1x (2+x) 2 ] 12 6 l x3 (2+x)3 = If x l we get the powers of 3. If x = 0 we get powers of 2 (where do l, 2, 4. 8 appear in H ?). Where L changed x to 1+x, applying L again changes 1+x to 2 +x. How do the rows of H count cor ners and edges and faces of a cube? A square in 2D has 4 corners, 4 edges, I face. Add one dimension at a time: Co1111ec1 two squares to get a 3D cube. Connect two cubes to get a 4D hypercube. The cube has s-corners and 12 edges: 4 edges in each square and 4 between the squares. The cube has 6 faces: 1 in each square and 4 faces between the squares. This row 8. 12, 6. 1 of H will lead to the next row (one more dimension) by 2h; j + h;j - 1 = h;+lj • Can you see this in four ,iime11sions? The hypercube has 16 comers, no problem. It has 12 edges from one cube, 12 from th:e other cube, 8 that connect comers between those cubes: total 2 x 12 + 8 = 32 edges. It has 6 faces from each separate cube and = 12 more from connecting pairs of edges: total 2 x 6 + 12 24 faces. It has one box from each cube and 6 more from connecting pairs of faces: total 2 x l + 6 = 8 boxes. And sure enough, the next row of H is 16, 32, 24, 8, 1. = = 2.4 B For these matrices, when does AB BA? When does BC CB'? When docs A times BC equal AB times C? Give the conditions on their entries p. q, r. z: A= [p q Or J If p, q. r, l, z are 4 by 4 blocks instead of numbers, do the answers change? 64 Chapter 2 Solving linear Equations Solution First of all. A times BC always equals AB times C. We don't need paren- = theses in A(BC) (AB)C = ABC. But we do need to keep the matrices in this order A, B. C. Compare AB with B A: AB = [ Pq q +P r ] [P BA = +q q r'] . = = We only have AB = BA if q 0 and p r. Now compare BC with C B : BC = [o0 0' ] B and C happen to commute. One explanation is that the diagonal part of B is / , which commutes with a11 2 by 2 matrices. The off-diagonal part of B looks exactly Jike C (except for a scalar factor z) and every matrix commutes with itself. When p , q. r, z. are 4 by 4 blocks and I changes to the 4 by 4 identity matrix, all these products remain correct. So the answers are the same. (If the / 's in B were changed to blocks t , 1, 1, then BC would have the block r z and CB would have the block zt. Those would normally be different- the order is important in block multi~ pJication.) 2.4 C A directed graph starts with n nodes. There are n2 possible edges-each edge leaves one of the 11 nodes and enters one of the n nodes (possibly itself). The n = by n adjacency matrix has a;; 1 when an edge leaves node ; and enters node j ; if = no edge then a ij 0. Here arc two directed graphs and their adjacency matrices: node I to node 2 node I to node I G ( J 2 A=[ : ~ ] node 2 to node I The i, j entry of A2 is anatj+· • • +a;,,anj· Why does that sum count the two-step paths from ; to any node to j ? The i, j e ntry of Ak counts k-step paths: counts the paths I to 2 to l . I to I to 1 1 to I to 2] with two edges [ 2 to I to l 2 to 1 to 2 List all of the 3•slep paths between each pair of nodes and compare with A3. When Ak has no zeros. that number k is the diameter of the graph- the number of edges needed to connect the most distant pair of nodes. What is the diameter of the second graph? Solution The number aaakj will be ..I" if there is an edge from node i to k and an edge from k to j . This is a 2-step path. The number a;kak; will be "O" if either of 2.4 Rules ior Matrix Operations 65 those edges (i to k, k to j ) is missing. So the sum of a ;kakJ is the number of 2-stcp paths leaving i and entering j . Matrix multiplication is just right for this count. The 3-step paths arc counted by A3; we look at paths to node 2: counts the paths • •• l to I to l to 2, I to 2 to I to 2] with three steps [ • • • 2 to I to 1 to 2 These A k contain the Fibonacci numbers 0, I, 1, 2 , 3, 5, 8, 13, .. . coming in Section 6.2. = = = Fibonacci's rule F1c+2 F1;+1 + Fk (as in 13 8 + 5) shows up in (A) (Ak) Ak+1: There are 13 six-step paths from node I to node I, but I can't find them all. Ak also counts words. A path like l to I. to 2 to l corresponds to the number 1121 or the word aaba. The number 2 (the letter b) is not allowed to repeat because the graph has no edge from node 2 to node 2. The i , j entry of Ale counts the aJlowcd numbers (or words) of length k + l that start with the ith letter and end with the jth. The second graph also has diameter 2; A2 has 110 zeros. Problem Set 2.4 l1..:: 1, tt Problems 1-17 are about the laws of mairix multiplication. +1 1 A is 3 by 5, B is 5 by 3, C is 5 by 1, and D is 3 by I. All entries are l. Which of these matrix operations arc allowed, and what are the results? BA AB ABD DBA A(B + C ). 2 What rows or columns or matrices do you multiply to find (a) the third column of AB? (b) the first row of AB? (c) the entry in row 3, column 4 of AB? (d) the entry in row 1, column I of C DE? 3 Add AB to AC and compare with A(B + C): l A = [~ ~] and B = [~ ~] and C = [~ ~ 4 In Problem 3, multiply A times BC. Then multiply AB times C. 5 Compute A2 and A3. Make a prediction for A5 and An: t] A = [~ and A = [~ ~]. 66 Chapter 2 Solving Linear Equations + 6 Show lhal (A+ B )2 is different from A2 2AB + B 2 , when J. ~ ~ ~ A = [ ~] and B = [ Write down the correct rule for (A+ B)(A + 8) = A2 + _ _ + 8 2. 7 True or false. Give a specific example when false: (a) If columns I and 3 of B are the same. so are columns I and 3 of AB. (b) If rows I and 3 of B are the same, so are rows I and 3 of AB. (c) If rows I and 3 of A are the s.ame. so are rows 1 and 3 of ABC. = (d) (A 8)2 A2 B 2. 8 How is each row of DA and EA related to the rows of A, when How is each column of AD and A£ related to the columns of A? 9 Row I of A is added to row 2. This gives EA below. Then column I of EA is _ffi added to column 2 lo produce (EA)F: 1 tt EA=[!~][: !]=[a:c b ] b+d fl !]= [a:c and (EA)F=(EA)[~ a+b ] n+c+b+d • (a) Do those steps in the opposite order. First add column I of A to column 2 by AF, then add row I of AF to row 2 by E(AF). (b) Compare with (EA)F. What law is obeyed by matrix multiplication? 10 Row 1 of A is again added to row 2 to produce EA. Then F adds row 2 of £ A to row I. The result is F(EA): I][ d] = [~ F (EA) a b ] [2a + c 2b + l a +c b+d - a+c b+d • (a) Do those steps in the opposite order: first add row 2 to row I by FA. then add row I of FA to row 2. (b} What law is or is not obeyed by matrix multiplication? 11 (3 by 3 matrices) Choose the only B so that for every matrix A (a) BA =4A (b) BA =48 2.4 Rules for Matrix Operations 67 (c) BA hns rows l and 3 of A reversed and row 2 unchanged (d) All rows of BA are the same as row 1 of A. 12 Suppose AB = BA and AC= CA for these two particular matrices B and C: = A _ [ac db] commutes with B [ 0I O0 J = = = Prove that a d and b c 0. Th.en A is a multiple of /. The only matrices = that commute with B and C and alJ other 2 by 2 matrices are A multiple of I. 13 Which of the following matrices are guaranteed to equal (A - B)2: A2 - B2, (B - A)2, A2 - 2AB + B2, A(A - 8) - B(A - B), A2 - AB - BA + B2? 14 True or false: (a) If A2 is defined then A is necessarily square. (b) If AB and BA are defined then A and B arc square. (c) If AB and BA are defined then AB and BA are square. (d) JfAB=BthenA=l. ~ 15 If A is 111 by n, how many separate multiplications arc involved when 1 (a) A multiplies a vector x with 11 components? (b) A multiplies an ,, by p matrix. B? (c) A multiplies itself to produce A 2 ? Herem = 11. = 16 To prove that (AB)C A(BC), use the column vectors b1, .... b,, of B. Fim suppose that C has only one column c with entries c1, ... , en: AB has columns Ab1 , ... , Ab,, and Be has one column c1 b 1 + · · · + cnb11 • = = Then (AB)c ci Abt + · ·· +c11 Abn equals A(q b1 + ··· +c11 h,1) A(Bc). = Linearity gives equality of those two sums, and (AB)c A(Bc}. The same is = true for all other _ _ of C. Therefore (AB)C A(BC) . 17 For A = [~ :}] and B = [}g!], compute these answers and not/Jing more: (a) column 2 of AB (b} row 2 of AB (c) row 2 of AA = A2 = (d) row 2 of AAA A3. Problems 18-20 use aij for the entry in row i, column j of A. 18 Write down the 3 by 3 matrix A whose entries are 68 Chapter 2 Solving Linear Equations (a) aij = minimum of i and j = (b) Cljj (- ])i+j (c) au =;/j. 19 What words would you use to describe each of these classes of matrices? Give a 3 by 3 example in each class. Which matrix belongs to all four classes? = (a) ail 0 if i # j = (b) Oij 0 if i < j = (C) Gij Oji = (d} llij Olj• 20 The entries of A are a ij. Assuming 1hat zeros don' t appear, what is (a) the first pivot? (b) the multiplier f31 of row I to be subtracted from row 3? (c) the new entry that replaces ci32 after that subtraction? (d) the second pivot? Problems 21-25 involve powers of A. [i ~] A = ~ ~ and V = [ ~] • 000 0 I 22 Find all the powers A2, A3, . . . and AB, (AB)2, . . . for 23 By trial and error find real nonzero 2 by 2 matrices such that = A2 - I BC = 0 DE = -ED (not allowing DE= 0). 24 (a) Find a nonzero matrix A for which A2 = 0. = (b) Find a matrix that has A2 '# 0 but A3 0. = = 25 By experiment with 11 2 and n 3 predict A" for 2.4 Rules for Matrix Operations 69 Problems 26-34 use column-row multiplication and block multiplication. 26 Multiply AB using columns times rows: 27 The product of upper triangular matrices is always upper triangular: :J [~ :J ~ AB= [~ ~ = [o ]. OOx OOx 00 Row times column is dot product (Row 2 of A)· (column I of 8) = 0. Which other dot products give zeros'! Column times row is full matrix Draw x·s and O's in (column 2 of A ) times (row 2 of B) and in (column 3 of A) times (row 3 of B). 28 Draw the cuts in A (2 by 3) and B (3 by 4) and AB to show how each of the four multiplication rules is really a block mulliplication: (I) Mattix A times columns of B . (2) Rows of A times matrix B. (3) Rows of A times columns of B. (4) Columns of A times rows of B. 29 Draw cuts in A and x to multiply Ax a column at a time: x, (column J) + • ••. 30 Which matrices £ 21 ancJ £31 produce zeros in the (2, 1) and (3, I) positions of E 21 A and E31A? I 0 5 = Find the single matrix E £ 31E:?1 that produces both zeros at once. Multi- ply EA. 31 Block multiplication says in the text that column I is eliminated by 0] [a [a b] _ EA _- [- c/Ia I c D - b ] 0 D - ch/ a • In Problem 30, what are c and D and what is D - ch/ a? 70 Chapter 2 Solving Linear Equ,rnons = 32 With ;2 - 1, the product of (A + iB) and (x + iy) is Ax + iBx +i Ay - By. Use blocks to separate the real part without i from the imaginary part that multiplies i: [x] [Ax - A -BJ = By] ~eal ~art [? ? y ? 1magmary part 33 Suppose you solve Ax = b for three special right sides b: If the three solutions x , . x 2. X3 arc the columns of a matrix X. what is A times X? = = 34 If the three solutions in Question 33 arc x 1 (I. L. l) and x 2 (0. I, l) and = = = x 3 (0. 0. l ). solve Ax b when b (3. 5. 8). Challenge problem: What is A? 35 Eli111i11atio11 for a 2 b-:,· 2 block matrix: When you multiply the first block row by CA - I and subtract from the second row, what is the "Schur complement" S that appears'? 36 Find all matrices A =[ :~] that satisfy A(] l] = [I) ]A. 37 Suppose a ••circle graph" has 5 nodes connected (in both directions) by edges around a circle. What is its adjacency matrix from Worked Example 2.4 C? What arc A2 and A3 and the diameter of this graph? 38 If 5 edges in Question 37 go in one direction only, from nodes 1. 2. 3, 4, 5 to 2. 3, 4. 5, 1, what are A and A2 and the diameter of thi:- one-way circle? 39 If you multiply a northwest matrlt A and a southeast matrix B, what type of matrices are AB and 8 A? "Northwest" and "southeast.. mean zeros below and above the anti "alternating diagonals". Problem Set 2.5 1 Find the inverses (directly or from the 2 by 2 formu la) of A, B . C: ! ~] A =[~ ~] and B =[ and C = [; ~]. 2.5 Inverse Matrices 79 2 For these "pennutation matrices" find p - l by trial and error (with l's and O's): ~ = P 0O OI 0l] [I O 0 and P = [~ ~] . l 00 3 Solve for the columns of A- 1 = (; ~); J 4 Show that [ i] has no inverse by trying to solvc for the column (x, y): = 5 Find an upper triangular U (not diagonal) with u2 = I and U u-1. = = 6 (a) If A is invertible and AB AC, prove quickly that B C. i]. (b) If A = [: = find two matrices B i= C such that AB AC. = 7 (Important) If A has row I + row 2 row 3. show that A is not invertible: (a) Explain why Ax = (1, 0, 0) cannot have a solution. = (b) Which right sides (bi, bi, b3) might allow a solution to Ax b? (c) What happens to row 3 in elimination? = 8 If A has column l + column 2 column 3, show that A is not invertible: = (a) Find a nonzero solution x to Ax 0. The matrix is 3 by 3. = (b) Elimination keeps column l + column 2 column 3. Explain why there is no third pivot. 9 Suppose A is invertible and you exchange its first two rows to reach B. Is the new matrix B invertible and how would you find n- 1 from A- 1? 10 Find the inverses (in any legal way) of o o o 2] 00 30 A= [ o 4 O O 5 0 0 0 = and B 43 32 00 0OJ 0 065• [ 0 0 76 11 (a) Find invertible matrices A and B such that A + B is not invertible. (b) Find singular matrices A and B such that A+ B is invertible. C0pyrighted ma,cr ,al 80 Chapter 2 Solving Linear Equations = 12 If the product C AB is invertible (A and B are square), then A ilSclf is in- vertible. Find a formula for A- 1 thail involves c-1 and B. = 13 If the product M ABC of three square matrices is invertible. then B is invert- n- ible. (So arc A and C .) Find a fonnula for 1 that involves M- 1 and A and C. 14 If you add row I of A to row 2 to get B, how do you find B - 1 from A- 1? ! ~] [ = [ Notice the order. The inverse of B A ] is 15 Prove that a matrix with a column of zeros cannot have an inverse. 16 Multiply [: ~] times [ -~ -~ ]. What is the inverse of each matrix if ad -:fo be? 17 (a) What matrix E has the same effect as these three steps? Subtract row I from row 2, subtract row I from row 3, then subtract row 2 from row 3. (b) What single matrix L has the same effect as these three reverse steps? Add row 2 to row 3. add row I to row 3, then add row 1 to row 2. 18 If B is the inverse of A2• show that AB is the inverse of A. 19 Find the numbers e1 and b that give the inverse of 5 • eye(4) - ones(4,4): ]-I [a -14 - 4I --II --II b b a b b bb ] - t -1 4 -I [ -1 - 1 - I 4 = bbab• bb ba What are a and b in the inverse of 6 • eye{S) - ones(S,5)? 20 Show that A = 4 • eye(4) - ones(4,4) is nor invertible: Multiply A• ones(4,1). 21 There are sixteen 2 by 2 matrices whose entries are I's and O's. How many of them arc invertible? Questions 22--28 are about the Gauss-Jordan method for calculating A-1. 22 Change / into A- I as you reduce A to I (by row operations): 23 Follow the 3 by 3 text example but with plus signs in A. Eliminate above and below lhc pivots to reduce [ A I ] to f / A- I ]: I O t O OJ 210101. l 200 2.5 Inverse Malrices 81 24 Use Gauss-Jordan elimination on [ A I] to solve AA- 1 = / : 25 Find A- 1 and B- 1 (if they exist) by elimination on [ A I] and [ B l ]: = 2 -1 -l] and B - 1 2 - 1 . [-1 - 1 2 U 26 What three matrices E 21 and E 12 and D- 1 reduce A - i] to the identity matrix? Multiply 0 - 1£1 2 £ 21 to find A- 1. 27 Invert these matrices A by the Gauss-Jordan method starting with [ A I ): A=[: I] A = 2I OI 3OJ [0 0 I and 21 2 . 2 3 ~ t7 28 Exchange rows and continue with Gauss-Jordan to find A- 1: -l-J. J 1·1 [A I] = [ O2 22 OI O1 J • 29 True or false (with a counterexample if false and a reason if true): (a) A 4 by 4 matrix with a row of zeros is not invertible. (b) A matrix with l's down the main diagonal is invertible. (c) If A is invertible then A- 1 is invertible. (d) If A is invertible then A2 is invertible. 30 For which three numbers c is this matrix not invertible, and why not? 2 C C] A= C C C . [8 7 C 31 Prove that A is inveniblc if a =f. 0 and a =f. b (find the pivots or A -I): 82 Chapter 2 Solving Linear Equations 32 This matrix has a remarkable inverse. Find A- 1 by elimination on [ A I ). Extend to a 5 by 5 "alternating matrix" and guess its inverse; then multiply to confinn. -J] J - I I 0 I -I I A= 0 0 I -I • [0 0 0 I = ( 33 Use the 4 by 4 inverse in Question 32 to solve Ax 1, I, 1, 1). 34 Suppose P and Q have the same rows as I but in any order. Show that P - Q = is singular by solving (P - Q)x 0. 35 Find and check the inverses (assuming they exist) of these block matrices: = 36 If an invertible mairix A commutes with C (this means AC CA) show that A- I commutes with C. If also B commutes with C, show that AB commutes = with C. Translation: If AC = CA amd BC = CB then (A B)C C (AB). 37 Could ;i. 4 by 4 matrix. A be invertible if every row contains the numbers O, I1 2, 3 1 in some order'? What if every row of B contains 0, 1, 2, -3 in some order? 11 38 In the worked example 2.5 B. the triangular Pascal matrix A has an inverse with "alternating diagonals". Check that this A - • is DAD, where the diagonal matrix = D has alternating entries 1. -1, 1, - 1. Then ADAD I, so what is the inverse = of AD pascal (4, 1)? 39 The Hilbert matrices have Hij = 1/ (i + j - I). Ask MATLAB for the exact 6 by 6 inverse invhilb(6 ). Then ask for inv{hilb(6)). How can these be different, when the computer never makes mistakes? 40 Use inv{S) to invert MATLAB's 4 by 4 symmetric matrix S = pascal(4). Create Pascal's lower triangular A = abs(pascal(4, 1)) and test inv{S) = inv(A' ) • inv(A). = = 4 1 If A ones(4,4) and b rand(4,1), how does MATLAB tell you that Ax = b = = has no solution? If b ones(4, 1), which solution to Ax b is found by A\b? = c•. 42 If AC= I and AC*= I (all square matrices) use 21 to prove that C = 43 Direct multiplication gives MM- 1 I, and I would recommend doing #3. M - 1 shows the change in A- 1 (useful to know) when a matrix is subtracted from A: 1 M = 1 - uv = 2 M A - llV and M- 1 =I+ uv/(1 - vu) = and M- 1 A- 1 +A - 1uvA - 1/ (1 - vA- 111 ) = 3 M I-UV and M - 1 =I,,+ UUm - VU)- 1V 4 M = A- uw- 1v and M - 1 = A- 1 +A - 1U(W-vA - 1u) - 1VA - 1 2.6 Elimination = Factorization: A = LU 83 The Woodbury-Morrison fonnula 4 is the "matrix inversion lemma" in engineering. The four identities come from the I, l block when inverting these matrices (v is 1 by 11, " is 11 by 1, V is m by 11, U is II by m, m ~ n): In U] [ V lnr = ELIMINATION FACTORIZATION: A = LU ■ 2.6 Students often say that mathematics courses are too theoretical. Well, not this section. It is almost purely practical. The goal is to describe Gaussian elimination in the most useful way. Many key ideas of linear algebra, when you look at them closely, are really factorizations of a matrix. The original matrix A becomes the product of two or three special matrices. TIJe first factorization- also the most important in practice- comes = now from elimination. The factors are triang11lar matrices. The factorizatioll that comes from elimination is A LU. We already know U. the upper triangular matrix with the pivots on its diagonal. The elimination steps take A to U . We will show how reversing those steps (taking U back to A) is achieved by a lower triangular L. The entries of L are exactly the multipliers tu-which multiplied row j when it was subtracted from row i. Start with a 2 by 2 example. The matrix A contains 2, I, 6, 8. The number to eliminate is 6. Subtract 3 times row l from row 2. That step is £ 21 in the forward direction. The return step from U to A is L = £ 1 21 (an addition using +3): [-! ~] [~ !] [i !] = Forward from A to U: £21A = = U ! ~] [ ~] ! !] Back from U to A : £211U = [ ~ =[ = A. The second line is our factorization. Instead of £ 211U = A we write LU = A. Move now to larger matrices with many £ 's. Then L will include all their i11verses. Each step from A to U multiplies by a matrix Eij to produce zero in the (i. j) position. To keep this clear, we stay with the most frequent case-wlre11 no row exchanges are involved. If A is 3 by 3, we multiply by £21 and £31 and £32. The multipliers fu produce zeros in the (2, l) and (3, 1) and (3, 2) positions-all below the diagonal. Elimination ends with the upper triangular U. Now move those E's onto the other side, where their inverses mulliply U: The inverses go in opposite order, as they must. That product of three inverses is L. We have reached A= LU. Now we stop to understand it. 84 Chapter 2 Solving Linear Equations Explanation and Examples First poim: Every inverse matrix Ei';1 is lower triangular. Its off-diagonal entry is t;; . to undo the subtraction with -f;;. The main diagonals of E and E- 1 contain l's. ?]. Our example above had f 2 1 = 3 and E = [_} Y] and £ - 1 = [l Second point: Equation (I) shows a lower triangular matrix (the product of Eij) multiplying A. It also shows a lower triangular matrix (the product of Eij 1) multiplying U to bring back A. This product of in,erses is L. One reason for working with the inverses is that we want to factor A, not U. = The "inverse form" gives A LU. The second reason is that we get something extra, almost more than we deserve. This is the third point. showing that L is exactly right. Third poim: Each multiplier tii goes directly into its i. j position- unchanged- in the product of inverses which is L. Usually matrix muhiplication will mix up all the numbers. Here that doesn't happen. The order is right for the inverse matrices, to keep the f's unchanged. The reason is given below in equation (3). Since each E - 1 has I's down its diagonal. the final good point is that L does too. = 2J (A LU) This is elimination withor1t row exclia11ges. The upper triangular U has the pivotc; on its diagonal. The lower triangular L has all 1's on its diagonal. The mllllipliers eij are below the diagonal of L. 1·1 Example 1 The matrix A has I, 2, I on its diagonals. Elimination subtracts ½times row 1 from row 2. The last step subtracts j times row 2 from row 3. The lower = = triangular L has !21 ½ and e32 j- Multiplying LU produces A: 0 A= 2l 21 OI J = [ ½1 I [0 I 2 0 2 3 3 ~ 0 The (3, 1) multiplier is zero because the (3. I) entry in A is zero. No operation needed. Example 2 Change the top left entry from 2 to I. The pivots all become 1. The mullipliers are all I. That pattern continues when A is 4 by 4: n i [i i t iJ A=[~ = J[,•: These LU examples are showing something extra, which is very important in practice. Assume no row exchanges. When can we predict zeros in L and U? When a row of A starts witli zeros, so does that row of L . When a column of A swrts with: zeros, so does that column of U. 2.6 Elimination • Factorization: A = LU 85 If a row stans with zero, we don' t need an elimination step. L has a zero, which saves computer time. Similarly, zeros at the start of a column survive into U. But please realize: Zeros in the middle of a matrix are likely to be filled in, while elimination sweeps forward. We now explain why L has the multipJiers fu in position, with no mix-up. Tlie key reason why A equals LU: Ask yourself about the pivot rows Lhat are subtracted from lower rows. Are they the original rows of A? No. elimination probably changed them. Are they rows of U? Yes, the pivol rows never change again. When computing the third row of U. we subtract multiples of earlier rows of V (not rows of A!); = Row 3 of U (Row 3 of A) - l31 (Row 1 of V) - i32{Row 2 of V). (2) Rewrite this equation to see that the row [ t31 t32 1 ] is multiplying U: + + (Row 3 of A)= £31 (Row 1 of U) €32(Row 2 of U) l (Row 3 of U). (3) This is exactly row 3 of A = LU. All rows look like this, whatever the size of A. With no row exchanges, we have A= LU. Remark The L U factorization is ..unsymmetric" because U has the pivots on its diagonal where L has 1's. This is easy to change. Divide U by a diagonal matrix D tlrat contains the pivots. That leaves a new matrix with J's on the diagonal: Split U into l II 12/d1 1 dn It is convenient (but a little confusing) to keep the same letter U for this new upper triangular matrix. It has l's on the diagonal (like L). Instead of the normal LU, the new form has D in the middle: Lower triangular L times diagonal D times upper triangular U. The triangular factorization can be wrilten A= LU or A = LDU. Whenever you see LDU, it is understood that U has 1's on the diagonal. Each row is divided by irs firsr 1wm:.ero elltry- the pivot. Then L and U are treated evenly in LDU: s][~ il [! ~][~ ~] [! ~][2 splitsfunhcrinto (4) The pivots 2 and 5 went into D. Dividing the rows by 2 and 5 left the rows [ I 4] and [0 1] in the new U. The multiplier 3 is still in L. My own lect11res somelimes stop at this point. The next paragraphs show how elimination codes are organized, and how mong they taJce. If MATLAB (or any software) is available, I strongly recommend the last problems 32 to 35. You can measure the computing lime by just counting the seconds! Copyrigl1ted ma1ci ,al 86 Chapter 2 Solving Linear Equations = One Square System Two Triangular Systems The matrix l contains our memory of Gaussian elimination. It holds the numbers that multiplied the pivot rows. before subtracting them from lower rows. When do we need this record and how do we use it? We need L as soon as there is a right side b. The factors L and U were com- = pletely decided by the left side (the matrix A). On the right side of Ax b, we use Solve: 1 Factor (into L and U. by forward elimination on A) 2 Solve (forward elimination on b using L, 1hen back substitution using U). Earlier, we worked on b while we were working on A. No problem with thatjust augment A by an extra column b. But most computer codes keep the two sides separate. The memory of forward elimination is held in L and U, at no extra cost in storage. Then we process b whenever we want to. The User's Guide to UNPACK remarks that "This situation is so common and the savings are so important that no provision has been made for solving a single system with just one subroutine." How does Solve work on fJ? First. apply forwarc;I elimination to the right side {the multipliers are stored in L, use them now). This changes b to a new right side c - we = = are really sob1ing Le b. Then back substitution solves Ux c as always. The = original system Ax h is factored into two triang11lar systems: = Solve Le= b and then solve U x c . (5) = = To see that x is correct, multiply Ux c by L. Then LUx Le is just Ax = b. To emphasize: There is nothi11g new about those steps. This is exactly what we = have done all along. We were really solving the triangular system Le b as elimina- tion went forward. Then back substitution produced x. An example shows it all. = = Example 3 Forward elimination on Ax b ends at U x c: u + 2v = 5 4u + 9v = 21 becomes ll + 2v = 5 V = J. The multiplier was 4. which is saved in L . The right side used it to find c: ! ~][c] = Le b The lower triangular system [ = [2~] gives c = [~] . = Ux c The upper triangular system [ ~ ~] [x] = [;] gives x = [~]- It is satisfying that L and V can take the n2 storage locations that originally held A. The t 's go below the diagonal. The whole discussion is only looking to see what elimination actually did. 2.6 Elimination • f actorization: A= LU 87 The Cost of Elimination A very practical question is cost- or computing time. Can we so]ve 1000 equations on a PC? What if 11 = IO, 000? Large systems come up all the time in scientific computing, where a three-dimensional problem can easily lead to a million unknowns. We can let the calculation run overnight, but we can't leave it for 100 years. The first stage of elimination. on column I. produces zeros below the first pivot. To find each new entry below the pivot row requires one multiplication and one sub traction. We will count this first srage as 112 mulriplications and 112 subtractions. It is actually less. n2 - n, because row I does not change. The next stage clears out the second column below the second pivot. The working matrix is now of size n - I. Estimate this stage by (n - 1)2 multiplications and subtractions. The matrices are getting smaller as elimination goes forward. The rough count to reach U is the sum of squares n2 + (n - I)2 + •••+ 22 + 12. There is an exact formula ,½n(n + J)(n + 1) for this sum of squares. When II is large, the ½and the 1 are not important. The number that mailers is ! n3. The sum of f squares is like the integral of x 2 ! The integral from O to II is n3: Elimi11ation 011 A re uires abo1d } 11J 11wltiplicotio11s and }113 s11btractfons. What about the right side b'! Going forward. we subtract multiples of b 1 from the lower components bi, . . . , b11. This is 11 - 1 steps. The second stage takes only n - 2 steps, because b1 is not involved. The last stage of forward elimination takes one step. Now start back substitution. Computing Xn uses one step (divide by the last pivot). The next unknown uses two steps. When we reach x 1 it will require II steps (n - I substitutions of the other unknowns, then division by the first pivot). The total count on the right side, from b to c to x - fon11ard to the bottom and hack to tire top-is exactly n2: [(n-l ) +(n - 2)+ .. ·+ 1] + [1+2+··· + (n - l)+nJ = n 2. (6) To see that sum, pair off (11 - 1) with 1 and (n - 2) with 2. The pairings leave II terms. each equal to n. That makes n2. The right side costs a lot less than the left side! Each right side needs 112 multiplications and 112 subtractiom,. Here are the MATLAB codes to factor A into LU and to solve Ax = b. The program slu stops right away if a number smaller than the tolerance "toI" appears in a pivot 88 Chapter 2 Solving Linear Equations position. Later the program plu will look down the column for a pivot, to execute a row exchange and continue solving. These Teaching Codes are on web.mit.edu/18.06/www. = function [L, U] slu(A) % Square LU factorization with no row exchanges! = [n. n] size(A); tol = l.e - 6: fork= l: n if abs(A(k, k)) < tol end % Cannot proceed without a row exchange: stop = L(k, k) I ; = for i k + 1 : n % Multipliers for column k are put into L = L(i, k) A(i. k)/A(k, k); = for j k + l : n % Elimination beyond row k and column k = * A(i, j) A(i, j) - L(i , k) A(k. j); % Matrix still called A end end for j ;:::; k: n U(k. j) = A(k, j); % row k is settled. now name it U end end = function x slv(A. b) % Solve Ax = b using l and U from slu(A). No row exchanges! = [L, U] slu(A); = for k l : n for j = l : k - 1 s = s + L(k, j) *c(j); = = end c(k) b(k) - s; % Forward cmimination to solve Le b end fork= 11 : - 1 : I % Going backwards from x(n) to x(l ) = for j k + l : 11 % Back substitution I= I+ U(k. j) * X(j): end x(k) == (c(k) - t)/ U(k, k); % Divide by pivot end x = x ' ; % Transpose to column vector = = How Jong does it take to solve Ax b? For a random matrix of order n 1000, we tried the MATLAB command tic; A\b; toe. The time on my PC was 3 seconds. = For ,r 2000 the time was 20 seconds, which is approaching the n3 rule. The time is multiplied by about 8 when n is multiplied by 2. According to this n 3 rule, matrices that are IO times as large (order 10,000) will take thousands of seconds. Matrices of order 100,000 will take millions of seconds. 2.6 Elimination = Factorization: A= LU 89 This is too expensive without a supercomputer, but remember that these matrices are = full. Most matrices in practice are sparse (many zero entries). In that case A LU is much faster. For tridiagonal matrices of order 10,000, storing only the nonzeros, = solving Ax b is a breeze. ■ REVI EW OF THE KEY IDEAS ■ 1. Gaussian elimination (with no row exchanges) factors A into L times U. 2. The lower triangular L contains the numbers that multiply pivot rows, going from A to U. The product LU adds those rows back to recover A. = = 3. On the right side we solve Le b (forward) and U x c (backwards). 4. There are !<113 - n) multiplications and subtractions on the left side. 5. There are n2 multipHcations and subtractions on the right side. ■ WORKED EXAMPLES ■ 2.6 A The lower triangular Pascal matrix Pi was in the worked example 2.5 B. (It contains the ·•Pascal rriangle" and Gauss-Jordan found its inverse.) This problem connects PL to the symmetric Pascal matrix Ps and the upper triangular Pu . The sym- metric Ps has Pascal's triangle tilted, so each entry is the sum of the entry above and the entry to the left. The ,, by II symmetric Ps is pascal(n) in MATLAB. = Problem: Establish the amazing lower-upper Jacroriztuiou Ps PLPu: I 1 = pascal{4) I 2 [1 3 I 4 Then predict and check the next row and column for 5 by 5 Pascal matrices. Solution You could multiply PL Pu to get Ps. Better to start with the symmetric Ps and reach the upper triangular Pu by elimination: 3l 4l ] 6 10 ~ [101I 21 311 025 9 ~ [0 l 1l 2I 3I] 00 1 3 ~ [l011 21 31] 0 0 13 = Pu . 10 20 0 3 9 19 0 0 3 JO OOO I = The multipliers f..;j that entered these steps go perfectly into PL, Then Ps Pl Pu is = a particularly neat example of A LU. Notice that every pivot is l ! The pivots are 90 Chapter 2 Solving Linear Equations on the diagonal of Pu . The next section will show how symmetry produces a special relationship between the triangular L and U. You see Pu as the ..transpose" of Pl . You might expect the MATLAB command lu(pascal(4)) to produce these factors Pt. and Pu. That doesn't happen because the Ju subroutine chooses the largest available pivot in each column (it will exchange rows so the second pivot is 3). But a dif- ferent command chol factors without row exchanges. Then [L. U] = c hol(pascal(4)) = produces the triangular Pascal matrices as L and U. Try it. In the 5 by 5 case the new fifth rows do maintain Ps PLPu : Next Row I 5 15 35 70 for Ps I 4 6 4 1 for Pi I wilJ only check that this fifth row of PL times the (same) fifth column of Pu gives t'-> + 4"- + 6"- + 4"- + 12 = 70 m• the fifth row of Ps. The full proof of Ps = PL Pu is quite fascinating-this factorization can be reached in at least four different ways. I am going to put these proofs on the course web page web.mit.edu/18.06/www, which is also available through MIT's OpenCourseWare at ocw.mit.edu. These Pascal matrices Ps, PL , Pu have so many remarkable properties-we will see them again. You could locate them usiing the Index at the end of the book. 2.6 8 The problem is: Solve Ps x = b = (1. 0. O. 0). This special right side means 5 that x will be the first column of P 1. That is Gauss-Jordan, matching the columns of Ps Pi 1 = /. We already know the triangular PL and Pu from 2.6 A. so we solve PLC = b (forward substitution} Pu x = c (back substitution). Use MATLAB to find the full inverse matrix p5- t . = Solution The lower triangular system PLc b is solved top to bottom: CJ =I + C1 Cz = 0 Cl +2ci + C3 =0 q + 3c2 + 3c3 + C4 = 0 gives Cl =+I c2 = -1 CJ= + l C4 = - 1 Forward eliminacion is multiplication by PL1. It produces the upper triangular system Pux = c . The solution x comes as always by back substitution. bottom lo top: + = + + XI X2 X3 X4 x2 + 2x3 + 3x4 = - l .t 3 + 3.q = I X4 = -I gives = .'t'J + 4 x2 = - 6 X3 = -t4 X4 = - 1 -~ ] 5 The complete inverse matr~x P 1 has that x in its first column: 4 -6 4 = - 6 14 - 11 inv(pascal(4)) [ 4 -1 - 11 3 10 - 3 -3 I • 2.6 Elimination = Factorization: A= LU 91 Problem Set 2.6 Problems 1-14 compute the factorization A= LU (and also A= LDU). = = 1 (Important) Forward elimination changes [ ~ } ]x b to a triangular [ A~ ]x c: x+ y= 5 X + 2y = 1 x+ y = 5 y= 2 5] 1 l [1 2 7 -----+ l 1 5] [0 I 2 That step subtracted £21 times row from row 2. The reverse step adds t21 times row I to row 2. The matrix for that reverse step is L = _ _ . Multiply this Ltimes the triangular system (A~ ]x = (~] co get _ _ = _ _ . = In letters, L multiplies Ux c to givc _ _ . = 2 (Move to 3 by 3) Forward elimination changes Ax = b to a triangular Ux c: x + y+ z= 5 X + 2y + 3z =7 = x +3y + 6z I I x+ y + z =5 y + 2z = 2 2y +5z= 6 x + y+ z= 5 y + 2z = 2 z= 2 = U = = The equation z 2 in x c comes from the original x + 3y + 6z 11 in Ax = b by subtracting = e31 _ _ times equation 1 and l32 = __ _ times the final equation 2. Reverse that to recover [ 1 3 6 11 ] in A and b from the final [ I I I 5 ] and l O I 2 2 ] and LO O l 2 J in U and c: = Row 3 of [ A b] (t31 Row 1 + l32 Row 2 + J Row 3) of [ U c]. = b= In matrix notation this is multipJication by L. So A LU and le. = = 3 Write down the 2 by 2 triangular systems Le b and Ux c from Problem 1. = Check that c (5, 2) solves the first one. Find x that solves the second one. = = c 4 What are the 3 by 3 triangular systems Le b and Ux from Problem 2? = Check that c (5, 2, 2) solves the first one. Which x solves the second one? = = 5 What matrix E puts A into triangular form EA U? Multiply by E- 1 L to factor A into LU: A= [ 0 2 6 4 I 3 O2 J . 5 6 What two elimination matrices £ 21 and £ 32 put A into upper triangular form = = A £ 32£ 21 U? Multiply by £ 1 32 and £ 2/ to factor A into LU £ 211£ 321U: I I A= 2 4 [0 4 Copyrighted ""ull,;11,..1