If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Deriving a method for determining inverses

Determining a method for constructing inverse transformation matrices. Created by Sal Khan.

Want to join the conversation?

  • leaf yellow style avatar for user Sarah
    What is the difference between a transformation matrix and a permutation matrix?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • old spice man green style avatar for user newbarker
      A permutation matrix has ones and zeroes only. All it can do is move entries from the matrix/vector it is being multiplied with, so some very limited transformations could be represented with a permutation vector. For instance, here's a permutation matrix to swap row 2 and row 1 in a matrix/vector with 3 rows:

      [0 1 0]
      [1 0 0]
      [0 0 1]

      it is the identity matrix with row 2 and row 1 swapped. Simple huh? Please multiply that matrix with this column vector

      [x]
      [y]
      [z]

      to verify that it does what we think it should do.

      A transformation matrix has a lot more freedom. It can stretch along one axis independently from others. For instance, here's a stretch of 5 in the x axis and a shrink by a half in the z axis:

      [ 5 0 0]
      [ 0 1 0]
      [ 0 0 1/2]

      transformation matrices can rotate, flip, project, etc: http://en.wikipedia.org/wiki/Linear_mapping#Examples_of_linear_transformation_matrices.
      (20 votes)
  • leaf grey style avatar for user adam
    In my current textbook (and I'm sure other places discussing this topic),
    invertible is a term that means the same thing as the term non-singular, such that there are a finite number of row operations you can do to get to the identity matrix. (All those row operations merged together are called the inverse of the matrix.)
    Likewise, non-invertible corresponds with singular, such that there is no matrix that can produce the identity matrix for a singular matrix.
    It's been pretty confusing for me with different terminology in my class and online, but I hope this helps out someone!
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user DeWain Molter
      Mathematics progresses at different rates among different people with different perspectives for different reasons all the time, so you end up with differences in language, interpretation, visualization, focus, goal and strategy... all of which leads to differences in vocabulary. You will (hopefully) find that each type of vocabulary reveals the math in a different way, so it is useful to master as many versions of an idea as you can.
      (6 votes)
  • male robot johnny style avatar for user Andrew
    I do not understand how he got the numbers at 3min 38 seconds Can you plz explain
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user binhex
      S is the transformation matrix we're trying to solve, I is the Identity matrix,
      the idea is that if we apply the transformation to I, that is, SxI, we should get S itself since I is identity.
      So let us apply to I (identity) what we know of what to do regarding this transformation, [a1, a2, a3] -> [a1, a2+a1, a3-a1], the result should be S.
      (4 votes)
  • leafers seed style avatar for user Parul  Patel
    Can anyone please help me out with a way to calculate faster the inverse of a matrix?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • male robot donald style avatar for user Parth Gupta
    Is it possible to convert a matrix into row echelon form by transforming the row vectors instead of column vectors?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user pupilmeter
    I'm having a problem. I've multiplied S1xA by hand and with an app. Both times I got
    1 -1 -1
    -2 2 3
    -2 1 4

    Not
    1 -1 -1
    0 1 2
    0 2 5

    Any thoughts on what I did wrong?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user loumast17
      Wrong order. Unlike normal multiplication A*B is not the same as B*A

      If you are confused which order you should do, think of each matrix as a function, where you start with f(x) and then to have that be part of a function you write g(f(x)), expanding to the left. That's how I think of it.
      (3 votes)
  • aqualine seedling style avatar for user Sanjana Sridhar
    What is row echelon form?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user chinna.chilukuri
    Hat is homomorphism? (This is out of context question)
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user kzoyogurt
      A homomorphism is a map between two algebraic structures of the same type (that can be vector spaces), preserving the structures' operations. This means a function f: A -> B mapping from vector space A to B, f is a homomorphism if f(x.y) = f(x).f(y) for every x,y of A.
      You can say that f is a homomorphism if f preserves operations.

      For example: We have a 2x2 matrix A. A =
      [a 0
      0 a]

      f: A -> B

      f(a+b) =
      [a+b 0
      0 a+b]

      = f(a) + f(b)

      and

      f(2a) =
      [2a 0
      0 2a]

      = 2f(a)

      ->> f is a homomorphism because f preserves matrix addition and multiplication.
      (1 vote)
  • blobby green style avatar for user promanov3815
    I'm confused by the order of multiplying the matrices. We have matrix A. I thought we would be multiplying this matrix by the transformed identity matrix to get our result. However, that doesn't work out = we multiply I by A (I is first). Is there a general rule in linear algebra that says which matrix should the the first in multiplication (since commutative law doesn't hold).
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user InnocentRealist
    Why is the word "singular" used for matrices that aren't invertible? Are all functions that aren't both 1-1 and onto called "singular"?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • primosaur ultimate style avatar for user Derek M.
      It's just an arbitrary term they (the math community) chose. The term "singular" is only used for matrices according to wikipedia, but I am sure it is also used for linear operators (i.e. a linear transformation T: V -> V). Note that singular matrices are also square.
      (2 votes)

Video transcript

I have this matrix A here that I want to put into reduced row echelon form. And we've done this multiple times. You just perform a bunch of row operations. But what I want to show you in this video is that those row operations are equivalent to linear transformations on the column vectors of A. So let me show you by example. So if we just want to put A into reduced row echelon form, the first step that we might want to do if we wanted to zero out these entries right here, is-- let me do it right here-- is we'll keep our first entry the same. So for each of these column vectors, we're going to keep the first entry the same. So they're going to be 1, minus 1, minus 1. And actually, let me simultaneously construct my transformation. So I'm saying that my row operation I'm going to perform is equivalent to a linear transformation on the column vector. So it's going to be a transformation that's going to take some column vector, a1, a2, and a3. It's going to take each of these and then do something to them, do something to them in a linear way. They'll be linear transformations. So we're keeping the first entry of our column vector the same. So this is just going to be a1. This is a line right here. That's going to be a1. Now, what can we do if we want to get to reduced row echelon form? We'd want to make this equal to 0. So we would want to replace our second row with the second row plus the first row, because then these guys would turn out to be 0. So let me write that on my transformation. I'm going to replace the second row with the second row plus the first row. Let me write it out here. Minus 1 plus 1 is 0. 2 plus minus 1 is 1. 3 plus minus 1 is 2. Now, we also want to get a 0 here. So let me replace my third row with my third row minus my first row. So I'm going to replace my third row with my third row minus my first row. So 1 minus 1 is 0. 1 minus minus 1 is 2. 4 minus minus 1 is 5, just like that. So you see this was just a linear transformation. And any linear transformation you could actually represent as a matrix vector product. So for example, this transformation, I could represent it. To figure out its transformation matrix, so if we say that T of x is equal to, I don't know, let's call it some matrix S times x. We already used the matrix A. So I have to pick another letter. So how do we find S? Well, we just apply the transformation to all of the column vectors, or the standard basis vectors of the identity matrix. So let's do that. So the identity matrix-- I'll draw it really small like this-- the identity matrix looks like this, 1, 0, 0, 0, 1, 0, 0, 0, 1. That's what that identity matrix looks like. To find the transformation matrix, we just apply this guy to each of the column vectors of this. So what do we get? I'll do it a little bit bigger. We apply it to each of these column vectors. But we see the first row always stays the same. So the first row is always going to be the same thing. So 1, 0, 0. I'm essentially applying it simultaneously to each of these column vectors, saying, look, when you transform each of these column vectors, their first entry stays the same. The second entry becomes the second entry plus the first entry. So 0 plus 1 is 1. 1 plus 0 is 1. 0 plus 0 is 0. Then the third entry gets replaced with the third entry minus the first entry. So 0 minus 1 is minus 1. 0 minus 0 is 0. 1 minus 0 is 1. Now notice, when I apply this transformation to the column vectors of our identity matrix, I essentially just performed those same row operations that I did up there. I performed those exact same row operations on this identity matrix. But we know that this is actually the transformation matrix, that if we multiply it by each of these column vectors, or by each of these column vectors, we're going to get these column vectors. So you can view it this way. This right here, this is equal to S. This is our transformation matrix. So we could say that if we create a new matrix whose columns are S times this column vector, S times 1, minus 1, 1. And then the next column is S times-- I wanted to do it in that other color-- S times this guy, minus 1, 2, 1. And then the third column is going to be S times this third column vector, minus 1, 3, 4. We now know we're applying this transformation, this is S, times each of these column vectors. That is the matrix representation of this transformation. This guy right here will be transformed to this right here. Let me do it down here. I wanted to show that stuff that I had above here as well. Well, I'll just draw an arrow. That's probably the simplest thing. This matrix right here will become that matrix right there. So another way you could write it, this is equivalent to what? What is this equivalent to? When you take a matrix and you multiply it times each of the column vectors, when you transform each of the column vectors by this matrix, this is the definition of a matrix-matrix product. This is equal to our matrix S-- I'll do it in pink-- this is equal to our matrix S, which is 1, 0, 0, 1, 1, 0, minus 1, 0, 1, times our matrix A, times 1, minus 1, 1, minus 1, 2, 1, minus 1, 3, 4. So let me make this very clear. This is our transformation matrix S. This is our matrix A. And when you perform this product you're going to get this guy right over here. I'll just copy and paste it. Edit, copy, and let me paste it. You're going to get that guy just like that. Now the whole reason why I'm doing that is just to remind you that when we perform each of these row operations, we're just multiplying. We're performing a linear transformation on each of these columns. And it is completely equivalent to just multiplying this guy by some matrix S. In this case, we took the trouble of figuring out what that matrix S is. But any of these row operations that we've been doing, you can always represent them by a matrix multiplication. So this leads to a very interesting idea. When you put something in reduced row echelon form, let me do it up here. Actually, let's just finish what we started with this guy. Let's put this guy in reduced row echelon form. Let me call this first S. Let's call that S1. So this guy right here is equal to that first S1 times A. We already showed that that's true. Now let's perform another transformation. Let's just do another set of row operations to get us to reduced row echelon form. So let's keep our middle row the same, 0, 1, 2. And let's replace the first row with the first row plus the second row, because I want to make this a 0. So 1 plus 0 is 1. Let me do it in another color. Minus 1 plus 1 is 0. Minus 1 plus 2 is 1. Now, I want to replace the third row with, let's say the third row minus 2 times the first row. So that's 0 minus 2, times 0, is 0. 2 minus 2, times 1, is 0. 5 minus 2, times 2, is 1. 5 minus 4 is 1. We're almost there. We just have to zero out these guys right there. Let's see if we can get this into reduced row echelon form. So what is this? I just performed another linear transformation. Actually, let me write this. Let's say if this was our first linear transformation, what I just did is I performed another linear transformation, T2. I'll write it in a different notation, where you give me some vector, some column vector, x1, x2, x3. What did I just do? What was the transformation that I just performed? My new vector, I made the top row equal to the top row plus the second row. So it's x1 plus x2. I kept the second row the same. And then the third row, I replaced it with the third row minus 2 times the second row. That was a linear transformation we just did. And we could represent this linear transformation as being, we could say T2 applied to some vector x is equal to some transformation vector S2, times our vector x. Because if we applied this transformation matrix to each of these columns, it's equivalent to multiplying this guy by this transformation matrix. So you could say that this guy right here-- we haven't figured out what this is, but I think you get the idea-- this matrix right here is going to be equal to this guy. It's going to be equal to S2 times this guy. What is this guy right here? Well, this guy is equal to S1 times A. It's going to be S2 times S1, times A. Fair enough. And you could have gotten straight here if you just multiplied S2 times S1. This could be some other matrix. If you just multiplied it by A, you'd go straight from there to there. Fair enough. Now, we still haven't gotten this guy into reduced row echelon form. So let's try to get there. I've run out of space below him, so I'm going to have to go up. So let's go upwards. What I want to do is, I'm going to keep the third row the same, 0, 0, 1. Let me replace the second row with the second row minus 2 times the third row. So we'll get a 0, we'll get a 1 minus 2, times 0, and we'll get a 2 minus 2, times 1. So that's a 0. Let's replaced the first row with the first row minus the third row. So 1 minus 0 is 1. 0 minus 0 is 0. 1 minus 1 is 0, just like that. Let's just actually write what our transformation was. Let's call it T3. I'll do it in purple. T3 is the transformation of some vector x-- let me write it like this-- of some vector x1, x2, x3. What did we do? We replaced the first row with the first row minus the third row, x1 minus x3. We replaced the second row with the second row minus 2 times the third row. So it's x2 minus 2 times x3. Then the third row just stayed the same. So obviously, this could also be represented. T3 of x could be equal to some other transformation matrix, S3 times x. So this transformation, when you multiply it to each of these columns, is equivalent to multiplying this guy times this transformation matrix, which we haven't found yet. We can write it. So this is going to be equal to S3 times this matrix right here, which is S2, S1, A. And what do we have here? We got the identity matrix. We put it in reduced row echelon form. We got the identity matrix. We already know from previous videos the reduced row echelon form of something is the identity matrix. Then we are dealing with an invertible transformation, or an invertible matrix. Because this obviously could be the transformation for some transformation. Let's just call this transformation, I don't know, did I already use T? Let's just call it Tnaught for our transformation applied to some vector x, that might be equal to Ax. So we know that this is invertible. We put it in reduced row echelon form. We put its transformation matrix in reduced row echelon form. And we got the identity matrix. So that tells us that this is invertible. But something even more interesting happened. We got here by performing some row operations. And we said those row operations were completely equivalent to multiplying this guy right here by multiplying our original transformation matrix by a series of transformation matrices that represent our row operations. And when we multiplied all this, this was equal to the identity matrix. Now, in the last video we said that the inverse matrix, so if this is Tnaught, Tnaught inverse could be represented-- it's also a linear transformation-- It can be represented by some inverse matrix that we just called A inverse times x. And we saw that the inverse transformation matrix times our transformation matrix is equal to the identity matrix. We saw this last time. We proved this to you. Now, something very interesting here. We have a series of matrix products times this guy, times this guy, that also got me the identity matrix. So this guy right here, this series of matrix products, this must be the same thing as my inverse matrix, as my inverse transformation matrix. And so we could actually calculate it if we wanted to. Just like we did, we actually figured out what S1 was. We did it down here. We could do a similar operation to figure out what S2 was, S3 was, and then multiply them all out. We would have actually constructed A inverse. I guess, something more interesting we could do instead of doing that, what if we applied these same matrix products to the identity matrix. So the whole time we did here, when we did our first row operation. So we have here, we have the matrix A. Let's say we have an identity matrix on the right. Let's call that I, right there. Now, our first linear transformation we did-- we saw that right here-- that was equivalent to multiplying S1 times A. The first set of row operations was this. It got us here. Now, if we perform that same set of row operations on the identity matrix, what are we going to get? We're going to get the matrix S1. S1 times the identity matrix is just S1. All of the columns of anything times the identity times the standard basis columns, it'll just be equal to itself. You'll just be left with that S1. This is S1 times I. That's just S1. Fair enough. Now, you performed your next row operation and you ended up with S2 times S1, times A. Now if you performed that same row operation on this guy right there, what would you have? You would have S2 times S1, times the identity matrix. Now, our last row operation we represented with the matrix product S3. We're multiplying it by the transformation matrix S3. So if you did that, you have S3, S2, S1 A. But if you perform the same exact row operations on this guy right here, you have S3, S2, S1, times the identity matrix. Now when you did this, when you performed these row operations here, this got you to the identity matrix. Well, what are these going to get you to? When you just performed the same exact row operations you performed on A to get to the identity matrix, if you performed those same exact row operations on the identity matrix, what do you get? You get this guy right here. Anything times that identity matrix is going to be equal to itself. So what is that right there? That is A inverse. So we have a generalized way of figuring out the inverse for transformation matrix. What I can do is, let's say I have some transformation matrix A. I can set up an augmented matrix where I put the identity matrix right there, just like that, and I perform a bunch of row operations. And you could represent them as matrix products. But you perform a bunch of row operations on all of them. You perform the same operations you perform on A as you would do on the identity matrix. By the time you have A as an identity matrix, you have A in reduced row echelon form. By the time A is like that, your identity matrix, having performed the same exact operations on it, it is going to be transformed into A's inverse. This is a very useful tool for solving actual inverses. Now, I've explained the theoretical reason why this works. In the next video we'll actually solve this. Maybe we'll do it for the example that I started off with in this video.