If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Linear algebra>Unit 2

Lesson 3: Transformations and matrix multiplication

# Compositions of linear transformations 1

Introduction to compositions of Linear Transformations. Created by Sal Khan.

## Want to join the conversation?

• What exactly the composition means?
• Intuitively, it means do something, and then do another thing to that something.
Formally, composition of functions is when you have two functions f and g, then consider g(f(x)). We call the function g of f "g composed with f".
So in this video, you apply a linear transformation, which warps the space in some way, and then apply another linear transformation to the already warped space. The result is a composition.
• I have two questions:

1. At he says that A will be l x n. That makes sense except how do we know which subset of R^n (vector x) or R^l (vector z) will be the column, and which will be the row?

2. This seems awfully familiar to the g(f(x)) and F(g(x)) stuff that I did in college algebra/Algebra 2. Is this related at all? Was the stuff they showed us in algebra kind of a precursor to this stuff?
• For an mxn matrix, the matrix is m tall and n wide, so m rows and n columns. An lxn matrix would be n wide and l tall, giving the transformation `A x⃑ = z⃑`.

And yes, it's very similar, just with more variables.
• What is the trace of a matrix?
• The trace of a matrix is the sum of the elements of the main diagonal of the matrix. It is only defined for square matrices.
• I found using the same x vector notation throughout every seperate transformation somewhat confusing. So is this what we learned ?
1. apply 1st transformation to relevant size Identity matrix :
dot product row vectors of matrix A with column vectors of Identity matrix and write the resulting scalars in same order as row number of A and column number of I. Notice that the result gives us the column vectors of A again. (mxn) and (nxn) matrices gives us a (mxn) matrix again.
2.apply second transformation on the resulting matrix in 1 above. Dot product each row vector of B with each column vector of A. Write the resulting scalars in same order as
row number of B and column number of A. (lxm) and (mxn) matrices give us (lxn) matrix. This is the composite linear transformation.
3.Now multiply the resulting matrix in 2 with the vector x we want to transform. This gives us a new vector with dimensions (lx1). (lxn) matrix and (nx1) vector multiplication.
• Here's what this video is getting at. Given:
`T(x) = Ax` and `S(x) = Bx`
We know:
`T∘S(x) = A(Bx) = (AB)x = ABx`
In other words, you can use matrix multiplication to combine multiple linear transformations into a single linear transformation.
• Another way to proof that (T o S)(x) is a L.T. is to use the matrix-vector product definitions of the L.T.'s T and S. Simply evaluate BA into a solution matrix K. And by the fact that all matrix-vector products are linear transformations and (T o S)(x) = Kx, (T o S)(x) is a linear transformation.
• At this point, we hadn't defined what a matrix-matrix product was.
• In determining the dimensions of the A matrix Sal stated that because x was an element of Rn the A matrix would have n columns. However, if the vector x is in Rn would that require the vector x to have n elements which would translate into A having n rows, not n columns? Why n rows?
• If r1, r2, etc. are the row vectors of A, then Ax = (x dot r1, x dot r2, ... , x dot rn), which means that A must have row vectors with n components (the same as x), which means that A is mxn - it has m rows and n columns.
• @ Khan talks about how X is a member of Rm.
But at the beginning of the video the X is a member of Rn
How did X go from being a member of Rm to Rn?
Thank You
(1 vote)
• Sal is recycling varaible names. The x in Rm is a different x than the one in Rn.
(1 vote)
• How do I explain in terms of transformations how the graphs of y=16+0.07x and y=26.1+0.07x are related to each other?
(1 vote)
• If every linear transformation can be written as a matrix, is there a tool for writing an analysing every non-linear transformation?
(1 vote)
• So is this for introduction to transformation
(1 vote)
• It's an introduction to compositions of transformations. For an intro to transformations themselves, you might want to look at some of the earlier videos in the Linear Algebra playlist -- probably starting around the Vector Transformations video and working on from there.
(1 vote)

## Video transcript

Let's see if we can build a bit on some of our work with linear transformation. I have two linear transformations. I have the transformation S, that's a mapping, or function, from the set X to the set Y. And let's just say that X is a subset of of Rn. Y is a subset of Rm. Then we know S is a linear transformation. It can be represented by a matrix vector product. We can write S of X. Let me do it in the same color as I was doing it before. We can write that S of some vector X, is equal to some matrix A times X. The matrix A, it's going to be X, whatever X we input into the function, although we take the mapping of. It's going to be in this set, right here, is going to be a member of Rn. This is going to be right here. Let me do it like this. X is going to be a member of Rn. Well, it's actually going to be a member of X, which is a subset of Rn. I'm just trying to figure out what the dimensions of matrix A are going to be. This is going to have n components right here. Matrix A has to have n columns. Matrix A is going to be, let's just say, is an m by n matrix. Fair enough. Let's say we have another linear transformation. Let me draw what I've done so far. We have sum set X, right here, that is set X. It is a subset of Rn. Rn, I can draw out there. We have this mapping, S, or this linear transformation, from X to Y. It goes to a new set, Y, right here. Y is a member of Rm. The mapping X, right here. You take some element here, and you apply the transformation S. I've told you it's a linear transformation. You'll get to some value in set Y, which is in Rm. I said that the matrix representation of our linear transformation is going to be an m by n matrix. You're going to start with something that has n entries, or a vector that's a member of Rn. You want to end up with a vector that's in Rm. Fair enough. Now, let's say I have another linear transformation, T. It's a mapping from the set Y to the set Z. Let me draw. I have another set here called set Z. I can map from elements of Y, so I could map from here, into elements of Z using the linear transformation T. Similar to what I did before. We know that Y is a member of Rm. You know that this is a subset, not a member, more of a subset of Rm. These are just arbitrary letters. It could be 100 or 5, or whatever. I'm just trying to stay abstract. Z is a member, I'm running out of letters, let's say Z is a member of Rl. Z is a member of Rl. Then, what's the transformation T, what's it's matrix representation going to be. You know it's a linear transformation. I told you that. We know it can be represented in this form. We could say that T of X, where X is a member of Rm, is going to be equal to some matrix B times X. What are the dimensions of matrix B going to be. X is going to be a member of Rm, so B is going to have to have m columns. And then it's a mapping into a set that's a member of Rl. It's going to map from members of Rm to members of Rl. It's going to be l by m matrix. When you see this, a very natural question might arise in your head. Can we construct some mapping that goes all the way, that goes all the way, from set X all the way to set T. Maybe we'll call that the composition of-- I mean we can create that mapping using a combination of S and T. Let's just make up some word. Let's just call T, with this little circle S, let's just call this a mapping from X all the way to Z. We'll call this the composition of T with S. We're essentially just combining the two functions in order to try to create some mapping that takes us from T, from set X, all the way to set Z. We still haven't defined this. How can we actually construct this. A natural thing might be to first apply transformation S. Let's say that this is our X we're dealing with right here. Maybe the first thing we want to do is apply S, and that'll give us an S of X. That will give us this value, right here, that's in set Y. And then what if we were to take that value and apply the transformation T to it? We would take this value, and apply the transformation T to it, to maybe get to this value. This would be the linear transformation T applied to this value, this member of the set Y, which is in Rm. We are just going to apply that transformation to this guy, right here, which was the transformation S applied to X. This might look fancy, but all this is, remember this is just a vector, right here, in the set Y, which is a subset of Rm. This is a vector that is in X. When you apply mapping, you get another vector that's in Y. You apply the linear transformation T to that, then you get another vector that's at set Z. Let's define the composition of T with S. This is going to be a definition. Let's define the composition of T with S to be-- first we apply S to some vector in X. Apply S to some vector in X to get us here. Then we apply T to that vector to get us to set Z. To get us to set-- so we apply T to this thing right there. The first question might be, is this even a linear transformation? Is the composition of two linear transformations even a linear transformation? Well there are two requirements to be a linear transformation. The sum of the linear transformation of the sum of two vectors, should be the linear transformation of each of them summed together. I know when I just say that verbally, it probably doesn't make a lot of sense. Let's try to take the composition, the composition of T with S of the sum of two vectors in X. I'm taking the vectors x and the vectors y. By definition, what is this equal to? This is equal to applying to linear transformation T to the linear transformation S, applied to our two vectors, x plus y. What is this equal to? I told you at the beginning of the video, that S is a linear transformation. So by definition, of a linear transformation, one of our requirements, we know that S of x plus y is the same thing as S of x plus S of y, because S is a linear transformation. We know that is true. We know that we can replace this thing right there with that thing right there. We also know that T is a linear transformation. Which means that the transformation applied to the sum of two vectors is equal to the transformation of each of the vectors summed up. The transformation of S of x, or the transformation applied to the transformation of S applied to x, I know the terminology is getting confused, plus T of S of y. We can do this because we know that T is a linear transformation. But what is this right here? All this statement right here is equal to the composition of T with S, applied to x, plus the composition of T with S, applied to y. Given that both T and S are linear transformations, we got our first requirement. That the composition applied to the sum of two vectors is equal to the composition applied to each of the vectors summed up. That was our first requirement for linear transformation. Our second one is, we need to apply this to a scalar multiple of a vector in X. So, T of S, or let me say it this way, the composition of T with S applied to some scalar multiple of some vector x, that's in our set X. This is a vector x, that's our set X. This should be a capital X. This is equal to what. Well, by our definition of our linear, of our composition, this is equal to the transformation T applied to the transformation S, applied to c times our vector x. What is this equal to? We know that this is a linear transformation. Given that this is a linear transformation, that S is a linear transformation, we know that this can be rewritten as T times c times S applied to x. This little replacing that I did, with S applied to c times x, is the same thing as c times the linear transformation applied to x. This just comes out of the fact that S is a linear transformation. We've done that multiple times. Now we have T applied to some scalar multiple of some vector. We can do the same thing. We know that T is a linear transformation. We know that this is equal to, I'll do it down here, this is equal to c times T applied to S applied to some vector x that's in there. What is this equal =? This is equal to the constant c times the composition T with S of our vector x right there. We've met our second requirement for linear transformation. The composition as we've defined it is definitely a linear transformation. This means that the composition of T with S can be written as some matrix-- let me write it this way-- the composition of T with S applied to, or the transformation of, which is a composition of T with S, applied to some vector x, can be written as some matrix times our vector x. And what will be the dimensions of our matrix? We're going from a n dimension space, so this is going to have n columns, to a l dimension space. So this is going to have l rows. This is going to be an l by n matrix. I'll leave you there in this video. I realize I've been making too many 20 minutes plus videos. The next video, now that we know this is a linear transformation, and that we know that we can represent it as a matrix vector product. We'll actually figure out how to represent this matrix, especially in relation to the two matrices that define our transformations, S and T.