An introduction to how the jacobian matrix represents what a multivariable function looks like locally, as a linear transformation.
Want to join the conversation?
- Is the Jacobian matrix an extension of the gradient?(12 votes)
- I've always seen it that way personally.
If you take a matrix N*3 [ u v w ] where u, v and w are column N-dimensional vectors that represent the new basis vectors in our output space, then the jacobian is similarly a N*3 matrix [ df/dx df/dy df/dz ] where df/dx is the column vector [df1/dx ; df2/dx ; ... ; dfN/dx], etc, for df/dy and df/dz. In this case f is a function from R³ to R^N.
If you take a scalar valued function (g from R³ to R¹ for example), then [ dg/dx dg/dy dg/dz ] is your gradient as a row vector ! Now the gradient is generally used a column vector, so be careful. There is probably some explanation as to why, but I don't know it.(9 votes)
- At2:01, could anyone please explain to me what Sal means by the input and the output space(6 votes)
- Basically, you can think of the "Input Space" as all the possible vectors that could be used as an input to the function f and all the possible vectors that could be the result as making up the "Output Space". So for f(x) = y, all the possible x vectors make up the input space and all the possible y vectors make up the output space.
If you want to learn more about what he's talking about then check out the videos in the Linear Algebra section--Sal does a great job explaining concepts like vector fields and subspaces and all that, which Grant (the guy who made this video--not Sal) doesn't really cover in these videos.(5 votes)
- is it safe to say that Jacobian matrix tells you how much the bases transformed from input space to output space?(6 votes)
- No, that's the function at the matrix at the beginning of the video.
I would say the Jacobian matrix tells you how values change when you move around on a parametric surface (like how the slope changes when you take different points on a curve, but now in 2d)).(4 votes)
- What is an example of a transformation that does not have local linearity?(3 votes)
- Local linearity is just another term for "differentiability", but it emphasizes the geometric perspective described in the video (in the same way that "transformation" means "function", but emphasizes this kind of perspective).
So if you simply want an example of a vector function that's not differentiable everywhere, F(x,y)=(abs[x],abs[y]) would do the trick.(6 votes)
- In the jacobian matrix, if we replace the single derivative by the 2nd derivative or the 3rd or even more highrer order derivative, will it not make it an even more accurate representation of linearity? If yes, why is this higher order derivatives not used in jacobiam matrix?(3 votes)
- Keep watching the videos and you will get to the Hessian matrix - which is exactly what you're referring to.(4 votes)
- At02:27, what does Sal mean when he says that "by dividing $\partial f_1$ with $\partial x$, it scales up to be a normal vector?
Both are tiny infinitesimally small quantities, but since they are of similar sizes, the ratio is a constant? Am I right in this line of thinking?
Also he says the ratio does not shrink when we zoom in further and further, this seems to imply that the numerator and denominator would shrink if we changed the scale of the graph in order to zoom in. I don't understand that, if we zoomed in, the small changes $\partial f_1$ and $\partial x$ ought to grow larger right?(1 vote)
- The way I understood what he was saying was that df1/dx is the ratio of the change in f1 to the change in x; it is the factor by which the x component in the input space was scaled to get the new x component (f1) in the output space. As you can see in the video, when the transformation is performed, the new x-component appears to be stretched and larger than the originial "tiny step in the x-direction" as he described it.
If you were to zoom in a lot in the output space, the changes partial f1 and partial x would appear to be equal, or at least closer in size (this is what happens with differentials that approximate the change in the independent variable of a function: in single-variable functions, dy approaches delta y as we "zoom in" or decrease delta x to be infinitesimally small). So to have an objective sense of how different the partials are, we take their "ratio", which effectively and mathematically means we take a partial derivative. That's how I've come to understand it. I'm still a freshman at university.(5 votes)
- I've got two questions :):
1. Firstly, how is the function locally linear, if even if we zoom in on a small area of it, the lines, which seem linear, obviously are not? (They are still a part of the sine wave)
2. And then, in that case, aren't all transformations locally linear?(3 votes)
- Am I missing something or should the change in f1 because of the change in x (the green dashed line at2:14) not go all the way to the the gridline?(2 votes)
- Can we say that finding the Jacobian for transformations is like finding the equation of the tangent line to a single variable function?(2 votes)
- is the simulation available online? if not, where can i find a similar simulation of the visual representation of the transformation to play with myself?(2 votes)
- [Narrator] In the last video we were looking at this particular function. It's a very non linear function. And we were picturing it as a transformation that takes every point x, y in space to the point x plus sign y, y plus sign of x. And moreover, we zoomed in on a specific point. And let me actually write down what point we zoomed in on, it was (-2,1). That's something we're gonna want to record here (-2,1). And I added couple extra grid lines around it just so we can see in detail what the transformation does to points that are in the neighborhood of that point. And over here, this square shows the zoomed in version of that neighborhood. And what we saw is that even though the function as a whole, as a transformation, looks rather complicated, around that one point, it looks like a linear function. It's locally linear so what I'll show you here is what matrix is gonna tell you the linear function that this looks like. And this is gonna be kind of two by two matrix. I'll make a lot of room for ourselves here. It'll be a two by two matrix and the way to think about it is to first go back to our original setup before the transformation. And think of just a tiny step to the right. What I'm gonna think of as a little, partial x. A tiny step in the x direction. And what that turns into after the transformation is gonna be some tiny step in the output space. And here let me actually kind of draw on what that tiny step turned into. It's no longer purely in the x direction. It has some rightward component. But now also some downward component. And to be able to represent this in a nice way, what I'm gonna do is instead of writing the entire function as something with a vector valued output, I'm gonna go ahead and represent this as a two separate scalar value functions. I'm gonna write the scalar value functions f1 of x, y. So I'm just giving a name to x plus sign y. And f2 of x, y, again all I'm doing is giving a name to the functions we already have written down. When I look at this vector, the consequence of taking a tiny d, x step in the input space that corresponds to some two d movement in the output space. And the x component of that movement. Right if I was gonna draw this out and say hey, what's the x component of that movement. That's something we think of as a little partial change in f1, the x component of our output. And if we divide this, if we take you know partial f1 divided by the size of that initial tiny change, it basically scales it up to be a normal sized vector. Not a tiny nudge but something that's more constant that doesn't shrink as we zoom in further and further. And then similarly the change in the y direction, right the vertical component of that step that was still caused by the dx. Right, it's still caused by that initial step to the right, that is gonna be the tiny, partial change in f2. The y component of the output cause here we're all just looking in the output space that was caused by a partial change in the x direction. And again I kind of like to think about this we're dividing by a tiny amount. This partial f2 is really a tiny, tiny nudge. But by dividing by the size of the initial tiny nudge that caused it, we're getting something that's basically a number. Something that doesn't shrink when we consider more and more zoomed in versions. So that, that's all what happens when we take a tiny step in the x direction. But another thing you could do, another thing you can consider is a tiny step in the y direction. Right cause we wanna know, hey, if you take a single step some tiny unit upward, what does that turn into after the transformation. And what that looks like is this vector that still has some upward component. But it also has a rightward component. And now I'm gonna write its components as the second column of the matrix. Because as we know when you're representing a linear transformation with a matrix, the first column tells you where the first basis vector goes and the second column shows where the second basis vector goes. If that feels unfamiliar, either check out the refresher video or maybe go and look at some of the linear algebra content. But to figure out the coordinates of this guy, we do basically the same thing. Let's say first of all, the change in the x direction here, the x component of this nudge vector. That's gonna be given as a partial change to f1, right, to the x component of the output. Here we're looking in the outputs base. We're dealing with f1, f1 and f2 and we're asking what that change was that was caused by a tiny change in the y direction. So the change in f1 caused by some tiny step in the y direction divided by the size of that tiny step. And then the y component of our output here. The y component of the step in the outputs base that was caused by the initial tiny step upward in the input space. Well that is the change of f2, second component of our output as caused by dy. As caused by that little partial y. And of course all of this is very specific to the point that we started at right. We started at the point (-2,1). So each of these partial derivatives is something that really we're saying, don't take the function, evaluate it at the point (2,-1), and when you evaluate each one of these at the point (2,-1) you'll get some number. And that will give you a very concrete two by two matrix that's gonna represent the linear transformation that this guy looks like once you've zoomed in. So this matrix here that's full of all of the partial derivatives has a very special name. It's called as you may have guessed, the Jacobian. Or more fully you'd call it the Jacobian Matrix. And one way to think about it is that it carries all of the partial differential information right. It's taking into account both of these components of the output and both possible inputs. And giving you a kind of a grid of what all the partial derivatives are. But as I hope you see, it's much more than just a way of recording what all the partial derivatives are. There's a reason for organizing it like this in particular and it really does come down to this idea of local linearity. If you understand that the Jacobian Matrix is fundamentally supposed to represent what a transformation looks like when you zoom in near a specific point, almost everything else about it will start to fall in place. And in the next video, I'll go ahead and actually compute this just to show you what the process looks like. And how the result we get kind of matches with the picture we're looking at, see you then.