- What do quadratic approximations look like
- Quadratic approximation formula, part 1
- Quadratic approximation formula, part 2
- Quadratic approximation example
- The Hessian matrix
- The Hessian matrix
- Expressing a quadratic form with a matrix
- Vector form of multivariable quadratic approximation
- The Hessian
- Quadratic approximation
A continuation from the previous video, leading to the full formula for the quadratic approximation of a two-variable function. Created by Grant Sanderson.
Want to join the conversation?
- Ok, now I have a serious question... which song did Grant start singing? :P(45 votes)
- okay i got a question at this point i am not sure if it is just me or what but i am starting to mix this up with the taylor series do these have some kind of connection? is this like some kind of multi dimensional equivalent of taylor series(23 votes)
- That is an exceptional point. This is in fact very closely related to the Taylor Series.
Just as functions can be multidimensional, so too can the Taylor Series.
How are they related?
Well, the Taylor Series is a means to represent some function (can be multidimensional) as a polynomial. I.e., of the form a+bx+cx^2+dx^3+...
Now, the Taylor Series can have infinite terms. The more terms the series has, the closer it is to the original function. But, if we cut the Taylor Series short, say, by only including the terms up to x^1, we have ourselves a linear approximation (or a local linearisation) of the function. However, if we include all the terms in the Taylor Series up to x^2, we have ourselves a quadratic approximation to the original function.
So, to summarise, *approximations are just the Taylor Series cut short.*
-If you cut it at x, you've got a linear approximation
-If you cut it at x^2, you've got a quadratic approximation
-If you cut it at x^3, you've got a cubic approximation
-If you cut it at x^4, you've got a quartic approximation
and so on.
Hope this helps.(31 votes)
- At1:13, why we want the 2nd partial derivative? I feel dull.. but I miss the point.(4 votes)
- My understanding of this topic is still growing but I feel I have an understanding sufficient to give you the intuition for an appropriate answer to your question..what I'm trying to say is, I might be wrong but I think I'm right.
The reason you take the second derivative is because the second derivative tells you what direction the curve around that region is going to be i.e. either positive or negative (think back to single variable quadratic equations and what second derivatives do there - it tells you whether the point you're looking at is curving up or down (max or min point)). These second derivative terms acts as the control knobs that make your graph curve and therefore hug your original function more closely allowing for closer approximations. In other words, the second derivative is what turns the flat tangent plane into the curvy sheet that Grant showed at9:15. If you're wondering why the first derivative wasn't used, its because the first derivative gives you the tangent line which is just a straight line which is what the linear terms already do. Quadratic terms are necessary for the second derivative to happen and its also what we want to make the second derivative a constant as mentioned at5:44.
Hope that helps:)(14 votes)
- Why do we still need the linear part of the quadratic approximation function? Why can't we just throw it out and keep the ax^2 + bxy + c^2 term?(4 votes)
- wondering where the article on quadratic approximations would be found ? - mentioned at9:03(2 votes)
- aren't you just doing taylor expansions except instead of approximating curves you're approximating surfaces?(2 votes)
- They both have some similarities, but we have to use slightly different methods because we are working in 3D.
But yes, the Taylor/Macluarin Expansion creates a quadratic that approximates a 2D graph and now we are creating a quadratic equation to approximate 3D graphs, so we have the same ideas in mind.
Hope this helps,
- Convenient Colleague(1 vote)
- Why do they call it a quadratic approximation and not a local quadratic approximation, you are still talking about a specific point on the curve.(1 vote)
- I think the linear approximation is only around that point, but the quadratic approximation is able to approximate more than just the point. It is still local but it covers a lot more area and therefore should be differentiated from the local aspect!(1 vote)
- Like Dave already asked, is there a reason we started out trying to create this formula with the linear part of the quadratic approximation function? Why can't we just throw it out and keep the ax^2 + bxy + c^2 term?
That is what I would have expected, and Grant doesn't really explain why.
Thanks for your time. :)(1 vote)
- [Voiceover] ♫ Line things up a little bit right here. ♫ All right. So in the last video I set up the scaffolding for the quadratic approximation which I'm calling Q of a function, an arbitrary two variable function which I'm calling f, and the form that we have right now looks like quite a lot actually. We have six different terms. Now the first three were just basically stolen from the local linearization formula and written in their full abstractness. It almost makes it seem a little bit more complicated than it is. And then these next three terms are basically the quadratic parts. We have what is basically X squared. We take it as X minus X naught squared so that we don't mess with anything previously once we plug in X equals X naught, but basically we think of this as X squared. And then this here is basically X times Y, but of course we're matching each one of them with the corresponding X naught Y naught, and then this term is the Y squared. And the question at hand is how do we fill in these constants? The coefficients in front of each one of these quadratic terms to make it so that this guy Q hugs the graph of f as closely as possible. And I showed that in the very first video, kind of what that hugging means. Now in formulas, the goal here, I should probably state, what it is that we want is for the second partial derivatives of Q, so for example if we take the partial derivative with respect to X twice in a row, we want it to be the case that if you take that guy and you evaluate it at the point of interest, the point about which we are approximating, it should be the same as when you take the second partial derivative of f or the corresponding second partial derivative I should say since there's multiple different second partial derivatives, and you evaluate it at that same point. And of course we want this to be true not just with the second partial derivative with respect to X twice in a row, but if we did it with the other ones. Like for example, let's say we took the partial derivative first with respect to X, and then with respect to Y. This is called the mixed partial derivative. We want it to be the case that when we evaluate that at the point of interest it's the same as taking the mixed partial derivative of f with respect to X, and then with respect to Y, and we evaluate it at that same point. And remember, for almost all functions that you deal with, when you take this second partial derivative where we mix two of the variables, it doesn't matter the order in which you take them, right? You could take it first with respect to X then Y or you could it first with respect to Y, and then with respect to X. Usually these guys are equal. There are some functions for which this isn't true, but we're going to basically assume that we're dealing with functions where this is. So, that's the only mixed partial derivative that we have to take into account. And I'll just kind of get rid of that guy there. And then, of course, the final one, just to have it on record here, is that we want the partial derivative when we take it with respect to Y two times in a row and we evaluate that at the same point, there's kind of a lot, there's a lot of writing that goes on with these things and that's just kind of par for the course when it comes to multi-variable calculus, but you take the partial derivative with respect to Y, add both of them, and you want it to be the same value at this point. So even though there's a lot going on here, all I'm basically saying is all the second to partial differential information should be the same for Q as it is for f. So, let's actually go up and take a look at our function and start thinking about what it's partial derivatives are. What it's first and second partial derivatives are. And to do that, let me first just kind of clear up some of the board here just to make it so we can actually start computing what this second partial derivative is. So let's go ahead and do it. First, this partial derivative with respect to X twice, what we'll do is I'll take one of those out and think partial derivative with respect to X. And then on the inside I'm going to put what the partial derivative of this entire expression with respect to X is. But we just take it one term at a time. This first term here is a constant, so that goes to zero. The second term here actually has the variable X in it. And when we take it's partial derivative, since this is a linear term, it's just going to be that constant sitting in front of it. So it will be that constant which is the value of the partial derivative of f with respect to X evaluated at the point of interest. And that's just a constant. All right, so that's there. This next term has no Xs in it, so that's just going to go to zero. This term is interesting because it's got an X in it. So when we take its derivative with respect to X, that two comes down. So this will be two times a, whatever the constant a is, multiplied by X minus X naught. That's what the derivative of this component is with respect to X. Then this over here, this also has an X, but it's just showing up basically as a linear term. And when we treat Y as a constant, since we're taking the partial derivative with respect to X, what that ends up being is b multiplied by that, what looks like a constant as far as X is concerned, Y minus Y naught. And then the last term doesn't have any Xs in it. So that is the first partial derivative with respect to X. And now we do it again. Now we take the partial derivative with respect to X, and I'll hmm, maybe I should actually clear up even more of this guy. And now when we take the partial derivative of this expression with respect to X, f of X of X naught, Y naught, that's just a constant, so that goes to zero. Two times a times X, that's going to, we take the derivative with respect to X and we're just going to get two times a. And this last term doesn't have an Xs in it, so that also goes to zero. So conveniently, when we take the second partial derivative of Q with respect to X, We just get a constant. It's this constant to a. And since we want it to be the case, we want that this entire thing is equal to, well what do we want? We want it to be the second partial derivative of f both times with respect to X. So here I'm going to use the subscript notation. Over here I'm using the kind of Leibniz notation, but here just second partial derivative with respect to X, we want it to match whatever that looks like when we evaluate it at the point of interest. So what we could do to make that happen, to make sure that two a is equal to this guy, is we set a equal to one half of that second partial derivative evaluated at the point of interest. Okay. So this is something we kind of tuck away. We remember this is, we have solved for one of the constants. So now let's start thinking about another one of them. Well I guess actually I don't have to scroll off because let's say we just want to take the mixed partial derivative here where if instead of taking it with respect to X twice, we wanted to, let's see I'll kind of erase this, we wanted to first do it with respect to X, and then do it with respect to Y. Then we can kind of just edit what we have over here and we say, "we already took it with respect to X, "so now as our second go we're going to be "taking it with respect to Y." So in that case, instead of getting two a let's kind of figure out what it is that we get. When we take the derivative of this whole guy with respect to Y, well this looks like a constant. This here also looks like a constant since we're doing it with respect to Y and no Ys show up. And the partial derivative of this just ends up being b. So again, we just get a constant. This time it's b not two, previously it was two a, but now it's just b. And this time we want it to equal the mixed partial derivative. So instead of saying f sub XX, I'm going to f XY which basically says you take the partial derivative first with respect to X and then with respect to Y. We want this guy to equal the value of that mixed partial derivative evaluated at that point. So that gives us another fact. That means we can just basically set b equal to that. And this is another fact, another constant that we can record. And now for C, when we're trying to figure out what that should be, the reasoning is almost identical. It's pretty much symmetric. We did everything that we did for the case X, and instead we do it for taking the partial derivative with respect to Y twice in a row, and I encourage you to do that for yourself. It'll definitely solidify everything that we're doing here because it can seem kind of like a lot and a lot of computations. But you're going to get basically the same conclusion you did for the constant a. It's going to be the case that you have the constant c is equal to one half of the second partial derivative of f with respect to Y, so you're differentiating with respect to Y twice evaluated at the point of interest. So this is going to be kind of the third fact. And the way that you get to that conclusion again, it's going to be almost identical to the way that we found this one for X. Now when you plug in these values for a, b and c, and these are constants, even though we've written them as formulas they are constants, when you plus those in to this full formula, you're going to get the quadratic approximation. It'll have six separate terms. One that corresponds to the constant, two that correspond to the linear fact, and then three which correspond to the various quadratic terms. And if you wanted to dig into more details and kind of go through an example or two on this, I do have an article on quadratic approximations and hopefully you can kind of step through and do some of the computations yourself as you go. But in all of this, even though there's a lot of formulas going on, it can be pretty notationly heavy. I want you to think back to that original graphical intuition, here, let me actually pull up the graphical intuition here. So if you're approximating a function near a specific point, the quadratic approximation looks like this curve where if you were to chop it in any direction it would be a parabola, but it's hugging the graph pretty closely. So it gives up a pretty close approximation. So even though there's a lot of formulas that go on to get us that, the ultimate visual and I think the ultimate intuition is actually a pretty sensible one. You're just hoping to find something that hugs the function nice and closely. And with that, I will see you next video.