Main content
Multivariable calculus
Course: Multivariable calculus > Unit 3
Lesson 2: Quadratic approximations- What do quadratic approximations look like
- Quadratic approximation formula, part 1
- Quadratic approximation formula, part 2
- Quadratic approximation example
- The Hessian matrix
- The Hessian matrix
- Expressing a quadratic form with a matrix
- Vector form of multivariable quadratic approximation
- The Hessian
- Quadratic approximation
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Expressing a quadratic form with a matrix
How to write an expression like ax^2 + bxy + cy^2 using matrices and vectors. Created by Grant Sanderson.
Want to join the conversation?
- Is there a way to write a cubic form with vectors and matrices (or possibly tensors of rank 3)? What about forms of greater degrees?(10 votes)
- Why does Grant use the transposed vector instead of taking the dot product of the vector? The computations for both methods is the same, but what is the underlying meaning in using the transposed vector? So instead of x^T Mx it would be x ·Mx.(7 votes)
- You've answered your own question, so there's no point for me to answer this, but yes, we use the transposed version of x so that it becomes a 1 X 2 matrix, which can then be multiplied to a 2 X 1 matrix. Otherwise, the matrix multiplication is undefined.(2 votes)
- As grant explained on 3B1B that multiplying transpose of the vector is same as taking the dot product. This quadratic form will convert to Mx . x. Then why is the transpose notation used? This one seems easier...(to me)(6 votes)
- In vector notation, it is
(Mx)·x
. But when the same is translated to Matrices, you getx'Mx
. This is just the way Matrices implement a dot product. They are one and the same.(2 votes)
- Up till now we've always used dot product notation to represent a transpose inner product, is there any advantage to writing it this way instead?(4 votes)
- Interesting to note is that you can multiply two quadratic forms with matrix multiplication, except you get an additional term |x|^[2(n - 1)] where n is the number of terms being multiplied.(3 votes)
- Other people have asked, but why not compute (Mx).x instead? Is it just a random choice?(3 votes)
- How do we determine the elements in the M matrix? I know it's gotta be symmetrical, but what other patterns are there?(3 votes)
- Is not V^T * X for linear form? Because if it is VX then dimensions [3x1] * [3x1], so it can be only point wise multiplication. So to get original ax + by + cz we need [1x3] * [3x1], so V need to be transposed(1 vote)
- What is the connection between this and positive definite matrices?(1 vote)
- Can the solution to the final problem be found anywhere? (x'Ax for [x y z] and [a b c \\ c d e \\ c e f])(0 votes)
Video transcript
- [Voiceover] Hey guys. There's one more thing
I need to talk about before I can describe the vectorized form for the quadratic approximation
of multivariable functions which is a mouthful to say so let's say you have
some kind of expression that looks like a times x squared and I'm thinking x is a variable times b times xy, y is another variable, plus c times y squared and I'm thinking of a, b
and c as being constants and x and y as being variables. Now, this kind of
expression has a fancy name. It's called a quadratic form. Quadratic form. And that always threw me off. I always kind of was like, what, what does form mean? I know what a quadratic expression is and quadratic typically
means something is squared or you have two variables but why do they call it a form? And basically it just means
that the only things in here are quadratic. It's not the case that you have an x term sitting on its own or a constant out here like two when you're adding
all of those together instead it's just you have
purely quadratic terms but of course, mathematicians
don't want to call it just a purely quadratic expression instead they have to give
a fancy name to things so that it seems more
intimidating than it needs to be but anyways, so we have a quadratic form and the question is how can we express this
in a vectorized sense? And for analogy, let's
think about linear terms where let's say you have a times x plus b times y and I'll throw another variable in there, another constant times another variable z. If you see something like this where every variable is just
being multiplied by a constant and then you add terms
like that to each other, we can express this nicely with vectors where you pile all of the
constants into their own vector, a vector containing a, b and c and you imagine the dot product between that and a vector that contains all of the variable components, x, y and z and the convenience here is
then you can have just a symbol like a v let's say which represents this
whole constant vector and then you can write down, take the dot product between that and then have another symbol, maybe a bold faced x which represents a vector that
contains all of the variables and this way, your notation
just kind of looks like a constant times a variable just like in the single variable world when you have a constant
number times a variable number, it's kind of like taking a constant vector times a variable vector. And the importance of
writing things down like this is that v could be a vector that contains not just three numbers but a hundred numbers and then x would have a
hundred corresponding variables and the notation doesn't
become any more complicated. It's generalizable at
the higher dimensions. So the question is can be we do something similar like that with our quadratic form? Because you can imagine let's say we started
introducing the variable z then you would have to
have some other term, some other constant times
the xz quadratic term and then some other constant times the z squared quadratic term and another one for the yz quadratic term and it would get out of hand and as soon as you
start introducing things like a hundred variables, it would get seriously out of hand because there's a lot of
different quadratic terms so we want a nice way to express this. And I'm just going to kind
of show you how we do it and then we'll work it through
to see why it makes sense. So usually, instead of
thinking of b times xy, we actually think of this
as two times some constant times xy and this of course
doesn't make a difference. You would just change what b represents but you'll see why it's more
convenient to write it this way in just a moment. So the vectorized way to describe
a quadratic form like this is to take a matrix, a two by two matrix since
this is two dimensions where a and c are in the diagonal and then b is on the other diagonal and we always think of these
as being symmetric matrices so if you imagine kind of
reflecting the whole matrix about this line, you'll get the same number so it's important that we
have that kind of symmetry. And now what you do is
you multiply the vector, the variable vector that's got x, y on the right side of this matrix and then you multiply it again but you turn it on its side so instead of being a vertical vector, you transpose it to
being a horizontal vector on the other side. And this is a little bit analogous too having two variables multiplied in. You have two vectors multiplied
in but on either side. And this is a good point by the way if you are uncomfortable
with matrix multiplication to maybe pause the video, go find the videos about
matrix multiplication and kind of refresh or learn about that because moving forward, I'm just going to assume that it's something you're familiar with. So going about computing this, first, let's tackle this
right multiplication here. We have a matrix multiplied by a vector. Well, the first component that we get, we're going to multiply the top row by each corresponding term in the vector so it'll be a times x. a times x plus b times y. Plus b times that second term y and then similarly for the bottom term, we'll take the bottom row and multiply the corresponding terms so b times x. b times x plus c times y. c times y. So that's what it looks like when we do that right multiplication and of course we've got to
keep our transposed vector over there on the right, on the left side. So now, we have, this is just a two by one vector now and this is a one by two. You could think of it
as a horizontal vector or a one by two matrix but now when we multiply these guys, you just kind of line up
the corresponding terms. You'll have x multiplied by
that entire top expression so x multiplied by ax plus by. ax plus by and then we add that to the second term y multiplied by the second term of this guy which is bx plus cy so y multiplied by bx plus cy and all of these are numbers
so we can simplify it once we start distributing the first term is x times a times x so that's ax squared and then the next term
is x times b times y so that's b times xy. Over here, we have y times b times x so that's the same thing as b times xy so that's kind of why we have, why it's convenient to write a two there because that naturally
comes out of our expansion. And then the last term
is y times c times y so that's cy squared. So we get back the original quadratic form that we were shooting for. ax squared plus two bxy plus cy squared That's how this entire term expands. As you kind of work it through, you end up with the same
quadratic expression. Now, the convenience
of this quadratic form being written with a matrix like this is that we can write
this more abstractally and instead of writing
the whole matrix in, you could just let a letter like m represent that whole matrix and then take the vector
that represents the variable, maybe a bold faced x and you would multiply it on the right and then you transpose it
and multiply it on the left so typically you denote that by putting a little t as a superscript so x transposed multiplied by the matrix from the left and this expression, this is what a quadratic form
looks like in vectorized form and the convenience is the same as it was in the linear case. Just like v could represent something that had a hundred
different numbers in it and x would have a hundred
different constants, you could do something similar here where you can write that same expression even if the matrix m is super huge. Let's just see what this would look like in a three dimensional circumstance so, actually, I'll need more room so I'll go down even further. So we have x transpose
multiplied by the matrix multiplied by x, bold faced x and let's say instead this represented, you have x then y then z, our transposed vector and then our matrix, our matrix let's say was a, b, c, d, e, f and because it
needs to be symmetric, whatever term is in this spot here needs to be the same as over here kind of when you reflect
it about that diagonal. Similarly, c, that's going
to be the same term here and e would be over here. So there's only really six
free terms that you have but if fills up this entire matrix and then on the right side, we would multiply that by x, y, z. Now, I won't work it out in this video but you can imagine actually
multiplying this matrix by this vector and then multiplying the
corresponding vector that you get by this transposed vector and you'll get some kind of quadratic form with three variables and the point is you'll
get a very complicated one but it's very simple to
express things like this. So with that tool in hand, in the next video, I will talk about how
we can use this notation to express the quadratic approximations for multivariable functions. See you then.