Main content

## Linear algebra

### Course: Linear algebra > Unit 1

Lesson 5: Vector dot and cross products- Vector dot product and vector length
- Proving vector dot product properties
- Proof of the Cauchy-Schwarz inequality
- Vector triangle inequality
- Defining the angle between vectors
- Defining a plane in R3 with a point and normal vector
- Cross product introduction
- Proof: Relationship between cross product and sin of angle
- Dot and cross product comparison/intuition
- Vector triple product expansion (very optional)
- Normal vector from plane equation
- Point distance to plane
- Distance between planes

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Vector triangle inequality

Proving the triangle inequality for vectors in Rn. Created by Sal Khan.

## Want to join the conversation?

- why should he stop saying magnitude. When the definition of a vector is having both magnitude and direction. and the positive or negative indicates direction and the magnitude is the the absolute value of the vector^2 which really is neither positive nor negative.?(23 votes)
- If I understand Sal, he is trying to avoid the confusion between scalar and vector quantities. A vector has both length and direction - what is referred to as magnitude. (The vector [x] (1, 2) has exactly the same length as vector [y] (-1, -2), but a different direction, hence a different magnitude, and therefore, [x] <> [y]). On the other hand, the length of the two vectors is equal, hence ||x|| = ||y||. In the process of calculating ||x||, all information regarding direction is lost.

A good analogy can be found in physics in the distinction between speed and velocity. An object in uniform circular motion has constant speed but has an ever changing velocity - because velocity includes direction.(22 votes)

- what about ||vector x||-||vector y||<=||vector(x+y)|| ? what happens when c is negative? can then it this part be proved?(18 votes)
- Note that
`||x|| - ||y|| <= ||x+y||`

is a much less restrictive statement than`||x|| + ||y|| <= ||x + y||`

. All of the lengths are (by definition) positive values. Multiplying one of them by -1 on the left side of the inequality just makes that side even less than before.

Perhaps it's helpful to think of the axiomatic statement:`-||x|| <= ||x||`

since all "negative or zero" values are less than or equal to all "positive or zero" values.(7 votes)

- Does the equality work if vector x or y are zero? I know we assumed they were Not Zero for the proof, but it seems like if we let either one or both be zero, it still holds true.(9 votes)
- But yes, because if you consider the case where one or both is zero separately, it's easy to see that both sides of the Cauchy Schwarz inequality go to zero. And also || x + 0 || = || x || and || x || + || 0 || = || x ||.(9 votes)

- At5:38, Sal says "I should stop using the term magnitude." Why?(5 votes)
- In 1 or 2 dimensions the vector is graph-able, and length communicates the idea in a more commonly understood way.

The terms are interchangeable (personally, I was taught to use magnitude).(10 votes)

- Why is x dot y <= | x dot y | ?(2 votes)
- Remember that x dot y (I'll write it as x*y) is a real number.

If it's positive, then x*y = |x*y|.

If it's negative, then x*y < |x*y|.

Therefore x*y <= |x*y|.(12 votes)

- Sal corrects himself a couple times when he says "magnitude", saying that he should say "length" instead. But I thought magnitude and length are the same thing?(3 votes)
- Perhaps "length" is slightly more precise and clearly refers to vectors; even scalar values have "magnitude," but it wouldn't make sense to talk about the "length" of a scalar.(5 votes)

- what is the difference between the length and the magnitude of a vector?(2 votes)
- When adding two vectors in n-space, don't the three points that define this addition (the tail of the first vector, the head of the first vector/tail of the second vector, and the head of the second vector) just define another plane (or set of planes along the same line for the degenerative case) that's in some equivalent version of R2? So it seems to me that these relationships between pairs of added vectors are always in an R2(-like) space. So I'm not really sure what Sal means about extending these definitions beyond R2.(2 votes)
- Dave, you are correct in your observation that the three vectors considered are all co-planar. This observation is very helpful for having an intuitive grasp of what is going on.

Mathematically, however, an R2-like space is not a well-defined concept. Consider our n-dimensional vectors as before. They are co-planar. We could perform a change of basis, such that the vectors lie along the x-y plane, and then it would act much like in the 2D case, so we can use two dimensional reasoning. However, by extending to Rn, this essentially picks up the flat plane of these vectors and puts it at some angle in n-dimensional space. It is not necessarily immediately obvious that the situation can be reduced to two dimensions.

Extending to n-dimensions is more useful if you have more vectors. Say we have more than the vectors**x**,**y**and**x**+**y**. Say we have another vector which is not co-planar to these vectors in our problem, and we wanted to do some maths or operation which included the triangle inequality as one of the steps. If the triangle inequality only applied to two dimensions, then we would not be able to use it in this case. However, since we have proven it in n dimensions, we can use it even if we have other vectors in the problem which aren't co-planar.(2 votes)

- why he said that ||x+y||squred as ||x||^2+2(x.y)+||y||^2

here he has said about the magnitude of x and y but why not about 2(x.y)(2 votes)- The magnitude squared of a vector is the same thing as the dot product of that vector with itself. The dot product is also distributive across vector addition. Therefore:
`|x+y|² = (x+y).(x+y)`

rewritten`= x.x + 2x.y + y.y`

Multiplied out`= |x|² + 2x.y + |y|²`

rewritten again(1 vote)

- is magnitude and length the same thing?(2 votes)
- Not necessarily. Most of the time yes, but sometimes vectors are not drawn to scale and thus the magnitude will not match the length of the vector arrow.(1 vote)

## Video transcript

In the last video, we showed
you the Cauchy-Schwarz Inequality. I think it's worth rewriting
because this is something that's going to show up a lot. It's a very useful tool. And that just told us if I have
two vectors, x and y, they're both members of Rn. And they're both nonzero
vectors. And that was an assumption we
had to make when we did the proof, otherwise there was a
potential of dividing by one of their magnitudes. So that would've been
a big no-no. But if we assume that they're
both nonzero, then we can say that the absolute value of their
dot products is going to be less than or equal
to the products of their individual lengths. So that's the length of vector x
and we defined that a couple of videos ago. And then this is the
length of vector y. And of course, this is just a
regular number and then each of these are just
regular numbers. They're not vectors once
you take a length. The length of a 50 dimensional
vector could just be the number 3. It's just a scalar value. So this is just scalar
multiplication here. And we also learned that the
only time that this inequality turns into an equality is the
situation where x is equal to some scalar multiple of y. And so in some textbooks you'll
say-- and this has to be a nonzero scalar multiple. But that's a bit obvious. I told you that x and
y are nonzero. So if this was 0 then
x would be 0. And we already said
that x is not 0. But if you want to say there you
could say that you know c also is going to be nonzero. But that essentially
just falls out of this information there. But if this is the case and if
and only if this is the case, then we can say that the
absolute value of the dot product of the two vectors
is equal to the product of their lengths. Now, this is all just
a review of what I did in the last video. Now what else can we do
that's useful with it? So let's just play around
a little bit. I can't claim to be
experimenting, I know where this is going to go. Let's see what happens
if I were to take the length of x plus y. So I'm going to add the two
vectors and then take the length of that vector squared. Well we know from a couple of
videos ago that the length squared can also be rewritten as
the dot product of a vector with itself. This right here, x
plus y, I know it looks like two vectors. But it's two vectors added
to each other. So it's really a vector. x
plus y is a real vector. I could graph x plus y. So the length of x plus y
squared, I can rewrite it as the dot product of that
vector with itself. So x plus y dot x plus y. And all of these are vectors. These aren't just numbers. And this is the dot product. It's just not normal
multiplication. But we saw two videos ago that
the dot product has the distributive and the
associative and the commutative properties just
like regular scalar multiplication. So you can kind of FOIL this out
if that's how you remember multiplying your binomials. Or I will think of it more as
just doing the distributive property twice. So this can be rewritten
as x dot x. Actually, let me write it as
the distributive property because that's sometimes not
obvious to a lot of people. So let me write this x as a
yellow x and let me write this whole term as x plus y. So this right here can be
rewritten as x dot-- so this x dot this x plus y. And then it would be plus
this y dot-- I want to just switch colors. Plus this y dot x plus y. It's good to see that when
you're multiplying these, you're just applying the
distributive property. All I did is I distributed this
term along each of these terms into sum right here. So then I got this. And then I can distribute each
of these into the sum. So then this becomes-- I'll be
careful with the colors-- x dot x plus x dot y. Maybe this was a little bit
overkill, but I think it's good to see that this isn't
just some magic here. And we're just using the exact
properties that we proved with the dot product. So that's that right there. And then it's plus y dot x. Plus this yellow y
dot the yellow x. Sorry, dot the blue y. So the magnitude or the length
of our vector x plus y squared can be rewritten like this. And I'll just switch
back to one color. So this equals that and all of
that-- what does this equal? This is equal to x dot x. What's x dot x? x dot x is just the magnitude. So let me write this is just
equal to the magnitude of our vector x. I should stop using the
word magnitude. The length of our vector
x squared. And then I have two
terms here. I have an x dot y
and a y dot x. We know that x dot y and y dot
x are really the same thing. We showed that order doesn't
matter when you take the dot product, just like it doesn't
matter with regular multiplication. So these are really the same
terms. So we could write plus 2 times x dot y. And then finally, we have that
last term sitting here. We have this y dot y. y dot y is the same thing
as the length of our vector y squared. Now, let's see if we can break
out our Cauchy-Schwarz Inequality. Or maybe Schwarz, I don't know
if I'm pronouncing it right. But x dot y. t We have the absolute value
of x dot y here. But we know that just x dot y is
going to be-- it has to be less than or equal to the
absolute value of x dot y. Why is that? Well this could be negative. I could show you
examples of dot products that are negative. In fact, if x has all positive
terms and y has all negative terms, you're going to have
a negative dot product. So this could be negative
or it could be positive. If it's positive the absolute
value-- their equal to each other. If this is negative, than this
absolute value is definitely going to be greater than it. We can add to the Cauchy-Schwarz
Inequality and this is a bit obvious. We could say look, we could add
a little x dot y is less than or equal to the absolute
value of x dot y. Which is less than or equal to
the length of x times the length of y. So x dot y is definite-- this,
the dot product of x with y is definitely less than it's
absolute value of that. Which is definitely less than
the lengths of those two multiplied. So if I rewrite this, this
statement right here is definitely less than or equal
to this exact statement. But if I replace these with the
lengths of the vectors. So that is definitely less than
or equal to-- I'm just rewriting this x squared and
I'll write the plus 2 there. Plus 2. But I want to make it very clear
what I'm replacing here. And then I have the
plus length of my vector y there squared. Now this I'm saying, this is
definitely less than the absolute value of x dot y. Which is definitely less, by the
Cauchy-Schwarz Inequality, definitely less than the product
of the two lengths. So I'm just replacing this
with the product of their two lengths. So I'm going to put the length
of x times the length of y. And since this is the
same as that, this is the same as that. But this is definitely
less than this. This whole term has to be less
than this whole term. Now let me just remind you
what we were doing. I said that this thing that I
wrote over here, this is the same as that. So this thing up here, which is
the same as that, which is less than that also. So we can write that the
magnitude of x plus y squared and not the magnitude, the
length of the vector x plus y squared is less than
this whole thing that I wrote out here. Or less than or equal to. Now, what is this thing? Remember, I mean this might look
all fancy with my little double lines around
everything. But these are just numbers. This length of x squared,
this is just a number. Each of these are numbers and I
can just say hey, look, this looks like a perfect
square to me. This term on the right-hand side
is the exact same thing as the length of x plus
the length of y. Everything squared. If you just squared this out
you'll get x squared plus 2 times the length of x times the length of y plus y squared. So our length of the vector x
plus y squared is less than or equal to this quantity
over here. And if we just take the square
root of both sides of this, you get the length of our vector
x plus y is less than or equal to the length of the
vector x by itself plus the length of the vector y. And we call this the triangle
inequality, which you might have remembered from geometry. Now why is it called the
triangle inequality? Well you could imagine
each of these to be separate side of a triangle. In fact, let's draw it. We can draw this in R2. Let me turn my graph paper on. Let me see where the
graphs show up. If I turn my graph paper
on right there, maybe I'll draw it here. So let's draw my vector x. So let's say that my vector x
look something like this. Let's say that's my vector x. It's a vector 2, 4. So that's my vector x. And let's say my vector y-- well
I'm just going to do it head to tail because I'm
going to add the two. So my vector y-- I'm going to do
it in nonstandard position. Let's say it's look something
like-- let's say my vector y looks something like this. Draw it properly. That's my vector y. What does x plus y look like? And remember, I can't
necessarily draw any two vectors on a two-dimensional
space like this. I'm just assuming that
these are in R2. But this is just to
give you the idea. So then this is their
sum, right? You took from the tail of
x to the head of y. So this vector right here
is the vector x plus y. And that's why it's called
the triangle inequality. It's just saying that look, this
thing is always going to be less than or equal to-- or
the length of this thing is always going to be less than or
equal to the length of this thing plus the length
of this thing. And that's kind of obvious
when you just learn two-dimensional geometry. That look, this is a much more
efficient way of getting from this point to this point
than going out here and then going out here. And then, what's the case in
which this length is equal to these lengths? Well if you keep flattening this
triangle out and you go to the extreme case where
maybe the vector x looks like this. And if the vector y is just
kind of going in the exact same-- vector y is going in
the exact same direction, maybe it's going a little
bit further. This is vector x, this
is vector y. Now x plus y will just
be this whole vector. Now that whole thing
is x plus y. And this is the case now where
you actually-- where the triangle inequality turns
into an equality. That's why that little
equal sign is there. The extreme case where essentially, x and y are collinear. And why does that work out? We can even go back to our
math and understand that. So let me turn my graph off. We can go back to
our math here. If I go back to this point,
remember, right here I made the statement, look, this thing
is definitely less than this thing over here. But what if I made
an assumption? What if I said that x is equal
to some scalar times y? And actually, I have to be
careful because just some scalar times y-- remember, our
Cauchy-Schwarz Inequality said that look, the inequality turns
into an equality if x is some nonzero scalar of y. And then we can apply this. We can say that the absolute
value of x dot y is the same as this over here. But I don't have the absolute
value of x dot y here. I don't know that this
is positive. I can say definitively that this
is a positive quantity because I took the absolute
value of it. No absolute value here. So the only way that I can
assure that this is a positive quantity, that this is the same
thing as the absolute value of x dot y is to enforce--
if I'm kind of going to go down this road, is to
enforce that this term right here, that c be positive. Because if c is positive, then
x dot y, if x dot y then that would be the same thing
as cy dot y. Which we know is just the same
thing as c times the magnitude of y squared. And the only way that I can
ensure that this expression right here is equal to the
absolute value of x dot y, the only way I can assure this
is that c is positive. If c is negative, then this is
going to be a negative number while this is a positive. So if I assume that this is
positive, then I can say that x dot y is equal to the absolute
value of x dot y. And since it's a scalar
multiple, then I could say that that term is equal to, not
just less than or equal to, the magnitude
of x's and y's. So hopefully I'm not
confusing you. So all I'm saying is, if I can
assume that x is some positive scalar multiple of y,
that this wouldn't be a less than sign. Then I could say that x dot
y is the same thing as the absolute value of x dot y
since this is positive. And if it's the same thing as
the absolute value of x dot y and it's some scalar multiple
of each other, than we could go down this other route. We could say that this
thing here-- I don't want to get too messy. We could say that this
is equal to that. If this is equal to that, then
this would have become an equal sign, not a less than
or equal to sign. And then we would have had the
limiting case-- I don't want to call it the limiting case. But we could say that x plus y--
we would've done the same work, but we would've had an
equal sign the whole way. Would equal the length of x. The length of x plus y would
equal the length of x plus the length of y in the situation
where x is equal to some positive scalar times y. So c is greater than 0. These two imply each other. And we saw that geometrically. I lost my axes here, but we see
that the only time that the length of x plus y is equal
to the length of x plus the length of y is when
they're collinear. Over here this plus this is
clearly-- you can just visually look at it-- longer
than this right there. So you might be saying Sal,
once again, this linear algebra's a little bit silly. We learned the triangle
inequality in eighth or ninth grade. Why did you go through all of
this pain to redefine it? And this is the interesting
thing. What I just drew here and this
is what you learned in ninth grade geometry. This is just in R2. This is just your Cartesian
coordinates, or I don't want to use the word dimension too
much because we're going to define that formally. But this is kind of your
two-dimensional space that's going on. What's interesting or what's
useful about linear algebra is, we've just defined the
triangle inequality for arbitrarily large vectors,
or vectors that have an arbitrarily high number
of components. Each of these, these don't
have to be in R2. This is true if we're in R100
where every vector has a hundred components to it. We've just defined some notion
of the triangle inequality. We've abstracted well beyond
just our two-dimensional Cartesian coordinates. Well beyond even our three
dimensions to essentially, n dimensional space. And I haven't defined dimensions
yet, but I think you're starting to appreciate
what they are. But anyway, hopefully you
found that useful. We can now take this result. And actually, that result with
this result and define what the notion of an angle between
two vectors are. Once again, you know, on some
levels you're like well, why do we have to define an angle? Isn't an angle just-- isn't
that just an angle? Well yeah, we know what an angle
is in two dimensions, but what does an angle mean when
you abstract things to n dimensions? Or when you're in Rn. So that's what we'll talk
about in the next video.