Main content

## Linear algebra

### Course: Linear algebra > Unit 3

Lesson 4: Orthonormal bases and the Gram-Schmidt process- Introduction to orthonormal bases
- Coordinates with respect to orthonormal bases
- Projections onto subspaces with orthonormal bases
- Finding projection onto subspace with orthonormal basis example
- Example using orthogonal change-of-basis matrix to find transformation matrix
- Orthogonal matrices preserve angles and lengths
- The Gram-Schmidt process
- Gram-Schmidt process example
- Gram-Schmidt example with 3 basis vectors

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# The Gram-Schmidt process

Finding an orthonormal basis for a subspace using the Gram-Schmidt Process. Created by Sal Khan.

## Want to join the conversation?

- Is there an error at16:00?

He copies the text, but should the second term he copies also be negative along with the first?(33 votes)- yes, you are right. should be y_3 = v_3 - [projv_3], and the negative sign will be distributed to all proj terms(22 votes)

- What are practical applications of Gram-Schmidt process?...

Thanks.(7 votes)- My chemistry professor says that the Gram-Schmidt process was used to develop the hybridization of atomic orbitals in chemistry. Linus Pauling, I think.

I'd be really grateful if Salman could do a video showing the relation between the Gram Schmidt process and hybridization orbitals. :)(30 votes)

- I need to reference this video for an article I'm writing, who Is the Author?(3 votes)
- Salman Khan, founder of KhanAcademy(26 votes)

- what happens if the Gram-Schmidt procedure is applied to a list of vectors that is not linearly independent?(5 votes)
- Basically, if you have a vector
**d**that is a linear combination of vectors**a**,**b**, &**c**and you apply the Gram-Schmidt procedure, then subtracting each respective projection from**d**will leave you with a vector of zeroes. [It would be like taking a vector (a,b,c) and subtracting the x-, y-, & z-components--you're left with nothing!] Thus, assuming the procedure is automated, an error will occur because any attempt to "normalize" the zero vector will entail trying to divide by zero.

As a result, you'll need to take this into account in your algorithm, testing to see whether your output is a zero vector, and getting rid of them as they arise...

Hope this helps!(15 votes)

- Question about projection: I have learned that the projection of v onto w = (v dot w / v dot v) times v.

However, I see Sal is calculating projection of v2 onto v1 = (v2 dot u1) times u1

where normally, this would be calculated by projection = (v2 dot v1 / v2 dot v2) times v2

Why not (v2 dot u1 / v2 dot v2) times v2?

Doesn't (v2 dot u1) times u1 just = an orthonormal basis? is the orthonormal basis = projection in this case?

my question is, how do we go from (v2 dot v1 / v2 dot v2) times v2 to (v2 dot u1) times u1? is it because u1 is an orthonormal basis for v1? How were we able to drop out the denominator in the projection equation?(6 votes)- First, when you project a vector
**v**onto a vector**w**, the result is a scaled version of the vector**w**,*NOT*the vector**v**:

proj(**v**) = k**w**,

where "k" is a constant and:

k = (**v**⋅**w**/‖**w**‖²)

The formula you first mention ["(v dot w / v dot v) times v"] is the correct formula for the projection of.**w**onto**v**

Now, the reason why we want to first project the vector**v**onto**w**is so that we can then subtract that projection from**v**, leaving us with a new vector (I'll call it vector "**b**") that is orthogonal to**w**Thus,**b**=**v**- proj(**v**)

(And then, of course, we "normalize" the vector so that it has unit length).

Lastly, the reason why Sal drops out the denominator in the projection equation is because the denominator is the squared length of the vector**u_1**which is equal to 1 because it is a unit vector.

Again, to reiterate, the new orthonormal basis vector is*NOT*the projection vector. Rather, we calculate the projection vector, proj(**v**) so that we can subtract it from**v**itself, leaving us with a new vector that's*orthogonal*to**w**.

Does this make sense?? Hope this helps!!(9 votes)

- At "2:30", you forgot to put the || signs. You wrote u_1 instead of ||u_1||(9 votes)
- Here the letter "u" is meant to stand for "unit vector", so you don't need to write "||u1||".(0 votes)

- Well, here I noticed something interesting.

The result of the Gram-Schmidt process orthogonal basis is dependent on the vector we choose to start up with and so on.

For example, if I start with projecting into v_1 I will have different results than starting with v_2.(6 votes)- This is true. It is worth noting that basis sets are not unique, and neither are orthonormal basis sets. Thus, it is not surprising that the Gram-Schmidt process can produce different basis sets depending on how it is performed.(2 votes)

- I have another method.

Just divide every vector in the basis by their lengths. Is my method valid?(0 votes)- No. You would be missing orthogonality!(21 votes)

- At the beginning of the video it is stated that v1 and v2 are basis vectors for a subspace.

If they are basis vectors then shouldn't they be orthogonal to each other.

If they are orthogonal to each other how can they have a projection onto each other at8:38in the video.(2 votes)- Basis vectors don't have to be orthogonal, they just have to be linearly independent.(5 votes)

- How to calculate the distance of a 2x2 matrices(2 votes)
- To do this you must first define a norm on matrices, it is trivially equivalent to defining a norm on the space of linear operators from R^2 to R^2. Let T: R^2 --> R^2 be linear, define ||T|| = sup{ |Tv| = 1 : |v| = 1}. This definition makes sense since the set is clearly non-empty, and bounded above (To see that it's bounded above, note that Tv = a_1 T e_1 + ... + a_n T e_n, since this sum is finite, and each of the values is finite, the result is finite). To check that it defines a metric you must show that it satisfies the three properties of a metric: 1. Positivity, d(x,y) greater than equal to zero for all x,y; equality iff x = y. 2. d(x,y) = d(x,y) and 3. Triangle Inequality holds, For any x,y,z we have that d(x,z) <= d(x,y) + d(y,z). I leave this an exercise for you!(3 votes)

## Video transcript

Let's say I have a set of
linearly independent vectors, V1, V2, all the way to Vk,
that are a basis for V. We've seen this many
times before. That's nice. It's a basis, but we've learned
over the last few videos it would be even
better to have an orthonormal basis for V. What we're going to address in
this video is, is there some way that we can construct an
orthonormal basis for V, given that we already have this basis,
which I'm assuming isn't orthonormal. It'll work whether or not
it's orthonormal. It'll just generate another
orthonormal basis. But can we somehow, just given
any basis, generate an orthonormal basis for V, and
then be able to benefit from all of the properties of having
an orthonormal basis? Let's see if we can make
some headway here. Let's think about it
for simple cases. Let's say that I have a
one-dimensional subspace. Let's call that V1, for it's
a one-dimensional subspace. I'm going to say it's the span
of just this vector V1. Now, or you could say the vector
V1 is the basis for the subspace V1. If I had this simple case, how
can I ensure that this is orthonormal? What I could do is I could
define some vector, let's call it U1. Obviously, this is orthogonal
to all of the other guys. There aren't any other
guys here. [INAUDIBLE] vectors there, but there aren't
any other members of this set, so it's orthogonal
to everything else because there isn't anything else. And then to make its length
equal one, you can take this vector and divide it
by its length. So if we define some vector U1
to be equal to V1 divided by the length of V1, then
the length of U1 is going to be what? It's going to be the length
of V1 over the length of V1 like this. This is just a constant
right here. This is going to be 1 over
the length of V1. That could be 5, times length
of V1, which is just equal to 1. This one will have
a length of 1. So just like that, if we have
the set of just U1, we could say that just the set of U1 is
an orthonormal basis for V1, for the subspace V1. Now, that was trivially easy. If k happens to be
1, we're done. That was super easy. We just have to divide
by our length. You have to just essentially
just normalize this vector, and you have an orthonormal
basis because there's nothing else to be orthogonal with. Well let's complicated the
issue a little bit. Let's say we go to
two dimensions. Let's say we have
a subspace V2. That's the span of, let's
say it's the first two of these vectors. It's the span of V1
and the vector V2. Now, we know that V1, V1 can
easily be represented as-- or, V1 is a linear combination
of U1. How do I know that? Well, I can just multiply
both sides of this times the length of V1. We have U1 times the length
of V1 is equal to V1. So we could say this
is the same thing. This is equivalent to saying the
span of the vector U1 and the vector V2, where U1 is the
vector we got up here. How can I say this? Because anything that can be a
linear combination of these guys can be a linear combination
of those guys. Because where ever you had a
V1, you can replace the V1 with the linear combination
of U1 that gives you V1. So you just multiply U1 times
that scalar and you get it. I think you get the point. But how can we ensure that this
is an orthonormal set? What do we do? Let's graph it. This is going to be
a plane in Rn. So let's just make our
chalkboard here, or our video board here, the plane. So we have U1, which
is a unit vector. It has length 1. So that is U1. V1 and V2 are linearly
independent, that's by definition of a basis. So you can't represent V2 as
a linear multiple or linear combination of V1. Likewise, you can't represent V2
as a linear combination of U1 because U1 is a linear
combination of U1. So, V1 is not going to be on
the line spanned by U1. So I could actually draw that. The line spanned by U1
is just like this. This is the line
spanned by U1. Let me draw it a little
bit better than that. I don't want to make
it too dark. So this is the line. I'll do one last try. The line spanned by U1
looks like this. It looks like that,
it's just a line. And that is the subspace
V1, right? With the span of U1. So this is equal to
the span of U1. All we did is normalize V1
right there to get to U1. So the span of V1 is the
same span of U1. That is this subspace. That's this line in Rn. We have vector V2,
which is linearly independent from V1 and U1. V2 would look, let's just
say something like that. V2. Now, what we want to do is
replace V2 with another vector that is definitely orthogonal
to this, and will still be able to construct V2 with some
combination of this and our new vector. Well, the most obvious vector
would be some vector that is orthogonal to V1. So it's orthogonal, so it's
a member of the orthogonal complement of V1. If you look at it just visually
here, if I add some member of V1 to that member of
the orthogonal complement of V1, I'm going to get V2. In fact, we've seen that
multiple times. We know that any vector in
Rn, let's say V2, can be represented as a sum
of two vectors. I'll call them x and y, where x
is a member of V1 and y is a member of the orthogonal
complement of V1. We've seen that multiple
times. Now what was this thing
by definition? We're looking for this thing. This is the x. And this is the y. We're looking for the y, because
if we can find this vector y, then if we replace
V with that vector y, we'll still be able to generate
V because you can take a multiple of U and add
it to y and get V. And so anything that you were
able to generate with V2 before, you can now generate
with our U1, multiples of that, plus multiples
of our new vector. So, we're trying to solve, we're
trying to figure out what this vector y
is right there. So how do we do it? Well, it's going to be V2
minus this vector x. Right? What is this vector
x, by definition? This vector x is, by definition,
the projection of V2 onto the subspace V1. So, the vector that we're trying
to find, the vector y, and if we find it, and that we
can replace V2 with y, the vector y, is just equal to
V2-- let me write it this way-- minus the projection
of V2 onto V1. That's what y is. Remember, if we can replace V2--
the reason why the span of U1, V2 is the same thing as
the span of U1-- let me call this y2 here. I'm going to just call that
y2, because we're probably going to have to use
y's in the future. You want y2. So the reason why this span is
the same thing as the span of U1 and y2 is because I can
generate V2 with a linear combination of U1
and y2, right? I can scale up U1 and then add
y2 and I'll get to V2. So anything that I could have
generated with V2 I can get with a linear combination
of these guys. That's why these
are equivalent. And what's neat about this now
is that these guys are orthogonal complements-- or
these guys are orthogonal relative to each other, right? By definition, y was a member of
the orthogonal complements. If you dot these two guys,
you're going to get 0. How can we actually
solve for it? Well, this is useful as
well because V1 has an orthonormal basis. And we saw, I think it was two
or three videos ago, the neat thing about orthonormal bases
is, it's very easy to determine the projection
onto those bases. It's essentially, let me write
it over here, the projection of the vector V2 onto the
subspace V1 is equal to V2-- let me write it this way-- V2
dotted with the subspaces V1's first basis vector, which
is the vector U1, right? That's its first basis vector. We're dealing with an
orthonormal basis-- times U1. And then if we had more basis
vectors, we would say plus V2 dotted with our next basis
vector, times that basis vector, and so on
and so forth. But V1 only has one
basis vector. It only has the basis vector
U1 right here. Right? It is only spanned by that. So, we can rewrite this thing
right here. ys, this vector right here that I'm going to
replace V2 with, is equal to V2 minus the projection of V2
onto the subspace V1, on to that line, which is just this. V2 dot U1 times the vector U1. And just like that, we've
solved for y. So we have a basis for V2, where
this guy and this guy are orthogonal. This guy is the unit vector. He's been normalized,
but this guy hasn't been normalized yet. So to normalize it, let's just
define some other vector as U2 and do the same thing. Let's just normalize it. So U2 is equal to y2, divided
by the length of y2. Now, we can say that the
subspace V2 is equal to the span of U1 and, instead of y2,
I'll put U2 there, because I can generate y2 by just
scaling up U2. What's neat about this now is,
I have two unit vectors, or two normalized vectors, and
they're orthogonal with respect to each and they span
what V1 and V2 spanned originally. Now, we need to keep going. What about if we want to go
to V3 here, what do we do? Well, we do the same idea. So let's define the
subspace V3. This is going to be a
three-dimensional subspace. Once we get beyond V3, it
becomes a little hard to visualize, but I think you're
going to see the pattern after this step. If we define V3 as equal to the
span of these guys, U1, U2, and then, in our
original basis, V3. I didn't write it there but
there's a V3 there. So in our original
non-orthonormal basis, we have V3. What is this going
to look like? Or how can we turn this into
an orthonormal basis? So, if you visualize all of
these, so the span of U1 and U2, our subspace V2 is just
going to be a plane. It's going to be
a plane in R3. It's going to look like
that right there. And so, our new span is going
to be everything in that plane, all linear combinations
of things in that plane, plus linear combinations of our
vector V3, which is linearly independent from these guys
because it was linearly independent from the stuff we
used to construct these guys. So V3 is going to pop
out of this plane. It can't be represented
by the linear combination of those guys. So let's say V3 pops out of
the plane, like that. Now, we want another vector that
can represent everything of this span, but it is
orthogonal to these dudes right here. Or another way to think
about it, it's orthogonal to the plane. So we want another vector that
is orthogonal to the plane. So let's call that vector y3,
y sub 3, not y to the third power, y sub 3. And if we figure out our y sub
three, we can replace V3 with it because V3 can be represented
as a linear combination of U1 and U2. Right? That's going to be a linear
combination of U1, U2 is going to be some vector in
this plane plus y3. I can represent this guy with
this green vector plus this vector right here. So if we replace him with y3, we
can still get to V3 and we can still get to all the linear
combinations that V3 can help construct. So what is this y3? Well, by the same exact logic,
this green vector right here is the projection of my vector
V3 onto the subspace V2 and your vector y3 is just equal
to the vector V3 minus the projection of V3 onto V2. So, what is that going to be? Well, the projection-- I'll
write it here-- the projection onto the subspace V2 of V3, of
the vector V3 is going to be equal to-- we're going
to use the same exact logic we did before. We saw this two or
three videos ago. Because V2 is defined with an
orthonormal basis, we can say that the projection of V3 onto
that subspace is V3, dot our first basis vector, dot U1,
times our first basis vector, plus V3 dot our second basis
vector, our second orthonormal basis vector, times our second
orthonormal basis vector. It's that easy. That's one of the neat things
about having an orthonormal basis was. This is an orthonormal basis. So we can define the projection
in this way. So y is just going to be V3
minus this business right here, and I'll just write out. So y3 is going to be V3 minus
the projection of V3 onto V2. So minus that business
right there. Let me copy it, and then
let me paste it. And then you get y3. And y3 is nice. So if we replace this with y3,
which we can because we can now get V3 as a linear
combination of these guys and y3. That's nice because all of these
guys are orthogonal with respect to each other, but y3
does not have length 1 yet. It hasn't been normalized. So we can just replace y3 with
another unit vector. We'll just say, U3 is equal to
y3 divided by the length of y3, whatever that might be. But if we have what y3 is, we
can figure out its length and divide by that. And then if you put-- so this
is equal to the span of U1, U2, and U3. So U3 might be a scaled-down
version of y3. We're talking about linear
combinations when we talk about span, so you can scale
them up and add it to the projection. You're still going to get V3. And now, these are all
normalized vectors. And so you would not have an
orthonormal basis for the subspace V3. I think you get the
pattern now. You can keep doing this. You can define V4, when I
defined the problem I just said that we're dealing with
a k dimensional subspace. We go up to Vk. But you just keep going until
you go up to your k. If k was 3, we'd be done. If k is 4, you say you define a
subspace V4 is equal to the span of U1, U2, U3, and then
throw our non-orthonormal vector, V4, into the mix. And so you can replace V4 with
y4 is equal V4 minus the projection. It becomes a little
hard to visualize. You're not projecting on to
a three-dimensional space. The projection of V4 onto
our subspace V3. It's completely analogous
to this. It's just V3 is now a
three-dimensional space, not a plane. You're [? finding ?] this projection onto this. This is definitely going to be
orthogonal to everything else, to our subspace V3. And you can construct V4 with y4
because V4, by definition, is equal to-- we can just
rearrange this-- y4 plus the projection of V4 onto V3. So you can construct V4 with y4
and some linear combination of these guys. So I can replace this guy
with y4 and then I would normalize y4. I would divide it by its
length and I'd get U4. And I just keep doing that
until I get up to k. If I do V5, I do the process
over and over and over again. And this process of creating an
orthonormal basis is called the Gram-Schmidt Process. And it might seem a little
abstract, the way I did it here, but in the next video
I'm actually going to find orthonormal bases
for subspaces. You'll see it's not too bad
when you have to deal with real numbers.