Main content

## Linear algebra

### Course: Linear algebra > Unit 3

Lesson 4: Orthonormal bases and the Gram-Schmidt process- Introduction to orthonormal bases
- Coordinates with respect to orthonormal bases
- Projections onto subspaces with orthonormal bases
- Finding projection onto subspace with orthonormal basis example
- Example using orthogonal change-of-basis matrix to find transformation matrix
- Orthogonal matrices preserve angles and lengths
- The Gram-Schmidt process
- Gram-Schmidt process example
- Gram-Schmidt example with 3 basis vectors

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Introduction to orthonormal bases

Looking at sets and bases that are orthonormal -- or where all the vectors have length 1 and are orthogonal to each other. Created by Sal Khan.

## Want to join the conversation?

It seems to me that when Sal proves that the orthonormal set is linearly independent, he just proves that a member of his set isn´t a multiple of another member of his set. Shouldn´t he prove that the member is not a linear combination of all other members of his set, so that the set, not 2 vectors would be proven linearly independent?

If no 2 vectors in his set are scalar multiples of each other, the set could still be linearly dependent. Please elaborate, thank you.(48 votes)- You're right, but the proof can be extended to show the v's are linearly independent.

First suppose that the v's are linearly dependent. Then v_i is some linear combination of v_j (for all j != i), or v_i = c_1*v_1 + c_2*v2 + c_{i-1}**v_{i-1} + c_{i+1}*v_{i+1} + ... + c_n*v_n where the c's can't all be zero. Since you've defined the v's so that they are all orthogonal, then v_i . v_k = 0 for some k != i, or (c_1*v_1 + c_2*v2 + c_{i-1}*v_{i-1} + c_{i+1}*v_{i+1} + ... + c_n*v_n) . v_k = 0. All the terms on the left hand side except for c_k*v_k will be wiped out because of the orthogonality, leaving (c_k*v_k . v_k) = c_k**||v_k||^2 = 0. Since you defined ||v_k||^2 = 1, then c_k must be zero, and since k is just some arbitrary index, then c_k = 0 for all k from 1 to n. This however contradicts the definition of a linear combination when we said all the c's can't be zero. Thus the v's are linearly independent.(36 votes)

- What are the prerequisites for this lesson? How do I determine what other videos I need to watch in order to understand this one?

I have < 1 week (for a Quantum Computing course), it mentions specifically this and one other Linear Algebra topic (eigenvalues/vectors). I've been serially watching every video in the "Linear Algebra" section from the beginning, but there will not be enough time.

So, how to determine what videos I can skip in order to reach this one and be able to understand it?(3 votes)- I my honest opinion, you will require more than one week to get to this point. For example, I have been working up to this for about 2 months and I practice every day.(8 votes)

- All these concepts are directly applied to electrons in atoms. In that sense, if I consider Vi be the wave function of i^th electron, is it correct to consider as follow:

Normalized vector Vi : the total probability of finding the electron is 1 (Vi.Vi=1)

No two electrons can be in the same place (becasue Vi.Vj=0)

Vi and Vj are Linearly independent : One electron does not cross each other electrons position.

Please give corrections and suggestions for further reading of basic level for this.(3 votes) - minute7:12, hello there, big fan of you... how come you say "it can span V"? isn't it have to span V form the first place? - isn't the definition of orthonormal set is a standard indipendent set of vectors in a slicly difrrent angle from the normal x\y\z\... axis?

hopefully i explained myself ok with my louzy engilsh, thanks in advance keep up the incredible work, you make me love math, simply love it.(2 votes)- he said it spans a subspace not entire space(2 votes)

- If you have a orhonormal basis set u, then is their inner product <u|u> defined to be 1?(2 votes)
- I think you're confusing sets and their elements. An orthonormal basis is a set of vectors, whereas "u" is a vector. Say B = {v_1, ..., v_n} is an orthonormal basis for the vector space V, with some inner product defined say < , >. Now <v_i, v_j> = d_ij where d_ij = 0 if i is not equal to j, 1 if i = j. This is called the kronecker delta. This says that if you take an element of my set B, such as v_1 and consider <v_1 , v_1> then this value must be 1. If the subscript isn't 1 then you will always get zero! The short answer is yes, but you had a slight conceptual mishap in your question.(1 vote)

- Must a scalar multiple of an orthogonal matrix orthogonal as well? Is this answered in another video?(2 votes)
- Do you mean that if "M" is an orthogonal matrix is "kM" orthogonal? If so lets check the definition. I would recommend trying some examples.

"kM" is orthogonal if all of its columns are unit vectors. But if "M" was orthogonal and we multiply a "k" into "M" somewhere it will multiply one of the columns by a scaler that is not 1 so that column will no longer be a unit vector.(1 vote)

- Is it called "Orthonormal bases" or "Orthonormal basis"?

It was "bases" in the title, but he said and wrote (as at11:12) "basis"(1 vote)- When bases is the plural of base, it is pronounced bay-sez. One base, several bay-sez.

When bases is the plural of basis, it is pronounced bay-sees. One basis, several bay-sees.

(ie you never have several basises)(4 votes)

- For expressing the dot product of the vectors, shouldn't we put the first vector transposed?(1 vote)
- If you treat the vectors as 1-column matrices, then yes, in order to do the dot product you have to put express your first vector as a 1-row matrix. But if you are using normal vector notation (as most of the video does) then you are not committed to the matrix representation of vectors, and as such each vector can be seen as either a 1-column matrix, a 1-row matrix, a tuple of numbers or even as an arrow in space.

In notation, there is no difference between:`⎡a_x⎤`

a = ⎥a_y⎥

⎣a_z⎦

a = [a_x a_y a_z](2 votes)

- If you have a set of 30 vectors in r2 how can they all be orthogonal to each other? It seems like you could have at most 2?(1 vote)
- That's correct; you could never have more than two vectors in R2 and have them all be orthogonal to one another. To see a visual example of this, try drawing three straight lines (vectors in R2) such that each line intersects the origin and is perpendicular to the other two lines.(2 votes)

- What are the coordinates for the translation of a triangle given the matrix addition ? Or yet how can I solve the problem?(1 vote)

## Video transcript

Let's say I've got me
a set of vectors. So let me call my set B. And let's say I have the vectors
v1, v2, all the way through vk. Now let's say this isn't just
any set of vectors. There's some interesting things
about these vectors. The first thing is that all of
these guys have length of 1. So we could say the length of
vector vi is equal to 1 for i is equal to-- well we could say
between 1 and k or i is equal to 1, 2, all
the way to k. All of these guys have
length equal 1. Or another way to say it is
that the square of their lengths are 1. The square of a vi whose
length is equal to 1. Or vi dot vi is equal to 1 for
i is any of these guys. Any i can be 1, 2, 3,
all the way to k. So that's the first interesting
thing about it. Let me write it in
regular words. All the vectors in
B have length 1. Or another way to say is that
they've all been normalized. That's another way to say that
is that they have all been normalized. Or they're all unit vectors. Normalized vectors are vectors
that you've made their lengths 1. You're turned them into
unit vectors. They have all been normalized. So that's the first
interesting thing about my set, B. And then the next interesting
thing about my set B is that all of the vectors are
orthogonal to each other. So if you dot it with itself,
if you dot a vector with itself, you get length 1. But if you take a vector and dot
it with any other vector-- if you take vi and you were
to dot it with vj. So if you took v2 and dotted it
with v1, it's going to be equal to 0 for i does
not equal j. All of these guys
are orthogonal. Let me write that down. All of the vectors are
orthogonal to each other. And of course they're not
orthogonal to themselves because they all
have length 1. So if you take the dot product
with itself, you get 1. If you take a dot product with
some other guy in your set you're going to get 0. Maybe I can write it this way. vi dot vj for all the members
of the set is going to be equal to 0 for i does
not equal j. And then if these guys are the
same vector-- I'm dotting with myself-- I'm going
to have length 1. So it would equal length
1 for i is equal to j. So I've got a special set. All of these guys have length
1 and they're all orthogonal with each other. They're normalized and they're
all orthogonal. And we have a special
word for this. This is called an
orthonormal set. So B is an orthonormal set. Normal for normalized. Everything is orthogonal. They're all orthogonal relative
to each other. And everything has
been normalized. Everything has length 1. Now, the first interesting thing
about an orthonormal set is that it's also going to be
a linearly independent set. So if B is orthonormal, B is
also going to be linearly independent. And how can I show
that to you? Well let's assume that it isn't
linearly independent. Let me take vi, let me take vj
that are members of my set. And let's assume that
i does not equal j. Now, we already know that
it's an orthonormal set. So vi dot vj is going
to be equal to 0. They are orthogonal. These are two vectors
in my set. Now, let's assume that they
are linearly dependent. I want to prove that they are
linearly independent and the way I'm going to prove that is
by assuming they are linearly dependent and then arriving
at a contradiction. So let's assume that vi and
vj are linearly dependent. Well then that means that I
can represent one of these guys as a scalar multiple
the other. And I can pick either way. So let's just say, for the sake
of argument, that I can represent vi-- let's say
that vi is equal to sum scalar c times vj. That's what linear
dependency means. That one of them can be
represented as a scalar multiple of the other. Well if this is true, then
I can just substitute this back in for vi. And what do I get? I get c times vj-- which is just
another way of writing vi because I assumed linear
dependence. That dot vj has got
to be equal to 0. This guy was vi. This is vj. They are orthogonal
to each other. But this right here is just
equal to c times vj dot vj which is just equal to c times
the length of vj squared. And that has to equal 0. They are orthogonal so
that has to equal 0. Which implies that the length
of vj has to be equal to 0. If we assume that this is some
non-zero multiple, and this has to be some non-zero
multiple-- I should have written it there-- c
does not equal 0. Why does this have to be
a non-zero multiple? Because these were both
non-zero vectors. This is a non-zero vector. So this guy can't be 0. This guy has length 1. So if this is a non-zero vector,
there's no way that I can just put a 0 here. Because if I put a 0 then
I would get a 0 vector. So c can't be 0. So if c isn't 0, then this guy
right here has to be 0. And so we get that the
length of vj is 0. Which we know is false. The length of vj is 1. This is an orthonormal set. The length of all of the
members of B are 1. So we reach a contradiction. This is our contradiction. Vj is not the 0 vector. It has length 1. Contradiction. So if you have a bunch of
vectors that are orthogonal and they're non-zero, they
have to be linearly independent. Which is pretty interesting. So if I have this set, this
orthonormal set right here, it's also a set of linearly
independent vectors, so it can be a basis for a subspace. So let's say that B is the basis
for some subspace, v. Or we could say that v is equal
to the span of v1, v2, all the way to vk. Then we called B-- if it was
just a set, we'd call it a orthonormal set, but it can be
an orthonormal basis when it's spans some subspace. So we can write, we can
say that B is an orthonormal basis for v. Now everything I've done is very
abstract, but let me do some quick examples for you. Just so you understand what an
orthonormal basis looks like with real numbers. So let's say I have
two vectors. Let's say I have the vector,
v1, that is-- say we're dealing in R3 so it's 1/3,
2/3, 2/3 and 2/3. And let's say I have another
vector, v2, that is equal to 2/3, 1/3, and minus 2/3. And let's say that B is
the set of v1 and v2. So the first question
is, is what are the lengths of these guys? So let's take the length. The length of v1 squared
is just v1 dot v1. Which is just 1/3 squared,
which is just 1 over 0. Plus 2/3 squared,
which is 4/9. Plus 2/3 squared,
which is 4/9. Which is equal to 1. So if the length squared is 1,
then that tells us that the length of our first vector
is equal to 1. If the square of the length is
1, you take the square root, so the length is 1. What about vector 2? Well the length of vector 2
squared is equal to v2 dot v2. Which is equal to-- let's see,
two 2/3 squared is 4/9-- plus 1/3 squared is 1/9. Plus 2/3 squared is 4/9. So that is 9/9, which
is equal to 1. Which tells us that the length
of v2, the length of vector v2 is equal to 1. So we know that these guys are
definitely normalized. We can call this a
normalized set. But is it an orthonormal set? Are these guys orthogonal
to each other? And to test that out we just
take their dot product. So v1 dot v2 is equal to 1/3
times 2/3, which is 2/9. Plus 2/3 times 1/3,
which is 2/9. Plus 2/3 times the minus 2/3. That's minus 4/9. 2 plus 2 minus 4 is 0. So it equals 0. So these guys are indeed
orthogonal. So B is an orthonormal set. And if I have some subspace,
let's say that B is equal to the span of v1 and v2, then we
can say that the basis for v, or we could say that B is
an orthonormal basis. for V.