Main content

## Linear algebra

### Course: Linear algebra > Unit 3

Lesson 2: Orthogonal projections- Projections onto subspaces
- Visualizing a projection onto a plane
- A projection onto a subspace is a linear transformation
- Subspace projection matrix example
- Another example of a projection matrix
- Projection is closest vector in subspace
- Least squares approximation
- Least squares examples
- Another least squares example

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Least squares examples

An example using the least squares solution to an unsolvable system. Created by Sal Khan.

## Want to join the conversation?

- Is this
**least squares solution**equivalent to the**orthocenter**of the triangle?

If the shortest distance is the projection, then the solution point should be on the perpendicular to each line.

...or did I miss something?(9 votes)- Nope, it's not the orthocenter. The least squares solution is always inside the triangle whereas the orthocenter can be outside. Nor does the least squares solution minimize the sum of the perpendicular distances to each line as some commenters claim, nor the distances to the corners of the triangle which was my second guess. In fact, the least squares solution has no geometric significance in R^2 because it's not unique! If you take the equation for one of the lines and multiply through by a constant then the least squares solution changes, even though it's the exact same line. It turns out every point in the triangle is a valid least squares solution if you use the right coefficients.

The way to see what's going on geometrically in this problem is to look at the codomain in R^3 and the solution vector**b**= [2, 1, 4]. Applying transformation**A**to the original triangle gets you a new triangle in R^3 whose corners are [2, 1, 1] [2, 6, 4] and [17, 1, 4]. If we subtract**b**from each corner vector we get [0, 0, -3] [0, 5, 0] and [15, 0, 0]. And now, if we multiply the original linear equations by the constants c1, c2, and c3 respectively then those difference vectors become [0, 0, -3*c1] [0, 5*c2, 0] and [15*c3, 0, 0]. So without changing the original triangle we can stretch the transformed triangle relative to**b**along all 3 axes at will. And if we stretch it the right way we can move the shortest vector from**b**to the transformed triangle wherever we like, and thus we can move the least squares point of the original triangle wherever we like.(8 votes)

- I'm having trouble figuring out why I'm getting two different answers. The second line written in slope intercept form has a slope of -1/2 and an intercept of 1/2. So I would think the line could also be written as (1/2)x + y = 1/2. However, when I then go through the process of finding the approximate intersection using these values in my matrices (using for the second rows of A and b [1/2 1] and [1/2] instead of [1 2] and [1] as they are in the video) I end up with a different approximate intersection (x: 52/31 y: 69/62). I've gone back through the math a couple times and don't seem to be finding a place where I'm making an error. Am I missing something? Or is such an approximate solution not unique and I'll find different ones depending on which numbers I use? Thanks.(6 votes)
- You're absolutely right, equivalent forms of the same lines will yield different least squares solutions.(1 vote)

- At around 9 minutes, when he swaps the rows (1,6 and 6,1), why does it not make the matrix negative?(3 votes)
- At around 9 minutes, when he swaps the rows (1,6 and 6,1), why does not swap x1 and x2, so the result it [3/7; 10/7]; if you don swap x1 and x2, [6 1; 1 6]*[10/7; 3/7] is not equal [9; 4](3 votes)
- If he didn't swap the rows, he would've gotten the same answer:
`[6 1 | 9]`

[1 6 | 4]

divide R1 by 6:`[1 1/6 | 3/2]`

[1 6 | 4]

R2-R1:`[1 1/6 | 3/2]`

(4-3/2 = 8/2-3/2 = 5/2)

[0 35/6 | 5/2]

R2*6/35:`[1 1/6 | 3/2]`

(5/2*6/35 = 30/70 = 3/7)

[0 1 | 3/7]

R1-R2/6:`[1 0 | 10/7]`

(3/2-(3/7)/6 = 3/2-3/42 = 21/14-1/14 = 20/14 = 10/7)`[0 1 | 3/7]`

This is the same thing he got in the video, but I never swapped the 2 rows.(3 votes)

- Shouldn't it be x ~= [(A^T * A)^-1] * [A^T * b]?

(the inverse of A^T * A)(3 votes)- That's basically what he did, except he solved the system by reducing to reduced row echelon form instead of multiplying by the inverse.

1st he multiplied A^T*A to get a matrix, we'll call it C

2nd he multiplied A^T*b to get a vectord we'll call it d

Lastly he solved Cx=d

which you could solve by reducing to reduce row echelon form or you could solve by C^-1*d=x

which is the same as (A^T*A)^-1*(A^T*b)=x

Hope this makes sense(3 votes)

- Is there a way to associate a confidence with the final intersection point in this least squares example? I've been looking at confidence videos and lessons and not finding a direct way to apply that to this example.(2 votes)
- In the previous video this formula was derived with the idea of vectors in R3 and its projection onto some C(A). How does this link to finding a point closest to 3 lines?(2 votes)
- I can't really visualise how this fits in with the idea of projections (like the example in the last video did). Would someone be able to share an intuition of how this example works in terms of a projection of b onto C(A)?(1 vote)
- How does this work in a more general setting? Say we have A*f(x) + B*f(y) = C ? Being more specific/example, something like having an equation like y=Ae^(Bt) + C. How would we regress a sample of data points if we have such a model? I tried working this out using a linear model as a template but the B coefficient remained. I am now guessing there is no general way to regress coefficients using reduced row echelon form.(1 vote)
- i love this lesson because it feels like the entire class up to this point is just exposition for this extremely useful application(1 vote)

## Video transcript

So I've got three lines in R2,
and I want to find their intersection. So the first one is 2x minus y
is equal to 2, the second one is x plus 2y is equal to 1, and
the third one is x plus y is equal to 4. So let's first just graph these,
just to have a visual representation of what
we're trying to do. So I like writing my lines in
y equals mx plus b form. So this top line becomes what? Minus y is equal to minus
2x, minus 2x plus 2. I just subtracted 2x
from both sides. Or we could write that y
is equal to 2x minus 2. That's the first line here. The second line-- I'm doing it
in green --we could write this as 2y is equal to minus x plus
1, or we could write that y is equal to minus 1/2 x plus 1/2. I just divided both
sides by 2. And this last line right here,
we could write this as y is equal to minus x plus 4. Go straight here. y is equal to minus x plus 4. Now let me graph these. Let me draw an axis,
that is my y-axis. I can call it my y-axis since
we're actually dealing with x and y's now. I'll do it slightly lighter. I'll do it in this
grey as well. I'd say that is my x-axis. Just like that. And this first guy is going
to be 2x minus 2. So it's y-intercept is going
to be at minus 2. It's going to have a slope of
2, so it's going to be a pretty steep line,
just like that. So that's that first line,
just like that. 2x minus 2. The next line is 1/2,
minus 1/2, plus 1/2. So if we go plus 1/2, that's
right there, and then the slope is minus 1/2. So for every 2 we go
over, we go down 1. So it's going to be like that. It's actually going to
be orthogonal, right? Because it's the negative
inverse of this guy. So it's going to look
something like that. We draw it like that,
just like that. And this guy is minus
x plus 4. This is 1, 2, 3, 4, and you go
minus x, so for every 1 you go over, you go down 1. So this other line is going to
look something like this. This other line, this last
line, is going to look something like that. Just like that. Now, I said at the beginning of
this video that I want to find the intersection of
these three lines. But notice, there is
no intersection of these three lines. They all intersect the other
two, but they don't all intersect each other
in one point. We can kind of call the system
as being overdetermined. We've overconstrained it. There is no intersection of
all three of these points. So if I were to actually try to
solve this system, I would find no solution. And to say that this has no
solution is equivalent to saying that this
matrix, or this equation has no solutions. Let me write this. I'm just going to rewrite
the system like this. This is equivalent to the
matrix, let me make sure I get this right, the matrix times
the vector xy is equal to 2, 1, and 4. And so, this first equation is
2 times x, minus 1 times y. So it's 2 and minus 1. 2 times x minus 1 times
y is equal to 2. That's that first equation
over there. The second equation-- actually
I could even, well I won't color code it, that'll take
forever --that's 1 times x plus 2 times y is equal to 1. And then we have x plus
y is equal to 4. This system and this equation,
this system right here, these are equivalent. Now, this isn't going to
have any solution. You could try to find
a solution to this. You could create an augmented
matrix, put it in reduced row echelon form. But there is no intersection
to the three things. So you're not going to find
a solution to A times some vector-- we could call this some
vector x --is equal to B. Or another way to say it, is
that B is not in the column space of this matrix
right here. Now, we learned in the last
video that sure, we can't find a solution to Ax equals B. Ax equals B has no solution. We see it graphically here;
these lines don't intersect with each other. And you could prove it for
yourself algebraically by trying to find a
solution here. You'll end up with
a 0 equals 1. But we can almost get
there by finding a least squares solution. And we find a least squares
solution if we multiply both sides by A transpose. We know that A transpose times
A times our least squares solution is going to be equal
to A transpose times B. So at least we can find the
closest fit for our solution. So let's find the vector
x this is our least squares solution. So what is A transpose
times A? A transpose looks like
this, you'll have 2 minus 1, 1, 2, 1, 1. That is A transpose. And then of course A is
just this thing: 2 minus 1, 1, 1, 1, 1. So A transpose A is going to be
equal to-- We have a 2 by 3 times a 3 by 2 matrix, so it's
going to be a 2 by 2 matrix. So it's going to be
a 2 by 2 matrix. So what do we get, we get 2
times 2 which is 4, plus 1 times 1, plus 1 times 1. So it's 4 plus 1 plus 1. So that's equal to 6. And then we have 2 times minus
1, which is minus 2, plus 1 times 2, so those cancel out. Minus 2 plus 2 is 0,
plus 1 times 1. That's just going to be 1. And then we get minus 1 times
2, which is minus 2, plus 2 times 1, which is 2. So the minus 2 plus 2 is 0, plus
1 times 1, so we get a 1. And then finally, we get minus
1 times minus 1, which is positive 1. Plus 2 times 2, which is
4, so we're now at 5. Plus 1 times 1, so this
is going to be 6. So this is A transpose A. Now, what is A transpose
times B? A transpose is 2, 1,
1, minus 1, 2, 1. And then B is just the 3 by 1
vector, the vector that's a member of R3, 2, 1, 4. So what is this going
to be equal to? This is going to be equal
to-- We get a 3. Sorry, this is a 2 by
3 times a 3 by 1. We're going to get
a 2 by 1 vector. You get a 2 by 1 vector here. So 2 times 2 is 4, plus 1
times 1, so that's plus 1, so that's 5. Plus-- let me actually write it
down, I'm going to make a careless mistake --2 times 2,
which is 4, plus 1 times 1, which is 1, plus 1 times
4, which is 4. Then here, you have minus 1
times 2, which is minus 2, 2 times 1, which is 2, plus
1 times 4, which is 4. So A transpose times B is
equal to 9, and this is going to be 4. So we can rewrite this guy right
here as the matrix A transpose A. Which is just 6, 1, 1, 6 times
my least squares solution-- so this is actually going to be in
the column space of A --is equal to A transpose times B,
which is just the vector 9 4. And this'll be a little bit more
straightforward to find a solution for. In fact, there will
be a solution. We proved it in the
last video. So to find a solution, let's
create our little augmented matrix: 6, 1, augmented
with a 9. You have the 4 there,
you get a 1 and a 6. Just like that. Let's put the left hand side in
reduced row echelon form. Actually, first I'm going
to swap these two rows. That's my first row operation
that I choose to do, just because I like to have
that 1 there. It's a nice pivot entry. So then it goes to 1,
6, 4, and 6, 1, 9. And then let me replace my
second row with the second row minus 6 times the first row. So I'm going to keep my
first row the same. So I have 1, 6, and 4. And then my second row, I'm
going to replace my second row with a second row minus
6 times the first row. So 6 minus 6 is 0. 1 minus 6 times 6, that's 1
minus 36, so that's minus 35. And then a 9 minus 6 times
4, is 9 minus 24. This one always gets
me in trouble. So 9 minus 24, that's the
negative of 24 minus 9, so that is minus 15. Let me make sure I didn't
make a careless mistake. 1 minus 36 is minus 35, 9
minus 24, is minus 15. So that's what I get
right there. And then let me go to the right
now, so let me let me divide this row right here, let
me divide it by minus 35. So I'm going to keep my
first row the same. 1, 6, and 4. And then this guy right
here, I'm going to divide by minus 35. So I'm going to get 0, 1, and
then minus 15 over minus 35. That's 15 over 35, or
that's 3 over 7, so that is equal to 3/7. And then let me just put this
in complete reduced row echelon form. That'll be nice. Let me keep my second
row the same. So my second row is
0, 1, and 3/7. And then my first row, I'm going
to replace it with my first row minus 6 times
my second row. So 1 minus 6 times 0 is 1, 6
minus 6 times 1 is 0, and then we have 4 minus 6 times 3/7. So 4 is 28/7. Let me write it up here. So we have 4, which is 28/7,
minus 6 times 3/7, so minus 18/7, right? That's 6 times 3/7. So this is going to
be equal to 10/7. And just like that, I've solved
this new equation. So you could say that-- Let
me write it this way. We could write that x star--
This is going to the first entry of x star, which
we could call x, is going to be 10/7. Let me write this. x star, the solution is going
to be 10/7 and 3/7. So I'm saying that if you take
x is equal to 10/7 and y is equal to 3/7, you're going
to get as close to a solution as possible. Let's see what that looks
like visually? What is 10/7? Let me write this down. x star
is equal to 10/7 and 3/7. We're saying the closest-- Our
least squares solution is x is equal to 10/7, so x is
a little over one. And then y is going to be 3/7,
a little less than 1/2. So our least squares solution
is going to be this one, right there. And so this, when you put this
value for x, when you put x is equal to 10/7 and y is equal
to 3/7, you're going to minimize the collective squares
of the distances between all of these guys. I drew this a little bit
too small to show that. But let's actually figure out
what our least, what our minimized difference is. Remember, the whole point of
the is to minimize the distance between Ax star, is
to minimize the distance between Ax star and B. Or between B and Ax star. Now, what was Ax
star equal to? Ax star was equal to 9/4. So this right here. Sorry, Ax star, that's not
equal to 9/4, that's A transpose Ax star
is equal to 9/4. Ax star is our original matrix
A, which is this one right here, so 2-- Let me write
it down here. I know it's off the
page right now. So our original matrix A was
2, minus 1, 1, 2, 1, 1. That was our original matrix
A right there. And then our x star,
we were able to determine, is 10/7 and 3/7. So Ax star is going to be
this product, so what does it equal to? It is equal to, it is going
to be a 3 by 1 matrix. So we got 2 times 10/7, which
is 20/7, minus 1 times 3/7, so minus 3/7. And then we get, let me see,
yep, that's 10 minus 3/7, and we have 10/7, minus 2 times
3/7, so minus 6/7. Or plus, sorry, this
is a plus. Right? 2 times 10/7 is 20/7 minus 1
times 3/7, then we have 1 times 10/7 plus 2 times 3/7. And then we have
10/7 plus 3/7. So Ax, so this is A and x
star, our least squares approximation for x, is
equal to what is this? This is 17/7, this is 16/7,
and this is 13/7. We want to find out with this
minimum distance is. So, let's see, this is going
to be this thing. So 17/7, 16/7, and 13/7
minus our original B. Our original B was
2, 1, and 4. So I'm claiming that my solution
that we just found, this minimizes this distance. Because this is the projection
of B onto the column space of A. We saw that before. So our B we see all the
way up here is 2, 1, 4, just like that. So if we take the length of
this-- let me switch colors --this is equal to the length. Let me write all of us
in sevenths-- I'll just do it in my head. I don't want to waste
too much time. So this is 17/7 minus
14/7, right? 2 is 14/7, so this is
going to be 3/7. Then it's 16/7 minus
7/7, so that's 9/7. And then we have 13/7 minus
28/7, so that is minus 15 over 7. So this is the vector that
separates the B that was not in my column space of A from
the projection of B. And if we find it's length,
it's length is going to be equal to-- Let's find the square
of it's length first. The square of its length is
going to be 3/7 squared, so that is 9/49, plus 9/7 squared,
which is 81/49, plus minus 15/7 squared. What's 15 squared? 15 squared is 225, I think. Let me make sure. I'm prone to careless
mistakes. 5 times 5 is 25, 1 times
5, this is 75, and then I have a 150. Yep, 225. So plus 225/49. Which is equal to-- So 9 plus 81
is 90, and then so 225 plus 90, to get a 5, was a 315. So this is equal to 315/49. Or, if we actually wanted the
difference, that's going to square root of this. So if we take just the regular
distance, that's equal to the square root of that. So it's equal to the square
root of 315 over 7. And square root of 315,
it looks like that is simplifiable. Does 9 go into it? Looks like 9 goes into it,
maybe 35 five times? So it would be what, 3 square
roots of 35 over 7. So that is just a measure. Let me put it this way, you're
not going to be able to find any member of R2, any values
of x and y, that's going to give you a smaller value than
this when you find the distance between it's solution
and the solution you were trying to get to. So this, based on our least
squares solution, is the best estimate you're going to get.
x is equal to 10/7, y is equal to 3/7. A little bit right,
just like that. Anyway, hopefully you found
that useful, and you're starting to appreciate that the
least squares solution is pretty useful.