Main content

### Course: Integral Calculus > Unit 5

Lesson 12: Lagrange error bound# Taylor polynomial remainder (part 2)

The more terms we have in a Taylor polynomial approximation of a function, the closer we get to the function. But HOW close? In this video, we prove the Lagrange error bound for Taylor polynomials. Created by Sal Khan.

## Want to join the conversation?

- In the video when you integrate both sides isn't the value of n supposed to increase by one therefore making it n+1 instead of n-1?(51 votes)
- Its the derivative thats decreasing. If you take the integral of f'(x) its f(x) + C. If you take the integral of f''(x) its f'(x) + C. You can see that the derivative is decreasing from ' to none and '' to '.(61 votes)

- I dont understand how Sal goes from the integral of E(n+1)(x) dx. to just E(n)(x), around the 10 minute mark.(22 votes)
- The function is not E^(n+1). The (n+1) is actually how many derivatives we've taken. So when we take the antiderivative, we "lose" one derivative and go to (n+1). It's not a power, it's a derivative :). Hope this helps!(54 votes)

- Can anyone explain to me why at10:00, where -ma<=c,why we have to take the lower bound of c, making c=-ma? I am really confused at that part.(17 votes)
- This part actually has an error, which can be fixed by using a definite integral between x and a (or a and x, depending on whether x is less than a), rather than an indefinite integral. I have sent a message to Mr. Khan about this error, so hopefully this gets corrected soon. Anyway, a definite integral with these bounds gets you from step to step quite smoothly :) Hope that helps!(21 votes)

- Sal just assumed we know M in this video, but how would one actually find it?(18 votes)
- Since the function is continuous in the interval of [a, b], you know there will be a maximum value. It can be f(b), f(a), or f(c) where f`(c) = 0 (extreme value).(13 votes)

- Why are we free to choose the integration constant valur in the limit bounds of the error function?? can we always choose any value of c while finding a function(9 votes)
- we aren't "free" to choose c, it must be within certain constraints that arise when trying to calculate it. we choose the smallest value we can so that the potential difference between the rhs and lhs is as small as possible.(7 votes)

- Are there any videos that instruct on situations where M is given?(9 votes)
- There are subsequent examples that demonstrate how to find M for specific functions:

https://www.khanacademy.org/math/calculus-home/series-calc/taylor-series-calc/v/lagrange-error-bound-for-sine-function

https://www.khanacademy.org/math/calculus-home/series-calc/taylor-series-calc/v/lagrange-error-bound-exponential-example(4 votes)

- Is it possible to make a better bound?(5 votes)
- In a general sense, for any given n, there is no better bound. You can prove this to yourself by constructing examples where E(x) is exactly equal to the bound shown in the video. Here is one such example. Let's say that f(x) = x + x^2 / 2 and that one takes a Taylor polynomial approximation with degree 1 ( n = 1 ) at zero ( a = 0). Then, the polynomial approximation is P(x) = x; the error function is E(x) = x^2 / 2; E''(x) = 1 and, thus, M = 1; and M (x-a)^(n+1) / (n+1)! = 1 * x^2 / 2! = E(x).

Of course, it is true that you can get a better approximation by increasing n. In the above example, you get a perfect approximation ( E(x) = 0 ) by increasing n from 1 to 2.(5 votes)

- When taking the antiderivative of the remainder at8:10why does that not have a constant added to it?

Shouldn't anti[E^(n+1)(x)] be equal to E^(n)(x) + C ?(2 votes)- It should indeed be E^(n)(x) + C, which Sal says if you wait until8:45, after which we goes further and finds an appropriate value for C. You just need to have patience.(5 votes)

- so how do you determine the M in the final formula?(4 votes)
- You need to figure it out yourself. Let's say you have the function
`e^x`

and you have a Maclaurin polynomial with degree 3. The polynomial is`1 + 1/2 * x^2 + 1/6 * x^3 + 1/24 * x^4`

. Let's also say that you want to approximate it at`0.1`

. The biggest the function can get in the interval`[0, 0.1]`

is obviously`e^0.1`

, since`e^x`

is an increasing function. This means that the`M`

in this case will be`e^0.1`

.(3 votes)

- How do you find M for the error function?(4 votes)
- For sin(x) , M is 1 because no matter how many times you take derivatives it always lies between -1 and 1. However, for unbounded functions like e^x, you have to find the greater value between a and b, and M is e^a or e^b. Otherwise it's the same.(2 votes)

## Video transcript

In the last video, we started to explore
the notion of an error function. Not to be confused with the expected value because it really does reflect the same
notation. But here E is for error. And we could also thought it will some times here referred to as Reminder
function. And we saw it's really just the difference
as we, the difference between the function and
our approximation of the function. So for example, this, this distance right
over here, that is our error. That is our error at the x is equal to b. And what we really care about is the
absolute value of it. Because at some points f of x might be
larger than the polynomial. Sometimes the polynomial might be larger
than f of x. What we care is the absolute distance
between them. And so what I want to do in this video is try to bound, try to bound our error at
some b. Try to bound our error. So say it's less than or equal to some
constant value. Try to bound it at b for some b is greater
than a. We're just going to assume that b is
greater than a. And we saw some tantalizing, we, we got to
a bit of a tantalizing result that seems like we might be able to
bound it in the last video. We saw that the n plus 1th derivative of
our error function is equal to the n plus 1th
derivative of our function. Or their absolute values would also be
equal to. So if we could somehow bound the n plus
1th derivative of our function over some interval, an
interval that matters to us. An interval that maybe has b in it. Then, we can, at least bound the n plus
1th derivative our error function. And then, maybe we can do a little bit of integration to bound the error itself at
some value b. So, let's see if we can do that. Well, let's just assume, let's just assume
that we're in a reality where we do know something about the n plus1
derivative of f of x. Let's say we do know that this. We do it in a color I that haven't used
yet. Well, I'll do it in white. So let's say that that thing over there
looks something like that. So that is f the n plus 1th derivative. The n plus 1th derivative. And I only care about it over this
interval right over here. Who cares what it does later, I just gotta
bound it over the interval cuz at the end of the day I just wanna
balance b right over there. So let's say that the absolute value of
this. Let's say that we know. Let me write it over here, let's say that
we know. We know that the absolute value of the n
plus 1th derivative, the n plus 1th. And, I apologize I actually switch between
the capital N and the lower-case n and I did that in the
last video. I shouldn't have, but now that you know that I did that hopefully it doesn't
confuse it. N plus 1th, so let's say we know that the
n plus 1th derivative of f of x, the absolute value
of it, let's say it's bounded. Let's say it's less than or equal to some
m over the interval, cuz we only care about
the interval. It might not be bounded in general, but
all we care is it takes some maximum value over
this interval. So over, over, over the interval x, I
could write it this way, over the interval x is a member between a
and b, so this includes both of them. It's a closed interval, x could be a, x could be b, or x could be anything in
between. And we can say this generally that, that this derivative will have some
maximum value. So this is its, the absolute value,
maximum value, max value, m for max. We know that it will have a maximum value,
if this thing is continuous. So once again we're going to assume that
it is continuous, that it has some maximum value over this
interval right over here. Well this thing, this thing right over
here, we know is the same thing as the n plus 1 derivative of
the error function. So then we know, so then that, that
implies, that implies that, that implies that the, that's a new color,
let me do that in blue, or that green. That implies that the, the, the end plus
one derivative of the error function. The absolute value of it because these are
the same thing is also, is also bounded by m. So that's a little bit of an interesting
result but it gets us no where near there. It might look similar but this is the n
plus 1 derivative of the error function. And, and we'll have to think about how we
can get an m in the future. We're assuming that we some how know it
and maybe we'll do some example problems where we
figure that out. But this is the m plus 1th derivative. We bounded it's absolute value but we really want to bound the actual error
function. The 0 is the derivative, you could say,
the actual function itself. What we could try to integrate both sides
of this and see if we can eventually get to e, to get to e
of x. To get our, to our error function or our
remainder function so let's do that. Let's take the integral, let's take the
integral of both sides of this. Now the integral on this left hand side,
it's a little interesting. We take the integral of the absolute
value. It would be easier if we were taking the
absolute value of the integral. And lucky for us, the way it's set up. So let me just write a little aside here. We know generally that if I take, and it's
something for you to think about. If I take, so if I have two options, if I
have two options, this option versus and I don't
know, they look the same right now. I know they look the same right now. So over here, I'm gonna have the integral
of the absolute value and over here I'm going to have the
absolute value of the interval. Which of these is going to be, which of
these can be larger? Well, you just have to think about the
scenarios. So, if f of x is always positive over the
interval that you're taking the integration, then
they're going to be the same thing. They're, you're gonna get positive values. Take the absolute of a value of a positive
value. It doesn't make a difference. What matters is if f of x is negative. If f of x, if f of x is negative the entire time, so if this our x-axis, that
is our y-axis. If f of x is, well we saw if it's positive
the entire time, you're taking the absolute value of
a positive, absolute value of positive. It's not going to matter. These two things are going to be equal. If f of x is negative the whole time, then
you're going to get, then this integral going to
evaluate to a negative value. But then, you would take the absolute
value of it. And then over here, you're just going to,
this is, the integral going to value to a positive value and it's still
going to be the same thing. The interesting case is when f of x is
both positive and negative, so you can imagine
a situation like this. If f of x looks something like that, then this right over here, the integral, you'd
have positive. This would be positive and then this would
be negative right over here. And so they would cancel each other out. So this would be a smaller value than if you took the integral of the absolute
value. So the integral, the absolute value of f
would look something like this. So all of the areas are going to be, if
you view the integral, if you view this it is definitely going to be a definite
integral. All of the areas, all of the areas would
be positive. So when you it, you are going to get a bigger value when you take the integral of
an absolute value. Then you will, especially when f of x goes both positive and negative over the
interval. Then you would if you took the integral
first and then the absolute value. Cuz once again, if you took the integral
first, for something like this, you'd get a low value cause this
stuff would cancel out. Would cancel out with this stuff right
over here then you'd take the absolute value of just a lower, a
lower magnitude number. And so in general, the integral, the integral, sorry the absolute value of the
integral is going to be less than or equal to the
integral of the absolute value. So we can say, so this right here is the
integral of the absolute value which is going to be
greater than or equal. What we have written over here is just
this. That's going to be greater than or equal
to, and I think you'll see why I'm why I'm doing
this in a second. Greater than or equal to the absolute
value, the absolute value of the integral of, of the n plus
1th derivative. The n plus 1th derivative of, x, dx. And the reason why this is useful, is that
we can still keep the inequality that, this is less
than, or equal to this. But now, this is a pretty straight forward
integral to evaluate. The indo, the anti-derivative of the n
plus 1th derivative, is going to be the nth
derivative. So this business, right over here. Is just going to the absolute value of the
nth derivative. The absolute value of the nth derivative
of our error function. Did I say expected value? I shouldn't. See, it even confuses me. This is the error function. I should've used r, r for remainder. But this all error. The, noth, nothing about probability or
expected value in this video. This is. E for error. So anyway, this is going to be the nth
derivative of our error function, which is going to be less
than or equal to this. Which is less than or equal to the
anti-derivative of M. Well, that's a constant. So that's going to be mx, mx. And since we're just taking indefinite
integrals. We can't forget the idea that we have a
constant over here. And in general, when you're trying to
create an upper bound you want as low of an upper bound as
possible. So we wanna minimize, we wanna minimize
what this constant is. And lucky for us, we do have, we do know
what this, what this function, what value this
function takes on at a point. We know that the nth derivative of our
error function at a is equal to 0. I think we wrote it over here. The nth derivative at a is equal to 0. And that's because the nth derivative of
the function and the approximation at a are going to be the
same exact thing. And so, if we evaluate both sides of this
at a, I'll do that over here on the side, we know
that the absolute value. We know the absolute value of the nth
derivative at a, we know that this thing is going to be equal to
the absolute value of 0. Which is 0. Which needs to be less than or equal to
when you evaluate this thing at a, which is less than or equal to
m a plus c. And so you can, if you look at this part of the inequality, you subtract m a from
both sides. You get negative m a is less than or equal
to c. So our constant here, based on that little
condition that we were able to get in the last
video. Our constant is going to be greater than
or equal to negative ma. So if we want to minimize the constant, if
we wanna get this as low of a bound as possible, we would wanna
pick c is equal to negative Ma. That is the lowest possible c that will meet these constraints that we know are
true. So, we will actually pick c to be negative
Ma. And then we can rewrite this whole thing
as the absolute value of the nth derivative of
the error function. The nth derivative of the error function. Not the expected value. I have a strange suspicion I might have
said expected value. But, this is the error function. The nth der. The absolute value of the nth derivative
of the error function is less than or equal to M times x minus
a. And once again all of the constraints
hold. This is for, this is for x as part of the
interval. The closed interval between, the closed
interval between a and b. But looks like we're making progress. We at least went from the m plus 1
derivative to the n derivative. Lets see if we can keep going. So same general idea. This if we know this then we know that we can take the integral of both sides of
this. So we can take the integral of both sides
of this the anti derivative of both sides. And we know from what we figured out up
here that something's that's even smaller than
this right over here. Is, is the absolute value of the integral
of the expected value. Now [LAUGH] see, I said it. Of the error function, not the expected
value. Of the error function. The nth derivative of the error function
of x. The nth derivative of the error function
of x dx. So we know that this is less than or equal
to based on the exact same logic there. And this is useful because this is just
going to be, this is just going to be the nth minus 1 derivative of
our error function of x. And of course we have the absolute value
outside of it. And now this is going to be less than or
equal to. It's less than or equal to this, which is
less than or equal to this, which is less than or equal to
this right over here. The anti-derivative of this right over
here is going to be M times x minus a squared over 2. You could do U substitution if you want or
you could just say hey look. I have a little expression here, it's
derivative is 1. So it's implicitly there so I can just
treat it as kind of a U. So raise it to an exponent and then divide
that exponent. But once again I'm taking indefinite
integrals. So I'm going to say a plus C over here. But let's use that same exact logic. If we evaluate this at A, you're going to
have it. If you evaluate this while, let's evaluate
both sides of this at A. the left side, evaluated at A, we know, is
going to be zero. We figured that out, all, up here in the
last video. So you get, I'm gonna do it on the right
over here. You get zero, when you valued the left
side of a. The right side of a, if you, the right
side of the value of a you get m times a menus a
square over 2. So you are gonna get 0 plus c, so you are
gonna get, 0 is less or equal to c. Once again we want to minimize our constant, we wanna minimize our upper boundary up
here. So we wanna pick the lowest possible c
that we talk constrains. So the lowest possible c that meets our
constraint is zero. And so the general idea here is that we
can keep doing this, we can keep doing exactly what we're doing all
the way, all the way, all the way until. And so we keep integrating it at the exact
same, same way that I've done it all the way that we get and using
this exact same property here. All the way until we get, the bound on the
error function of x. So you could view this as the 0th
derivative. You know, we're going all the way to the 0th derivative, which is really just the
error function. The bound on the error function of x is
going to be less than or equal to, and what's it
going to be? And you can already see the pattern here. Is that it's going to be m times x, minus
a. And the exponent, the one way to think
about it, this exponent plus this derivative is going to be equal
to n plus 1. Now this derivative is zero so this
exponent is going to be n plus 1. And whatever the exponent is, you're going
to have,a nd maybe I should have done it, you're going to have n plus
one factorial over here. And if say wait why, where does this n
plus 1 factorial come from? I just had a two here. Well think about what happens when we
integrate this again. You're going to raise this to the third
power and then divide by three. So your denominator is going to have two
times three. Then when you integrate it again, you're
going to raise it to the fourth power and then divide by
four. So then your denominator is going to be
two times three times four. Four factorial. So whatever power you're raising to, the denominator is going to be that power
factorial. But what's really interesting now is if we
are able to figure out that maximum value of
our function. If we're able to figure out that maximum
value of our function right there. We now have a way of bounding our error
function over that interval, over that interval
between a and b. So for example, the error function at b. We can now bound it if we know what an m
is. We can say the error function at b is
going to be less than or equal to m times b minus a to the n plus 1th power over n
plus 1 factorial. So that gets us a really powerful, I guess
you could call it, result, kinda the, the math
behind it. And now we can show some examples where
this could actually be applied.