Is this the 3Blue1Brown guy? Sounds really really similar!

Yes, looks like he is. When I google his name, the first result is 3blue1brown.

Just want to clarify that this IS the Total Differential? I thought of this as instead of Multivariable Chain Rule, but product rule instead (since chain rule usually implied). Is that a different, but acceptable understanding of it?

Saul has introduced the multivariable chain rule by finding the derivative of a simple multivariable function by applying the single variable chain and product rules. He then rewrites the formula he has used in a manner equivalent to the multivariable chain rule to demonstrate that the multivariable chain rule is equivalent to applying rules that we already know to work.

dx/dx =1 But dx/∂x= ?

In calculus, "dx" represents an infinitesimal change in the variable "x," and it's often used in the context of finding derivatives. When you write "dx/dx = 1," it means the derivative of "x" with respect to "x" is equal to 1, which is a tautological statement. Essentially, it's saying that a change in "x" with respect to "x" is always 1, which is true because it's a straightforward change in the same variable. However, when you write "dx/∂x," you are taking the derivative with respect to a partial derivative (∂x), which typically implies that you are dealing with a function of multiple variables. The partial derivative symbol (∂) is used in multivariable calculus to indicate that you are taking a derivative with respect to one variable while keeping other variables constant. So, "dx/∂x" doesn't have a straightforward interpretation without context. The result would depend on the specific function you are differentiating with respect to "x" (∂x) and how it depends on other variables. In general, "dx/∂x" is a notation that isn't commonly used because it's somewhat ambiguous. You would typically see "∂f/∂x" to represent the partial derivative of a function "f" with respect to "x." Let's consider a simple example of a function of two variables, say, "f(x, y) = x^2 + 2xy + y^2." We can find the partial derivative of this function with respect to "x," denoted as ∂f/∂x: f(x, y) = x^2 + 2xy + y^2 ∂f/∂x is found by treating "y" as a constant and taking the derivative of "f" with respect to "x." The derivative of "x^2" with respect to "x" is "2x," the derivative of "2xy" with respect to "x" is "2y," and the derivative of "y^2" with respect to "x" is 0 because "y" is a constant with respect to "x." So, we have: ∂f/∂x = 2x + 2y Now, if you want to find "dx/∂x" for this function, you are essentially calculating the reciprocal of ∂f/∂x because "dx" is a small change in "x" and ∂f/∂x represents how "f" changes concerning "x." Therefore: dx/∂x = 1 / (2x + 2y) This gives you a sense of how "x" changes concerning the change in "x" (dx) for the given function, taking into account how it depends on both "x" and "y." So, if you were to evaluate this expression for specific values of "x" and "y," you would find the rate of change of "x" concerning "x" for that point in the function.

why f[x(t),y(t)] is considered function of 1 variable ?

1 input to 2 outputs, 2 outputs to 1 input. the 2 outputs is a process, while there's only 1 input and 1 output.

Hey so quick question. At the very end you write out the Multivariate Chain Rule with the factor "x" leading. However in your example throughout the video ends up with the factor "y" being in front. Would this not be a contradiction since the placement of a negative within this rule influences the result. For example look at -sin(t). This value makes the right side of the addition side negative, so now you are subtracting essentially. If you change this, as you would have to based on your "complete" formula at the end, the negative would now be in front of the addition and now you are adding a positive to a negative. Isn't this wrong or am I just off my rocker?

Isn't adding a positive to a negative same as adding a negative to a positive? a + (-b) = (-b) + a

Main content

Course: Multivariable calculus > Unit 2

Lesson 5: Multivariable chain rule

Multivariable chain rule

Name: Multivariable chain rule
Uploaded: 2016-05-21T00:30:03Z
Description: This is the simplest case of taking the derivative of a composition involving multivariable functions.

Google Classroom

This is the simplest case of taking the derivative of a composition involving multivariable functions. Created by Grant Sanderson.

Want to join the conversation?

Sort by:

Aryan Chouhan
Posted 6 years ago. Direct link to Aryan Chouhan's post “Is this the 3Blue1Brown g...”
Is this the 3Blue1Brown guy? Sounds really really similar!
Button navigates to signup pageComment on Aryan Chouhan's post “Is this the 3Blue1Brown g...”
(30 votes)
Answer
- m
  Posted 5 years ago. Direct link to m's post “Yes, looks like he is. Wh...”
  Yes, looks like he is. When I google his name, the first result is 3blue1brown.
  Button navigates to signup page
  (24 votes)
steve
Posted 7 years ago. Direct link to steve's post “Just want to clarify that...”
Just want to clarify that this IS the Total Differential? I thought of this as instead of Multivariable Chain Rule, but product rule instead (since chain rule usually implied). Is that a different, but acceptable understanding of it?
Button navigates to signup pageComment on steve's post “Just want to clarify that...”
(10 votes)
Answer
- Still No Sheep
  Posted 7 years ago. Direct link to Still No Sheep's post “Saul has introduced the m...”
  Saul has introduced the multivariable chain rule by finding the derivative of a simple multivariable function by applying the single variable chain and product rules. He then rewrites the formula he has used in a manner equivalent to the multivariable chain rule to demonstrate that the multivariable chain rule is equivalent to applying rules that we already know to work.
  Comment on Still No Sheep's post “Saul has introduced the m...”
  (7 votes)
White
Posted 7 years ago. Direct link to White's post “I'm surprised by how much...”
I'm surprised by how much the dot product comes up very often in multivar calc. Your essence of linear algebra series was really helpful!
Button navigates to signup pageButton navigates to signup page
(8 votes)
Answer
sangitasharma7nov
Posted 10 months ago. Direct link to sangitasharma7nov's post “grant is baccc”
grant is baccc
Button navigates to signup pageButton navigates to signup page
(6 votes)
Answer
Fabian the Panda
Posted 8 months ago. Direct link to Fabian the Panda's post “Grant is backkk!”
Grant is backkk!
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
$starky tree style avatar for user {Rayeed}^3$
{Rayeed}^3
Posted 4 years ago. Direct link to {Rayeed}^3's post “dx/dx =1 But dx/∂x= ?”
dx/dx =1 But dx/∂x= ?
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- Tanzz
  Posted 6 months ago. Direct link to Tanzz's post “In calculus, "dx" represe...”
  In calculus, "dx" represents an infinitesimal change in the variable "x," and it's often used in the context of finding derivatives. When you write "dx/dx = 1," it means the derivative of "x" with respect to "x" is equal to 1, which is a tautological statement. Essentially, it's saying that a change in "x" with respect to "x" is always 1, which is true because it's a straightforward change in the same variable.
  
  However, when you write "dx/∂x," you are taking the derivative with respect to a partial derivative (∂x), which typically implies that you are dealing with a function of multiple variables. The partial derivative symbol (∂) is used in multivariable calculus to indicate that you are taking a derivative with respect to one variable while keeping other variables constant.
  
  So, "dx/∂x" doesn't have a straightforward interpretation without context. The result would depend on the specific function you are differentiating with respect to "x" (∂x) and how it depends on other variables.
  
  In general, "dx/∂x" is a notation that isn't commonly used because it's somewhat ambiguous. You would typically see "∂f/∂x" to represent the partial derivative of a function "f" with respect to "x."
  
  Let's consider a simple example of a function of two variables, say, "f(x, y) = x^2 + 2xy + y^2." We can find the partial derivative of this function with respect to "x," denoted as ∂f/∂x:
  
  f(x, y) = x^2 + 2xy + y^2
  
  ∂f/∂x is found by treating "y" as a constant and taking the derivative of "f" with respect to "x." The derivative of "x^2" with respect to "x" is "2x," the derivative of "2xy" with respect to "x" is "2y," and the derivative of "y^2" with respect to "x" is 0 because "y" is a constant with respect to "x." So, we have:
  
  ∂f/∂x = 2x + 2y
  
  Now, if you want to find "dx/∂x" for this function, you are essentially calculating the reciprocal of ∂f/∂x because "dx" is a small change in "x" and ∂f/∂x represents how "f" changes concerning "x." Therefore:
  
  dx/∂x = 1 / (2x + 2y)
  
  This gives you a sense of how "x" changes concerning the change in "x" (dx) for the given function, taking into account how it depends on both "x" and "y."
  
  So, if you were to evaluate this expression for specific values of "x" and "y," you would find the rate of change of "x" concerning "x" for that point in the function.
  Button navigates to signup page
  (2 votes)
diamantidisno3
Posted 5 years ago. Direct link to diamantidisno3's post “why f[x(t),y(t)] is consi...”
why f[x(t),y(t)] is considered function of 1 variable ?
Button navigates to signup pageComment on diamantidisno3's post “why f[x(t),y(t)] is consi...”
(2 votes)
Answer
- sedon
  Posted 4 years ago. Direct link to sedon's post “1 input to 2 outputs, 2 o...”
  1 input to 2 outputs, 2 outputs to 1 input. the 2 outputs is a process, while there's only 1 input and 1 output.
  Button navigates to signup page
  (3 votes)
Blackout119
Posted 7 years ago. Direct link to Blackout119's post “Hey so quick question. At...”
Hey so quick question. At the very end you write out the Multivariate Chain Rule with the factor "x" leading. However in your example throughout the video ends up with the factor "y" being in front. Would this not be a contradiction since the placement of a negative within this rule influences the result. For example look at -sin(t). This value makes the right side of the addition side negative, so now you are subtracting essentially. If you change this, as you would have to based on your "complete" formula at the end, the negative would now be in front of the addition and now you are adding a positive to a negative.
Isn't this wrong or am I just off my rocker?
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- Ethan Zhu
  Posted 6 years ago. Direct link to Ethan Zhu's post “Isn't adding a positive t...”
  Isn't adding a positive to a negative same as adding a negative to a positive?
  a + (-b) = (-b) + a
  Button navigates to signup page
  (5 votes)
wangjianan45
Posted 18 days ago. Direct link to wangjianan45's post “Lesson 5 changed back to ...”
Lesson 5 changed back to Grant. welcome.
the teacher of Lesson 4 is great as well.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
Noah Schwartz
Posted 7 years ago. Direct link to Noah Schwartz's post “To visualize f(x(t), y(t)...”
To visualize f(x(t), y(t)) in 3D space, would t be the length of the curve?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer

Video transcript

- [Voiceover] So I've written here three different functions. The first on is a multivariable function, it has a two variable input, x, y, and a single variable output, that's x squared times y, that's just a number, and then the other two functions are each just regular old single variable functions. And what I want to do is start thinking about the composition of them. So, I'm going to take, as the first component, the value of the function x of t, so you pump t through that, and then you make that the first component of f. And the second component will be the value of the function y of t. So, the image that you might have in your head for something like this is you can think of t as just living on a number line of some kind, then you have x and y, which is just the plane, so that will be, you know, your x-coordinate, your y-coordinate, two-dimensional space, and then you have your output, which is just whatever the value of f is. And for this whole function, for this whole composition of functions, you're thinking of xt, yt, as taking a single point in t, and kind of moving it over to two-dimensional space somewhere, and then from there, our multivariable function takes that back down. So, this is just the single variable function, nothing too fancy going on in terms of where you start and where you end up, it's just what's happening in the middle. And what I want to know is what's the derivative of this function. If I take this, and it's just an ordinary derivative, not a partial derivative, because this is just a single variable function, one variable input, one variable output, how do you take it's derivative? And there's a special rule for this, it's called the chain rule, the multivariable chain rule, but you don't actually need it. So, let's actually walk through this, showing that you don't need it. It's not that you'll never need it, it's just for computations like this you could go without it. It's a very useful theoretical tool, a very useful model to have in mind for what function composition looks like and implies for derivatives in the multivariable world. So, let's just start plugging things in here. If I have f(x) and y(t), the first thing I might do is write okay, f, and instead of x of t, just write in cosine of t, since that's the function that I have for x of t, and then y we replace that with sine of t, sine of t, and of course I'm hoping to take the derivative of this. And then from there, we can go to the definition of f, f of xy equals f squared times y, which means we take that first component squared. So we'll take that first component, cosine of t, and then square it, square that guy, and then we'll multiply it by the second component, sine of t, sine of t, and again we're just taking this derivative. And you might be wondering, okay, why am I doing this, you're just showing me how to take a first derivative, an ordinary derivative? But the pattern that we'll see is gonna lead us to the multivariable chain rule. And it's actually kind of surprising when you see it in this context, because it pops out in a way that you might not expect things to pop out. So, continuing our chugging along, when you take the derivative of this, you do the product rule, left d right, plus right d left, so in this case, the left is cosine squared of t, we just leave that as it is, cosine squared of t, and multiply it by the derivative of the right, d right, so that's going to be cosine of t, cosine of t, and then we add to that right, which is, keep that right side unchanged, multiply it by the derivative of the left, and for that we use the chain rule, the single variable chain rule, where you think of taking the derivative of the outside, so you plug two down, like you're taking the derivative of two x, but you're just writing in cosine, instead of x. Cosine t, and then you multiply that by the derivative of the inside, that's a tongue twister, which is negative sine of t, negative sine of t. And I'm afraid I'm gonna run off the edge here, certainly with the many many parentheses that I need. I'll go ahead and rewrite this though. I'm gonna rewrite it anyway because there's a certain pattern that I hope to make clear. So, let me just rewrite this side, let's copy that down here, I just want to rewrite this guy. You might be wondering why, but it'll become clear in just a moment why I want to do this. So, in this case, I'm gonna write this as two times cosine of t, times sine of t, then all of them multiplied by negative sine of t, negative sine of t. So this is the derivative, this is the derivative of the composition of functions that ultimately was a single variable function, but it kind of wind through two different variables. And I just want to make an observation in terms of the partial derivatives of f. So, let me just make a copy of this guy, give ourselves a little bit of room down here, paste that over here. So let's look at the partial derivatives of f for a second here. So, if I took the partial derivative with respect to x, partial x, which means y is treated as a constant. So I take the derivative of x squared to get two x, and then multiply it by that constant, which is just y, and if I also do it with respect to y, get all of them in there. So, now y looks like a variable, x looks like a constant, so x squared also looks like a constant, constant times a variable, the derivative is just that constant. These two, their pattern comes up in the ultimate result that we got. And this is the whole reason that I rewrote it. If you look at this two x y, you can see that over here, where cosine corresponds to x, sine corresponds to y, based on our original functions, and an x squared here corresponds with squaring the x that we put in there. Then if we take the derivative of our two intermediary functions, the ordinary derivative of x, with respect to t, that's derivative of cosine, negative sine of t, and then similarly derivative of y, just the ordinary derivative, no partials going on here, with respect to t, that's equal to cosine, derivative of sine is cosine. And these guys show up, right, you see negative sine over here, and you see cosine show up over here. And we can generalize this, we can write it down and say at least for this specific example, it looks like the derivative of the composition is this part, which is the partial of f with respect to y, right, that's kind of what it looks like here, once we've plugged in the intermediary functions, multiply it by this guy, was the ordinary derivative of y, with respect to t. So, that was the ordinary derivative of y, with respect to t. And then very similarly, this guy was the partial of f, with respect to x, partial x, and we're multiplying it by the ordinary derivative of x of t. So, over here, x of t, with respect to t. And of course, when I write this partial f, partial y, what I really mean is you plug in for x and y, the two coordinate functions, x of t, y of t. So, if I say partial f, partial y over here, what I really mean is you take that x squared and then you plug in x of t squared to get cosine squared. And same deal over here, you're always plugging things in, so you ultimately have a function of t. But this right here has a name, this is the multivariable chain rule. And it's important enough, I'll just write it out all on it's own here. If we take the ordinary derivative, with respect to t, of a composition of a multivariable function, in this case just two variables, x of t, y of t, where we're plugging in two intermediary functions, x of t, y of t, each of which just single variable, the result is that we take the partial derivative, with respect to x, and we multiply it by the derivative of x with respect to t, and then we add to that the partial derivative with respect to y, multiplied by the derivative of y with respect to t. So, this entire expression here is what you might call the simple version of the multivariable chain rule. There's a more general version, and we'll kind of build up to it, but this is the simplest example you can think of, where you start with one dimension, and then you move over to two dimension somehow, and then you move from those two dimensions down to one. So, this is that, and in the next video I'm gonna talk about the intuition for why this is true. You know, here I just went through an example and showed oh but it just happens to be true, it fills this pattern. But there's a very nice line of reasoning for where this comes about, and I'll also talk about a more generalized form, where you'll see it. We start using vector notation, it makes things look very clean, and I might even get around to a more formal argument for why this is true. So, we'll see in next video.