If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Interpretation of Lagrange multipliers

Lagrange multipliers are more than mere ghost variables that help to solve constrained optimization problems...

Lagrange multipliers technique, quick recap

Image credit: By Nexcis (Own work) [Public domain], via Wikimedia Commons
When you want to maximize (or minimize) a multivariable function f(x,y,) subject to the constraint that another multivariable function equals a constant, g(x,y,)=c, follow these steps:
  • Step 1: Introduce a new variable λ, and define a new function L as follows:
    This function L is called the "Lagrangian", and the new variable λ is referred to as a "Lagrange multiplier"
  • Step 2: Set the gradient of L equal to the zero vector.
    L(x,y,,λ)=0Zero vector
    In other words, find the critical points of L.
  • Step 3: Consider each solution, which will look something like (x0,y0,,λ0). Plug each one into f. Or rather, first remove the λ0 component, then plug it into f, since f does not have λ as an input. Whichever one gives the greatest (or smallest) value is the maximum (or minimum) point your are seeking.

Budgetary constraints, revisited

The last article covering examples of the Lagrange multiplier technique included the following problem.
  • Problem: Suppose you are running a factory, producing some sort of widget that requires steel as a raw material. Your costs are predominantly human labor, which is $20 per hour for your workers, and the steel itself, which runs for $170 per ton. Suppose your revenue R is loosely modeled by the equation
    • h represents hours of labor
    • s represents tons of steel
    If your budget is $20,000, what is the maximum possible revenue?
You can get a feel for this problem using the following interactive diagram, which let's you see which values of (h,s) yield a given revenue (blue curve) and which values satisfy the constraint (red line).
The full details of the solution can be found in the last article. For our purposes here, you just need to know what happens in principle as we follow the steps of the Lagrange multiplier technique.
  • We start by writing the Lagrangian L(h,s,λ) based on the function R(h,s) and the constraint 20h+170s=20,000.
  • Then we find the critical points of L, meaning the solutions to
  • There might be several solutions (h,s,λ) to this equation,
    so for each one you plug in the h and s components to the revenue function R(h,s) to see which one actually corresponds with the maximum.
It's common to write this maximizing critical point as (h,s,λ), using asterisk superscripts to indicate that this is a solution. This means h and s represent the hours of labor and tons of steel you should allocate to maximize revenue subject to your budget. But how can we interpret the Lagrange multiplier λ that comes with these maximizing values? This is the core question of the article.
It turns out that λ tells us how much more money we can make by changing our budget.
Let's get a feel for what it means to change the budget. The following tool is similar to the one above, but now the red line representing which points (h,s) satisfy the budget constraint will shift as you let the budget vary around $20,000. This budget is represented with the variable b.
For each value of the budget b, try to maximize R while ensuring that the curves still touch each other. Notice that the maximum R-value you can achieve changes as b changes. We are interested in studying the specifics of that change.
Let M represent the maximum revenue you achieve. In the next interactive diagram, the only variable you can change is b, and you can see how the value of M depends on b.
In other words, this maximum revenue M is a function of the budget b, so we write it as
We can now express a truly wonderful fact: The Lagrange multiplier λ(b) gives the derivative of M:
In terms of the interactive diagram above, this means λ(b) tells you the rate of change of the black dot representing M as you move around the green dot representing b.
Showing why this is true is a bit tricky, but first, let's take a moment to interpret it. For example, if we found that λ(b)=2.59, it would mean each additional dollar you spend over your budget would yield another $2.59 in revenue. Conversely, decreasing your budget by a dollar will cost you that much in lost revenue.
This interpretation of λ comes up commonly enough in economics to deserve a name: "Shadow price". It is the money gained by loosening the constraint by a single dollar, or conversely the price of strengthening the constraint by one dollar.

Generally speaking

Let's generalize what we just did with the budget example and see why it's true. Spelling out the full result is actually quite a mouthful, but it should be made clear by holding the following mantra in the back of your mind: "How does the solution change as the constraint changes?".
We start with the usual Lagrange multiplier setup. There is a function we want to maximize,
and a constraint,
We start by writing the Lagrangian,
Let (x,y,λ) be the critical point of L, which solves our constrained optimization problem. In other words,
And (x,y) maximizes f (subject to the constraint).
When we start to think of c as a variable, we must account for the fact that the solution (x,y,λ) changes as the constraint c changes. To do this, we start writing each component as a function of c:
In other words, when the constraint equals some value c, the solution triplet to the Lagrange multiplier problem is (x(c),y(c),λ(c)).
We now let M(c) represent the (constrained) maximum value of f as a function of c, which can be written in terms of f, x(c) and y(c) as follows:
The core result we wish to show is that
This says that the Lagrange multiplier λ gives the rate of change of the solution to the constrained maximization problem as the constraint varies.

Want to outsmart your teacher?

Proving this result could be an algebraic nightmare, since there is no explicit formula for the functions x(c), y(c), λ(c) or M(c). This means you would have to start with the defining property of x, y and λ, namely that L(x,y,λ)=0, and reason your way towards dMdc. This is not at all straight forward (try it!).
There is a fun story, in which a professor was asked what the harshest truth he ever learned from a student was. He recalled a class he taught when he went through a long and algebraically heavy proof, only to be shown by a student that there is a much simpler approach. The lesson, he said, was that he was not as smart as he thought he was.
The result he was talking about just so happens to be what we are now trying to prove. Although the student's approach is not quite so simple as the story makes it out to be, it is still a clean way to view the problem. More importantly, it is easier to remember than other proofs, so I'll spell it out in full here. As happens so often in math, a little insight can save us from excessive algebra.

The insight

The underlying insight is that evaluating the Lagrangian itself at a solution (x,y,λ) will give the maximum value M. This is because the "g(x,y)c" term in the Lagrangian goes to zero (since a solution must satisfy the constraint), so we have
Given that we want to find dMdc, this suggests that we should find a way to treat L as a function of c. Then we might be able to relate the derivative we want to a derivative of L with respect to c.

The followthrough

Start by treating L as a function of four variable instead of three, since c is now modeled as a changing value:
Reflection question: When L is written as a four-variable function like this, what is Lc?
Choose 1 answer:

This partial derivative is promising, since our goal is to show that dMdc=λ, and we know that M=L at solutions. However, we still have work to do.
To encode the fact that we only care about the value of L at a solutions (x,y,λ) for a given value of c, we replace x,y and λ with x(c),y(c) and λ(c). These are functions of c which correspond to the solution of the Lagrangian problem for a given choice of the "constant" c.
This lets us write M as a function of c as follows:
Even though this expression has only one variable, c, there is a four-variable function L as an intermediary. Therefore, to take its (ordinary) derivative with respect to c, we use the multivariable chain rule:
Note, each partial derivative in the expression above should be evaluated at (x(c),y(c),λ(c),c), but writing that would make the expression more messy than it already is.
This might seem like a lot, but remember where the terms x, y and λ each came from. Each partial derivative Lx, Ly, and Lλ is zero when evaluated at (x,y,λ). That's how a solution (x,y,λ) is defined! This means the first three terms go to zero.
Moreover, since dcdc=1, the entire expression simplifies to
It's important to notice that the reason for this simplification relies on the special properties of solution points (x,y,λ). Otherwise, working out the full derivative based on the multivariable chain rule could have been a nightmare!
For the sake of notational cleanliness, we left out the inputs to these derivatives, but let's write them in.
Since we saw in the reflection question above that Lc=λ, this means

Want to join the conversation?

  • female robot amelia style avatar for user Shubham
    While calculating dM*/dc why we take partial derivative with respect to x,y and λ and not x*,y* and λ*?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user metavert
    Weird that so much time was spent on Lagrangians in this unit, but it doesn't appear on the unit test at all and there's not even a quiz. I'd have liked to test my understanding of it.
    (5 votes)
    Default Khan Academy avatar avatar for user
  • purple pi teal style avatar for user suhas
    A lot of textbooks interpret the Lagrange multiplier this way (see Strang, Gilbert). But there is an easier way without having to invent an auxiliary function with four variables.

    dM*/dc = df(x*,y*)/dc
    df(x*, y*)/dc = f_x(x*, y*) (dx/dc) + f_y(x*, y*) (dy/dc)
    , where the _x and _y are subscripts representing partial derivatives

    But, f_x(x*, y*) = λ* g_x(x*, y*)
    f_y(x*, y*) = λ* g_y(x*, y*)

    df(x*, y*)/dc = λ*[g_x(x*, y*)(dx/dc) + g_y(x*, y*)(dy/dc)] = λdg(x, y*)/dc

    g(x*, y*) = c
    λdg(x*, y*)/dc = λ*dc/dc = λ*
    (5 votes)
    Default Khan Academy avatar avatar for user
  • leaf red style avatar for user H Adnan
    In the previous article, there was an example with (lambda)=0, does this means that increasing the budget does not affect the revenue? and how are the constraints related to the budget now?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Cihan Baran
      Yes, this isn't explained all that clearly.

      We are implicitly assuming that you are constrained by the budget - and thus increasing your budget should give you further revenue.

      Mathematically, if you are constrained by your budget, then the optimal solution is at the boundary of the surface, meaning for optimal x*, and optimal y*, g(x*,y*) = c . In this case you have a positive lambda. Increasing c will lead to different, better x* and y*.

      If you are not constrained by your budget, in the optimal case, you have g(x*, y*) < c . Thus increasing c doesn't give you any extra juice, as x* and y* don't change. In this case, lambda is 0.

      In this article, it is implicitly assumed that you are constrained by your budget (or whatever your constraint is) so that increasing c will lead to different solutions. Otherwise, it becomes trivial.
      (4 votes)
  • blobby green style avatar for user iam_apocalypse
    very nice explanation! I'm sure why we are interested in how the solution changes with a change in c?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user maycol.medina
    Yeah, i can kinda undertand and track the explanation, but i still have this feeling like i am leaving something out.
    I Just can't say i'm undertand the why of the Lagrange multipliers at all.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user curty
    how do i check for cases when x or y is equal to 0?
    (1 vote)
    Default Khan Academy avatar avatar for user