Mathematical methods for economic theory: 6.3 The envelope theorem

6.3 The envelope theorem

In economic theory we are often interested in how the maximal value of a function depends on some parameters.

Consider, for example, a firm that can produce output with a single input using the production function f. The standard theory is that the firm chooses the amount x of the input to maximize its profit pf(x) − wx, where p is the price of output and w is the price of the input. Denote by x*(w, p) the optimal amount of the input when the prices are w and p. An economically interesting question is: how does the firm's maximal profit pf(x*(w, p)) − wx*(w, p) depend upon p?

We have already answered this question in an earlier example. To do so, we used the chain rule to differentiate pf(x*(w, p)) − wx*(w, p) with respect to p, yielding

f(x*(w, p)) + x*'_p(w, p)[pf'(x*(w, p)) − w],

and then used the fact that x*(w, p) satisfies the first-order condition pf'(x*(w, p)) − w = 0 to conclude that the derivative is simply f(x*(w, p)).

That is, the fact that the value of the variable satisfies the first-order condition allows us to dramatically simplify the expression for the derivative of the firm's maximal profit. On this page I describe results that generalize this observation to an arbitrary maximization problem.

Unconstrained problems

Consider the unconstrained maximization problem

max_x f(x, r),

where x is a n-vector and r is a k-vector of parameters. Assume that for any vector r the problem has a unique solution; denote this solution x*(r). Denote the maximum value of f, for any given value of r, by f*(r):

f*(r) = f(x*(r), r).

We call f* the value function.

Example 6.3.1

Let n = 1, k = 2, and f(x, r) = x^r₁ − r₂x for x ≥ 0, where 0 < r₁ < 1. This function is concave (look at its second derivative), and any solution of max_x f(x, r) is positive, so a solution satisfies the first-order condition r₁x^r₁−1 − r₂ = 0. Thus x*(r) = (r₁/r₂)^{1/(1 − r₁)} is the solution of the problem, so that the value function of f is

f*(r)	=	(x(r))^r₁ − r₂x(r)
	=	(r₁/r₂)^{r₁/(1 − r₁)} − r₂(r₁/r₂)^{1/(1 − r₁)}.

We wish to find the derivatives of f* with respect to each parameter r_h for h = 1, ..., k. First, because x*(r) is a solution of the problem when the parameter vector is r, if f is differentiable then x*(r) satisfies the first-order conditions

f'_i(x*(r), r) = 0 for i = 1, ..., n.

Now, if the function x* is differentiable then differentiating f* with respect to r_h using the chain rule we have

f*_h'(r)

∑n
i=1 f'_i(x*(r), r)·(∂x_i*/∂r_h)(r) + f'_n+h(x*(r), r).

The first term corresponds to the change in f* caused by the change in the solution of the problem that occurs when r_h changes; the second term corresponds to the direct effect of a change in r_h on the value of f.

Given the first-order conditions, this expression simplifies to

f*_h'(r)

f'_n+h(x*(r), r) for h = 1, ..., k.

Note that the derivative on the right-hand side is the partial derivative of f with respect to r_h (the n + hth variable in the vector (x, r)), holding x fixed at x*(r).

This argument assumes that the maximization problem has a unique solution x* and this solution is differentiable (in r). These assumptions are not required for the conclusion; only the differentiability of f* is required.

Proposition 6.3.1 (Envelope theorem for an unconstrained maximization problem)

Let f be a function of n + k variables, let r be a k-vector, and let the n-vector x*(r) be a maximizer of f(x, r). Assume that the partial derivative f'_n+h(x*(r), r) (i.e. the partial derivative of f with respect to r_h at (x*(r), r)) exists. Define the function f* of k variables by

f*(r) = max_x f(x, r) for all r.

If the partial derivative f*_h'(r) exists then

f*_h'(r)

f'_n+h(x*(r), r).

Proof: By the definition of the function f*, we have
f*(r) = f(x*(r), r)
and
f*(s) ≥ f(x*(r), s) for all s.
Put differently,
f(x*(r), s) − f*(s) ≤ 0 for all s
and
f(x*(r), r) − f*(r) = 0.
Thus r is a solution of the problem
max_s f(x*(r), s) − f*(s).
Hence if the partial derivatives f'_n+h(x*(r), r) and f*_h'(r) exist, so that the partial derivative of f(x*(r), s) − f*(s) with respect to s_h at r exists, then by a previous result we have f'_n+h(x*(r), r) − f*_h'(r) = 0, or f*_h'(r) = f'_n+h(x*(r), r).

This result says that the change in the maximal value of the function as a parameter changes is the change caused by the direct impact of the parameter on the function, holding the value of x fixed at its optimal value; the indirect effect, resulting from the change in the optimal value of x caused by a change in the parameter, is zero.

The next two examples illustrate the result.

Example 6.3.2

Consider the earlier example, in which f(x, r) = x^r₁ − r₂x, where 0 < r₁ < 1. We found that the solution of the problem

max_x f(x, r)

is given by

x*(r) = (r₁/r₂)^{1/(1 − r₁)}.

Thus by the envelope theorem, the derivative of the maximal value of f with respect to r₁ is the derivative of f with respect to r₁ evaluated at x*(r), namely

(x*(r))^r₁ln x*(r),

(r₁/r₂)^{r₁/(1 − r₁)}ln (r₁/r₂)^{1/(1 − r₁)}.

(If you have forgotten how to differentiate x^r₁ with respect to r₁ (not with respect to x!), remind yourself of the rules.)

If you approach this problem directly, by calculating the value function explicitly and then differentiating it, rather than using the envelope theorem, you are faced with the task of differentiating

(x*(r))^r₁ − r₂x*(r)

(r₁/r₂)^{r₁/(1 − r₁)} − r₂(r₁/r₂)^{1/(1 − r₁)}

with respect to r₁, which is much more difficult than the task of differentiating the function f partially with respect to r₁.

Example 6.3.3

Consider the problem studied at the start of this section, in which a firm can produce output, with price p, using a single input, with price w, according to the production function f. The firm's profit when it uses the amount x of the input is π(x, (w, p)) = pf(x) − wx, and its maximal profit is

π*(w, p)

pf(x*(w, p)) − wx*(w, p),

where x*(w, p) is the optimal amount of the input at the prices (w, p). The function π* is known as the firm's profit function. By the envelope theorem, the derivative of this function with respect to p is the partial derivative of π with respect to p evaluated at x = x*(w, p), namely

f(x*(w, p)).

In particular, the derivative is positive: if the price of output increases, then the firm's maximal profit increases.

Also by the envelope theorem the derivative of the firm's maximal profit with respect to w is

−x*(p, w).

(This result is known as Hotelling's Lemma, after Harold Hotelling, 1895–1973.) In particular, this derivative is negative: if the price of the input increases, then the firm's maximal profit decreases.

A consequence of Hotelling's Lemma is that we can easily find the firm's input demand function x* if we know the firm's profit function, even if we do not know the firm's production function: we have x*(p, w) = −π*_w'(p, w) for all (p, w), so we may obtain the input demand function by simply differentiating the profit function.

A trivial example in which f does not have a unique maximizer is f(x, r) = r. In this case, the value of f does not depend on x, so that every value of x maximizes f. Thus a straightforward application of the chain rule to demonstrate the envelope theorem is not possible for this example. But the assumptions in my statement of the result are satisfied, and indeed the conclusion holds: f*(r) = r, so that f*'(r) = 1, and f_r(x*, r) = 1.

Here is an example in which f has a unique maximizer, but this maximizer is not a differentiable function of r.

Example 6.3.4

Define the function f of two variables by

f(x, r) =

	−(x − r)²	if r ≥ 0
	−(x + r)²	if r < 0	.

For each value of r, this function has a unique maximizer, given by x*(r) = |r|. Thus the maximizer is not differentiable at r = 0. However, the partial derivative f'_r(x*(0), 0) exists (f(0, r) = −r² for all r) and f*(r) = 0 for all r, so that the assumptions of the result are satisfied. The conclusion holds for r = 0, the point at which x* is not differentiable: f*'(0) = 0 and f'_r(0, 0) = 0.

Constrained problems

We may apply the same arguments to maximization problems with constraints. Consider the problem

max_x f(x, r) subject to g_j(x, r) = 0 for j = 1, ..., m,

where x is an n-vector, r is a k-vector of parameters, and f and g_j for j = 1, ..., m are continuously differentiable functions. Assume that for every value of r the problem has a single solution, and denote this solution x*(r). As before, denote the value of f at the solution of the problem, for any given value of r, by f*(r):

f*(r) = f(x*(r), r).

Call f* the value function.

We want to calculate the derivatives f*_h'(r) for h = 1, ..., k of the function f*. If x* is differentiable, then using the chain rule we have

f*_h'(r)

∑n
i=1 f'_i(x*(r),r)·(∂x*_i/∂r_h)(r) + f'_n+h(x*(r), r).

Now, if the Jacobian matrix of the constraints has rank m then by an earlier result for any value of r there are unique numbers λ₁(r), ..., λ_m(r) such that the solution x*(r) satisfies the first-order conditions

f'_i(x*(r), r) − ∑m
j=1λ_j(r)(∂g_j/∂x_i)(x*(r), r) = 0 for i = 1, ..., n.

Thus we have

f*_h'(r)

∑n
i=1[∑m
j=1λ_j(r)(∂g_j/∂x_i)(x*(r), r)] · (∂x*_i/∂r_h)(r) + f'_n+h(x*(r), r).

Reversing the sums, we obtain

f*_h'(r)

∑m
j=1λ_j(r)[∑n
i=1 (∂g_j/∂x_i)(x*(r),r) · (∂x*_i/∂r_h)(r)] + f'_n+h(x*(r),r).

Now, for all j = 1, ..., m we have g_j(x*(r), r) = 0 for all r. Differentiating this equation with respect to r_h we get

∑n
i=1(∂g_j/∂x_i)(x*(r), r) · (∂x_i*/∂r_h)(r) + (∂g_j/∂r_h)(x*(r), r) = 0.

Hence

f*_h'(r)

−∑m
j=1 λ_j(r)(∂g_j/∂r_h)(x*(r), r) + f'_n+h(x*(r), r).

Now define the Lagrangean function L by

L(x, λ, r)

f(x, r) − ∑m
j=1λ_jg_j(x, r),

where λ = (λ₁, ..., λ_m). Then we have

f*_h'(r) = L'_n+m+h(x*(r), λ(r), r).

We have proved the following result.

Proposition 6.3.2 (Envelope theorem for constrained maximization problems)

Let f and g₁, ..., g_m be continuously differentiable functions of n + k variables, with m ≤ n. Suppose that for all values of the k-vector r the problem

max_x f(x, r) subject to g_j(x, r) = 0 for j = 1, ..., m,

where x is an n-vector, has a unique solution, which is differentiable in r. Denote this solution x*(r) and define the function f* of k variables by

f*(r)

max_x f(x, r) subject to g_j(x, r) = 0 for j = 1, ..., m.

Suppose that the rank of the m × n matrix in which the (i, j)th component is (∂g_j/∂x_i)(x*(r), r) is m. Define the function L by

L(x, λ, r)

f(x, r) − ∑m
j=1λ_jg_j(x, r) for every (x, λ, r),

where x is an n-vector, λ is an m-vector, and r is a k-vector. Then

f*_h'(r)

L'_n+m+h(x*(r), λ(r), r) for h = 1, ..., k,

where λ(r) = (λ₁(r), ..., λ_m(r)) and λ_j(r) is the value of Lagrange multiplier associated with the jth constraint at the solution of the problem.

Example 6.3.5

Consider a utility maximization problem:

max_x u(x) subject to p·x = w

where x is a vector (a bundle of goods), p is the price vector, and w is the consumer's wealth (a number). Denote the solution of the problem by x*(p, w), and denote the value function by v, so that

v(p, w) = u(x*(p, w)) for every (p, w).

The function v is known as the indirect utility function.

By the envelope theorem for constrained maximization problems we have

(∂v/∂p_i)(p, w) = −λ*(p, w)x_i*(p, w)

(since u does not depend independently on p or w) and

(∂v/∂w)(p, w) = λ*(p, w).

Thus

(∂v/∂p_i)(p, w)

(∂v/∂w)(p, w)

= −x_i*(p, w).

That is, if you know the indirect utility function then you can recover the demand functions. This result is known as Roy's identity (after René Roy, 1894–1977).

Your first name*
Your last name*
Your email address*
Comment*
Enter the first six letters of the alphabet*	(to help establish that you are human)