Mathematical methods for economic theory: 5.2 Local optima

5.2 Local optima

One variable

Occasionally we are interested only in the local maximizers or minimizers of a function. We may be able to tell whether a stationary point point is a local maximizer, a local minimizer, or neither by examining the second derivative of the function at the stationary point.

Proposition 5.2.1 (Second-order conditions for local optimum of function of one variable) proof

Let f be a twice-differentiable function of a single variable with a continuous second derivative, defined on the interval I. Suppose that f'(x*) = 0 for some x* in the interior of I (so that x* is a stationary point of f).

If f"(x*) < 0 then x* is a local maximizer of f.
If x* is a local maximizer of f then f"(x*) ≤ 0.
If f"(x*) > 0 then x* is a local minimizer of f.
If x* is a local minimizer of f then f"(x*) ≥ 0.

Proof hide

To prove the first point, note that given that f" is the derivative of f', we have

f"(x*) = lim_h→0

f'(x* + h) − f'(x*)

Given f'(x*) = 0, we thus have

f"(x*) = lim_h→0

f'(x* + h)

Hence if f"(x*) < 0, there exists ε > 0 such that f'(x* + h) > 0 for −ε < h < 0 and f'(x* + h) < 0 for 0 < h < ε. Thus f is increasing on (x* − ε, x*) and decreasing on (x*, x* + ε), so that x* is a local maximizer of f.

The proof of the third point is similar.

To prove the second point, suppose that x* is a local maximizer of f. If f"(x*) > 0 then from the third point, x* is also a local minimizer of f. But then for some ε > 0, f is constant on the interval (x* − ε, x* + ε), in which case f"(x*) = 0, contradicting f"(x*) > 0.

The proof of the fourth point is similar.

If f"(x*) = 0 then we don't know, without further investigation, whether x* is a local maximizer or local minimizer of f, or neither (check the functions x⁴, −x⁴, and x³ at x = 0). In this case, information about the signs of the higher order derivatives may tell us whether a point is a local maximum or a local minimum. In practice, however, these conditions are rarely useful; I do not discuss them.

Many variables

As for a function of a single variable, a stationary point of a function of many variables may be a local maximizer, a local minimizer, or neither, and we may be able to distinguish the cases by examining the second-order derivatives of the function at the stationary point.

Let (x₀, y₀) be a stationary point of the function f of two variables. Suppose it is a local maximizer. Then certainly it must be a maximizer along the two lines through (x₀, y₀) parallel to the axes. Using the theory for functions of a single variable, we conclude that

f"₁₁(x₀, y₀) ≤ 0 and f"₂₂(x₀, y₀) ≤ 0,

where f"_ij denotes the second partial derivative of f with respect to its ith argument, then with respect to its jth argument.

However, even the variant of this condition in which both inequalities are strict is not sufficient for (x₀, y₀) to be a maximizer, as the following example shows.

Example 5.2.1

Consider the function f(x, y) = 3xy − x² − y². The first-order conditions are

f'₁(x, y)	= 3y − 2x = 0
f'₂(x, y)	= 3x − 2y = 0

so that f has a single stationary point, (x, y) = (0, 0). Now,

f"₁₁(0, 0)	= −2 ≤ 0
f"₂₂(0, 0)	= −2 ≤ 0.

But (0, 0) is not a local maximizer: at (0, 0) the value of the function is 0, but at (ε, ε) with ε > 0 the value of the function is 3ε² − ε² − ε² = ε², which is positive (and hence exceeds f(0, 0) = 0) no matter how small ε is.

The point (0, 0) is a col. If you walk due north or due south, you descend; if you walk due east or due west, you also descend. But if you walk northeast or southwest, you climb mountains. The Col d'Arratille is an example.

The function is plotted in the following figure. Rotate the figure by dragging your mouse. (Rotation on a touch device isn't possible.)

This example shows that we cannot determine the nature of a stationary point of a function f of two variables by looking only at the partial derivatives f"₁₁ and f"₂₂ at the stationary point.

The next result gives a condition that involves the definiteness of the Hessian of the function, and thus all the cross-partials. The result assumes that all the second-order partial derivatives f"_ij are continuous for all x in some set S, so that by Young's theorem we have f"_ij(x) = f"_ji(x) for all x ∈ S, and hence the Hessian is symmetric. (The condition on f is satisfied, for example, by any polynomial.)

Proposition 5.2.2 (Second-order conditions for local optimum of function of many variables) source

Let f be a twice-differentiable function of n variables with continuous partial derivatives and cross partial derivatives, defined on the set S. Suppose that f'_i(x*) = 0 for i = 1, ..., n for some x* in the interior of S (so that x* is a stationary point of f). Let H be the Hessian of f.

If H(x*) is negative definite then x* is a local maximizer of f.
If x* is a local maximizer of f then H(x*) is negative semidefinite.
If H(x*) is positive definite then x* is a local minimizer of f.
If x* is a local minimizer of f then H(x*) is positive semidefinite.

Source hide: Proofs may be found in Sydsæter (1981) (Theorem 5.11, p. 243) and Simon and Blume (1994) (Theorem 30.10, p. 836).

An implication of this result is that if x* is a stationary point of f then

if H(x*) is negative definite then x* is a local maximizer
if H(x*) is negative semidefinite, but neither negative definite nor positive semidefinite, then x* is not a local minimizer, but might be a local maximizer
if H(x*) is positive definite then x* is a local minimizer
if H(x*) is positive semidefinite, but neither positive definite nor negative semidefinite, then x* is not a local maximizer, but might be a local minimizer
if H(x*) is neither positive semidefinite nor negative semidefinite then x* is neither a local maximizer nor a local minimizer.

For a function f of two variables, the Hessian is

	f"₁₁(x*)	f"₁₂(x*)
	f"₂₁(x*)	f"₂₂(x*)

This matrix is negative definite if f"₁₁(x*) < 0 and |H(x*)| > 0. (These two inequalities imply that f"₂₂(x*) < 0.) Thus the extra condition, in addition to the two conditions f"₁₁(x*) < 0 and f"₂₂(x*) < 0 considered originally, for x* to be a local maximizer is f"₁₁(x*)f"₂₂(x*) − f"₂₁(x*)f"₁₂(x*) > 0.

Similarly, a sufficient condition for a stationary point x* of a function of two variables to be a local minimizer are f"₁₁(x*) > 0 and |H(x*)| > 0 (which imply that f"₂₂(x*) > 0).

In particular, if, for a function of two variables, |H(x*)| < 0, then x* is neither a local maximizer nor a local minimizer. (Note that this condition is only sufficient, not necessary.)

A stationary point that is neither a local maximizer nor a local minimizer is called a saddle point. Examples are the point (0, 0) for the function f(x, y) = x² − y² and the point (0, 0) for the function f(x, y) = x⁴ − y⁴. In both cases, (0, 0) is a maximizer in the y direction given x = 0 and a minimizer in the x direction given y = 0; the graph of each function resembles a saddle for a horse. Note that not all saddle points look like saddles. For example, every point (0, y) is a saddle point of the function f(x, y) = x³. From the results above, a sufficient, though not necessary, condition for a stationary point x* of a function f of two variables to be a saddle point is |H(x*)| < 0.

A saddle point is sometimes defined to be a stationary point at which the Hessian is indefinite. (See, for example, Mathematics for economists by Simon and Blume, p. 399.) Under this definition, (0, 0) is a saddle point of f(x, y) = x² − y², but is not a saddle point of f(x, y) = x⁴ − y⁴. The definition I give appears to be more standard.

Example 5.2.2

Consider the function f(x, y) = x³ + y³ − 3xy. The first-order conditions for an optimum are

3x² − 3y	= 0
3y² − 3x	= 0.

Thus the stationary points satisfy y = x² = y⁴, so that either (x, y) = (0, 0) or y³ = 1. So there are two stationary points: (0, 0), and (1, 1).

Now, the Hessian of f at any point (x, y) is

H(x, y) =

	6x	−3
	−3	6y

Thus |H(0, 0)| = −9, so that (0, 0) is neither a local maximizer nor a local minimizer (i.e. is a saddle point). We have f"₁₁(1, 1) = 6 > 0 and |H(1, 1)| = 36 − 9 > 0, so that (1, 1) is a local minimizer.

The function is plotted in the following figure.

Example 5.2.3

Consider the function f(x, y) = 8x³ + 2xy − 3x² + y² + 1. We have

f'₁(x, y)	= 24x² + 2y − 6x
f'₂(x, y)	= 2x + 2y.

To find the stationary points of the function, solve the first-order conditions. From the second equation we have y = −x; substituting this into first equation we find that 24x² − 8x = 8x(3x − 1) = 0. This equation has two solutions, x = 0 and x = 1/3. Thus there are two stationary points:

(x*, y*) = (0, 0) and (x**, y**) = (1/3, −1/3).

We have

f"₁₁(x, y) = 48x − 6, f"₂₂(x, y) = 2, and f"₁₂(x, y) = f"₂₁(x, y) = 2,

so Hessian is

	48x − 6	2
	2	2

Look at each stationary point in turn.

(x*, y*) = (0, 0): We have f"₁₁(0, 0) = −6 < 0 and
f"₁₁(0, 0)f"₂₂(0, 0) − (f"₁₂(0, 0))² = −16 < 0.
So (x*, y*) = (0, 0) is neither a local maximizer nor a local minimizer (i.e. it is a saddle point).
(x**, y**) = (1/3, −1/3): We have f"₁₁(1/3, −1/3) = 10 > 0 and
f"₁₁(1/3, −1/3)f"₂₂(1/3, −1/3) − (f"₁₂(1/3, −1/3))² = 96/3 − 16 = 16 > 0.
So (x**, y**) = (1/3, −1/3) is a local minimizer, with f(1/3, −1/3) = 23/27.

The function is plotted in the following figure.

Your first name*
Your last name*
Your email address*
Comment*
Enter the first six letters of the alphabet*	(to help establish that you are human)