Mathematical methods for economic theory

Martin J. Osborne

7.3 Optimization with inequality constraints: the sufficiency of the Kuhn-Tucker conditions

We saw previously that for both an unconstrained maximization problem and a maximization problem with an equality constraint the first-order conditions are sufficient for a global optimum when the objective and constraint functions satisfy appropriate concavity/convexity conditions. The same is true for an optimization problem with inequality constraints.

Concave objective function

Proposition 7.3.1 (Sufficiency of Kuhn-Tucker conditions for concave objective function)  
Let f and gj for j = 1, ..., m be continuously differentiable functions of many variables defined on an open convex set S and let cj for j = 1, ..., m be constants. Suppose that If there exists an m-vector of numbers λ = (λ1, ..., λm) and a point x* ∈ S such that (x*, λ) satisfies the Kuhn-Tucker conditions then x* solves the problem
maxxSf(x) subject to gj(x) ≤ cj for j = 1, ..., m.
Proof  
The fact that (x*, λ) satisfies the Kuhn-Tucker conditions means that
f'i(x*) − ∑m
j=1
λjg'ji(x*) = 0 for i = 1, ..., n
λj ≥ 0, gj(x*) ≤ cj and λj[gj(x*) − cj] = 0 for j = 1, ..., m
(where g'ji(x*) is the derivative of gj with respect to its ith argument evaluated at x*).

Let x be such that gj(x) ≤ cj for all j. We need to show that f(x) ≤ f(x*).

Let j be an index such that gj(x*) = cj. (That is, the jth constraint is binding at x*.) Then gj(x) ≤ gj(x*), so that by a characterization of quasiconvex functions, n
i=1
g'ji(x*)·(xi − x*i) ≤
0. Thus given λj ≥ 0 we have λjn
i=1
g'ji(x*)·(xi − x*i) ≤
0.

Now let j be an index such that gj(x*) < cj. Then by the complementary slackness condition for constraint j in the Kuhn-Tucker conditions, λj = 0.

Thus for every constraint j we have

λjn
i=1
g'ji(x*)·(xi − x*i) ≤ 0
and hence
m
j=1
λjn
i=1
g'ji(x*)·(xi − x*i) ≤ 0,
or, reversing the summations,
n
i=1
m
j=1
λjg'ji(x*)·(xi − x*i) ≤ 0.
Now, multiplying the ith first-order condition in the Kuhn-Tucker conditions by xi − x*i and summing over i we get
n
i=1
f'i(x*)·(xi − x*i) − ∑n
i=1
m
j=1
λjg'ji(x*)·(xi − x*i) = 0.
By the previous inequality, the second term is nonpositive, so
n
i=1
f'i(x*)·(xi − x*i) ≤ 0.
Thus from characterization of concave functions, f(x) ≤ f(x*), completing the proof.
This result together with Proposition 7.2.1, giving conditions under which the Kuhn-Tucker conditions are necessary, and the fact that a convex function (and hence a linear one) is quasiconvex yields the following useful corollary.
Corollary 7.3.1 (Necessity and sufficiency of Kuhn-Tucker conditions for concave objective function)
Let f and gj for j = 1, ..., m be continuously differentiable functions of many variables defined on an open convex set S and let cj for j = 1, ..., m be constants. Suppose that f is concave and
  • either each gj is linear
  • or each gj is convex and there exists x ∈ S such that gj(x) < cj for j = 1, ..., m.
Then x* ∈ S solves the problem
maxxSf(x) subject to gj(x) ≤ cj for j = 1, ..., m
if and only if there exists an m-vector of numbers λ = (λ1, ..., λm) such that (x*, λ) satisfies the Kuhn-Tucker conditions.

Quasiconcave objective function

The condition that the objective function be concave is a bit too strong to be useful in some economic applications. For example, the assumption we would like to impose on a consumer's utility function is that it be quasiconcave. The next result is useful in this case.
Proposition 7.3.2 (Sufficiency of Kuhn-Tucker conditions for quasiconcave objective function)  
Let f and gj for j = 1, ..., m be continuously differentiable functions of n variables defined on an open convex set S and let cj for j = 1, ..., m be constants. Suppose that If there exists an m-vector of numbers λ = (λ1, ..., λm) and x* ∈ S such that (x*, λ) satisfies the Kuhn-Tucker conditions and it is not the case that f'i(x*) = 0 for i = 1, ..., n then x* solves the problem
maxxSf(x) subject to gj(x) ≤ cj for j = 1, ..., m.
Proof  
The proof of the previous result, Proposition 7.3.1, uses the concavity of f only in the last sentence. In particular, under the assumption that f is quasiconcave we can conclude that
n
i=1
f'i(x*)·(xi − x*i) ≤ 0,
as in that proof. Now suppose, contrary to the claim that x* solves the maximization problem, that f(x) > f(x*) for some x that satisfies the constraints.

Denote the vector of partial derivatives of f by ∇f. Given the continuity of f, the fact that f(x) > f(x*) means that there exists t > 0 (small enough) such that f(x − tf(x*)) > f(x*), and the conclusion from the proof of the Proposition 7.3.1 may be written

f(x*)·(x − x*) ≤ 0.
Thus
f(x*)·(x − tf(x*) − x*)  =  tf(x*)·∇f(x*) + ∇f(x*)(x − x*)
 ≤  tn
i=1
(f'i(x*))2 < 0,
given the assumption that ∇f(x*) ≠ (0, ..., 0). But now, by a characterization of quasiconcave functions, this inequality implies that f(x − tf(x*)) < f(x*), contradicting f(x − tf(x*)) > f(x*). Thus in fact f(x) ≤ f(x*).
The constraints in a standard consumer's optimization problem are linear, so the following implication of this result and Proposition 7.2.1, giving conditions under which the Kuhn-Tucker conditions are necessary, is useful.
Corollary 7.3.2 (Necessity and sufficiency of Kuhn-Tucker conditions for quasiconcave objective function)
Let f and gj for j = 1, ..., m be continuously differentiable functions of n variables defined on an open convex set S and let cj for j = 1, ..., m be constants. Suppose that f is quasiconcave and each gj is linear. If x* ∈ S solves the problem
maxxSf(x) subject to gj(x) ≤ cj for j = 1, ..., m
then there exists a unique m-vector λ such that (x*, λ) satisfies the Kuhn-Tucker conditions, and if there exists an m-vector λ and a point x* ∈ S such that (x*, λ) satisfies the Kuhn-Tucker conditions and it is not the case that f'i(x*) = 0 for i = 1, ..., n then x* solves the problem.
If you have a minimization problem, remember that you can transform it to a maximization problem by multiplying the objective function by −1. Thus for a minimization problem the condition on the objective function in the first result above is that it be convex, and the condition in the second result is that it be quasiconvex.

Examples

The next two simple examples illustrate how to use these results and the ones in Section 7.2 to solve optimization problems.
Example 7.3.1
Consider the problem
maxxS[−(x − 2)2] subject to x ≥ 1,
where S is the set of all numbers. This problem is illustrated in the following figure.

x 0 1 2 3 4 0 −1 −2 −3 −4 x = 1 −(x − 2)2

Written in the standard format, this problem is

maxxS[−(x − 2)2] subject to 1 − x ≤ 0.
The set S is open and convex, and the objective and constraint functions are continuously differentiable on this set. The objective function is concave and the constraint function is linear, and hence both concave and quasiconvex. Thus by Proposition 7.2.1 the Kuhn-Tucker conditions are necessary (if x* solves the problem then there is a number λ such that (x*, λ) satisfies the Kuhn-Tucker conditions) and by Proposition 7.3.1 they are sufficient (if (x*, λ) satisfies the Kuhn-Tucker conditions then x* solves the problem).

The Lagrangean is the function L defined by

L(x) = −(x − 2)2 − λ(1 − x).

The Kuhn-Tucker conditions are

−2(x − 2) + λ  = 0
1 − x ≤ 0, λ ≥ 0, and λ(1 − x = 0.
From the last condition we have either λ = 0 or x = 1.
  • x = 1: 2 + λ = 0, or λ = −2, which violates λ ≥ 0.
  • λ = 0: −2(x − 2) = 0; the only solution is x = 2.
Thus the Kuhn-Tucker conditions have a unique solution, (x, λ) = (2, 0). Hence x = 2 is the unique solution of the problem.
Example 7.3.2
Consider the problem
maxxS[−(x − 2)2] subject to x ≥ 3,
where S is the set of all numbers. This problem is illustrated in the following figure.

x 0 1 2 3 4 0 −1 −2 −3 −4 x = 3 −(x − 2)2

Written in the standard format, this problem is

maxxS[−(x − 2)2] subject to 3 − x ≤ 0.
As in the previous example, 7.3.1, the objective function is concave and the constraint function is linear, so that x* is a solution of the problem if and only if there is a number λ such that (x*, λ) is a solution of the Kuhn-Tucker conditions.

The Lagrangean is

L(x) = −(x − 2)2 − λ(3 − x).

The Kuhn-Tucker conditions are

−2(x−2) + λ  = 0
3 − x ≤ 0, λ ≥ 0, and λ(3 − x = 0.
From the last conditions we have either λ = 0 or x = 3.
  • x = 3: −2 + λ = 0, or λ = 2.
  • λ = 0: −2(x − 2) = 0; since x ≥ 3 this has no solution compatible with the other conditions.
Thus the Kuhn-Tucker conditions have a single solution, (x, λ) = (3, 2). Hence x = 3 is the unique solution of the problem.
These two examples illustrate a procedure for finding solutions of the Kuhn-Tucker conditions that is useful in many problems. First, look at the complementary slackness conditions, which imply that either a Lagrange multiplier is zero or a constraint is binding. Then follow through the implications of each case, using the other equations. In the two examples, this procedure is very easy to follow. In the following examples doing so is more complicated.
Example 7.3.3
Consider the problem
max(x1x2)∈S [−(x1 − 4)2 − (x2 − 4)2] subject to x1 + x2 ≤ 4 and x1 + 3x2 ≤ 9,
where S is the set of all pairs of numbers. The set S is open and convex, and the objective and constraint functions are continuously differentiable on this set. The objective function is concave and each constraint function is linear, and hence both concave and quasiconvex. Thus by Proposition 7.2.1 the Kuhn-Tucker conditions are necessary (if x* solves the problem then there is a number λ such that (x*, λ) satisfies the Kuhn-Tucker conditions) and by Proposition 7.3.1 they are sufficient (if (x*, λ) satisfies the Kuhn-Tucker conditions then x* solves the problem).

In Example 7.1.1 we previously found that the Kuhn-Tucker conditions for this problem are

−2(x1 − 4) − λ1 − λ2  = 0
−2(x2 − 4) − λ1 − 3λ2  = 0
x1 + x2 ≤ 4, λ1 ≥ 0, and λ1(x1 + x2 − 4)  = 0
x1 + 3x2 ≤ 9, λ2 ≥ 0, and λ2(x1 + 3x2 − 9)  = 0.
What are the solutions of these conditions? Start by looking at the two conditions λ1(x1 + x2 − 4) = 0 and λ2(x1 + 3x2 − 9) = 0. These two conditions yield the following four cases.
  • x1 + x2 = 4 and x1 + 3x2 = 9: In this case we have x1 = 3/2 and x2 = 5/2. Then the first two equations are
    5 − λ1 −   λ2  = 0
    3 − λ1 − 3λ2  = 0
    which imply that λ1 = 6 and λ2 = −1, which violates the condition λ2 ≥ 0.
  • x1 + x2 = 4 and x1 + 3x2 < 9, so that λ2 = 0: Then first two equations imply x1 = x2 = 2 and λ1 = 4. All the conditions are satisfied, so (x1x2, λ1, λ2) = (2, 2, 4, 0) is a solution.
  • x1 + x2 < 4 and x1 + 3x2 = 9, so that λ1 = 0: Then the first two equations imply x1 = 33/10 and x2 = 19/10, violating x1 + x2 < 4.
  • x1 + x2 < 4 and x1 + 3x2 < 9, so that λ1 = λ2 = 0: Then first two equations imply x1 = x2 = 4, violating x1 + x2 < 4.
So (x1x2, λ1, λ2) = (2, 2, 4, 0) is the single solution of the Kuhn-Tucker conditions. Hence the unique solution of problem is (x1x2) = (2, 2).
The next two examples involve problems of the form
maxx u(x) subject to p·x ≤ w, x ≥ 0,
where u is a function of many variables, p is a vector, and w is a scalar. A standard consumer's maximization problem in economic theory takes this form; the techniques used in the example may be used also in problems with other specifications of the function u.
Example 7.3.4
Consider the problem
max(x,y)∈S xy subject to x + y ≤ 6, x ≥ 0, and y ≥ 0,
where S is the set of all pairs of numbers. This set is open and convex, and the objective and constraint functions are differentiable on it. Each constraint function is linear, and hence concave. Thus by Proposition 7.2.1 the Kuhn-Tucker conditions are necessary (if x* solves the problem then there is a vector λ such that (x*, λ) satisfies the Kuhn-Tucker conditions). Also, the objective function is continuous and the constraint set is compact, so by the extreme value theorem the problem has a solution. Thus x* is a solution of the problem if and only if there is a vector λ* such that (x*, λ*) satisfies the Kuhn-Tucker conditions and f(x*) ≥ f(x) for all values of x for which there is a vector λ such that (x, λ) satisfies the Kuhn-Tucker conditions.

(Note that we cannot use Proposition 7.3.1 because the objective function is not concave, and we cannot use Proposition 7.3.2 because although the objective function is quasiconcave on the set of vectors x with x ≥ 0, it is not quasiconcave on any open set that contains this set.)

The Lagrangean is given by

L(xy) = xy − λ1(x + y − 6) + λ2x + λ3y.
The Kuhn-Tucker conditions are
y − λ1 + λ2  = 0
x − λ1 + λ3  = 0
λ1 ≥ 0, x + y ≤ 6, λ1(x + y − 6)  = 0
λ2 ≥ 0, x ≥ 0, λ2x  = 0
λ3 ≥ 0, y ≥ 0, λ3y  = 0.
  • If x > 0 and y > 0 then λ2 = λ3 = 0, so that λ1 = x = y from the first two conditions. Hence x = y = λ1 = 3 from the third condition. These values satisfy all the conditions.
  • If x = 0 and y > 0 then λ3 = 0 from the last condition and hence λ1 = x = 0 from the second condition. But now from the first condition λ2 = −y < 0, contradicting λ2 ≥ 0.
  • If x > 0 and y = 0 then λ2 = 0, and a symmetric argument yields a contradiction.
  • If x = y = 0 then λ1 = 0 from the third set of conditions, so that λ2 = λ3 from the first and second conditions. These values satisfy all the conditions.
We conclude that the Kuhn-Tucker conditions have two solutions, (xy, λ1, λ2, λ3) = (3, 3, 3, 0, 0) and (0, 0, 0, 0, 0). We have f(3, 3) = 9 and f(0, 0) = 0, so the solution of the problem is (3, 3).
In this example, the objective function is defined on an open convex set that contains the constraint set and is differentiable there. In the next example, there is no such open convex set.
Example 7.3.5
Consider the problem
max(x,y)∈S x1/2 + y subject to px + y ≤ I, x ≥ 0, and y ≥ 0
where S is the set of pairs (xy) of numbers with x ≥ 0, and p > 0 and I > 0 are parameters. The set S is not open, and the objective function is defined only for x ≥ 0, so it is not defined on any open set that contains the constraint set. Thus the argument in the previous example cannot be used.

Instead, first consider the problem

max(x,y)∈S' x1/2 + y subject to px + y ≤ I and y ≥ 0
where S' = {(xy): x > 0}. The set S' is open and convex, the objective and constraint functions are continuously differentiable on this set, the objective function is quasiconcave, and each constraint function is linear, so by Corollary 7.3.2 if x* ∈ S' solves the problem then there is a vector λ such that (x*, λ) satisfies the Kuhn-Tucker conditions and if there is a vector λ and a point x* ∈ S' such that (x*, λ) satisfies the Kuhn-Tucker conditions and not all the partial derivatives of the objective function are 0 at x* then x* solves the problem.

The Lagrangean of the modified problem is given by

L(xy) = x1/2 + y − λ1(px + y − I) + λ2y.
The Kuhn-Tucker conditions are
(1/2)x−1/2 − λ1p  = 0
1 − λ1 + λ2  = 0
λ1 ≥ 0, px + y ≤ I, and λ1(px + y − I = 0
λ2 ≥ 0, −y ≤ 0, and λ2y  = 0.
From the first condition, λ1 > 0 and x = 1/(2λ1p)2. From the last condition, either λ2 = 0 or y = 0.
  • If λ2 = 0 then λ1 = 1 from the second condition, and hence x = 1/(4p2) and y = I − 1/(4p). We have 1/(4p2) > 0, so (xy, λ1, λ2) = (1/(4p2), I − 1/(4p), 1, 0) is a solution of the Kuhn-Tucker conditions if I − 1/(4p) ≥ 0, or p ≥ 1/(4I).
  • If y = 0 then x = I/p, λ1 = 1/(2(pI)1/2), and λ2 = λ1 − 1, so that λ2 ≥ 0 if and only if λ1 ≥ 1, or p ≤ 1/(4I).
Thus for each pair (pI) of values of the parameters, the Kuhn-Tucker conditions have a single solution,
left brace (I/p, 0) if p ≤ 1/(4I)
(1/(4p2),I − 1/(4p)) if p > 1/(4I).
Neither partial derivative of the objective function is 0 for any point (xy), so we conclude that the solution of the modified problem for each value of (pI) is given by this expression.

Now return to the original problem. The maximal value of the objective function when x = 0 is I. At the solution of the modified problem, the value of the objective function is (I/p)1/2 if p ≤ 1/(4I), and 1/(2p) + I − 1/(4p) = I + 1/(4p) if p > I/(4p). If p ≤ 1/(4I) then (I/p)1/2 ≥ 2I, so in both cases the value of the objective function exceeds I. We conclude that the solution of the original problem is the solution of the modified problem.