Mathematical methods for economic theory: 3.3 Concave and convex functions of many variables

3.3 Concave and convex functions of many variables

Convex sets

To extend the notions of concavity and convexity to functions of many variables we first define the notion of a convex set.

Definition: A set S of n-vectors is convex if
(1−λ)x + λx' ∈ S whenever x ∈ S, x' ∈ S, and λ ∈ [0,1].

We call (1 − λ)x + λx' a convex combination of x and x'. Geometrically, the set of all convex combinations of two points x and x' is the line segment connecting x and x'.

For n = 1, the definition coincides with the definition of an interval: a set of numbers is convex if and only if it is an interval.

For n = 2, two examples are given in the following figures. The set in the first figure is convex, because every line segment joining a pair of points in the set lies entirely in the set. The set in the second figure is not convex, because the line segment joining the points x and x' does not lie entirely in the set.

The following property of convex sets (which you are asked to prove in an exercise) is sometimes useful.

Proposition 3.3.1: The intersection of convex sets is convex.

Note that the union of convex sets is not necessarily convex.

Concave and convex functions

Let f be a function of many variables, defined on a convex set S. We say that f is concave if the line segment joining any two points on the graph of f is never above the graph; f is convex if the line segment joining any two points on the graph is never below the graph. (That is, the definitions are the same as the definitions for functions of a single variable.)

More precisely, we can make the following definition (which is again essentially the same as the corresponding definition for a function of a single variable). Note that only functions defined on convex sets are covered by the definition.

Definition

Let f be a function of many variables defined on a convex set S. Then f is

concave if for all x ∈ S, all x' ∈ S, and all λ ∈ (0,1) we have

f((1−λ)x + λx') ≥ (1−λ)f(x) + λf(x')
convex if for all x ∈ S, all x' ∈ S, and all λ ∈ (0,1) we have

f((1−λ)x + λx') ≤ (1−λ)f(x) + λf(x').

As for a function of a single variable, a strictly concave function satisfies the definition for concavity with a strict inequality (> rather than ≥) for all x ≠ x', and a strictly convex function satisfies the definition for convexity with a strict inequality (< rather than ≤) for all x ≠ x'.

Definition

Let f be a function of many variables defined on a convex set S. Then f is

strictly concave if for all x ∈ S, all x' ∈ S with x' ≠ x, and all λ ∈ (0,1) we have

f((1−λ)x + λx') > (1−λ)f(x) + λf(x')
strictly convex if for all x ∈ S, all x' ∈ S with x' ≠ x, and all λ ∈ (0,1) we have

f((1−λ)x + λx') < (1−λ)f(x) + λf(x').

Example 3.3.1

Let f be a linear function, defined by f(x) = a₁x₁ + ... + a_nx_n = a·x on a convex set, where a_i is a constant for each i. Then f is both concave and convex:

f((1 − λ)x + λx')	=	a·[(1−λ)x + λx'] for all x, x', and λ ∈ [0, 1]
	=	(1−λ)a·x + λa·x' for all x, x', and λ ∈ [0, 1]
	=	(1−λ)f(x) + λf(x') for all x, x', and λ ∈ [0, 1].

Example 3.3.2

Suppose the function g of a single variable is concave on [a,b], and the function f of two variables is defined by f(x,y) = g(x) on [a, b] × [c, d]. Is f concave?

First note that the domain of f is a convex set, so the definition of concavity can apply.

The functions g and f are illustrated in the following figures. (The axes for g are shown in perspective, like those for f, to make the relation between the two figures clear. If we were plotting only g, we would view it straight on, so that the x-axis would be horizontal. Note that every cross-section of the graph of f parallel to the x-axis is the graph of the function g.)

From the graph of f (the roof of a horizontal tunnel), you can see that it is concave. The following argument is precise.

f((1−λ)(x, y)	+	λ(x', y'))
	=	f((1−λ)x + λx', (1−λ)y + λy')
	=	g((1−λ)x + λx')
	≥	(1−λ)g(x) + λg(x')
	=	(1−λ)f(x, y) + λf(x', y')

so f is concave.

Example 3.3.3

Let f and g be defined as in the previous example. Assume now that g is strictly concave. Is f strictly concave?

The strict concavity of f implies that

f((1−λ)(x, y)	+	λ(x', y'))
	>	(1−λ)f(x, y) + λf(x', y')

for all x ≠ x'. But to show that f is strictly concave we need to show that the inequality is strict whenever (x, y) ≠ (x', y')—in particular, for cases in which x = x' and y ≠ y'. In such a case, we have

f((1−λ)(x, y)	+	λ(x', y'))
	=	f(x, (1−λ)y + λy')
	=	g(x)
	=	(1−λ)f(x, y) + λf(x, y').

Thus f is not strictly concave. You can see the lack of strict concavity in the figure (in the previous example): if you take two (x, y) pairs with the same value of x, the line joining them lies everywhere on the surface of the function, never below it.

Characterizations of concave and convex functions

Having seen many examples of concave functions, you should find it plausible that a function is concave if and only if the set of points under its graph—the set shaded pink in the following figure—is convex. The result is stated precisely in the following proposition.

Proposition 3.3.2

A function f of many variables defined on a convex set S is

concave if and only if the set of points on or below its graph is convex:
{(x, y): x ∈ S and y ≤ f(x)} is convex
convex if and only if the set of points on or above its graph is convex:
{(x, y): x ∈ S and y ≥ f(x)} is convex.

Proof

Let L = {(x, y): x ∈ S and y ≤ f(x)}.

First suppose f is concave and let (x, y) ∈ L and (x', y') ∈ L. Then x ∈ S, x' ∈ S, y ≤ f(x) and y' ≤ f(x'). The last two inequalities imply that

(1 − λ)y + λy'

≤

(1 − λ)f(x) + λf(x') for any λ ∈ [0, 1].

Now, because S is convex, (1 − λ)x + λx' ∈ S, so that (1 − λ)x + λx' is in the domain of f, and because f is concave,

(1 − λ)f(x) + λf(x') ≤ f((1 − λ)x + λx').

Thus

(1 − λ)y + λy' ≤ f((1 − λ)x + λx'),

and hence ((1 − λ)x + λx', (1 − λ)y + λy') = (1 − λ)(x, y) + λ(x', y') ∈ L, establishing that L is convex.

Conversely, suppose L is convex. Let x ∈ S and x' ∈ S. Then (x, f(x)) ∈ L and (x', f(x')) ∈ L, so by the convexity of L, (1 − λ)(x, f(x)) + λ(x', f(x')) = ((1 − λ)x + λx', (1 − λ)f(x) + λf(x')) ∈ L for any λ ∈ [0, 1]. Thus (1 − λ)f(x) + λf(x') ≤ f((1 − λ)x + λx'), establishing that f is concave.

The argument for a convex function is symmetric.

Another characterization of a concave function is the following generalization of Jensen's inequality for functions of a single variable.

Proposition 3.3.3 (Jensen's inequality)

A function f of many variables defined on a convex set S is concave if and only if for all n ≥ 2

f(λ₁x₁ + ... + λ_nx_n)

≥

λ₁f(x₁) + ... + λ_nf(x_n)

for all x₁ ∈ S, ..., x_n ∈ S and all λ₁ ≥ 0, ..., λ_n ≥ 0 with ∑n
i=1λ_i = 1.

The function f of many variables defined on the convex set S is convex if and only if for all n ≥ 2

f(λ₁x₁ + ... + λ_nx_n)

≤

λ₁f(x₁) + ... + λ_nf(x_n)

for all x₁ ∈ S, ..., x_n ∈ S and all λ₁ ≥ 0, ..., λ_n ≥ 0 with ∑n
i=1λ_i = 1.

Proof

Here is the proof for concavity; the proof for convexity is analogous.

If the inequality is satisfied for all n, it is satisfied in particular for n = 2, so that f is concave directly from the definition of a concave function.

Now suppose that f is concave. Then the definition of a concave function implies directly that the inequality is satisfied for n = 2. To show that it is satisfied for all n ≥ 3 I argue by induction. Let m ≥ 2 and suppose that the inequality is satisfied for all n ≤ m. I show that it is satisfied for n = m + 1. Take any x₁ ∈ S, ..., x_m+1 ∈ S and λ₁ ≥ 0, ..., λ_m+1 ≥ 0 with ∑m+1
i=1λ_i = 1. If λ₁ = 1 then λ₂ = ... = λ_m+1 = 0, so that the inequality is trivially satisfied. If λ₁ < 1 then

f(∑m+1 i=1λ_ix_i)	=	f(λ₁x₁ + (1 − λ₁)∑m+1 i=2(λ_i/(1−λ₁))x_i)
	≥	λ₁f(x₁) + (1 − λ₁)f(∑m+1 i=2(λ_i/(1−λ₁))x_i)

by taking x = x₁, x' = ∑m+1
i=2(λ_i/(1−λ₁))x_i, and λ = 1 − λ₁ in the definition of a concave function. Now, because the inequality is satisfied for n = m we have

f(∑m+1
i=2(λ_i/(1−λ₁))x_i)

≥

∑m+1
i=2(λ_i/(1−λ₁))f(x_i),

so that

f(∑m+1 i=1λ_ix_i)	≥	λ₁f(x₁) + (1 − λ₁)∑m+1 i=2(λ_i/(1−λ₁))f(x_i)
	=	∑m+1 i=1λ_if(x_i),

completing the argument.

Differentiable concave and convex functions

A previous result shows that the graph of a concave function of a single variable lies everywhere on or below all of its tangents. The generalization of this result to concave functions of many variables says that the graph of such a function lies everywhere on or below all of its tangent planes. As for a function of a single variable, a symmetric result holds for convex functions. Like the result for functions of a single variable, it is used to show that stationary points are global maximizers of concave functions and global minimizers of convex functions.

Proposition 3.3.4

The differentiable function f of n variables defined on an open convex set S is concave on S if and only if

f(x) − f(x*)

≤

∑n
i=1f'_i(x*)·(x_i − x*_i) for all x ∈ S and x* ∈ S

and is convex on S if and only if

f(x) − f(x*)

≥

∑n
i=1f'_i(x*)·(x_i − x*_i) for all x ∈ S and x* ∈ S.

Source: One proof follows the lines of the proof of the analogous result for a function of a single variable. For details, see Sydsæter and Hammond (1995), Theorem 17.7 (p. 629). (Note that by this result their assumption that the function has continuous partial derivatives is equivalent to the assumption that the function is continuously differentiable, and by Corollary 25.5.1 (p. 246) in Rockafellar (1970), a differentiable convex function, and hence a differentiable concave function, is continuously differentiable.) Simon and Blume (1994), Theorem 21.3 (p. 511), give a different proof.

Twice-differentiable concave and convex functions

A twice-differentiable function of a single variable is concave if and only if its second derivative is nonpositive everywhere.

To determine whether a twice-differentiable function of many variables is concave or convex, we need to examine all its second partial derivatives. We call the matrix of all the second partial derivatives the Hessian of the function.

Definition

Let f be a twice-differentiable function of n variables. The Hessian of f at x is

H(x) =

f"₁₁(x)	f"₁₂(x)	...	f"_1n(x)
f"₂₁(x)	f"₂₂(x)	...	f"_2n(x)
...	...	...	...
f"_n1(x)	f"_n2(x)	...	f"_nn(x)

Note that by Young's theorem, the Hessian at x of any function that is twice-differentiable on an open set containing x is symmetric.

We can determine the concavity/convexity of a function by determining whether the Hessian is negative or positive semidefinite, as follows.

Proposition 3.3.5

Let f be a twice-differentiable function of many variables defined on an open convex set S and denote the Hessian of f at the point x by H(x). Then

f is concave if and only if H(x) is negative semidefinite for all x ∈ S
if H(x) is negative definite for all x ∈ S then f is strictly concave
f is convex if and only if H(x) is positive semidefinite for all x ∈ S
if H(x) is positive definite for all x ∈ S then f is strictly convex.

Source: For a proof, see Sydsæter and Hammond (1995), Theorems 17.13 (p. 640) and 17.14 (p. 641).

Note that the result does not claim that if f is strictly concave then H(x) is negative definite for all x ∈ S. Indeed, consider the function f of a single variable defined by f(x) = −x⁴. This function is strictly concave, but the 1 × 1 matrix H(0) is not negative definite (its single component is 0).

The result implies the following procedure.

Procedure for checking the concavity/convexity and strict concavity/convexity of a function of many variables

Hessian negative definite for all x ⇒ function is strictly concave; Hessian positive definite for all x ⇒ function is strictly convex.
Hessian not negative semidefinite for all x ⇒ function is not concave, and hence not strictly concave; Hessian not positive semidefinite for all x ⇒ function is not convex, and hence not strictly convex.
Hessian negative semidefinite for all x but not negative definite for all x ⇒ function is concave and may or may not be strictly concave; you need to use other methods (for example, the basic definition of concavity) to determine whether it is strictly concave. Similarly, Hessian positive semidefinite for all x but not positive definite for all x ⇒ function is convex and may or may not be strictly convex.

Example 3.3.4

Consider the function f(x, y) = 2x − y − x² + 2xy − y² defined on the set of all pairs of numbers. Its Hessian is

	−2	2
	2	−2

which is negative semidefinite, but not negative definite (its determinant is zero). (For this function, the Hessian does not depend on (x, y); in general it does.) Thus f is concave; the analysis does not tell us whether it is strictly concave.

Example 3.3.5

Consider the function f(x₁, x₂, x₃) = x2
1 + 2x2
2 + 3x2
3 + 2x₁x₂ + 2x₁x₃ defined on the set of all triples of numbers. Its first partials are

f'₁(x₁, x₂, x₃)	= 2x₁ + 2x₂ + 2x₃
f'₂(x₁, x₂, x₃)	= 4x₂ + 2x₁
f'₃(x₁, x₂, x₃)	= 6x₃ + 2x₁.

So its Hessian is

f"₁₁	f"₁₂	f"₁₃
f"₂₁	f"₂₂	f"₂₃
f"₃₁	f"₃₂	f"₃₃

2	2	2
2	4	0
2	0	6

The leading principal minors of the Hessian are 2 > 0, 4 > 0, and 8 > 0. So the Hessian is positive definite, and f is strictly convex.

In these two examples, the Hessian of f is independent of its argument, because f is a quadratic. In the next example, the Hessian of the function does not have this property.

Example 3.3.6

Consider the Cobb-Douglas function, defined by f(K, L) = AK^aL^b on the set of pairs (K, L) with K ≥ 0 and L ≥ 0. Assume that A > 0. The Hessian of this function is

	a(a−1)AK^a−2L^b	abAK^a−1L^b−1
	abAK^a−1L^b−1	b(b−1)AK^aL^b−2

Thus for f to be concave on the set of pairs (K, L) with K ≥ 0 and L ≥ 0 we need a(a−1)AK^a−2L^b ≤ 0, b(b−1)AK^aL^b−2 ≤ 0, and abA²K^2a−2L^2b−2(1 − (a + b)) ≥ 0 for all K ≥ 0 and L ≥ 0. Thus f is concave on this set if and only if a ≥ 0, b ≥ 0, and a + b ≤ 1 (so that a ≤ 1 and b ≤ 1). Similarly, it is strictly concave on the set of pairs (K, L) with K > 0 and L > 0 if a > 0, b > 0, and a + b < 1 (so that a < 1 and b < 1). (In fact, under these conditions it is strictly concave on the set of pairs (K, L) with K ≥ 0 and L ≥ 0, but this conclusion does not follow from the signs of the minors of the Hessian.)

Two properties of concave and convex functions

The following result says that any weighted sum of concave functions with nonnegative weights is concave; the proof is an exercise.

Proposition 3.3.6

Let f and g be functions of many variables defined on a convex set S.

If f and g are concave and a ≥ 0 and b ≥ 0 then the function h defined by h(x) = af(x) + bg(x) for all x ∈ S is concave.
If f and g are convex and a ≥ 0 and b ≥ 0 then the function h defined by h(x) = af(x) + bg(x) for all x ∈ S is convex.

Example 3.3.7: A firm produces the output f(x) from the vector x of inputs, which costs it c(x). The function f is concave and the function c is convex. The firm sells its output at a fixed price p > 0. Its profit when it uses the input vector x is
π(x) = pf(x) − c(x).
That is, π is the sum of two functions, pf and −c. The function −c is concave because c is convex, so by the proposition π is concave.

The last result is a generalization of a a previous result to functions of many variables; the proof is the same as the proof of the previous result.

Proposition 3.3.7

Let U be a function of many variables defined on a convex set S and let g be a function of a single variable defined on a set that contains the range of U.

If U is concave and g is nondecreasing and concave then the function f defined by f(x) = g(U(x)) for all x ∈ S is concave.
If U is convex and g is nondecreasing and convex then the function f defined by f(x) = g(U(x)) for all x ∈ S is convex.

Your first name*
Your last name*
Your email address*
Comment*
Enter the first six letters of the alphabet*	(to help establish that you are human)