Mathematical methods for economic theory: 3.1 Concave and convex functions of a single variable

3.1 Concave and convex functions of a single variable

Definitions

The twin notions of concavity and convexity are used widely in economic theory, and are also central to optimization theory. A function of a single variable is concave if every line segment joining two points on its graph does not lie above the graph at any point. Symmetrically, a function of a single variable is convex if every line segment joining two points on its graph does not lie below the graph at any point. These concepts are illustrated in the following figures.

Here is a precise definition.

Definition

Let f be a function of a single variable defined on an interval. Then f is

concave if every line segment joining two points on its graph is never above the graph
convex if every line segment joining two points on its graph is never below the graph.

To make this definition more useful, we can translate it into an algebraic condition. Let f be a function defined on the interval [x₁, x₂]. This function is concave according to the definition if, for every pair of numbers a and b with x₁ ≤ a ≤ x₂ and x₁ ≤ b ≤ x₂, the line segment from (a, f(a)) to (b, f(b)) lies on or below the graph of the function, as illustrated in the following figure.

Denote the height of the line segment from (a, f(a)) to (b, f(b)) at the point x by h_a,b(x). Then according to the definition, the function f is concave if and only if for every pair of numbers a and b with x₁ ≤ a ≤ x₂ and x₁ ≤ b ≤ x₂ we have

f(x)

≥

h_a,b(x) for all x with a ≤ x ≤ b. (*)

Now, every point x with a ≤ x ≤ b may be written as x = (1 − λ)a + λb, where λ is a real number from 0 to 1. (When λ = 0, we have x = a; when λ = 1 we have x = b.) The fact that h_a,b is linear means that

h_a,b((1 − λ)a + λb)

(1 − λ)h_a,b(a) + λh_a,b(b)

for any value of λ with 0 ≤ λ ≤ 1. Further, we have h_a,b(a) = f(a) and h_a,b(b) = f(b) (the line segment coincides with the function at its endpoints), so

h_a,b((1 − λ)a + λb)

(1 − λ)f(a) + λf(b).

Thus the condition (*) is equivalent to

f((1−λ)a + λb)

≥

(1 − λ)f(a) + λf(b) for all λ with 0 ≤ λ ≤ 1.

We can make a symmetric argument for a convex function. Thus the definition of concave and convex functions may be rewritten as follows.

Definition

Let f be a function of a single variable defined on the interval I. Then f is

concave if for all a ∈ I, all b ∈ I, and all λ ∈ (0, 1) we have

f((1−λ)a + λb) ≥ (1 − λ)f(a) + λf(b)
convex if for all a ∈ I, all b ∈ I, and all λ ∈ (0, 1) we have

f((1−λ)a + λb) ≤ (1 − λ)f(a) + λf(b).

In an exercise you are asked to show that f is convex if and only if −f is concave.

Note that a function may be both concave and convex. Let f be such a function. Then for all values of a and b we have

f((1−λ)a + λb)

≥

(1 − λ)f(a) + λf(b) for all λ ∈ (0, 1)

and

f((1−λ)a + λb)

≤

(1 − λ)f(a) + λf(b) for all λ ∈ (0, 1).

Equivalently, for all values of a and b we have

f((1−λ)a + λb)

(1 − λ)f(a) + λf(b) for all λ ∈ (0, 1).

That is, a function is both concave and convex if and only if it is linear (or, more properly, affine), taking the form f(x) = α + βx for all x, for some constants α and β.

Economists often assume that a firm's production function is increasing and concave. Examples of such a function for a firm that uses a single input are shown in the next two figures. The fact that such a production function is increasing means that more input generates more output. The fact that it is concave means that the increase in output generated by each one-unit increase in the input does not increase as more input is used. In economic jargon, there are “nonincreasing returns” to the input, or, given that the firm uses a single input, “nonincreasing returns to scale”. In the example in the first of the following two figures, the increase in output generated by each one-unit increase in the input not only does not increase as more of the input is used, but in fact decreases, so that in economic jargon there are “diminishing returns”, not merely “nonincreasing returns”, to the input.

The notions of concavity and convexity are important in optimization theory because, as we shall see, a simple condition is sufficient (as well as necessary) for a maximizer of a differentiable concave function and for a minimizer of a differentiable convex function. (Precisely, every point at which the derivative of a concave differentiable function is zero is a maximizer of the function, and every point at which the derivative of a convex differentiable function is zero is a minimizer of the function.)

The next result shows that a nondecreasing concave transformation of a concave function is concave.

Proposition 3.1.1: Let U be a concave function of a single variable and g a nondecreasing and concave function of a single variable. Define the function f by f(x) = g(U(x)) for all x. Then f is concave.

Proof

We need to show that f((1−λ)a + λb) ≥ (1−λ)f(a) + λf(b) for all values of a and b with a ≤ b and all λ ∈ (0, 1).

By the definition of f we have

f((1−λ)a + λb)

g(U((1−λ)a + λb)).

Now, because U is concave we have

U((1−λ)a + λb)

≥

(1 − λ)U(a) + λU(b).

Further, because g is nondecreasing, r ≥ s implies g(r) ≥ g(s). Hence

g(U((1−λ)a + λb))

≥

g((1−λ)U(a) + λU(b)).

But now by the concavity of g we have

g((1−λ)U(a) + λU(b)) ≥ (1−λ)g(U(a)) + λg(U(b)) = (1−λ)f(a) + λf(b).

So f is concave.

Jensen's inequality: another characterization of concave and convex functions

If we let λ₁ = 1 − λ and λ₂ = λ in the earlier definition of a concave function and replace a by x₁ and b by x₂, the definition becomes: f is concave on the interval I if for all x₁ ∈ I, all x₂ ∈ I, and all λ₁ ≥ 0 and λ₂ ≥ 0 with λ₁ + λ₂ = 1 we have

f(λ₁x₁ + λ₂x₂)

≥

λ₁f(x₁) + λ₂f(x₂).

The following result, due to Johan Jensen (1859–1925), shows that this characterization can be generalized. (The J in each of Jensen's names is, incidentally, pronounced the way an English speaker pronounces a Y.)

Proposition 3.1.2 (Jensen's inequality)

A function f of a single variable defined on the interval I is concave if and only if for all n ≥ 2

f(λ₁x₁ + ... + λ_nx_n)

≥

λ₁f(x₁) + ... + λ_nf(x_n)

for all x₁ ∈ I, ..., x_n ∈ I and all λ₁ ≥ 0, ..., λ_n ≥ 0 with ∑n
i=1λ_i = 1.

The function f of a single variable defined on the interval I is convex if and only if for all n ≥ 2

f(λ₁x₁ + ... + λ_nx_n)

≤

λ₁f(x₁) + ... + λ_nf(x_n)

for all x₁ ∈ I, ..., x_n ∈ I and all λ₁ ≥ 0, ..., λ_n ≥ 0 with ∑n
i=1λ_i = 1.

Source: The result is a special case of a result for functions of many variables.

Differentiable functions

The following diagram of a differentiable concave function should convince you that the graph of such a function lies on or below every tangent to the function. In the figure, the red line is the graph of the function and the gray line is the tangent at the point x*, which has slope f'(x*).

The fact that the graph of the function lies below this tangent is equivalent to

f(x) − f(x*) ≤ f'(x*)(x − x*) for all x.

The next result states this observation, and the similar one for convex functions, precisely. It is used to show the important result that for a concave differentiable function f every point x for which f'(x) = 0 is a global maximizer, and for a convex differentiable function every such point is a global minimizer.

Proposition 3.1.3

The differentiable function f of a single variable defined on an open interval I is concave on I if and only if

f(x) − f(x*)

≤

f'(x*)(x − x*) for all x ∈ I and x* ∈ I

and is convex on I if and only if

f(x) − f(x*)

≥

f'(x*)(x − x*) for all x ∈ I and x* ∈ I.

Proof

I first show that if f is concave on I then the first inequality in the result holds. From the definition of a concave function, for all x ∈ I, all x* ∈ I, and all λ ∈ (0, 1) we have

f((1 − λ)x* + λx)

≥

(1 − λ)f(x*) + λf(x)

f((1 − λ)x* + λx)

≥

f(x*) + λ(f(x) − f(x*))

f(x) − f(x*)

≤

f((1 − λ)x* + λx) − f(x*)

Define the function g of a single variable by

g(λ)

f((1 − λ)x* + λx).

Then we can write the previous inequality as

f(x) − f(x*)

≤

(g(λ) − g(0))/λ.

The function g is differentiable, because f is differentiable, and the limit of the right-hand side of this inequality as λ converges to 0 is g'(0) by the definition of a derivative. Thus

f(x) − f(x*)

≤

g'(0).

Now, we have

g'(λ) = f'((1 − λ)x* + λx)(x − x*),

g'(0) = f'(x*)(x − x*).

Substituting this expression for the right-hand side of the inequality two lines above, we get the inequality in the result.

I now show that if the first inequality in the result holds then f is concave. Let x* ∈ I and x ∈ I, and define x' = (1 − λ)x* + λx. Then x' ∈ I and by the inequality, which holds for all values of x and x* in I, we have both

f(x*) − f(x')

≤

f'(x')(x* − x')

(letting x = x* and x* = x') and

f(x) − f(x')

≤

f'(x')(x − x')

(letting x* = x'). Now let λ ∈ [0, 1], multiply the first inequality by 1 − λ and the second by λ, and add, to get

(1 − λ)(f(x*) − f(x'))

λ(f(x) − f(x')) ≤ (1 − λ)f'(x')(x* − x')) + λf'(x')(x − x')

(1 − λ)f(x*) + λf(x) − f(x')

≤

f'(x')((1 − λ)x* + λx − x').

Given that x' = (1 − λ)x* + λx, the right-hand side is 0 and the inequality is

(1 − λ)f(x*) + λf(x)

≤

f((1 − λ)x* + λx),

showing that f is concave.

Symmetric arguments apply for a convex function.

Twice-differentiable functions

We often assume that the functions in economic models (e.g. a firm's production function, a consumer's utility function) are twice-differentiable. We may determine the concavity or convexity of such a function by examining its second derivative: a function whose second derivative is nonpositive everywhere is concave, and a function whose second derivative is nonnegative everywhere is convex.

Proposition 3.1.4

A twice-differentiable function f of a single variable defined on the interval I is

concave if and only if f"(x) ≤ 0 for all x in the interior of I
convex if and only if f"(x) ≥ 0 for all x in the interior of I.

Source: The result is a special case of a result for functions of many variables. For a direct proof, see Rockafellar (1970), Theorem 4.4 (p. 26).

Example 3.1.1: Is x² − 2x + 2 concave or convex on any interval? Its second derivative is 2 ≥ 0, so it is convex for all values of x.

Example 3.1.2: Is x³ − x² concave or convex on any interval? Its second derivative is 6x − 2, so it is convex on the interval [1/3, ∞) and concave the interval (−∞, 1/3].

The next result shows how the characterization of concave twice-differentiable functions can be used to prove an earlier result when the functions involved are twice-differentiable. The earlier result is true for all functions, so the next result proves something we already know to be true. I include it only as an example of the usefulness of the characterization of concavity in the previous proposition.

Proposition 3.1.5: Let U be a concave function of a single variable and g a nondecreasing and concave function of a single variable. Assume that U and g are twice-differentiable. Define the function f by f(x) = g(U(x)) for all x. Then f is concave.

Proof

We have f'(x) = g'(U(x))U'(x), so that

f"(x)

g"(U(x))·U'(x)·U'(x) + g'(U(x))U"(x).

Since g"(x) ≤ 0 (g is concave), g'(x) ≥ 0 (g is nondecreasing), and U"(x) ≤ 0 (U is concave), we have f"(x) ≤ 0. That is, f is concave.

A point at which a twice-differentiable function changes from being convex to concave, or vice versa, is an inflection point.

Definition

The point c is an inflection point of a twice-differentiable function f of a single variable if f"(c) = 0 and for some values of a and b with a < c < b we have

either f"(x) > 0 if a < x < c and f"(x) < 0 if c < x < b
or f"(x) < 0 if a < x < c and f"(x) > 0 if c < x < b.

The function f in the following figure has an inflection point at c. For x between a and c, the value of f"(x) is negative, and for x between c and b, it is positive.

Note that some authors, including Sydsæter and Hammond (1995) (p. 308), give a slightly different definition, in which the conditions f"(x) > 0 and f"(x) < 0 are replaced by f"(x) ≥ 0 and f"(x) ≤ 0. According to this alternative definition, f" does not have to change sign at c. For example, for a linear function, every point satisfies the alternative definition.

Strict convexity and concavity

The inequalities in the definition of concave and convex functions are weak: such functions may have linear parts, as does the function in the following figure for x > a.

A concave function that has no linear parts is said to be strictly concave.

Definition

The function f of a single variable defined on the interval I is

strictly concave if for all a ∈ I, all b ∈ I with a ≠ b, and all λ ∈ (0,1) we have

f((1−λ)a + λb) > (1 − λ)f(a) + λf(b).
strictly convex if for all a ∈ I, all b ∈ I with a ≠ b, and all λ ∈ (0,1) we have

f((1−λ)a + λb) < (1 − λ)f(a) + λf(b).

An earlier result states that if f is twice differentiable then

f is concave on [a, b] if and only if f"(x) ≤ 0 for all x ∈ (a, b).

Does this result have an analogue for strictly concave functions? Not exactly. If f"(x) < 0 for all x ∈ (a,b) then f is strictly concave on [a, b], but the converse is not true: if f is strictly concave then its second derivative is not necessarily negative at all points. (Consider the function f(x) = −x⁴. It is concave, but its second derivative at 0 is zero, not negative.) That is,

f is strictly concave on [a, b] if f"(x) < 0 for all x ∈ (a, b), but if f is strictly concave on [a, b] then f"(x) is not necessarily negative for all x ∈ (a, b).

Analogous observations apply to the case of convex and strictly convex functions, with the conditions f"(x) ≥ 0 and f"(x) > 0 replacing the conditions f"(x) ≤ 0 and f"(x) < 0.

Your first name*
Your last name*
Your email address*
Comment*
Enter the first six letters of the alphabet*	(to help establish that you are human)